Data Lake and Datawarehouse

7 years ago I wrote some article about datawarehouse (see . At that time, just few organization talk about data warehouse and data mining.

Nowadays, after the era of big data where the data is super huge is generated people start thinking different way on how the data is being stored and can be used for analytical purposes.

In traditional data warehouse, the data is loaded into RDBMS after the use of it is define. e.g Perhaps the organization use the data warehouse to keep the total goods that has been sold for every city, region, state and country. It captured also the goods type.

Both Data Lake and Data Warehouse have different objectives to be achieved in an enterprise. Some of the key difference are shown here:


Data Lake Data Warehouse
Captures all types of data and structures, semi-structured and unstructured in their most natural form from source systems Captures structured information and processes it as it is acquired into a fixed model defined for data warehouse purposes
Possesses enough processing power to process and analyze all kinds of data and have it analyzed for access Processes structured data into a dimensional or reporting model for advanced reporting and analytics
A Data Lake usually contains more relevant information that has good probability of access and can provide operational needs for an enterprise A Data Warehouse usually stores and retains data for long term, so that the data can be accessed on demand

Layers in data lake figure




Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: