What is a DataWarehouse? And why would you need one?
A data warehouse is a central repository that aggregates data from all transactional and other data sources within a firm, to create a historical archive of all of the firm's data even when transactional systems have hard data retention constraints.
It provides for the following capabilities:
What is an Operational Data Store (ODS)?
An ODS is a database designed to integrate data from multiple sources. It allows for cleaning, resolving redundancy and integrity checking before additional operations. The data is then passed back to the operational systems and to the DWH for storage and reporting. It is usually designed to contain atomic or low level data such as transactions and prices and also has limited history which is captured real time or near real time. Much greater volume of data is stored in the DWH generally on a less frequent basis.
Why do we add Data/Strategic Marts to most modern data management platforms?
Data marts are fit for purpose access layers that support specific reporting use for individual teams or use cases for e.g. a sales and operations data mart, or a marketing strategy data mart. Usually a subset of the DWH, and very focused on the elements needed for the purpose it is designed for. The usual reasons to create data marts are -
And finally what are Data Lakes and Swamps?
A single store of all the data in the Enterprise in its raw form. It is a method of storing data within a system or repository in its natural format and facilitates the colocation of data in various schemas, structured and unstructured in files or object blobs or data bases. A deteriorated data lake, inaccessible to its int