A data warehouse is a system that enables us to store the data and support analytical processing. The data warehousing can be seen from the business perspective and technical perspective.
First, let us see data warehousing from a business perspective. Let us consider an online retail store like Amazon, there are several business processes that would be ongoing as below:
- Webshop department where the customers can order and buy things.
- Retailer department where the details of the retailers are maintained.
- Management department where they monitor all the sales and the profit trends.
A single database is not enough to address the needs of all the activities of the different departments. For example, we know what information you would like to analyze in the management department. There might be a need for the retailer department to maintain the data of all the retailers throughout the world. Hence, as the requirement changes based on the nature of the business, the organization of the data in the database also changes. It is also clear that one database is not enough to support the needs of all the departments.
In the business perspective, it is very important to segregate the data based on the analytical purpose and the operational purpose. For example, the management team require data for analytics. However, in the webshop department, it is very important to register each and every transaction of the data whenever a customer buys or returns a product. This kind of data is operational data.
When you consider the analytical database, we shoot several queries to get the desired results. It is very important to organize the data in such a way that the data is retrieved very easily through the queries. When you consider the operational database, all the transactions should be registered to make the process easy and correct for the customers. Data integrity is very important in the transactional databases. Hence, it is very important to choose databases based on their needs. If the volume of data is very less, it is possible to store both the analytical and operational data in the same database.
In data warehousing, we consider only the analytical data but not the operational data. The data warehouse is a system that retrieves and stores data periodically from the source systems into a dimensional or normalized data store. Datawarehouse keeps track of the history of data to support analytics. They are not updated whenever any transaction occurs in the source database. On the contrary, They are updated in batches.
When you take a look at data warehousing from a technical perspective, the data is extracted from different data sources used for the operational processes, transform the data and load it to the dimensional model. The dimensional model is designed to make it easy for business users to use the data and to improve the analytical query performance.
The business intelligence applications that includes the reports, dashboards, graphs are used by the business users to view the result of analytical processes.