The Birth of Dimensional Data Model
Many of the early data-modelers have been computer engineers and scientists who believe that data storage has to be optimized. To do so, they have built data-models in a “normalized” form. However, these data-models have been very complex to understand by a business analyst. Even if they did, it has been very difficult to write reports on top of these normalized data models.
When ad-hoc reporting tools started coming around, business users wanted to point them at the underlying data-models, drag-and-drop relevant columns, and create reports instantaneously. One of the genius data architects, Ralph Kimball, has thought of organizing the entire data set within a company in simple dimensions and facts. Hence the Dimensional Data Model is born around 1991.
What is a Dimensional Data Model?
A Dimensional Model presents the data in an intuitive and easily understandable format mimicking the way most of us think about data. We as humans remember things (store our memories) in the form of events. We remember “the first time our baby walked.” To store this, we know that “walking” is a FACT. Then, who walked, when did the baby walk, where did this happen, etc., are all DIMENSIONS. So, a dimensional model uses
If we were to register first walks of all the babies in the world, we need a list of all the babies in the world (Babies dimension table), list of addresses where all the babies have walked (Location dimension table), generic time log (Time dimension table). And, to register the FACTs, we need one table to register the total number of steps that the baby took before falling (Walks Fact table) that will connect these three dimension tables.
I know everyone explains with products, customers, and sales example. I thought I will make it a little more pleasant and memorable (there is another FACT for you!) with the baby walking example.
How is a Dimensional Data Model created?
Basic ingredients for a Dimensional Data Model are:
- Business Process: We need to understand which business process that we are modeling. Then we need to understand the level of details that we need to store (called granularity).
- Data: We obviously need to identify which of the data used in this business process belongs to FACTs and which of them belong to DIMENSIONs.