type
Post
Created date
Aug 23, 2022 03:21 AM
category
Data Science
tags
Data Warehousing
status
Published
Language
English
From
School
summary
slug
password
Author
Jason Ching Yuen Siu
Priority
Featured
Featured
Cover
Origin
Type
URL
Youtube
Youtube
icon

Level of Aggregations and Data Warehousing Architecture

Summary

  • The higher the level of aggregation, the more aggregation in the fact measure.
  • A data warehouse is built primarily used for drilling down some interesting data for business decision.
  • It is common when we design a data warehouse, we start from a high level of aggregation, where fact measures contain aggregated values. Lowering down the level of aggregation can be done by changing the granularity of the dimension, or by simply adding new dimensions.
  • Determining whether a star schema is in Level-0 or not can be tricky. Not having an aggregated fact measure does not always mean that the star schema is on Level-0. Hence, it is important to understand the concept of transaction recorded in the E/R diagram of the operational database

What does levels of granularity imply?

notion image
Aggregate values have different levels of granularity.
  • The lower the level of granularity:
    • The higher the level of aggregation (e.g., level 0,1,2)
    • The more specific information we can get. (越低越細)Level 0 is the highest level (Most detail).
    • In another words, the granularity (or details) of dim is low, so the level of agg is high.
Example:
  • Total Sales per Year has lower granularity than Total Sales per Quarter.
  • Number of Logins in the lab per Semester has different level of focus (or granularity) than Number of Logins in the lab per Month
notion image
notion image

How to lower down the level of aggregations

  1. Add a new dimension. When we add a new dimension, each value in the fact measure will literally be broken down more details on each record of the new dimension.
notion image
  1. Replace an existing dimension with a higher granularity dimension. The values of the fact measures will also be broken down more details because the fact measure has a lower detail dimension
notion image
 

How do we know if the schema is in its lowest level (i.e., Lv 0)?

  1. Op = Schema. Attributes are the same as the ones in the operation database.
  1. Unaggregated fact measures. Lv 0 has no aggregation so values of fact measures are 1.
notion image

What are the case when level of aggregation CANNOT be compared?

notion image

Case Studies

Case study 1: The number of logins per hour at night?
notion image
notion image
notion image
Star schema
notion image
notion image
notion image
notion image
Case study 2 (Lab) : Clothing Company
notion image
 
notion image
notion image
notion image
Case study 3 (Lab) : Toll Way
notion image
notion image
notion image
notion image
notion image

Star Schemas with No Aggregation (i.e., Level 0 )

notion image

Although dims cannot be broken down, having lowest dims does NOT gurantee that the values of FM is Unaggregated. So, these are NOT in lv 0.

notion image
notion image
notion image
This is not Level-0 because the fact measures do not capture both sides of the m-m relationship between Purchase Order and Item entities in the E/R diagram; rather, it captures one side only, which is from the Purchase Order side and not from the Item side.

There are three possible lv0 schemas

If we need to include all possible dimensions in the star schema, we can add Product Dimension to the star schema.
And Level-0 star schema must have the Purchase Order Dimension and Item Dimension as its core.
 
Since the star schemas below come with these two core dimensions, it is at lv 0.
notion image
notion image
 
The most complete star schema with Customer Dimension and Product Dimension
or with either Customer Dimension (Fig. 14.14) or Product Dimension only
notion image

Conclusion

The Level-0 star schema should focus on the transaction level of the E/R diagram, which is denoted by the m-m relationship.

Understanding the Relationship between Transactions and Fact Measures

 
 
 
FIT3003 - Unique SQL functions in Oracle DatabaseBusiness framework for solving problems - Canva PDF