DIMENSIONAL DATA MODELLING
About This Course
Course Overview :
Dimensional modeling (DM) is the name of a set of techniques and concepts used in data warehouse design. It is considered to be different from entity-relationship modeling (ER). Dimensional Modeling does not necessarily involve a relational database. The same modeling approach, at the logical level, can be used for any physical form, such as a multidimensional database or even flat files. According to data warehousing consultant Ralph Kimball, DM is a design technique for databases intended to support end-user queries in a data warehouse. It is oriented around understandability and performance. According to him, although transaction-oriented ER is very useful for the transaction capture, it should be avoided for end-user delivery.
Course Content:
- Dimensional Modeling Fundamentals
- Publishing responsibilities of DW/BI professionals
- Role of dimensional modeling in the Kimball, Corporate Information Factory (CIF), and hybrid architectures
- Fact and dimension table characteristics
- Surrogate key for dimensions
- Fact table granularity
- Degenerate dimensions
- Benefits of dimensional modeling
- 4-step design process
- Retail Sales Case Study
- Transaction fact tables
- Denormalized dimension table hierarchies
- Dealing with nulls
- Dimension role-playing
- Date and time-of-day dimension considerations
- Centipede fact tables with too many dimensions
- Star versus snowflake schemas
- Factless fact tables
- Order Management Design Workshop
- Complications with operational header/line data
- Allocated facts at different levels of detail
- Abstract, generic dimensions
- Freeform text comments
- Junk dimensions for miscellaneous transaction indicators
- Multiple currencies
- Inventory Case Study
- Implications of business processes on data architecture
- Semi-additive facts
- Three types of fact tables – transaction, periodic snapshot and accumulating snapshot
- Conformed dimensions – identical and shrunken roll-ups
- Enterprise Data Warehouse Bus Architecture and matrix for master data and integration
- Drilling across fact tables
- Consolidated cross-process fact tables
- Billing Design Review Exercise
- Common design flaws and mistakes to avoid
- Checklist for conducting design reviews
- Slowly Changing Dimensions
- Basic Type 1, 2 and 3 techniques
- Advanced techniques to deliver current and point-in-time attribute values
- Mini-dimensions for large, rapidly changing dimensions
- Multiple mini-dimensions and outriggers
- Credit Card Design Workshop
- Complementary transaction and periodic snapshot schemas
- Design considerations for one dimension versus two dimensions
- Bridge tables for many-valued dimension attributes
- Fact table normalization
- Insurance Case Study
- Review of design patterns and techniques
- Development of bus matrix from extended case study
- Complex, unpredictable accumulating snapshots
- Detailed implementation bus matrix
- Dimensional Modeling Process
- Process flow, tasks and deliverables
- Financial Applications – Profit Equation
- Allocating costs to the same grain as revenue
- Profit margin point analysis and value banding
- Financial Applications – General Ledger
- Tracking instantaneous balances
- Multiple time zones
- Drilling down in the general ledger to a document
- Financial Applications – Budgeting Value Chain
- Budgets, commitments and expenditures
- Bridge tables for variable-depth ragged hierarchies
- Shared ownership and time-varying ragged hierarchies
- Pathstring alternative for ragged hierarchies
- Tracking the “age of the book”
- Calculating the “policy loss triangle”
- Retail Bank Account Tracking Workshop
- Multiple account types with hundreds of potential attributes and facts
- Many-to-many account to customer map and weighted versus “impact” reports
- Tagging accounts as “about to go bankrupt”
- Super-types and sub-types
- Automobile Options Exercise
- Column versus row trade-offs based on usability and scalability
- Compliance-Enabled Data Warehouses
- Eliminating Type 1 and Type 3 updates
- ETL Back Room Dimensional Designs
- Tracking data quality with error event fact table
- Column, structure, and business rule tests for data quality
- Reporting data quality with audit dimension
- Customer Relationship Management Payoffs Discussion
- Business users’ expectations and bottom line impact?
- Data sources needed? Common quality/integration problems?
- Complex Customer Behavior Case Studies
- Building study groups
- Sequential time dependent study groups
- Applying study groups to marketing panels and medical outcomes
- Customer Dimension Modeling Challenges
- Sparse but wide demographics attributes
- Finding detailed customer profile at random times in the past
- Tricky time span queries
- Simultaneous facts and dimensions
- Relationship between prospects and customers
- Real Time Customer Tracking
- Hot partitions
- Handling unresolved customer identities in real time
- Modeling Sequential Behavior
- Step dimension for describing sequential behavior
- RFID and web page challenges
- Modeling product purchase sequences
- Big Data Analytic Use Cases
- Competing DBMS and Hadoop architectures
- Attaching dimensions to big data
- Drilling across conventional and big data sources
- Final Customer-Centric Topics
- “Text” facts for customer cluster identification
- Structured questionnaires