Data Warehousing Community Forum

Data Warehousing => Data Warehousing Concepts => Topic started by: ashgo2k10 on March 29, 2010, 02:33:18 pm



Title: Fact table to capture data across different time grains
Post by: ashgo2k10 on March 29, 2010, 02:33:18 pm
Hi,

I am very new to data warehousing. I need to construct a data warehouse for a metrics application which can generate dynamic reports from millions of data. These reports can be hourly, daily, weekly, monthly and so on.

I know I will have to create a time dimension table storing a time key (integer) and other information like day of month, week , month, year, etc.

My question is should i create a single fact table that stores all the information at lowest grain (i.e. hour) and then query this fact table to generate all kinds of reports?

I see a disadvantage to this approach: If I need a weekly report, then lot of time would be taken to query such huge data in way of grouping and summing. So should i be creating a different fact tables which capture the calculated data for different time grains.. like a fact table for weekly reports, another one for monthly reports, and so on? Or should I create just one fact table and have different indexed views for better performance?

Please bear with me as I am a novice and would greatly appreciate any help that comes my way.


Title: Re: Fact table to capture data across different time grains
Post by: Whoever on March 29, 2010, 10:31:52 pm
My question is should i create a single fact table that stores all the information at lowest grain (i.e. hour) and then query this fact table to generate all kinds of reports?

I see a disadvantage to this approach: If I need a weekly report, then lot of time would be taken to query such huge data in way of grouping and summing. So should i be creating a different fact tables which capture the calculated data for different time grains.. like a fact table for weekly reports, another one for monthly reports, and so on? Or should I create just one fact table and have different indexed views for better performance?
One fact table would be the most logical/maintainable solution.

You really shouldn't worry about the performance degradation due to the required aggregation/summation.
You would be surprised as to just how efficiently database engines can aggregate upon integer fields, especially if indexed well.

Also, indexed views is a good idea if you face performance issues.

However, the ideal solution would be to use a dedicated multi-dimensional "OLAP" engine (http://en.wikipedia.org/wiki/MOLAP). (such as Microsoft Analysis Services)
Such engines are designed specifically for quick retrieval of aggregated views of data.


Title: Re: Fact table to capture data across different time grains
Post by: ashgo2k10 on March 30, 2010, 08:48:37 pm
Thanks so much, Whoever!