|
Title: Fact table to capture data across different time grains Post by: ashgo2k10 on March 29, 2010, 02:33:18 pm Hi,
I am very new to data warehousing. I need to construct a data warehouse for a metrics application which can generate dynamic reports from millions of data. These reports can be hourly, daily, weekly, monthly and so on. I know I will have to create a time dimension table storing a time key (integer) and other information like day of month, week , month, year, etc. My question is should i create a single fact table that stores all the information at lowest grain (i.e. hour) and then query this fact table to generate all kinds of reports? I see a disadvantage to this approach: If I need a weekly report, then lot of time would be taken to query such huge data in way of grouping and summing. So should i be creating a different fact tables which capture the calculated data for different time grains.. like a fact table for weekly reports, another one for monthly reports, and so on? Or should I create just one fact table and have different indexed views for better performance? Please bear with me as I am a novice and would greatly appreciate any help that comes my way. Title: Re: Fact table to capture data across different time grains Post by: Whoever on March 29, 2010, 10:31:52 pm My question is should i create a single fact table that stores all the information at lowest grain (i.e. hour) and then query this fact table to generate all kinds of reports? One fact table would be the most logical/maintainable solution.I see a disadvantage to this approach: If I need a weekly report, then lot of time would be taken to query such huge data in way of grouping and summing. So should i be creating a different fact tables which capture the calculated data for different time grains.. like a fact table for weekly reports, another one for monthly reports, and so on? Or should I create just one fact table and have different indexed views for better performance? You really shouldn't worry about the performance degradation due to the required aggregation/summation. You would be surprised as to just how efficiently database engines can aggregate upon integer fields, especially if indexed well. Also, indexed views is a good idea if you face performance issues. However, the ideal solution would be to use a dedicated multi-dimensional "OLAP" engine (http://en.wikipedia.org/wiki/MOLAP). (such as Microsoft Analysis Services) Such engines are designed specifically for quick retrieval of aggregated views of data. Title: Re: Fact table to capture data across different time grains Post by: ashgo2k10 on March 30, 2010, 08:48:37 pm Thanks so much, Whoever!
|