When should you use a STAR and when a SNOW-FLAKE schema?
Answer Posted / tanmay kumar meher
The snowflake and star schema are methods of storing data
which are multidimensional in nature (i.e. which can be
analysed by any or all of a number of independent factors)
in a relational database.
The snowflake schema (sometimes called snowflake join
schema) is a more complex schema than the star schema
because the tables which describe the dimensions are
normalized.
Snowflake schema is nothing but one dimension table will be
connected to another dimension table and so on.
------------
Snowflake
------------
? If a dimension is very sparse (i.e. most of the
possible values for the dimension have no data) and/or a
dimension has a very long list of attributes which may be
used in a query, the dimension table may occupy a
significant proportion of the database and snow flaking may
be appropriate.
? A multidimensional view is sometimes added to an
existing transactional database to aid reporting. In this
case, the tables which describe the dimensions will already
exist and will typically be normalized. A snowflake schema
will hence be easier to implement.
? A snowflake schema can sometimes reflect the way in
which users think about data. Users may prefer to generate
queries using a star schema in some cases, although this
may or may not be reflected in the underlying organization
of the database.
? Some users may wish to submit queries to the
database which, using conventional multidimensional
reporting tools, cannot be expressed within a simple star
schema. This is particularly common in data mining of
customer databases, where a common requirement is to locate
common factors between customers who bought products
meeting complex criteria. Some snow flaking would typically
be required to permit simple query tools such as Cognos
Power play to form such a query, especially if provision
for these forms of query weren't anticipated when the data
warehouse was first designed.
---------
Star
----------
The star schema (sometimes referenced as star join schema)
is the simplest data warehouse schema, consisting of a
single "fact table" with a compound primary key, with one
segment for each "dimension" and with additional columns of
additive, numeric facts.
The star schema makes multi-dimensional database (MDDB)
functionality possible using a traditional relational
database. Because relational databases are the most common
data management system in organizations today, implementing
multi-dimensional views of data using a relational database
is very appealing. Even if you are using a specific MDDB
solution, its sources likely are relational databases.
Another reason for using star schema is its ease of
understanding. Fact tables in star schema are mostly in
third normal form (3NF), but dimensional tables are in de-
normalized second normal form (2NF). If you want to
normalize dimensional tables, they look like snowflakes
(see snowflake schema) and the same problems of relational
databases arise - you need complex queries and business
users cannot easily understand the meaning of data.
Although query performance may be improved by advanced DBMS
technology and hardware, highly normalized tables make
reporting difficult and applications complex.
Is This Answer Correct ? | 40 Yes | 2 No |
Post New Answer View All Answers
Can we have multiple application.cfm file in an application?
What is the difference between ods and oltp?
What is metadata in context of a datawarehouse and how it is important?
Do you view contain data?
Explain which columns go to the fact table and which columns go the dimension table?
How can we determine what records to extract?
What is the difference between connected and unconnected stored procedures?
What are cluster analysis in data warehousing?
Is there any default username & pwd for bo designer & supervisor?
Explain the versions of reportnet?
Explain the difference between oltp and olap?
What are the steps involved in creating dimensional modeling process?
What is fact less fact table? Where you have used it in your project?
Suppose if a session fails after loading of 10,000 records in to the target.how can you load the records from 10001 the record when you run the session next time?
Explain the importance of data warehouse?