what is a junk dimension ?
Answers were Sorted based on User's Feedback
Answer / vijay
A "junk" dimension is a collection of random transactional
codes, flags and/or text attributes that are unrelated to
any particular dimension.
The junk dimension is simply a structure that provides a
convenient place to store the junk attributes. A good
example would be a trade fact in a company that brokers
equity trades.
The fact would contain several metrics (principal amount,
net amount, price per share, commission, margin amount,
etc.) and would be related to several dimensions such as
account, date, rep, office, exchange, etc. This fact would
also contain several codes and flags that were related to
the transaction rather than any of the dimensions ... such
as origin code (that indicates whether the trade was
initiated with a phone call or via the Web), a reinvest
flag (that indicates whether or not this trade as was the
result of the reinvestment of a dividend payout) and a
comment field for storing special instructions from the
customer.
These three attributes would normally be
removed from the fact table and stored in a junk
dimension ... perhaps called the trade dimension. In this
way, the number of indexes on the fact table would be
reduced, and performance (not to mention ease of use) would
be enhanced. Hope this helps.
| Is This Answer Correct ? | 57 Yes | 0 No |
Answer / aparna
A junk dimension is a convenient grouping of flags and
indicators. It's helpful, but not absolutely
required, if there's a positive correlation among the
values. The benefits of a junk dimension
include: ? Provide a recognizable, user-intuitive location
for related codes, indicators and their
descriptors in a dimensional framework.
? Clean up a cluttered design that already has too many
dimensions. There might be five
or more indicators that could be collapsed into a single 4-
byte integer surrogate key in
the fact table.
? Provide a smaller, quicker point of entry for queries
compared to performance from
constraining directly on these attributes in the fact
table. If your database supports bitmapped
indices, this potential benefit may be irrelevant, although
the others are still
valid.
| Is This Answer Correct ? | 20 Yes | 3 No |
Answer / srinu
junk dim is convient for group of flags and attributes to
get them out of a fact table inti useful dimension framework
| Is This Answer Correct ? | 15 Yes | 5 No |
Answer / pullareddy palle
junk dimensions are the dimensions which can not participate
in mapping data flow(not useful). like boolean, flags
| Is This Answer Correct ? | 7 Yes | 0 No |
Answer / arpita
A dimension which cannot be used to indicate the key
performance indicators is known as junk dimension.....
eg: Address,Phone number,Faxno,Description......etc...
| Is This Answer Correct ? | 7 Yes | 3 No |
Answer / mohan
junk dimention is a dimention table which holds only flags
or status where dimension field values are like True/False,
open/Close,complete/incomplete/inprogress, etc.
| Is This Answer Correct ? | 4 Yes | 0 No |
Answer / ramesh
what is junk dimension? Give an example
When developing a dimensional model, we often encounter miscellaneous flags and indicators. These flags do not logically belong to the core dimension tables.
A junk dimension is grouping of low cardinality flags and indicators. This junk dimension helps in avoiding cluttered design of data warehouse. Provides an easy way to access the dimensions from a single point of entry and improves the performance of sql queries.
Example: For example, assume that there are two dimension tables (gender and marital status). The data of these two tables are shown below:
Code:
Table: Gender
Id Gender_status
----------------
1 Male
2 Female
Table: Marital Status
Id Marital_Status
----------------
1 Single
2 Married
Here both the dimensions have low cardinality flags. This will cause maintenance of two tables and decrease performance of sql queries.
We can combine these two dimensions into a single table by cross joining and can maintain a single dimension table. The result of cross join is shown below:
Code:
id gender mrg_status
--------------------
1 Male Single
2 Male Married
3 Female Single
4 Female Married
This new dimension table is called a junk dimension. This will improve the manageability and improves the sql queries performance.
| Is This Answer Correct ? | 4 Yes | 0 No |
Answer / vamshi
In data warehouse design, frequently we run into a situation where there are yes/no indicator fields in the source system. However, if u keep all those indicator fields in the fact table, not only do we need to build many small dimension tables, but the amount of information stored in the fact table also increases tremendously,
Junk dimension is the way to solve this problem. In a junk dimension, we combine these indicator fields into a single dimension. This way, we'll only need to build a single dimension table, and the number of fields in the fact table, as well as the size of the fact table, can be decreased
The content in the junk dimension table is the combination of all possible values of the individual indicator fields
| Is This Answer Correct ? | 1 Yes | 0 No |
Answer / prakash m
A convenient grouping of random flags and attributes to get
them out of the fact table
Retain only useful fields
Remove fields that make no sense at all
Remove fields that are inconsistently filled
Remove fields that are of operational interest only
Design similar to demographics; maximum unique
combinations, assign integer key, plug into fact
Create new combination (insert new dimension record) at ETL
run-time
E.g. Yes/No Flags in old retail transaction data
| Is This Answer Correct ? | 2 Yes | 2 No |
How i can Schdule the Informatica job in "Unix Corn Schduling tool" ?
How to create the list file having millions of flat files while indirect loading in informatica? In indirect file loading, suppose we have less no.of flat files then we can enter files names manually in list file creation. If millions of files are there, how can we enter the flat file names in list file?
There are 100 lines in a file. How to print line number 31-50 and 81-90 in unix with a single command.
if i hv 6 table as source table but can i make 12 as a dim table and 6 as fact table belongs to that src table.
I have data like sno mailid 1 subbu@dell.com 2 arthi@dell.com 3 thiru@dell.com .. ..... like this for N no.of how we will send a message 'meet hr' at a time in windows environment by the informatica..
How can the following be achieved in 1 single Informatica Mapping. * If the Header table record has error value(NULL) then those records and the corresponding child records in the SUBHEAD and DETAIL tables should also not be loaded into the target(TARGET1,TARGET 2 or TARGET3). * If the HEADER table record is valid, but the SUBHEAD or DETAIL table record has an error value (NULL) then the no data should be loaded into the target TARGET1,TARGET 2 or TARGET3. * If the HEADER table record is valid and the SUBHEAD or DETAIL table record also has valid records only then the data should be loaded into the target TARGET1,TARGET 2 and TARGET3. =================================================== HEADER COL1 COL2 COL3 COL5 COL6 1 ABC NULL NULL CITY1 2 XYZ 456 TUBE CITY2 3 GTD 564 PIN CITY3 SUBHEAD COL1 COL2 COL3 COL5 COL6 1 1001 VAL3 748 543 1 1002 VAL4 33 22 1 1003 VAL6 23 11 2 2001 AAP1 334 443 2 2002 AAP2 44 22 3 3001 RAD2 NULL 33 3 3002 RAD3 NULL 234 3 3003 RAD4 83 31 DETAIL COL1 COL2 COL3 COL5 COL6 1 D001 TXX2 748 543 1 D002 TXX3 33 22 1 D003 TXX4 23 11 2 D001 PXX2 56 224 2 D002 PXX3 666 332 ======================================================== TARGET1 2 XYZ 456 TUBE CITY2 TARGET2 2 2001 AAP1 334 443 2 2002 AAP2 44 22 TARGET3 2 D001 PXX2 56 224 2 D002 PXX3 666 332
tell me the informatica architecture
Hi ETL gurus can any one tell me with a flow how to implement SCD Type 1 and SCD Type 2 in a single mapping.For some fields SCD type has to be implemented and for some fields scd type has to be implementd..Thank in advance..please let me know in case of any concerns...
What is the cumulative sum and moving sum?
If I have 10 flat files with same name abc.txt files with different timestamps as source I need to load them in tgt table oracle. in between job execution fails and rows are not loaded into tgt. how can I make them load in that target even if my job fails?
What is the difference between STOP and ABORT options in Workflow Monitor?
What are multi-group transformations?