when U have a remove dublicate option in sort stage, why we
have a remove dublicate stage in PX, thought it is
recamended to sort data before using a remove dublicate
stage. I hae been thinking this from days....
Answers were Sorted based on User's Feedback
Answer / prasu
In Duplicate Stages we have more number of optionscompare
to sort while removing duplicates.If you have less number
if data you can go with Sort stage to remove duolicats.If
you have large number of data go for Remove Duplicates
Stage.
| Is This Answer Correct ? | 8 Yes | 0 No |
Answer / phani kumar
Sort stage is used to sort the data and having option of
identifying the duplicate records with the value of Key
change column. But, to perform sort and remove duplicates is
leads to decrease the performance. So, it is preferable for
less amount of data.
Remove duplicates stage is used to get only unique records
either first occurrence or last occurrences. For large
amount of data, sorted data is required for better performance.
Correct me if iam wrong..........
Thanks and regards....
Phani kumar
| Is This Answer Correct ? | 8 Yes | 0 No |
Answer / data master
Sort Stage do Sorting of data and performing Remove
Duplicate records, which will slow the performance of job
(Hence it is better to sort data at database level).
If the data is already sorted than use the Remove Duplicate
Stage to remove duplicate records, Which will give better
performance of job than above situation.
| Is This Answer Correct ? | 3 Yes | 2 No |
Answer / swati
In Remove Duplicate stage you will get only unique records.
In sort Stage you will get both unique and duplicate records based on key change column.
| Is This Answer Correct ? | 1 Yes | 0 No |
Which warehouse using in your datawarehouse
What can we do with datastage director?
What is the difference between datastage and datastage tx?
what is the difference between == and eq in UNIX shell scripting?
whats relation between configuration file and datasets?
how to configure databases through datastage
how many datamarts we will use in real time project and when will use the datamart?pls send the replay early
Difference between server jobs & parallel jobs?
hi, how would i run job1 then job 3 , then job2 in a sequence of job1 ,job2,job3. Thanks sunitha
How can we read latest records in a text file named file1.txt using seq file stage only? file1 having 100 records in that 5 record sare latest records.How can we read that latest records?
IS FILE SET CAN SUPPORT I/P AND O/P LINK AT A TIME?
hi my source is:: empno,deptno,salary 1, 10, 3.5 2, 20, 8 2, 10, 4.5 1, 30, 5 3, 10, 6 3, 20, 4 1, 20, 9 then target should be in below form... empno,max(salary),min(salary),deptno 1, 9, 3.5, 20 2, 8, 4.5, 20 3, 6, 4, 10 can anyone give data flow in data stage for the above scenario.... thanks in advance...