when U have a remove dublicate option in sort stage, why we
have a remove dublicate stage in PX, thought it is
recamended to sort data before using a remove dublicate
stage. I hae been thinking this from days....
Answers were Sorted based on User's Feedback
Answer / prasu
In Duplicate Stages we have more number of optionscompare
to sort while removing duplicates.If you have less number
if data you can go with Sort stage to remove duolicats.If
you have large number of data go for Remove Duplicates
Stage.
| Is This Answer Correct ? | 8 Yes | 0 No |
Answer / phani kumar
Sort stage is used to sort the data and having option of
identifying the duplicate records with the value of Key
change column. But, to perform sort and remove duplicates is
leads to decrease the performance. So, it is preferable for
less amount of data.
Remove duplicates stage is used to get only unique records
either first occurrence or last occurrences. For large
amount of data, sorted data is required for better performance.
Correct me if iam wrong..........
Thanks and regards....
Phani kumar
| Is This Answer Correct ? | 8 Yes | 0 No |
Answer / data master
Sort Stage do Sorting of data and performing Remove
Duplicate records, which will slow the performance of job
(Hence it is better to sort data at database level).
If the data is already sorted than use the Remove Duplicate
Stage to remove duplicate records, Which will give better
performance of job than above situation.
| Is This Answer Correct ? | 3 Yes | 2 No |
Answer / swati
In Remove Duplicate stage you will get only unique records.
In sort Stage you will get both unique and duplicate records based on key change column.
| Is This Answer Correct ? | 1 Yes | 0 No |
How rejected rows are managed in datastage?
What is the difference between validate and compile?
How do you get log info into a file?
what is the Difference Between Datastage Server Edition and Parallel Edition?
IN SEQUENTIAL FILE,I HAVE DATA LIKE THIS EID,ENAME 11,AA 11,CC 22,DD 33,EE 22,AA 22,BB 22,CC 11,BB THEN i SELECT perform sort ON eid,uncheck both unique,stable AND I CHOOSE hash SORTING.WHAT IS THE OUTPUT I CAN GET.what happend if i select UNIQUE,STABLE. PLEASE EXPLAIN HOW THE RECORDS DISPLAY AT TARGET.
How to remove blank spaces from data?
how can or from where we can get reference data in scd type2 implementation?
What are the components of datastage?
im new to this tool im now at project plz tell me step by step process how to design plz help me i wnt to go with exp for job plz give me d proper design and explination
What are the partitioning techniques available in link partitioner?
Source have 1000 records and it can have three nodes ok but i want how many records are executed on each node?
1.i have 5 jobs(1-5),i connect with each other,i want run from 3-5 only how? 2.how to schedual the job in datastage7.5 2? what is the deff bet grip and fgrep command? how do you cleanse the data in your project