How to remove duplicates in transformer stage? in parallel
mode
Answers were Sorted based on User's Feedback
Answer / kiran
partition the data by key and sort the data and click on
unique value. This will automatically delete duplicate
data.
Is This Answer Correct ? | 20 Yes | 3 No |
Answer / praveen sarva
STEP 1) TRANSFORMER STAGE PROPERTIES--> ADVANCED -->
EXECUTION MODE ---> PARLLEL
STEP 2) TRANSFORMER STAGE PROPERTIES --> INPUT -->
PARTITIONING--> PARTITION TYPE --> HASH ---> ENABLE SORT ---
> ENABLE UNIQUE
Simple u will get non duplicate records....
Is This Answer Correct ? | 11 Yes | 0 No |
Answer / kiran
i am not sure who marked my answer as wrong. Can you please
be responsible enough to state why its wrong?
Is This Answer Correct ? | 1 Yes | 0 No |
Answer / satya
run u r job in sequencial mode and sort the source data
then play with stage variable's in Transformer.
because in parallel mode data is partioned .
Is This Answer Correct ? | 1 Yes | 1 No |
Answer / prasad
Take 2 Stage variables in transformer stage
sV1 =Column_Name
sV2 =if Column_Name=sV1 Then 0 Else 1
put it constraint sV2=1 (only will get unique records)
if u want duplicates sV2=0
Is This Answer Correct ? | 0 Yes | 1 No |
Answer / santhosh
go to transformer stage properties->input->define any kind of partition over there and enable perform sort check box....
n also define the particular column need to be sorted..
it gives the sorted column out view...
Is This Answer Correct ? | 1 Yes | 6 No |
whom do you report?
Difference between IBM DATA STAGE8.5 and DATA STAGE9.1 ?
How do you reject records in a transformer?
What is the version control how can i apply this in DataStage can any one tell me the anser
how to run jon in unix back round process what is command use in runing a job?
You enter values in a schema file for RCP and you also entered values in sequential file? which one will it take?
WHAT are unix quentios in datastage
if we using two sources having same meta data and how to check the data in two sources is same or not? and if the data is not same i want to abort the job ?how we can do this?
Out of 4 mill records only 3 mill records are loaded to target and then job aborted. How to load only those 1 mill(not loaded records) for next run. This job is not sequential job, it is stand alone parallel job.What are the possibilities available in datastage8.1?
How to write a expression to display the first letter in Caps in each word using transformer stage ? Please let me know ASAP Thanks in advance...
0 Answers Alpharithm Technologies,
how can u handle null values in transformer stage.
Can you explain tagbatch restructure operator?