How to remove duplicates in transformer stage? in parallel
mode
Answers were Sorted based on User's Feedback
Answer / kiran
partition the data by key and sort the data and click on
unique value. This will automatically delete duplicate
data.
| Is This Answer Correct ? | 20 Yes | 3 No |
Answer / praveen sarva
STEP 1) TRANSFORMER STAGE PROPERTIES--> ADVANCED -->
EXECUTION MODE ---> PARLLEL
STEP 2) TRANSFORMER STAGE PROPERTIES --> INPUT -->
PARTITIONING--> PARTITION TYPE --> HASH ---> ENABLE SORT ---
> ENABLE UNIQUE
Simple u will get non duplicate records....
| Is This Answer Correct ? | 11 Yes | 0 No |
Answer / kiran
i am not sure who marked my answer as wrong. Can you please
be responsible enough to state why its wrong?
| Is This Answer Correct ? | 1 Yes | 0 No |
Answer / satya
run u r job in sequencial mode and sort the source data
then play with stage variable's in Transformer.
because in parallel mode data is partioned .
| Is This Answer Correct ? | 1 Yes | 1 No |
Answer / prasad
Take 2 Stage variables in transformer stage
sV1 =Column_Name
sV2 =if Column_Name=sV1 Then 0 Else 1
put it constraint sV2=1 (only will get unique records)
if u want duplicates sV2=0
| Is This Answer Correct ? | 0 Yes | 1 No |
Answer / santhosh
go to transformer stage properties->input->define any kind of partition over there and enable perform sort check box....
n also define the particular column need to be sorted..
it gives the sorted column out view...
| Is This Answer Correct ? | 1 Yes | 6 No |
How to write a left outer join condition by using Transformer stage in server jobs? Could any one help me?
Name the command line functions to import and export the DS jobs?
project Steps,hits, Project level HArd things,Solved methods?
Differentiate between Join, Merge and Lookup stage?
Hi All, Could you please let me know whether Datastage server and px supports oredb? OREDB:It's a Oracle Retail Embedded Database.Previousely It was called as Acumate data base. It's a multidimensional database. Please help me on this issue ASAP. Thanks in advance Ashok
what are .ctl(control files) files ? how the dataset stage have better performance by this files?
col1 123 abc 234 def jkl 768 opq 567 789 but i want two targetss target1 contains only numeric values and target2 contains only alphabet values like trg1 123 234 768 567 789 trg2 abc def jkl opq
how to create document in datastage?
Could anyone give brief explanation bout datastage admin
how to handle null values in sequential file?
What are the features of datastage flow designer?
tab1 tab2 1,a 1,d 2,b 3,c perfoms outerjoin what is the o/p? write sql query for outerjoin?