How to remove duplicates in transformer stage? in parallel
mode
Answers were Sorted based on User's Feedback
Answer / kiran
partition the data by key and sort the data and click on
unique value. This will automatically delete duplicate
data.
Is This Answer Correct ? | 20 Yes | 3 No |
Answer / praveen sarva
STEP 1) TRANSFORMER STAGE PROPERTIES--> ADVANCED -->
EXECUTION MODE ---> PARLLEL
STEP 2) TRANSFORMER STAGE PROPERTIES --> INPUT -->
PARTITIONING--> PARTITION TYPE --> HASH ---> ENABLE SORT ---
> ENABLE UNIQUE
Simple u will get non duplicate records....
Is This Answer Correct ? | 11 Yes | 0 No |
Answer / kiran
i am not sure who marked my answer as wrong. Can you please
be responsible enough to state why its wrong?
Is This Answer Correct ? | 1 Yes | 0 No |
Answer / satya
run u r job in sequencial mode and sort the source data
then play with stage variable's in Transformer.
because in parallel mode data is partioned .
Is This Answer Correct ? | 1 Yes | 1 No |
Answer / prasad
Take 2 Stage variables in transformer stage
sV1 =Column_Name
sV2 =if Column_Name=sV1 Then 0 Else 1
put it constraint sV2=1 (only will get unique records)
if u want duplicates sV2=0
Is This Answer Correct ? | 0 Yes | 1 No |
Answer / santhosh
go to transformer stage properties->input->define any kind of partition over there and enable perform sort check box....
n also define the particular column need to be sorted..
it gives the sorted column out view...
Is This Answer Correct ? | 1 Yes | 6 No |
if the source file is CID,CCODE,CONNDATE,CREATEDBY 0000000224,1000,20060601,CURA 0000000224,2000,20050517,AFGA 0000000224,3000,20080601,TUNE 0000000225,1000,20020601,CURA 0000000225,2000,20050617,AFGA 0000000225,3000,20080601,TONE AND TARGET is oracle following are the validations cid loaded with unique records leading zeors has to be deleted while loading cid in target load only customer who got early connected to company conn_date should be loaded into oracle date format cid datatype is varchar2 in target conn_date is data datatype ccode is varchar2 0000000224,1000,20060601,CURA 0000000224,1000,20060601,CURA
what is the custome stage in datastage? how can we impliment that one? plz tell me
To see hidden files in LINIX?
What is the use of surrogate key stage?
Describe stream connector?
I have a few records just I want to store data in to targets cycling way how?
Define project in datastage?
iam new to datastage...now i want to know what are fact tables, dimension tables in bank domain...if any body knows plz tell me asap..
what is combinability and non combinability?
what are the devoleper roles in real time? plz tell i am new to datastage....
if we using two sources having same meta data and how to check the data in two sources is same or not? and if the data is not same i want to abort the job ?how we can do this?
I am having two tables called MASTER and DETAIL. I want to insert records to both tables. But one condition is that whenever the insert for MASTER table is success then only the records will inserted into the DETAIL table, otherwise abort the job. How can u design this job?