how many types of remove the duplicate records?
Answers were Sorted based on User's Feedback
Answer / prabhu rathnam
1. remove duplicate stage
2. sort stage
3. copy stage
| Is This Answer Correct ? | 8 Yes | 0 No |
Answer / rohit babar
There are 3 ways by which we can remove the duplicate records
1. By using remove duplicate stage. We have control here on which record we want to remove it's first or last. There is option record to retain in remove duplicate stage where we select first or last option.
2. By using sort stage. there is option allow duplicate in sort stage. We set this option true when we want duplicate records & If we want unique records we set it to the false
3. By in-line sorting. In Partition tab of any stage when we select key base partition tech. then perform sort option will enable if we check this option next 2 option will enable which is stable & unique if we select stable that means we allow duplicate records & if we select unique that means we remove duplicate records and getting unique records
| Is This Answer Correct ? | 5 Yes | 0 No |
Answer / soumya
1. Sort the Data using a key column then Use 3 transformer stage Variable
SV1 = Col1
SV2 = If SV1 = SV3 then 'DUP' else 'UNQ'
SV3 (Initial value = 0)
In the constraint SV2 = 'DUP' or SV2 = 'UNQ'
2. Aggregater stage group by key col (col1) count the column
then a filter stage use count col > 1.
| Is This Answer Correct ? | 1 Yes | 2 No |
How to create environments and call them? What is the use defined variables?
IS IT POSSIBLE TO USE DATASET AS A SOURCE FILE?
AGGREGATOR default datatype
how can we create tables in datastage?
Why do we use exception activity in Datastage?
client know skid info?
What is the difference between datastage and informatica?
Difference in the implementation of lookup and join stages,in joining two tables?
Hi, i did what you mentioned in the answer, i.e. source- >Transformer -> 3 datasets. Iam able to see the data in datasets but its not sort order... Can you tell how sort the data?? i also checked Hash partition with performsort.
Hi friends, I am new to datastage, i have one query in datastage any one you please give reply to my post. I have a workbook (excel sheet) named as eg: xxxx, in that i have two tables emp(eid, ename, salary, deptno) and dep(deptno, name, state). in my source i have ODBC enterprise stage read the emp table and dept table join the two table and write the dept no 10(eid,ename,salary,name,state) values in target(). Thanks, Badari
source file is having 5 records while moving into target it want to be 10 records
what is the difference between the join and look up explin me one exmple