how many types of remove the duplicate records?
Answers were Sorted based on User's Feedback
Answer / prabhu rathnam
1. remove duplicate stage
2. sort stage
3. copy stage
| Is This Answer Correct ? | 8 Yes | 0 No |
Answer / rohit babar
There are 3 ways by which we can remove the duplicate records
1. By using remove duplicate stage. We have control here on which record we want to remove it's first or last. There is option record to retain in remove duplicate stage where we select first or last option.
2. By using sort stage. there is option allow duplicate in sort stage. We set this option true when we want duplicate records & If we want unique records we set it to the false
3. By in-line sorting. In Partition tab of any stage when we select key base partition tech. then perform sort option will enable if we check this option next 2 option will enable which is stable & unique if we select stable that means we allow duplicate records & if we select unique that means we remove duplicate records and getting unique records
| Is This Answer Correct ? | 5 Yes | 0 No |
Answer / soumya
1. Sort the Data using a key column then Use 3 transformer stage Variable
SV1 = Col1
SV2 = If SV1 = SV3 then 'DUP' else 'UNQ'
SV3 (Initial value = 0)
In the constraint SV2 = 'DUP' or SV2 = 'UNQ'
2. Aggregater stage group by key col (col1) count the column
then a filter stage use count col > 1.
| Is This Answer Correct ? | 1 Yes | 2 No |
What is the process of killing a job in datastage?
Differentiate between operational datastage (ods) and data warehouse?
What are the environmental settings for data stage,while working on parellel jobs?
Hi friends, I am new to datastage, i have one query in datastage any one you please give reply to my post. I have a workbook (excel sheet) named as eg: xxxx, in that i have two tables emp(eid, ename, salary, deptno) and dep(deptno, name, state). in my source i have ODBC enterprise stage read the emp table and dept table join the two table and write the dept no 10(eid,ename,salary,name,state) values in target(). Thanks, Badari
how will u design file watch jobs?
Two source files contains same meta data third file contains different data types can I funnel that file.
How can we achive parallelism
Define ds designer?
Explain usage analysis in datastage?
HOW U CAN ABORT THE JOB IF THE DATA IS DUPLICATE?
i have source data like empno,enmae 11 ,aa 12 ,bb i want output like empno,ename 11 ,aa 12 ,bb 11 ,aa 12 ,bb
at source level i have 40 columns,i want only 20 cols at target what r the various ways to get it