Drop duplicate records ...

SOURCE LIKE ..........

ID flag1 flag2

100 N Y

100 N N

100 Y N

101 Y Y

101 N Y

102 Y N

103 N N

104 Y Y

105 N N

106 N Y

102 N Y

105 Y Y

in above file if any id having both the flags as "N" then
that corresponding id records should be dropped,

in above case o/p should be as

ID flag1 flag2

101 Y Y
101 N Y
102 Y N
102 N Y
104 Y Y
106 N Y

Steps to do :

1) Identified the id’s that got duplicated (both the
flag values having vales “N”)

2) Look up with these id’s to existing id’s to drop .

Answer Posted / vz

Put a constraint in Transformer stage as shown bellow.

flag1=y or flag2=y


feald1=y or feald2=y

I think it's help you.

Is This Answer Correct ?    3 Yes 1 No

Post New Answer       View All Answers

Please Help Members By Posting Answers For Below Questions

What is the difference between operational data stage (ods) and data warehouse?




Difference between ‘validated ok’ and ‘compiled’ in data stage?


how to run a sequential file stage in parallel if the stage is used on the TARGET side


hi iam new to this tooliam cmpltied to know abt datastage so now iam in project tell me whole step by step what iam doing iwnt to go with exp so plz hlp me pals


What is the roundrobin collector?


DB2 connector> transformer > sequential file Data will be exported into a csv format in a sequential file. This file will be send in a email using a sequence job. Problem here is, how to avoid sending a blank csv file? When I ran the job there are chances that it might return zero records but in the sequence job csv file is going blank. how can I avoid this? thanks


Why do we use exception activity in Datastage?


What are the different kinds of views available in a datastage director?


What are the different plug-ins stages used in your projects?


Hi everyone,I have kept a project Sales And Distribution for a pharmaceutical company.can anybody explain one complex business rule that we had in our project and how did you accomplish it using DS?


Is possible to create skid in dim,fact tables?


What are the important features of datastage?


what is stage is used for below Input columns: dept|mgr|employee|salary Output columns: mgr|count of employee per mgr|avg salary per dept note: each dept has one mgr and each mgr has many employees


If we take 2 tables(like emp and dept),we use join stage and how to improve the performance?