Drop duplicate records ...



SOURCE LIKE ..........

ID flag1 flag2

100 N Y

100 N N

100 Y N

101 Y Y

101 N Y

102 Y N

103 N N

104 Y Y

105 N N

106 N Y

102 N Y

105 Y Y

in above file if any id having both the flags as "N" then
that corresponding id records should be dropped,

in above case o/p should be as

ID flag1 flag2

101 Y Y
101 N Y
102 Y N
102 N Y
104 Y Y
106 N Y



Steps to do :

1) Identified the id’s that got duplicated (both the
flag values having vales “N”)

2) Look up with these id’s to existing id’s to drop .

Answers were Sorted based on User's Feedback



Drop duplicate records ... SOURCE LIKE .......... ID flag1 flag2 100 ..

Answer / dipal

step-1
Filter the record based on condition
Flag1=N AND Flag2=N ....link1
also defined a reject link
step-2
read link1 as left link and reject link as right link and
do inner join with Look up stage based on id
also define a reject link.
now the reject link will have required output.

Is This Answer Correct ?    3 Yes 0 No

Drop duplicate records ... SOURCE LIKE .......... ID flag1 flag2 100 ..

Answer / vz

Put a constraint in Transformer stage as shown bellow.

flag1=y or flag2=y


means

feald1=y or feald2=y

I think it's help you.

Is This Answer Correct ?    3 Yes 1 No

Post New Answer

More Data Stage Interview Questions

i have 4 jobs i want run 1job should run on 1node and 2job runon 2node and.... how to make it possible?

1 Answers  


How can we do null handling in sequential files?

3 Answers   Reliance,


How to find value from a column in a dataset?

0 Answers   TIAA CREF,


Difference between IBM DATA STAGE8.5 and DATA STAGE9.1 ?

0 Answers   ABC, TCS,


create a job that splits the data in the Jobs.txt file into four output files. You will direct the data to the different output files using constraints. • Job name: JobLevels • Source file: Jobs.txt • Target file 1: LowLevelJobs.txt &#8722; min_lvl between 0 and 25 inclusive. &#8722; Same column types and headings as Jobs.txt. &#8722; Include column names in the first line of the output file. &#8722; Job description column should be preceded by the string “Job Title:” and embedded within square brackets. For example, if the job description is “Designer”, the derived value is: “Job Title: [Designer]”. • Target file 2: MidLevelJobs.txt &#8722; min_lvl between 26 and 100 inclusive. &#8722; Same format and derivations as Target file 1. • Target file 3: HighLevelJobs.txt &#8722; min_lvl between 101 and 500 inclusive. &#8722; Same format and derivations as Target file 1. • Rejects file: JobRejects.txt &#8722; min_lvl is out of range, i.e., below 0 or above 500. &#8722; This file has only two columns: job_id and reject_desc. &#8722; reject_desc is a variable-length text field, maximum length 100. It should contain a string of the form: “Level out of range: <min_lvl>”, where <min_lvl> is the value in the min_lvl field. My Question is how do you write the stage variable for reject rows.

0 Answers   HCL,






deptno wise to find max and min,and sum of rows and in target to company wise maximum

1 Answers   IBM, TCS,


What is ibm datastage?

0 Answers  


What is merge stage?

0 Answers  


Why do we use link partitioner and link collector in datastage?

0 Answers  


What are the stages in datastage?

0 Answers  


job locking methods? How can we unlock the job?

3 Answers   IBM,


Name the different types of Lookups in Datastage?

0 Answers  


Categories