Drop duplicate records ...
SOURCE LIKE ..........
ID flag1 flag2
100 N Y
100 N N
100 Y N
101 Y Y
101 N Y
102 Y N
103 N N
104 Y Y
105 N N
106 N Y
102 N Y
105 Y Y
in above file if any id having both the flags as "N" then
that corresponding id records should be dropped,
in above case o/p should be as
ID flag1 flag2
101 Y Y
101 N Y
102 Y N
102 N Y
104 Y Y
106 N Y
Steps to do :
1) Identified the id’s that got duplicated (both the
flag values having vales “N”)
2) Look up with these id’s to existing id’s to drop .
Answer Posted / dipal
step-1
Filter the record based on condition
Flag1=N AND Flag2=N ....link1
also defined a reject link
step-2
read link1 as left link and reject link as right link and
do inner join with Look up stage based on id
also define a reject link.
now the reject link will have required output.
| Is This Answer Correct ? | 3 Yes | 0 No |
Post New Answer View All Answers
Describe link sort?
project Steps,hits, Project level HArd things,Solved methods?
client know skid info?
sed,awk,head
Differentiate between Join, Merge and Lookup stage?
Describe the main features of datastage?
Notification Activity
What is the difference between datastage and datastage tx?
What is the difference between Datastage 7.5 and 7.0?
What is the different type of jobs in datastage?
What steps should be taken to improve Datastage jobs?
i WANTED TO USE THE RANGE LOOKUP SCENARIO IN DATASTAGE 7.5.2 SRVER JOB.i HAVE A DATE FIELD IN SOURCE AND I SHOULD MATCH IT WITH A FIELD IN LOOKUP FILE.BUT,THE FIELDS SHOULD MATCH EVEN THOUGH THERE IS SOME RANGE.CAN SOMEONE TELL ME HOW CAN I DO THAT. THANKS
create a job that splits the data in the Jobs.txt file into
four output files. You will direct the data to the
different output files using constraints. • Job name:
JobLevels
• Source file: Jobs.txt
• Target file 1: LowLevelJobs.txt
− min_lvl between 0 and 25 inclusive.
− Same column types and headings as Jobs.txt.
− Include column names in the first line of the output file.
− Job description column should be preceded by the
string “Job
Title:” and embedded within square brackets. For example, if
the job description is “Designer”, the derived value
is: “Job
Title: [Designer]”.
• Target file 2: MidLevelJobs.txt
− min_lvl between 26 and 100 inclusive.
− Same format and derivations as Target file 1.
• Target file 3: HighLevelJobs.txt
− min_lvl between 101 and 500 inclusive.
− Same format and derivations as Target file 1.
• Rejects file: JobRejects.txt
− min_lvl is out of range, i.e., below 0 or above 500.
− This file has only two columns: job_id and reject_desc.
− reject_desc is a variable-length text field, maximum
length
100. It should contain a string of the form: “Level out of
range:
Difference between sequential file and data set?
What are the difference types of stages?