I have 2 files 1st contains duplicate records only, 2nd file contains Unique records.EX:
File1:
1 subhash 10000
1 subhash 10000
2 raju 20000
2 raju 20000
3 chandra 30000
3 chandra 30000
File2:
1 subhash 10000
5 pawan 15000
7 reddy 25000
3 chandra 30000
Output file:-- capture all the duplicates in both file with count.
1 subhash 10000 3
1 subhash 10000 3
1 subhash 10000 3
2 raju 20000 2
2 raju 20000 2
3 chandra 30000 3
3 chandra 30000 3
3 chandra 30000 3
Answer Posted / ankit gosain
Hi,
This problem can be solved by creating a job with following
stages:
File2 File2
| |
| |
| |
File1-----Funnel----Aggregator----Join----Filter---Tgt_File
|
|
|
File1
1. Funnel both the files (Now you have Unique & Duplicates
records).
2. Aggregate on the basis of any i/p column and mention the
calculation type = Count Rows (say o/p column row_count).
3. Join the aggregated o/p with the i/p file1,2 one the
basis of key & mention the join type = Inner Join.
4. In filter stage, mention the where clause as row_count>1.
If you have further doubt or query, catch me on
ankitgosian@gmail.com
Cheers,
Ankit :)
| Is This Answer Correct ? | 1 Yes | 0 No |
Post New Answer View All Answers
Define meta stage?
How a routine is called in datastage job?
Differentiate between datastage and informatica?
What are some prerequisites for datastage?
What are the types of containers?
1)How will u implement SCD2 by using surrogate key. 2)What are the disadvantages with surrogate key. 3)How will you handle nulls in your project for the varchar, integer data types. 4)Can I use two fact tables in star schema. 5)3 jobs are running on the 2 nodes after I added one more node so can I compile those jobs to run on three nodes.
Describe stream connector?
what is use of SDR function?
How many types of stage?
What is a quality stage in datastage tool?
What can we do with datastage director?
what is the use of skid in reporting?
Define repository tables in datastage?
in oracle target stage when we use load option and when we use upsert option?
how to abort the job its matain duplicates?