I have 2 files 1st contains duplicate records only, 2nd file contains Unique records.EX:
File1:
1 subhash 10000
1 subhash 10000
2 raju 20000
2 raju 20000
3 chandra 30000
3 chandra 30000
File2:
1 subhash 10000
5 pawan 15000
7 reddy 25000
3 chandra 30000
Output file:-- capture all the duplicates in both file with count.
1 subhash 10000 3
1 subhash 10000 3
1 subhash 10000 3
2 raju 20000 2
2 raju 20000 2
3 chandra 30000 3
3 chandra 30000 3
3 chandra 30000 3
Answer Posted / ankit gosain
Hi,
This problem can be solved by creating a job with following
stages:
File2 File2
| |
| |
| |
File1-----Funnel----Aggregator----Join----Filter---Tgt_File
|
|
|
File1
1. Funnel both the files (Now you have Unique & Duplicates
records).
2. Aggregate on the basis of any i/p column and mention the
calculation type = Count Rows (say o/p column row_count).
3. Join the aggregated o/p with the i/p file1,2 one the
basis of key & mention the join type = Inner Join.
4. In filter stage, mention the where clause as row_count>1.
If you have further doubt or query, catch me on
ankitgosian@gmail.com
Cheers,
Ankit :)
| Is This Answer Correct ? | 1 Yes | 0 No |
Post New Answer View All Answers
how to export or import the jobs in .ISX file
How to find value from a column in a dataset?
What are the enhancements made in datastage 7.5 compare with 7.0?
What is meta stage?
Explain ibm infosphere information server and highlight its main features?
Can you explain players in datastage?
What is a quality stage in datastage tool?
Hi, what is use of Macros,functions and Routines..? At what situation you are used. If you know the answer please explain it. Thanks.
How do you import and export data into datastage?
CHANGE CAPTURE
How and where you used hash file?
Describe the architecture of datastage?
if we using two sources having same meta data and how to check the data in two sources is same or not? and if the data is not same i want to abort the job ?how we can do this?
What could be a data source system?
What is the Environment Variable need to Set to TRIM in Project Level?(In transfermer, we TRIM function but I need to impliment this project level using Environment variable)