I have 2 files 1st contains duplicate records only, 2nd file contains Unique records.EX:
File1:
1 subhash 10000
1 subhash 10000
2 raju 20000
2 raju 20000
3 chandra 30000
3 chandra 30000
File2:
1 subhash 10000
5 pawan 15000
7 reddy 25000
3 chandra 30000
Output file:-- capture all the duplicates in both file with count.
1 subhash 10000 3
1 subhash 10000 3
1 subhash 10000 3
2 raju 20000 2
2 raju 20000 2
3 chandra 30000 3
3 chandra 30000 3
3 chandra 30000 3
Answers were Sorted based on User's Feedback
Answer / subbuchamala
File1,File2====Funnel-----Copy=======1st link AGG, 2nd link JOIN----Filter----OutputFile
1. pass the 2 files to funnel stage and then copy stage.
2. from copy stage 1st link to AGG stage, 2nd link to JOIN stage
3. In AGG stage, Group by Key column say ID, NAME take the count and JOIN based on KEY column
4. Filter on COUNT>1 send the output OutputFile
we get desired output
| Is This Answer Correct ? | 14 Yes | 0 No |
Answer / ankit gosain
Hi,
This problem can be solved by creating a job with following
stages:
File2 File2
| |
| |
| |
File1-----Funnel----Aggregator----Join----Filter---Tgt_File
|
|
|
File1
1. Funnel both the files (Now you have Unique & Duplicates
records).
2. Aggregate on the basis of any i/p column and mention the
calculation type = Count Rows (say o/p column row_count).
3. Join the aggregated o/p with the i/p file1,2 one the
basis of key & mention the join type = Inner Join.
4. In filter stage, mention the where clause as row_count>1.
If you have further doubt or query, catch me on
ankitgosian@gmail.com
Cheers,
Ankit :)
| Is This Answer Correct ? | 1 Yes | 0 No |
Hi guys, please design a job with derivation(solution). write exact conditions. My requirement Source table emp_no qualification 1 a 1 c 2 a 3 c 3 b Target table emp_no qualification 1 b 2 b 2 c 3 a Here every employer have three qualifications i.e a,b and c. what ever source table dont have some qualification, that will be move to target table. Like above. Hope u get the point. Thanks.
Why fact table is in normal form?
what is advantages of snowflake when it is used?
To see hidden files in LINIX?
Name the third party tools that can be used in datastage?
Hi Friends, I have a input data like, class_id Marks 101 50 101 60 101 40 102 90 102 35 And i want my output data like class_id Marks Rank 101 50 2 101 60 1 101 40 3 102 90 1 102 35 2 how to do this in datastage?
8 Answers Cognizant, HCL, TIAA CREF,
while we using change capture stage we have to be take two table thats are 1.before table 2. after table . what is before table and after table please give me clear notation Thank You very much in advance
cust id,cust quty like 1,101;1,102;1,103 i want output like cust id,cust quty 1 101,102,103 in oracle please write a query in oracle
how to find diff between 2 dates without using Icon... funtions?
i have a scenario like two columns(Empno, Ename) in that duplicate records are there, so my question is how to get second duplicate record in datastage.
how many types of sorting the data in data stage?
A job is having only 2 stages I/p dataset and target table.Job is taking very long time to load 50 million records.How to improve performance of this job.