I have 2 files 1st contains duplicate records only, 2nd file contains Unique rec

I have 2 files 1st contains duplicate records only, 2nd file contains Unique records.EX:
File1:
1 subhash 10000
1 subhash 10000
2 raju 20000
2 raju 20000
3 chandra 30000
3 chandra 30000
File2:
1 subhash 10000
5 pawan 15000
7 reddy 25000
3 chandra 30000
Output file:-- capture all the duplicates in both file with count.
1 subhash 10000 3
1 subhash 10000 3
1 subhash 10000 3
2 raju 20000 2
2 raju 20000 2
3 chandra 30000 3
3 chandra 30000 3
3 chandra 30000 3

Question Posted / subbuchamala

2 Answers
8071 Views
TCS, I also Faced
E-Mail Answers

Answers were Sorted based on User's Feedback

I have 2 files 1st contains duplicate records only, 2nd file contains Unique records.EX: File1: 1 ..

Answer / subbuchamala

File1,File2====Funnel-----Copy=======1st link AGG, 2nd link JOIN----Filter----OutputFile
1. pass the 2 files to funnel stage and then copy stage.
2. from copy stage 1st link to AGG stage, 2nd link to JOIN stage
3. In AGG stage, Group by Key column say ID, NAME take the count and JOIN based on KEY column
4. Filter on COUNT>1 send the output OutputFile
we get desired output

Is This Answer Correct ?

14 Yes

0 No

I have 2 files 1st contains duplicate records only, 2nd file contains Unique records.EX: File1: 1 ..

Answer / ankit gosain

Hi,

This problem can be solved by creating a job with following
stages:

File2 File2
| |
| |
| |
File1-----Funnel----Aggregator----Join----Filter---Tgt_File
|
|
|
File1

1. Funnel both the files (Now you have Unique & Duplicates
records).
2. Aggregate on the basis of any i/p column and mention the
calculation type = Count Rows (say o/p column row_count).
3. Join the aggregated o/p with the i/p file1,2 one the
basis of key & mention the join type = Inner Join.
4. In filter stage, mention the where clause as row_count>1.

If you have further doubt or query, catch me on
ankitgosian@gmail.com

Cheers,
Ankit :)

Is This Answer Correct ?

1 Yes

0 No

Post New Answer

More Data Stage Interview Questions

how to find diff between 2 dates without using Icon... funtions?

1 Answers

what is snow flack schema?

2 Answers

explain how to create SCD-2 IN DATASTAGE 7.5X2 PLZ EXPLAIN WITH 4 OR 5 RECORDS TAKE IT EXAMPLE AND JOB DESINGN URGENT

3 Answers IBM, Polaris,

Hi this madan, in data stage one file in Empno 12345678910 in a table, i want target is Empno 1 2 3 4 5 6 7 8 9 10

5 Answers Tech Mahindra,

if the source file is CID,CCODE,CONNDATE,CREATEDBY 0000000224,1000,20060601,CURA 0000000224,2000,20050517,AFGA 0000000224,3000,20080601,TUNE 0000000225,1000,20020601,CURA 0000000225,2000,20050617,AFGA 0000000225,3000,20080601,TONE AND TARGET is oracle following are the validations cid loaded with unique records leading zeors has to be deleted while loading cid in target load only customer who got early connected to company conn_date should be loaded into oracle date format cid datatype is varchar2 in target conn_date is data datatype ccode is varchar2 0000000224,1000,20060601,CURA 0000000224,1000,20060601,CURA

2 Answers

1)s.key generate 1 to 700 records today. tomorrow another 400 will updated how to update the records using s.key generator? 2)source is like :-- DB --> T/F stage1 --> seq1file T/f 1 is linking with T/F2 ---> seq 2 how to load the data? in source i given some conditions those r going in seq1. The another data will going to seq2 how to do this ?

0 Answers Wipro,

convert yyyy mm dd to dd mm yyyy?

4 Answers

detail about sdlc

4 Answers

How to convert RGB Value to Hexadecimal values in datastage?

0 Answers

what r the stages mostly used in realtime scenarios

4 Answers HCL, IBM,

Wat is pre-load in Hashed file

1 Answers Patni,

what is mapping lookup

1 Answers

For more Data Stage Interview Questions Click Here