Golgappa.net | Golgappa.org | BagIndia.net | BodyIndia.Com | CabIndia.net | CarsBikes.net | CarsBikes.org | CashIndia.net | ConsumerIndia.net | CookingIndia.net | DataIndia.net | DealIndia.net | EmailIndia.net | FirstTablet.com | FirstTourist.com | ForsaleIndia.net | IndiaBody.Com | IndiaCab.net | IndiaCash.net | IndiaModel.net | KidForum.net | OfficeIndia.net | PaysIndia.com | RestaurantIndia.net | RestaurantsIndia.net | SaleForum.net | SellForum.net | SoldIndia.com | StarIndia.net | TomatoCab.com | TomatoCabs.com | TownIndia.com
Interested to Buy Any Domain ? << Click Here >> for more details...

I have 2 files 1st contains duplicate records only, 2nd file contains Unique records.EX:
File1:
1 subhash 10000
1 subhash 10000
2 raju 20000
2 raju 20000
3 chandra 30000
3 chandra 30000
File2:
1 subhash 10000
5 pawan 15000
7 reddy 25000
3 chandra 30000
Output file:--&#61664; capture all the duplicates in both file with count.
1 subhash 10000 3
1 subhash 10000 3
1 subhash 10000 3
2 raju 20000 2
2 raju 20000 2
3 chandra 30000 3
3 chandra 30000 3
3 chandra 30000 3

Answer Posted / ankit gosain

Hi,

This problem can be solved by creating a job with following
stages:

File2 File2
| |
| |
| |
File1-----Funnel----Aggregator----Join----Filter---Tgt_File
|
|
|
File1

1. Funnel both the files (Now you have Unique & Duplicates
records).
2. Aggregate on the basis of any i/p column and mention the
calculation type = Count Rows (say o/p column row_count).
3. Join the aggregated o/p with the i/p file1,2 one the
basis of key & mention the join type = Inner Join.
4. In filter stage, mention the where clause as row_count>1.

If you have further doubt or query, catch me on
ankitgosian@gmail.com

Cheers,
Ankit :)

Is This Answer Correct ?    1 Yes 0 No



Post New Answer       View All Answers


Please Help Members By Posting Answers For Below Questions

Define meta stage?

1207


How a routine is called in datastage job?

1121


Differentiate between datastage and informatica?

1188


What are some prerequisites for datastage?

1104


What are the types of containers?

1239


1)How will u implement SCD2 by using surrogate key. 2)What are the disadvantages with surrogate key. 3)How will you handle nulls in your project for the varchar, integer data types. 4)Can I use two fact tables in star schema. 5)3 jobs are running on the 2 nodes after I added one more node so can I compile those jobs to run on three nodes.

4047


Describe stream connector?

1312


what is use of SDR function?

5178


How many types of stage?

1181


What is a quality stage in datastage tool?

1081


What can we do with datastage director?

1242


what is the use of skid in reporting?

2381


Define repository tables in datastage?

1164


in oracle target stage when we use load option and when we use upsert option?

2285


how to abort the job its matain duplicates?

2576