Interview Questions Answers in Data Stage, Informatica

subhash

{ City } hyderabad
< Country > india
* Profession * datastage developer
User No # 101246

Total Questions Posted # 15
Total Answers Posted # 100

Total Answers Posted for My Questions # 43
Total Views for My Questions # 151078

Users Marked my Answers as Correct # 555
Users Marked my Answers as Wrong # 33

Answers / { subhash }

Question { TCS, 31359 }

A flatfile contains 200 records.I want to load first 50
records at first time running the job,second 50 records at
second time running and so on,how u can develop the job?pls
give the steps?pls pls

Answer

other than VARUN solution

1. Add 'row number' column in Seq File stage, so that each record has a number associated with it.
2. Add a job param with which we can provide the number of record from where we want to run the job. We can pass this either using Sequence Start LOOP(List type variables-50,100,150,200) or by shell script.
3. In the tfm, use a stage variable to run only from the record number till 50 records by counting each record.

Is This Answer Correct ?

2 Yes

0 No

Question { IBM, 11838 }

i/p o/p1 o/p2
1 1 4
1 1 5
1 1 6
2 2
2 2
2 2
3
3
4
5
6
how to populates i/p rows into o/p1&o/p2 using datastage
stages?and also the same scenario using sql?

Answer

SQL> small correct in Query provided by Kalpana.

SELECT COL1 FROM TAB1 WHERE COL1 IN
(SELECT COL1 FROM TAB1 GROUP BY COL1 HAVING COUNT(*) > 1) ;

O/P 1 :
1
1
1
2
2
2
3
3
SQL>

SELECT COL1 FROM TAB1 GROUP BY COL1 HAVING COUNT(*) = 1 ;

O/P 2 :
4
5
6

Is This Answer Correct ?

5 Yes

0 No

Question { L&T, 9606 }

can we half project in parallel jobs and half project in
server jobs?

Answer

in one project you can both server and parallel jobs.
People use both type of canvas to develop jobs based on their requirements.
Using sequence job, you can call both types of job along with other actvities

Is This Answer Correct ?

0 Yes

0 No

Question { Wipro, 8177 }

what are the different type of errors in datastage?

Answer

There are three message types:
1. Error/fatal
2. Warning
3. Informational/info

Is This Answer Correct ?

1 Yes

0 No

Question { 5586 }

i 10 jobs first two jobs are runing in 2nodes,next 2 jobs
are running in 4 nodes, next 4 jobs are running in 6 nodes
and the remaining jobs are running on 10 nodes. how to
change the node configuration?

Answer

First create 3 Configuration files(2 Node, 4 Node and 6 Node
config files).
in the job, go to the 'TOOLS' tab,
then go to 'Configurations' select the on which node
configuration file you want to run the job.

Is This Answer Correct ?

3 Yes

0 No

Question { 5586 }

Answer

First create 4 Configuration files(2 Node, 4 Node 6 Node and 10 Node config files).

And add the environment variable($APT_CONFIG_FILE) to all the jobs in the job Properties.

-->For first2 jobs: Assign the 2 Node configuration file path to the $APT_CONFIG_FILE variable in Job properties.
-->For next 2 jobs: Assign the 4 Node configuration file path to the $APT_CONFIG_FILE variable in Job properties.

do this for all jobs.

Bsed on this config file setting, Jobs will pick required Config file in runtime.

Is This Answer Correct ?

4 Yes

0 No

Question { IBM, 6720 }

Hi,
In source I have records like this
No name address
10 manoj mum
10 manoj dilhi
20 kumar usa
20 kumar Tokyo
I want records in target like shown below
No name addr1 addr2
10 manoj mum dilhi
20 kumar usa Tokyo

If it is reverse we can do this by using Normalizer
transformation by setting occurance as 2.
Somebody will say use denoralization technique. But as of my
knowledge I couldn’t find any denormalization technique. Is
there any concept like that?
I tryid this seriously but I could find any idea to
implement this.
Can any one please help me ?
Advance Thanks

Answer

SELECT NO, NAME,
MAX(ADDRESS) AS ADDR1,
MIN(ADDRESS) AS ADDR2
FROM TABLE_NAME
GROUP BY (NO,NAME)

Is This Answer Correct ?

2 Yes

0 No

Question { IBM, 10144 }

source file is having 5 records while moving into target it
want to be 10 records

Answer

Different ways are:

1. SRC-->COPY-->(2links)-->FUNNEL-->TGT
from copy stage get 2 copies of output data and send to funnel.

2. SRC-->Transfermer-->(2links)-->FUNNEL-->TGT
in the TFM, map the all the columns to both the links and then funnel them.

3. SRC--->TGT(with append mode)
Run job 2 times with TGT Seq file mode as "Append"

4. 2 SRCs(2 same Seq file Stages)-->Funnel-->TGT
read the same file from 2 Seqfile Stages and send them to funnel.

5. SRC--->TGT
Add 2 "File" properties in the Seq file stage and for both of them give the same file name and path.

File=directory/file.txt
File=directory/file.txt

Is This Answer Correct ?

1 Yes

0 No

Question { TCS, 6011 }

SEQUENTIAL FILE I HAVE ONE RECORD,I WANT 100 RECORDS IN
TARGET?HOW CAN WE DO THAT?PLS EXPLAIN ME AND WHAT STAGES ARE
THERE?WHAT LOGIC?

Answer

1)
JOB1: SRC---->COPY---->TGT
SEQuence:
START LOOP---->JOB1----->END LOOP Activity.

In TGT stage use 'Append' Mode.
By Looping 100 time, we can get 100 records in target.

2)
SRC---->Transformer---->TGT
By using Looping Variable in the Transformer, we can achieve
this.
Loop While Condition "@ITERATION <=100"

Is This Answer Correct ?

9 Yes

0 No

Question { Hexaware, 5620 }

What is configuration your file structure 2)I have two
databases both are Oracle while loading data from source to
target the job takes 30 min but I want to load less time how?

Answer

1) configuration file structure:
{
node "node1"
{
fastname ""
pools ""
resource disk "" {pools
""}
resource scratchdisk " Path>" {pools ""}
}
node "node2"
{
fastname "mctux034"
pools ""
resource disk
"/dsg/IBM/IBM/InformationServer/Server/Datasets" {pools ""}
resource scratchdisk
"/dsg/IBM/IBM/InformationServer/Server/Scratch" {pools ""}
}
}
2)
SRC--->COPY---->TGT
Here Copy stage increase performance and works as a Buffer
when diff in reading count and writing record count.
by setting/checking below in Target ORACLE Connector:
Record Count=6000 (default is 2000, should be multiple of
Array Size)
Array Size=3000 (default is 2000)
Write mode=Bulk Load
Rebuild Indexes After Bulk Load

Is This Answer Correct ?

3 Yes

0 No

Question { Polaris, 21658 }

how to retrive the max(sal),deptno,empno in datastage?

Answer

SRC--> COPY stage--> 2 o/p links from COPY Stage-->
one is --> AGGREGATOR and other is --> JOIN-->TGT

-->In AGG Stage:
set group=DEPNO
aggregator type=calculation
column for calculation=SAL
max value output column=SAL

-->O/P of AGG Stage:
SAL(max(sal), DEPNO

-->In JOIN stage:
Inner Join on DEPNO and SAL and get the EMPNO as well.
By doing this, we get Max(sal) of the employee on particular DEPNO.

Is This Answer Correct ?

2 Yes

0 No

Question { Accenture, 22482 }

how can i get 2nd highest salary in datastage?can u send me
,thanQ
2)if i had source has 2 records 1st record ie 1st column
contains 1,2,3 and 2nd coulmn contains 10,10,10 i have to
get target as 2nd columns as 20,30,40 how can i?

Answer

for 2nd Qn:
Source is:
Col1 Col2
1 10
2 10
3 10
In transformer, we put the below logic:

Link.Col1 * Link.Col2 + Link.Col2------>Col2

1*10+10------------->20
2*10+10------------->30
3*10+10------------->40

Hope this is fine.

Is This Answer Correct ?

7 Yes

1 No

Question { 8143 }

1)Source file contains one record, I want 100 records in
target file.

Answer

Another best way is:
SRC--->TFM--->TGT
in TFM, we can do this by using LOOPing concept.
@ITERATION<=100
map the SRC record to TGT.

Is This Answer Correct ?

10 Yes

0 No

Question { 4701 }

I have 100 records how can I load at a time from the single
time

Answer

by setting/checking below in Target ORACLE Connector:
Record Count=2000 (default is 2000)
Array Size=2000 (default is 2000)
Write mode=Bulk Load

Is This Answer Correct ?

1 Yes

0 No

Question { Cap Gemini, 18819 }

i have a table
col1
10
20
30
40
10
20
50
my requirement is how to retrive only duplicates like
10
10
20
20
like this how it's possible in SQL?

Answer

From above all, these 3 are the correct ways using SQL:
A.
select * from emp as p where rowid<(select max(rowid) from
emp s where p.deptno=s.deptno);
B.
select * from sno
where sno in (select sno from emp
group by sno
having count(sno)>1
)
C.
1. SELECT A.COL1 FROM TABLE1 A

2. ( SELECT COL1, COUNT(COL1)FROM TABLE1
GROUP BY (COL1)
HAVING COUNT(COL1)=1)B

WHERE

3. A.COL1<>B.COL1

RESULT : 1. 10 20 30 40 10 20 50
2. 30 40 50
3. 10 10 20 20
And Using DataStage:
A.
Take source and copy,aggregate,JOIN,FILTER
1) give one copy output link to aggr, other to JOIN

2)In arggr perform count rows,
col1 COUNT
10 2
20 2
30 1
40 1
50 1

3) JOIN the two links(link1 from copy, link2 from AGG) and
use LEFT OUTER JOIN
col1 COUNT
10 2
20 2
30 1
40 1
10 2
20 2
50 1

4)And in FILTER, give constrains as count=1 to TARGET1 and
count>1 to TARGET2. in the TARGET2, u will get the desired
output.

Is This Answer Correct ?

4 Yes

0 No

Prev 1 2 [3] 4 5 6 7 Next