Interview Questions Answers in Data Stage

sharath

{ City } hyderabad
< Country > india
* Profession * dev
User No # 104284

Total Questions Posted # 1
Total Answers Posted # 32

Total Answers Posted for My Questions # 2
Total Views for My Questions # 7918

Users Marked my Answers as Correct # 62
Users Marked my Answers as Wrong # 14

Answers / { sharath }

Question { IBM, 29327 }

How to exclude first and last lines while reading data into a sequential file(having some 1000 records).I guess probably by using unix filter option but not sure which to use

Answer

Well if must use filter option in sequential stage then
the Ans is:

head -n 10 'FileName' | tail -n 8

explanation:
suppose if ur file has 10 records + column name then total
lines are 11.
col1
1
2
3
4
5
6
7
8
9
10

head -n 10 will fetch column name + 9 records
and
tail -n 8 will fetch 2 to 9 records excluding colname and
1st record.
like :
2
3
4
5
6
7
8
9
Thats it.

Is This Answer Correct ?

0 Yes

2 No

Question { 8715 }

What are the prerequisites for join stage

Answer

1.Inputs as left right and intermediate links(files).
2.can do left outer,right outer,inner join and full outer join
3.N inputs for left,right and inner but 2 inputs for full outer.
4.Inputs should be sorted and partitioned for better
performance.
5.Removing duplicates is optional as join operation is
highly optimized which sequential and less i/o and less page
faults.
6.Memory requirements are comparatively less than Lookup stage.
7.In join key columns names should be same in primary and
secondary datasets.

Is This Answer Correct ?

0 Yes

0 No

Question { 5782 }

What is the use of surrogate key stage?

Answer

Here are Uses:
1.We can generate sequence Numbers.
2.This will act as alternative to primary key.
3.we can pass the values with parameters.
4.we can pass the parameters at run time.
5.parameterization is possible.

cool...:)

Is This Answer Correct ?

1 Yes

0 No

Question { IBM, 14383 }

What is the default padding character?

Answer

Well the Default Padding character in DataStage is NULL
(0*0) a Hexadecimal Value. And This can be changed to our
requirements by including Environment variable
$APT_STRING_PADCHAR and setting it to 0*20 as default value
then it will pad with spaces instead.

Is This Answer Correct ?

0 Yes

1 No

Question { 12477 }

What is the default execution order of the sequential file?

Answer

It behaves different w r t reading capability.
i.e.,
when it is reading a single file; it does sequential mode of
reading by default and when it is reading multiple files it
does parallel mode of reading by default.

And if we need to improve performance of sequential file
stage when reading a single large file then we have two
options:
1. Read from Multiple Nodes
2. No of readers per node.

1st one is applicable only when flat file is Fixed length
file and not applicable for delimited formatted files.

The latter one is applicable for delimited files by
mentioning no of nodes then it performs parallel mode of
execution by reading single file in partitions improving the
performance.

Thats it.

Is This Answer Correct ?

4 Yes

0 No

Question { 25219 }

How many input links can you give to a Transformer stage?

Answer

1 in -->TX--->N(out Links) & 1 Reject links w r t Parallel Jobs

&

1 in & N Ref Links -->TX --> N(Output Links) & 1 Reject
Link. w r t Server Jobs.

Is This Answer Correct ?

4 Yes

1 No

Question { IBM, 17796 }

i want send my all duplicate record one tar and all uniq
records one target how we will perfome explain
example:
input data
eid
251
251
456
456
951
985
out put/target1
251
251
456
456
out put/target2
951
985
how we will bring

Answer

Hi guys

Here is the answer.

Seq-->copy--->Agg,--> Join ----->Filter -----> DS1
----------> -----> DS2

Explanation:

From seq stg to copy and
from copy send one link to aggregator for counting of rows
and other to join stage for Clubbing.
At join use input column to club. From join to filter and at
filter give conditions at where clause count = 1 output link
-0 and count =2 output link -1.
Finally use Datasets as targets DS1 and DS2.

Ok thats it.

Is This Answer Correct ?

0 Yes

0 No

Question { IBM, 9124 }

how to change left and right links in join stage?

Answer

In join stage at link Ordering Tab
we can change the Order of links.

Is This Answer Correct ?

0 Yes

0 No

Question { 5300 }

Is there no issue when you try to convert a NOt null column
in nullable and vice versa in aggregator styage and
transformer stage? When I tried i got the warnings but in a
running code I can see such type of scenarios. Please
explain

Answer

When
col is converted from not null to null and data coming has
null or not null values then NO Problem.

But col is NULL constraint and converted to Not NULL then
two cases
i) when data coming from source has no NULL values then no
problem.
ii)when data coming from source has NULL values then the job
simply ABORTS.

Is This Answer Correct ?

0 Yes

0 No

Question { IBM, 10570 }

Input Data is:
Emp_Id, EmpInd
100, 0
100, 0
100, 0
101, 1
101, 1
102, 0
102, 0
102, 1
103, 1
103, 1
I want Output
100, 0
100, 0
100, 0
101, 1
101, 1
Means Indicator should either all ZEROs or all ONEs per
EmpId.
Impliment this using SQL and DataStage both.

Answer

Ok
1) sql:
select empno, rank() over(order by empno) Ind from e2;
then we get:
100 1
100 1
100 1
101 2
101 2
103 3

2) Datastage:
sequl file-->sort-->tx--> Target.
@sort:
create keychange column then we get
100 0
100 1
100 1
101 0
101 1
101 1
102 0
102 1
103 0
@ Tx use two stage variables:
sv1 integer = 0 value
sv2 integer = 0 value
and at derivations :
assign
1) keychange ---> sv1
2) if sv1=0 then sv2+1 else sv2 ----> sv2

map the columns empno and sv2
and we get the results.
Thats it.
shar

Is This Answer Correct ?

1 Yes

1 No

Question { CTS, 15881 }

INPUT file 'A' contains:
1
2
3
4
5
6
7
8
9
10

input file 'B' contains:
6
7
8
9
10
11
12
13
14
15

Output file 'X' contains:
1
2
3
4
5

Output file 'Y' contains:
6
7
8
9
10

Output file 'Z' contains:
11
12
13
14
15

How can we implement this in a single ds job?

Answer

Yes Union of Two Files is a Good idea
but No need of Transformer Stage we can do it in this way:

src1 ,src2 -->Funnel--->Sort--->Filter---->trg1,trg2,trg3

at sort just put allow duplicates=false
&
at filter just give conditions
1. col1>=1 and col1<6 trg1
2. col1>=6 and col1<11 trg2
3. col1>=11 trg3

Is This Answer Correct ?

9 Yes

3 No

Question { 5052 }

what is the difference between the join and look up
explin me one exmple

Answer

join lookup
i/p names- left,right,intermediate primary,secondary
join ops - left,right,inner,fullout left,inner
in & out - n i/p(s)-left,right,inner n i/p(s) normal
2 i/p(s)-full outer 2 i/p(s) sparse
1 o/p 1 o/p
rejects - n/a one
sort data- Mandatory optional
KcolNames- Mandatory optional
deduplica- no problem warnings in secondary.
memory - light high

Is This Answer Correct ?

0 Yes

0 No

Question { TCS, 7757 }

Question
4)
source target

c1 c1 c2 c3
c2 c4 c4 c5
c3 c6 c7
c4
c5
c6
c7

Singal Source and Singal Target only subash,

Answer

HERE IS Mine answer an alternative without pivot stage:

SRC-->TX1-->TX2-->RemDup-->TRG

from SRC send data to TX1 with RowNumCol
then Data :
col1 Row_Num
c1 0
c2 1
c3 2
c4 3
c4 4
c5 5
c6 6
c7 7

Now in Tx1 take stage variable sv1 and using Row_Num field
give condition as : If Row_Num >=0 And Row_Num <3 Then 1
Else If Row_Num >= 3 And Row_Num <6 Then 2 Else 3
o/p :
col1 col2
c1 1
c2 1
c3 1
c4 2
c4 2
c5 2
c6 3
c7 3

(If you don't know Vertical Pivoting then alternatively we
use another Transformer and Remove Duplicate stages to get
our result)

now at Tx2 we take Three Stage Variables sv1 sv2 sv3 all are
varchar
now we give proper conditions to concatinate input using col2.
Stage variables: Assignment
col2---sv1
If sv1 <> sv3 Then col2 Else sv2 : ',' : col2 --- sv2
col2---sv3

Now at this level u should get o/p as:

newcol1 newcol2
1 c1
1 c1,c2
1 c1,c2,c3
2 c4
2 c4,c4
2 c4,c4,c5
3 c6
3 c6,c7

Finally use Rem Duplicates by selecting
Duplicates to Retain : Last
o/p
1 c1,c2,c3
2 c4,c4,c5
3 c6,c7

(Just in case if you dont want 1st column just drop it using
modify stage or copy stage.)
Yeah i know its a lengthy way and it sucks,
out of curiosity did it mannnnn.
Better use Pivot stage -a wonderful stage

Is This Answer Correct ?

0 Yes

0 No

Question { TCS, 7757 }

Question
4)
source target

c1 c1 c2 c3
c2 c4 c4 c5
c3 c6 c7
c4
c5
c6
c7

Singal Source and Singal Target only subash,

Answer

small correction in answer at
sv1<>sv3 then col1 else col1:' ':sv2 --> sv2
and every thing is correct.

Is This Answer Correct ?

0 Yes

0 No

Question { TCS, 10689 }

I have Seq file, I don't want 10, 11th(or any two records like 20, 30th records ) records in the output

Answer

Yes Mr Ankit Gosian Answer is correct. But What if numbers
are 1 4 7 9 10 11 13 and so on then.

Is This Answer Correct ?

0 Yes

0 No

Prev 1 [2] 3 Next