Input Data is:
Emp_Id, EmpInd
100, 0
100, 0
100, 0
101, 1
101, 1
102, 0
102, 0
102, 1
103, 1
103, 1
I want Output
100, 0
100, 0
100, 0
101, 1
101, 1
Means Indicator should either all ZEROs or all ONEs per
EmpId.
Impliment this using SQL and DataStage both.
Answers were Sorted based on User's Feedback
In DataStage:
SRC--->CPY---->JOIN----TFM---TGT
--------|---- /
--------|--- /
--------|-- /
--------|- /
--------AGG
In AGG, GROUP BY EmpId, calculate MIN and MAX for each
EmpId.
JOIN both one copy from CPY and 2nd Aggrigated copy from
AGG.
In TFM, put constraint: IF MIN=MAX, then populate to TGT
then u will get required output.
Is This Answer Correct ? | 11 Yes | 0 No |
Answer / subhash
1. SQL:
SELECT * FROM
( SELECT EmpId, COUNT(*) AS CNT1 FROM EMP GROUP BY
EmpId) E1,
( SELECT EmpId, COUNT(*) AS CNT2 FROM EMP GROUP BY
EmpId, EmpInd) E2,
WHERE E1.EmpID = E2.EmpId AND
E1.CNT1 = E2.CNT2;
2.DataStage:
SRC--->CPY---->JOIN----TFM---TGT
| /
| /
| /
| /
AGG
In AGG, GROUP BY EmpId, calculate CNT and SUM.
JOIN both one copy from CPY and 2nd Aggrigated copy from
AGG.
In TFM, put constraint: IF CNT=SUM, then populate to TGT
then u will get required output.
Is This Answer Correct ? | 5 Yes | 2 No |
Answer / akila ramu
DB--->Transformer--->Output File
Sample data propagation through these stages:
In table->DB stage--->Tfm----->outputfile
101 0---->100 0 2 2-->100 0
100 0---->101 0 2 1-->100 0
101 1---->101 1 2 1
100 0
DB: Use the bvelow query in this stage
select emp_id, ind, count(emp_id) c1, count(emp_id ind) c2
from table_name
group by emp_id, ind
order by emp_id, ind
So similar empid-ind are grouped and the count of each
empid-ind pair is also sent in a seperate column c2. The
count of each emp_id is sent in c1.
Tfm: Output link Contraint:c1=c2
Looping contraint: @ITERATION<=c2
Looping variables: l_empid=emp_id
l_ind=ind
Pass these two looping variables as the emp_id and the ind
to the output file.
Is This Answer Correct ? | 3 Yes | 1 No |
Answer / lb14447
The sql query would be
SELECT * FROM EMPTEST WHERE EMP_ID IN (SELECT EMP_ID FROM EMPTEST GROUP BY EMP_ID HAVING SUM(EMP_IND)/COUNT(EMP_IND) = 0
OR SUM(EMP_IND)/COUNT(EMP_IND) = 1);
Datastage implementation:
SRC --> CPY ----> AGG---> FILTER
- |
- |
- |
- |
- |
--------> Look up ----> TGT
In the Aggregator stage calculate the Sum and Count fields.In the filter stage bypass the unwanted records using Sum and Count calculated in Aggr stage.
Is This Answer Correct ? | 1 Yes | 0 No |
Answer / mahalakshmi
SRC--SRT---FLT---|
| LKP
CPY---| |
| |
|__FNL_|
|
TGT
1.In Sort Stage (Keys on Empid & Emp Ind), Create Key
changeColumn = True
2. In Filter WhereClause-(Keychange=0)OutputLink=0
WhereClause-(Keychange=1)OutputLink=1
3.In Lookup InnerJoin on EmpId
4.In Funnel, Funnel Type=Sorted Funnel
Is This Answer Correct ? | 1 Yes | 0 No |
Ok
1) sql:
select empno, rank() over(order by empno) Ind from e2;
then we get:
100 1
100 1
100 1
101 2
101 2
103 3
2) Datastage:
sequl file-->sort-->tx--> Target.
@sort:
create keychange column then we get
100 0
100 1
100 1
101 0
101 1
101 1
102 0
102 1
103 0
@ Tx use two stage variables:
sv1 integer = 0 value
sv2 integer = 0 value
and at derivations :
assign
1) keychange ---> sv1
2) if sv1=0 then sv2+1 else sv2 ----> sv2
map the columns empno and sv2
and we get the results.
Thats it.
shar
Is This Answer Correct ? | 1 Yes | 1 No |
WHERE YOU USE UNIX COMMANDS AS A ETL DEVELOPER?
how to run a sequential file stage in parallel if the stage is used on the TARGET side
what is the difference between == and eq in UNIX shell scripting?
How many input links can you give to a Transformer stage?
there are two schemas x and y are there. some data is in x schema. i want to use that in y schema..how can i use? please give some possibilities
What are the differences between datastage and informatica?
How one source columns or rows to be loaded in to two different tables?
What are the partition techniques available in your last project?
Hi this madan, in data stage one file in Empno 12345678910 in a table, i want target is Empno 1 2 3 4 5 6 7 8 9 10
Why do we use link partitioner and link collector in datastage?
how can find maximum salary by using Remove duplicate stage?
1.i have 5 jobs(1-5),i connect with each other,i want run from 3-5 only how? 2.how to schedual the job in datastage7.5 2? what is the deff bet grip and fgrep command? how do you cleanse the data in your project