How would you delete duplicate observations?
Answers were Sorted based on User's Feedback
Answer / mohan reddy
NODUP OR NODUPREC OPTION IN PROC SORT STATEMENT.
EX;
PROC SORT DATA=EMP NODUP;
RUN;
NODUPKEY OPTION WILL ALSO DELETE THE DUPLICATE OBSERVATION
VALUES.BUT IT CAN USE THE BY VARIABLE.
EX
PROC SORT DATA=EMP NODUPKEY;
BY ENO;
RUN;
| Is This Answer Correct ? | 14 Yes | 1 No |
Answer / vijay
NODUP: in proc sort will delete duplication observations
NODUPKEY: deletes duplicate observation values of Key
variables
| Is This Answer Correct ? | 10 Yes | 0 No |
Answer / ananth
nodupkey option in proc sort statement.
Or use first.byvaribale or last.byvariable in data step.
| Is This Answer Correct ? | 12 Yes | 3 No |
Answer / prr
In Proc sort:
NoDupkey: TO delete duplicate observations based on By variable.
NoDuprecs: It looksup complete observation and delete
duplicate observations.
Nodup: it is a sas key word tells to sas, to delete
duplicate observations and keep only first one.
in Data step: First. and Last.
In Proc sql: Distinct Clause.
Process of SQL: 1.Select
2.group by
3.having
4.distinct
5.order by
| Is This Answer Correct ? | 6 Yes | 0 No |
Answer / ganesh
When you want elemenate duplicate values from dataset using
nodup option in the procedure sort.
When you want elemenate duplicate keys from specified
variables then use nodupkey option in the procedure sort.
| Is This Answer Correct ? | 5 Yes | 1 No |
Answer / reddy
nodup will eliminate the successive duplicate value only.
nodupkey eliminates all the duplicate values in a mentioned
variable.
| Is This Answer Correct ? | 3 Yes | 3 No |
Answer / thirumalesh.e.
We can delete using Proc NoDupkey NoDuprecs and
NoDuplicates, then by Dupsort system option, then
if.first . last, Proc sql, create by select * unique ...
OK.
| Is This Answer Correct ? | 0 Yes | 2 No |
what do you mean by data staging area? : Sas-di
Mention what are the data types does SAS contain?
How would you code a merge that will write the matches of both to one data set, the non-matches from the left-most data set to a second data set, and the non-matches of the right-most data set to a third data set?
What is proc sql pass through facility?
How do you put an elephant in the refrigerator?
Why double trailing @@ is used in input statement?
How long can a macro variable be? A token? : sas-macro
Why and when do you use proc sql?
Hot to suppress characters from a given string?
Name statements that are execution only.
How can you limit the variables written to output dataset in data step?
Differentiate between proc means and proc summary.