What is the difference between nodup and nodupkey options?
Answers were Sorted based on User's Feedback
Answer / asha
The NODUP option checks for and eliminates duplicate
observations.
The NODUPKEY option checks for and eliminates duplicate
observations by variable values.
| Is This Answer Correct ? | 100 Yes | 13 No |
Answer / stopby
The Nodup: can only remove the duplicate next to each other.
The by variables are very important for remove
the duplicates which all the variables have the same value.
The nodupkey: will remove the duplicated when they have the
same values for the by variables
| Is This Answer Correct ? | 43 Yes | 3 No |
Answer / tariq sharjil
Nodup:
It deletes those observations if every variable in the
dataset has the same value
Nodupkey:
It deletes all the observation on sorting variable. It
retains the first variable and deletes all other coming
after that
| Is This Answer Correct ? | 23 Yes | 5 No |
Answer / majid
data test1;
input id1 $ id2 $ extra ;
cards;
aa ab 3
aa ab 1
aa ab 2
aa ab 3
;
proc sort nodup data=test1;
by id1 ;
run;
proc print data=test1;
run;
output will be like this:
Obs id1 id2 extra
1 aa ab 3
2 aa ab 1
3 aa ab 2
4 aa ab 3
*nodup" is an alias for "noduprecs" which appears to
mean "no duplicate records" but there is no way sas can
know about these duplicate records unless they, by chance,
land next to each other in sequence It is a big mistake
to think sorting "nodup" will remove duplicate records.
Sometime it will, sometime it won't. The only way you can
be sure of removing duplicate records is to "proc sort
nodupkey" and include enough key variables to be sure you
will lose the duplicates you want to lose. In the case
shown above, then if we knew of the same "extra" values
being duplicates we wanted to remove then this variable
should be included in the list of sort variables and
then "nodupkey" will remove the duplicates as shown below.
;
proc sort nodup data=test1;
by id1 id2 extra;
run;
proc print data=test1;
run;
output will be like this:
Obs id1 id2 extra
1 aa ab 1
2 aa ab 2
3 aa ab 3
so as u can see nodup eliminated all duplicate observations
if you sort them by all variables but nodupkey will show
only the duplicate observation.
proc sort nodupkey data=test1;
by id1 ;
run;
options nocenter;
proc print data=test1;
run;
output will be like this:
Obs id1 id2 extra
1 aa ab 3
| Is This Answer Correct ? | 22 Yes | 6 No |
Answer / pavan
Nodup : it delete the observartions based on each and every
variable value is same irespective of sorting varibale.
Nodupkey: It delete the observarions based on sorting
variable.
| Is This Answer Correct ? | 13 Yes | 4 No |
Answer / sas d
NODUP - removes the duplicates. Here the key to remove the
duplicates is the entire record.
NODUPKEY - removes the duplicates. Here the key is the
variable(s) specified by the BY statement.
| Is This Answer Correct ? | 8 Yes | 3 No |
Answer / chiranjeevi
nodup:
By using the proc sort procedure along with nodup
option ,it checks for and eliminates the duplicate records.
nodupkey:
NODUPKEY eliminates the duplicate observation keys
in the data set.
| Is This Answer Correct ? | 19 Yes | 17 No |
Answer / susheel
The nodup option in the sort procedure eliminates observations that are exactly the same across all variables.
The nodupkey option in the sort procedure eliminates observations that are exactly the same across BY variable.
| Is This Answer Correct ? | 5 Yes | 3 No |
Answer / chandu
NODUPKEY :
It checks similar BY variable values and deletes duplicate
observations in the data set based on BY variable values...
NODUP :
It will be available with latest version, It checks and
deletes duplicate observations in the dataset.....
Any comments plzz...
| Is This Answer Correct ? | 14 Yes | 13 No |
Answer / shalabh tyagi
Nodup: Checks for duplicacy among the variables in a row and
keeps the 1st row of that observaion in the final output and
deletes the rest
Nodupkey: Checks for the duplicacy among the variables
specified in "by" statement and keeps the 1st row of the
observaion and deletes the rest
| Is This Answer Correct ? | 3 Yes | 2 No |
What are the difference between ceil and floor functions in sas?
What is the difference between class statement and by statement in proc means?
For what purpose would you use the RETAIN statement?
1.What is the difference between _NULL_ , _ALL_, and _N_? 2.What are the uses of _NULL_ using in Data Steps? Can we _NULL_ in Proc Steps also? 3.How do call the macro variable in Data Steps? 4.How to construct Pivot tables in Excel Using SAS?
what is the primary data source for the wrs? : Sas-bi
Describe how you would pass data to macro.
What techniques and/or PROCs do you use for tables?
Hi All.I am looking for Good Institute who could Provide the online SAS BI+DI Training along with software.Primarily in Delhi/NCR or in Hyderabad Please help with name and contact number of concerned person!! Thanks in Advance! :)
What is Linear Regression?
What are all the problems you faced while validating tables and reports?
0 Answers Accenture, Quintiles,
How do you add a number to a macro variable?
Code a PROC SORT on a data set containing State, District and County as the primary variables, along with several numeric variables?