In the SAS Data step what is the difference between the
subsetting done by Where and subsetting done by If?
Answers were Sorted based on User's Feedback
Answer / k
There is difference between how SAS handles IF and WHERE
conditions. The 'WHERE' condition is applied on data before
they enter Program Data Vector (PDV) and in case of 'IF' it
is applied after data comes out from PDV. Now, if you have
created a variable in the same data step using
Where : the created variable will not be applied condition.
IF: the created variable will be subjected to the condition
applied.
| Is This Answer Correct ? | 19 Yes | 0 No |
Answer / gangadhar
Make sure you apply the following rules when determining
which approach to take when subsetting your data set using
the DATA step. If your subset condition does not meet the
requirements below, then the WHERE and IF statements should
produce identical results. For cases such as this, use the
WHERE statement since it is more efficient. Note that
having both WHERE and IF statements within the same DATA
step has a cumulative effect.
• Can use WHERE statement when only specifying data
set variables
• Use IF statement when specifying automatic
variables or new variables created within DATA step
• Use IF statement when specifying FIRST.BY or LAST.
BY variables
• Use IF statement when specifying data set options
such as OBS = , POINT = or FIRSTOBS =
• In general, use IF statement when merging data sets
to apply subset condition after merging data set
• Use WHERE statement when specifying indexes
| Is This Answer Correct ? | 6 Yes | 0 No |
Answer / s.s.suresh
WHERE Statement can be only be used with variables in the
existing dataset where as IF statement can also be used raw
data as well
| Is This Answer Correct ? | 7 Yes | 1 No |
Answer / chowdary vamsi
Where:first chick the condensation ofter checking the errors
if:first chick the errors after chick the condensation
where :take more processing time compare if
if take less processing time compare to where.
| Is This Answer Correct ? | 0 Yes | 0 No |
Answer / govardhan bandari
IF-Works With New Variable
Where- Cant work With New Variables
IF-Works After PDV
Where-Works Before PDV
IF-in backend if process all the variable and it gives output based on condition
BUT
Where-process only conditionally met obseravtions
IF-Cant work with proceduers(except proc report with compute statement)
WHERE- work with proceduers
| Is This Answer Correct ? | 0 Yes | 0 No |
Answer / ramesh
Two Where coditions can be used at a time to one variable
Two IF conditions can not be used at a time to one variable
In Where condition either <= OR =< can be used
In If condition only <= can be used.
'where' is Data set options and statement
'If' is only Statement
| Is This Answer Correct ? | 0 Yes | 10 No |
what is study design in while working with SAS? what are screening variables in SAS?
What is the difference between an informat and a format? Name three informats or formats.
What statement do you code to write the record to the file?
in which companies SAS openings are there? List of companies using SAS technology.
how to debug and test the sas program? : Sas-administrator
How to convert .xls file into CSV format?
what is a post baseline?
Approximately what date is represented by SAS date value of 730
what is picture format? give any one example?
How to do user inputs and command line arguments in SAS? D&B
If you have a dataset that contains 100 variables, but you need only five of those, what is the code to force SAS to use only those variables?
I need help in merging two different datasets. I am merging by date and I want to propagate observations from one dataset to the corresponding dates. One dataset has a unique date for each day of the month, while the other dataset has same date for different patient visits. For example I want to spread an observation on the 31DEC2008 from one dataset to several observations with the same date on a second dataset for all the patients who visited on that date. I have tried to merge the two and the result is not what I wanted. Instead I get a dataset whereby all the dates have missing values where observations from the first datset should have spread.