Conditional Statements

library(SASmarkdown)
SAS found at C:/Program Files/SASHome/SASFoundation/9.4/sas.exe
SAS engines are now ready to use.

Often we want to execute a statement or a group of statements for just selected observations in our data set. In SAS this is accomplished with an IF-THEN statement.

If-Then

The basic syntax for IF-THEN is just

IF condition THEN statement;

where condition is an expression that can be interpreted as a logical value and statement is any executable SAS statement.

The statement is executed only if the condition is true.

Two examples where this is frequently used are searching for unusual values in your data and recoding.

If-Then Recodes

Recodes

IF-THEN recodes are often better handles as IF-THEN/ELSE. This is both more computationaly efficient, but also less prone to logic errors.

Indictor Coding

One common approach to coding indicator variables is to initialize a variable with one value (typically 0), and use IF-THEN to indicate the other value.

A problem with this approach is that people often forget to account for missing data.

data cars;
  set sashelp.cars;
  eightcyl = 0;
  if (cylinders eq 8) then eightcyl = 1;
  run;

proc freq data=cars;
  tables cylinders*eightcyl / nocol nopercent missing;
run;
                            The FREQ Procedure

                      Table of Cylinders by eightcyl

                    Cylinders     eightcyl

                    Frequency|
                    Row Pct  |       0|       1|  Total
                    ---------+--------+--------+
                           . |      2 |      0 |      2
                             | 100.00 |   0.00 |
                    ---------+--------+--------+
                           3 |      1 |      0 |      1
                             | 100.00 |   0.00 |
                    ---------+--------+--------+
                           4 |    136 |      0 |    136
                             | 100.00 |   0.00 |
                    ---------+--------+--------+
                           5 |      7 |      0 |      7
                             | 100.00 |   0.00 |
                    ---------+--------+--------+
                           6 |    190 |      0 |    190
                             | 100.00 |   0.00 |
                    ---------+--------+--------+
                           8 |      0 |     87 |     87
                             |   0.00 | 100.00 |
                    ---------+--------+--------+
                          10 |      2 |      0 |      2
                             | 100.00 |   0.00 |
                    ---------+--------+--------+
                          12 |      3 |      0 |      3
                             | 100.00 |   0.00 |
                    ---------+--------+--------+
                    Total         341       87      428

Another approach here might be to explicitly code both values of the indicator.

data cars;
  set sashelp.cars;
  if (cylinders ne .) then eightcyl = 0;
  if (cylinders eq 8) then eightcyl = 1;
  run;

proc freq data=cars;
  tables cylinders*eightcyl / nocol nopercent missing;
run;
                            The FREQ Procedure

                      Table of Cylinders by eightcyl

               Cylinders     eightcyl

               Frequency|
               Row Pct  |       .|       0|       1|  Total
               ---------+--------+--------+--------+
                      . |      2 |      0 |      0 |      2
                        | 100.00 |   0.00 |   0.00 |
               ---------+--------+--------+--------+
                      3 |      0 |      1 |      0 |      1
                        |   0.00 | 100.00 |   0.00 |
               ---------+--------+--------+--------+
                      4 |      0 |    136 |      0 |    136
                        |   0.00 | 100.00 |   0.00 |
               ---------+--------+--------+--------+
                      5 |      0 |      7 |      0 |      7
                        |   0.00 | 100.00 |   0.00 |
               ---------+--------+--------+--------+
                      6 |      0 |    190 |      0 |    190
                        |   0.00 | 100.00 |   0.00 |
               ---------+--------+--------+--------+
                      8 |      0 |      0 |     87 |     87
                        |   0.00 |   0.00 | 100.00 |
               ---------+--------+--------+--------+
                     10 |      0 |      2 |      0 |      2
                        |   0.00 | 100.00 |   0.00 |
               ---------+--------+--------+--------+
                     12 |      0 |      3 |      0 |      3
                        |   0.00 | 100.00 |   0.00 |
               ---------+--------+--------+--------+
               Total           2      339       87      428

This second approach, while it gets us the desired result, would be more efficient if we add an ELSE statement.

If-Then/Else

An ELSE statement tells SAS what to do if the condition is not true, and must be the next statement after an IF-THEN. Indentation is not required, but helps us humans see the statements as a group.

The syntax is:

IF condition THEN statement1;
  ELSE statement2;

If the condition is true, then statement1 will execute. If it is not, statement2 will execute. Note that statement2 can also be an IF-THEN, which allows you to deal with many possibilities. Revising our recode example from above

data cars;
  set sashelp.cars;
  if (cylinders eq 8) then eightcyl = 1;
    else if (cylinders ne .) then eightcyl = 0;
  run;

proc freq data=cars;
  tables cylinders*eightcyl / nocol nopercent missing;
run;
                            The FREQ Procedure

                      Table of Cylinders by eightcyl

               Cylinders     eightcyl

               Frequency|
               Row Pct  |       .|       0|       1|  Total
               ---------+--------+--------+--------+
                      . |      2 |      0 |      0 |      2
                        | 100.00 |   0.00 |   0.00 |
               ---------+--------+--------+--------+
                      3 |      0 |      1 |      0 |      1
                        |   0.00 | 100.00 |   0.00 |
               ---------+--------+--------+--------+
                      4 |      0 |    136 |      0 |    136
                        |   0.00 | 100.00 |   0.00 |
               ---------+--------+--------+--------+
                      5 |      0 |      7 |      0 |      7
                        |   0.00 | 100.00 |   0.00 |
               ---------+--------+--------+--------+
                      6 |      0 |    190 |      0 |    190
                        |   0.00 | 100.00 |   0.00 |
               ---------+--------+--------+--------+
                      8 |      0 |      0 |     87 |     87
                        |   0.00 |   0.00 | 100.00 |
               ---------+--------+--------+--------+
                     10 |      0 |      2 |      0 |      2
                        |   0.00 | 100.00 |   0.00 |
               ---------+--------+--------+--------+
                     12 |      0 |      3 |      0 |      3
                        |   0.00 | 100.00 |   0.00 |
               ---------+--------+--------+--------+
               Total           2      339       87      428

In another example, suppose we wanted to recode vehicle weight (a continous variable) into thousand pound categories.

One approach might be to use a series of IF-THEN statements. Notice that this example assumes no missing data. For each observation, five conditions are checked.

data cars;
    set sashelp.cars;
    if (weight lt 3000)                    then wgt = 2;
    if (weight ge 3000 and weight lt 4000) then wgt = 3;
    if (weight ge 4000 and weight lt 5000) then wgt = 4;
    if (weight ge 5000 and weight lt 6000) then wgt = 5;
    if (weight ge 6000)                    then wgt = 6;
    run;

A more efficient approach is

data cars;
    set sashelp.cars;
    if weight lt 3000        then wgt = 2;
        else if weight lt 4000 then wgt = 3;
        else if weight lt 5000 then wgt = 4;
        else if weight lt 6000 then wgt = 5;
        else                        wgt = 6;
    run;
    
proc freq data=cars;
  tables wgt / nocum;
run;
                            The FREQ Procedure

                       wgt    Frequency     Percent
                       ----------------------------
                         2          87       20.33 
                         3         238       55.61 
                         4          81       18.93 
                         5          19        4.44 
                         6           3        0.70 

Here, only one condition is checked for 87 observations, and the majority of observations are handles with one or two checks. And the code is much more readable.

Select/When/Otherwise/End

Like IF-THEN/ELSE, SELECT-WHEN stops processing subsequent statements once it encounters a true condition.

data cars;
  set sashelp.cars;
  select;
    when (weight lt 3000) wgt = 2;
    when (weight lt 4000) wgt = 3;
    when (weight lt 5000) wgt = 4;
    when (weight lt 6000) wgt = 5;
    otherwise wgt = 6;
    end;
run;

proc freq data=cars;
  tables wgt / nocum;
run;
                            The FREQ Procedure

                       wgt    Frequency     Percent
                       ----------------------------
                         2          87       20.33 
                         3         238       55.61 
                         4          81       18.93 
                         5          19        4.44 
                         6           3        0.70