Monday, July 31, 2017

(R) Conditionals

Today we will be discussing conditional statements within R. Conditionals are very easy to understand, and extremely powerful when implemented. In typical fashion, I will first address a coding concept, followed by an example of the code being utilized.

If you are familiar with generally practiced coding standards and paradigms, you should be familiar with conditional statements.

Typically, in other languages, an IF statement would resemble something like:

if (condition is met) DOSOMETHING;

The exact structuring of the statement depends on the coding language.

In R, conditional coding resembles the following:

ifelse(condition, if true do this, if false do this)

Example:

For this example, we will pretend that you are again using the iconic DataFrameA, and in this particular scenario, you want to create a flag variable within a blank data column.

# First we will create our sample data frame with the code below: #

A <- c(1,1,1,2,2,3,3)
B <-c(2,1,3,2,3,3,1)
DataFrameA <- data.frame(A, B)
DataFrameA

#########################################################

DataFrameA

A  B  C
1   2
1   1
1   3
2   2
2   3
3   3
3   1

The code that you will create, will check both column A, and column B, if either column contains a row value that matches, an 'X' will be created in column C.

To achieve this, the following line of code can be utilized:

DataFrameA$C <- ifelse(DataFrameA$A == DataFrameA$B, 'X', ' ')

Additionally, if you wanted to create code that creates an 'X' value for a match, or a 'Y' value for a non-matching variable, the following code can be utilized:

DataFrameA$C <- ifelse(DataFrameA$A == DataFrameA$B, 'X', 'Y')

In the first example, the newly modified DataFrameA would resemble:

A  B  C
1   2
1   1  X
1   3
2   2  X
2   3
3   3  X
3   1

In the second example, the newly modified DataFrameA would resemble:

A  B  C
1   2  Y
1   1  X
1   3  Y
2   2  X
2   3  Y
3   3  X
3   1  Y

A few quick notes on conditionals in R. Please note the use of '==' instead of '=' in the above listed example. In R, '==' is used to assess conditions, not '='. Also, DataFrameA$C is referring to the column C in DataFrameA, DataFrameA$A is referring to column A in DataFrameA, and DataFrameA$B is referring to column B in DataFrameA.

These examples are simple, but the applications for this concept are endless. In the next article, we will be discussing some of the similarities between R and SAS, and how to achieve similar functionality in R as it pertains to SAS.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.