# 2 Data Types

Data values in R come in several different types. We can begin by considering three fundamental types of data (later we’ll add more):

• numeric values (5, 3.14)
• character values (“abc”, “Wisconsin”)
• logical values (TRUE, FALSE)

The distinction is fundamental because it is common for operators (+, &) and functions to only work with specific types of data. When you are creating or debugging an R script, getting the data type right will be a common theme.

As a very simple example, we can add numbers, but not character values.

5 + 3.14
 8.14
"abc" + "Wisconsin"
Error in "abc" + "Wisconsin": non-numeric argument to binary operator

Similarly, we can use the “and” operator (&) with logical values, but not character values.

TRUE & FALSE
 FALSE
"abc" & "Wisconsin"
Error in "abc" & "Wisconsin": operations are possible only for numeric, logical or complex types

## 2.1 Dynamic Typing

In R, the type of a data object can be changed at any point: types are dynamic or mutable. We call the process of changing the data type coercion. Coercion may occur in many different contexts.

### 2.1.1 Replacing Values in a Vector

Suppose we have a numeric vector x, and we replace the first element of x with a character value. Then all the values in x are coerced to the character type.

x <- sample(1:5, 5)
x
 2 1 4 3 5
x + x # add the last two elements of x
 8
x <- "abc" # replace the first value
x
 "abc" "1"   "4"   "3"   "5"  
x + x  # now add the last two elements of x again
Error in x + x: non-numeric argument to binary operator

Notice that there is no message of any kind that the type of x has changed. Data coercion is a routine part of R processing. This is great when it works well, but it can be difficult to track down when something later breaks.

You can tell that x has become a character vector both by the quotes around the printed values, and by the error message when we try to add two elements.

We also have a variety of functions that test or report on the type of a data object. See help(is.numeric).

is.numeric(x)
 FALSE
mode(x)
 "character"

## 2.2 Exercises

1. We have seen a numeric-to-character coercion in Replacing Values in a Vector. What happens when we try to go the other way, from character to numeric with as.numeric()? Try out

• an integer coercion: as.numeric("8"). The quotes make the initial value a character type, which you can check with is.character("8").
• a decimal coercion: "2.7".
• a negative number: "-1".
• a number with extra white space around it: " 2.7 ".
• a number written with a comma: "5,432".
• a fraction: "2/3".
• a number with a currency symbol: "\$24.99".
• a non-numeric character: "B".

In what cases can R successfully parse quoted numeric values?

2. Logical-to-numeric coercion:

• TRUE
• FALSE
• NA
3. Numeric-to-logical coercion (as.logical())

• 1
• 2
• 2.14
• -2.14
• 0

What conclusions do you draw? What is the difference when we move from logical to numeric, versus numeric to logical?

4. Character-to-logical coercion:

• Quoted logical values: "TRUE", "FALSE", and "NA".
• Abbreviated quoted logical values: "F".
• Lowercase quoted logical values: "true".
• Mixed case quoted logical values: “FAlse”.
• Quoted numeric values: "1".
• Other character values: "green".