# 3 Data Structures

The most fundamental data structure in R is the vector, an ordered set of atomic elements (individual data values), all of the same type. A very simple example is a sequence of integers

x <- 0:5
x
 0 1 2 3 4 5

The most complex data structure in R is the list, an ordered set of arbitrary data objects, where the individual data objects may themselves be of different types and structures.

A simple example is a list composed of two differing vectors

x <- 1:5
y <- c("a", "b")
z <- list(x, y)
z
[]
 1 2 3 4 5

[]
 "a" "b"

Between vectors and lists in complexity lie matrices and dataframes.

Most data wrangling - preparing data for analysis - will involve vectors and dataframes. Most statistical modeling will begin with a dataframe and return results in the form of a list.

## 3.1 Four Basic Structures

• Vectors
• All elements of one type
• Scalars (individual numbers) are short vectors
• Example: myVector <- 1:4
• Matrices and Arrays
• All elements of one type
• Two or more dimensions
• Example: myMatrix <- matrix(1:8, ncol = 2)
• Example: myArray <- array(1:16, dim = c(4, 2, 2))
• Dataframes
• A collection of vectors, all of the same length
• Columns (vectors) may be of different types
• Always has column names and row names
• Some dataframes may be used as matrices
• Example: myDataframe <- mtcars
• Lists
• An ordered collection of arbitrary data structures
• May or may not have names
• Note that a dataframe is a special kind of list
• Example: myList <- lm(mpg ~ wt, data = mtcars)

We can query the structure of any data object with the str() function.

str(myMatrix)

## 3.2 Exercises

• Create each of the objects given as examples below. Use the str() function with each. What does R print to describe each structure?

• Example: myVector <- 1:4
• Example: myMatrix <- matrix(1:8, ncol = 2)
• Example: myArray <- array(1:16, dim = c(4, 2, 2))
• Example: myDataframe <- mtcars
• Example: myList <- lm(mpg ~ wt, data = mtcars)