• 1. Getting started summary

Animated gif of the R logo with magenta and red hearts moving upward in a loop to the left of the "R."
Figure 1: Some pretty R from Allison Horst.

Links to Summary. Chatbot tutor. Questions. Glossary. R functions. R packages. Additional resources.

Chapter summary

R is (much more than just) a simple calculator – it can keep track of variables, and has functions to make plots, summarize data, and build statistical models. R also has many packages that can extend its capabilities. Now that we are familiar with R, RStudio, vectors, functions, data types and packages, we are ready to build our R skills even further to work with data!

Chatbot tutor

Please interact with this custom chatbot (link here) I have made to help you with this chapter. I suggest interacting with at least ten back-and-forths to ramp up and then stopping when you feel like you got what you needed from it.

Practice Questions

Animated gif with pastel lines in the background. The words "CODE HERO" in bold black text scroll across repeatedly.
Figure 2: Some encouragement from Allison Horst.

The interactive R environment below allows you to work without switching tabs.

Q1) Entering "p"^2 into R produces which error?
Q2) Which logical question provides an unexpected answer?

This is a floating-point precision issue. In R (and most programming languages), some decimal values cannot be represented exactly in the binary code that they use under the hood. To see this, try (0.2 + 0.1) - 0.3:

(0.2 + 0.1) - 0.3
[1] 5.551115e-17

If you are worried about floating point errors, use the all.equal() function instead of ==, or round to 10 decimal places before asking logical questions.

Q3) R has a built-in dataset called iris. You can look at it or give it to functions by typing iris. Which variable type is the Species in the iris dataset?

For the following questions consider the diabetes dataset available at: https://rb.gy/fan785

Q4) Which variable in the diabetes dataset is a character but should be a number?:

Q5) True OR False: The numeric variable, bp.1d, is a double, but could be changed to an integer without changing any of our analyses:

Q6) Which categorical variable in the dataset is ordinal?

Q7) You collected five leaves of the wild grape (Vitis riparia) and measured their length and width. You have a table of lengths and widths of each leaf and a formula for grape leaf area (below).

The area of a grape leaf is: \[\text{leaf area } = 0.851 \times \text{ leaf length } \times \text{ leaf width}\] The data are here, each column is a leaf:

length 5.0 6.1 5.8 4.9 6.0
width 3.2 3.0 4.1 2.9 4.5

The mean leaf area is

  • First make vectors for length and width

  • length = c(5, 6.1, 5.8, 4.9, 6)

  • width = c(3.2, 3, 4.1, 2.9, 4.5)

  • Then multiply these vectors by each other and 0.851.

  • Finally find the mean

# Create length and width vectors
length <- c(5, 6.1, 5.8, 4.9, 6)
width <- c(3.2, 3, 4.1, 2.9, 4.5)
leaf_areas <- 0.851 * length * width # find area
mean(leaf_areas)                     # find mean
[1] 16.89916
# or in one step:
(0.851 * length * width) |>
  mean()
[1] 16.89916

Glossary of Terms

  • R: A programming language designed for statistical computing and data analysis.

  • RStudio: An Integrated Development Environment (IDE) that makes using R more user-friendly.

  • Vector: An ordered sequence of values of the same data type in R.

  • Assignment Operator (<-): Used to store a value in a variable.

  • Logical Operator: A symbol used to compare values and return TRUE or FALSE (e.g., ==, !=, >, <).

  • Numeric Variable: A variable that represents numbers, either as whole numbers (integers) or decimals (doubles).

  • Character Variable: A variable that stores text (e.g., "Clarkia xantiana").

  • Package: A collection of R functions and data sets that extend R’s capabilities.


New R functions

  • c(): Combines values into a vector.

  • install.packages(): Installs an R package.

  • library(): Loads an installed R package for use.

  • log(): Computes the logarithm of a number, with an optional base.

  • mean(): Calculates the average (mean) of a numeric vector.

  • read_csv() (readr): Reads a CSV file into R as a data frame.

  • round(): Rounds a number to a specified number of decimal places.

  • sqrt(): Finds the square root of a number.

  • View(): Opens a data frame in a spreadsheet-style viewer.


R Packages Introduced

  • base: The core R package that provides fundamental functions like c(), log(), sqrt(), and round().

  • readr: A tidyverse package for reading rectangular data files (e.g., read_csv()).

  • dplyr: A tidyverse package for data manipulation, including mutate(), glimpse(), and across().

  • conflicted: Helps resolve function name conflicts when multiple packages have functions with the same name.

Additional resources

These optional resources reinforce or go beyond what we have learned.

R Recipes:

Videos: