The only prerequisite for the R training is to install the latest version of R and RStudio on your computer and the packages we're using for the training. These are available online and are free to download and install. We'll talk about the difference between R and RStudio on the first day, but for now, just make sure they're installed.
NOTE: If you already have R and RStudio on your computer, the R version should be at least 4.4.3 and the RStudio version at least 2025.06.2-418 to make sure everyone's code behaves the same way. You can check your software versions by opening RStudio. The console in the bottom left will display the R version. Check the RStudio version by going to Help > About RStudio. The first 4 numbers should be 2025 or higher.
A number of packages are required to follow along with data wrangling and visualization sessions. Please try to install these in RStudio ahead of time by running the code below. If you don't know how to run the code, open view the Running Code Screencast below for how to do this.
packages <- c("tidyverse", # for Day 2 and 3 data wrangling
"RColorBrewer", "viridis", "patchwork", # for Day 3 ggplot
"readxl", "writexl", # for day 1 importing from excel
"car") # for Levene's test - also a great stats R package
install.packages(setdiff(packages, rownames(installed.packages())))
# Check that installation worked
library(tidyverse) # turns on core tidyverse packages
library(RColorBrewer) # palette generator
library(viridis) # more palettes
library(patchwork) # multipanel plots
library(readxl) # reading xlsx
library(writexl) # writing xlsxlibrary(tidyverse), you will get the following message in
your console. This means the tidyverse successfully loaded on your
machine and is not an error.
We'll download these as we run through the code. The check is that your computer doesn't have security restrictions that prevent importing datasets from github into R. All datasets used in this training will also be available via a zip file. The github hosted datasets are just easier to make sure everyone downloads the same version and is grabbing them from the same place.
Copy the code below into RStudio and run it. If you don't get an error message, you're good.
If you do get an error message, you'll just need to import the datasets locally using the provided zip file.
Goals for Day 1:
Feedback: Please leave feedback in the training feedback
form. You can submit feedback multiple times and don't need to answer
every question. Responses are anonymous.
R is a programming language that was originally developed by statisticians for statistical computing and graphics. R is free and open source. That means you will never need a paid license to use it, and you can view the underlying source code of any function and suggest fixes and improvements. Since its first official release in 1995, R remains one of the leading programming languages for statistics and data visualization, and its capabilities continue to grow.
When you install R, it comes with a simple user interface that lets you write and execute code. However, writing code in this interface is similar to writing a report in Notepad: it's simple and straightforward, but you likely need more features than Notepad has to format your document. This is where RStudio comes in.
For more information on the history of R, visit the R Project website.This is primarily where you write code. When you create a new script or open an existing one, it displays here. In the screenshot above, there's a script called bat_data_wrangling.R open in the source pane. Note that if you haven't yet opened or created a new script, you won't see this pane until you do.
The source pane color-codes your code to make it easier to read, and detects syntax errors (the coding equivalent of a spell checker) by flagging the line number with a red "x" and showing a squiggly line under the offending code.
When you're ready to run all or part of your script:This is where the code actually runs. When you first open RStudio, the console will tell you the version of R that you're running (should be R 4.4.1 or greater).
While most often you'll run code from a script in the source pane, you can also run code directly in the console. Code in the console won't get saved to a file, but it's a great way to experiment and test out lines of code before adding them to your script in the source pane. The console is also where errors appear if your code breaks. Deciphering errors can be a challenge that gets easier over time. Googling errors is a good place to start.File organization is an important part of being a good coder. Keeping code, input data, and results together in one place will protect your sanity and the sanity of the person who inherits the project. R Studio projects help with this. Creating a new R Studio project for each new code project makes it easier to manage settings and file paths.
Before we create a project, take a look at the Console tab.
Notice that at the top of the console there is a folder path. That path
is your current working directory.
If you refer to a file in R using a relative path, for example
./data/my_data_file.csv, R will look in your current
working directory for a folder called data containing a
file called my_data_file.csv.
Note the use of forward slashes instead of back slashes for file paths. You can either use a forward slash (/) or a double back slash for file paths. The paths below are equivalent and the full file path the relative path above is specifying.
Using relative paths is a helpful because the full path will be specific to your computer and likely won't work on a different computer. But there's no guarantee that everyone has the same default R working directory. This is where projects come in. Projects package all of your code, data, output, etc. into a file type that is easily transferrable to other machines regardless of file location.mma_r_intro. Next, you'll select
what folder to keep your project folder in. Documents/R is
a good place to store all of your R projects but it's up to you. When
you are done, click on Create Project.
If you successfully started a project named
mma_r_intro, you should see it listed at the very top right
of your screen. As you start new projects, you'll want to check that
you're working in the right one before you start coding. Take a look at
the Console tab again. Notice that your current working directory
is now your project folder. When you look in the Files tab of the
bottom right pane, you'll see that it also defaults to the project
folder.
list.files() function, which lists everything in the
working directory of your project.
We're going to store all of our datasets for this training in a data folder. First, create the data folder in your project using the code below.
Create data folder
Download files from github repo
We're going to try downloading all of the datasets you're going to use for the training into this data folder. Copy the code in the chunk below and paste it into a new script (File > New File > R Script). Then select and run the code.
file_list <- c(
"https://raw.githubusercontent.com/KateMMiller/MMA_R_Training_2026/refs/heads/main/data/ACAD_Jordan_Pond_water_chem.csv",
"https://raw.githubusercontent.com/KateMMiller/MMA_R_Training_2026/refs/heads/main/data/BASHAR_motile_invert_counts.csv",
"https://raw.githubusercontent.com/KateMMiller/MMA_R_Training_2026/refs/heads/main/data/BASHAR_Point_Intercept_data.csv",
"https://raw.githubusercontent.com/KateMMiller/MMA_R_Training_2026/refs/heads/main/data/bat_site_info.csv",
"https://raw.githubusercontent.com/KateMMiller/MMA_R_Training_2026/refs/heads/main/data/bat_captures.csv",
"https://raw.githubusercontent.com/KateMMiller/IMD_R_Training_2026/refs/heads/main/data/HOBO_temp_example.csv",
"https://raw.githubusercontent.com/KateMMiller/MMA_R_Training_2026/refs/heads/main/data/motile_invert_species_table.csv",
"https://raw.githubusercontent.com/KateMMiller/MMA_R_Training_2026/refs/heads/main/data/SHIHAR_photoplot_cover.csv")
file_names <- sub(".*data/", "", file_list)
lapply(seq_along(file_list), function(x){
download.file(file_list[x],
destfile = paste0("./data/", file_names[x]))
})If you're not able to download, extract the MMI_R_training_data.zip file into this new data folder.
day_1_script.R. Make sure you are working in the
mma_r_intro project that you just created. Click on the
New File icon
day_1_script.R.
We'll start with something simple. Basic math in R is pretty straightforward and the syntax is similar to simply using a graphing calculator. You can use the examples below or come up with your own. Even if you're using the examples, try to actually type the code instead of copy-pasting - you’ll learn to code faster that way.
To run a single line of code in your script, place your cursor anywhere in that line and press CTRL+ENTER (or click the Run button in the top right of the script pane). To run multiple lines of code, highlight the lines you want to run and hit CTRL+ENTER or click Run.
To leave notes in your script, use the hashtag/pound sign (#). This will change the color of text that R reads as a comment and doesn't run. Commenting your code is one of the best habits you can form. Comments are a gift to your future self and anyone else who tries to use your code.
Type code below in your script and run each line.
# Commented text: try this line to generate some basic text and become familiar with where results will appear:
print("Welcome to R!")## [1] "Welcome to R!"
## [1] 2
## [1] 1.5
## [1] 3
## [1] 669.6619
## [1] -1
Coding Tip: Notice that when you run a line of code, the code and the result appear in the console. You can also type code directly into the console, but it won't be saved anywhere. As you get more comfortable with R, it can be helpful to use the console as a "scratchpad" for experimenting and troubleshooting. For now, it's best to err on the side of saving your code as a script so that you don't accidentally lose useful work.
Occasionally, it's enough to just run a line of code and display the result in the console. But typically our code is more complex than adding one plus one, and we want to store the result and use it later in the script. This is where variables come in. Variables allow you to assign a value (whether that's a number, a data table, a chunk of text, or any other type of data that R can handle) to a short, human-readable name. Anywhere you put a variable in your code, R will replace it with its value when your code runs. Variables are also called objects in R.
R uses the <- symbol for variable assignment. If
you've used other programming languages, you may be tempted to use
= instead. It will work, but there are subtle differences
between <- and =, so you should get in the
habit of using <-.
R is case-sensitive. So if you name one object fishdata
and another Fishdata or FISHDATA, R will
interpret these all as unique objects. While you can do things like
this, it's best practice not to use the same name for different objects,
as it makes code difficult to follow.
Type code below to assign values to variables named a and b
# the value of 12.098 is assigned to variable 'a'
a <- 12.098
# and the value 65.3475 is assigned to variable 'b'
b <- 65.3475
# we can now perform whatever mathematical operations we want using these two
# variables without having to repeatedly type out the actual numbers:
a*b## [1] 790.5741
## [1] 7.305156e+68
## [1] 538.7261
In the code above, we assign the variables a and
b once. We can then reuse them as often as we want. This is
helpful because we save ourselves some typing, reduce the chances of
making a typo somewhere, and if we need to change the value of
a or b, we only have to do it in one
place.
Also notice that when you assign variables, you can see them listed in your Environment tab (top right pane). Remember, everything you see in the environment is just in R's temporary memory and won't be saved when you close out of RStudio.
All of the examples you've seen so far are fairly contrived for the sake of simplicity. Let's take a look at some code that everyone here will make use of at some point: reading data from a CSV.It's hard to get very far in R without making use of functions. Think of a function as a programmed task that takes some kind of input (the argument(s)) from the user and outputs a result (the return value).
Coding Tip: Note the difference in how RStudio color codes what it thinks are functions. There are a lot of pre-programmed functions in base R, which is what comes along with R when you install R. Installing R packages will add additional functions. You can also build your own. Names that R recognizes as a function are color coded differently than what R recognizes as text, numbers, etc. It's good practice to not use existing functions as new object names.
Commonly used base R functions include:mean(): calculate the mean of a set of numbers
min(): calculate the minimum of a set of numbers
max(): calculate the maximum of a set of numbers
range(): calculate the min and max of a set of numbers
sd(): calculate the standard deviation of set of numbers
sqrt(): calculate the square root of a value
Calculate mean and range to see how functions work
x <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
# equivalent to x <- 1:10
# bad coding
#mean <- mean(x)
# good coding
mean_x <- mean(x)
mean_x## [1] 5.5
## [1] 1 10
Most of the work we do in R relies on one or more existing datasets that we want to query or summarize, rather than creating our own in R. Importing data in R is therefore an important skill. R can import just about any data type, including CSV and MS Excel files. You can also import tables from MS Access and SQL databases using ODBC drivers. That's beyond the scope of this class, but I can share examples for anyone needing to import from a database. For now, I'll show how to work with CSVs and Excel spreadsheets.
We use the read.csv() function to import CSVs in R. The
read.csv() function takes the file path or url to the CSV
as input and outputs a data frame containing the data from the CSV. Here
we're going to read a CSV from a website, then save that in the data
folder of our project. We'll talk more about what data frames are
next.
Run the following line to import a teaching data set of motile invertebrates collected in the Bass Harbor rocky intertidal zone from the github repository for this training
# read in the data from BASHAR_motile_invert_counts.csv and assign it as a dataframe to the variable "motinv"
motinv <- read.csv("./data/BASHAR_motile_invert_counts.csv")
View the data in a separate window by running the View()
function.
Or, check out the first few or last few records in your console. Click on View R output to view output.
## Network UnitCode SiteCode StartDate Year QAQC PlotName CommunityType
## 1 NETN ACAD BASHAR 6/24/2013 2013 TRUE A1 Ascophyllum
## 2 NETN ACAD BASHAR 6/21/2013 2013 FALSE A1 Ascophyllum
## 3 NETN ACAD BASHAR 6/24/2013 2013 TRUE A1 Ascophyllum
## 4 NETN ACAD BASHAR 6/21/2013 2013 FALSE A1 Ascophyllum
## 5 NETN ACAD BASHAR 6/24/2013 2013 TRUE A1 Ascophyllum
## 6 NETN ACAD BASHAR 6/21/2014 2014 FALSE A1 Ascophyllum
## ScientificName CommonName SpeciesCode Damage No.Damage Subsampled
## 1 Littorina littorea Common periwinkle LITLIT 0 2 No
## 2 Littorina littorea Common periwinkle LITLIT 0 3 No
## 3 Littorina obtusata Smooth periwinkle LITOBT 1 2 No
## 4 Littorina obtusata Smooth periwinkle LITOBT 0 6 No
## 5 Nucella lapillus Dogwhelk NUCLAP 0 1 No
## 6 Littorina littorea Common periwinkle LITLIT 0 2 No
## Network UnitCode SiteCode StartDate Year QAQC PlotName CommunityType
## 677 NETN ACAD BASHAR 7/18/2021 2021 FALSE R5 Red Algae
## 678 NETN ACAD BASHAR 5/22/2022 2022 FALSE R5 Red Algae
## 679 NETN ACAD BASHAR 5/22/2022 2022 FALSE R5 Red Algae
## 680 NETN ACAD BASHAR 5/22/2022 2022 FALSE R5 Red Algae
## 681 NETN ACAD BASHAR 6/12/2023 2023 FALSE R5 Red Algae
## 682 NETN ACAD BASHAR 6/29/2024 2024 FALSE R5 Red Algae
## ScientificName CommonName SpeciesCode Damage No.Damage
## 677 Testudinalia testudinalis Limpet TECTES 0 1
## 678 Littorina littorea Common periwinkle LITLIT 5 45
## 679 Littorina obtusata Smooth periwinkle LITOBT 0 1
## 680 Testudinalia testudinalis Limpet TECTES 0 2
## 681 Littorina littorea Common periwinkle LITLIT 2 26
## 682 Littorina littorea Common periwinkle LITLIT 0 5
## Subsampled
## 677 No
## 678 No
## 679 No
## 680 No
## 681 No
## 682 No
Now write the csv to disk and show how to import from your computer.
# Write the data frame to your data folder using a relative path.
# By default, write.csv adds a column with row names that are numbers. I don't
# like that, so I turn that off.
write.csv(motinv, "./data/BASHAR_motile_invert_counts.csv", row.names = FALSE)Make sure the writing to disk worked by importing the CSV from your computer
# Read the data frame in using a relative path
motinv <- read.csv("./data/BASHAR_motile_invert_counts.csv")
# Equivalent code to read in the data frame using full path on my computer, but won't match another user.
motinv <- read.csv("C:/Users/KMMiller/OneDrive - DOI/NETN/R_Dev/MMA_R_Training_2026/data/BASHAR_motile_invert_counts.csv")Base R does not have a way to import MS Excel files. The first step
for working with Excel files (i.e., files with .xls or .xlsx
extensions), therefore, is to install the readxl package to
import .xlsx files and writexl to write files to .xlsx. The
readxl package has a couple of options for loading Excel
spreadsheets, depending on whether the extension is .xls, .xlsx, or
unknown, along with options to import different worksheets within a
spreadsheet.
The code below installs the required packages (if you forgot to ahead of time), loads them, then first writes the ACAD_wetland CSV we just imported to an .xlsx. The last step imports the .xslx version of the ACAD wetland data.
read_xlsx() function can't read
from a url like read.csv() can.
## # A tibble: 6 × 14
## Network UnitCode SiteCode StartDate Year QAQC PlotName CommunityType
## <chr> <chr> <chr> <chr> <dbl> <lgl> <chr> <chr>
## 1 NETN ACAD BASHAR 2013-06-21 2013 FALSE A1 Ascophyllum
## 2 NETN ACAD BASHAR 2013-06-21 2013 FALSE A1 Ascophyllum
## 3 NETN ACAD BASHAR 2014-06-21 2014 FALSE A1 Ascophyllum
## 4 NETN ACAD BASHAR 2014-06-21 2014 FALSE A1 Ascophyllum
## 5 NETN ACAD BASHAR 2016-06-28 2016 FALSE A1 Ascophyllum
## 6 NETN ACAD BASHAR 2016-06-28 2016 FALSE A1 Ascophyllum
## # ℹ 6 more variables: ScientificName <chr>, CommonName <chr>,
## # SpeciesCode <chr>, Damage <dbl>, No.Damage <dbl>, Subsampled <chr>
The data frame we just examined is a type of data structure. A data structure is what it sounds like: a structure that holds data in an organized way. There are multiple data structures in R, including vectors, lists, arrays, matrices, data frames, and tibbles (more on this data structure later). Today we'll focus on vectors and data frames.
Vectors are the simplest data structure in R. Vectors are like a single column of data in an Excel spreadsheet. Vectors only have one dimension, and can be accessed by their row number. Here are some examples of vectors:
## [1] 1 2 3 4 5 6 7 8 9 10
## [1] 2 3 4 5 6 7 8 9 10 11
## [1] FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE
## [1] 12.5 20.4 18.1 38.5 19.3
bird_ids <- c("song sparrow", "dark-eyed junco", "golden-crowned kinglet", "dark-eyed junco")
bird_ids## [1] "song sparrow" "dark-eyed junco" "golden-crowned kinglet"
## [4] "dark-eyed junco"
Note the use of c(). The c() function
stands for combine, and it combines elements into a single
vector, with each element separated by a comma in code. The c() function
is a fairly universal way to combine multiple elements in R, and you’re
going to see it over and over. Note how in digits, when we added a 1,
every value in digits increased by 1. This highlights the concept of
vectorization in R. The general idea being that you can apply a
single operation to a vector (or row in a data frame), and it will apply
to all elements of that vector.
If you need to access a single element of a vector, you can use the
syntax my_vector[x] where x is the element's
index (the number corresponding to its position in the vector).
You can also use a vector of indices to extract multiple elements from
the vector. Note that in R, indexing starts at 1 (i.e.
my_vector[1] is the first element of
my_vector). If you've coded in other languages, you may be
used to indexing starting at 0.
## [1] "dark-eyed junco"
## [1] "song sparrow" "dark-eyed junco"
You can also return only unique values from a vector. The
bird_ids vector has dark-eyed juncos listed twice. To get
only unique species, run the following code. I also added
sort() to sort the list alphabetically.
## [1] "dark-eyed junco" "golden-crowned kinglet" "song sparrow"
In the examples above, each vector contains a different type of data:
digits contains integers, is_odd contains
logical (TRUE/FALSE) values, bird_ids contains text, and
tree_dbh contains decimal numbers. That's because a given
vector can only contain a single type of data.
In R, there are six main data types:
"hello", "3",
"R is my favorite programming language")23,
3.1415)L to it or use
as.integer() (e.g. 5L,
as.integer(30)).TRUE,
FALSE). Note that TRUE and FALSE
must be all-uppercase.You can use the class() function to get the data type of
a vector:
## [1] "character"
## [1] "numeric"
## [1] "integer"
## [1] "logical"
Data frames are the main way will be interacting with data in R. They're essentially like spreadsheets in excel with specific properties.
Properties of data frames:
Coding Tip: R is strict about assigning data types to columns,
such that any text in an otherwise numeric field will turn the entire
column into a character. Similarly, if there's anything besides TRUE,
FALSE, or a blank in a field meant to be TRUE/FALSE, R will treat that
as a character field instead of logical. So, if R treats as a character
something that should be a numeric field, it's a good clue there may be
a typo or issue in your data needing attention. You can check the
assigned data types using the str() function.
## 'data.frame': 682 obs. of 14 variables:
## $ Network : chr "NETN" "NETN" "NETN" "NETN" ...
## $ UnitCode : chr "ACAD" "ACAD" "ACAD" "ACAD" ...
## $ SiteCode : chr "BASHAR" "BASHAR" "BASHAR" "BASHAR" ...
## $ StartDate : chr "6/24/2013" "6/21/2013" "6/24/2013" "6/21/2013" ...
## $ Year : int 2013 2013 2013 2013 2013 2014 2014 2016 2016 2017 ...
## $ QAQC : logi TRUE FALSE TRUE FALSE TRUE FALSE ...
## $ PlotName : chr "A1" "A1" "A1" "A1" ...
## $ CommunityType : chr "Ascophyllum" "Ascophyllum" "Ascophyllum" "Ascophyllum" ...
## $ ScientificName: chr "Littorina littorea" "Littorina littorea" "Littorina obtusata" "Littorina obtusata" ...
## $ CommonName : chr "Common periwinkle" "Common periwinkle" "Smooth periwinkle" "Smooth periwinkle" ...
## $ SpeciesCode : chr "LITLIT" "LITLIT" "LITOBT" "LITOBT" ...
## $ Damage : chr "0" "0" "1" "0" ...
## $ No.Damage : int 2 3 2 6 1 2 1 6 9 41 ...
## $ Subsampled : chr "No" "No" "No" "No" ...
$
One way to access the column dimension in data frames is to use the
$ syntax. The $ is used to separate the data
frame name from the column name. It's similar to the
[table_name].[column_name] syntax in Access.
To view the names of the columns in a data frame, you can use the
names() function, or use head() to see the
first 6 rows with column names. Whatever you prefer. I'll use the former
for now.
See column names in wetland data.
## [1] "Network" "UnitCode" "SiteCode" "StartDate"
## [5] "Year" "QAQC" "PlotName" "CommunityType"
## [9] "ScientificName" "CommonName" "SpeciesCode" "Damage"
## [13] "No.Damage" "Subsampled"
See all rows in the PlotName and ScientificName columns in the motile invertebrate data. You can view the output by clicking on the R output drop down.
## [1] "A1" "A1" "A1" "A1" "A1" "A1" "A1" "A1" "A1" "A1" "A1" "A1" "A1" "A1" "A1"
## [16] "A1" "A1" "A1" "A1" "A1" "A1" "A1" "A1" "A1" "A1" "A1" "A1" "A2" "A2" "A2"
## [31] "A2" "A2" "A2" "A2" "A2" "A2" "A2" "A2" "A2" "A2" "A2" "A2" "A2" "A2" "A2"
## [46] "A2" "A2" "A2" "A2" "A2" "A2" "A2" "A2" "A2" "A2" "A2" "A2" "A2" "A2" "A2"
## [61] "A2" "A3" "A3" "A3" "A3" "A3" "A3" "A3" "A3" "A3" "A3" "A3" "A3" "A3" "A3"
## [76] "A3" "A3" "A3" "A3" "A3" "A3" "A3" "A3" "A3" "A3" "A3" "A3" "A3" "A3" "A3"
## [91] "A3" "A3" "A4" "A4" "A4" "A4" "A4" "A4" "A4" "A4" "A4" "A4" "A4" "A4" "A4"
## [106] "A4" "A4" "A4" "A4" "A4" "A4" "A4" "A4" "A4" "A4" "A4" "A4" "A4" "A4" "A4"
## [121] "A4" "A4" "A4" "A4" "A4" "A4" "A5" "A5" "A5" "A5" "A5" "A5" "A5" "A5" "A5"
## [136] "A5" "A5" "A5" "A5" "A5" "A5" "A5" "A5" "A5" "A5" "A5" "A5" "A5" "A5" "A5"
## [151] "A5" "A5" "A5" "A5" "B1" "B1" "B1" "B1" "B1" "B1" "B1" "B1" "B1" "B1" "B1"
## [166] "B1" "B1" "B1" "B1" "B1" "B1" "B1" "B1" "B1" "B1" "B1" "B1" "B1" "B1" "B1"
## [181] "B1" "B1" "B1" "B1" "B2" "B2" "B2" "B2" "B2" "B2" "B2" "B2" "B2" "B2" "B2"
## [196] "B2" "B2" "B2" "B2" "B2" "B2" "B2" "B2" "B2" "B2" "B2" "B2" "B2" "B2" "B2"
## [211] "B2" "B2" "B2" "B2" "B2" "B2" "B2" "B3" "B3" "B3" "B3" "B3" "B3" "B3" "B3"
## [226] "B3" "B3" "B3" "B3" "B3" "B3" "B3" "B3" "B3" "B3" "B3" "B3" "B3" "B3" "B3"
## [241] "B3" "B3" "B3" "B3" "B4" "B4" "B4" "B4" "B4" "B4" "B4" "B4" "B4" "B4" "B4"
## [256] "B4" "B4" "B4" "B4" "B4" "B4" "B4" "B4" "B4" "B4" "B4" "B4" "B4" "B4" "B4"
## [271] "B4" "B4" "B4" "B4" "B4" "B4" "B4" "B5" "B5" "B5" "B5" "B5" "B5" "B5" "B5"
## [286] "B5" "B5" "B5" "B5" "B5" "B5" "B5" "B5" "B5" "B5" "B5" "B5" "B5" "B5" "B5"
## [301] "B5" "B5" "F1" "F1" "F1" "F1" "F1" "F1" "F1" "F1" "F1" "F1" "F1" "F1" "F1"
## [316] "F1" "F1" "F1" "F1" "F1" "F1" "F1" "F1" "F1" "F1" "F1" "F1" "F1" "F1" "F1"
## [331] "F1" "F1" "F1" "F1" "F1" "F1" "F1" "F1" "F1" "F1" "F1" "F1" "F1" "F1" "F1"
## [346] "F1" "F1" "F1" "F1" "F1" "F2" "F2" "F2" "F2" "F2" "F2" "F2" "F2" "F2" "F2"
## [361] "F2" "F2" "F2" "F2" "F2" "F2" "F2" "F2" "F2" "F2" "F2" "F2" "F2" "F2" "F2"
## [376] "F2" "F2" "F2" "F2" "F2" "F2" "F2" "F2" "F2" "F2" "F2" "F2" "F2" "F2" "F2"
## [391] "F2" "F2" "F2" "F2" "F2" "F3" "F3" "F3" "F3" "F3" "F3" "F3" "F3" "F3" "F3"
## [406] "F3" "F3" "F3" "F3" "F3" "F3" "F3" "F3" "F3" "F3" "F3" "F3" "F3" "F3" "F3"
## [421] "F3" "F3" "F3" "F3" "F3" "F3" "F3" "F3" "F3" "F3" "F3" "F3" "F3" "F3" "F3"
## [436] "F3" "F3" "F3" "F3" "F4" "F4" "F4" "F4" "F4" "F4" "F4" "F4" "F4" "F4" "F4"
## [451] "F4" "F4" "F4" "F4" "F4" "F4" "F4" "F4" "F4" "F4" "F4" "F4" "F4" "F4" "F4"
## [466] "F4" "F4" "F4" "F4" "F4" "F4" "F4" "F4" "F4" "F4" "F4" "F4" "F4" "F4" "F4"
## [481] "F5" "F5" "F5" "F5" "F5" "F5" "F5" "F5" "F5" "F5" "F5" "F5" "F5" "F5" "F5"
## [496] "F5" "F5" "F5" "F5" "F5" "F5" "F5" "F5" "F5" "F5" "F5" "F5" "F5" "F5" "F5"
## [511] "F5" "F5" "F5" "F5" "F5" "F5" "F5" "F5" "F5" "F5" "F5" "F5" "F5" "F5" "F5"
## [526] "F5" "R1" "R1" "R1" "R1" "R1" "R1" "R1" "R1" "R1" "R1" "R1" "R1" "R1" "R1"
## [541] "R1" "R1" "R1" "R1" "R1" "R1" "R1" "R1" "R1" "R1" "R1" "R1" "R1" "R1" "R1"
## [556] "R1" "R1" "R1" "R1" "R1" "R1" "R1" "R2" "R2" "R2" "R2" "R2" "R2" "R2" "R2"
## [571] "R2" "R2" "R2" "R2" "R2" "R2" "R2" "R2" "R2" "R2" "R2" "R2" "R2" "R2" "R2"
## [586] "R2" "R2" "R2" "R2" "R2" "R2" "R2" "R2" "R2" "R2" "R2" "R2" "R2" "R2" "R3"
## [601] "R3" "R3" "R3" "R3" "R3" "R3" "R3" "R3" "R3" "R3" "R3" "R3" "R3" "R3" "R3"
## [616] "R3" "R3" "R3" "R3" "R3" "R3" "R3" "R3" "R3" "R3" "R3" "R3" "R4" "R4" "R4"
## [631] "R4" "R4" "R4" "R4" "R4" "R4" "R4" "R4" "R4" "R4" "R4" "R4" "R4" "R4" "R4"
## [646] "R4" "R4" "R4" "R4" "R4" "R4" "R4" "R4" "R4" "R4" "R4" "R4" "R4" "R5" "R5"
## [661] "R5" "R5" "R5" "R5" "R5" "R5" "R5" "R5" "R5" "R5" "R5" "R5" "R5" "R5" "R5"
## [676] "R5" "R5" "R5" "R5" "R5" "R5" "R5"
## [1] "Littorina littorea" "Littorina littorea"
## [3] "Littorina obtusata" "Littorina obtusata"
## [5] "Nucella lapillus" "Littorina littorea"
## [7] "Littorina obtusata" "Littorina littorea"
## [9] "Littorina obtusata" "Littorina littorea"
## [11] "Littorina obtusata" "Littorina littorea"
## [13] "Littorina obtusata" "Littorina littorea"
## [15] "Littorina obtusata" "Carcinus maenas"
## [17] "Littorina littorea" "Littorina obtusata"
## [19] "Carcinus maenas" "Littorina littorea"
## [21] "Littorina obtusata" "Carcinus maenas"
## [23] "Littorina littorea" "Littorina obtusata"
## [25] "Carcinus maenas" "Littorina littorea"
## [27] "Littorina saxatilis" "Littorina littorea"
## [29] "Littorina littorea" "Littorina obtusata"
## [31] "Littorina obtusata" "Nucella lapillus"
## [33] "Nucella lapillus" "Littorina littorea"
## [35] "Littorina obtusata" "Littorina saxatilis"
## [37] "Littorina littorea" "Littorina obtusata"
## [39] "Nucella lapillus" "Littorina littorea"
## [41] "Littorina obtusata" "Nucella lapillus"
## [43] "Littorina littorea" "Littorina obtusata"
## [45] "Littorina littorea" "Littorina obtusata"
## [47] "Littorina littorea" "Littorina obtusata"
## [49] "Carcinus maenas" "Littorina littorea"
## [51] "Littorina obtusata" "Carcinus maenas"
## [53] "Littorina littorea" "Littorina obtusata"
## [55] "Nucella lapillus" "Carcinus maenas"
## [57] "Littorina littorea" "Littorina obtusata"
## [59] "Carcinus maenas" "Littorina littorea"
## [61] "Littorina obtusata" "Littorina littorea"
## [63] "Littorina littorea" "Littorina obtusata"
## [65] "Littorina obtusata" "Nucella lapillus"
## [67] "Nucella lapillus" "Littorina littorea"
## [69] "Littorina obtusata" "Littorina littorea"
## [71] "Littorina obtusata" "Littorina littorea"
## [73] "Littorina obtusata" "Nucella lapillus"
## [75] "Littorina littorea" "Littorina obtusata"
## [77] "Littorina littorea" "Littorina obtusata"
## [79] "Carcinus maenas" "Littorina littorea"
## [81] "Littorina obtusata" "Carcinus maenas"
## [83] "Littorina littorea" "Littorina obtusata"
## [85] "Carcinus maenas" "Littorina littorea"
## [87] "Littorina obtusata" "Carcinus maenas"
## [89] "Littorina littorea" "Littorina obtusata"
## [91] "Littorina littorea" "Littorina obtusata"
## [93] "Littorina littorea" "Littorina littorea"
## [95] "Littorina obtusata" "Littorina obtusata"
## [97] "Nucella lapillus" "Littorina littorea"
## [99] "Littorina obtusata" "Nucella lapillus"
## [101] "Littorina littorea" "Littorina obtusata"
## [103] "Littorina littorea" "Littorina obtusata"
## [105] "Nucella lapillus" "Littorina littorea"
## [107] "Littorina obtusata" "Littorina littorea"
## [109] "Littorina obtusata" "Nucella lapillus"
## [111] "Carcinus maenas" "Littorina littorea"
## [113] "Littorina obtusata" "Carcinus maenas"
## [115] "Littorina littorea" "Littorina obtusata"
## [117] "Carcinus maenas" "Littorina littorea"
## [119] "Littorina obtusata" "Nucella lapillus"
## [121] "Carcinus maenas" "Littorina littorea"
## [123] "Littorina obtusata" "Nucella lapillus"
## [125] "Littorina littorea" "Littorina obtusata"
## [127] "Littorina littorea" "Littorina littorea"
## [129] "Littorina obtusata" "Littorina obtusata"
## [131] "Littorina littorea" "Littorina obtusata"
## [133] "Littorina littorea" "Littorina obtusata"
## [135] "Littorina littorea" "Littorina obtusata"
## [137] "Littorina littorea" "Littorina obtusata"
## [139] "Nucella lapillus" "Littorina littorea"
## [141] "Littorina obtusata" "Littorina littorea"
## [143] "Littorina obtusata" "Carcinus maenas"
## [145] "Littorina littorea" "Littorina obtusata"
## [147] "Littorina littorea" "Littorina obtusata"
## [149] "Nucella lapillus" "Carcinus maenas"
## [151] "Littorina littorea" "Littorina obtusata"
## [153] "Littorina littorea" "Littorina obtusata"
## [155] "Littorina littorea" "Littorina saxatilis"
## [157] "Nucella lapillus" "Littorina littorea"
## [159] "Littorina obtusata" "Nucella lapillus"
## [161] "Littorina littorea" "Littorina obtusata"
## [163] "Nucella lapillus" "Littorina littorea"
## [165] "Littorina obtusata" "Littorina littorea"
## [167] "Littorina obtusata" "Littorina saxatilis"
## [169] "Littorina littorea" "Littorina obtusata"
## [171] "Littorina saxatilis" "Carcinus maenas"
## [173] "Littorina littorea" "Littorina obtusata"
## [175] "Littorina saxatilis" "Nucella lapillus"
## [177] "Littorina littorea" "Littorina obtusata"
## [179] "Nucella lapillus" "Littorina littorea"
## [181] "Littorina obtusata" "Littorina saxatilis"
## [183] "Littorina littorea" "Littorina obtusata"
## [185] "Littorina littorea" "Littorina obtusata"
## [187] "Littorina littorea" "Littorina obtusata"
## [189] "Littorina littorea" "Littorina obtusata"
## [191] "Littorina littorea" "Littorina obtusata"
## [193] "Nucella lapillus" "Littorina littorea"
## [195] "Littorina obtusata" "Nucella lapillus"
## [197] "Littorina littorea" "Littorina obtusata"
## [199] "Littorina littorea" "Littorina obtusata"
## [201] "Nucella lapillus" "Carcinus maenas"
## [203] "Littorina littorea" "Littorina obtusata"
## [205] "Nucella lapillus" "Littorina littorea"
## [207] "Littorina obtusata" "Nucella lapillus"
## [209] "Carcinus maenas" "Littorina littorea"
## [211] "Littorina obtusata" "Nucella lapillus"
## [213] "Carcinus maenas" "Littorina littorea"
## [215] "Littorina obtusata" "Littorina saxatilis"
## [217] "Nucella lapillus" "Littorina obtusata"
## [219] "Littorina littorea" "Littorina obtusata"
## [221] "Littorina littorea" "Littorina obtusata"
## [223] "Nucella lapillus" "Littorina littorea"
## [225] "Littorina obtusata" "Littorina littorea"
## [227] "Littorina obtusata" "Littorina saxatilis"
## [229] "Littorina littorea" "Littorina obtusata"
## [231] "Nucella lapillus" "Carcinus maenas"
## [233] "Littorina littorea" "Littorina obtusata"
## [235] "Littorina littorea" "Littorina obtusata"
## [237] "Nucella lapillus" "Littorina littorea"
## [239] "Littorina obtusata" "Nucella lapillus"
## [241] "Carcinus maenas" "Littorina littorea"
## [243] "Littorina obtusata" "Nucella lapillus"
## [245] "Littorina obtusata" "Littorina saxatilis"
## [247] "Littorina littorea" "Littorina obtusata"
## [249] "Littorina littorea" "Littorina obtusata"
## [251] "Nucella lapillus" "Littorina littorea"
## [253] "Littorina obtusata" "Nucella lapillus"
## [255] "Testudinalia testudinalis" "Littorina littorea"
## [257] "Littorina obtusata" "Littorina saxatilis"
## [259] "Littorina littorea" "Littorina obtusata"
## [261] "Littorina saxatilis" "Littorina littorea"
## [263] "Littorina obtusata" "Littorina saxatilis"
## [265] "Nucella lapillus" "Littorina littorea"
## [267] "Littorina obtusata" "Littorina saxatilis"
## [269] "Nucella lapillus" "Littorina littorea"
## [271] "Littorina obtusata" "Littorina saxatilis"
## [273] "Nucella lapillus" "Carcinus maenas"
## [275] "Littorina littorea" "Littorina obtusata"
## [277] "Nucella lapillus" "Littorina obtusata"
## [279] "Littorina littorea" "Littorina obtusata"
## [281] "Littorina littorea" "Littorina littorea"
## [283] "Littorina obtusata" "Littorina littorea"
## [285] "Littorina obtusata" "Nucella lapillus"
## [287] "Carcinus maenas" "Littorina littorea"
## [289] "Littorina obtusata" "Littorina saxatilis"
## [291] "Nucella lapillus" "Testudinalia testudinalis"
## [293] "Littorina littorea" "Littorina obtusata"
## [295] "Littorina saxatilis" "Carcinus maenas"
## [297] "Littorina littorea" "Littorina obtusata"
## [299] "Nucella lapillus" "Littorina littorea"
## [301] "Littorina obtusata" "Nucella lapillus"
## [303] "Littorina littorea" "Littorina littorea"
## [305] "Littorina obtusata" "Littorina obtusata"
## [307] "Nucella lapillus" "Testudinalia testudinalis"
## [309] "Testudinalia testudinalis" "Littorina littorea"
## [311] "Littorina obtusata" "Testudinalia testudinalis"
## [313] "Littorina littorea" "Littorina obtusata"
## [315] "Nucella lapillus" "Testudinalia testudinalis"
## [317] "Littorina littorea" "Littorina obtusata"
## [319] "Nucella lapillus" "Littorina littorea"
## [321] "Littorina obtusata" "Nucella lapillus"
## [323] "Testudinalia testudinalis" "Littorina littorea"
## [325] "Littorina obtusata" "Nucella lapillus"
## [327] "Testudinalia testudinalis" "Carcinus maenas"
## [329] "Littorina littorea" "Littorina obtusata"
## [331] "Testudinalia testudinalis" "Carcinus maenas"
## [333] "Littorina littorea" "Littorina obtusata"
## [335] "Nucella lapillus" "Testudinalia testudinalis"
## [337] "Carcinus maenas" "Littorina littorea"
## [339] "Littorina obtusata" "Nucella lapillus"
## [341] "Testudinalia testudinalis" "Carcinus maenas"
## [343] "Littorina littorea" "Littorina obtusata"
## [345] "Nucella lapillus" "Testudinalia testudinalis"
## [347] "Carcinus maenas" "Littorina littorea"
## [349] "Littorina obtusata" "Testudinalia testudinalis"
## [351] "Littorina littorea" "Littorina obtusata"
## [353] "Littorina obtusata" "Nucella lapillus"
## [355] "Nucella lapillus" "Testudinalia testudinalis"
## [357] "Littorina littorea" "Littorina obtusata"
## [359] "Nucella lapillus" "Testudinalia testudinalis"
## [361] "Littorina littorea" "Littorina obtusata"
## [363] "Nucella lapillus" "Littorina littorea"
## [365] "Littorina obtusata" "Littorina saxatilis"
## [367] "Nucella lapillus" "Testudinalia testudinalis"
## [369] "Littorina littorea" "Littorina obtusata"
## [371] "Nucella lapillus" "Littorina littorea"
## [373] "Littorina obtusata" "Nucella lapillus"
## [375] "Testudinalia testudinalis" "Littorina littorea"
## [377] "Littorina obtusata" "Nucella lapillus"
## [379] "Carcinus maenas" "Littorina littorea"
## [381] "Littorina obtusata" "Nucella lapillus"
## [383] "Carcinus maenas" "Littorina littorea"
## [385] "Littorina obtusata" "Littorina saxatilis"
## [387] "Nucella lapillus" "Testudinalia testudinalis"
## [389] "Carcinus maenas" "Littorina littorea"
## [391] "Littorina obtusata" "Nucella lapillus"
## [393] "Littorina littorea" "Littorina obtusata"
## [395] "Nucella lapillus" "Littorina littorea"
## [397] "Littorina littorea" "Littorina obtusata"
## [399] "Littorina obtusata" "Nucella lapillus"
## [401] "Nucella lapillus" "Testudinalia testudinalis"
## [403] "Testudinalia testudinalis" "Littorina littorea"
## [405] "Littorina obtusata" "Nucella lapillus"
## [407] "Testudinalia testudinalis" "Littorina littorea"
## [409] "Littorina obtusata" "Testudinalia testudinalis"
## [411] "Littorina littorea" "Littorina obtusata"
## [413] "Nucella lapillus" "Littorina littorea"
## [415] "Littorina obtusata" "Nucella lapillus"
## [417] "Testudinalia testudinalis" "Littorina littorea"
## [419] "Littorina obtusata" "Nucella lapillus"
## [421] "Testudinalia testudinalis" "Littorina littorea"
## [423] "Littorina obtusata" "Littorina littorea"
## [425] "Littorina obtusata" "Nucella lapillus"
## [427] "Testudinalia testudinalis" "Littorina littorea"
## [429] "Littorina obtusata" "Nucella lapillus"
## [431] "Testudinalia testudinalis" "Carcinus maenas"
## [433] "Littorina littorea" "Littorina obtusata"
## [435] "Nucella lapillus" "Carcinus maenas"
## [437] "Littorina littorea" "Littorina obtusata"
## [439] "Nucella lapillus" "Littorina littorea"
## [441] "Littorina littorea" "Littorina obtusata"
## [443] "Littorina obtusata" "Nucella lapillus"
## [445] "Testudinalia testudinalis" "Testudinalia testudinalis"
## [447] "Littorina littorea" "Littorina obtusata"
## [449] "Nucella lapillus" "Littorina littorea"
## [451] "Littorina obtusata" "Littorina littorea"
## [453] "Littorina obtusata" "Nucella lapillus"
## [455] "Testudinalia testudinalis" "Littorina littorea"
## [457] "Littorina obtusata" "Nucella lapillus"
## [459] "Littorina littorea" "Littorina obtusata"
## [461] "Nucella lapillus" "Testudinalia testudinalis"
## [463] "Littorina littorea" "Littorina obtusata"
## [465] "Testudinalia testudinalis" "Carcinus maenas"
## [467] "Littorina littorea" "Littorina obtusata"
## [469] "Nucella lapillus" "Testudinalia testudinalis"
## [471] "Littorina littorea" "Littorina obtusata"
## [473] "Nucella lapillus" "Testudinalia testudinalis"
## [475] "Littorina littorea" "Littorina obtusata"
## [477] "Nucella lapillus" "Carcinus maenas"
## [479] "Littorina littorea" "Littorina obtusata"
## [481] "Littorina littorea" "Littorina littorea"
## [483] "Littorina obtusata" "Littorina obtusata"
## [485] "Nucella lapillus" "Nucella lapillus"
## [487] "Testudinalia testudinalis" "Testudinalia testudinalis"
## [489] "Littorina littorea" "Littorina obtusata"
## [491] "Testudinalia testudinalis" "Littorina littorea"
## [493] "Littorina obtusata" "Nucella lapillus"
## [495] "Testudinalia testudinalis" "Littorina littorea"
## [497] "Littorina obtusata" "Nucella lapillus"
## [499] "Testudinalia testudinalis" "Littorina littorea"
## [501] "Littorina obtusata" "Nucella lapillus"
## [503] "Testudinalia testudinalis" "Littorina littorea"
## [505] "Littorina obtusata" "Nucella lapillus"
## [507] "Testudinalia testudinalis" "Littorina littorea"
## [509] "Littorina obtusata" "Nucella lapillus"
## [511] "Carcinus maenas" "Littorina littorea"
## [513] "Littorina obtusata" "Nucella lapillus"
## [515] "Testudinalia testudinalis" "Littorina littorea"
## [517] "Littorina obtusata" "Nucella lapillus"
## [519] "Littorina littorea" "Littorina obtusata"
## [521] "Nucella lapillus" "Testudinalia testudinalis"
## [523] "Carcinus maenas" "Littorina littorea"
## [525] "Littorina obtusata" "Nucella lapillus"
## [527] "Littorina littorea" "Littorina littorea"
## [529] "Littorina obtusata" "Littorina obtusata"
## [531] "Nucella lapillus" "Testudinalia testudinalis"
## [533] "Littorina littorea" "Littorina obtusata"
## [535] "Nucella lapillus" "Testudinalia testudinalis"
## [537] "Littorina littorea" "Littorina obtusata"
## [539] "Testudinalia testudinalis" "Littorina littorea"
## [541] "Nucella lapillus" "Testudinalia testudinalis"
## [543] "Littorina littorea" "Littorina obtusata"
## [545] "Littorina littorea" "Littorina obtusata"
## [547] "Testudinalia testudinalis" "Littorina littorea"
## [549] "Littorina obtusata" "Testudinalia testudinalis"
## [551] "Carcinus maenas" "Littorina littorea"
## [553] "Littorina obtusata" "Testudinalia testudinalis"
## [555] "Littorina littorea" "Littorina obtusata"
## [557] "Nucella lapillus" "Littorina littorea"
## [559] "Carcinus maenas" "Littorina littorea"
## [561] "Nucella lapillus" "Testudinalia testudinalis"
## [563] "Littorina obtusata" "Littorina littorea"
## [565] "Littorina obtusata" "Testudinalia testudinalis"
## [567] "Littorina littorea" "Littorina obtusata"
## [569] "Nucella lapillus" "Testudinalia testudinalis"
## [571] "Littorina littorea" "Littorina obtusata"
## [573] "Testudinalia testudinalis" "Littorina littorea"
## [575] "Littorina obtusata" "Testudinalia testudinalis"
## [577] "Littorina littorea" "Littorina obtusata"
## [579] "Nucella lapillus" "Testudinalia testudinalis"
## [581] "Littorina littorea" "Littorina obtusata"
## [583] "Nucella lapillus" "Testudinalia testudinalis"
## [585] "Carcinus maenas" "Littorina littorea"
## [587] "Littorina obtusata" "Testudinalia testudinalis"
## [589] "Littorina littorea" "Littorina obtusata"
## [591] "Nucella lapillus" "Testudinalia testudinalis"
## [593] "Littorina littorea" "Testudinalia testudinalis"
## [595] "Carcinus maenas" "Littorina littorea"
## [597] "Littorina obtusata" "Nucella lapillus"
## [599] "Testudinalia testudinalis" "Littorina littorea"
## [601] "Nucella lapillus" "Littorina littorea"
## [603] "Testudinalia testudinalis" "Littorina littorea"
## [605] "Littorina obtusata" "Nucella lapillus"
## [607] "Testudinalia testudinalis" "Littorina littorea"
## [609] "Testudinalia testudinalis" "Littorina littorea"
## [611] "Littorina obtusata" "Littorina saxatilis"
## [613] "Testudinalia testudinalis" "Littorina littorea"
## [615] "Testudinalia testudinalis" "Littorina littorea"
## [617] "Littorina obtusata" "Nucella lapillus"
## [619] "Testudinalia testudinalis" "Littorina littorea"
## [621] "Littorina littorea" "Littorina obtusata"
## [623] "Nucella lapillus" "Carcinus maenas"
## [625] "Littorina littorea" "Testudinalia testudinalis"
## [627] "Littorina littorea" "Littorina littorea"
## [629] "Littorina littorea" "Littorina littorea"
## [631] "Littorina obtusata" "Testudinalia testudinalis"
## [633] "Littorina littorea" "Littorina obtusata"
## [635] "Testudinalia testudinalis" "Littorina littorea"
## [637] "Testudinalia testudinalis" "Littorina littorea"
## [639] "Littorina obtusata" "Nucella lapillus"
## [641] "Testudinalia testudinalis" "Littorina littorea"
## [643] "Littorina obtusata" "Nucella lapillus"
## [645] "Testudinalia testudinalis" "Littorina littorea"
## [647] "Littorina obtusata" "Testudinalia testudinalis"
## [649] "Carcinus maenas" "Littorina littorea"
## [651] "Nucella lapillus" "Testudinalia testudinalis"
## [653] "Littorina littorea" "Littorina obtusata"
## [655] "Testudinalia testudinalis" "Littorina littorea"
## [657] "Testudinalia testudinalis" "Littorina littorea"
## [659] "Nucella lapillus" "Nucella lapillus"
## [661] "Littorina littorea" "Littorina littorea"
## [663] "Littorina littorea" "Nucella lapillus"
## [665] "Testudinalia testudinalis" "Littorina littorea"
## [667] "Nucella lapillus" "Testudinalia testudinalis"
## [669] "Littorina littorea" "Nucella lapillus"
## [671] "Testudinalia testudinalis" "Littorina littorea"
## [673] "Littorina obtusata" "Testudinalia testudinalis"
## [675] "Littorina littorea" "Nucella lapillus"
## [677] "Testudinalia testudinalis" "Littorina littorea"
## [679] "Littorina obtusata" "Testudinalia testudinalis"
## [681] "Littorina littorea" "Littorina littorea"
[ , ]
Remember that every data frame has 2 dimensions. The first dimension is rows and the second is columns. Thinking of the data in two dimensions in the order of rows then columns helps understand how brackets work.
Square brackets[rows, columns] are how you access specific
rows and columns in a data frame using base R. Examples include:
Square brackets were one of the hardest concepts when I was starting out. Don't worry if this isn't immediately intuitive. There are easier ways to work with data frame rows and columns, which you'll learn on Day 2. It is still useful to have a basic understanding of how to interpret square brackets, as you will likely encounter them on StackOverflow or other R help sites. We'll work through some examples of using the square brackets to access rows, columns and/or both.
The code below asks for the dimensions of the motinv
data frame, and returns 682 14. That means there are 682 rows, and 14
columns.
Return data frame number of rows and columns by checking data frame dimensions. Click on R output to view results.
## [1] 682 14
## [1] 682
## [1] 14
Return first 5 rows of the motile invert. data frame.
Note the comma with nothing to the right. That means return all columns.
## Network UnitCode SiteCode StartDate Year QAQC PlotName CommunityType
## 1 NETN ACAD BASHAR 6/24/2013 2013 TRUE A1 Ascophyllum
## 2 NETN ACAD BASHAR 6/21/2013 2013 FALSE A1 Ascophyllum
## 3 NETN ACAD BASHAR 6/24/2013 2013 TRUE A1 Ascophyllum
## 4 NETN ACAD BASHAR 6/21/2013 2013 FALSE A1 Ascophyllum
## 5 NETN ACAD BASHAR 6/24/2013 2013 TRUE A1 Ascophyllum
## ScientificName CommonName SpeciesCode Damage No.Damage Subsampled
## 1 Littorina littorea Common periwinkle LITLIT 0 2 No
## 2 Littorina littorea Common periwinkle LITLIT 0 3 No
## 3 Littorina obtusata Smooth periwinkle LITOBT 1 2 No
## 4 Littorina obtusata Smooth periwinkle LITOBT 0 6 No
## 5 Nucella lapillus Dogwhelk NUCLAP 0 1 No
Return all rows and a subset of columns of the data frame
Note how the left side of the comma is empty. That means return all rows.
## SiteCode ScientificName CommonName Year Damage No.Damage
## 1 BASHAR Littorina littorea Common periwinkle 2013 0 2
## 2 BASHAR Littorina littorea Common periwinkle 2013 0 3
## 3 BASHAR Littorina obtusata Smooth periwinkle 2013 1 2
## 4 BASHAR Littorina obtusata Smooth periwinkle 2013 0 6
## 5 BASHAR Nucella lapillus Dogwhelk 2013 0 1
## 6 BASHAR Littorina littorea Common periwinkle 2014 0 2
## 7 BASHAR Littorina obtusata Smooth periwinkle 2014 0 1
## 8 BASHAR Littorina littorea Common periwinkle 2016 0 6
## 9 BASHAR Littorina obtusata Smooth periwinkle 2016 1 9
## 10 BASHAR Littorina littorea Common periwinkle 2017 0 41
## 11 BASHAR Littorina obtusata Smooth periwinkle 2017 0 1
## 12 BASHAR Littorina littorea Common periwinkle 2018 1 11
## 13 BASHAR Littorina obtusata Smooth periwinkle 2018 0 3
## 14 BASHAR Littorina littorea Common periwinkle 2019 0 9
## 15 BASHAR Littorina obtusata Smooth periwinkle 2019 0 5
## 16 BASHAR Carcinus maenas Green crab 2021 0 1
## 17 BASHAR Littorina littorea Common periwinkle 2021 0 2
## 18 BASHAR Littorina obtusata Smooth periwinkle 2021 0 16
## 19 BASHAR Carcinus maenas Green crab 2022 0 1
## 20 BASHAR Littorina littorea Common periwinkle 2022 0 4
## 21 BASHAR Littorina obtusata Smooth periwinkle 2022 0 5
## 22 BASHAR Carcinus maenas Green crab 2023 0 2
## 23 BASHAR Littorina littorea Common periwinkle 2023 0 1
## 24 BASHAR Littorina obtusata Smooth periwinkle 2023 0 1
## 25 BASHAR Carcinus maenas Green crab 2024 0 2
## 26 BASHAR Littorina littorea Common periwinkle 2024 7 35
## 27 BASHAR Littorina saxatilis Rough periwinkle 2024 1 1
## 28 BASHAR Littorina littorea Common periwinkle 2013 0 3
## 29 BASHAR Littorina littorea Common periwinkle 2013 0 8
## 30 BASHAR Littorina obtusata Smooth periwinkle 2013 4 19
## 31 BASHAR Littorina obtusata Smooth periwinkle 2013 0 25
## 32 BASHAR Nucella lapillus Dogwhelk 2013 0 1
## 33 BASHAR Nucella lapillus Dogwhelk 2013 0 3
## 34 BASHAR Littorina littorea Common periwinkle 2014 0 4
## 35 BASHAR Littorina obtusata Smooth periwinkle 2014 0 29
## 36 BASHAR Littorina saxatilis Rough periwinkle 2014 0 1
## 37 BASHAR Littorina littorea Common periwinkle 2015 1 28
## 38 BASHAR Littorina obtusata Smooth periwinkle 2015 0 12
## 39 BASHAR Nucella lapillus Dogwhelk 2015 1 0
## 40 BASHAR Littorina littorea Common periwinkle 2016 5 52
## 41 BASHAR Littorina obtusata Smooth periwinkle 2016 3 50
## 42 BASHAR Nucella lapillus Dogwhelk 2016 0 1
## 43 BASHAR Littorina littorea Common periwinkle 2017 0 65
## 44 BASHAR Littorina obtusata Smooth periwinkle 2017 0 20
## 45 BASHAR Littorina littorea Common periwinkle 2018 4 75
## 46 BASHAR Littorina obtusata Smooth periwinkle 2018 4 31
## 47 BASHAR Littorina littorea Common periwinkle 2019 0 70
## 48 BASHAR Littorina obtusata Smooth periwinkle 2019 0 23
## 49 BASHAR Carcinus maenas Green crab 2021 0 5
## 50 BASHAR Littorina littorea Common periwinkle 2021 0 23
## 51 BASHAR Littorina obtusata Smooth periwinkle 2021 0 39
## 52 BASHAR Carcinus maenas Green crab 2022 0 1
## 53 BASHAR Littorina littorea Common periwinkle 2022 0 67
## 54 BASHAR Littorina obtusata Smooth periwinkle 2022 0 30
## 55 BASHAR Nucella lapillus Dogwhelk 2022 0 2
## 56 BASHAR Carcinus maenas Green crab 2023 0 3
## 57 BASHAR Littorina littorea Common periwinkle 2023 2 46
## 58 BASHAR Littorina obtusata Smooth periwinkle 2023 0 2
## 59 BASHAR Carcinus maenas Green crab 2024 0 2
## 60 BASHAR Littorina littorea Common periwinkle 2024 6 78
## 61 BASHAR Littorina obtusata Smooth periwinkle 2024 1 16
## 62 BASHAR Littorina littorea Common periwinkle 2013 2 14
## 63 BASHAR Littorina littorea Common periwinkle 2013 0 16
## 64 BASHAR Littorina obtusata Smooth periwinkle 2013 5 23
## 65 BASHAR Littorina obtusata Smooth periwinkle 2013 0 34
## 66 BASHAR Nucella lapillus Dogwhelk 2013 0 2
## 67 BASHAR Nucella lapillus Dogwhelk 2013 0 4
## 68 BASHAR Littorina littorea Common periwinkle 2014 0 14
## 69 BASHAR Littorina obtusata Smooth periwinkle 2014 0 18
## 70 BASHAR Littorina littorea Common periwinkle 2015 1 27
## 71 BASHAR Littorina obtusata Smooth periwinkle 2015 0 19
## 72 BASHAR Littorina littorea Common periwinkle 2016 0 59
## 73 BASHAR Littorina obtusata Smooth periwinkle 2016 2 20
## 74 BASHAR Nucella lapillus Dogwhelk 2016 0 1
## 75 BASHAR Littorina littorea Common periwinkle 2017 0 54
## 76 BASHAR Littorina obtusata Smooth periwinkle 2017 0 10
## 77 BASHAR Littorina littorea Common periwinkle 2018 5 66
## 78 BASHAR Littorina obtusata Smooth periwinkle 2018 2 20
## 79 BASHAR Carcinus maenas Green crab 2019 0 4
## 80 BASHAR Littorina littorea Common periwinkle 2019 0 137
## 81 BASHAR Littorina obtusata Smooth periwinkle 2019 0 49
## 82 BASHAR Carcinus maenas Green crab 2021 0 7
## 83 BASHAR Littorina littorea Common periwinkle 2021 0 30
## 84 BASHAR Littorina obtusata Smooth periwinkle 2021 0 37
## 85 BASHAR Carcinus maenas Green crab 2022 0 1
## 86 BASHAR Littorina littorea Common periwinkle 2022 4 81
## 87 BASHAR Littorina obtusata Smooth periwinkle 2022 0 26
## 88 BASHAR Carcinus maenas Green crab 2023 0 3
## 89 BASHAR Littorina littorea Common periwinkle 2023 5 26
## 90 BASHAR Littorina obtusata Smooth periwinkle 2023 1 5
## 91 BASHAR Littorina littorea Common periwinkle 2024 6 81
## 92 BASHAR Littorina obtusata Smooth periwinkle 2024 0 12
## 93 BASHAR Littorina littorea Common periwinkle 2013 0 20
## 94 BASHAR Littorina littorea Common periwinkle 2013 1 29
## 95 BASHAR Littorina obtusata Smooth periwinkle 2013 1 6
## 96 BASHAR Littorina obtusata Smooth periwinkle 2013 5 15
## 97 BASHAR Nucella lapillus Dogwhelk 2013 0 1
## 98 BASHAR Littorina littorea Common periwinkle 2014 0 35
## 99 BASHAR Littorina obtusata Smooth periwinkle 2014 0 22
## 100 BASHAR Nucella lapillus Dogwhelk 2014 0 1
## 101 BASHAR Littorina littorea Common periwinkle 2015 4 98
## 102 BASHAR Littorina obtusata Smooth periwinkle 2015 2 27
## 103 BASHAR Littorina littorea Common periwinkle 2016 5 113
## 104 BASHAR Littorina obtusata Smooth periwinkle 2016 1 52
## 105 BASHAR Nucella lapillus Dogwhelk 2016 1 1
## 106 BASHAR Littorina littorea Common periwinkle 2017 3 134
## 107 BASHAR Littorina obtusata Smooth periwinkle 2017 1 19
## 108 BASHAR Littorina littorea Common periwinkle 2018 1 96
## 109 BASHAR Littorina obtusata Smooth periwinkle 2018 2 11
## 110 BASHAR Nucella lapillus Dogwhelk 2018 0 1
## 111 BASHAR Carcinus maenas Green crab 2019 0 2
## 112 BASHAR Littorina littorea Common periwinkle 2019 0 92
## 113 BASHAR Littorina obtusata Smooth periwinkle 2019 0 37
## 114 BASHAR Carcinus maenas Green crab 2021 0 5
## 115 BASHAR Littorina littorea Common periwinkle 2021 9 124
## 116 BASHAR Littorina obtusata Smooth periwinkle 2021 1 54
## 117 BASHAR Carcinus maenas Green crab 2022 0 2
## 118 BASHAR Littorina littorea Common periwinkle 2022 0 78
## 119 BASHAR Littorina obtusata Smooth periwinkle 2022 0 29
## 120 BASHAR Nucella lapillus Dogwhelk 2022 1 0
## 121 BASHAR Carcinus maenas Green crab 2023 0 3
## 122 BASHAR Littorina littorea Common periwinkle 2023 24 75
## 123 BASHAR Littorina obtusata Smooth periwinkle 2023 3 12
## 124 BASHAR Nucella lapillus Dogwhelk 2023 0 1
## 125 BASHAR Littorina littorea Common periwinkle 2024 14 101
## 126 BASHAR Littorina obtusata Smooth periwinkle 2024 3 16
## 127 BASHAR Littorina littorea Common periwinkle 2013 5 17
## 128 BASHAR Littorina littorea Common periwinkle 2013 1 22
## 129 BASHAR Littorina obtusata Smooth periwinkle 2013 0 23
## 130 BASHAR Littorina obtusata Smooth periwinkle 2013 4 23
## 131 BASHAR Littorina littorea Common periwinkle 2014 0 49
## 132 BASHAR Littorina obtusata Smooth periwinkle 2014 0 22
## 133 BASHAR Littorina littorea Common periwinkle 2015 1 65
## 134 BASHAR Littorina obtusata Smooth periwinkle 2015 PM 11
## 135 BASHAR Littorina littorea Common periwinkle 2016 4 113
## 136 BASHAR Littorina obtusata Smooth periwinkle 2016 2 30
## 137 BASHAR Littorina littorea Common periwinkle 2017 4 62
## 138 BASHAR Littorina obtusata Smooth periwinkle 2017 0 14
## 139 BASHAR Nucella lapillus Dogwhelk 2017 0 2
## 140 BASHAR Littorina littorea Common periwinkle 2018 1 65
## 141 BASHAR Littorina obtusata Smooth periwinkle 2018 1 15
## 142 BASHAR Littorina littorea Common periwinkle 2019 1 93
## 143 BASHAR Littorina obtusata Smooth periwinkle 2019 0 24
## 144 BASHAR Carcinus maenas Green crab 2021 0 4
## 145 BASHAR Littorina littorea Common periwinkle 2021 1 45
## 146 BASHAR Littorina obtusata Smooth periwinkle 2021 0 45
## 147 BASHAR Littorina littorea Common periwinkle 2022 3 45
## 148 BASHAR Littorina obtusata Smooth periwinkle 2022 1 21
## 149 BASHAR Nucella lapillus Dogwhelk 2022 0 1
## 150 BASHAR Carcinus maenas Green crab 2023 0 3
## 151 BASHAR Littorina littorea Common periwinkle 2023 5 45
## 152 BASHAR Littorina obtusata Smooth periwinkle 2023 0 4
## 153 BASHAR Littorina littorea Common periwinkle 2024 14 74
## 154 BASHAR Littorina obtusata Smooth periwinkle 2024 2 7
## 155 BASHAR Littorina littorea Common periwinkle 2013 0 1
## 156 BASHAR Littorina saxatilis Rough periwinkle 2013 0 1
## 157 BASHAR Nucella lapillus Dogwhelk 2013 0 1
## 158 BASHAR Littorina littorea Common periwinkle 2015 0 3
## 159 BASHAR Littorina obtusata Smooth periwinkle 2015 4 40
## 160 BASHAR Nucella lapillus Dogwhelk 2015 1 0
## 161 BASHAR Littorina littorea Common periwinkle 2016 0 5
## 162 BASHAR Littorina obtusata Smooth periwinkle 2016 0 40
## 163 BASHAR Nucella lapillus Dogwhelk 2016 0 8
## 164 BASHAR Littorina littorea Common periwinkle 2017 1 12
## 165 BASHAR Littorina obtusata Smooth periwinkle 2017 0 17
## 166 BASHAR Littorina littorea Common periwinkle 2018 6 47
## 167 BASHAR Littorina obtusata Smooth periwinkle 2018 7 44
## 168 BASHAR Littorina saxatilis Rough periwinkle 2018 0 5
## 169 BASHAR Littorina littorea Common periwinkle 2019 0 27
## 170 BASHAR Littorina obtusata Smooth periwinkle 2019 0 44
## 171 BASHAR Littorina saxatilis Rough periwinkle 2019 0 1
## 172 BASHAR Carcinus maenas Green crab 2021 0 2
## 173 BASHAR Littorina littorea Common periwinkle 2021 0 21
## 174 BASHAR Littorina obtusata Smooth periwinkle 2021 0 53
## 175 BASHAR Littorina saxatilis Rough periwinkle 2021 0 2
## 176 BASHAR Nucella lapillus Dogwhelk 2021 0 2
## 177 BASHAR Littorina littorea Common periwinkle 2022 0 35
## 178 BASHAR Littorina obtusata Smooth periwinkle 2022 0 75
## 179 BASHAR Nucella lapillus Dogwhelk 2022 0 7
## 180 BASHAR Littorina littorea Common periwinkle 2023 3 46
## 181 BASHAR Littorina obtusata Smooth periwinkle 2023 0 15
## 182 BASHAR Littorina saxatilis Rough periwinkle 2023 0 1
## 183 BASHAR Littorina littorea Common periwinkle 2024 1 18
## 184 BASHAR Littorina obtusata Smooth periwinkle 2024 1 7
## 185 BASHAR Littorina littorea Common periwinkle 2013 0 1
## 186 BASHAR Littorina obtusata Smooth periwinkle 2013 0 1
## 187 BASHAR Littorina littorea Common periwinkle 2014 0 2
## 188 BASHAR Littorina obtusata Smooth periwinkle 2014 0 1
## 189 BASHAR Littorina littorea Common periwinkle 2015 0 10
## 190 BASHAR Littorina obtusata Smooth periwinkle 2015 2 20
## 191 BASHAR Littorina littorea Common periwinkle 2016 0 2
## 192 BASHAR Littorina obtusata Smooth periwinkle 2016 0 61
## 193 BASHAR Nucella lapillus Dogwhelk 2016 0 2
## 194 BASHAR Littorina littorea Common periwinkle 2017 1 56
## 195 BASHAR Littorina obtusata Smooth periwinkle 2017 1 47
## 196 BASHAR Nucella lapillus Dogwhelk 2017 0 3
## 197 BASHAR Littorina littorea Common periwinkle 2018 4 52
## 198 BASHAR Littorina obtusata Smooth periwinkle 2018 6 25
## 199 BASHAR Littorina littorea Common periwinkle 2019 1 41
## 200 BASHAR Littorina obtusata Smooth periwinkle 2019 1 43
## 201 BASHAR Nucella lapillus Dogwhelk 2019 0 4
## 202 BASHAR Carcinus maenas Green crab 2021 0 1
## 203 BASHAR Littorina littorea Common periwinkle 2021 0 54
## 204 BASHAR Littorina obtusata Smooth periwinkle 2021 0 91
## 205 BASHAR Nucella lapillus Dogwhelk 2021 0 4
## 206 BASHAR Littorina littorea Common periwinkle 2022 0 20
## 207 BASHAR Littorina obtusata Smooth periwinkle 2022 0 24
## 208 BASHAR Nucella lapillus Dogwhelk 2022 0 1
## 209 BASHAR Carcinus maenas Green crab 2023 0 1
## 210 BASHAR Littorina littorea Common periwinkle 2023 1 23
## 211 BASHAR Littorina obtusata Smooth periwinkle 2023 0 10
## 212 BASHAR Nucella lapillus Dogwhelk 2023 0 2
## 213 BASHAR Carcinus maenas Green crab 2024 0 1
## 214 BASHAR Littorina littorea Common periwinkle 2024 0 25
## 215 BASHAR Littorina obtusata Smooth periwinkle 2024 0 13
## 216 BASHAR Littorina saxatilis Rough periwinkle 2024 0 1
## 217 BASHAR Nucella lapillus Dogwhelk 2024 0 2
## 218 BASHAR Littorina obtusata Smooth periwinkle 2014 1 4
## 219 BASHAR Littorina littorea Common periwinkle 2015 0 1
## 220 BASHAR Littorina obtusata Smooth periwinkle 2015 0 10
## 221 BASHAR Littorina littorea Common periwinkle 2016 0 3
## 222 BASHAR Littorina obtusata Smooth periwinkle 2016 0 20
## 223 BASHAR Nucella lapillus Dogwhelk 2016 0 11
## 224 BASHAR Littorina littorea Common periwinkle 2017 1 57
## 225 BASHAR Littorina obtusata Smooth periwinkle 2017 0 21
## 226 BASHAR Littorina littorea Common periwinkle 2018 2 51
## 227 BASHAR Littorina obtusata Smooth periwinkle 2018 4 23
## 228 BASHAR Littorina saxatilis Rough periwinkle 2018 0 1
## 229 BASHAR Littorina littorea Common periwinkle 2019 0 17
## 230 BASHAR Littorina obtusata Smooth periwinkle 2019 0 39
## 231 BASHAR Nucella lapillus Dogwhelk 2019 0 6
## 232 BASHAR Carcinus maenas Green crab 2021 0 9
## 233 BASHAR Littorina littorea Common periwinkle 2021 0 9
## 234 BASHAR Littorina obtusata Smooth periwinkle 2021 0 11
## 235 BASHAR Littorina littorea Common periwinkle 2022 1 31
## 236 BASHAR Littorina obtusata Smooth periwinkle 2022 0 45
## 237 BASHAR Nucella lapillus Dogwhelk 2022 0 4
## 238 BASHAR Littorina littorea Common periwinkle 2023 1 23
## 239 BASHAR Littorina obtusata Smooth periwinkle 2023 0 8
## 240 BASHAR Nucella lapillus Dogwhelk 2023 0 1
## 241 BASHAR Carcinus maenas Green crab 2024 0 2
## 242 BASHAR Littorina littorea Common periwinkle 2024 0 25
## 243 BASHAR Littorina obtusata Smooth periwinkle 2024 0 5
## 244 BASHAR Nucella lapillus Dogwhelk 2024 0 1
## 245 BASHAR Littorina obtusata Smooth periwinkle 2013 0 1
## 246 BASHAR Littorina saxatilis Rough periwinkle 2013 0 1
## 247 BASHAR Littorina littorea Common periwinkle 2014 0 1
## 248 BASHAR Littorina obtusata Smooth periwinkle 2014 2 0
## 249 BASHAR Littorina littorea Common periwinkle 2015 0 1
## 250 BASHAR Littorina obtusata Smooth periwinkle 2015 2 19
## 251 BASHAR Nucella lapillus Dogwhelk 2015 0 2
## 252 BASHAR Littorina littorea Common periwinkle 2016 0 4
## 253 BASHAR Littorina obtusata Smooth periwinkle 2016 0 21
## 254 BASHAR Nucella lapillus Dogwhelk 2016 1 4
## 255 BASHAR Testudinalia testudinalis Limpet 2016 0 1
## 256 BASHAR Littorina littorea Common periwinkle 2017 1 25
## 257 BASHAR Littorina obtusata Smooth periwinkle 2017 1 26
## 258 BASHAR Littorina saxatilis Rough periwinkle 2017 0 1
## 259 BASHAR Littorina littorea Common periwinkle 2018 1 28
## 260 BASHAR Littorina obtusata Smooth periwinkle 2018 2 29
## 261 BASHAR Littorina saxatilis Rough periwinkle 2018 0 1
## 262 BASHAR Littorina littorea Common periwinkle 2019 3 19
## 263 BASHAR Littorina obtusata Smooth periwinkle 2019 0 56
## 264 BASHAR Littorina saxatilis Rough periwinkle 2019 0 1
## 265 BASHAR Nucella lapillus Dogwhelk 2019 0 1
## 266 BASHAR Littorina littorea Common periwinkle 2021 0 18
## 267 BASHAR Littorina obtusata Smooth periwinkle 2021 0 28
## 268 BASHAR Littorina saxatilis Rough periwinkle 2021 0 1
## 269 BASHAR Nucella lapillus Dogwhelk 2021 0 2
## 270 BASHAR Littorina littorea Common periwinkle 2022 1 29
## 271 BASHAR Littorina obtusata Smooth periwinkle 2022 0 72
## 272 BASHAR Littorina saxatilis Rough periwinkle 2022 0 2
## 273 BASHAR Nucella lapillus Dogwhelk 2022 0 11
## 274 BASHAR Carcinus maenas Green crab 2024 0 1
## 275 BASHAR Littorina littorea Common periwinkle 2024 0 8
## 276 BASHAR Littorina obtusata Smooth periwinkle 2024 0 5
## 277 BASHAR Nucella lapillus Dogwhelk 2024 0 4
## 278 BASHAR Littorina obtusata Smooth periwinkle 2015 1 10
## 279 BASHAR Littorina littorea Common periwinkle 2016 0 1
## 280 BASHAR Littorina obtusata Smooth periwinkle 2016 0 26
## 281 BASHAR Littorina littorea Common periwinkle 2017 0 5
## 282 BASHAR Littorina littorea Common periwinkle 2018 1 10
## 283 BASHAR Littorina obtusata Smooth periwinkle 2018 2 20
## 284 BASHAR Littorina littorea Common periwinkle 2019 2 14
## 285 BASHAR Littorina obtusata Smooth periwinkle 2019 0 9
## 286 BASHAR Nucella lapillus Dogwhelk 2019 0 1
## 287 BASHAR Carcinus maenas Green crab 2021 0 1
## 288 BASHAR Littorina littorea Common periwinkle 2021 0 7
## 289 BASHAR Littorina obtusata Smooth periwinkle 2021 1 13
## 290 BASHAR Littorina saxatilis Rough periwinkle 2021 0 1
## 291 BASHAR Nucella lapillus Dogwhelk 2021 0 1
## 292 BASHAR Testudinalia testudinalis Limpet 2021 1 0
## 293 BASHAR Littorina littorea Common periwinkle 2022 0 47
## 294 BASHAR Littorina obtusata Smooth periwinkle 2022 0 107
## 295 BASHAR Littorina saxatilis Rough periwinkle 2022 0 1
## 296 BASHAR Carcinus maenas Green crab 2023 0 1
## 297 BASHAR Littorina littorea Common periwinkle 2023 0 21
## 298 BASHAR Littorina obtusata Smooth periwinkle 2023 0 6
## 299 BASHAR Nucella lapillus Dogwhelk 2023 0 3
## 300 BASHAR Littorina littorea Common periwinkle 2024 0 17
## 301 BASHAR Littorina obtusata Smooth periwinkle 2024 0 17
## 302 BASHAR Nucella lapillus Dogwhelk 2024 0 7
## 303 BASHAR Littorina littorea Common periwinkle 2013 2 18
## 304 BASHAR Littorina littorea Common periwinkle 2013 2 30
## 305 BASHAR Littorina obtusata Smooth periwinkle 2013 0 31
## 306 BASHAR Littorina obtusata Smooth periwinkle 2013 1 40
## 307 BASHAR Nucella lapillus Dogwhelk 2013 0 5
## 308 BASHAR Testudinalia testudinalis Limpet 2013 0 4
## 309 BASHAR Testudinalia testudinalis Limpet 2013 0 11
## 310 BASHAR Littorina littorea Common periwinkle 2014 0 43
## 311 BASHAR Littorina obtusata Smooth periwinkle 2014 0 58
## 312 BASHAR Testudinalia testudinalis Limpet 2014 0 12
## 313 BASHAR Littorina littorea Common periwinkle 2015 0 10
## 314 BASHAR Littorina obtusata Smooth periwinkle 2015 3 31
## 315 BASHAR Nucella lapillus Dogwhelk 2015 0 1
## 316 BASHAR Testudinalia testudinalis Limpet 2015 0 31
## 317 BASHAR Littorina littorea Common periwinkle 2016 0 84
## 318 BASHAR Littorina obtusata Smooth periwinkle 2016 0 74
## 319 BASHAR Nucella lapillus Dogwhelk 2016 0 2
## 320 BASHAR Littorina littorea Common periwinkle 2017 0 96
## 321 BASHAR Littorina obtusata Smooth periwinkle 2017 0 19
## 322 BASHAR Nucella lapillus Dogwhelk 2017 0 9
## 323 BASHAR Testudinalia testudinalis Limpet 2017 0 2
## 324 BASHAR Littorina littorea Common periwinkle 2018 4 66
## 325 BASHAR Littorina obtusata Smooth periwinkle 2018 0 18
## 326 BASHAR Nucella lapillus Dogwhelk 2018 0 1
## 327 BASHAR Testudinalia testudinalis Limpet 2018 0 1
## 328 BASHAR Carcinus maenas Green crab 2019 0 1
## 329 BASHAR Littorina littorea Common periwinkle 2019 0 116
## 330 BASHAR Littorina obtusata Smooth periwinkle 2019 0 38
## 331 BASHAR Testudinalia testudinalis Limpet 2019 0 6
## 332 BASHAR Carcinus maenas Green crab 2021 0 3
## 333 BASHAR Littorina littorea Common periwinkle 2021 2 131
## 334 BASHAR Littorina obtusata Smooth periwinkle 2021 1 90
## 335 BASHAR Nucella lapillus Dogwhelk 2021 0 6
## 336 BASHAR Testudinalia testudinalis Limpet 2021 0 1
## 337 BASHAR Carcinus maenas Green crab 2022 0 3
## 338 BASHAR Littorina littorea Common periwinkle 2022 0 234
## 339 BASHAR Littorina obtusata Smooth periwinkle 2022 0 46
## 340 BASHAR Nucella lapillus Dogwhelk 2022 1 2
## 341 BASHAR Testudinalia testudinalis Limpet 2022 0 2
## 342 BASHAR Carcinus maenas Green crab 2023 0 4
## 343 BASHAR Littorina littorea Common periwinkle 2023 11 188
## 344 BASHAR Littorina obtusata Smooth periwinkle 2023 3 24
## 345 BASHAR Nucella lapillus Dogwhelk 2023 0 5
## 346 BASHAR Testudinalia testudinalis Limpet 2023 0 2
## 347 BASHAR Carcinus maenas Green crab 2024 0 1
## 348 BASHAR Littorina littorea Common periwinkle 2024 1 116
## 349 BASHAR Littorina obtusata Smooth periwinkle 2024 1 18
## 350 BASHAR Testudinalia testudinalis Limpet 2024 0 2
## 351 BASHAR Littorina littorea Common periwinkle 2013 0 5
## 352 BASHAR Littorina obtusata Smooth periwinkle 2013 0 16
## 353 BASHAR Littorina obtusata Smooth periwinkle 2013 1 80
## 354 BASHAR Nucella lapillus Dogwhelk 2013 0 1
## 355 BASHAR Nucella lapillus Dogwhelk 2013 0 6
## 356 BASHAR Testudinalia testudinalis Limpet 2013 0 3
## 357 BASHAR Littorina littorea Common periwinkle 2014 0 10
## 358 BASHAR Littorina obtusata Smooth periwinkle 2014 0 47
## 359 BASHAR Nucella lapillus Dogwhelk 2014 0 1
## 360 BASHAR Testudinalia testudinalis Limpet 2014 0 4
## 361 BASHAR Littorina littorea Common periwinkle 2015 1 29
## 362 BASHAR Littorina obtusata Smooth periwinkle 2015 2 26
## 363 BASHAR Nucella lapillus Dogwhelk 2015 1 1
## 364 BASHAR Littorina littorea Common periwinkle 2016 1 42
## 365 BASHAR Littorina obtusata Smooth periwinkle 2016 2 103
## 366 BASHAR Littorina saxatilis Rough periwinkle 2016 0 1
## 367 BASHAR Nucella lapillus Dogwhelk 2016 0 6
## 368 BASHAR Testudinalia testudinalis Limpet 2016 0 1
## 369 BASHAR Littorina littorea Common periwinkle 2017 0 71
## 370 BASHAR Littorina obtusata Smooth periwinkle 2017 0 35
## 371 BASHAR Nucella lapillus Dogwhelk 2017 0 4
## 372 BASHAR Littorina littorea Common periwinkle 2018 1 100
## 373 BASHAR Littorina obtusata Smooth periwinkle 2018 1 29
## 374 BASHAR Nucella lapillus Dogwhelk 2018 0 1
## 375 BASHAR Testudinalia testudinalis Limpet 2018 0 1
## 376 BASHAR Littorina littorea Common periwinkle 2019 1 85
## 377 BASHAR Littorina obtusata Smooth periwinkle 2019 0 34
## 378 BASHAR Nucella lapillus Dogwhelk 2019 0 1
## 379 BASHAR Carcinus maenas Green crab 2021 0 5
## 380 BASHAR Littorina littorea Common periwinkle 2021 0 90
## 381 BASHAR Littorina obtusata Smooth periwinkle 2021 0 87
## 382 BASHAR Nucella lapillus Dogwhelk 2021 0 2
## 383 BASHAR Carcinus maenas Green crab 2022 0 1
## 384 BASHAR Littorina littorea Common periwinkle 2022 4 123
## 385 BASHAR Littorina obtusata Smooth periwinkle 2022 0 44
## 386 BASHAR Littorina saxatilis Rough periwinkle 2022 0 2
## 387 BASHAR Nucella lapillus Dogwhelk 2022 0 2
## 388 BASHAR Testudinalia testudinalis Limpet 2022 0 2
## 389 BASHAR Carcinus maenas Green crab 2023 0 1
## 390 BASHAR Littorina littorea Common periwinkle 2023 1 148
## 391 BASHAR Littorina obtusata Smooth periwinkle 2023 0 37
## 392 BASHAR Nucella lapillus Dogwhelk 2023 0 3
## 393 BASHAR Littorina littorea Common periwinkle 2024 1 84
## 394 BASHAR Littorina obtusata Smooth periwinkle 2024 0 41
## 395 BASHAR Nucella lapillus Dogwhelk 2024 0 3
## 396 BASHAR Littorina littorea Common periwinkle 2013 0 3
## 397 BASHAR Littorina littorea Common periwinkle 2013 0 4
## 398 BASHAR Littorina obtusata Smooth periwinkle 2013 0 17
## 399 BASHAR Littorina obtusata Smooth periwinkle 2013 0 24
## 400 BASHAR Nucella lapillus Dogwhelk 2013 0 1
## 401 BASHAR Nucella lapillus Dogwhelk 2013 0 2
## 402 BASHAR Testudinalia testudinalis Limpet 2013 0 4
## 403 BASHAR Testudinalia testudinalis Limpet 2013 0 13
## 404 BASHAR Littorina littorea Common periwinkle 2014 0 41
## 405 BASHAR Littorina obtusata Smooth periwinkle 2014 0 59
## 406 BASHAR Nucella lapillus Dogwhelk 2014 0 3
## 407 BASHAR Testudinalia testudinalis Limpet 2014 0 3
## 408 BASHAR Littorina littorea Common periwinkle 2015 3 23
## 409 BASHAR Littorina obtusata Smooth periwinkle 2015 0 25
## 410 BASHAR Testudinalia testudinalis Limpet 2015 0 5
## 411 BASHAR Littorina littorea Common periwinkle 2016 2 46
## 412 BASHAR Littorina obtusata Smooth periwinkle 2016 1 42
## 413 BASHAR Nucella lapillus Dogwhelk 2016 1 2
## 414 BASHAR Littorina littorea Common periwinkle 2017 0 121
## 415 BASHAR Littorina obtusata Smooth periwinkle 2017 0 30
## 416 BASHAR Nucella lapillus Dogwhelk 2017 0 2
## 417 BASHAR Testudinalia testudinalis Limpet 2017 0 1
## 418 BASHAR Littorina littorea Common periwinkle 2018 7 165
## 419 BASHAR Littorina obtusata Smooth periwinkle 2018 0 34
## 420 BASHAR Nucella lapillus Dogwhelk 2018 0 4
## 421 BASHAR Testudinalia testudinalis Limpet 2018 0 6
## 422 BASHAR Littorina littorea Common periwinkle 2019 0 86
## 423 BASHAR Littorina obtusata Smooth periwinkle 2019 0 27
## 424 BASHAR Littorina littorea Common periwinkle 2021 7 249
## 425 BASHAR Littorina obtusata Smooth periwinkle 2021 1 42
## 426 BASHAR Nucella lapillus Dogwhelk 2021 0 2
## 427 BASHAR Testudinalia testudinalis Limpet 2021 0 1
## 428 BASHAR Littorina littorea Common periwinkle 2022 0 165
## 429 BASHAR Littorina obtusata Smooth periwinkle 2022 0 34
## 430 BASHAR Nucella lapillus Dogwhelk 2022 0 1
## 431 BASHAR Testudinalia testudinalis Limpet 2022 0 4
## 432 BASHAR Carcinus maenas Green crab 2023 0 1
## 433 BASHAR Littorina littorea Common periwinkle 2023 2 151
## 434 BASHAR Littorina obtusata Smooth periwinkle 2023 0 16
## 435 BASHAR Nucella lapillus Dogwhelk 2023 0 3
## 436 BASHAR Carcinus maenas Green crab 2024 0 3
## 437 BASHAR Littorina littorea Common periwinkle 2024 5 106
## 438 BASHAR Littorina obtusata Smooth periwinkle 2024 0 26
## 439 BASHAR Nucella lapillus Dogwhelk 2024 0 1
## 440 BASHAR Littorina littorea Common periwinkle 2013 1 22
## 441 BASHAR Littorina littorea Common periwinkle 2013 1 26
## 442 BASHAR Littorina obtusata Smooth periwinkle 2013 2 32
## 443 BASHAR Littorina obtusata Smooth periwinkle 2013 0 80
## 444 BASHAR Nucella lapillus Dogwhelk 2013 0 1
## 445 BASHAR Testudinalia testudinalis Limpet 2013 0 2
## 446 BASHAR Testudinalia testudinalis Limpet 2013 0 9
## 447 BASHAR Littorina littorea Common periwinkle 2014 0 25
## 448 BASHAR Littorina obtusata Smooth periwinkle 2014 0 25
## 449 BASHAR Nucella lapillus Dogwhelk 2014 0 1
## 450 BASHAR Littorina littorea Common periwinkle 2015 2 18
## 451 BASHAR Littorina obtusata Smooth periwinkle 2015 1 4
## 452 BASHAR Littorina littorea Common periwinkle 2016 0 68
## 453 BASHAR Littorina obtusata Smooth periwinkle 2016 0 51
## 454 BASHAR Nucella lapillus Dogwhelk 2016 0 1
## 455 BASHAR Testudinalia testudinalis Limpet 2016 0 12
## 456 BASHAR Littorina littorea Common periwinkle 2017 0 124
## 457 BASHAR Littorina obtusata Smooth periwinkle 2017 0 41
## 458 BASHAR Nucella lapillus Dogwhelk 2017 0 1
## 459 BASHAR Littorina littorea Common periwinkle 2018 4 181
## 460 BASHAR Littorina obtusata Smooth periwinkle 2018 1 28
## 461 BASHAR Nucella lapillus Dogwhelk 2018 0 2
## 462 BASHAR Testudinalia testudinalis Limpet 2018 0 3
## 463 BASHAR Littorina littorea Common periwinkle 2019 0 102
## 464 BASHAR Littorina obtusata Smooth periwinkle 2019 0 31
## 465 BASHAR Testudinalia testudinalis Limpet 2019 0 4
## 466 BASHAR Carcinus maenas Green crab 2021 0 3
## 467 BASHAR Littorina littorea Common periwinkle 2021 2 212
## 468 BASHAR Littorina obtusata Smooth periwinkle 2021 0 34
## 469 BASHAR Nucella lapillus Dogwhelk 2021 0 1
## 470 BASHAR Testudinalia testudinalis Limpet 2021 0 1
## 471 BASHAR Littorina littorea Common periwinkle 2022 17 282
## 472 BASHAR Littorina obtusata Smooth periwinkle 2022 4 33
## 473 BASHAR Nucella lapillus Dogwhelk 2022 0 9
## 474 BASHAR Testudinalia testudinalis Limpet 2022 1 5
## 475 BASHAR Littorina littorea Common periwinkle 2023 1 130
## 476 BASHAR Littorina obtusata Smooth periwinkle 2023 0 7
## 477 BASHAR Nucella lapillus Dogwhelk 2023 0 1
## 478 BASHAR Carcinus maenas Green crab 2024 0 2
## 479 BASHAR Littorina littorea Common periwinkle 2024 11 138
## 480 BASHAR Littorina obtusata Smooth periwinkle 2024 1 21
## 481 BASHAR Littorina littorea Common periwinkle 2013 0 5
## 482 BASHAR Littorina littorea Common periwinkle 2013 0 13
## 483 BASHAR Littorina obtusata Smooth periwinkle 2013 0 45
## 484 BASHAR Littorina obtusata Smooth periwinkle 2013 1 89
## 485 BASHAR Nucella lapillus Dogwhelk 2013 1 0
## 486 BASHAR Nucella lapillus Dogwhelk 2013 0 11
## 487 BASHAR Testudinalia testudinalis Limpet 2013 0 5
## 488 BASHAR Testudinalia testudinalis Limpet 2013 0 13
## 489 BASHAR Littorina littorea Common periwinkle 2014 0 9
## 490 BASHAR Littorina obtusata Smooth periwinkle 2014 0 35
## 491 BASHAR Testudinalia testudinalis Limpet 2014 0 1
## 492 BASHAR Littorina littorea Common periwinkle 2015 5 17
## 493 BASHAR Littorina obtusata Smooth periwinkle 2015 4 35
## 494 BASHAR Nucella lapillus Dogwhelk 2015 1 6
## 495 BASHAR Testudinalia testudinalis Limpet 2015 0 2
## 496 BASHAR Littorina littorea Common periwinkle 2016 0 61
## 497 BASHAR Littorina obtusata Smooth periwinkle 2016 2 49
## 498 BASHAR Nucella lapillus Dogwhelk 2016 1 10
## 499 BASHAR Testudinalia testudinalis Limpet 2016 0 4
## 500 BASHAR Littorina littorea Common periwinkle 2017 0 80
## 501 BASHAR Littorina obtusata Smooth periwinkle 2017 0 28
## 502 BASHAR Nucella lapillus Dogwhelk 2017 0 1
## 503 BASHAR Testudinalia testudinalis Limpet 2017 0 2
## 504 BASHAR Littorina littorea Common periwinkle 2018 0 97
## 505 BASHAR Littorina obtusata Smooth periwinkle 2018 2 39
## 506 BASHAR Nucella lapillus Dogwhelk 2018 0 3
## 507 BASHAR Testudinalia testudinalis Limpet 2018 0 10
## 508 BASHAR Littorina littorea Common periwinkle 2019 0 70
## 509 BASHAR Littorina obtusata Smooth periwinkle 2019 0 18
## 510 BASHAR Nucella lapillus Dogwhelk 2019 1 7
## 511 BASHAR Carcinus maenas Green crab 2021 0 1
## 512 BASHAR Littorina littorea Common periwinkle 2021 3 130
## 513 BASHAR Littorina obtusata Smooth periwinkle 2021 2 39
## 514 BASHAR Nucella lapillus Dogwhelk 2021 0 3
## 515 BASHAR Testudinalia testudinalis Limpet 2021 0 1
## 516 BASHAR Littorina littorea Common periwinkle 2022 6 134
## 517 BASHAR Littorina obtusata Smooth periwinkle 2022 0 63
## 518 BASHAR Nucella lapillus Dogwhelk 2022 0 10
## 519 BASHAR Littorina littorea Common periwinkle 2023 1 168
## 520 BASHAR Littorina obtusata Smooth periwinkle 2023 0 11
## 521 BASHAR Nucella lapillus Dogwhelk 2023 0 3
## 522 BASHAR Testudinalia testudinalis Limpet 2023 0 1
## 523 BASHAR Carcinus maenas Green crab 2024 0 2
## 524 BASHAR Littorina littorea Common periwinkle 2024 8 73
## 525 BASHAR Littorina obtusata Smooth periwinkle 2024 3 29
## 526 BASHAR Nucella lapillus Dogwhelk 2024 0 2
## 527 BASHAR Littorina littorea Common periwinkle 2013 0 2
## 528 BASHAR Littorina littorea Common periwinkle 2013 0 4
## 529 BASHAR Littorina obtusata Smooth periwinkle 2013 0 1
## 530 BASHAR Littorina obtusata Smooth periwinkle 2013 0 1
## 531 BASHAR Nucella lapillus Dogwhelk 2013 0 1
## 532 BASHAR Testudinalia testudinalis Limpet 2013 0 2
## 533 BASHAR Littorina littorea Common periwinkle 2014 0 6
## 534 BASHAR Littorina obtusata Smooth periwinkle 2014 0 2
## 535 BASHAR Nucella lapillus Dogwhelk 2014 0 2
## 536 BASHAR Testudinalia testudinalis Limpet 2014 0 3
## 537 BASHAR Littorina littorea Common periwinkle 2015 9 69
## 538 BASHAR Littorina obtusata Smooth periwinkle 2015 1 18
## 539 BASHAR Testudinalia testudinalis Limpet 2015 0 6
## 540 BASHAR Littorina littorea Common periwinkle 2016 3 18
## 541 BASHAR Nucella lapillus Dogwhelk 2016 0 2
## 542 BASHAR Testudinalia testudinalis Limpet 2016 0 6
## 543 BASHAR Littorina littorea Common periwinkle 2017 0 92
## 544 BASHAR Littorina obtusata Smooth periwinkle 2017 0 2
## 545 BASHAR Littorina littorea Common periwinkle 2018 5 94
## 546 BASHAR Littorina obtusata Smooth periwinkle 2018 0 5
## 547 BASHAR Testudinalia testudinalis Limpet 2018 0 2
## 548 BASHAR Littorina littorea Common periwinkle 2019 0 234
## 549 BASHAR Littorina obtusata Smooth periwinkle 2019 0 9
## 550 BASHAR Testudinalia testudinalis Limpet 2019 0 6
## 551 BASHAR Carcinus maenas Green crab 2021 0 3
## 552 BASHAR Littorina littorea Common periwinkle 2021 18 261
## 553 BASHAR Littorina obtusata Smooth periwinkle 2021 0 4
## 554 BASHAR Testudinalia testudinalis Limpet 2021 0 5
## 555 BASHAR Littorina littorea Common periwinkle 2022 11 233
## 556 BASHAR Littorina obtusata Smooth periwinkle 2022 0 12
## 557 BASHAR Nucella lapillus Dogwhelk 2022 0 1
## 558 BASHAR Littorina littorea Common periwinkle 2023 10 182
## 559 BASHAR Carcinus maenas Green crab 2024 0 2
## 560 BASHAR Littorina littorea Common periwinkle 2024 10 153
## 561 BASHAR Nucella lapillus Dogwhelk 2024 0 3
## 562 BASHAR Testudinalia testudinalis Limpet 2024 0 1
## 563 BASHAR Littorina obtusata Smooth periwinkle 2013 0 1
## 564 BASHAR Littorina littorea Common periwinkle 2014 0 23
## 565 BASHAR Littorina obtusata Smooth periwinkle 2014 0 3
## 566 BASHAR Testudinalia testudinalis Limpet 2014 0 1
## 567 BASHAR Littorina littorea Common periwinkle 2015 10 94
## 568 BASHAR Littorina obtusata Smooth periwinkle 2015 0 5
## 569 BASHAR Nucella lapillus Dogwhelk 2015 1 0
## 570 BASHAR Testudinalia testudinalis Limpet 2015 0 12
## 571 BASHAR Littorina littorea Common periwinkle 2016 0 30
## 572 BASHAR Littorina obtusata Smooth periwinkle 2016 1 0
## 573 BASHAR Testudinalia testudinalis Limpet 2016 0 2
## 574 BASHAR Littorina littorea Common periwinkle 2017 0 106
## 575 BASHAR Littorina obtusata Smooth periwinkle 2017 0 1
## 576 BASHAR Testudinalia testudinalis Limpet 2017 0 1
## 577 BASHAR Littorina littorea Common periwinkle 2018 12 95
## 578 BASHAR Littorina obtusata Smooth periwinkle 2018 0 4
## 579 BASHAR Nucella lapillus Dogwhelk 2018 0 1
## 580 BASHAR Testudinalia testudinalis Limpet 2018 0 3
## 581 BASHAR Littorina littorea Common periwinkle 2019 1 170
## 582 BASHAR Littorina obtusata Smooth periwinkle 2019 0 8
## 583 BASHAR Nucella lapillus Dogwhelk 2019 0 1
## 584 BASHAR Testudinalia testudinalis Limpet 2019 0 1
## 585 BASHAR Carcinus maenas Green crab 2021 1 0
## 586 BASHAR Littorina littorea Common periwinkle 2021 257 0
## 587 BASHAR Littorina obtusata Smooth periwinkle 2021 1 0
## 588 BASHAR Testudinalia testudinalis Limpet 2021 0 1
## 589 BASHAR Littorina littorea Common periwinkle 2022 9 123
## 590 BASHAR Littorina obtusata Smooth periwinkle 2022 0 8
## 591 BASHAR Nucella lapillus Dogwhelk 2022 0 4
## 592 BASHAR Testudinalia testudinalis Limpet 2022 0 4
## 593 BASHAR Littorina littorea Common periwinkle 2023 6 169
## 594 BASHAR Testudinalia testudinalis Limpet 2023 0 2
## 595 BASHAR Carcinus maenas Green crab 2024 0 2
## 596 BASHAR Littorina littorea Common periwinkle 2024 6 96
## 597 BASHAR Littorina obtusata Smooth periwinkle 2024 0 1
## 598 BASHAR Nucella lapillus Dogwhelk 2024 0 1
## 599 BASHAR Testudinalia testudinalis Limpet 2024 0 1
## 600 BASHAR Littorina littorea Common periwinkle 2013 0 1
## 601 BASHAR Nucella lapillus Dogwhelk 2013 0 1
## 602 BASHAR Littorina littorea Common periwinkle 2014 0 6
## 603 BASHAR Testudinalia testudinalis Limpet 2014 0 3
## 604 BASHAR Littorina littorea Common periwinkle 2015 5 15
## 605 BASHAR Littorina obtusata Smooth periwinkle 2015 0 3
## 606 BASHAR Nucella lapillus Dogwhelk 2015 1 3
## 607 BASHAR Testudinalia testudinalis Limpet 2015 0 3
## 608 BASHAR Littorina littorea Common periwinkle 2016 0 51
## 609 BASHAR Testudinalia testudinalis Limpet 2016 0 2
## 610 BASHAR Littorina littorea Common periwinkle 2017 0 63
## 611 BASHAR Littorina obtusata Smooth periwinkle 2017 0 1
## 612 BASHAR Littorina saxatilis Rough periwinkle 2017 0 1
## 613 BASHAR Testudinalia testudinalis Limpet 2017 0 2
## 614 BASHAR Littorina littorea Common periwinkle 2018 5 101
## 615 BASHAR Testudinalia testudinalis Limpet 2018 0 4
## 616 BASHAR Littorina littorea Common periwinkle 2019 7 125
## 617 BASHAR Littorina obtusata Smooth periwinkle 2019 0 2
## 618 BASHAR Nucella lapillus Dogwhelk 2019 0 1
## 619 BASHAR Testudinalia testudinalis Limpet 2019 0 5
## 620 BASHAR Littorina littorea Common periwinkle 2021 13 107
## 621 BASHAR Littorina littorea Common periwinkle 2022 0 148
## 622 BASHAR Littorina obtusata Smooth periwinkle 2022 0 2
## 623 BASHAR Nucella lapillus Dogwhelk 2022 0 2
## 624 BASHAR Carcinus maenas Green crab 2023 0 1
## 625 BASHAR Littorina littorea Common periwinkle 2023 34 180
## 626 BASHAR Testudinalia testudinalis Limpet 2023 0 2
## 627 BASHAR Littorina littorea Common periwinkle 2024 4 36
## 628 BASHAR Littorina littorea Common periwinkle 2013 0 1
## 629 BASHAR Littorina littorea Common periwinkle 2013 0 1
## 630 BASHAR Littorina littorea Common periwinkle 2014 2 12
## 631 BASHAR Littorina obtusata Smooth periwinkle 2014 0 1
## 632 BASHAR Testudinalia testudinalis Limpet 2014 0 6
## 633 BASHAR Littorina littorea Common periwinkle 2015 9 58
## 634 BASHAR Littorina obtusata Smooth periwinkle 2015 0 1
## 635 BASHAR Testudinalia testudinalis Limpet 2015 0 5
## 636 BASHAR Littorina littorea Common periwinkle 2016 1 6
## 637 BASHAR Testudinalia testudinalis Limpet 2016 0 1
## 638 BASHAR Littorina littorea Common periwinkle 2017 1 131
## 639 BASHAR Littorina obtusata Smooth periwinkle 2017 0 1
## 640 BASHAR Nucella lapillus Dogwhelk 2017 0 1
## 641 BASHAR Testudinalia testudinalis Limpet 2017 0 2
## 642 BASHAR Littorina littorea Common periwinkle 2018 12 106
## 643 BASHAR Littorina obtusata Smooth periwinkle 2018 0 1
## 644 BASHAR Nucella lapillus Dogwhelk 2018 0 3
## 645 BASHAR Testudinalia testudinalis Limpet 2018 0 5
## 646 BASHAR Littorina littorea Common periwinkle 2019 11 1960
## 647 BASHAR Littorina obtusata Smooth periwinkle 2019 0 1
## 648 BASHAR Testudinalia testudinalis Limpet 2019 0 3
## 649 BASHAR Carcinus maenas Green crab 2021 0 2
## 650 BASHAR Littorina littorea Common periwinkle 2021 15 224
## 651 BASHAR Nucella lapillus Dogwhelk 2021 3 0
## 652 BASHAR Testudinalia testudinalis Limpet 2021 0 3
## 653 BASHAR Littorina littorea Common periwinkle 2022 3 100
## 654 BASHAR Littorina obtusata Smooth periwinkle 2022 0 4
## 655 BASHAR Testudinalia testudinalis Limpet 2022 0 6
## 656 BASHAR Littorina littorea Common periwinkle 2023 26 150
## 657 BASHAR Testudinalia testudinalis Limpet 2023 0 3
## 658 BASHAR Littorina littorea Common periwinkle 2024 3 62
## 659 BASHAR Nucella lapillus Dogwhelk 2013 0 1
## 660 BASHAR Nucella lapillus Dogwhelk 2013 0 2
## 661 BASHAR Littorina littorea Common periwinkle 2014 0 1
## 662 BASHAR Littorina littorea Common periwinkle 2015 4 16
## 663 BASHAR Littorina littorea Common periwinkle 2016 1 5
## 664 BASHAR Nucella lapillus Dogwhelk 2016 0 2
## 665 BASHAR Testudinalia testudinalis Limpet 2016 0 2
## 666 BASHAR Littorina littorea Common periwinkle 2017 0 96
## 667 BASHAR Nucella lapillus Dogwhelk 2017 0 1
## 668 BASHAR Testudinalia testudinalis Limpet 2017 0 2
## 669 BASHAR Littorina littorea Common periwinkle 2018 0 101
## 670 BASHAR Nucella lapillus Dogwhelk 2018 0 3
## 671 BASHAR Testudinalia testudinalis Limpet 2018 0 2
## 672 BASHAR Littorina littorea Common periwinkle 2019 0 39
## 673 BASHAR Littorina obtusata Smooth periwinkle 2019 0 1
## 674 BASHAR Testudinalia testudinalis Limpet 2019 0 1
## 675 BASHAR Littorina littorea Common periwinkle 2021 6 11
## 676 BASHAR Nucella lapillus Dogwhelk 2021 2 0
## 677 BASHAR Testudinalia testudinalis Limpet 2021 0 1
## 678 BASHAR Littorina littorea Common periwinkle 2022 5 45
## 679 BASHAR Littorina obtusata Smooth periwinkle 2022 0 1
## 680 BASHAR Testudinalia testudinalis Limpet 2022 0 2
## 681 BASHAR Littorina littorea Common periwinkle 2023 2 26
## 682 BASHAR Littorina littorea Common periwinkle 2024 0 5
Return first 5 rows and a subset of columns of the data frame
## SiteCode ScientificName CommonName Year Damage No.Damage
## 1 BASHAR Littorina littorea Common periwinkle 2013 0 2
## 2 BASHAR Littorina littorea Common periwinkle 2013 0 3
## 3 BASHAR Littorina obtusata Smooth periwinkle 2013 1 2
## 4 BASHAR Littorina obtusata Smooth periwinkle 2013 0 6
## 5 BASHAR Nucella lapillus Dogwhelk 2013 0 1
Return all rows and first 4 columns of the data frame
motinv_sub <- motinv[, 1:4] # works, but risky
motinv_sub2 <- motinv[, c("Network", "UnitCode", "SiteCode", "StartDate")] #same result, but betterCoding Tip: As shown above, you can specify columns by name or by column number. However, it's almost always best to refer to columns by name. It makes your code easier to read and prevents it from breaking if columns get reordered.
CHALLENGE: How would you look at the the first 4 even rows
(2, 4, 6, 8), and first 2 columns of the motinv data
frame?
## Network UnitCode
## 2 NETN ACAD
## 4 NETN ACAD
## 6 NETN ACAD
## 8 NETN ACAD
## [1] "Network" "UnitCode" "SiteCode" "StartDate"
## [5] "Year" "QAQC" "PlotName" "CommunityType"
## [9] "ScientificName" "CommonName" "SpeciesCode" "Damage"
## [13] "No.Damage" "Subsampled"
## Network UnitCode
## 2 NETN ACAD
## 4 NETN ACAD
## 6 NETN ACAD
## 8 NETN ACAD
= vs == vs %in%
= or a
double ==.
=.
==.
!= is interpreted as not equal to for similar
use.
%in%. This operator works just like
==, but for multiple conditions. The ==
operator is not designed to take more than 1 condition, even though it
won't give you an error. Instead, it will stop after it makes the first
match.
As you get more comfortable with R, this will become natural. If you
forget, R will error and may even give you a hint when you used
= instead of ==.
Pattern match (filter) to return a data frame of surveys that were not QAQC visits (QAQC = TRUE) visits and return all columns.
## Network UnitCode SiteCode StartDate Year QAQC PlotName CommunityType
## 1 NETN ACAD BASHAR 6/24/2013 2013 TRUE A1 Ascophyllum
## 2 NETN ACAD BASHAR 6/21/2013 2013 FALSE A1 Ascophyllum
## 3 NETN ACAD BASHAR 6/24/2013 2013 TRUE A1 Ascophyllum
## 4 NETN ACAD BASHAR 6/21/2013 2013 FALSE A1 Ascophyllum
## 5 NETN ACAD BASHAR 6/24/2013 2013 TRUE A1 Ascophyllum
## 6 NETN ACAD BASHAR 6/21/2014 2014 FALSE A1 Ascophyllum
## ScientificName CommonName SpeciesCode Damage No.Damage Subsampled
## 1 Littorina littorea Common periwinkle LITLIT 0 2 No
## 2 Littorina littorea Common periwinkle LITLIT 0 3 No
## 3 Littorina obtusata Smooth periwinkle LITOBT 1 2 No
## 4 Littorina obtusata Smooth periwinkle LITOBT 0 6 No
## 5 Nucella lapillus Dogwhelk NUCLAP 0 1 No
## 6 Littorina littorea Common periwinkle LITLIT 0 2 No
##
## FALSE TRUE
## 640 42
##
## FALSE
## 640
Filter data to only return the ScientificName column of rows where CommunityType is "Barnacle". Click on R Output below to view results.
## [1] "Littorina littorea" "Littorina saxatilis"
## [3] "Nucella lapillus" "Littorina littorea"
## [5] "Littorina obtusata" "Nucella lapillus"
## [7] "Littorina littorea" "Littorina obtusata"
## [9] "Nucella lapillus" "Littorina littorea"
## [11] "Littorina obtusata" "Littorina littorea"
## [13] "Littorina obtusata" "Littorina saxatilis"
## [15] "Littorina littorea" "Littorina obtusata"
## [17] "Littorina saxatilis" "Carcinus maenas"
## [19] "Littorina littorea" "Littorina obtusata"
## [21] "Littorina saxatilis" "Nucella lapillus"
## [23] "Littorina littorea" "Littorina obtusata"
## [25] "Nucella lapillus" "Littorina littorea"
## [27] "Littorina obtusata" "Littorina saxatilis"
## [29] "Littorina littorea" "Littorina obtusata"
## [31] "Littorina littorea" "Littorina obtusata"
## [33] "Littorina littorea" "Littorina obtusata"
## [35] "Littorina littorea" "Littorina obtusata"
## [37] "Littorina littorea" "Littorina obtusata"
## [39] "Nucella lapillus" "Littorina littorea"
## [41] "Littorina obtusata" "Nucella lapillus"
## [43] "Littorina littorea" "Littorina obtusata"
## [45] "Littorina littorea" "Littorina obtusata"
## [47] "Nucella lapillus" "Carcinus maenas"
## [49] "Littorina littorea" "Littorina obtusata"
## [51] "Nucella lapillus" "Littorina littorea"
## [53] "Littorina obtusata" "Nucella lapillus"
## [55] "Carcinus maenas" "Littorina littorea"
## [57] "Littorina obtusata" "Nucella lapillus"
## [59] "Carcinus maenas" "Littorina littorea"
## [61] "Littorina obtusata" "Littorina saxatilis"
## [63] "Nucella lapillus" "Littorina obtusata"
## [65] "Littorina littorea" "Littorina obtusata"
## [67] "Littorina littorea" "Littorina obtusata"
## [69] "Nucella lapillus" "Littorina littorea"
## [71] "Littorina obtusata" "Littorina littorea"
## [73] "Littorina obtusata" "Littorina saxatilis"
## [75] "Littorina littorea" "Littorina obtusata"
## [77] "Nucella lapillus" "Carcinus maenas"
## [79] "Littorina littorea" "Littorina obtusata"
## [81] "Littorina littorea" "Littorina obtusata"
## [83] "Nucella lapillus" "Littorina littorea"
## [85] "Littorina obtusata" "Nucella lapillus"
## [87] "Carcinus maenas" "Littorina littorea"
## [89] "Littorina obtusata" "Nucella lapillus"
## [91] "Littorina obtusata" "Littorina saxatilis"
## [93] "Littorina littorea" "Littorina obtusata"
## [95] "Littorina littorea" "Littorina obtusata"
## [97] "Nucella lapillus" "Littorina littorea"
## [99] "Littorina obtusata" "Nucella lapillus"
## [101] "Testudinalia testudinalis" "Littorina littorea"
## [103] "Littorina obtusata" "Littorina saxatilis"
## [105] "Littorina littorea" "Littorina obtusata"
## [107] "Littorina saxatilis" "Littorina littorea"
## [109] "Littorina obtusata" "Littorina saxatilis"
## [111] "Nucella lapillus" "Littorina littorea"
## [113] "Littorina obtusata" "Littorina saxatilis"
## [115] "Nucella lapillus" "Littorina littorea"
## [117] "Littorina obtusata" "Littorina saxatilis"
## [119] "Nucella lapillus" "Carcinus maenas"
## [121] "Littorina littorea" "Littorina obtusata"
## [123] "Nucella lapillus" "Littorina obtusata"
## [125] "Littorina littorea" "Littorina obtusata"
## [127] "Littorina littorea" "Littorina littorea"
## [129] "Littorina obtusata" "Littorina littorea"
## [131] "Littorina obtusata" "Nucella lapillus"
## [133] "Carcinus maenas" "Littorina littorea"
## [135] "Littorina obtusata" "Littorina saxatilis"
## [137] "Nucella lapillus" "Testudinalia testudinalis"
## [139] "Littorina littorea" "Littorina obtusata"
## [141] "Littorina saxatilis" "Carcinus maenas"
## [143] "Littorina littorea" "Littorina obtusata"
## [145] "Nucella lapillus" "Littorina littorea"
## [147] "Littorina obtusata" "Nucella lapillus"
## [1] "Littorina littorea" "Littorina saxatilis"
## [3] "Nucella lapillus" "Littorina littorea"
## [5] "Littorina obtusata" "Nucella lapillus"
## [7] "Littorina littorea" "Littorina obtusata"
## [9] "Nucella lapillus" "Littorina littorea"
## [11] "Littorina obtusata" "Littorina littorea"
## [13] "Littorina obtusata" "Littorina saxatilis"
## [15] "Littorina littorea" "Littorina obtusata"
## [17] "Littorina saxatilis" "Carcinus maenas"
## [19] "Littorina littorea" "Littorina obtusata"
## [21] "Littorina saxatilis" "Nucella lapillus"
## [23] "Littorina littorea" "Littorina obtusata"
## [25] "Nucella lapillus" "Littorina littorea"
## [27] "Littorina obtusata" "Littorina saxatilis"
## [29] "Littorina littorea" "Littorina obtusata"
## [31] "Littorina littorea" "Littorina obtusata"
## [33] "Littorina littorea" "Littorina obtusata"
## [35] "Littorina littorea" "Littorina obtusata"
## [37] "Littorina littorea" "Littorina obtusata"
## [39] "Nucella lapillus" "Littorina littorea"
## [41] "Littorina obtusata" "Nucella lapillus"
## [43] "Littorina littorea" "Littorina obtusata"
## [45] "Littorina littorea" "Littorina obtusata"
## [47] "Nucella lapillus" "Carcinus maenas"
## [49] "Littorina littorea" "Littorina obtusata"
## [51] "Nucella lapillus" "Littorina littorea"
## [53] "Littorina obtusata" "Nucella lapillus"
## [55] "Carcinus maenas" "Littorina littorea"
## [57] "Littorina obtusata" "Nucella lapillus"
## [59] "Carcinus maenas" "Littorina littorea"
## [61] "Littorina obtusata" "Littorina saxatilis"
## [63] "Nucella lapillus" "Littorina obtusata"
## [65] "Littorina littorea" "Littorina obtusata"
## [67] "Littorina littorea" "Littorina obtusata"
## [69] "Nucella lapillus" "Littorina littorea"
## [71] "Littorina obtusata" "Littorina littorea"
## [73] "Littorina obtusata" "Littorina saxatilis"
## [75] "Littorina littorea" "Littorina obtusata"
## [77] "Nucella lapillus" "Carcinus maenas"
## [79] "Littorina littorea" "Littorina obtusata"
## [81] "Littorina littorea" "Littorina obtusata"
## [83] "Nucella lapillus" "Littorina littorea"
## [85] "Littorina obtusata" "Nucella lapillus"
## [87] "Carcinus maenas" "Littorina littorea"
## [89] "Littorina obtusata" "Nucella lapillus"
## [91] "Littorina obtusata" "Littorina saxatilis"
## [93] "Littorina littorea" "Littorina obtusata"
## [95] "Littorina littorea" "Littorina obtusata"
## [97] "Nucella lapillus" "Littorina littorea"
## [99] "Littorina obtusata" "Nucella lapillus"
## [101] "Testudinalia testudinalis" "Littorina littorea"
## [103] "Littorina obtusata" "Littorina saxatilis"
## [105] "Littorina littorea" "Littorina obtusata"
## [107] "Littorina saxatilis" "Littorina littorea"
## [109] "Littorina obtusata" "Littorina saxatilis"
## [111] "Nucella lapillus" "Littorina littorea"
## [113] "Littorina obtusata" "Littorina saxatilis"
## [115] "Nucella lapillus" "Littorina littorea"
## [117] "Littorina obtusata" "Littorina saxatilis"
## [119] "Nucella lapillus" "Carcinus maenas"
## [121] "Littorina littorea" "Littorina obtusata"
## [123] "Nucella lapillus" "Littorina obtusata"
## [125] "Littorina littorea" "Littorina obtusata"
## [127] "Littorina littorea" "Littorina littorea"
## [129] "Littorina obtusata" "Littorina littorea"
## [131] "Littorina obtusata" "Nucella lapillus"
## [133] "Carcinus maenas" "Littorina littorea"
## [135] "Littorina obtusata" "Littorina saxatilis"
## [137] "Nucella lapillus" "Testudinalia testudinalis"
## [139] "Littorina littorea" "Littorina obtusata"
## [141] "Littorina saxatilis" "Carcinus maenas"
## [143] "Littorina littorea" "Littorina obtusata"
## [145] "Nucella lapillus" "Littorina littorea"
## [147] "Littorina obtusata" "Nucella lapillus"
Filter data to return any plot where Littorina species were detected in the barnacle plots.
lit_spp <- c("Littorina littorea", "Littorina obtusata", "Littorina saxatilis")
motinv_lit <- motinv[motinv$ScientificName %in% lit_spp,
c("SiteCode", "PlotName", "ScientificName", "Year")]
motinv_lit## SiteCode PlotName ScientificName Year
## 1 BASHAR A1 Littorina littorea 2013
## 2 BASHAR A1 Littorina littorea 2013
## 3 BASHAR A1 Littorina obtusata 2013
## 4 BASHAR A1 Littorina obtusata 2013
## 6 BASHAR A1 Littorina littorea 2014
## 7 BASHAR A1 Littorina obtusata 2014
## 8 BASHAR A1 Littorina littorea 2016
## 9 BASHAR A1 Littorina obtusata 2016
## 10 BASHAR A1 Littorina littorea 2017
## 11 BASHAR A1 Littorina obtusata 2017
## 12 BASHAR A1 Littorina littorea 2018
## 13 BASHAR A1 Littorina obtusata 2018
## 14 BASHAR A1 Littorina littorea 2019
## 15 BASHAR A1 Littorina obtusata 2019
## 17 BASHAR A1 Littorina littorea 2021
## 18 BASHAR A1 Littorina obtusata 2021
## 20 BASHAR A1 Littorina littorea 2022
## 21 BASHAR A1 Littorina obtusata 2022
## 23 BASHAR A1 Littorina littorea 2023
## 24 BASHAR A1 Littorina obtusata 2023
## 26 BASHAR A1 Littorina littorea 2024
## 27 BASHAR A1 Littorina saxatilis 2024
## 28 BASHAR A2 Littorina littorea 2013
## 29 BASHAR A2 Littorina littorea 2013
## 30 BASHAR A2 Littorina obtusata 2013
## 31 BASHAR A2 Littorina obtusata 2013
## 34 BASHAR A2 Littorina littorea 2014
## 35 BASHAR A2 Littorina obtusata 2014
## 36 BASHAR A2 Littorina saxatilis 2014
## 37 BASHAR A2 Littorina littorea 2015
## 38 BASHAR A2 Littorina obtusata 2015
## 40 BASHAR A2 Littorina littorea 2016
## 41 BASHAR A2 Littorina obtusata 2016
## 43 BASHAR A2 Littorina littorea 2017
## 44 BASHAR A2 Littorina obtusata 2017
## 45 BASHAR A2 Littorina littorea 2018
## 46 BASHAR A2 Littorina obtusata 2018
## 47 BASHAR A2 Littorina littorea 2019
## 48 BASHAR A2 Littorina obtusata 2019
## 50 BASHAR A2 Littorina littorea 2021
## 51 BASHAR A2 Littorina obtusata 2021
## 53 BASHAR A2 Littorina littorea 2022
## 54 BASHAR A2 Littorina obtusata 2022
## 57 BASHAR A2 Littorina littorea 2023
## 58 BASHAR A2 Littorina obtusata 2023
## 60 BASHAR A2 Littorina littorea 2024
## 61 BASHAR A2 Littorina obtusata 2024
## 62 BASHAR A3 Littorina littorea 2013
## 63 BASHAR A3 Littorina littorea 2013
## 64 BASHAR A3 Littorina obtusata 2013
## 65 BASHAR A3 Littorina obtusata 2013
## 68 BASHAR A3 Littorina littorea 2014
## 69 BASHAR A3 Littorina obtusata 2014
## 70 BASHAR A3 Littorina littorea 2015
## 71 BASHAR A3 Littorina obtusata 2015
## 72 BASHAR A3 Littorina littorea 2016
## 73 BASHAR A3 Littorina obtusata 2016
## 75 BASHAR A3 Littorina littorea 2017
## 76 BASHAR A3 Littorina obtusata 2017
## 77 BASHAR A3 Littorina littorea 2018
## 78 BASHAR A3 Littorina obtusata 2018
## 80 BASHAR A3 Littorina littorea 2019
## 81 BASHAR A3 Littorina obtusata 2019
## 83 BASHAR A3 Littorina littorea 2021
## 84 BASHAR A3 Littorina obtusata 2021
## 86 BASHAR A3 Littorina littorea 2022
## 87 BASHAR A3 Littorina obtusata 2022
## 89 BASHAR A3 Littorina littorea 2023
## 90 BASHAR A3 Littorina obtusata 2023
## 91 BASHAR A3 Littorina littorea 2024
## 92 BASHAR A3 Littorina obtusata 2024
## 93 BASHAR A4 Littorina littorea 2013
## 94 BASHAR A4 Littorina littorea 2013
## 95 BASHAR A4 Littorina obtusata 2013
## 96 BASHAR A4 Littorina obtusata 2013
## 98 BASHAR A4 Littorina littorea 2014
## 99 BASHAR A4 Littorina obtusata 2014
## 101 BASHAR A4 Littorina littorea 2015
## 102 BASHAR A4 Littorina obtusata 2015
## 103 BASHAR A4 Littorina littorea 2016
## 104 BASHAR A4 Littorina obtusata 2016
## 106 BASHAR A4 Littorina littorea 2017
## 107 BASHAR A4 Littorina obtusata 2017
## 108 BASHAR A4 Littorina littorea 2018
## 109 BASHAR A4 Littorina obtusata 2018
## 112 BASHAR A4 Littorina littorea 2019
## 113 BASHAR A4 Littorina obtusata 2019
## 115 BASHAR A4 Littorina littorea 2021
## 116 BASHAR A4 Littorina obtusata 2021
## 118 BASHAR A4 Littorina littorea 2022
## 119 BASHAR A4 Littorina obtusata 2022
## 122 BASHAR A4 Littorina littorea 2023
## 123 BASHAR A4 Littorina obtusata 2023
## 125 BASHAR A4 Littorina littorea 2024
## 126 BASHAR A4 Littorina obtusata 2024
## 127 BASHAR A5 Littorina littorea 2013
## 128 BASHAR A5 Littorina littorea 2013
## 129 BASHAR A5 Littorina obtusata 2013
## 130 BASHAR A5 Littorina obtusata 2013
## 131 BASHAR A5 Littorina littorea 2014
## 132 BASHAR A5 Littorina obtusata 2014
## 133 BASHAR A5 Littorina littorea 2015
## 134 BASHAR A5 Littorina obtusata 2015
## 135 BASHAR A5 Littorina littorea 2016
## 136 BASHAR A5 Littorina obtusata 2016
## 137 BASHAR A5 Littorina littorea 2017
## 138 BASHAR A5 Littorina obtusata 2017
## 140 BASHAR A5 Littorina littorea 2018
## 141 BASHAR A5 Littorina obtusata 2018
## 142 BASHAR A5 Littorina littorea 2019
## 143 BASHAR A5 Littorina obtusata 2019
## 145 BASHAR A5 Littorina littorea 2021
## 146 BASHAR A5 Littorina obtusata 2021
## 147 BASHAR A5 Littorina littorea 2022
## 148 BASHAR A5 Littorina obtusata 2022
## 151 BASHAR A5 Littorina littorea 2023
## 152 BASHAR A5 Littorina obtusata 2023
## 153 BASHAR A5 Littorina littorea 2024
## 154 BASHAR A5 Littorina obtusata 2024
## 155 BASHAR B1 Littorina littorea 2013
## 156 BASHAR B1 Littorina saxatilis 2013
## 158 BASHAR B1 Littorina littorea 2015
## 159 BASHAR B1 Littorina obtusata 2015
## 161 BASHAR B1 Littorina littorea 2016
## 162 BASHAR B1 Littorina obtusata 2016
## 164 BASHAR B1 Littorina littorea 2017
## 165 BASHAR B1 Littorina obtusata 2017
## 166 BASHAR B1 Littorina littorea 2018
## 167 BASHAR B1 Littorina obtusata 2018
## 168 BASHAR B1 Littorina saxatilis 2018
## 169 BASHAR B1 Littorina littorea 2019
## 170 BASHAR B1 Littorina obtusata 2019
## 171 BASHAR B1 Littorina saxatilis 2019
## 173 BASHAR B1 Littorina littorea 2021
## 174 BASHAR B1 Littorina obtusata 2021
## 175 BASHAR B1 Littorina saxatilis 2021
## 177 BASHAR B1 Littorina littorea 2022
## 178 BASHAR B1 Littorina obtusata 2022
## 180 BASHAR B1 Littorina littorea 2023
## 181 BASHAR B1 Littorina obtusata 2023
## 182 BASHAR B1 Littorina saxatilis 2023
## 183 BASHAR B1 Littorina littorea 2024
## 184 BASHAR B1 Littorina obtusata 2024
## 185 BASHAR B2 Littorina littorea 2013
## 186 BASHAR B2 Littorina obtusata 2013
## 187 BASHAR B2 Littorina littorea 2014
## 188 BASHAR B2 Littorina obtusata 2014
## 189 BASHAR B2 Littorina littorea 2015
## 190 BASHAR B2 Littorina obtusata 2015
## 191 BASHAR B2 Littorina littorea 2016
## 192 BASHAR B2 Littorina obtusata 2016
## 194 BASHAR B2 Littorina littorea 2017
## 195 BASHAR B2 Littorina obtusata 2017
## 197 BASHAR B2 Littorina littorea 2018
## 198 BASHAR B2 Littorina obtusata 2018
## 199 BASHAR B2 Littorina littorea 2019
## 200 BASHAR B2 Littorina obtusata 2019
## 203 BASHAR B2 Littorina littorea 2021
## 204 BASHAR B2 Littorina obtusata 2021
## 206 BASHAR B2 Littorina littorea 2022
## 207 BASHAR B2 Littorina obtusata 2022
## 210 BASHAR B2 Littorina littorea 2023
## 211 BASHAR B2 Littorina obtusata 2023
## 214 BASHAR B2 Littorina littorea 2024
## 215 BASHAR B2 Littorina obtusata 2024
## 216 BASHAR B2 Littorina saxatilis 2024
## 218 BASHAR B3 Littorina obtusata 2014
## 219 BASHAR B3 Littorina littorea 2015
## 220 BASHAR B3 Littorina obtusata 2015
## 221 BASHAR B3 Littorina littorea 2016
## 222 BASHAR B3 Littorina obtusata 2016
## 224 BASHAR B3 Littorina littorea 2017
## 225 BASHAR B3 Littorina obtusata 2017
## 226 BASHAR B3 Littorina littorea 2018
## 227 BASHAR B3 Littorina obtusata 2018
## 228 BASHAR B3 Littorina saxatilis 2018
## 229 BASHAR B3 Littorina littorea 2019
## 230 BASHAR B3 Littorina obtusata 2019
## 233 BASHAR B3 Littorina littorea 2021
## 234 BASHAR B3 Littorina obtusata 2021
## 235 BASHAR B3 Littorina littorea 2022
## 236 BASHAR B3 Littorina obtusata 2022
## 238 BASHAR B3 Littorina littorea 2023
## 239 BASHAR B3 Littorina obtusata 2023
## 242 BASHAR B3 Littorina littorea 2024
## 243 BASHAR B3 Littorina obtusata 2024
## 245 BASHAR B4 Littorina obtusata 2013
## 246 BASHAR B4 Littorina saxatilis 2013
## 247 BASHAR B4 Littorina littorea 2014
## 248 BASHAR B4 Littorina obtusata 2014
## 249 BASHAR B4 Littorina littorea 2015
## 250 BASHAR B4 Littorina obtusata 2015
## 252 BASHAR B4 Littorina littorea 2016
## 253 BASHAR B4 Littorina obtusata 2016
## 256 BASHAR B4 Littorina littorea 2017
## 257 BASHAR B4 Littorina obtusata 2017
## 258 BASHAR B4 Littorina saxatilis 2017
## 259 BASHAR B4 Littorina littorea 2018
## 260 BASHAR B4 Littorina obtusata 2018
## 261 BASHAR B4 Littorina saxatilis 2018
## 262 BASHAR B4 Littorina littorea 2019
## 263 BASHAR B4 Littorina obtusata 2019
## 264 BASHAR B4 Littorina saxatilis 2019
## 266 BASHAR B4 Littorina littorea 2021
## 267 BASHAR B4 Littorina obtusata 2021
## 268 BASHAR B4 Littorina saxatilis 2021
## 270 BASHAR B4 Littorina littorea 2022
## 271 BASHAR B4 Littorina obtusata 2022
## 272 BASHAR B4 Littorina saxatilis 2022
## 275 BASHAR B4 Littorina littorea 2024
## 276 BASHAR B4 Littorina obtusata 2024
## 278 BASHAR B5 Littorina obtusata 2015
## 279 BASHAR B5 Littorina littorea 2016
## 280 BASHAR B5 Littorina obtusata 2016
## 281 BASHAR B5 Littorina littorea 2017
## 282 BASHAR B5 Littorina littorea 2018
## 283 BASHAR B5 Littorina obtusata 2018
## 284 BASHAR B5 Littorina littorea 2019
## 285 BASHAR B5 Littorina obtusata 2019
## 288 BASHAR B5 Littorina littorea 2021
## 289 BASHAR B5 Littorina obtusata 2021
## 290 BASHAR B5 Littorina saxatilis 2021
## 293 BASHAR B5 Littorina littorea 2022
## 294 BASHAR B5 Littorina obtusata 2022
## 295 BASHAR B5 Littorina saxatilis 2022
## 297 BASHAR B5 Littorina littorea 2023
## 298 BASHAR B5 Littorina obtusata 2023
## 300 BASHAR B5 Littorina littorea 2024
## 301 BASHAR B5 Littorina obtusata 2024
## 303 BASHAR F1 Littorina littorea 2013
## 304 BASHAR F1 Littorina littorea 2013
## 305 BASHAR F1 Littorina obtusata 2013
## 306 BASHAR F1 Littorina obtusata 2013
## 310 BASHAR F1 Littorina littorea 2014
## 311 BASHAR F1 Littorina obtusata 2014
## 313 BASHAR F1 Littorina littorea 2015
## 314 BASHAR F1 Littorina obtusata 2015
## 317 BASHAR F1 Littorina littorea 2016
## 318 BASHAR F1 Littorina obtusata 2016
## 320 BASHAR F1 Littorina littorea 2017
## 321 BASHAR F1 Littorina obtusata 2017
## 324 BASHAR F1 Littorina littorea 2018
## 325 BASHAR F1 Littorina obtusata 2018
## 329 BASHAR F1 Littorina littorea 2019
## 330 BASHAR F1 Littorina obtusata 2019
## 333 BASHAR F1 Littorina littorea 2021
## 334 BASHAR F1 Littorina obtusata 2021
## 338 BASHAR F1 Littorina littorea 2022
## 339 BASHAR F1 Littorina obtusata 2022
## 343 BASHAR F1 Littorina littorea 2023
## 344 BASHAR F1 Littorina obtusata 2023
## 348 BASHAR F1 Littorina littorea 2024
## 349 BASHAR F1 Littorina obtusata 2024
## 351 BASHAR F2 Littorina littorea 2013
## 352 BASHAR F2 Littorina obtusata 2013
## 353 BASHAR F2 Littorina obtusata 2013
## 357 BASHAR F2 Littorina littorea 2014
## 358 BASHAR F2 Littorina obtusata 2014
## 361 BASHAR F2 Littorina littorea 2015
## 362 BASHAR F2 Littorina obtusata 2015
## 364 BASHAR F2 Littorina littorea 2016
## 365 BASHAR F2 Littorina obtusata 2016
## 366 BASHAR F2 Littorina saxatilis 2016
## 369 BASHAR F2 Littorina littorea 2017
## 370 BASHAR F2 Littorina obtusata 2017
## 372 BASHAR F2 Littorina littorea 2018
## 373 BASHAR F2 Littorina obtusata 2018
## 376 BASHAR F2 Littorina littorea 2019
## 377 BASHAR F2 Littorina obtusata 2019
## 380 BASHAR F2 Littorina littorea 2021
## 381 BASHAR F2 Littorina obtusata 2021
## 384 BASHAR F2 Littorina littorea 2022
## 385 BASHAR F2 Littorina obtusata 2022
## 386 BASHAR F2 Littorina saxatilis 2022
## 390 BASHAR F2 Littorina littorea 2023
## 391 BASHAR F2 Littorina obtusata 2023
## 393 BASHAR F2 Littorina littorea 2024
## 394 BASHAR F2 Littorina obtusata 2024
## 396 BASHAR F3 Littorina littorea 2013
## 397 BASHAR F3 Littorina littorea 2013
## 398 BASHAR F3 Littorina obtusata 2013
## 399 BASHAR F3 Littorina obtusata 2013
## 404 BASHAR F3 Littorina littorea 2014
## 405 BASHAR F3 Littorina obtusata 2014
## 408 BASHAR F3 Littorina littorea 2015
## 409 BASHAR F3 Littorina obtusata 2015
## 411 BASHAR F3 Littorina littorea 2016
## 412 BASHAR F3 Littorina obtusata 2016
## 414 BASHAR F3 Littorina littorea 2017
## 415 BASHAR F3 Littorina obtusata 2017
## 418 BASHAR F3 Littorina littorea 2018
## 419 BASHAR F3 Littorina obtusata 2018
## 422 BASHAR F3 Littorina littorea 2019
## 423 BASHAR F3 Littorina obtusata 2019
## 424 BASHAR F3 Littorina littorea 2021
## 425 BASHAR F3 Littorina obtusata 2021
## 428 BASHAR F3 Littorina littorea 2022
## 429 BASHAR F3 Littorina obtusata 2022
## 433 BASHAR F3 Littorina littorea 2023
## 434 BASHAR F3 Littorina obtusata 2023
## 437 BASHAR F3 Littorina littorea 2024
## 438 BASHAR F3 Littorina obtusata 2024
## 440 BASHAR F4 Littorina littorea 2013
## 441 BASHAR F4 Littorina littorea 2013
## 442 BASHAR F4 Littorina obtusata 2013
## 443 BASHAR F4 Littorina obtusata 2013
## 447 BASHAR F4 Littorina littorea 2014
## 448 BASHAR F4 Littorina obtusata 2014
## 450 BASHAR F4 Littorina littorea 2015
## 451 BASHAR F4 Littorina obtusata 2015
## 452 BASHAR F4 Littorina littorea 2016
## 453 BASHAR F4 Littorina obtusata 2016
## 456 BASHAR F4 Littorina littorea 2017
## 457 BASHAR F4 Littorina obtusata 2017
## 459 BASHAR F4 Littorina littorea 2018
## 460 BASHAR F4 Littorina obtusata 2018
## 463 BASHAR F4 Littorina littorea 2019
## 464 BASHAR F4 Littorina obtusata 2019
## 467 BASHAR F4 Littorina littorea 2021
## 468 BASHAR F4 Littorina obtusata 2021
## 471 BASHAR F4 Littorina littorea 2022
## 472 BASHAR F4 Littorina obtusata 2022
## 475 BASHAR F4 Littorina littorea 2023
## 476 BASHAR F4 Littorina obtusata 2023
## 479 BASHAR F4 Littorina littorea 2024
## 480 BASHAR F4 Littorina obtusata 2024
## 481 BASHAR F5 Littorina littorea 2013
## 482 BASHAR F5 Littorina littorea 2013
## 483 BASHAR F5 Littorina obtusata 2013
## 484 BASHAR F5 Littorina obtusata 2013
## 489 BASHAR F5 Littorina littorea 2014
## 490 BASHAR F5 Littorina obtusata 2014
## 492 BASHAR F5 Littorina littorea 2015
## 493 BASHAR F5 Littorina obtusata 2015
## 496 BASHAR F5 Littorina littorea 2016
## 497 BASHAR F5 Littorina obtusata 2016
## 500 BASHAR F5 Littorina littorea 2017
## 501 BASHAR F5 Littorina obtusata 2017
## 504 BASHAR F5 Littorina littorea 2018
## 505 BASHAR F5 Littorina obtusata 2018
## 508 BASHAR F5 Littorina littorea 2019
## 509 BASHAR F5 Littorina obtusata 2019
## 512 BASHAR F5 Littorina littorea 2021
## 513 BASHAR F5 Littorina obtusata 2021
## 516 BASHAR F5 Littorina littorea 2022
## 517 BASHAR F5 Littorina obtusata 2022
## 519 BASHAR F5 Littorina littorea 2023
## 520 BASHAR F5 Littorina obtusata 2023
## 524 BASHAR F5 Littorina littorea 2024
## 525 BASHAR F5 Littorina obtusata 2024
## 527 BASHAR R1 Littorina littorea 2013
## 528 BASHAR R1 Littorina littorea 2013
## 529 BASHAR R1 Littorina obtusata 2013
## 530 BASHAR R1 Littorina obtusata 2013
## 533 BASHAR R1 Littorina littorea 2014
## 534 BASHAR R1 Littorina obtusata 2014
## 537 BASHAR R1 Littorina littorea 2015
## 538 BASHAR R1 Littorina obtusata 2015
## 540 BASHAR R1 Littorina littorea 2016
## 543 BASHAR R1 Littorina littorea 2017
## 544 BASHAR R1 Littorina obtusata 2017
## 545 BASHAR R1 Littorina littorea 2018
## 546 BASHAR R1 Littorina obtusata 2018
## 548 BASHAR R1 Littorina littorea 2019
## 549 BASHAR R1 Littorina obtusata 2019
## 552 BASHAR R1 Littorina littorea 2021
## 553 BASHAR R1 Littorina obtusata 2021
## 555 BASHAR R1 Littorina littorea 2022
## 556 BASHAR R1 Littorina obtusata 2022
## 558 BASHAR R1 Littorina littorea 2023
## 560 BASHAR R1 Littorina littorea 2024
## 563 BASHAR R2 Littorina obtusata 2013
## 564 BASHAR R2 Littorina littorea 2014
## 565 BASHAR R2 Littorina obtusata 2014
## 567 BASHAR R2 Littorina littorea 2015
## 568 BASHAR R2 Littorina obtusata 2015
## 571 BASHAR R2 Littorina littorea 2016
## 572 BASHAR R2 Littorina obtusata 2016
## 574 BASHAR R2 Littorina littorea 2017
## 575 BASHAR R2 Littorina obtusata 2017
## 577 BASHAR R2 Littorina littorea 2018
## 578 BASHAR R2 Littorina obtusata 2018
## 581 BASHAR R2 Littorina littorea 2019
## 582 BASHAR R2 Littorina obtusata 2019
## 586 BASHAR R2 Littorina littorea 2021
## 587 BASHAR R2 Littorina obtusata 2021
## 589 BASHAR R2 Littorina littorea 2022
## 590 BASHAR R2 Littorina obtusata 2022
## 593 BASHAR R2 Littorina littorea 2023
## 596 BASHAR R2 Littorina littorea 2024
## 597 BASHAR R2 Littorina obtusata 2024
## 600 BASHAR R3 Littorina littorea 2013
## 602 BASHAR R3 Littorina littorea 2014
## 604 BASHAR R3 Littorina littorea 2015
## 605 BASHAR R3 Littorina obtusata 2015
## 608 BASHAR R3 Littorina littorea 2016
## 610 BASHAR R3 Littorina littorea 2017
## 611 BASHAR R3 Littorina obtusata 2017
## 612 BASHAR R3 Littorina saxatilis 2017
## 614 BASHAR R3 Littorina littorea 2018
## 616 BASHAR R3 Littorina littorea 2019
## 617 BASHAR R3 Littorina obtusata 2019
## 620 BASHAR R3 Littorina littorea 2021
## 621 BASHAR R3 Littorina littorea 2022
## 622 BASHAR R3 Littorina obtusata 2022
## 625 BASHAR R3 Littorina littorea 2023
## 627 BASHAR R3 Littorina littorea 2024
## 628 BASHAR R4 Littorina littorea 2013
## 629 BASHAR R4 Littorina littorea 2013
## 630 BASHAR R4 Littorina littorea 2014
## 631 BASHAR R4 Littorina obtusata 2014
## 633 BASHAR R4 Littorina littorea 2015
## 634 BASHAR R4 Littorina obtusata 2015
## 636 BASHAR R4 Littorina littorea 2016
## 638 BASHAR R4 Littorina littorea 2017
## 639 BASHAR R4 Littorina obtusata 2017
## 642 BASHAR R4 Littorina littorea 2018
## 643 BASHAR R4 Littorina obtusata 2018
## 646 BASHAR R4 Littorina littorea 2019
## 647 BASHAR R4 Littorina obtusata 2019
## 650 BASHAR R4 Littorina littorea 2021
## 653 BASHAR R4 Littorina littorea 2022
## 654 BASHAR R4 Littorina obtusata 2022
## 656 BASHAR R4 Littorina littorea 2023
## 658 BASHAR R4 Littorina littorea 2024
## 661 BASHAR R5 Littorina littorea 2014
## 662 BASHAR R5 Littorina littorea 2015
## 663 BASHAR R5 Littorina littorea 2016
## 666 BASHAR R5 Littorina littorea 2017
## 669 BASHAR R5 Littorina littorea 2018
## 672 BASHAR R5 Littorina littorea 2019
## 673 BASHAR R5 Littorina obtusata 2019
## 675 BASHAR R5 Littorina littorea 2021
## 678 BASHAR R5 Littorina littorea 2022
## 679 BASHAR R5 Littorina obtusata 2022
## 681 BASHAR R5 Littorina littorea 2023
## 682 BASHAR R5 Littorina littorea 2024
Coding Tip: There are often multiple ways to perform a task. The best code is code that 1) works, 2) is easy to follow, and 3) is unlikely to break (e.g. use column names instead of numbers). That still means there are typically multiple equally valid approaches. There are other ways to judge good code as you advance, but for now, aspire to write code that meets these three qualities.
unique(), sort(),
length()
Determining the number of records that match a certain condition can
useful too. Say we want to know how many unique sites were sampled in
the motinv data frame. We can use a combination of brackets
and other functions to summarize that, like below.
Sort alphabetically a list of unique plot names.
# Return a vector of unique plot names, sorted alphabetically
plots_unique <- sort(unique(motinv[,"PlotName"]))
plots_unique## [1] "A1" "A2" "A3" "A4" "A5" "B1" "B2" "B3" "B4" "B5" "F1" "F2" "F3" "F4" "F5"
## [16] "R1" "R2" "R3" "R4" "R5"
Determine number of unique sites
## [1] 20
CHALLENGE: How many unique species are there in the
motinv data frame?
We've already explored the motile invertebrate dta a bit using
head(), str(), names(), and
View(). These are functions that you will use over and over
as you work with data in R. Below, I'm going to show how I get to know a
data set in R. First, to help you picture how these data are collected,
here's a site map of the Bass Harbor monitoring site.
Read in example rocky intertidal motile invertebrate data
Look at first few records
## Network UnitCode SiteCode StartDate Year QAQC PlotName CommunityType
## 1 NETN ACAD BASHAR 6/24/2013 2013 TRUE A1 Ascophyllum
## 2 NETN ACAD BASHAR 6/21/2013 2013 FALSE A1 Ascophyllum
## 3 NETN ACAD BASHAR 6/24/2013 2013 TRUE A1 Ascophyllum
## 4 NETN ACAD BASHAR 6/21/2013 2013 FALSE A1 Ascophyllum
## 5 NETN ACAD BASHAR 6/24/2013 2013 TRUE A1 Ascophyllum
## 6 NETN ACAD BASHAR 6/21/2014 2014 FALSE A1 Ascophyllum
## ScientificName CommonName SpeciesCode Damage No.Damage Subsampled
## 1 Littorina littorea Common periwinkle LITLIT 0 2 No
## 2 Littorina littorea Common periwinkle LITLIT 0 3 No
## 3 Littorina obtusata Smooth periwinkle LITOBT 1 2 No
## 4 Littorina obtusata Smooth periwinkle LITOBT 0 6 No
## 5 Nucella lapillus Dogwhelk NUCLAP 0 1 No
## 6 Littorina littorea Common periwinkle LITLIT 0 2 No
Look at structure of each column
## 'data.frame': 682 obs. of 14 variables:
## $ Network : chr "NETN" "NETN" "NETN" "NETN" ...
## $ UnitCode : chr "ACAD" "ACAD" "ACAD" "ACAD" ...
## $ SiteCode : chr "BASHAR" "BASHAR" "BASHAR" "BASHAR" ...
## $ StartDate : chr "6/24/2013" "6/21/2013" "6/24/2013" "6/21/2013" ...
## $ Year : int 2013 2013 2013 2013 2013 2014 2014 2016 2016 2017 ...
## $ QAQC : logi TRUE FALSE TRUE FALSE TRUE FALSE ...
## $ PlotName : chr "A1" "A1" "A1" "A1" ...
## $ CommunityType : chr "Ascophyllum" "Ascophyllum" "Ascophyllum" "Ascophyllum" ...
## $ ScientificName: chr "Littorina littorea" "Littorina littorea" "Littorina obtusata" "Littorina obtusata" ...
## $ CommonName : chr "Common periwinkle" "Common periwinkle" "Smooth periwinkle" "Smooth periwinkle" ...
## $ SpeciesCode : chr "LITLIT" "LITLIT" "LITOBT" "LITOBT" ...
## $ Damage : chr "0" "0" "1" "0" ...
## $ No.Damage : int 2 3 2 6 1 2 1 6 9 41 ...
## $ Subsampled : chr "No" "No" "No" "No" ...
Look at summary of the columns
## Network UnitCode SiteCode StartDate
## Length:682 Length:682 Length:682 Length:682
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
## Year QAQC PlotName CommunityType
## Min. :2013 Mode :logical Length:682 Length:682
## 1st Qu.:2015 FALSE:640 Class :character Class :character
## Median :2018 TRUE :42 Mode :character Mode :character
## Mean :2018
## 3rd Qu.:2022
## Max. :2024
## ScientificName CommonName SpeciesCode Damage
## Length:682 Length:682 Length:682 Length:682
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
## No.Damage Subsampled
## Min. : 0.0 Length:682
## 1st Qu.: 2.0 Class :character
## Median : 8.0 Mode :character
## Mean : 30.9
## 3rd Qu.: 35.0
## Max. :1960.0
Check for complete cases, assuming every column requires a value (i.e. no blanks).
##
## TRUE
## 682
##
## FALSE TRUE
## 12 670
To keep data frames rectangular, R treats missing data (i.e. blanks)
as NA (stands for not available). A foundational philosophy of R is that
the user must tell R functions what do to if NAs are in the data.
Ideally that forces the user to investigate the NAs to determine their
reason for being there, whether there's a way to fix it, if those
records should be dropped, etc. If you try to calculate the mean of a
column that has a blank in it, and you don't tell R what to do with NAs,
the returned value will be NA. Most summary functions in R
have an argument na.rm, which is logical (TRUE/FALSE). To
drop NAs, you include na.rm = TRUE.
It's important every time you have NAs in your data to think about what they mean and how best to treat them. Sometimes, it's best to drop them. Other times, converting the blanks to 0 is the best approach. It depends entirely on your data and what you intend to do with it.
Test NA use with mean() function
## [1] NA
## [1] 4
Look at unique values for Damage.
## [1] "0" "1" "10" "11" "12" "13" "14" "15" "17" "18" "2" "24"
## [13] "257" "26" "3" "34" "4" "5" "6" "7" "8" "9" "PM"
##
## 0 1 10 11 12 13 14 15 17 18 2 24 257 26 3 34 4 5 6 7
## 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
## 8 9 PM
## 1 1 1
There's 1 record called "PM", which stands for Permanently Missing in our data. We will convert PM to a blank, which R calls NA, and create a new Damage class column that is converted to numeric.
Convert "PM" to blank. I will first make a copy of the data frame.
motinv2 <- motinv
motinv2$Damage[motinv2$Damage == "PM"] <- NA
motinv2$Damage_num <- as.numeric(motinv2$Damage)
# check that it worked
str(motinv2) # Damage_num is numeric## 'data.frame': 682 obs. of 15 variables:
## $ Network : chr "NETN" "NETN" "NETN" "NETN" ...
## $ UnitCode : chr "ACAD" "ACAD" "ACAD" "ACAD" ...
## $ SiteCode : chr "BASHAR" "BASHAR" "BASHAR" "BASHAR" ...
## $ StartDate : chr "6/24/2013" "6/21/2013" "6/24/2013" "6/21/2013" ...
## $ Year : int 2013 2013 2013 2013 2013 2014 2014 2016 2016 2017 ...
## $ QAQC : logi TRUE FALSE TRUE FALSE TRUE FALSE ...
## $ PlotName : chr "A1" "A1" "A1" "A1" ...
## $ CommunityType : chr "Ascophyllum" "Ascophyllum" "Ascophyllum" "Ascophyllum" ...
## $ ScientificName: chr "Littorina littorea" "Littorina littorea" "Littorina obtusata" "Littorina obtusata" ...
## $ CommonName : chr "Common periwinkle" "Common periwinkle" "Smooth periwinkle" "Smooth periwinkle" ...
## $ SpeciesCode : chr "LITLIT" "LITLIT" "LITOBT" "LITOBT" ...
## $ Damage : chr "0" "0" "1" "0" ...
## $ No.Damage : int 2 3 2 6 1 2 1 6 9 41 ...
## $ Subsampled : chr "No" "No" "No" "No" ...
## $ Damage_num : num 0 0 1 0 0 0 0 0 1 0 ...
## [1] 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 17 18 24
## [20] 26 34 257
Using the motinv2 data frame, which fixed the Damage
column by making the Damage_num field numeric, we're now
going to drop visits that were for QAQC using a new base R function
called subset(). The subset() function allows
you to reduce the dimensions of a data frame. You can reduce rows,
columns, or both in the same function call. I will also show the bracket
approach.
Remove QAQC visits (IsQAQC == TRUE) and drop the original Damage column
Convert StartDate into a date-time instead of character.
## Network UnitCode SiteCode StartDate Year QAQC PlotName CommunityType
## 2 NETN ACAD BASHAR 6/21/2013 2013 FALSE A1 Ascophyllum
## 4 NETN ACAD BASHAR 6/21/2013 2013 FALSE A1 Ascophyllum
## 6 NETN ACAD BASHAR 6/21/2014 2014 FALSE A1 Ascophyllum
## 7 NETN ACAD BASHAR 6/21/2014 2014 FALSE A1 Ascophyllum
## 8 NETN ACAD BASHAR 6/28/2016 2016 FALSE A1 Ascophyllum
## 9 NETN ACAD BASHAR 6/28/2016 2016 FALSE A1 Ascophyllum
## ScientificName CommonName SpeciesCode No.Damage Subsampled
## 2 Littorina littorea Common periwinkle LITLIT 3 No
## 4 Littorina obtusata Smooth periwinkle LITOBT 6 No
## 6 Littorina littorea Common periwinkle LITLIT 2 No
## 7 Littorina obtusata Smooth periwinkle LITOBT 1 No
## 8 Littorina littorea Common periwinkle LITLIT 6 No
## 9 Littorina obtusata Smooth periwinkle LITOBT 9 No
## Damage_num
## 2 0
## 4 0
## 6 0
## 7 0
## 8 0
## 9 1
# Create new column called Date
motinv3$Date <- as.Date(motinv3$StartDate, format = "%m/%d/%Y")
str(motinv3)## 'data.frame': 640 obs. of 15 variables:
## $ Network : chr "NETN" "NETN" "NETN" "NETN" ...
## $ UnitCode : chr "ACAD" "ACAD" "ACAD" "ACAD" ...
## $ SiteCode : chr "BASHAR" "BASHAR" "BASHAR" "BASHAR" ...
## $ StartDate : chr "6/21/2013" "6/21/2013" "6/21/2014" "6/21/2014" ...
## $ Year : int 2013 2013 2014 2014 2016 2016 2017 2017 2018 2018 ...
## $ QAQC : logi FALSE FALSE FALSE FALSE FALSE FALSE ...
## $ PlotName : chr "A1" "A1" "A1" "A1" ...
## $ CommunityType : chr "Ascophyllum" "Ascophyllum" "Ascophyllum" "Ascophyllum" ...
## $ ScientificName: chr "Littorina littorea" "Littorina obtusata" "Littorina littorea" "Littorina obtusata" ...
## $ CommonName : chr "Common periwinkle" "Smooth periwinkle" "Common periwinkle" "Smooth periwinkle" ...
## $ SpeciesCode : chr "LITLIT" "LITOBT" "LITLIT" "LITOBT" ...
## $ No.Damage : int 3 6 2 1 6 9 41 1 11 3 ...
## $ Subsampled : chr "No" "No" "No" "No" ...
## $ Damage_num : num 0 0 0 0 0 1 0 0 1 0 ...
## $ Date : Date, format: "2013-06-21" "2013-06-21" ...
Renaming columns in base R is kind of a pain, and I have to look it up every time I need to do it. I'll show you an easier way to do this tomorrow.
Rename ScientificName column
## [1] "Network" "UnitCode" "SiteCode" "StartDate"
## [5] "Year" "QAQC" "PlotName" "CommunityType"
## [9] "ScientificName" "CommonName" "SpeciesCode" "No.Damage"
## [13] "Subsampled" "Damage_num" "Date"
names(motinv3)[names(motinv3) == "ScientificName"] <- "Species"
names(motinv3) # check that it worked## [1] "Network" "UnitCode" "SiteCode" "StartDate"
## [5] "Year" "QAQC" "PlotName" "CommunityType"
## [9] "Species" "CommonName" "SpeciesCode" "No.Damage"
## [13] "Subsampled" "Damage_num" "Date"
Site_Plot column via paste()
The paste() and paste0() functions are very
handy for creating new columns that are combinations of existing
functions. The code below will create a new column named
Site_Plot that's a combination of SiteCode and
PlotName.
Create new Site_Plot column
motinv3$Site_Plot <- paste(motinv3$SiteCode, motinv3$PlotName, sep = "-")
motinv3$Site_Plot <- paste0(motinv3$SiteCode, "-", motinv3$PlotName) #equivalent- by default no separation between elements of paste.
Coding Tip: In most cases, it does not matter whether you use
single ' or double ", as long as you open and close with the same. The
cases where it matters are where you have quotes within quotes. There
you have to alternate your usage, like
print("Text in outer quote 'text printed as being within quotes' end with closing quote").
Option 1. Subset the data with brackets and use the
sort(unique()) to give an easier to read output.
# OPTION 2
gcrab <- motinv3[motinv3$Species == "Carcinus maenas",]
sort(unique(gcrab$Year)) #2019, 2021, 2022, 2023, 2024## [1] 2019 2021 2022 2023 2024
Option 2. Subset data then use table() to tally the
years and number of rows green crabs were found.
##
## 2019 2021 2022 2023 2024
## 3 16 6 11 11
There are multiple ways to do this. Two examples are below.
Option 1. View the data and sort by No.Damage.
Option 2. Find the max No.Damage count and subset the data frame
## Network UnitCode SiteCode StartDate Year QAQC PlotName CommunityType
## 646 NETN ACAD BASHAR 6/11/2019 2019 FALSE R4 Red Algae
## Species CommonName SpeciesCode No.Damage Subsampled
## 646 Littorina littorea Common periwinkle LITLIT 1960 No
## Damage_num Date Site_Plot
## 646 11 2019-06-11 BASHAR-R4
CHALLENGE: Fix the No.Damage typo by replacing 1960 with 196.
Let's say that you looked at the datasheet, and the actual count for No.Damage was 196 instead of 1960. You can change that value in the original CSV by hand. But even better is to document that change in code. There are multiple ways to do this. Two examples are below.
But first, it's good to create a new data frame when modifying the original data frame, so you can refer back to the original if needed. I also use a really specific filter to make sure I'm not accidentally changing other data.
Replace 1960 with 196
# create copy of motinv data
motinv_fix <- motinv3
# find the problematic value, and change it to 196
motinv_fix$No.Damage[motinv_fix$Year == 2019 &
motinv_fix$PlotName == "R4" &
motinv_fix$No.Damage == 1960] <- 196
# check your work
range(motinv3$No.Damage) #1960## [1] 0 1960
## [1] 0 282
Visualizing the data is also important to get a sense for the data
and look for potential errors and outliers. Base R has plotting
functions that allow you to create quick plots without having to know a
lot of code. I often use Base R plot functions when I'm exploring data
but not making plots I plan to use for publication. When I need to
create more complex plots, I use ggplot2, which we'll cover
on Day 2 and 3.
Histograms are a great start. The code below generates a basic
histogram plot of a specific column in the dataframe using the
hist() function.
Plot histogram of motile invertebrate No.Damage counts
Looking at the histogram, it looks like all of the counts are below
500cm except for one that's way out in 2000 range. You can also make a
scatterplot of the data. If you only specify one column, the x axis will
be the row number for each record, and the y axis will be the specified
column.
Make point plot of No.Damage counts
Again, you can see there's one value that's greater than all of the
others.
We can also plot two variables in a scatterplot.
Make scatterplot of No.Damage vs. Damage_num (Option 1)
Make scatterplot of No.Damage vs. Damage_num (Option 2- better axis labels)
Here you can see there's one value that's greater than all of the
others in both sets of counts. These would be worth looking at more
carefully to determine if they're errors in the data.
dplyr
ifelse() and case_when()
conditional statements
dplyr.
summarize() and mutate().
We are now going to learn how to subset rows and columns and other
common data wrangling tasks using packages in the
tidyverse. Taken directly from
tidyverse.org: "The tidyverse is an
opinionated collection of R packages designed for data science. All
packages share an underlying design philosophy, grammar, and data
structures."
You should have installed all of the tidyverse packages in preparation for this training. If you missed that step, install tidyverse packages using code below. It can take a few minutes for all the packages to install.
Only run if you haven't installed these packages yet
Load the tidyverse
Coding Tip: When you type library(tidyverse), you're
loading all nine the packages in the tidyverse. If you're only using one
or two packages, it's better to just load those to packages. It's
clearer to the user which packages are needed to run your code and
reduces dependencies. For this session, we're only going to use
dplyr, so I will just load that.
map() that allow you
to iterate functions or processes like a for loop.
read functions for csv, and other
formats. The read_csv() function, for example has more
bells and whistles than the base R read.csv() function.
I've never needed those extra features, so I just use
read.csv().
head(data.frame) over the format for
head(tibble).
dplyr
The dplyr package is perhaps the single most useful
package in R for working with your data.
Artwork by
@allison_horst
dplyr functions and their use:
Now, using the dplyr package in the
tidyverse, we're going to do the same operations we did
yesterday with brackets.
Read in example motile invertebrate data
Replace "PM" with NA (blank) in Damage columns.
# Base R
motinv2 <- motinv
motinv2$Damage[motinv2$Damage == "PM"] <- NA
motinv2$Damage_num <- as.numeric(motinv2$Damage)# dplyr approach with mutate
motinv2 <- mutate(motinv, Damage_num = as.numeric(replace(Damage, Damage == "PM", NA)))
str(motinv2)## 'data.frame': 682 obs. of 15 variables:
## $ Network : chr "NETN" "NETN" "NETN" "NETN" ...
## $ UnitCode : chr "ACAD" "ACAD" "ACAD" "ACAD" ...
## $ SiteCode : chr "BASHAR" "BASHAR" "BASHAR" "BASHAR" ...
## $ StartDate : chr "6/24/2013" "6/21/2013" "6/24/2013" "6/21/2013" ...
## $ Year : int 2013 2013 2013 2013 2013 2014 2014 2016 2016 2017 ...
## $ QAQC : logi TRUE FALSE TRUE FALSE TRUE FALSE ...
## $ PlotName : chr "A1" "A1" "A1" "A1" ...
## $ CommunityType : chr "Ascophyllum" "Ascophyllum" "Ascophyllum" "Ascophyllum" ...
## $ ScientificName: chr "Littorina littorea" "Littorina littorea" "Littorina obtusata" "Littorina obtusata" ...
## $ CommonName : chr "Common periwinkle" "Common periwinkle" "Smooth periwinkle" "Smooth periwinkle" ...
## $ SpeciesCode : chr "LITLIT" "LITLIT" "LITOBT" "LITOBT" ...
## $ Damage : chr "0" "0" "1" "0" ...
## $ No.Damage : int 2 3 2 6 1 2 1 6 9 41 ...
## $ Subsampled : chr "No" "No" "No" "No" ...
## $ Damage_num : num 0 0 1 0 0 0 0 0 1 0 ...
Convert StartDate (character) to Date (date-time).
# dplyr approach with mutate
motinv2 <- mutate(motinv2, Date = as.Date(StartDate, format = "%m/%d/%Y"))Rename the ScientificName column to Species.
# dplyr approach with rename
motinv2 <- rename(motinv2, "Species" = "ScientificName")
names(motinv2)## [1] "Network" "UnitCode" "SiteCode" "StartDate"
## [5] "Year" "QAQC" "PlotName" "CommunityType"
## [9] "Species" "CommonName" "SpeciesCode" "Damage"
## [13] "No.Damage" "Subsampled" "Damage_num" "Date"
Create a Site_Plot column that's a combination of SiteCode and PlotName.
# dplyr approach with mutate
motinv2 <- mutate(motinv2, Site_Plot = paste(SiteCode, PlotName, sep = "-"))Drop records that are QAQC visits and drop original Damage column.
# Base R
motinv3 <- subset(motinv2, QAQC == FALSE, select = -Damage) # Note the importance of FALSE all caps# dplyr
motinv3a <- filter(motinv2, QAQC == FALSE)
motinv3 <- select(motinv3a, -Damage)
head(motinv3)## Network UnitCode SiteCode StartDate Year QAQC PlotName CommunityType
## 1 NETN ACAD BASHAR 6/21/2013 2013 FALSE A1 Ascophyllum
## 2 NETN ACAD BASHAR 6/21/2013 2013 FALSE A1 Ascophyllum
## 3 NETN ACAD BASHAR 6/21/2014 2014 FALSE A1 Ascophyllum
## 4 NETN ACAD BASHAR 6/21/2014 2014 FALSE A1 Ascophyllum
## 5 NETN ACAD BASHAR 6/28/2016 2016 FALSE A1 Ascophyllum
## 6 NETN ACAD BASHAR 6/28/2016 2016 FALSE A1 Ascophyllum
## Species CommonName SpeciesCode No.Damage Subsampled
## 1 Littorina littorea Common periwinkle LITLIT 3 No
## 2 Littorina obtusata Smooth periwinkle LITOBT 6 No
## 3 Littorina littorea Common periwinkle LITLIT 2 No
## 4 Littorina obtusata Smooth periwinkle LITOBT 1 No
## 5 Littorina littorea Common periwinkle LITLIT 6 No
## 6 Littorina obtusata Smooth periwinkle LITOBT 9 No
## Damage_num Date Site_Plot
## 1 0 2013-06-21 BASHAR-A1
## 2 0 2013-06-21 BASHAR-A1
## 3 0 2014-06-21 BASHAR-A1
## 4 0 2014-06-21 BASHAR-A1
## 5 0 2016-06-28 BASHAR-A1
## 6 1 2016-06-28 BASHAR-A1
dplyr.
The filter() function reduces rows. The
select() function reduces columns.
Reclass No.Damage outlier
|>
The pipe (|> or %>%) makes
dplyr and other tidyverse packages even more powerful. The
pipe |> allows you to string together commands. So,
taking all of the code above, we can do it all in the same function
call.
Wrangle motile invertebrate data with pipes
motinv_final <- motinv |>
mutate(Damage_num = as.numeric(replace(Damage, Damage == "PM", NA)), # Fix Damage PM
SitePlot = paste(SiteCode, PlotName, sep = "-"), # create new SitePlot column
Date = as.Date(StartDate, format = "%m/%d/%Y"), # create new Date column
No.Damage_fix = replace(No.Damage, No.Damage == 1960, 196)) |> # fix error in No.Damage
rename("Species" = "ScientificName") |> # change column name
filter(QAQC == FALSE) |> # drop QAQC visits
select(-Damage) |> # drop original Damage column
arrange(SitePlot, Year, Species) # optional sorting the data
head(motinv_final) ## Network UnitCode SiteCode StartDate Year QAQC PlotName CommunityType
## 1 NETN ACAD BASHAR 6/21/2013 2013 FALSE A1 Ascophyllum
## 2 NETN ACAD BASHAR 6/21/2013 2013 FALSE A1 Ascophyllum
## 3 NETN ACAD BASHAR 6/21/2014 2014 FALSE A1 Ascophyllum
## 4 NETN ACAD BASHAR 6/21/2014 2014 FALSE A1 Ascophyllum
## 5 NETN ACAD BASHAR 6/28/2016 2016 FALSE A1 Ascophyllum
## 6 NETN ACAD BASHAR 6/28/2016 2016 FALSE A1 Ascophyllum
## Species CommonName SpeciesCode No.Damage Subsampled
## 1 Littorina littorea Common periwinkle LITLIT 3 No
## 2 Littorina obtusata Smooth periwinkle LITOBT 6 No
## 3 Littorina littorea Common periwinkle LITLIT 2 No
## 4 Littorina obtusata Smooth periwinkle LITOBT 1 No
## 5 Littorina littorea Common periwinkle LITLIT 6 No
## 6 Littorina obtusata Smooth periwinkle LITOBT 9 No
## Damage_num SitePlot Date No.Damage_fix
## 1 0 BASHAR-A1 2013-06-21 3
## 2 0 BASHAR-A1 2013-06-21 6
## 3 0 BASHAR-A1 2014-06-21 2
## 4 0 BASHAR-A1 2014-06-21 1
## 5 0 BASHAR-A1 2016-06-28 6
## 6 1 BASHAR-A1 2016-06-28 9
Hopefully you agree that pipes are amazing! They allow for more
efficient coding and in relatively easy to follow the steps, and make
the dplyr functions, like mutate() so much more useful.
Outside of pipes for example, mutate() doesn't feel more
useful than base R for creating a new column. From now on, I will use
pipes regularly in the code.
%>%, that also functions as a
pipe with code. The %>% pipe was the original pipe that
was introduced by the tidyverse in the magrittr package.
The magrittr pipe was so popular, that starting in R 4.0, a
base R pipe was introduced (|>). It's supposed to be
better optimized for order of operations and reduces a package you need
to install. So, in general, use the base R pipe |>. It's
also why I had you set the default pipe in Global Options to the
|>. A useful keyboard shortcut for the pipe is Ctrl +
Shift + M. You should see the |> pipe in your script
when you type that shortcut. If you get the %>% pipe
instead, you need to change that default setting in Global Options (see
Day 1 > R and RStudio > RStudio Global Options > Step 3. Change
default pipe.) Coding Tip: While the number of steps you can pipe together is virtually endless, piping many tasks, especially complex ones, can make code hard to read and troubleshoot. It's best to limit number of pipes to 3-4, and/or to do complex tasks that might fail or require checking on their own.
motinv, how many species are found
in PlotName A1 in 2024?
## Network UnitCode SiteCode StartDate Year QAQC PlotName CommunityType
## 1 NETN ACAD BASHAR 6/11/2019 2019 FALSE R4 Red Algae
## ScientificName CommonName SpeciesCode Damage No.Damage Subsampled
## 1 Littorina littorea Common periwinkle LITLIT 11 1960 No
CHALLENGE: Fix the No.Damage typo by replacing 1960 with 196.
Let's say that you looked at the datasheet, and the actual count for No.Damage was 196 instead of 1960. You can change that value in the original CSV by hand. But even better is to document that change in code. There are multiple ways to do this. Two examples are below.
But first, it's good to create a new data frame when modifying the original data frame, so you can refer back to the original if needed. I also use a really specific filter to make sure I'm not accidentally changing other data.
Replace 1960 with 196
# Reminder of the base R approach
# create copy of motinv data
motinv_fix <- motinv
# find the problematic value, and change it to 196
motinv_fix$No.Damage[motinv_fix$Year == 2019 &
motinv_fix$PlotName == "R4" &
motinv_fix$No.Damage == 1960] <- 196
# dplyr approach
motinv_fix <- motinv |> mutate(No.Damage = replace(No.Damage, No.Damage == 1960, 196))
range(motinv_fix$No.Damage)## [1] 0 282
Conditional functions ifelse(),
if(){ }else{ }, and case_when() allow you to
return results that depends on specified conditions.
ifelse(): Primarily for use with data frames. Takes 3
arguments: 1) the condition to test; 2) the value to return if condition
is true; 3) the value to return of the condition is false. Function can
only handle 2 possible outcomes, although nested ifelse()
statements are possible (see example below). This function is
vectorized, which means it's optimized for working on columns in data
frames. Of the 3 conditionals, it tends to perform the fastest on large
data sets.
case_when(): Primarily for use with data frames. Can take
any number of condition statements and their value to return. Requires
dplyr package to be loaded. Syntax is a bit tricky to
figure out at first, but once you have it, it's about as easy as using
ifelse(). This function is akin to SQL CASE WHEN. On large
data sets, it consistently performs slower than ifelse().
if(){ }else{ }: Can be used with data frames, but is more
commonly used for operations outside of data frames. An example would be
only running a chunk of code if a certain condition is met (e.g., if the
data frame has > 0 rows, run next line of code.)
ifelse()
The ifelse() function takes 3 arguments organized like:
ifelse(condition == TRUE, return this, return this instead).
The first is the condition you're testing. The second argument is what
to return if the condition is met. The third is what to return if the
condition is not met. You can also nest ifelse() to include
more than 2 conditions, but it can quickly get out of hand and hard to
follow (see below).
Let's start by adding a column to the motile invertebrate that uses
the SpeciesCode to create a new column called
native that is either TRUE for native species, or FALSE for
non-native species. We'll add a second column named
native_grp that is "native", "exotic", and "invasive". The
invasive group includes Asian shore crabs and green crabs.
Create nativity column conditioning on SpeciesCode
# green crab, Asian shore crab, and common periwinkle species codes
exo_spp <- c("CARMAE", "HEMISAN", "LITLIT")
# smooth periwinkle, rough periwinkle, dogwhelk, and limpet species codes
nat_spp <- c("LITOBT", "LITSAX", "NUCLAP", "TECTES")
# Make a table of species codes in BASHAR
table(motinv$SpeciesCode)##
## CARMAE LITLIT LITOBT LITSAX NUCLAP TECTES
## 47 220 197 20 116 82
# Add native column with ifelse
motinv <- motinv |> mutate(native = ifelse(SpeciesCode %in% nat_spp, TRUE, FALSE))
# Add native_status column with nested ifelse
motinv <- motinv |> mutate(native_status = ifelse(SpeciesCode %in% nat_spp, "native",
ifelse(SpeciesCode %in% c("CARMAE", "HEMISAN"), "invasive",
"exotic")))
table(motinv$SpeciesCode, motinv$native)##
## FALSE TRUE
## CARMAE 47 0
## LITLIT 220 0
## LITOBT 0 197
## LITSAX 0 20
## NUCLAP 0 116
## TECTES 0 82
##
## exotic invasive native
## CARMAE 0 47 0
## LITLIT 220 0 0
## LITOBT 0 0 197
## LITSAX 0 0 20
## NUCLAP 0 0 116
## TECTES 0 0 82
%in% instead of ==
because exotic has multiple species codes. Only use == when
there's only one condition to match against. The %in%
approach can with with any combination of matching conditions, so I
almost always use %in% instead of ==.
case_when()
The case_when() function allows you to have multiple
conditions, each with their own return. The syntax is a bit different
than ifelse() to allow for the multiple conditions and
returns. Using the same approach as above, we'll recreate the
native_status column with case_when(). We'll then add a
fourth output for species codes that don't match any of the previous
conditions and set that as 'unknown'. Basically the TRUE
just means, any records left are assigned 'unknown'.
Note the order of operations in case_when(). The first
step assigns native species a 'native' status. Then, only non-native
species are left to condition on. The next step assigns CARMAE and
HEMISAN as 'invasive'. The third step conditions on species on the
exo_spp group, but only those that weren't already handled in previous
steps. Then the fourth statement considers any species not matched as
native, invasive, exotic. Rather than relying on this function behavior,
it's better to not have overlapping categories (e.g. not include CARMAE
and HEMISAN in the exo_spp). I include it here to demonstrate the
point.
Create status column conditioning on SpeciesCode
# green crab, Asian shore crab, and common periwinkle species codes
exo_spp <- c("CARMAE", "HEMISAN", "LITLIT")
# smooth periwinkle, rough periwinkle, dogwhelk, and limpet species codes
nat_spp <- c("LITOBT", "LITSAX", "NUCLAP", "TECTES")
motinv <- motinv |>
mutate(native_status = case_when(SpeciesCode %in% nat_spp ~ 'native',
SpeciesCode %in% c("CARMAE", "HEMISAN") ~ 'invasive',
SpeciesCode %in% exo_spp ~ 'exotic',
TRUE ~ 'unknown'))
table(motinv$SpeciesCode, motinv$native_status) # check that the output worked##
## exotic invasive native
## CARMAE 0 47 0
## LITLIT 220 0 0
## LITOBT 0 0 197
## LITSAX 0 0 20
## NUCLAP 0 0 116
## TECTES 0 0 82
if(){ }else{ }
This style of if(){ }else{ }, hereafter called if/else,
conditionals is best used for operations outside of data frames, like
turning code on or off based on specific conditions. I use if/else with
ggplot (graphing R package we'll cover later) to turn certain features
on or off based on a condition in the data or a condition I set. If/else
statements are also helpful for bug handling in your code. For example,
if you want the code to send a warning when your data frame is empty (no
rows), you can have an if/else statement that prints to the console. You
can string together multiple conditions to test by adding more
else{ } statements.
Print warning in console that indicates if invasive species are found in the motile invertebrate data.
inv <- motinv |> filter(native_status == "invasive")
spp_det <- unique(inv$CommonName)
if(nrow(inv) > 0){print(paste0("The following invasive species were detected in the data: ",
paste0(spp_det, collapse = ", ")))
} else {print("No invasive species were detected in the data.")}## [1] "The following invasive species were detected in the data: Green crab"
Force the else statement to print, by filtering out invasive species before testing. I added another potential else statement just to show that syntax.
inv <- motinv |> filter(SpeciesCode %in% nat_spp) |>
filter(native_status == "invasive")
spp_det <- unique(inv$CommonName)
if(nrow(inv) > 0){print(paste0("The following invasive species were detected in the data: ",
paste0(spp_det, collapse = ", ")))
} else {print("No invasive species were detected in the data.")}## [1] "No invasive species were detected in the data."
pred <- c("CARMAE", "NUCLAP")
# base R
motinv$trophic <- ifelse(motinv$SpeciesCode %in% pred, "predator", "herbivore")
table(motinv$trophic, motinv$SpeciesCode)##
## CARMAE LITLIT LITOBT LITSAX NUCLAP TECTES
## herbivore 0 220 197 20 0 82
## predator 47 0 0 0 116 0
# tidyverse
motinv <- motinv |> mutate(trophic = ifelse(SpeciesCode %in% pred, "predator", "herbivore"))
table(motinv$trophic, motinv$SpeciesCode)##
## CARMAE LITLIT LITOBT LITSAX NUCLAP TECTES
## herbivore 0 220 197 20 0 82
## predator 47 0 0 0 116 0
# Base R using a nested ifelse()
motinv$count_level <-
ifelse(motinv$No.Damage > 35, "High",
ifelse(motinv$No.Damage >= 10 & motinv$No.Damage <= 35, "Medium", "Low"))
table(motinv$count_level) # check that it worked##
## High Low Medium
## 167 352 163
# Tidyverse using case_when() and between
motinv <- motinv |> mutate(count_level = case_when(No.Damage > 35 ~ "High",
between(No.Damage, 10, 35) ~ "Medium",
No.Damage < 10 ~ "Low"))
table(motinv$count_level) # check that it worked##
## High Low Medium
## 167 352 163
Note the use of the between() function that saves
typing. This function matches as >= and <=.
summarize()
Yesterday, we used functions like mean(),
min(), and max() to summarize entire datasets.
Now we're going to use those same functions to summarize data by
grouping variables, such as park, year, plot, etc. The process is
similar to using Totals in Access or subtotals in Excel, although it is
more flexible and efficient in R.
summarize() and mutate():
mutate() returns the same number of rows as the
original data frame. This function also returns all of the rows that
were in the original data frame.
summarize() returns the same number of rows as there are
grouping levels in the original data frame. This function only
returns the rows that were part of the .by = c() and that
were created in the summarize() function.
mean(): calculate the group means
min(): calculate the group minimums
max(): calculate the group maximums
sum(): calculate the group sums
sd(): calculate the group standard deviations
n(): tally the number of rows within each group
To demonstrate summarize and mutate in dplyr, we're going to use the point intercept data collected along 3 transects in the Bass Harbor site. The data have already been summarized by transect (T1, T2, T3). We now want to calculate site-level median elevation and percent frequency for each cover type.
Read in the point intercept data
## SiteCode PlotName Year CoverType CoverCode med_elev num_counts
## 1 BASHAR T1 2018 Rock ROCK 4.340922 19
## 2 BASHAR T1 2018 Crustose non-coralline NONCOR 3.389422 14
## 3 BASHAR T1 2018 Water WATER 4.404461 4
## 4 BASHAR T1 2018 Bolt BOLT 4.107183 2
## 5 BASHAR T1 2018 Other Algae - Green ALGGRE 3.823654 7
## 6 BASHAR T1 2018 Barnacle BARSPP 2.519213 6
## samp_counts pct_freq
## 1 153 12.418301
## 2 153 9.150327
## 3 153 2.614379
## 4 153 1.307190
## 5 153 4.575163
## 6 153 3.921569
Using mutate(), calculate the average percent frequency and
median elevation by CoverType and Year
pi_dat_mut <- pi_dat |> mutate(med_elev_sl = median(med_elev),
avg_pct_freq = mean(pct_freq),
.by = c(SiteCode, Year, CoverType, CoverCode))
nrow(pi_dat) #314
nrow(pi_dat_mut) #314
head(pi_dat_mut)Note how pi_dat_mut has the same number of rows and all
the original columns plus the two we calculated (site-level median, and
average % frequency). More often we're interested in reducing the data
to one row per grouping level. That's what summarize() is
for.
Using summarize(), calculate the average percent frequency,
median elevation, min/max of frequency and elevation by CoverType and
Year.
pi_dat_sum <- pi_dat |> summarize(elev_sl_med = median(med_elev),
elev_sl_min = min(med_elev),
elev_sl_max = max(med_elev),
avg_pct_freq = mean(pct_freq),
.by = c(SiteCode, Year, CoverType, CoverCode))
nrow(pi_dat) #314
nrow(pi_dat_sum) #124
head(pi_dat_sum)Note how pi_dat_sum has the 1/3 of the rows and only the
grouping columns and the two we calculated (site-level median, and
average % frequency).
The mutate(.by = c()) approach is helpful if you're
trying to standardize values within your group. But in most cases, the
summarize(.by = c()) approach, which collapses to the group
level, is what you're looking for. Note that in older versions of dplyr,
the syntax was group_by() |> summarize() with no
.by = c().
Summarize the average and standard error total counts for the motile invertebrate data by CommunityType.
We will first fix the 1960 and PM errors from before, drop QAQC visits, then combine the Damage and No.Damage columns to make a total_count column.
# Fix the data issues again
motinv <- motinv |>
mutate(NoDamage_fix = replace(No.Damage, Damage == 1960, 196),
Damage_fix = as.numeric(replace(Damage, Damage == "PM", NA)),
total_count = NoDamage_fix + Damage_fix) |>
filter(QAQC == FALSE)
# Summarize the mean count per plot of each species by year and community type
motinv_sum <- motinv |>
summarize(mean_count = sum(total_count)/5, # 5 plots per site
se_counts = sd(total_count)/sqrt(5), # 5 plots per site
.by = c(SiteCode, Year, CommunityType,
ScientificName, CommonName, SpeciesCode))
head(motinv_sum)pi_dat),
calculate the average percent frequency of each non-vegetated substrate
by year. Note that non-vegetated substrates are CoverCode = c('BOLT',
'ROCK', 'WATER').
pi_dat),
calculate the average percent frequency of each non-vegetated vs
vegetated cover types by year. Note that non-vegetated substrates are
CoverCode = c('BOLT', 'ROCK', 'WATER').
pi_subtype <- pi_dat |>
mutate(sub_type = ifelse(CoverCode %in% c("BOLT", "ROCK", "WATER"), "nonveg", "veg")) |> # filter nonveg grps
summarize(avg_freq = mean(pct_freq), # calc avg.
.by = c(SiteCode, Year, sub_type)) |> # grouping variables
arrange(SiteCode, Year, sub_type) # sort variables
head(pi_subtype) # check outputReshaping data from long to wide and wide to long is a common task with our data. Datasets are usually described as long, or wide. The long form, which is the structure database tables often take, consists of each row being an observation, and each column being a variable (i.e. tidy format). However, in summary tables, we often want to reshape the data to be wide for better digestion.
We’ll work with the point intercept data again to demonstrate pivoting, and will use the data frame we created by summarizing median elevation and average percent frequency of each cover code by year. If you don't have that data frame yet, run the code below.
# load the package
library(dplyr)
library(tidyr) # for pivot functions
#--- import the raw point intercept data
pi_dat <- read.csv("./data/BASHAR_Point_Intercept_data.csv")
# summarize data by site, year, and cover type
pi_dat_sum <- pi_dat |> summarize(med_elev_sl = median(med_elev, na.rm = T),
avg_pct_freq = mean(pct_freq, na.rm = T),
.by = c(SiteCode, Year, CoverType, CoverCode))With pi_dat_sum we're going to pivot the data wide to
make each CoverCode a separate column and the values in
each cell be the avg_pct_freq. The code below is pretty straightforward
with names_from being the column you want to turn into
column names, and the values_from being the value you want
in the cells.
Pivot point intercept data to wide
When pivoting long to wide, you're reducing the number of rows to
have one observation for each of level of the variable you're pivoting
on. If you have other variables in your data frame, like CoverCode and
med_elev_sl in this data frame, you have to drop those columns for the
pivot to result in one observation per level of the pivoted variable. If
that doesn't make sense, try running the pivot_wider()
without dropping CoverType or med_elev_sl, and you'll see what I mean. I
also added the arrange() by CoverCode, so the columns were
sorted alphabetically in the pivot.
pi_wide <- pi_dat_sum |>
arrange(CoverCode, Year) |> # sort by CoverCode and year
select(-CoverType, -med_elev_sl) |> # Drop extra column
pivot_wider(names_from = CoverCode, # column that will produce column names
values_from = avg_pct_freq) # column to make the values
head(pi_wide)## # A tibble: 6 × 22
## SiteCode Year ALGGRE ALGRED ARTCOR ASCNOD BARSPP BOLT CHOMAS CRUCOR FUCEPI
## <chr> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 BASHAR 2013 0.654 0.983 1.99 NA 16.2 NA 10.5 NA 1.94
## 2 BASHAR 2014 1.98 NA 0.652 NA 14.9 0.662 7.46 NA 3.50
## 3 BASHAR 2015 4.16 NA NA 0.976 12.1 NA 5.47 NA 2.40
## 4 BASHAR 2016 3.98 NA 0.667 NA 9.14 NA 6.86 0.667 1.99
## 5 BASHAR 2017 10.3 0.983 0.987 NA 12.6 NA 10.1 NA 4.16
## 6 BASHAR 2018 5.19 NA 0.650 NA 8.59 1.30 3.71 NA 6.53
## # ℹ 11 more variables: FUCSPP <dbl>, NONCOR <dbl>, OTHINV <dbl>, OTHSUB <dbl>,
## # PALPAL <dbl>, PORSPP <dbl>, ROCK <dbl>, ULVINT <dbl>, ULVLAC <dbl>,
## # UNIDEN <dbl>, WATER <dbl>
That was pretty simple. But there are a lot of blanks where a
CoverCode wasn't detected in a give year and site. We can use the
values_fill argument to save us time filling blanks as
0s.
Pivot point intercept data to wide filling blanks as 0
pi_wide <- pi_dat_sum |>
arrange(CoverCode, Year) |>
select(-CoverType, -med_elev_sl) |>
pivot_wider(names_from = CoverCode,
values_from = avg_pct_freq,
values_fill = 0) # new line
head(pi_wide)## # A tibble: 6 × 22
## SiteCode Year ALGGRE ALGRED ARTCOR ASCNOD BARSPP BOLT CHOMAS CRUCOR FUCEPI
## <chr> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 BASHAR 2013 0.654 0.983 1.99 0 16.2 0 10.5 0 1.94
## 2 BASHAR 2014 1.98 0 0.652 0 14.9 0.662 7.46 0 3.50
## 3 BASHAR 2015 4.16 0 0 0.976 12.1 0 5.47 0 2.40
## 4 BASHAR 2016 3.98 0 0.667 0 9.14 0 6.86 0.667 1.99
## 5 BASHAR 2017 10.3 0.983 0.987 0 12.6 0 10.1 0 4.16
## 6 BASHAR 2018 5.19 0 0.650 0 8.59 1.30 3.71 0 6.53
## # ℹ 11 more variables: FUCSPP <dbl>, NONCOR <dbl>, OTHINV <dbl>, OTHSUB <dbl>,
## # PALPAL <dbl>, PORSPP <dbl>, ROCK <dbl>, ULVINT <dbl>, ULVLAC <dbl>,
## # UNIDEN <dbl>, WATER <dbl>
Now we see that every cell has a value. Another useful argument in
pivot_wider() is names_prefix. That allows you
to add a string before the column names that are generated in the pivot.
This is helpful if you're pivoting on a number column, like year or plot
number. R doesn't like column names that start with a number. The
names_prefix is a quick way to fix that. To demonstrate, I'll pivot on
year instead of CoverCode.
Pivot point intercept data wide using Year instead of CoverCode and add prefix.
pi_wide_yr <- pi_dat_sum |>
arrange(Year) |>
select(-med_elev_sl) |>
pivot_wider(names_from = Year, # pivot on year instead of CoverCode
values_from = avg_pct_freq,
values_fill = 0,
names_prefix = "yr_") # new line
head(pi_wide_yr)## # A tibble: 6 × 14
## SiteCode CoverType CoverCode yr_2013 yr_2014 yr_2015 yr_2016 yr_2017 yr_2018
## <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 BASHAR Rock ROCK 15.4 14.6 16.4 10.5 16.2 11.3
## 2 BASHAR Water WATER 2.82 1.95 3.03 2.84 1.96 2.82
## 3 BASHAR Crustose n… NONCOR 10.2 10.3 8.03 7.72 4.57 8.65
## 4 BASHAR Barnacle BARSPP 16.2 14.9 12.1 9.14 12.6 8.59
## 5 BASHAR Rockweed FUCSPP 31.2 43.7 45.7 56.3 35.7 50.1
## 6 BASHAR Unidentifi… UNIDEN 2.65 0.658 0 0 0 0
## # ℹ 5 more variables: yr_2019 <dbl>, yr_2021 <dbl>, yr_2022 <dbl>,
## # yr_2023 <dbl>, yr_2024 <dbl>
CHALLENGE: Use the motinv_sum data frame from
the "Summarizing with dplyr" tab to pivot on SpeciesCode and mean_count,
and fill the NAs with 0s. If you don't have the motinv_sum data frame
handy, run the code below to create it.
Hint: Drop the
ScientificName and CommonName columns before you pivot.
# Fix the data issues again
motinv <- motinv |>
mutate(NoDamage_fix = replace(No.Damage, Damage == 1960, 196),
Damage_fix = as.numeric(replace(Damage, Damage == "PM", NA)),
total_count = NoDamage_fix + Damage_fix) |>
filter(QAQC == FALSE)
# Summarize the mean count per plot of each species by year and community type
motinv_sum <- motinv |>
summarize(mean_count = sum(total_count)/5, # 5 plots per site
se_counts = sd(total_count)/sqrt(5), # 5 plots per site
.by = c(SiteCode, Year, CommunityType,
ScientificName, CommonName, SpeciesCode))motinv_wide <- motinv_sum |>
arrange(SpeciesCode) |> # sorting so columns are alphabetical
select(-ScientificName, -CommonName) |>
pivot_wider(names_from = SpeciesCode,
values_from = mean_count,
values_fill = 0)
head(motinv_wide)## # A tibble: 6 × 10
## SiteCode Year CommunityType se_counts CARMAE LITLIT LITOBT LITSAX NUCLAP
## <chr> <int> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 BASHAR 2021 Ascophyllum 0.980 4.4 0 0 0 0
## 2 BASHAR 2022 Ascophyllum 0.224 1 0 0 0 0
## 3 BASHAR 2023 Ascophyllum 0.2 2.8 0 0 0 0
## 4 BASHAR 2024 Ascophyllum 0 0.8 0 0 0 0
## 5 BASHAR 2019 Ascophyllum 0.632 1.2 0 0 0 0
## 6 BASHAR 2021 Barnacle 1.73 2.6 0 0 0 0
## # ℹ 1 more variable: TECTES <dbl>
CHALLENGE: Use the motinv_sum data frame from
the "Summarizing with dplyr" tab to pivot on Year and mean_count, fill
the NAs with 0s, and add "yr_" to the column names to prevent column
names starting with numbers. If you don't have the motinv_sum data frame
handy, run the code below to create it.
Hint: Drop the se_counts
column before you pivot.
# Fix the data issues again
motinv <- motinv |>
mutate(NoDamage_fix = replace(No.Damage, Damage == 1960, 196),
Damage_fix = as.numeric(replace(Damage, Damage == "PM", NA)),
total_count = NoDamage_fix + Damage_fix) |>
filter(QAQC == FALSE)
# Summarize the mean count per plot of each species by year and community type
motinv_sum <- motinv |>
summarize(mean_count = sum(total_count)/5, # 5 plots per site
se_counts = sd(total_count)/sqrt(5), # 5 plots per site
.by = c(SiteCode, Year, CommunityType,
ScientificName, CommonName, SpeciesCode))motinv_wide_yr <- motinv_sum |>
arrange(Year) |> # sorting so columns are alphabetical
select(-se_counts) |>
pivot_wider(names_from = Year,
values_from = mean_count,
values_fill = 0,
names_prefix = "yr_")
head(motinv_wide_yr)## # A tibble: 6 × 16
## SiteCode CommunityType ScientificName CommonName SpeciesCode yr_2013 yr_2014
## <chr> <chr> <chr> <chr> <chr> <dbl> <dbl>
## 1 BASHAR Ascophyllum Littorina litto… Common pe… LITLIT 14 20.8
## 2 BASHAR Ascophyllum Littorina obtus… Smooth pe… LITOBT 19 18.4
## 3 BASHAR Ascophyllum Nucella lapillus Dogwhelk NUCLAP 0.6 0.2
## 4 BASHAR Barnacle Littorina litto… Common pe… LITLIT 0.4 0.6
## 5 BASHAR Barnacle Littorina obtus… Smooth pe… LITOBT 0.2 1.6
## 6 BASHAR Fucus Littorina litto… Common pe… LITLIT 13.6 25.6
## # ℹ 9 more variables: yr_2015 <dbl>, yr_2016 <dbl>, yr_2017 <dbl>,
## # yr_2018 <dbl>, yr_2019 <dbl>, yr_2021 <dbl>, yr_2022 <dbl>, yr_2023 <dbl>,
## # yr_2024 <dbl>
We can reshape the capture data back to long, which will give us a
similar data as before with 0s are added into the data. For the
pivot_long() function, you have to tell it which columns to
pivot on. If you don't specify, it will make the entire dataset into 2
long columns, which you typically don't want. Here I tell R not to pivot
on SiteCode and Year columns, because I know they're in the data frame
and unlikely to change. If I instead specified the species codes to
pivot on, if a new species were found in the next year of sampling, I'd
have to update this code to include that new species.
pi_long <- pi_wide |> pivot_longer(cols = -c(SiteCode, Year),
names_to = "SpeciesCode",
values_to = "Avg_Pct_Freq")
head(pi_long)## # A tibble: 6 × 4
## SiteCode Year SpeciesCode Avg_Pct_Freq
## <chr> <int> <chr> <dbl>
## 1 BASHAR 2013 ALGGRE 0.654
## 2 BASHAR 2013 ALGRED 0.983
## 3 BASHAR 2013 ARTCOR 1.99
## 4 BASHAR 2013 ASCNOD 0
## 5 BASHAR 2013 BARSPP 16.2
## 6 BASHAR 2013 BOLT 0
Note that for pivot_longer() the
names_prefix = "" argument actually removes the string you
specify from the columns you're pivoting on, rather than adding the
string to the column name in pivot_wider(). In other words,
it does the opposite.
CHALLENGE: Pivot the motinv_wide_yr data frame on the years
columns, and remove the "yr_" from the year names using
names_prefix = 'yr_'.
We often need to combine data from separate tables in our work (e.g.,
relational database tables). In R we do this using either the
merge() function in base R or join_()
functions in dplyr. Because I find dplyr join functions to be more
intuitive and to perform faster than base R's merge, I'm going to show
how to use dplyr. If you understand the basic concepts if the join
functions, you can figure out how to merge in base R.
To demonstrate the different joins, we're going to use some fake bat capture data. One table has species captured by year and site. The other has additional site information, like X/Y coordinates, full site names, etc.
Read in fake bat site and capture data
#site data
bat_sites <- read.csv("./data/bat_site_info.csv")
# bat capture data
bat_cap <- read.csv("./data/bat_captures.csv")
# View sites listed in each
sort(unique(bat_sites$Site)) # Sites 1, 2, 3, 4, 5## [1] "site_001" "site_002" "site_003" "site_004" "site_005"
## [1] "site_001" "site_002" "site_003" "site_005" "site_006"
The key in the two bat datasets is the "Site" column. In the
bat_sites data frame, there are 5 unique sites, numbered
1:5. In the bat_cap data there are 5 unique sites, numbered
1, 2, 3, 5, 6. Therefore site_004 is only found in
bat_sites and site_006 is only found in
bat_cap.
Full join
##
## site_001 site_002 site_003 site_004 site_005 site_006
## 7 7 7 1 7 1
| Site | Unit | X | Y | SiteName | Year | LASCIN | MYOLEI | MYOSEP | MYOLUC |
|---|---|---|---|---|---|---|---|---|---|
| site_001 | Mount Desert Island | 559205 | 4907461 | Jordan Pond | 2019 | 1 | 0 | 0 | 0 |
| site_001 | Mount Desert Island | 559205 | 4907461 | Jordan Pond | 2020 | 0 | 1 | 1 | 0 |
| site_001 | Mount Desert Island | 559205 | 4907461 | Jordan Pond | 2021 | 0 | 1 | 0 | 1 |
| site_001 | Mount Desert Island | 559205 | 4907461 | Jordan Pond | 2022 | 1 | 2 | 0 | 1 |
| site_001 | Mount Desert Island | 559205 | 4907461 | Jordan Pond | 2023 | 0 | 1 | 0 | 0 |
| site_001 | Mount Desert Island | 559205 | 4907461 | Jordan Pond | 2024 | 0 | 0 | 0 | 2 |
| site_001 | Mount Desert Island | 559205 | 4907461 | Jordan Pond | 2025 | 0 | 2 | 0 | 0 |
| site_002 | Schoodic | 574712 | 4909721 | SERC Campus | 2019 | 1 | 1 | 0 | 0 |
| site_002 | Schoodic | 574712 | 4909721 | SERC Campus | 2020 | 0 | 2 | 0 | 0 |
| site_002 | Schoodic | 574712 | 4909721 | SERC Campus | 2021 | 1 | 1 | 0 | 0 |
| site_002 | Schoodic | 574712 | 4909721 | SERC Campus | 2022 | 0 | 0 | 0 | 1 |
| site_002 | Schoodic | 574712 | 4909721 | SERC Campus | 2023 | 0 | 0 | 1 | 0 |
| site_002 | Schoodic | 574712 | 4909721 | SERC Campus | 2024 | 0 | 2 | 0 | 1 |
| site_002 | Schoodic | 574712 | 4909721 | SERC Campus | 2025 | 1 | 0 | 1 | 0 |
| site_003 | Mount Desert Island | 554607 | 4895800 | Bass Harbor | 2019 | 0 | 1 | 0 | 1 |
| site_003 | Mount Desert Island | 554607 | 4895800 | Bass Harbor | 2020 | 0 | 1 | 0 | 1 |
| site_003 | Mount Desert Island | 554607 | 4895800 | Bass Harbor | 2021 | 0 | 3 | 0 | 1 |
| site_003 | Mount Desert Island | 554607 | 4895800 | Bass Harbor | 2022 | 0 | 2 | 0 | 0 |
| site_003 | Mount Desert Island | 554607 | 4895800 | Bass Harbor | 2023 | 0 | 1 | 0 | 0 |
| site_003 | Mount Desert Island | 554607 | 4895800 | Bass Harbor | 2024 | 0 | 2 | 0 | 1 |
| site_003 | Mount Desert Island | 554607 | 4895800 | Bass Harbor | 2025 | 0 | 1 | 0 | 1 |
| site_004 | Mount Desert Island | 549931 | 4903409 | Western Mtns | NA | NA | NA | NA | NA |
| site_005 | Mount Desert Island | 563101 | 4912371 | Sieur de Monts | 2019 | 0 | 1 | 1 | 0 |
| site_005 | Mount Desert Island | 563101 | 4912371 | Sieur de Monts | 2020 | 1 | 0 | 0 | 0 |
| site_005 | Mount Desert Island | 563101 | 4912371 | Sieur de Monts | 2021 | 0 | 1 | 0 | 0 |
| site_005 | Mount Desert Island | 563101 | 4912371 | Sieur de Monts | 2022 | 0 | 0 | 1 | 0 |
| site_005 | Mount Desert Island | 563101 | 4912371 | Sieur de Monts | 2023 | 1 | 0 | 0 | 0 |
| site_005 | Mount Desert Island | 563101 | 4912371 | Sieur de Monts | 2024 | 0 | 0 | 0 | 2 |
| site_005 | Mount Desert Island | 563101 | 4912371 | Sieur de Monts | 2025 | 0 | 1 | 0 | 0 |
| site_006 | NA | NA | NA | NA | 2025 | 0 | 0 | 1 | 0 |
Note how site_004, which was not in the bat_cap capture
data, but was in the bat_site data is included with NAs for
the columns that came from the bat_cap data. Additionally,
site_006, which was only in the bat_cap capture data but
not in the bat_site data has NAs for the columns that came
from the bat_site data.
Inner join
##
## site_001 site_002 site_003 site_005
## 7 7 7 7
| Site | Unit | X | Y | SiteName | Year | LASCIN | MYOLEI | MYOSEP | MYOLUC |
|---|---|---|---|---|---|---|---|---|---|
| site_001 | Mount Desert Island | 559205 | 4907461 | Jordan Pond | 2019 | 1 | 0 | 0 | 0 |
| site_001 | Mount Desert Island | 559205 | 4907461 | Jordan Pond | 2020 | 0 | 1 | 1 | 0 |
| site_001 | Mount Desert Island | 559205 | 4907461 | Jordan Pond | 2021 | 0 | 1 | 0 | 1 |
| site_001 | Mount Desert Island | 559205 | 4907461 | Jordan Pond | 2022 | 1 | 2 | 0 | 1 |
| site_001 | Mount Desert Island | 559205 | 4907461 | Jordan Pond | 2023 | 0 | 1 | 0 | 0 |
| site_001 | Mount Desert Island | 559205 | 4907461 | Jordan Pond | 2024 | 0 | 0 | 0 | 2 |
| site_001 | Mount Desert Island | 559205 | 4907461 | Jordan Pond | 2025 | 0 | 2 | 0 | 0 |
| site_002 | Schoodic | 574712 | 4909721 | SERC Campus | 2019 | 1 | 1 | 0 | 0 |
| site_002 | Schoodic | 574712 | 4909721 | SERC Campus | 2020 | 0 | 2 | 0 | 0 |
| site_002 | Schoodic | 574712 | 4909721 | SERC Campus | 2021 | 1 | 1 | 0 | 0 |
| site_002 | Schoodic | 574712 | 4909721 | SERC Campus | 2022 | 0 | 0 | 0 | 1 |
| site_002 | Schoodic | 574712 | 4909721 | SERC Campus | 2023 | 0 | 0 | 1 | 0 |
| site_002 | Schoodic | 574712 | 4909721 | SERC Campus | 2024 | 0 | 2 | 0 | 1 |
| site_002 | Schoodic | 574712 | 4909721 | SERC Campus | 2025 | 1 | 0 | 1 | 0 |
| site_003 | Mount Desert Island | 554607 | 4895800 | Bass Harbor | 2019 | 0 | 1 | 0 | 1 |
| site_003 | Mount Desert Island | 554607 | 4895800 | Bass Harbor | 2020 | 0 | 1 | 0 | 1 |
| site_003 | Mount Desert Island | 554607 | 4895800 | Bass Harbor | 2021 | 0 | 3 | 0 | 1 |
| site_003 | Mount Desert Island | 554607 | 4895800 | Bass Harbor | 2022 | 0 | 2 | 0 | 0 |
| site_003 | Mount Desert Island | 554607 | 4895800 | Bass Harbor | 2023 | 0 | 1 | 0 | 0 |
| site_003 | Mount Desert Island | 554607 | 4895800 | Bass Harbor | 2024 | 0 | 2 | 0 | 1 |
| site_003 | Mount Desert Island | 554607 | 4895800 | Bass Harbor | 2025 | 0 | 1 | 0 | 1 |
| site_005 | Mount Desert Island | 563101 | 4912371 | Sieur de Monts | 2019 | 0 | 1 | 1 | 0 |
| site_005 | Mount Desert Island | 563101 | 4912371 | Sieur de Monts | 2020 | 1 | 0 | 0 | 0 |
| site_005 | Mount Desert Island | 563101 | 4912371 | Sieur de Monts | 2021 | 0 | 1 | 0 | 0 |
| site_005 | Mount Desert Island | 563101 | 4912371 | Sieur de Monts | 2022 | 0 | 0 | 1 | 0 |
| site_005 | Mount Desert Island | 563101 | 4912371 | Sieur de Monts | 2023 | 1 | 0 | 0 | 0 |
| site_005 | Mount Desert Island | 563101 | 4912371 | Sieur de Monts | 2024 | 0 | 0 | 0 | 2 |
| site_005 | Mount Desert Island | 563101 | 4912371 | Sieur de Monts | 2025 | 0 | 1 | 0 | 0 |
The inner join only returns records from both datasets that have site
in common. Therefore, site_004 in the bat_site data and
site_006 in the bat_cap capture data were dropped.
Left join
##
## site_001 site_002 site_003 site_004 site_005
## 7 7 7 1 7
| Site | Unit | X | Y | SiteName | Year | LASCIN | MYOLEI | MYOSEP | MYOLUC |
|---|---|---|---|---|---|---|---|---|---|
| site_001 | Mount Desert Island | 559205 | 4907461 | Jordan Pond | 2019 | 1 | 0 | 0 | 0 |
| site_001 | Mount Desert Island | 559205 | 4907461 | Jordan Pond | 2020 | 0 | 1 | 1 | 0 |
| site_001 | Mount Desert Island | 559205 | 4907461 | Jordan Pond | 2021 | 0 | 1 | 0 | 1 |
| site_001 | Mount Desert Island | 559205 | 4907461 | Jordan Pond | 2022 | 1 | 2 | 0 | 1 |
| site_001 | Mount Desert Island | 559205 | 4907461 | Jordan Pond | 2023 | 0 | 1 | 0 | 0 |
| site_001 | Mount Desert Island | 559205 | 4907461 | Jordan Pond | 2024 | 0 | 0 | 0 | 2 |
| site_001 | Mount Desert Island | 559205 | 4907461 | Jordan Pond | 2025 | 0 | 2 | 0 | 0 |
| site_002 | Schoodic | 574712 | 4909721 | SERC Campus | 2019 | 1 | 1 | 0 | 0 |
| site_002 | Schoodic | 574712 | 4909721 | SERC Campus | 2020 | 0 | 2 | 0 | 0 |
| site_002 | Schoodic | 574712 | 4909721 | SERC Campus | 2021 | 1 | 1 | 0 | 0 |
| site_002 | Schoodic | 574712 | 4909721 | SERC Campus | 2022 | 0 | 0 | 0 | 1 |
| site_002 | Schoodic | 574712 | 4909721 | SERC Campus | 2023 | 0 | 0 | 1 | 0 |
| site_002 | Schoodic | 574712 | 4909721 | SERC Campus | 2024 | 0 | 2 | 0 | 1 |
| site_002 | Schoodic | 574712 | 4909721 | SERC Campus | 2025 | 1 | 0 | 1 | 0 |
| site_003 | Mount Desert Island | 554607 | 4895800 | Bass Harbor | 2019 | 0 | 1 | 0 | 1 |
| site_003 | Mount Desert Island | 554607 | 4895800 | Bass Harbor | 2020 | 0 | 1 | 0 | 1 |
| site_003 | Mount Desert Island | 554607 | 4895800 | Bass Harbor | 2021 | 0 | 3 | 0 | 1 |
| site_003 | Mount Desert Island | 554607 | 4895800 | Bass Harbor | 2022 | 0 | 2 | 0 | 0 |
| site_003 | Mount Desert Island | 554607 | 4895800 | Bass Harbor | 2023 | 0 | 1 | 0 | 0 |
| site_003 | Mount Desert Island | 554607 | 4895800 | Bass Harbor | 2024 | 0 | 2 | 0 | 1 |
| site_003 | Mount Desert Island | 554607 | 4895800 | Bass Harbor | 2025 | 0 | 1 | 0 | 1 |
| site_004 | Mount Desert Island | 549931 | 4903409 | Western Mtns | NA | NA | NA | NA | NA |
| site_005 | Mount Desert Island | 563101 | 4912371 | Sieur de Monts | 2019 | 0 | 1 | 1 | 0 |
| site_005 | Mount Desert Island | 563101 | 4912371 | Sieur de Monts | 2020 | 1 | 0 | 0 | 0 |
| site_005 | Mount Desert Island | 563101 | 4912371 | Sieur de Monts | 2021 | 0 | 1 | 0 | 0 |
| site_005 | Mount Desert Island | 563101 | 4912371 | Sieur de Monts | 2022 | 0 | 0 | 1 | 0 |
| site_005 | Mount Desert Island | 563101 | 4912371 | Sieur de Monts | 2023 | 1 | 0 | 0 | 0 |
| site_005 | Mount Desert Island | 563101 | 4912371 | Sieur de Monts | 2024 | 0 | 0 | 0 | 2 |
| site_005 | Mount Desert Island | 563101 | 4912371 | Sieur de Monts | 2025 | 0 | 1 | 0 | 0 |
The left join is taking every row in the left data,
bat_sites, and only the rows in the right data,
bat_cap, that have a matching site. Note how site_004,
which is only in the bat_sites, is included with NAs for
the columns that came from the bat_cap data that didn't
have a match. Site_006, which was only in the bat_cap data
was dropped.
Coding tip: I use left joins more than any other join because I'm usually joining tables that have a 1-to-many relationship, where the left dataset has 1 row for 1 or more rows in the right dataset. For example, say I have a dataset that only includes data for plots where an invasive species was detected and I want to do summary statistics that require the full number of plots. Using a left join, where the left dataset is a table of all of the plots and the right dataset is the invasive detections, will return the full set of plots to calculate summary statistics from. You may also have to fill 0s where NAs are introduced in the data before generating summary statistics, which should be done wisely.
Right join
##
## site_001 site_002 site_003 site_005 site_006
## 7 7 7 7 1
| Site | Unit | X | Y | SiteName | Year | LASCIN | MYOLEI | MYOSEP | MYOLUC |
|---|---|---|---|---|---|---|---|---|---|
| site_001 | Mount Desert Island | 559205 | 4907461 | Jordan Pond | 2019 | 1 | 0 | 0 | 0 |
| site_001 | Mount Desert Island | 559205 | 4907461 | Jordan Pond | 2020 | 0 | 1 | 1 | 0 |
| site_001 | Mount Desert Island | 559205 | 4907461 | Jordan Pond | 2021 | 0 | 1 | 0 | 1 |
| site_001 | Mount Desert Island | 559205 | 4907461 | Jordan Pond | 2022 | 1 | 2 | 0 | 1 |
| site_001 | Mount Desert Island | 559205 | 4907461 | Jordan Pond | 2023 | 0 | 1 | 0 | 0 |
| site_001 | Mount Desert Island | 559205 | 4907461 | Jordan Pond | 2024 | 0 | 0 | 0 | 2 |
| site_001 | Mount Desert Island | 559205 | 4907461 | Jordan Pond | 2025 | 0 | 2 | 0 | 0 |
| site_002 | Schoodic | 574712 | 4909721 | SERC Campus | 2019 | 1 | 1 | 0 | 0 |
| site_002 | Schoodic | 574712 | 4909721 | SERC Campus | 2020 | 0 | 2 | 0 | 0 |
| site_002 | Schoodic | 574712 | 4909721 | SERC Campus | 2021 | 1 | 1 | 0 | 0 |
| site_002 | Schoodic | 574712 | 4909721 | SERC Campus | 2022 | 0 | 0 | 0 | 1 |
| site_002 | Schoodic | 574712 | 4909721 | SERC Campus | 2023 | 0 | 0 | 1 | 0 |
| site_002 | Schoodic | 574712 | 4909721 | SERC Campus | 2024 | 0 | 2 | 0 | 1 |
| site_002 | Schoodic | 574712 | 4909721 | SERC Campus | 2025 | 1 | 0 | 1 | 0 |
| site_003 | Mount Desert Island | 554607 | 4895800 | Bass Harbor | 2019 | 0 | 1 | 0 | 1 |
| site_003 | Mount Desert Island | 554607 | 4895800 | Bass Harbor | 2020 | 0 | 1 | 0 | 1 |
| site_003 | Mount Desert Island | 554607 | 4895800 | Bass Harbor | 2021 | 0 | 3 | 0 | 1 |
| site_003 | Mount Desert Island | 554607 | 4895800 | Bass Harbor | 2022 | 0 | 2 | 0 | 0 |
| site_003 | Mount Desert Island | 554607 | 4895800 | Bass Harbor | 2023 | 0 | 1 | 0 | 0 |
| site_003 | Mount Desert Island | 554607 | 4895800 | Bass Harbor | 2024 | 0 | 2 | 0 | 1 |
| site_003 | Mount Desert Island | 554607 | 4895800 | Bass Harbor | 2025 | 0 | 1 | 0 | 1 |
| site_005 | Mount Desert Island | 563101 | 4912371 | Sieur de Monts | 2019 | 0 | 1 | 1 | 0 |
| site_005 | Mount Desert Island | 563101 | 4912371 | Sieur de Monts | 2020 | 1 | 0 | 0 | 0 |
| site_005 | Mount Desert Island | 563101 | 4912371 | Sieur de Monts | 2021 | 0 | 1 | 0 | 0 |
| site_005 | Mount Desert Island | 563101 | 4912371 | Sieur de Monts | 2022 | 0 | 0 | 1 | 0 |
| site_005 | Mount Desert Island | 563101 | 4912371 | Sieur de Monts | 2023 | 1 | 0 | 0 | 0 |
| site_005 | Mount Desert Island | 563101 | 4912371 | Sieur de Monts | 2024 | 0 | 0 | 0 | 2 |
| site_005 | Mount Desert Island | 563101 | 4912371 | Sieur de Monts | 2025 | 0 | 1 | 0 | 0 |
| site_006 | NA | NA | NA | NA | 2025 | 0 | 0 | 1 | 0 |
The right join is taking every row in the right data,
bat_cap, and only the rows in the left data,
bat_sites, that have a matching site. Note how Site_006,
which is only in the bat_cap, is included with NAs for the
columns that came from the bat_sites data that didn't have
a match. Site_004, which was only in the bat_sites data was
dropped.
Anti join to find sites not in bat_cap
## Site Unit X Y SiteName
## 1 site_004 Mount Desert Island 549931 4903409 Western Mtns
Anti join to find sites not in bat_sites
## Site Year LASCIN MYOLEI MYOSEP MYOLUC
## 1 site_006 2025 0 0 1 0
CHALLENGE: Join the motile invertebrate count data frame to the motile invertebrate species table to get Invasive and Exotic columns added to the data.
Import motinv data frames
#--- Read in motinv data if you haven't yet
motinv <- read.csv("./data/BASHAR_motile_invert_counts.csv")
#--- Read in species table
motspp <- read.csv("./data/motile_invert_species_table.csv")
head(motspp)## ScientificName CommonName SpeciesCode Invasive Exotic
## 1 Littorina littorea Common periwinkle LITLIT FALSE TRUE
## 2 Littorina obtusata Smooth periwinkle LITOBT FALSE FALSE
## 3 Carcinus maenas Green crab CARMAE TRUE TRUE
## 4 Littorina saxatilis Rough periwinkle LITSAX FALSE FALSE
## 5 Nucella lapillus Dogwhelk NUCLAP FALSE FALSE
## 6 Testudinalia testudinalis Limpet TECTES FALSE FALSE
## [1] "ScientificName" "CommonName" "SpeciesCode"
# left join species to motinv, because don't want to include species not found in count data
motinv_spp <- left_join(motinv,
motspp,
by = c("SpeciesCode", "ScientificName", "CommonName"))
head(motinv_spp)## Network UnitCode SiteCode StartDate Year QAQC PlotName CommunityType
## 1 NETN ACAD BASHAR 6/24/2013 2013 TRUE A1 Ascophyllum
## 2 NETN ACAD BASHAR 6/21/2013 2013 FALSE A1 Ascophyllum
## 3 NETN ACAD BASHAR 6/24/2013 2013 TRUE A1 Ascophyllum
## 4 NETN ACAD BASHAR 6/21/2013 2013 FALSE A1 Ascophyllum
## 5 NETN ACAD BASHAR 6/24/2013 2013 TRUE A1 Ascophyllum
## 6 NETN ACAD BASHAR 6/21/2014 2014 FALSE A1 Ascophyllum
## ScientificName CommonName SpeciesCode Damage No.Damage Subsampled
## 1 Littorina littorea Common periwinkle LITLIT 0 2 No
## 2 Littorina littorea Common periwinkle LITLIT 0 3 No
## 3 Littorina obtusata Smooth periwinkle LITOBT 1 2 No
## 4 Littorina obtusata Smooth periwinkle LITOBT 0 6 No
## 5 Nucella lapillus Dogwhelk NUCLAP 0 1 No
## 6 Littorina littorea Common periwinkle LITLIT 0 2 No
## Invasive Exotic
## 1 FALSE TRUE
## 2 FALSE TRUE
## 3 FALSE FALSE
## 4 FALSE FALSE
## 5 FALSE FALSE
## 6 FALSE TRUE
There are a number of other more advanced joins out there, the rolling join being one of them. For more information on all possible joins, refer to Chapter 19 in R for Data Science.
Rolling joins can come in handy if the key values in your two datasets don't perfectly match, and you want to join on the closest match. An example of where I've used rolling joins is to relate timing of high tide to the nearest water temperature measurement from a HOBO logger. You can allow for the nearest match in both directions or specify the direction (e.g., => or <=).Unfortunately, dplyr's rolling join doesn't perform the way I've
needed it. It only matches in one direction, like the closest
temperature measurement after high tide, or the closest temperature
measurement before high tide. If you need to do a rolling join, the
data.table package is your best bet. It requires learning a
new syntax and coding approach, so I'm not covering it here. But it's
helpful to know that if you're working with huge datasets,
data.table tends to perform much faster than dplyr and may
have more features for joining and summarizing your data than dplyr.
Dates, times and date-times are all species types of data in R. When
you read in a dataset that has any of these, they typically will read in
as a character. You then have to convert it into a date/time to do
anything meaningful with it. The first place to start is knowing the
code R uses to define year, month, day, hours, minutes, and seconds. The
most common codes you'll come across are below. For the full list, check
out the help for strptime by running: ?strptime. The codes
below are the ones you're most likely to come across, either to define a
date/time format, or to return a specific format (like day of the week,
month written in full, Julian day, etc.)
| Code | Definition |
|---|---|
| %a | Abbreviated weekday name in the current locale on this platform. |
| %A | Full weekday name in the current locale. |
| %b | Abbreviated month name in the current locale on this platform. Case-insensitive on input. |
| %B | Full month name in the current locale. Case-insensitive on input. |
| %d | Day of the month as decimal number (01-31). |
| %H | Hours as decimal number (00-23). As a special exception strings such as ??24:00:00?? are accepted for input. |
| %I | Hours as decimal number (01-12). |
| %j | Day of year (Julian) as decimal number (001-366): For input, 366 is only valid in a leap year. |
| %m | Month as decimal number (01-12). |
| %M | Minute as decimal number (00-59). |
| %p | AM/PM indicator in the locale. Used in conjunction with %I and not with %H. For input the match is case-insensitive. |
| %S | Second as integer (00-61) |
| %u | Weekday as a decimal number (1-7, Monday is 1). |
| %y | Year without century (00-99). |
| %Y | Year with century. |
Look at current time and date output.
## [1] "2026-04-27 13:00:49 EDT"
## [1] "POSIXct" "POSIXt"
## [1] "2026-04-27"
## [1] "Date"
For date only columns, you convert to a Date type. A few
different versions of defining dates are below, based on the different
format of the input date. This requires matching the format exactly. So,
if there are - between day, month, year, or /,
you need to specify the right symbol. If the output returns NA instead
of a Date, something was wrong either in how you specified the format,
or the column you're trying to format may have more than 1 format
represented.
Example formatting for dates
# date with slashes and full year
date_chr1 <- "3/12/2026"
date1 <- as.Date(date_chr1, format = "%m/%d/%Y")
str(date1)# date with dashes and 2-digit year
date_chr2 <- "3-12-26"
date2 <- as.Date(date_chr2, format = "%m-%d-%y")
str(date2)# date written out
date_chr3 <- "March 12, 2026"
date3 <- as.Date(date_chr3, format = "%b %d, %Y")
str(date3)## Date[1:1], format: "2026-03-12"
Extract information about dates
## [1] 71
## [1] "Thursday"
## [1] "Thu"
## [1] "March 12, 2026"
## [1] "Mar 12, 2026"
Do math with dates
## [1] "2026-03-13"
## [1] "2026-03-19"
Create a vector of evenly spaced dates.
This can be helpful for setting up axis labels where one axis is dates.
date_list <- as.Date(c("01/01/2026", "12/31/2026"), format = "%m/%d/%Y")
# by 15 days
seq.Date(date_list[1], date_list[2], by = "15 days")## [1] "2026-01-01" "2026-01-16" "2026-01-31" "2026-02-15" "2026-03-02"
## [6] "2026-03-17" "2026-04-01" "2026-04-16" "2026-05-01" "2026-05-16"
## [11] "2026-05-31" "2026-06-15" "2026-06-30" "2026-07-15" "2026-07-30"
## [16] "2026-08-14" "2026-08-29" "2026-09-13" "2026-09-28" "2026-10-13"
## [21] "2026-10-28" "2026-11-12" "2026-11-27" "2026-12-12" "2026-12-27"
## [1] "2026-01-01" "2026-02-01" "2026-03-01" "2026-04-01" "2026-05-01"
## [6] "2026-06-01" "2026-07-01" "2026-08-01" "2026-09-01" "2026-10-01"
## [11] "2026-11-01" "2026-12-01"
## [1] "2026-01-01" "2026-07-01"
date_list <- as.Date(c("01/01/2026", "12/31/2026"), format = "%m/%d/%Y")
seq.Date(date_list[1], date_list[2], by = "1 week")## [1] "2026-01-01" "2026-01-08" "2026-01-15" "2026-01-22" "2026-01-29"
## [6] "2026-02-05" "2026-02-12" "2026-02-19" "2026-02-26" "2026-03-05"
## [11] "2026-03-12" "2026-03-19" "2026-03-26" "2026-04-02" "2026-04-09"
## [16] "2026-04-16" "2026-04-23" "2026-04-30" "2026-05-07" "2026-05-14"
## [21] "2026-05-21" "2026-05-28" "2026-06-04" "2026-06-11" "2026-06-18"
## [26] "2026-06-25" "2026-07-02" "2026-07-09" "2026-07-16" "2026-07-23"
## [31] "2026-07-30" "2026-08-06" "2026-08-13" "2026-08-20" "2026-08-27"
## [36] "2026-09-03" "2026-09-10" "2026-09-17" "2026-09-24" "2026-10-01"
## [41] "2026-10-08" "2026-10-15" "2026-10-22" "2026-10-29" "2026-11-05"
## [46] "2026-11-12" "2026-11-19" "2026-11-26" "2026-12-03" "2026-12-10"
## [51] "2026-12-17" "2026-12-24" "2026-12-31"
If your dataset is huge, working with the lighter weight POSIXct may be best. Outside of that, whatever you choose may not matter too much in your workflow. We will use the lighter weight POSIXct version for our examples.
Look under the hood of the info stored by the two POSIX types
## [1] 1773293400
## attr(,"tzone")
## [1] "America/New_York"
## $sec
## [1] 0
##
## $min
## [1] 30
##
## $hour
## [1] 1
##
## $mday
## [1] 12
##
## $mon
## [1] 2
##
## $year
## [1] 126
##
## $wday
## [1] 4
##
## $yday
## [1] 70
##
## $isdst
## [1] 1
##
## $zone
## [1] "EDT"
##
## $gmtoff
## [1] NA
##
## attr(,"tzone")
## [1] "America/New_York"
## attr(,"balanced")
## [1] TRUE
Note the use of timezone in the code above. Here I specified the eastern
timezone. There are two handy ways to check timezones in R.
Check the timezone of your computer
## [1] "America/New_York"
Check the timezones built into base R
## [1] "Africa/Abidjan" "Africa/Accra"
## [3] "Africa/Addis_Ababa" "Africa/Algiers"
## [5] "Africa/Asmara" "Africa/Asmera"
## [7] "Africa/Bamako" "Africa/Bangui"
## [9] "Africa/Banjul" "Africa/Bissau"
## [11] "Africa/Blantyre" "Africa/Brazzaville"
## [13] "Africa/Bujumbura" "Africa/Cairo"
## [15] "Africa/Casablanca" "Africa/Ceuta"
## [17] "Africa/Conakry" "Africa/Dakar"
## [19] "Africa/Dar_es_Salaam" "Africa/Djibouti"
## [21] "Africa/Douala" "Africa/El_Aaiun"
## [23] "Africa/Freetown" "Africa/Gaborone"
## [25] "Africa/Harare" "Africa/Johannesburg"
## [27] "Africa/Juba" "Africa/Kampala"
## [29] "Africa/Khartoum" "Africa/Kigali"
## [31] "Africa/Kinshasa" "Africa/Lagos"
## [33] "Africa/Libreville" "Africa/Lome"
## [35] "Africa/Luanda" "Africa/Lubumbashi"
## [37] "Africa/Lusaka" "Africa/Malabo"
## [39] "Africa/Maputo" "Africa/Maseru"
## [41] "Africa/Mbabane" "Africa/Mogadishu"
## [43] "Africa/Monrovia" "Africa/Nairobi"
## [45] "Africa/Ndjamena" "Africa/Niamey"
## [47] "Africa/Nouakchott" "Africa/Ouagadougou"
## [49] "Africa/Porto-Novo" "Africa/Sao_Tome"
## [51] "Africa/Timbuktu" "Africa/Tripoli"
## [53] "Africa/Tunis" "Africa/Windhoek"
## [55] "America/Adak" "America/Anchorage"
## [57] "America/Anguilla" "America/Antigua"
## [59] "America/Araguaina" "America/Argentina/Buenos_Aires"
## [61] "America/Argentina/Catamarca" "America/Argentina/ComodRivadavia"
## [63] "America/Argentina/Cordoba" "America/Argentina/Jujuy"
## [65] "America/Argentina/La_Rioja" "America/Argentina/Mendoza"
## [67] "America/Argentina/Rio_Gallegos" "America/Argentina/Salta"
## [69] "America/Argentina/San_Juan" "America/Argentina/San_Luis"
## [71] "America/Argentina/Tucuman" "America/Argentina/Ushuaia"
## [73] "America/Aruba" "America/Asuncion"
## [75] "America/Atikokan" "America/Atka"
## [77] "America/Bahia" "America/Bahia_Banderas"
## [79] "America/Barbados" "America/Belem"
## [81] "America/Belize" "America/Blanc-Sablon"
## [83] "America/Boa_Vista" "America/Bogota"
## [85] "America/Boise" "America/Buenos_Aires"
## [87] "America/Cambridge_Bay" "America/Campo_Grande"
## [89] "America/Cancun" "America/Caracas"
## [91] "America/Catamarca" "America/Cayenne"
## [93] "America/Cayman" "America/Chicago"
## [95] "America/Chihuahua" "America/Ciudad_Juarez"
## [97] "America/Coral_Harbour" "America/Cordoba"
## [99] "America/Costa_Rica" "America/Coyhaique"
## [101] "America/Creston" "America/Cuiaba"
## [103] "America/Curacao" "America/Danmarkshavn"
## [105] "America/Dawson" "America/Dawson_Creek"
## [107] "America/Denver" "America/Detroit"
## [109] "America/Dominica" "America/Edmonton"
## [111] "America/Eirunepe" "America/El_Salvador"
## [113] "America/Ensenada" "America/Fort_Nelson"
## [115] "America/Fort_Wayne" "America/Fortaleza"
## [117] "America/Glace_Bay" "America/Godthab"
## [119] "America/Goose_Bay" "America/Grand_Turk"
## [121] "America/Grenada" "America/Guadeloupe"
## [123] "America/Guatemala" "America/Guayaquil"
## [125] "America/Guyana" "America/Halifax"
## [127] "America/Havana" "America/Hermosillo"
## [129] "America/Indiana/Indianapolis" "America/Indiana/Knox"
## [131] "America/Indiana/Marengo" "America/Indiana/Petersburg"
## [133] "America/Indiana/Tell_City" "America/Indiana/Vevay"
## [135] "America/Indiana/Vincennes" "America/Indiana/Winamac"
## [137] "America/Indianapolis" "America/Inuvik"
## [139] "America/Iqaluit" "America/Jamaica"
## [141] "America/Jujuy" "America/Juneau"
## [143] "America/Kentucky/Louisville" "America/Kentucky/Monticello"
## [145] "America/Knox_IN" "America/Kralendijk"
## [147] "America/La_Paz" "America/Lima"
## [149] "America/Los_Angeles" "America/Louisville"
## [151] "America/Lower_Princes" "America/Maceio"
## [153] "America/Managua" "America/Manaus"
## [155] "America/Marigot" "America/Martinique"
## [157] "America/Matamoros" "America/Mazatlan"
## [159] "America/Mendoza" "America/Menominee"
## [161] "America/Merida" "America/Metlakatla"
## [163] "America/Mexico_City" "America/Miquelon"
## [165] "America/Moncton" "America/Monterrey"
## [167] "America/Montevideo" "America/Montreal"
## [169] "America/Montserrat" "America/Nassau"
## [171] "America/New_York" "America/Nipigon"
## [173] "America/Nome" "America/Noronha"
## [175] "America/North_Dakota/Beulah" "America/North_Dakota/Center"
## [177] "America/North_Dakota/New_Salem" "America/Nuuk"
## [179] "America/Ojinaga" "America/Panama"
## [181] "America/Pangnirtung" "America/Paramaribo"
## [183] "America/Phoenix" "America/Port-au-Prince"
## [185] "America/Port_of_Spain" "America/Porto_Acre"
## [187] "America/Porto_Velho" "America/Puerto_Rico"
## [189] "America/Punta_Arenas" "America/Rainy_River"
## [191] "America/Rankin_Inlet" "America/Recife"
## [193] "America/Regina" "America/Resolute"
## [195] "America/Rio_Branco" "America/Rosario"
## [197] "America/Santa_Isabel" "America/Santarem"
## [199] "America/Santiago" "America/Santo_Domingo"
## [201] "America/Sao_Paulo" "America/Scoresbysund"
## [203] "America/Shiprock" "America/Sitka"
## [205] "America/St_Barthelemy" "America/St_Johns"
## [207] "America/St_Kitts" "America/St_Lucia"
## [209] "America/St_Thomas" "America/St_Vincent"
## [211] "America/Swift_Current" "America/Tegucigalpa"
## [213] "America/Thule" "America/Thunder_Bay"
## [215] "America/Tijuana" "America/Toronto"
## [217] "America/Tortola" "America/Vancouver"
## [219] "America/Virgin" "America/Whitehorse"
## [221] "America/Winnipeg" "America/Yakutat"
## [223] "America/Yellowknife" "Antarctica/Casey"
## [225] "Antarctica/Davis" "Antarctica/DumontDUrville"
## [227] "Antarctica/Macquarie" "Antarctica/Mawson"
## [229] "Antarctica/McMurdo" "Antarctica/Palmer"
## [231] "Antarctica/Rothera" "Antarctica/South_Pole"
## [233] "Antarctica/Syowa" "Antarctica/Troll"
## [235] "Antarctica/Vostok" "Arctic/Longyearbyen"
## [237] "Asia/Aden" "Asia/Almaty"
## [239] "Asia/Amman" "Asia/Anadyr"
## [241] "Asia/Aqtau" "Asia/Aqtobe"
## [243] "Asia/Ashgabat" "Asia/Ashkhabad"
## [245] "Asia/Atyrau" "Asia/Baghdad"
## [247] "Asia/Bahrain" "Asia/Baku"
## [249] "Asia/Bangkok" "Asia/Barnaul"
## [251] "Asia/Beirut" "Asia/Bishkek"
## [253] "Asia/Brunei" "Asia/Calcutta"
## [255] "Asia/Chita" "Asia/Choibalsan"
## [257] "Asia/Chongqing" "Asia/Chungking"
## [259] "Asia/Colombo" "Asia/Dacca"
## [261] "Asia/Damascus" "Asia/Dhaka"
## [263] "Asia/Dili" "Asia/Dubai"
## [265] "Asia/Dushanbe" "Asia/Famagusta"
## [267] "Asia/Gaza" "Asia/Harbin"
## [269] "Asia/Hebron" "Asia/Ho_Chi_Minh"
## [271] "Asia/Hong_Kong" "Asia/Hovd"
## [273] "Asia/Irkutsk" "Asia/Istanbul"
## [275] "Asia/Jakarta" "Asia/Jayapura"
## [277] "Asia/Jerusalem" "Asia/Kabul"
## [279] "Asia/Kamchatka" "Asia/Karachi"
## [281] "Asia/Kashgar" "Asia/Kathmandu"
## [283] "Asia/Katmandu" "Asia/Khandyga"
## [285] "Asia/Kolkata" "Asia/Krasnoyarsk"
## [287] "Asia/Kuala_Lumpur" "Asia/Kuching"
## [289] "Asia/Kuwait" "Asia/Macao"
## [291] "Asia/Macau" "Asia/Magadan"
## [293] "Asia/Makassar" "Asia/Manila"
## [295] "Asia/Muscat" "Asia/Nicosia"
## [297] "Asia/Novokuznetsk" "Asia/Novosibirsk"
## [299] "Asia/Omsk" "Asia/Oral"
## [301] "Asia/Phnom_Penh" "Asia/Pontianak"
## [303] "Asia/Pyongyang" "Asia/Qatar"
## [305] "Asia/Qostanay" "Asia/Qyzylorda"
## [307] "Asia/Rangoon" "Asia/Riyadh"
## [309] "Asia/Saigon" "Asia/Sakhalin"
## [311] "Asia/Samarkand" "Asia/Seoul"
## [313] "Asia/Shanghai" "Asia/Singapore"
## [315] "Asia/Srednekolymsk" "Asia/Taipei"
## [317] "Asia/Tashkent" "Asia/Tbilisi"
## [319] "Asia/Tehran" "Asia/Tel_Aviv"
## [321] "Asia/Thimbu" "Asia/Thimphu"
## [323] "Asia/Tokyo" "Asia/Tomsk"
## [325] "Asia/Ujung_Pandang" "Asia/Ulaanbaatar"
## [327] "Asia/Ulan_Bator" "Asia/Urumqi"
## [329] "Asia/Ust-Nera" "Asia/Vientiane"
## [331] "Asia/Vladivostok" "Asia/Yakutsk"
## [333] "Asia/Yangon" "Asia/Yekaterinburg"
## [335] "Asia/Yerevan" "Atlantic/Azores"
## [337] "Atlantic/Bermuda" "Atlantic/Canary"
## [339] "Atlantic/Cape_Verde" "Atlantic/Faeroe"
## [341] "Atlantic/Faroe" "Atlantic/Jan_Mayen"
## [343] "Atlantic/Madeira" "Atlantic/Reykjavik"
## [345] "Atlantic/South_Georgia" "Atlantic/St_Helena"
## [347] "Atlantic/Stanley" "Australia/ACT"
## [349] "Australia/Adelaide" "Australia/Brisbane"
## [351] "Australia/Broken_Hill" "Australia/Canberra"
## [353] "Australia/Currie" "Australia/Darwin"
## [355] "Australia/Eucla" "Australia/Hobart"
## [357] "Australia/LHI" "Australia/Lindeman"
## [359] "Australia/Lord_Howe" "Australia/Melbourne"
## [361] "Australia/North" "Australia/NSW"
## [363] "Australia/Perth" "Australia/Queensland"
## [365] "Australia/South" "Australia/Sydney"
## [367] "Australia/Tasmania" "Australia/Victoria"
## [369] "Australia/West" "Australia/Yancowinna"
## [371] "Brazil/Acre" "Brazil/DeNoronha"
## [373] "Brazil/East" "Brazil/West"
## [375] "Canada/Atlantic" "Canada/Central"
## [377] "Canada/Eastern" "Canada/Mountain"
## [379] "Canada/Newfoundland" "Canada/Pacific"
## [381] "Canada/Saskatchewan" "Canada/Yukon"
## [383] "CET" "Chile/Continental"
## [385] "Chile/EasterIsland" "CST6CDT"
## [387] "Cuba" "EET"
## [389] "Egypt" "Eire"
## [391] "EST" "EST5EDT"
## [393] "Etc/GMT" "Etc/GMT-0"
## [395] "Etc/GMT-1" "Etc/GMT-10"
## [397] "Etc/GMT-11" "Etc/GMT-12"
## [399] "Etc/GMT-13" "Etc/GMT-14"
## [401] "Etc/GMT-2" "Etc/GMT-3"
## [403] "Etc/GMT-4" "Etc/GMT-5"
## [405] "Etc/GMT-6" "Etc/GMT-7"
## [407] "Etc/GMT-8" "Etc/GMT-9"
## [409] "Etc/GMT+0" "Etc/GMT+1"
## [411] "Etc/GMT+10" "Etc/GMT+11"
## [413] "Etc/GMT+12" "Etc/GMT+2"
## [415] "Etc/GMT+3" "Etc/GMT+4"
## [417] "Etc/GMT+5" "Etc/GMT+6"
## [419] "Etc/GMT+7" "Etc/GMT+8"
## [421] "Etc/GMT+9" "Etc/GMT0"
## [423] "Etc/Greenwich" "Etc/UCT"
## [425] "Etc/Universal" "Etc/UTC"
## [427] "Etc/Zulu" "Europe/Amsterdam"
## [429] "Europe/Andorra" "Europe/Astrakhan"
## [431] "Europe/Athens" "Europe/Belfast"
## [433] "Europe/Belgrade" "Europe/Berlin"
## [435] "Europe/Bratislava" "Europe/Brussels"
## [437] "Europe/Bucharest" "Europe/Budapest"
## [439] "Europe/Busingen" "Europe/Chisinau"
## [441] "Europe/Copenhagen" "Europe/Dublin"
## [443] "Europe/Gibraltar" "Europe/Guernsey"
## [445] "Europe/Helsinki" "Europe/Isle_of_Man"
## [447] "Europe/Istanbul" "Europe/Jersey"
## [449] "Europe/Kaliningrad" "Europe/Kiev"
## [451] "Europe/Kirov" "Europe/Kyiv"
## [453] "Europe/Lisbon" "Europe/Ljubljana"
## [455] "Europe/London" "Europe/Luxembourg"
## [457] "Europe/Madrid" "Europe/Malta"
## [459] "Europe/Mariehamn" "Europe/Minsk"
## [461] "Europe/Monaco" "Europe/Moscow"
## [463] "Europe/Nicosia" "Europe/Oslo"
## [465] "Europe/Paris" "Europe/Podgorica"
## [467] "Europe/Prague" "Europe/Riga"
## [469] "Europe/Rome" "Europe/Samara"
## [471] "Europe/San_Marino" "Europe/Sarajevo"
## [473] "Europe/Saratov" "Europe/Simferopol"
## [475] "Europe/Skopje" "Europe/Sofia"
## [477] "Europe/Stockholm" "Europe/Tallinn"
## [479] "Europe/Tirane" "Europe/Tiraspol"
## [481] "Europe/Ulyanovsk" "Europe/Uzhgorod"
## [483] "Europe/Vaduz" "Europe/Vatican"
## [485] "Europe/Vienna" "Europe/Vilnius"
## [487] "Europe/Volgograd" "Europe/Warsaw"
## [489] "Europe/Zagreb" "Europe/Zaporozhye"
## [491] "Europe/Zurich" "GB"
## [493] "GB-Eire" "GMT"
## [495] "GMT-0" "GMT+0"
## [497] "GMT0" "Greenwich"
## [499] "Hongkong" "HST"
## [501] "Iceland" "Indian/Antananarivo"
## [503] "Indian/Chagos" "Indian/Christmas"
## [505] "Indian/Cocos" "Indian/Comoro"
## [507] "Indian/Kerguelen" "Indian/Mahe"
## [509] "Indian/Maldives" "Indian/Mauritius"
## [511] "Indian/Mayotte" "Indian/Reunion"
## [513] "Iran" "Israel"
## [515] "Jamaica" "Japan"
## [517] "Kwajalein" "Libya"
## [519] "MET" "Mexico/BajaNorte"
## [521] "Mexico/BajaSur" "Mexico/General"
## [523] "MST" "MST7MDT"
## [525] "Navajo" "NZ"
## [527] "NZ-CHAT" "Pacific/Apia"
## [529] "Pacific/Auckland" "Pacific/Bougainville"
## [531] "Pacific/Chatham" "Pacific/Chuuk"
## [533] "Pacific/Easter" "Pacific/Efate"
## [535] "Pacific/Enderbury" "Pacific/Fakaofo"
## [537] "Pacific/Fiji" "Pacific/Funafuti"
## [539] "Pacific/Galapagos" "Pacific/Gambier"
## [541] "Pacific/Guadalcanal" "Pacific/Guam"
## [543] "Pacific/Honolulu" "Pacific/Johnston"
## [545] "Pacific/Kanton" "Pacific/Kiritimati"
## [547] "Pacific/Kosrae" "Pacific/Kwajalein"
## [549] "Pacific/Majuro" "Pacific/Marquesas"
## [551] "Pacific/Midway" "Pacific/Nauru"
## [553] "Pacific/Niue" "Pacific/Norfolk"
## [555] "Pacific/Noumea" "Pacific/Pago_Pago"
## [557] "Pacific/Palau" "Pacific/Pitcairn"
## [559] "Pacific/Pohnpei" "Pacific/Ponape"
## [561] "Pacific/Port_Moresby" "Pacific/Rarotonga"
## [563] "Pacific/Saipan" "Pacific/Samoa"
## [565] "Pacific/Tahiti" "Pacific/Tarawa"
## [567] "Pacific/Tongatapu" "Pacific/Truk"
## [569] "Pacific/Wake" "Pacific/Wallis"
## [571] "Pacific/Yap" "Poland"
## [573] "Portugal" "PRC"
## [575] "PST8PDT" "ROC"
## [577] "ROK" "Singapore"
## [579] "Turkey" "UCT"
## [581] "Universal" "US/Alaska"
## [583] "US/Aleutian" "US/Arizona"
## [585] "US/Central" "US/East-Indiana"
## [587] "US/Eastern" "US/Hawaii"
## [589] "US/Indiana-Starke" "US/Michigan"
## [591] "US/Mountain" "US/Pacific"
## [593] "US/Samoa" "UTC"
## [595] "W-SU" "WET"
## [597] "Zulu"
## attr(,"Version")
## [1] "2025b"
If you understand how to set up a Date type in R, setting up date-times aren't that different. It just takes a bit more attention to get the format right. To demonstrate, we'll read in HOBO temperature data and set the timestamp column as a POSIXct date-time. There's usually a bit of cleaning required of HOBO data beyond setting the timestamp as POSIXct date-time. I'll show the whole process below.
Read in temperature data and look at it
## Plot.Title.HOBO_temp_example.csv X
## 1 # Date Time, GMT-05:00
## 2 1 7/18/2021 10:26
## 3 2 7/18/2021 11:26
## 4 3 7/18/2021 12:26
## 5 4 7/18/2021 13:26
## 6 5 7/18/2021 14:26
## X.1
## 1 Temp, °F (LGR S/N: 20672839, SEN S/N: 20672839)
## 2 58.842
## 3 58.712
## 4 58.109
## 5 56.208
## 6 56.208
## X.2 X.3
## 1 Coupler Detached (LGR S/N: 20672839) Coupler Attached (LGR S/N: 20672839)
## 2 Logged
## 3
## 4
## 5
## 6
## X.4 X.5
## 1 Stopped (LGR S/N: 20672839) End Of File (LGR S/N: 20672839)
## 2
## 3
## 4
## 5
## 6
Note the extra row on top showing the file name. HOBO data often has some metadata in the first row. The next code chunk imports a cleaner version of the data by skipping the first row, only pulling in the first 3 columns (we don't care about the columns that report Logged), and cleaning up the column names.
Clean up non-date HOBO data
temp_data <- read.csv("./data/HOBO_temp_example.csv", skip = 1)[,1:3]
colnames(temp_data) <- c("index", "date_time", "tempF")| index | date_time | tempF |
|---|---|---|
| 1 | 7/18/2021 10:26 | 58.842 |
| 2 | 7/18/2021 11:26 | 58.712 |
| 3 | 7/18/2021 12:26 | 58.109 |
| 4 | 7/18/2021 13:26 | 56.208 |
| 5 | 7/18/2021 14:26 | 56.208 |
| 6 | 7/18/2021 15:26 | 55.342 |
| 7 | 7/18/2021 16:26 | 55.602 |
| 8 | 7/18/2021 17:26 | 55.949 |
| 9 | 7/18/2021 18:26 | 55.602 |
| 10 | 7/18/2021 19:26 | 55.733 |
| 11 | 7/18/2021 20:26 | 55.819 |
| 12 | 7/18/2021 21:26 | 55.776 |
| 13 | 7/18/2021 22:26 | 56.469 |
| 14 | 7/18/2021 23:26 | 56.642 |
| 15 | 7/19/2021 0:26 | 56.556 |
| 16 | 7/19/2021 1:26 | 55.863 |
| 17 | 7/19/2021 2:26 | 55.819 |
| 18 | 7/19/2021 3:26 | 55.733 |
| 19 | 7/19/2021 4:26 | 55.733 |
| 20 | 7/19/2021 5:26 | 55.733 |
| 21 | 7/19/2021 6:26 | 55.949 |
| 22 | 7/19/2021 7:26 | 55.776 |
| 23 | 7/19/2021 8:26 | 56.035 |
| 24 | 7/19/2021 9:26 | 56.079 |
| 25 | 7/19/2021 10:26 | 56.901 |
| 26 | 7/19/2021 11:26 | 63.090 |
| 27 | 7/19/2021 12:26 | 63.732 |
| 28 | 7/19/2021 13:26 | 57.420 |
| 29 | 7/19/2021 14:26 | 56.685 |
| 30 | 7/19/2021 15:26 | 56.383 |
| 31 | 7/19/2021 16:26 | 56.469 |
| 32 | 7/19/2021 17:26 | 56.512 |
| 33 | 7/19/2021 18:26 | 56.815 |
| 34 | 7/19/2021 19:26 | 56.122 |
| 35 | 7/19/2021 20:26 | 57.074 |
| 36 | 7/19/2021 21:26 | 56.469 |
| 37 | 7/19/2021 22:26 | 56.122 |
| 38 | 7/19/2021 23:26 | 56.772 |
| 39 | 7/20/2021 0:26 | 57.979 |
| 40 | 7/20/2021 1:26 | 57.807 |
| 41 | 7/20/2021 2:26 | 56.469 |
| 42 | 7/20/2021 3:26 | 56.728 |
| 43 | 7/20/2021 4:26 | 56.295 |
| 44 | 7/20/2021 5:26 | 56.035 |
| 45 | 7/20/2021 6:26 | 56.079 |
| 46 | 7/20/2021 7:26 | 56.079 |
| 47 | 7/20/2021 8:26 | 56.165 |
| 48 | 7/20/2021 9:26 | 56.469 |
| 49 | 7/20/2021 10:26 | 57.031 |
| 50 | 7/20/2021 11:26 | 57.979 |
Convert date_time to POSIXct
We can see that the date is formatted as M/D/YYY, then there's a space, then the time is formatted with HH:MM, with hours following the 0-23 pattern, minutes 00-59. There are no seconds.
temp_data$timestamp <- as.POSIXct(temp_data$date_time,
format = "%m/%d/%Y %H:%M",
tz = "America/New_York")
head(temp_data)## index date_time tempF timestamp
## 1 1 7/18/2021 10:26 58.842 2021-07-18 10:26:00
## 2 2 7/18/2021 11:26 58.712 2021-07-18 11:26:00
## 3 3 7/18/2021 12:26 58.109 2021-07-18 12:26:00
## 4 4 7/18/2021 13:26 56.208 2021-07-18 13:26:00
## 5 5 7/18/2021 14:26 56.208 2021-07-18 14:26:00
## 6 6 7/18/2021 15:26 55.342 2021-07-18 15:26:00
Extract the YYYYMMDD date, month, Julian day, time, and hour of the timestamp.
temp_data$date <- format(temp_data$timestamp, "%Y%m%d")
temp_data$month <- format(temp_data$timestamp, "%b")
temp_data$time <- format(temp_data$timestamp, "%I:%M")
temp_data$hour <- as.numeric(format(temp_data$timestamp, "%I"))
head(temp_data)## index date_time tempF timestamp date month time hour
## 1 1 7/18/2021 10:26 58.842 2021-07-18 10:26:00 20210718 Jul 10:26 10
## 2 2 7/18/2021 11:26 58.712 2021-07-18 11:26:00 20210718 Jul 11:26 11
## 3 3 7/18/2021 12:26 58.109 2021-07-18 12:26:00 20210718 Jul 12:26 12
## 4 4 7/18/2021 13:26 56.208 2021-07-18 13:26:00 20210718 Jul 01:26 1
## 5 5 7/18/2021 14:26 56.208 2021-07-18 14:26:00 20210718 Jul 02:26 2
## 6 6 7/18/2021 15:26 55.342 2021-07-18 15:26:00 20210718 Jul 03:26 3
## index date_time tempF timestamp date month time hour
## 1 1 7/18/2021 10:26 58.842 2021-07-18 10:26:00 20210718 Jul 10:26 10
## 2 2 7/18/2021 11:26 58.712 2021-07-18 11:26:00 20210718 Jul 11:26 11
## 3 3 7/18/2021 12:26 58.109 2021-07-18 12:26:00 20210718 Jul 12:26 12
## 4 4 7/18/2021 13:26 56.208 2021-07-18 13:26:00 20210718 Jul 01:26 1
## 5 5 7/18/2021 14:26 56.208 2021-07-18 14:26:00 20210718 Jul 02:26 2
## 6 6 7/18/2021 15:26 55.342 2021-07-18 15:26:00 20210718 Jul 03:26 3
## month_num
## 1 7
## 2 7
## 3 7
## 4 7
## 5 7
## 6 7
## index date_time tempF timestamp date month time hour
## 1 1 7/18/2021 10:26 58.842 2021-07-18 10:26:00 20210718 Jul 10:26 10
## 2 2 7/18/2021 11:26 58.712 2021-07-18 11:26:00 20210718 Jul 11:26 11
## 3 3 7/18/2021 12:26 58.109 2021-07-18 12:26:00 20210718 Jul 12:26 12
## 4 4 7/18/2021 13:26 56.208 2021-07-18 13:26:00 20210718 Jul 01:26 1
## 5 5 7/18/2021 14:26 56.208 2021-07-18 14:26:00 20210718 Jul 02:26 2
## 6 6 7/18/2021 15:26 55.342 2021-07-18 15:26:00 20210718 Jul 03:26 3
## month_num julian
## 1 7 199
## 2 7 199
## 3 7 199
## 4 7 199
## 5 7 199
## 6 7 199
Goals for Day 3:
Artwork by
@allison_horst
Feedback: Please leave feedback in the training feedback
form. You can submit feedback multiple times and don't need to answer
every question. Responses are anonymous.
Consider three example data visualizations that demonstrate how some approaches are more effective than others in conveying patterns.
Most people can understand this figure of daily Covid cases faster than they can understand the table of daily Covid cases.
| state | timestamp | cases | total_population |
|---|---|---|---|
| AK | 2022-01-25T04:00:00Z | 203110 | 731545 |
| AL | 2022-01-25T04:00:00Z | 1153149 | 4903185 |
| AR | 2022-01-25T04:00:00Z | 738652 | 3017804 |
| AZ | 2022-01-25T04:00:00Z | 1767303 | 7278717 |
| CA | 2022-01-25T04:00:00Z | 7862003 | 39512223 |
| CO | 2022-01-25T04:00:00Z | 1207991 | 5758736 |
| CT | 2022-01-25T04:00:00Z | 683731 | 3565287 |
This table shows average monthly revenue for Acme products.
| category | product | Jan | Feb | Mar | Apr | May | Jun | Jul | Aug | Sep | Oct | Nov | Dec |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| party supplies | balloons | 892 | 1557 | 1320 | 972 | 1309 | 1174 | 1153 | 1138 | 1275 | 1178 | 1325 | 1422 |
| party supplies | confetti | 1271 | 1311 | 829 | 1020 | 1233 | 1061 | 1088 | 1395 | 1376 | 1152 | 1568 | 1412 |
| party supplies | party hats | 1338 | 1497 | 1445 | 956 | 1372 | 1482 | 1048 | 877 | 1404 | 1030 | 1458 | 1547 |
| party supplies | wrapping paper | 1396 | 1026 | 932 | 891 | 1364 | 896 | 900 | 1221 | 1146 | 967 | 1394 | 1507 |
| school supplies | backpacks | 1802 | 1773 | 1611 | 1723 | 1799 | 1730 | 1813 | 1676 | 1748 | 1652 | 1819 | 1759 |
| school supplies | notebooks | 1153 | 1471 | 1541 | 1371 | 1592 | 1514 | 1725 | 1702 | 1457 | 1604 | 1729 | 1279 |
| school supplies | pencils | 1679 | 1304 | 1054 | 1259 | 1425 | 1608 | 1972 | 1811 | 1610 | 1004 | 1417 | 1283 |
| school supplies | staplers | 1074 | 1708 | 1439 | 1154 | 1551 | 1099 | 1793 | 1601 | 1647 | 1666 | 1389 | 1511 |
Use the table above to answer these questions:
Now let's display the same table as a heat map, with larger numbers represented by darker color cells. How quickly can we answer those same two questions? What patterns can we see in the heat map that were not obvious in the table above?
In 1973, Francis Anscombe published "Graphs in statistical analysis", a paper describing four bivariate datasets with identical means, variances, and correlations.
| x1 | y1 | x2 | y2 | x3 | y3 | x4 | y4 |
|---|---|---|---|---|---|---|---|
| 10 | 8.04 | 10 | 9.14 | 10 | 7.46 | 8 | 6.58 |
| 8 | 6.95 | 8 | 8.14 | 8 | 6.77 | 8 | 5.76 |
| 13 | 7.58 | 13 | 8.74 | 13 | 12.74 | 8 | 7.71 |
| 9 | 8.81 | 9 | 8.77 | 9 | 7.11 | 8 | 8.84 |
| 11 | 8.33 | 11 | 9.26 | 11 | 7.81 | 8 | 8.47 |
| 14 | 9.96 | 14 | 8.10 | 14 | 8.84 | 8 | 7.04 |
| 6 | 7.24 | 6 | 6.13 | 6 | 6.08 | 8 | 5.25 |
| 4 | 4.26 | 4 | 3.10 | 4 | 5.39 | 19 | 12.50 |
| 12 | 10.84 | 12 | 9.13 | 12 | 8.15 | 8 | 5.56 |
| 7 | 4.82 | 7 | 7.26 | 7 | 6.42 | 8 | 7.91 |
| 5 | 5.68 | 5 | 4.74 | 5 | 5.73 | 8 | 6.89 |
| x1 | y1 | x2 | y2 | x3 | y3 | x4 | y4 | |
|---|---|---|---|---|---|---|---|---|
| mean | 9 | 7.50 | 9 | 7.50 | 9 | 7.50 | 9 | 7.50 |
| var | 11 | 4.13 | 11 | 4.13 | 11 | 4.12 | 11 | 4.12 |
Anscombe data as plots: Despite their identical statistics, when we plot the data we see the four datasets are actually very different. Anscombe's point was to understand the data, we must plot the data.
ggplot2ggplot2
The ggplot2 package is the most popular R
package for plotting. It takes a little effort to learn how the pieces
of a ggplot object fit together. However, once you get the hang of it,
you can create and customize a large variety of attractive plots with
just a few lines of R code. The package is called
ggplot2 because originally there was ggplot.
The developer, Hadley Wickham, didn't want to break the original package
to improve the package, so created ggplot2.
The ggplot2 online book
and cheatsheets
can be very helpful while you are learning to use the
ggplot2 package.
The ggplot2 package was developed using the grammar of
graphics as the underlying philosophy, which basically breaks a plot up
into individual building blocks related to aesthetics (e.g., color,
size, shape), geometries (e.g. points, lines, boxes), and themes (e.g.
axis label font size, legend placement, etc.).
ggplot2:
aes() argument.
ggplot()call is the data. This also means you can pipe data
into a ggplot object.
aes() argument either at the ggplot() level,
or within the specific geom.
aes(). If you
specify aes(color = park), then scale is where you can
specify a custom color for each park instead of ggplot's default color
scheme. The scale is where you can set the labels of groups in a legend
(if different than how the data are labeled). You can also customize
axis ranges, breaks, and labels with different scales.
We're going to use site-level photoplot % cover data from Ship Harbor to create the plot below, and will work through the code one piece at a time.
Mean percent cover by community type (panels) in Ship Harbor. Error bars represent +/- 1 SE.
Import photoplot cover data from Ship Harbor
# load package
library(ggplot2)
# import data
pcov <- read.csv("./data/SHIHAR_photoplot_cover.csv")
# check out the data
head(pcov) Create the ggplot template of average cover over time.
I'm going to assign this to an object named p, so we can
build it one layer at a time. We're going to have unique colors and
shapes for each level of CoverCode in our data, so we need to indicate
that in the aes() along with our x and y variables. If we
don't include color, fill, and shape in the aes(), the
points would all be the same color (default = black) and shape (default
is filled circle).
p <- ggplot(data = pcov, aes(x = Year, y = avg_cover,
color = CoverCode,
fill = CoverCode,
shape = CoverCode))
pAdd point and errorbar geometry
The order of geometries is the order they're drawn. I prefer the look of the points after the error bars.
p2 <- p +
geom_errorbar(aes(ymin = avg_cover - se_cover, ymax = avg_cover + se_cover)) +
geom_point()
p2Specify colors and shapes
The default colors and symbols in ggplot aren't great. We're going to
start by specifying our own colors and shapes manually. Then we'll use
color palettes from different packages. Specifying shapes in R requires
knowing the shape's symbol code. To view that, run ?points,
or search "pch in R plot" and you'll get the info below. Note that 0-14
are just lines with no fill. To change their color, use the color
aesthetic. Symbols 15-20 are solid, but also use color to change their
aesthetic. Symbols 21-25 have both a color (outline) and fill (inside)
aesthetic.
Figure of symbol codes in R.
p3 <- p2 +
scale_fill_manual(values = c("ASCNOD" = "#bcb02f", "BARSPP" = "#CAC7B6",
"NONCOR" = "#420816", "FUCSPP" = "#646519",
"MUSSPP" = "#170461", "REDGRP" = "#9e224d")) +
scale_color_manual(values = c("ASCNOD" = "#bcb02f", "BARSPP" = "#CAC7B6",
"NONCOR" = "#420816", "FUCSPP" = "#646519",
"MUSSPP" = "#170461", "REDGRP" = "#9e224d")) +
scale_shape_manual(values = c("ASCNOD" = 23, "BARSPP" = 24, "NONCOR" = 23,
"FUCSPP" = 25, "MUSSPP" = 23, "REDGRP" = 25))
p3Assign colors and shapes to variables
Adding the name to the scales renames the legend title. But to keep the shapes and colors in the same legend, you have to name all of them the same name.
cols <- c("ASCNOD" = "#bcb02f", "BARSPP" = "#CAC7B6",
"NONCOR" = "#420816", "FUCSPP" = "#646519",
"MUSSPP" = "#170461", "REDGRP" = "#9e224d")
shps <- c("ASCNOD" = 23, "BARSPP" = 24, "NONCOR" = 23,
"FUCSPP" = 25, "MUSSPP" = 23, "REDGRP" = 25)
p3 <- p2 +
geom_point(color = 'dimgrey', size = 2.5) + # setting point outline to dark grey
scale_fill_manual(values = cols, name = "Species Group") +
scale_color_manual(values = cols, name = "Species Group") +
scale_shape_manual(values = shps, name = "Species Group")
p3Improve axis ticks and labels
# Determine year range (so not hard coded/easily updated in future)
xrange <- range(pcov$Year)
p4 <- p3 +
scale_x_continuous(limits = c(xrange[1] - 1, xrange[2] + 1),
breaks = c(seq(xrange[1] - 1, xrange[2] + 1, 2))) +
labs(x = NULL, y = "Avg. Percent Cover")
p4Improve theme components
p5 <- p4 +
theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1), # angle x text
panel.grid.major = element_blank(), # turns of major grids
panel.grid.minor = element_blank(), # turns off minor grids
panel.background = element_rect(fill = 'white', color = 'dimgrey'),# makes background white
legend.key = element_blank()) # removes square fill around symbols in legend
p5Facet on CommunityType
ggplot(data = pcov, aes(x = Year, y = avg_cover,
color = CoverCode, group = CoverCode,
fill = CoverCode, shape = CoverCode)) +
geom_errorbar(aes(ymin = avg_cover - se_cover, ymax = avg_cover + se_cover),
linewidth = 0.6) +
geom_point(color = "dimgrey", size = 2.5) +
scale_fill_manual(values = cols, aesthetics = c("fill", "color"),
name = "Species Group") +
scale_shape_manual(values = shps, name = "Species Group") +
scale_x_continuous(limits = c(xrange[1] - 1, xrange[2] + 1),
breaks = c(seq(xrange[1] - 1, xrange[2] + 1, 2))) +
labs(x = NULL, y = "Avg. Percent Cover") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 0),
panel.grid.major = element_blank(), # turns of major grids
panel.grid.minor = element_blank(), # turns off minor grids
panel.background = element_rect(fill = 'white', color = 'dimgrey'),
legend.key = element_blank()) +
facet_wrap(~CommunityType)Save plot to disk
Copy and pasting figures from R into word documents or powerpoints
usually results in a poor resolution figure. The better approach is to
save figures to disk using ggsave(). The function saves the
most recent figure that was generated in the Plots tab in the bottom
right pane. You can also specify the ggplot object name, which would
come before the file name. If you wanted a jpg or png instead of svg,
just use that as the file extension type. The svg is a vectorize image,
which for figures with lines, is usually the best quality image that
won't become pixelated when zoomed. Only caveat is not all software
supports svgs.
ggplot(data = pcov, aes(x = Year, y = avg_cover,
color = CoverCode, group = CoverCode,
fill = CoverCode, shape = CoverCode)) +
geom_errorbar(aes(ymin = avg_cover - se_cover, ymax = avg_cover + se_cover),
linewidth = 0.6) +
geom_point(color = "dimgrey", size = 1.5) + # changed this line
scale_fill_manual(values = cols, aesthetics = c("fill", "color"),
name = "Species Group") +
scale_shape_manual(values = shps, name = "Species Group") +
scale_x_continuous(limits = c(xrange[1] - 1, xrange[2] + 1),
breaks = c(seq(xrange[1] - 1, xrange[2] + 1, 2))) +
labs(x = NULL, y = "Avg. Percent Cover") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 0),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_rect(fill = 'white', color = 'dimgrey'),
legend.key = element_blank()) +
facet_wrap(~CommunityType)ggplot(data = pcov, aes(x = Year, y = avg_cover,
color = CoverCode, group = CoverCode,
fill = CoverCode, shape = CoverCode)) +
geom_errorbar(aes(ymin = avg_cover - se_cover, ymax = avg_cover + se_cover),
linewidth = 1.2) + # changed this line
geom_point(color = "dimgrey", size = 2.5) +
scale_fill_manual(values = cols, aesthetics = c("fill", "color"),
name = "Species Group") +
scale_shape_manual(values = shps, name = "Species Group") +
scale_x_continuous(limits = c(xrange[1] - 1, xrange[2] + 1),
breaks = c(seq(xrange[1] - 1, xrange[2] + 1, 2))) +
labs(x = NULL, y = "Avg. Percent Cover") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 0),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_rect(fill = 'white', color = 'dimgrey'),
legend.key = element_blank()) +
facet_wrap(~CommunityType)ggplot(data = pcov, aes(x = Year, y = avg_cover,
color = CoverCode, group = CoverCode,
fill = CoverCode, shape = CoverCode)) +
geom_errorbar(aes(ymin = avg_cover - se_cover, ymax = avg_cover + se_cover),
linewidth = 0.6) +
geom_point(color = "dimgrey", size = 2.5) +
scale_fill_manual(values = cols, aesthetics = c("fill", "color"),
name = "Species Group") +
scale_shape_manual(values = shps, name = "Species Group") +
scale_x_continuous(limits = c(xrange[1] - 1, xrange[2] + 1),
breaks = c(seq(xrange[1] - 1, xrange[2] + 1, 2))) +
labs(x = "Year", y = "Avg Percent Cover") + # changed this line
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 0),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_rect(fill = 'white', color = 'dimgrey'),
legend.key = element_blank()) +
facet_wrap(~CommunityType)Add line geometry to figure
Note that for lines to plot properly in ggplot, you have to assign a
grouping variable in the aes(). The fill, color, and shape
aesthetics in the beginning already did that for us, but here's how it
would look within geom_line() if you needed to do that.
Add smoother to figure
The geom_smooth() plots a line assuming the y ~ x formula (unless you specify a different formula). By default the method is a LOESS smoother, but you can specify a range of methods, including linear regression by adding method = 'lm' to geom_smooth().
Note that I turned off the standard error ribbon that plots by default using se = FALSE. It’s too busy for this plot. I also don’t use the SE unless I’ve fit an actual model and checked the diagnostics. The status under the hood of geom_smooth() are also pretty black boxy, and I don’t always know if I can trust its calculation of SE.
Add horizontal dashed line called 50% line
Add horizontal dashed line called 50% line and make it show in the legend
p6 + geom_hline(aes(yintercept = 50, linetype = "50% line")) +
scale_linetype_manual(values = c("50% line" = "dashed"),
name = "Threshold")Move legend to the bottom
Change facet strips fill, color, and font size
p6 + theme(strip.background = element_rect(fill = "#F5F0DC", color = "black"),
strip.text = element_text(size = 10))Stacked bars instead of points
ggplot(data = pcov, aes(x = Year, y = avg_cover,
color = CoverCode, group = CoverCode,
fill = CoverCode, shape = CoverCode)) +
geom_bar(stat = 'identity', position = 'fill', color = 'dimgrey') +
scale_fill_manual(values = cols, aesthetics = c("fill", "color"),
name = "Species Group") +
scale_x_continuous(limits = c(xrange[1] - 1, xrange[2] + 1),
breaks = c(seq(xrange[1] - 1, xrange[2] + 1, 2))) +
labs(x = NULL, y = "Median. Percent Cover") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 0),
panel.grid.major = element_blank(), # turns of major grids
panel.grid.minor = element_blank(), # turns off minor grids
panel.background = element_rect(fill = 'white', color = 'dimgrey'),
legend.key = element_blank()) +
facet_wrap(~CommunityType)Bars with error bars instead of points (note I filtered on Barnacle CommunityType)
ggplot(data = pcov |> filter(CommunityType == "Barnacle"), # note filter
aes(x = Year, y = avg_cover,
color = CoverCode, group = CoverCode,
fill = CoverCode, shape = CoverCode)) +
geom_bar(stat = 'identity', position = 'dodge', color = 'dimgrey') + # new line
geom_errorbar(aes(ymin = avg_cover - se_cover, ymax = avg_cover + se_cover), linewidth = 0.6) +
scale_fill_manual(values = cols, aesthetics = c("fill", "color"),
name = "Species Group") +
scale_shape_manual(values = shps, name = "Species Group") +
scale_x_continuous(limits = c(xrange[1] - 1, xrange[2] + 1),
breaks = c(seq(xrange[1] - 1, xrange[2] + 1, 2))) +
labs(x = NULL, y = "Avg. Percent Cover") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 0),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_rect(fill = 'white', color = 'dimgrey'),
legend.key = element_blank()) +
facet_wrap(~CoverCode) # Different facetFaceting is helpful when your observations are all within the same
column. But say you have data in multiple columns and want to arrange
those plots into a grid. Faceting won't help because the data to plot
are in different columns. There are multiple packages that make it easy
to arrange multiple plots into a grid to look similar to faceted plots.
Packages include grid (and gridExtra),
cowplot, ggpubr, and patchwork.
We're going to use patchwork, a relative newcomer, and one
of the easiest I've found to code and customize. Here we're going to
plot pH, temperature, DO, and conductance for Jordan Pond in Acadia NP
and arrange them using patchwork.
The patchwork package has a lot of options to customize plot layouts. See the patchwork package website for more information.
Load patchwork and read in water chemistry data
# load packages
library(ggplot2)
library(patchwork) # multipanel plots
# load data
chem <- read.csv("./data/ACAD_Jordan_Pond_water_chem.csv")
# make date field a date
chem$date <- as.Date(chem$date, format = "%Y-%m-%d")Create a ggplot object for each parameter
# pH plot
p_pH <-
ggplot(chem, aes(x = date, y = pH)) +
theme_bw() +
geom_smooth(se = F, span = 0.5) +
geom_point(color = "dimgrey", alpha = 0.5, size = 2) +
labs(y = "pH", x = "Year") +
scale_x_date(date_breaks = "2 years", date_labels = "%Y") +
theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1))
# temp plot
p_temp <-
ggplot(chem, aes(x = date, y = Temp_F)) +
theme_bw() +
geom_smooth(se = F, span = 0.5) +
geom_point(color = "dimgrey", alpha = 0.5, size = 2) +
labs(y = "Temp (F)", x = "Year") +
scale_x_date(date_breaks = "2 years", date_labels = "%Y") +
theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1))
# Diss. Oxygen plot
p_do <-
ggplot(chem, aes(x = date, y = DO_mgL)) +
theme_bw() +
geom_smooth(se = F, span = 0.5) +
geom_point(color = "dimgrey", alpha = 0.5, size = 2) +
labs(y = "DO (mg/L)", x = "Year") +
scale_x_date(date_breaks = "2 years", date_labels = "%Y") +
theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1))
# Conductance plot
p_cond <-
ggplot(chem, aes(x = date, y = SpCond_uScm)) +
theme_bw() +
geom_smooth(se = F, span = 0.5) +
geom_point(color = "dimgrey", alpha = 0.5, size = 2) +
labs(y = "Spec. Cond. (uScm)", x = "Year")+
scale_x_date(date_breaks = "2 years", date_labels = "%Y") +
theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1))Arrange plots using patchwork
This is almost too easy to be true, but it really is this easy with patchwork. The patchwork package includes a bunch of options to customize sizes, add annotation, sharing axes across plots.
Arrange plots using patchwork in column of 4 and share x axis.
You can also collect the legend using a similar approach to collecting the axes.
The palettes in RColorBrewer can be viewed by running the code below.
The first group shows the sequential palettes (e.g. YlOrRd - Yellow
Orange Red). The second group shows the qualitative colors. The last
group shows the diverging palettes. The main drawback of these palettes
is they are limited by the number of levels in your data. So, if you
specify Set2 to color code different levels of a factor,
there are only 8 colors available to you. If your factor has more than 8
levels (e.g., 9 sites, 10 parks, etc.), then the levels beyond 8 won't
get plotted and you'll get a warning in the console similar to what we
saw for ggplot's default number of symbols.
View RColorBrewer palettes
Going back to the photoplot cover plots we made before, we'll use RColorBrewer to color code each CoverCode instead of doing this manually. We'll build the plot in the next chunk, that we then change the color palettes with in later plots.
Create basic plot
p_pal <- ggplot(data = pcov, aes(x = Year, y = avg_cover,
color = CoverCode, group = CoverCode,
fill = CoverCode, shape = CoverCode)) +
geom_errorbar(aes(ymin = avg_cover - se_cover, ymax = avg_cover + se_cover),
linewidth = 0.6) +
geom_point(color = "dimgrey", size = 2.5) +
scale_shape_manual(values = shps, name = "Species Group") +
scale_x_continuous(limits = c(xrange[1] - 1, xrange[2] + 1),
breaks = c(seq(xrange[1] - 1, xrange[2] + 1, 2))) +
labs(x = "Year", y = "Avg Percent Cover") + # changed this line
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 0),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_rect(fill = 'white', color = 'dimgrey'),
legend.key = element_blank()) +
facet_wrap(~CommunityType)Use Set2 palette for species groups
p_pal + scale_color_brewer(name = "Species Group", palette = "Set2",
aesthetics = c("fill", "color")) Note how I used the aesthetics in the
scale_color_brewer() to set fill and color as the same
time. We could have done this in the code above too.
Use Dark2 palette on temperature plot
p_pal + scale_color_brewer(name = "Species Group", palette = "Dark2",
aesthetics = c("fill", "color")) The viridis package comes with 8 palettes. The benefit of viridis is the number of levels is not limited to 8 like RColorBrewer. The palette options are below for 12 levels.
View viridis palettes with hexcodes
You can view the hexcodes of the different palettes by running the
code below. Just change viridis() to one of the other
palette names to get the hexcodes for those levels.
Use viridis default palette on CoverCode
The scale_color_viridis_d() selects the viridis palette
option (purple, green, yellow) for discrete values (i.e. categories).
For a continuous scale (e.g. temperature), you would specify
scale_color_viridis_c().
p_pal + scale_color_viridis_d(name = "Species Group", aesthetics = c("fill", "color")) #default viridis Use turbo palette on CoverCode
The scale_color_viridis_d() selects the viridis palette
option (purple, green, yellow) for discrete values (i.e. categories).
For a continuous scale (e.g. temperature), you would specify
scale_color_viridis_c().
p_pal + scale_color_viridis_d(name = "Species Group", aesthetics = c("fill", "color"), option = 'turbo') Heatmaps via geom_tile() are a place where viridis
palettes are especially helpful producing useful sequential or diverging
color palettes. We'll use the temperature data to plot heatmaps by month
for each site. Heatmaps are a bit different than other plots we've seen,
as the x and y values create a discrete grid, and the color in the cell
represents the value for that level of x and y. That means we have to
change how the x, y and color aesthetics are specified. Here we will
plot temperature by month and year faceted on site.
Basic heatmap code
Note the use of base R's month.abb to set the labels on
the x-axis. The month.abb is a vector of the 12 months
abbreviated as 3 letters. By setting 5:10, I'm taking the months May -
Oct.
p_heat <-
ggplot(chem, aes(x = mon, y = year, color = Temp_F, fill = Temp_F)) +
theme_bw() +
geom_tile() +
labs(y = "Year", x = "Month") +
scale_x_continuous(breaks = c(5, 6, 7, 8, 9, 10),
limits = c(4, 11),
labels = month.abb[5:10]) +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5))Plot heatmap with viridis continuous palette
Plot heatmap with plasma continuous palette, reverse scale
p_heat + scale_color_viridis_c(name = "Temp. (F)", aesthetics = c("fill", "color"),
option = "plasma", direction = -1) You can also create your own color ramp via
scale_color_gradient(), which creates a 2-color gradient,
scale_color_gradient2(), which creates a diverging color
gradient (low-mid-high), and a scale_color_gradientn(),
which creates an n-color gradient.
Create 2-color gradient
p_heat + scale_color_gradient(low = "#FCFC9A", high = "#F54927",
aesthetics = c("fill", 'color'),
name = "Temp. (F)") Create diverging gradient
For the divergent palette to be meaningful, you usually need to set the midpoint if it's not 0.
p_heat + scale_color_gradient2(low = "navy", mid = "#FCFC9A", high = "#F54927",
aesthetics = c("fill", 'color'),
midpoint = mean(chem$Temp_F),
name = "Temp. (F)") Create diverging gradient with multiple colors
Note the change in the legend by using guide = 'legend'.
Default is guide = 'colorbar'. I also customized the breaks
into 5-degree bins using breaks() and
seq().
p_heat + scale_color_gradientn(colors = c("#805A91", "#406AC2", "#FBFFAD", "#FFA34A", "#AB1F1F"),
aesthetics = c("fill", 'color'),
guide = "legend",
breaks = c(seq(40, 85, 5)),
name = "Temp. (F)") Load libraries and import data
library(tidyverse)
library(readxl)
ctd_mma <- read_xlsx("./data/PR_PF_2903444 (2).xlsx") |> data.frame()Graph data with reversed Y axis and color coded by station
ggplot(ctd_mma, aes(x = `TEMP..degree_Celsius.`,
y = `PRES..decibar.`,
group = Station,
color = Station)) +
geom_line() +
theme_bw() +
labs(x = "Temp. (C)", y = "Pressure (dbars)") +
scale_color_gradientn(colors = c("#805A91", "#406AC2", "#FBFFAD", "#FFA34A", "#AB1F1F"),
aesthetics = c('color'),
guide = "legend", # makes legend distinct, rather than color band
breaks = 1:15, # number of stations
name = "Station ID") +
scale_y_reverse() + # flip y axis
scale_x_continuous(limits = c(0, 30),
breaks = seq(0, 30, 5),
position = 'top') + # plot x-axis on top
theme(legend.position = 'bottom')R was designed to facilitate statistical analysis and data visualization, making tests like linear regression and Analysis of Variance relatively straightforward. Using the percent cover data by community type, I'll demonstrate how to set up the models, check diagnostics (at least how I do it), and summarize output. For the linear regression, we'll look at whether common periwinkle (SpeceisCode LITLIT) counts have changed over time in the Red Algae community plots (we'll ignore the lack of independence between years). For the ANOVA, we'll test whether common periwinkle counts differ between the community types.
Import data and load packages
library(dplyr)
library(ggplot2)
# install.packages('car') # uncomment and run if you don't have this package installed
library(car) # for levene's test
# import data
motinv <- read.csv("./data/BASHAR_motile_invert_counts.csv")
head(motinv)## Network UnitCode SiteCode StartDate Year QAQC PlotName CommunityType
## 1 NETN ACAD BASHAR 6/24/2013 2013 TRUE A1 Ascophyllum
## 2 NETN ACAD BASHAR 6/21/2013 2013 FALSE A1 Ascophyllum
## 3 NETN ACAD BASHAR 6/24/2013 2013 TRUE A1 Ascophyllum
## 4 NETN ACAD BASHAR 6/21/2013 2013 FALSE A1 Ascophyllum
## 5 NETN ACAD BASHAR 6/24/2013 2013 TRUE A1 Ascophyllum
## 6 NETN ACAD BASHAR 6/21/2014 2014 FALSE A1 Ascophyllum
## ScientificName CommonName SpeciesCode Damage No.Damage Subsampled
## 1 Littorina littorea Common periwinkle LITLIT 0 2 No
## 2 Littorina littorea Common periwinkle LITLIT 0 3 No
## 3 Littorina obtusata Smooth periwinkle LITOBT 1 2 No
## 4 Littorina obtusata Smooth periwinkle LITOBT 0 6 No
## 5 Nucella lapillus Dogwhelk NUCLAP 0 1 No
## 6 Littorina littorea Common periwinkle LITLIT 0 2 No
Prep data for analysis
# prep data for analysis
motinv_final <- motinv |>
mutate(Damage = as.numeric(replace(Damage, Damage == "PM", NA)), # Fix Damage PM
SitePlot = paste(SiteCode, PlotName, sep = "-"), # create new SitePlot column
Date = as.Date(StartDate, format = "%m/%d/%Y"), # create new Date column
No.Damage_fix = replace(No.Damage, No.Damage == 1960, 196),
total_count = Damage + No.Damage,
total_count_fix = Damage + No.Damage_fix, # fix error in No.Damage
year_st = Year - 2012) |> # set start year to 1 instead of 2013 for better interpretation
filter(QAQC == FALSE) |> # drop QAQC visits
arrange(SitePlot, Year, ScientificName) # optional sorting the data
# summarize counts, so 1 count per year, species and community type
motinv_sum <- motinv_final |> summarize(mean_count = mean(total_count, na.rm = TRUE),
mean_count_fix = mean(total_count_fix, na.rm = T),
.by = c(SiteCode, year_st, CommunityType, SpeciesCode))
# prep for linear regression
motinv_reg <- motinv_sum |> filter(SpeciesCode == "LITLIT" & CommunityType == "Red Algae")
head(motinv_reg)## SiteCode year_st CommunityType SpeciesCode mean_count mean_count_fix
## 1 BASHAR 1 Red Algae LITLIT 1.333333 1.333333
## 2 BASHAR 2 Red Algae LITLIT 10.000000 10.000000
## 3 BASHAR 3 Red Algae LITLIT 57.800000 57.800000
## 4 BASHAR 4 Red Algae LITLIT 23.000000 23.000000
## 5 BASHAR 5 Red Algae LITLIT 97.800000 97.800000
## 6 BASHAR 6 Red Algae LITLIT 106.200000 106.200000
# prep for analysis of variance
motinv_aov <- motinv_sum |> filter(SpeciesCode == "LITLIT") |>
mutate(ComCode = toupper(substr(CommunityType, 1, 3))) # create community code for easier plotting
head(motinv_aov)## SiteCode year_st CommunityType SpeciesCode mean_count mean_count_fix ComCode
## 1 BASHAR 1 Ascophyllum LITLIT 14.0 14.0 ASC
## 2 BASHAR 2 Ascophyllum LITLIT 20.8 20.8 ASC
## 3 BASHAR 4 Ascophyllum LITLIT 71.4 71.4 ASC
## 4 BASHAR 5 Ascophyllum LITLIT 72.6 72.6 ASC
## 5 BASHAR 6 Ascophyllum LITLIT 65.0 65.0 ASC
## 6 BASHAR 7 Ascophyllum LITLIT 80.4 80.4 ASC
Model formula
Model diagnostic plots
Check outliers
# detect outliers as > 2 SD of residuals
outliers <- which(abs(resid(lm_mod)) > 2 * sd(resid(lm_mod)))
# Highlight the outliers in a scatterplot
plot(mean_count ~ year_st, data = motinv_reg)
points(motinv_reg$year_st[outliers], motinv_reg$mean_count[outliers], col = "red", pch = 19)Residual and QQ plots clearly shows an outlier. This is the error in No.Damage that has 1960 instead of 196 for a count. I'll show the diagnostics for the fixed data now.
Rerun model with fixed counts
Model diagnostic plots, take 2
Summarize output
##
## Call:
## lm(formula = mean_count_fix ~ year_st, data = motinv_reg)
##
## Residuals:
## Min 1Q Median 3Q Max
## -86.400 -24.845 8.066 21.626 58.397
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 11.813 27.538 0.429 0.67802
## year_st 12.466 3.773 3.304 0.00917 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 44.73 on 9 degrees of freedom
## Multiple R-squared: 0.5481, Adjusted R-squared: 0.4979
## F-statistic: 10.92 on 1 and 9 DF, p-value: 0.009172
Estimates are the betas, such that the Estimate for year_st is the slope. These results suggest that for every year, there's an average of 12 more common periwinkles found in Red Algae plots. Though note that in plotting the results of the linear regression (using the geom_smooth() with linear method), the trend does not look linear.
Plot model results
ggplot(data = motinv_reg, aes(x = year_st, y = mean_count_fix)) +
geom_point() +
geom_smooth(method = 'lm') +
scale_x_continuous(breaks = seq(1, 13, 2),
labels = seq(1, 13, 2) + 2012) +
labs(x = "Year", y = "Mean common periwinkle count") +
theme_bw()Model formula
Model diagnostic plots
We are going to say that we're okay with the model diagnostics.
Levene's test of equal variance
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 3 7.7178 0.0003493 ***
## 40
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Significant p-value for Levene test indicates non-equal variance among groups (not surprisingly).
Shapiro-Wilk test of normality
A significant p-value rejects the null hypothesis of normal. The non-significant p-values suggests normality isn't a problem.
##
## Shapiro-Wilk normality test
##
## data: rstandard(aov_mod)
## W = 0.96188, p-value = 0.153
Summarize output
## Df Sum Sq Mean Sq F value Pr(>F)
## ComCode 3 42785 14262 6.719 0.000888 ***
## Residuals 40 84903 2123
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Significant p-value indicates at least one community type has a different mean count of common periwinkle.
Tukey's pairwise comparisons
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = mean_count_fix ~ ComCode, data = motinv_aov)
##
## $ComCode
## diff lwr upr p adj
## BAR-ASC -36.681818 -89.33845 15.97481 0.2582119
## FUC-ASC 41.031818 -11.62481 93.68845 0.1742661
## RED-ASC 35.443939 -17.21269 88.10057 0.2864043
## FUC-BAR 77.713636 25.05700 130.37027 0.0016647
## RED-BAR 72.125758 19.46913 124.78239 0.0037846
## RED-FUC -5.587879 -58.24451 47.06875 0.9918492
plot Tukey's pairwise comparisons
Plot model results
# reorder community by elevation
motinv_aov$ComCode_fac <- factor(motinv_aov$ComCode, levels = c("BAR", "ASC", "FUC", "RED"))
ggplot(data = motinv_aov, aes(x = ComCode_fac, y = mean_count_fix)) +
stat_summary(geom = 'bar', fun.data = mean_se, fill = 'grey', color = 'dimgrey') +
stat_summary(geom = 'errorbar', fun.data = mean_se, color = 'dimgrey', width = 0.3) +
labs(x = "Community Type", y = "Mean common periwinkle count") +
geom_text(aes(x = 1, y = 30, label = "AB"), size = 5) +
geom_text(aes(x = 2, y = 70, label = "A"), size = 5) +
geom_text(aes(x = 3, y = 125, label = "B"), size = 5) +
geom_text(aes(x = 4, y = 118, label = "B"), size = 5) +
theme_bw()Knowing how to code is only part of being a good coder. Below are general best practices to make code easier to run, understand, and be more stable with a relatively low maintenance cost. Many of these suggestions come from lessons working with my and other peoples' code. The R for Data Science also has a lot of great information on coding best practices in
# libraries
library(dplyr) # for mutate and filter
# parameters
analysis_year <- 2023
# import data set
photo_dat <- read.csv("./data/SHIHAR_photoplot_cover.csv")
# Filtering on Barnacle community type and analysis year
photo_dat2 <- photo_dat |> filter(CommunityType == "Barnacle") |>
filter(Year == analysis_year) Object names must start with a letter and can only contain letters, numbers, underscore, and period. Spaces aren't allowed in object names, and are best avoided in column names of data frames too. Descriptive object names will help you digest code, and often you'll want more than one word in the name. There are multiple cases that people tend to use, the most common of which tends to be snake_case. Other examples are below.
Ordering words in names, so that objects that are similar or derived from each other sort together. This also makes coding easier, as like objects will sort together in the popups that you see as you code.
# good word order
ACAD_rocky <- data.frame(year = 2020:2025, plot = 1:6)
ACAD_rocky2 <- ACAD_rocky |> filter(year > 2020)
ACAD_rocky3 <- ACAD_rocky2 |> mutate(plot_type = "vital signs")
# bad word order
rocky_ACAD <- data.frame(year = 2020:2025, plot = 1:6)
ACAD_after_2020 <- rocky_ACAD |> filter(year > 2020)
vital_ACAD_2020 <- ACAD_after_2020 |> mutate(plot_type = "vital signs")It's helpful to balance descriptive names with length. The longer the object name, the more typing you have to do to refer to that object. Coding long names, such as long column names in data frames, is cumbersome and inefficient. Compare the two objects below. While I doubt many would make super long object names like this, I commonly see excessively long column names in data packages. Limiting column names to 12 characters or less is super helpful for coders using those data.
# super long names
ACAD_rocky_intertidal_sampling_data <- data.frame(years_plots_were_sampled = c(2020:2025), wetland_plots_sampled = c(1:6))
ACAD_rocky_intertidal_sampling_data2 <- rocky_intertidal_sampling_data |> filter(years_plots_were_sampled > 2020)
# shorter still meaningful
ACAD_rocky <- data.frame(year = 2020:2025, plot = 1:6)
ACAD_rocky2 <- ACAD_rocky |> filter(year > 2020)Code style refers to consistent use of case, indenting, spacing, line width, etc. There are several style conventions out there. I tend to use the tidyverse style guide, which is based on Google's R style guide.
Style conventions I follow:<-,
=, ==, |>, +,
etc.
Example 1. Style for pipes
# Good code
trees_final <- trees |>
mutate(DecayClassCode_num = as.numeric(DecayClassCode),
Plot_Name = paste(ParkUnit, PlotCode, sep = "-"),
Date = as.Date(SampleDate, format = "%m/%d/%Y")) |>
rename("Species" = "ScientificName") |>
filter(IsQAQC == FALSE) |>
select(-DecayClassCode) |>
arrange(Plot_Name, TagCode)
# Same code, but much harder to follow
trees_final <- trees|>mutate(DecayClassCode_num=as.numeric(DecayClassCode), Plot_Name=paste(ParkUnit,PlotCode,sep = "-"), Date=as.Date(SampleDate,format="%m/%d/%Y"))|> rename("Species"="ScientificName")|>filter(IsQAQC==FALSE)|>select(-DecayClassCode)|>arrange(Plot_Name,TagCode)Example 2. Style for ggplot object
# Good code
ggplot(data = visits, aes(x = Year, y = Annual_Visits/1000)) +
geom_line() +
geom_point(color = "black", fill = "#82C2a3", size = 2.5, shape = 24) +
labs(x = "Year",
y = "Annual visitors in 1000's") +
scale_y_continuous(limits = c(2000, 4500),
breaks = seq(2000, 4500, by = 500)) +
scale_x_continuous(limits = c(1994, 2024),
breaks = c(seq(1994, 2024, by = 5))) +
theme(axis.text.x = element_text(size = 10, angle = 45, hjust = 1),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_rect(fill = 'white', color = 'dimgrey'),
title = element_text(size = 10)
)
# Same code but hard to follow
ggplot(data=visits,aes(x=Year,y=Annual_Visits/1000))+geom_line()+geom_point(color="black",fill="#82C2a3",size=2.5,shape=24) +
labs(x = "Year", y = "Annual visitors in 1000's")+
scale_y_continuous(limits=c(2000,4500),breaks=seq(2000,4500,by=500))+
scale_x_continuous(limits=c(1994,2024),breaks=c(seq(1994,2024,by=5)))+
theme(axis.text.x=element_text(size=10,angle=45,hjust=1), panel.grid.major=element_blank(),
panel.grid.minor=element_blank(),panel.background=element_rect(fill='white',color='dimgrey'),
title = element_text(size = 10))Using projects instead of stand alone scripts helps keep the various pieces of an analysis project in one place and more easily transferable across computers. Logical naming of scripts, so they sort easily, is also helpful.
Order and purpose of file names easy to follow
Hard to know script order and purpose
If you're starting a new R session to answer these questions, you'll need to read in the wetland and tree data frames again.
Read in example Bass Harbor motile invertebrate data from url
CHALLENGE: How would you look at the the first 4 even rows
(2, 4, 6, 8), and first 2 columns of the motinv data
frame?
## Network UnitCode
## 2 NETN ACAD
## 4 NETN ACAD
## 6 NETN ACAD
## 8 NETN ACAD
## [1] "Network" "UnitCode" "SiteCode" "StartDate"
## [5] "Year" "QAQC" "PlotName" "CommunityType"
## [9] "ScientificName" "CommonName" "SpeciesCode" "Damage"
## [13] "No.Damage" "Subsampled"
## Network UnitCode
## 2 NETN ACAD
## 4 NETN ACAD
## 6 NETN ACAD
## 8 NETN ACAD
CHALLENGE: How many unique species are there in the
motinv data frame?
Option 1. Subset the data with brackets and use the
sort(unique()) to give an easier to read output.
# OPTION 2
gcrab <- motinv[motinv$ScientificName == "Carcinus maenas",]
sort(unique(gcrab$Year)) #2019, 2021, 2022, 2023, 2024## [1] 2019 2021 2022 2023 2024
Option 2. Subset data then use table() to tally the
years and number of rows green crabs were found.
##
## 2019 2021 2022 2023 2024
## 3 16 6 11 11
There are multiple ways to do this. Two examples are below.
Option 1. View the data and sort by No.Damage.
Option 2. Find the max No.Damage count and subset the data frame
## Network UnitCode SiteCode StartDate Year QAQC PlotName CommunityType Species
## NA <NA> <NA> <NA> <NA> NA NA <NA> <NA> <NA>
## CommonName SpeciesCode No.Damage Subsampled Damage_num Date Site_Plot
## NA <NA> <NA> NA <NA> NA <NA> <NA>
CHALLENGE: Fix the No.Damage typo by replacing 1960 with 196.
Let's say that you looked at the datasheet, and the actual count for No.Damage was 196 instead of 1960. You can change that value in the original CSV by hand. But even better is to document that change in code. There are multiple ways to do this. Two examples are below.
But first, it's good to create a new data frame when modifying the original data frame, so you can refer back to the original if needed. I also use a really specific filter to make sure I'm not accidentally changing other data.
Replace 1960 with 196
# create copy of motinv data
motinv_fix <- motinv
# find the problematic value, and change it to 196
motinv_fix$No.Damage[motinv_fix$Year == 2019 &
motinv_fix$PlotName == "R4" &
motinv_fix$No.Damage == 1960] <- 196
# check your work
range(motinv$No.Damage) # 0 1960## [1] 0 1960
## [1] 0 282
If you're starting a new session to answer these questions, you'll
need to load dplyr and read in the motile invertebrate data frame
again.
Load dplyr
Read in example motile invertebrate and point intercept data
#--- Point intercept data ---
pi_dat <- read.csv("./data/BASHAR_Point_Intercept_data.csv")
#--- Motile invert count ---
motinv <- read.csv("./data/BASHAR_motile_invert_counts.csv")
#--- Motile invert site ---
motspp <- read.csv("./data/motile_invert_species_table.csv")
#--- hobo temp data ---
temp_data <- read.csv("./data/HOBO_temp_example.csv", skip = 1)[,1:3]
colnames(temp_data) <- c("index", "date_time", "tempF")motinv, how many species are found
in PlotName = A1 in 2024?
## Network UnitCode SiteCode StartDate Year QAQC PlotName CommunityType
## 1 NETN ACAD BASHAR 6/11/2019 2019 FALSE R4 Red Algae
## ScientificName CommonName SpeciesCode Damage No.Damage Subsampled
## 1 Littorina littorea Common periwinkle LITLIT 11 1960 No
CHALLENGE: Fix the No.Damage typo by replacing 1960 with 196.
Let's say that you looked at the datasheet, and the actual count for No.Damage was 196 instead of 1960. You can change that value in the original CSV by hand. But even better is to document that change in code. There are multiple ways to do this. Two examples are below.
But first, it's good to create a new data frame when modifying the original data frame, so you can refer back to the original if needed. I also use a really specific filter to make sure I'm not accidentally changing other data.
Replace 1960 with 196
# dplyr approach
motinv_fix <- motinv |> mutate(No.Damage = replace(No.Damage, No.Damage == 1960, 196))
range(motinv$No.Damage)## [1] 0 1960
## [1] 0 282
pred <- c("CARMAE", "NUCLAP")
# base R
motinv$trophic <- ifelse(motinv$SpeciesCode %in% pred, "predator", "herbivore")
table(motinv$trophic, motinv$SpeciesCode)##
## CARMAE LITLIT LITOBT LITSAX NUCLAP TECTES
## herbivore 0 220 197 20 0 82
## predator 47 0 0 0 116 0
# tidyverse
motinv <- motinv |> mutate(trophic = ifelse(SpeciesCode %in% pred, "predator", "herbivore"))
table(motinv$trophic, motinv$SpeciesCode)##
## CARMAE LITLIT LITOBT LITSAX NUCLAP TECTES
## herbivore 0 220 197 20 0 82
## predator 47 0 0 0 116 0
# Base R using a nested ifelse()
motinv$count_level <-
ifelse(motinv$No.Damage > 35, "High",
ifelse(motinv$No.Damage >= 10 & motinv$No.Damage <= 35, "Medium", "Low"))
table(motinv$count_level) # check that it worked##
## High Low Medium
## 167 352 163
# Tidyverse using case_when() and between()
motinv <- motinv |> mutate(count_level = case_when(No.Damage > 35 ~ "High",
between(No.Damage, 10, 35) ~ "Medium",
No.Damage < 10 ~ "Low"))
table(motinv$count_level) # check that it worked##
## High Low Medium
## 167 352 163
between() function that saves typing.
This function matches as >= and <=.
pi_dat),
calculate the average percent frequency of each non-vegetated substrate
by year. Note that non-vegetated substrates are CoverCode = c('BOLT',
'ROCK', 'WATER').
pi_nonveg <- pi_dat |> filter(CoverCode %in% c("BOLT", "ROCK", "WATER")) |> # filter nonveg grps
summarize(avg_freq = mean(pct_freq), # calc avg.
.by = c(SiteCode, Year, CoverCode, CoverType)) # grouping variables
head(pi_nonveg) # check output## SiteCode Year CoverCode CoverType avg_freq
## 1 BASHAR 2018 ROCK Rock 11.314530
## 2 BASHAR 2018 WATER Water 2.824469
## 3 BASHAR 2018 BOLT Bolt 1.301859
## 4 BASHAR 2022 WATER Water 1.774598
## 5 BASHAR 2022 ROCK Rock 8.864728
## 6 BASHAR 2019 ROCK Rock 16.006219
pi_dat),
calculate the average percent frequency of each non-vegetated vs
vegetated cover types by year. Note that non-vegetated substrates are
CoverCode = c('BOLT', 'ROCK', 'WATER').
pi_subtype <- pi_dat |>
mutate(sub_type = ifelse(CoverCode %in% c("BOLT", "ROCK", "WATER"), "nonveg", "veg")) |> # filter nonveg grps
summarize(avg_freq = mean(pct_freq), # calc avg.
.by = c(SiteCode, Year, sub_type)) |> # grouping variables
arrange(SiteCode, Year, sub_type) # sort variables
head(pi_subtype) # check output## SiteCode Year sub_type avg_freq
## 1 BASHAR 2013 nonveg 9.134165
## 2 BASHAR 2013 veg 9.807801
## 3 BASHAR 2014 nonveg 7.170307
## 4 BASHAR 2014 veg 9.992314
## 5 BASHAR 2015 nonveg 9.699333
## 6 BASHAR 2015 veg 10.075167
CHALLENGE: Use the motinv_sum data frame from
the "Summarizing with dplyr" tab to pivot on SpeciesCode and mean_count,
and fill the NAs with 0s. If you don't have the motinv_sum data frame
handy, run the code below to create it.
Hint: Drop the
ScientificName and CommonName columns before you pivot.
# Fix the data issues again
motinv <- motinv |>
mutate(NoDamage_fix = replace(No.Damage, Damage == 1960, 196),
Damage_fix = as.numeric(replace(Damage, Damage == "PM", NA)),
total_count = NoDamage_fix + Damage_fix) |>
filter(QAQC == FALSE)
# Summarize the mean count per plot of each species by year and community type
motinv_sum <- motinv |>
summarize(mean_count = sum(total_count)/5, # 5 plots per site
se_counts = sd(total_count)/sqrt(5), # 5 plots per site
.by = c(SiteCode, Year, CommunityType,
ScientificName, CommonName, SpeciesCode))motinv_wide <- motinv_sum |>
arrange(SpeciesCode) |> # sorting so columns are alphabetical
select(-ScientificName, -CommonName) |>
pivot_wider(names_from = SpeciesCode,
values_from = mean_count,
values_fill = 0)
head(motinv_wide)## # A tibble: 6 × 10
## SiteCode Year CommunityType se_counts CARMAE LITLIT LITOBT LITSAX NUCLAP
## <chr> <int> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 BASHAR 2021 Ascophyllum 0.980 4.4 0 0 0 0
## 2 BASHAR 2022 Ascophyllum 0.224 1 0 0 0 0
## 3 BASHAR 2023 Ascophyllum 0.2 2.8 0 0 0 0
## 4 BASHAR 2024 Ascophyllum 0 0.8 0 0 0 0
## 5 BASHAR 2019 Ascophyllum 0.632 1.2 0 0 0 0
## 6 BASHAR 2021 Barnacle 1.73 2.6 0 0 0 0
## # ℹ 1 more variable: TECTES <dbl>
CHALLENGE: Use the motinv_sum data frame from
the "Summarizing with dplyr" tab to pivot on Year and mean_count, fill
the NAs with 0s, and add "yr_" to the column names to prevent column
names starting with numbers. If you don't have the motinv_sum data frame
handy, run the code below to create it.
Hint: Drop the se_counts
column before you pivot.
# Fix the data issues again
motinv <- motinv |>
mutate(NoDamage_fix = replace(No.Damage, Damage == 1960, 196),
Damage_fix = as.numeric(replace(Damage, Damage == "PM", NA)),
total_count = NoDamage_fix + Damage_fix) |>
filter(QAQC == FALSE)
# Summarize the mean count per plot of each species by year and community type
motinv_sum <- motinv |>
summarize(mean_count = sum(total_count)/5, # 5 plots per site
se_counts = sd(total_count)/sqrt(5), # 5 plots per site
.by = c(SiteCode, Year, CommunityType,
ScientificName, CommonName, SpeciesCode))motinv_wide_yr <- motinv_sum |>
arrange(Year) |> # sorting so columns are alphabetical
select(-se_counts) |>
pivot_wider(names_from = Year,
values_from = mean_count,
values_fill = 0,
names_prefix = "yr_")
head(motinv_wide_yr)## # A tibble: 6 × 16
## SiteCode CommunityType ScientificName CommonName SpeciesCode yr_2013 yr_2014
## <chr> <chr> <chr> <chr> <chr> <dbl> <dbl>
## 1 BASHAR Ascophyllum Littorina litto… Common pe… LITLIT 14 20.8
## 2 BASHAR Ascophyllum Littorina obtus… Smooth pe… LITOBT 19 18.4
## 3 BASHAR Ascophyllum Nucella lapillus Dogwhelk NUCLAP 0.6 0.2
## 4 BASHAR Barnacle Littorina litto… Common pe… LITLIT 0.4 0.6
## 5 BASHAR Barnacle Littorina obtus… Smooth pe… LITOBT 0.2 1.6
## 6 BASHAR Fucus Littorina litto… Common pe… LITLIT 13.6 25.6
## # ℹ 9 more variables: yr_2015 <dbl>, yr_2016 <dbl>, yr_2017 <dbl>,
## # yr_2018 <dbl>, yr_2019 <dbl>, yr_2021 <dbl>, yr_2022 <dbl>, yr_2023 <dbl>,
## # yr_2024 <dbl>
CHALLENGE: Pivot the motinv_wide_yr data frame on the years
columns, and remove the "yr_" from the year names using
names_prefix = 'yr_'.
CHALLENGE: Join the motile invertebrate count data frame to the motile invertebrate species table to get Invasive and Exotic columns added to the data.
Import motinv data frames
# Read in motinv data if you haven't yet
motinv <- read.csv("./data/BASHAR_motile_invert_counts.csv")
head(motinv)## Network UnitCode SiteCode StartDate Year QAQC PlotName CommunityType
## 1 NETN ACAD BASHAR 6/24/2013 2013 TRUE A1 Ascophyllum
## 2 NETN ACAD BASHAR 6/21/2013 2013 FALSE A1 Ascophyllum
## 3 NETN ACAD BASHAR 6/24/2013 2013 TRUE A1 Ascophyllum
## 4 NETN ACAD BASHAR 6/21/2013 2013 FALSE A1 Ascophyllum
## 5 NETN ACAD BASHAR 6/24/2013 2013 TRUE A1 Ascophyllum
## 6 NETN ACAD BASHAR 6/21/2014 2014 FALSE A1 Ascophyllum
## ScientificName CommonName SpeciesCode Damage No.Damage Subsampled
## 1 Littorina littorea Common periwinkle LITLIT 0 2 No
## 2 Littorina littorea Common periwinkle LITLIT 0 3 No
## 3 Littorina obtusata Smooth periwinkle LITOBT 1 2 No
## 4 Littorina obtusata Smooth periwinkle LITOBT 0 6 No
## 5 Nucella lapillus Dogwhelk NUCLAP 0 1 No
## 6 Littorina littorea Common periwinkle LITLIT 0 2 No
## ScientificName CommonName SpeciesCode Invasive Exotic
## 1 Littorina littorea Common periwinkle LITLIT FALSE TRUE
## 2 Littorina obtusata Smooth periwinkle LITOBT FALSE FALSE
## 3 Carcinus maenas Green crab CARMAE TRUE TRUE
## 4 Littorina saxatilis Rough periwinkle LITSAX FALSE FALSE
## 5 Nucella lapillus Dogwhelk NUCLAP FALSE FALSE
## 6 Testudinalia testudinalis Limpet TECTES FALSE FALSE
## [1] "ScientificName" "CommonName" "SpeciesCode"
# left join species to motinv, because don't want to include species not found in count data
motinv_spp <- left_join(motinv,
motspp,
by = c("SpeciesCode", "ScientificName", "CommonName"))
head(motinv_spp)## Network UnitCode SiteCode StartDate Year QAQC PlotName CommunityType
## 1 NETN ACAD BASHAR 6/24/2013 2013 TRUE A1 Ascophyllum
## 2 NETN ACAD BASHAR 6/21/2013 2013 FALSE A1 Ascophyllum
## 3 NETN ACAD BASHAR 6/24/2013 2013 TRUE A1 Ascophyllum
## 4 NETN ACAD BASHAR 6/21/2013 2013 FALSE A1 Ascophyllum
## 5 NETN ACAD BASHAR 6/24/2013 2013 TRUE A1 Ascophyllum
## 6 NETN ACAD BASHAR 6/21/2014 2014 FALSE A1 Ascophyllum
## ScientificName CommonName SpeciesCode Damage No.Damage Subsampled
## 1 Littorina littorea Common periwinkle LITLIT 0 2 No
## 2 Littorina littorea Common periwinkle LITLIT 0 3 No
## 3 Littorina obtusata Smooth periwinkle LITOBT 1 2 No
## 4 Littorina obtusata Smooth periwinkle LITOBT 0 6 No
## 5 Nucella lapillus Dogwhelk NUCLAP 0 1 No
## 6 Littorina littorea Common periwinkle LITLIT 0 2 No
## Invasive Exotic
## 1 FALSE TRUE
## 2 FALSE TRUE
## 3 FALSE FALSE
## 4 FALSE FALSE
## 5 FALSE FALSE
## 6 FALSE TRUE
CHALLENGE: How would you return date1 as YYYYMMDD (20260312)?
date_list <- as.Date(c("01/01/2026", "12/31/2026"), format = "%m/%d/%Y")
seq.Date(date_list[1], date_list[2], by = "1 week")## [1] "2026-01-01" "2026-01-08" "2026-01-15" "2026-01-22" "2026-01-29"
## [6] "2026-02-05" "2026-02-12" "2026-02-19" "2026-02-26" "2026-03-05"
## [11] "2026-03-12" "2026-03-19" "2026-03-26" "2026-04-02" "2026-04-09"
## [16] "2026-04-16" "2026-04-23" "2026-04-30" "2026-05-07" "2026-05-14"
## [21] "2026-05-21" "2026-05-28" "2026-06-04" "2026-06-11" "2026-06-18"
## [26] "2026-06-25" "2026-07-02" "2026-07-09" "2026-07-16" "2026-07-23"
## [31] "2026-07-30" "2026-08-06" "2026-08-13" "2026-08-20" "2026-08-27"
## [36] "2026-09-03" "2026-09-10" "2026-09-17" "2026-09-24" "2026-10-01"
## [41] "2026-10-08" "2026-10-15" "2026-10-22" "2026-10-29" "2026-11-05"
## [46] "2026-11-12" "2026-11-19" "2026-11-26" "2026-12-03" "2026-12-10"
## [51] "2026-12-17" "2026-12-24" "2026-12-31"
## index date_time tempF month_num
## 1 1 7/18/2021 10:26 58.842 NA
## 2 2 7/18/2021 11:26 58.712 NA
## 3 3 7/18/2021 12:26 58.109 NA
## 4 4 7/18/2021 13:26 56.208 NA
## 5 5 7/18/2021 14:26 56.208 NA
## 6 6 7/18/2021 15:26 55.342 NA
## index date_time tempF month_num julian
## 1 1 7/18/2021 10:26 58.842 NA NA
## 2 2 7/18/2021 11:26 58.712 NA NA
## 3 3 7/18/2021 12:26 58.109 NA NA
## 4 4 7/18/2021 13:26 56.208 NA NA
## 5 5 7/18/2021 14:26 56.208 NA NA
## 6 6 7/18/2021 15:26 55.342 NA NA
Load packages and prep data for ggplot sections
# packages
library(dplyr)
library(ggplot2)
library(patchwork) # for arranging ggplot objects
library(RColorBrewer) # for palettes
library(viridis) # for palettes
# load data
pcov <- read.csv("./data/SHIHAR_photoplot_cover.csv") # import data
# define color and shape objects
cols <- c("ASCNOD" = "#C5B47B", "BARSPP" = "#A9A9A9",
"NONCOR" = "#574F91", "FUCSPP" = "#FFD560",
"MUSSPP" = "#6F88BF", "REDGRP" = "#FF4C53")
shps <- c("ASCNOD" = 23, "BARSPP" = 24, "NONCOR" = 23,
"FUCSPP" = 25, "MUSSPP" = 23, "REDGRP" = 25)
# Set x axis range
xrange <- range(pcov$Year)ggplot(data = pcov, aes(x = Year, y = avg_cover,
color = CoverCode, group = CoverCode,
fill = CoverCode, shape = CoverCode)) +
geom_errorbar(aes(ymin = avg_cover - se_cover, ymax = avg_cover + se_cover),
linewidth = 0.6) +
geom_point(color = "dimgrey", size = 1.5) + # changed this line
scale_fill_manual(values = cols, aesthetics = c("fill", "color"),
name = "Species Group") +
scale_shape_manual(values = shps, name = "Species Group") +
scale_x_continuous(limits = c(xrange[1] - 1, xrange[2] + 1),
breaks = c(seq(xrange[1] - 1, xrange[2] + 1, 2))) +
labs(x = NULL, y = "Avg. Percent Cover") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 0),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_rect(fill = 'white', color = 'dimgrey'),
legend.key = element_blank()) +
facet_wrap(~CommunityType)ggplot(data = pcov, aes(x = Year, y = avg_cover,
color = CoverCode, group = CoverCode,
fill = CoverCode, shape = CoverCode)) +
geom_errorbar(aes(ymin = avg_cover - se_cover, ymax = avg_cover + se_cover),
linewidth = 1.2) + # changed this line
geom_point(color = "dimgrey", size = 2.5) +
scale_fill_manual(values = cols, aesthetics = c("fill", "color"),
name = "Species Group") +
scale_shape_manual(values = shps, name = "Species Group") +
scale_x_continuous(limits = c(xrange[1] - 1, xrange[2] + 1),
breaks = c(seq(xrange[1] - 1, xrange[2] + 1, 2))) +
labs(x = NULL, y = "Avg. Percent Cover") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 0),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_rect(fill = 'white', color = 'dimgrey'),
legend.key = element_blank()) +
facet_wrap(~CommunityType)ggplot(data = pcov, aes(x = Year, y = avg_cover,
color = CoverCode, group = CoverCode,
fill = CoverCode, shape = CoverCode)) +
geom_errorbar(aes(ymin = avg_cover - se_cover, ymax = avg_cover + se_cover),
linewidth = 0.6) +
geom_point(color = "dimgrey", size = 2.5) +
scale_fill_manual(values = cols, aesthetics = c("fill", "color"),
name = "Species Group") +
scale_shape_manual(values = shps, name = "Species Group") +
scale_x_continuous(limits = c(xrange[1] - 1, xrange[2] + 1),
breaks = c(seq(xrange[1] - 1, xrange[2] + 1, 2))) +
labs(x = "Year", y = "Avg Percent Cover") + # changed this line
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 0),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_rect(fill = 'white', color = 'dimgrey'),
legend.key = element_blank()) +
facet_wrap(~CommunityType)CHALLENGE: How would you change the smoother in the code
below from LOESS to linear model and make the line dashed? Hint: method
= 'lm'.
Code for p6 if you don't have it.
p6 <- ggplot(data = pcov, aes(x = Year, y = avg_cover,
color = CoverCode, group = CoverCode,
fill = CoverCode, shape = CoverCode)) +
geom_errorbar(aes(ymin = avg_cover - se_cover, ymax = avg_cover + se_cover),
linewidth = 0.6) +
geom_point(color = "dimgrey", size = 2.5) +
scale_fill_manual(values = cols, aesthetics = c("fill", "color"),
name = "Species Group") +
scale_shape_manual(values = shps, name = "Species Group") +
scale_x_continuous(limits = c(xrange[1] - 1, xrange[2] + 1),
breaks = c(seq(xrange[1] - 1, xrange[2] + 1, 2))) +
labs(x = NULL, y = "Avg. Percent Cover") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 0),
panel.grid.major = element_blank(), # turns of major grids
panel.grid.minor = element_blank(), # turns off minor grids
panel.background = element_rect(fill = 'white', color = 'dimgrey'),
legend.key = element_blank()) +
facet_wrap(~CommunityType)CHALLENGE: How would you specify the 'RdYlBu' palette instead
of the ones used above?
Hint: Start with p_pal to save time
coding. Code below for p_pal if you don't have it.
p_pal <- ggplot(data = pcov, aes(x = Year, y = avg_cover,
color = CoverCode, group = CoverCode,
fill = CoverCode, shape = CoverCode)) +
geom_errorbar(aes(ymin = avg_cover - se_cover, ymax = avg_cover + se_cover),
linewidth = 0.6) +
geom_point(color = "dimgrey", size = 2.5) +
scale_shape_manual(values = shps, name = "Species Group") +
scale_x_continuous(limits = c(xrange[1] - 1, xrange[2] + 1),
breaks = c(seq(xrange[1] - 1, xrange[2] + 1, 2))) +
labs(x = "Year", y = "Avg Percent Cover") + # changed this line
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 0),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_rect(fill = 'white', color = 'dimgrey'),
legend.key = element_blank()) +
facet_wrap(~CommunityType)CHALLENGE: Create your own palette with at least three
colors.
Hint: Start with p_heat to save time coding. Code
below, if you don't have it.
# Create p_heat for 'palettes' section
p_heat <-
ggplot(chem, aes(x = mon, y = year, color = Temp_F, fill = Temp_F)) +
theme_bw() +
geom_tile() +
labs(y = "Year", x = "Month") +
scale_x_continuous(breaks = c(5, 6, 7, 8, 9, 10),
limits = c(4, 11),
labels = month.abb[5:10]) +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5))There are a number of options to get help with R. If you're trying to
figure out how to use a function, you can type ?function_name. For
example ?plot will show the R documentation for that
function in the Help panel.
Get help for the functions below
You can also press F1 while the cursor is on a function name to access the help for that function. Help documents in R are standardized to help you find what you're looking for.Great online resources to find answers to questions include Stackexchange, and Stackoverflow. Google searches are usually my first step, and I include "in R" and the package name (if applicable) in every search related to R code. If you're troubleshooting an error message, copying and pasting the error message verbatim into a search engine often helps.
Don't hesitate to reach out to colleagues for help as well! If you are stuck on something and the answers on Google are more confusing than helpful, don't be afraid to ask a human. Every experienced R programmer was a beginner once, so chances are they've encountered the same problem as you at some point. There is an R-focused Data Science Community of Practice for I&M folks, which anyone working in R (regardless of experience!) is invited and encouraged to join.
Unmatched parenthesis
mean_x <- mean(c(1, 3, 5, 7, 8, 21) # missing closing parentheses
mean_x <- mean(c(1, 3, 5, 7, 8, 21)) # correctUnmatched quotes
birds <- c("black-capped chickadee", "golden-crowned kinglet, "wood thrush") # missing quote after kingletMissing a comma between elements
birds <- c("black-capped chickadee", "golden-crowned kinglet" "wood thrush") # missing comma after kinglet
birds <- c("black-capped chickadee", "golden-crowned kinglet", "wood thrush") # correctedMisspelled function name
Incorrect use of dimensions with brackets
## Error in `[.data.frame`:
## ! undefined columns selected
There's a lot of great online material for learning new applications of R. The ones we've used the most are listed below.
While we won't get to these topics this during this training, the 2022 Advanced R training has sessions covering all of these topics. The Resources tab includes other online resources that cover these topics as well.
knitr::opts_chunk$set(warning=FALSE, message=FALSE)
hooks = knitr::knit_hooks$get()
hook_foldable = function(type) {
force(type)
function(x, options) {
res = hooks[[type]](x, options)
if (isFALSE(options[[paste0("fold.", type)]])) return(res)
paste0(
"<details><summary class='code2'>View R ", type, "</summary>\n",
res, "\n\n",
"</details>",
"\n\n",
"<hr style='height:1px; margin-bottom:15px; padding-bottom:15px; padding-top:-15px;margin-top:-15px;visibility:hidden;'>",
"\n\n"
)
}
}
knitr::knit_hooks$set(
output = hook_foldable("output"),
plot = hook_foldable("plot")
)
body {
background-color: #EBEBEB;
}
.tab-content {
background-color: #FAFAF0;
padding: 0 5px;
}
library(tidyverse)
#------------------------------------
# Day 0 - prep code
#------------------------------------
rm(list = ls())
packages <- c("tidyverse", # for Day 2 and 3 data wrangling
"RColorBrewer", "viridis", "patchwork", # for Day 3 ggplot
"readxl", "writexl", # for day 1 importing from excel
"car") # for Levene's test - also a great stats R package
install.packages(setdiff(packages, rownames(installed.packages())))
# Check that installation worked
library(tidyverse) # turns on core tidyverse packages
library(RColorBrewer) # palette generator
library(viridis) # more palettes
library(patchwork) # multipanel plots
library(readxl) # reading xlsx
library(writexl) # writing xlsx
motinv <- read.csv(
"https://raw.githubusercontent.com/KateMMiller/MMA_R_Training_2026/refs/heads/main/data/BASHAR_motile_invert_counts.csv")
#------------------------------------
# Day 1: Project Setup Code
#------------------------------------
# forward slash file path approach
"C:/Users/KMMiller/OneDrive = DOI/data/"
# backward slash file path approach
"C:\\Users\\KMMiller\\OneDrive = DOI\\data\\"
dir.create("data")
list.files() # you should see a data folder listed
file_list <- c(
"https://raw.githubusercontent.com/KateMMiller/MMA_R_Training_2026/refs/heads/main/data/ACAD_Jordan_Pond_water_chem.csv",
"https://raw.githubusercontent.com/KateMMiller/MMA_R_Training_2026/refs/heads/main/data/BASHAR_motile_invert_counts.csv",
"https://raw.githubusercontent.com/KateMMiller/MMA_R_Training_2026/refs/heads/main/data/BASHAR_Point_Intercept_data.csv",
"https://raw.githubusercontent.com/KateMMiller/MMA_R_Training_2026/refs/heads/main/data/bat_site_info.csv",
"https://raw.githubusercontent.com/KateMMiller/MMA_R_Training_2026/refs/heads/main/data/bat_captures.csv",
"https://raw.githubusercontent.com/KateMMiller/IMD_R_Training_2026/refs/heads/main/data/HOBO_temp_example.csv",
"https://raw.githubusercontent.com/KateMMiller/MMA_R_Training_2026/refs/heads/main/data/motile_invert_species_table.csv",
"https://raw.githubusercontent.com/KateMMiller/MMA_R_Training_2026/refs/heads/main/data/SHIHAR_photoplot_cover.csv")
file_names <- sub(".*data/", "", file_list)
lapply(seq_along(file_list), function(x){
download.file(file_list[x],
destfile = paste0("./data/", file_names[x]))
})
#------------------------------------
# Day 1: Start Coding Code
#------------------------------------
# Commented text: try this line to generate some basic text and become familiar with where results will appear:
print("Welcome to R!")
# simple math
1+1
(2*3)/4
sqrt(9)
# calculate basal area of tree with 14.6cm diameter; note pi is built in constant in R
(14.6^2)*pi
# get the cosine of 180 degrees - note that trig functions in R expect angles in radians
cos(pi)
# the value of 12.098 is assigned to variable 'a'
a <- 12.098
# and the value 65.3475 is assigned to variable 'b'
b <- 65.3475
# we can now perform whatever mathematical operations we want using these two
# variables without having to repeatedly type out the actual numbers:
a*b
(a^b)/((b+a))
sqrt((a^7)/(b*2))
x <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
# equivalent to x <- 1:10
# bad coding
#mean <- mean(x)
# good coding
mean_x <- mean(x)
mean_x
range_x <- range(x)
range_x
#------------------------------------
# Day 1: Read and Write Code
#------------------------------------
# read in the data from BASHAR_motile_invert_counts.csv and assign it as a dataframe to the variable "motinv"
motinv <- read.csv("./data/BASHAR_motile_invert_counts.csv")
# View the BASHAR_motile data frame we just created
View(motinv)
# Look at the top 6 rows of the data frame
head(motinv)
# Look at the bottom 6 rows of the data frame
tail(motinv)
# Write the data frame to your data folder using a relative path.
# By default, write.csv adds a column with row names that are numbers. I don't
# like that, so I turn that off.
write.csv(motinv, "./data/BASHAR_motile_invert_counts.csv", row.names = FALSE)
# Read the data frame in using a relative path
motinv <- read.csv("./data/BASHAR_motile_invert_counts.csv")
# Equivalent code to read in the data frame using full path on my computer, but won't match another user.
motinv <- read.csv("C:/Users/KMMiller/OneDrive - DOI/NETN/R_Dev/MMA_R_Training_2026/data/BASHAR_motile_invert_counts.csv")
install.packages("readxl") # only need to run once.
install.packages("writexl")
library(writexl) # saving xlsx
library(readxl) # importing xlsx
write_xlsx(motinv, "./data/BASHAR_motile_invert_counts.xlsx")
motinv_xls <- read_xlsx(path = "./data/BASHAR_motile_invert_counts.xlsx", sheet = "Sheet1")
head(motinv_xls)
#------------------------------------
# Day 1: Vectors Code
#------------------------------------
digits <- c(1:10) # Use x:y to create a sequence of integers starting at x and ending at y
digits
digits + 1 # note how 1 was added to every element of digits.
is_odd <- rep(c(FALSE, TRUE), 5) # Use rep(x, n) to create a vector by repeating x n times
is_odd
tree_dbh <- c(12.5, 20.4, 18.1, 38.5, 19.3)
tree_dbh
bird_ids <- c("song sparrow", "dark-eyed junco", "golden-crowned kinglet", "dark-eyed junco")
bird_ids
second_bird <- bird_ids[2]
second_bird
top_two_birds <- bird_ids[c(1,2)]
top_two_birds
sort(unique(bird_ids))
class(bird_ids)
class(tree_dbh)
class(digits)
class(is_odd)
str(motinv)
names(motinv)
motinv$PlotName
motinv$ScientificName
dim(motinv)
nrow(motinv) # first dim
ncol(motinv) # second dim
motinv[1:5,]
motinv[c(1, 2, 3, 4, 5),] #equivalent but more typing
motinv[, c("SiteCode", "ScientificName", "CommonName", "Year", "Damage", "No.Damage")]
motinv[1:5, c("SiteCode", "ScientificName", "CommonName", "Year", "Damage", "No.Damage")]
motinv_sub <- motinv[, 1:4] # works, but risky
motinv_sub2 <- motinv[, c("Network", "UnitCode", "SiteCode", "StartDate")] #same result, but better
# compare the two data frames to the original
head(motinv)
head(motinv_sub)
head(motinv_sub2)
motinv[c(2, 4, 6, 8), c(1, 2)]
names(motinv) # get the names of the first 2 columns
motinv[c(2, 4, 6, 8), c("Network", "UnitCode")]
head(motinv)
motinv_nonQ <- motinv[motinv$QAQC == FALSE, ]
table(motinv$QAQC) # 42 T
table(motinv_nonQ$QAQC) # 0 T
motinv$ScientificName[motinv$CommunityType == "Barnacle"]
motinv[motinv$CommunityType == "Barnacle", "ScientificName"] # equivalent
lit_spp <- c("Littorina littorea", "Littorina obtusata", "Littorina saxatilis")
motinv_lit <- motinv[motinv$ScientificName %in% lit_spp,
c("SiteCode", "PlotName", "ScientificName", "Year")]
motinv_lit
# Return a vector of unique plot names, sorted alphabetically
plots_unique <- sort(unique(motinv[,"PlotName"]))
plots_unique
# Returns the number of elements in sites_unique vector
length(plots_unique) # 20
# Option 1
length(unique(motinv[, "ScientificName"])) # 6
# Option 2
length(unique(motinv$ScientificName)) # equivalent
# Option 1 - used unique to just return unique site name
unique(motinv$Year[motinv$QAQC == TRUE]) # 2013
# Option 2
unique(motinv[motinv$QAQC == TRUE, "Year"])
#-----------------------------------------
# Day 1: Data Exploration Code
#-----------------------------------------
motinv <- read.csv("./data/BASHAR_motile_invert_counts.csv")
head(motinv)
str(motinv)
summary(motinv)
table(complete.cases(motinv[,1:13]))# first 13 columns are all complete
table(complete.cases(motinv$Subsampled))# where the FALSE are introduced
x <- c(1, 3, 8, 3, 5, NA)
mean(x) # returns NA
mean(x, na.rm = TRUE)
sort(unique(motinv$Damage)) # sorts the unique values in the column
table(unique(motinv$Damage)) # shows the number of records per value - very handy
motinv2 <- motinv
motinv2$Damage[motinv2$Damage == "PM"] <- NA
motinv2$Damage_num <- as.numeric(motinv2$Damage)
# check that it worked
str(motinv2) # Damage_num is numeric
sort(unique(motinv2$Damage_num)) # Only numbers show in table
motinv3 <- subset(motinv2, QAQC == FALSE, select = -Damage) # Note the importance of FALSE all caps
motinv3 <- subset(motinv2, QAQC != TRUE, select = -Damage) # equivalent
motinv3 <- motinv2[motinv2$QAQC == FALSE, -12] #equivalent but not as easy to follow
# Look at the start date format
head(motinv3) # month/day/year
# Create new column called Date
motinv3$Date <- as.Date(motinv3$StartDate, format = "%m/%d/%Y")
str(motinv3)
names(motinv3) # original names
names(motinv3)[names(motinv3) == "ScientificName"] <- "Species"
names(motinv3) # check that it worked
motinv3$Site_Plot <- paste(motinv3$SiteCode, motinv3$PlotName, sep = "-")
motinv3$Site_Plot <- paste0(motinv3$SiteCode, "-", motinv3$PlotName) #equivalent- by default no separation between elements of paste.
# with brackets
A1_2024 <- motinv3[motinv3$PlotName == "A1" & motinv3$Year == 2024, ]
nrow(A1_2024) # 3
# with base R subset
A1_2024b <- subset(motinv3, PlotName == "A1" & Year == 2024)
View(A1_2024b) # 3
# OPTION 2
gcrab <- motinv3[motinv3$Species == "Carcinus maenas",]
sort(unique(gcrab$Year)) #2019, 2021, 2022, 2023, 2024
gcrab2 <- subset(motinv3, Species == "Carcinus maenas")
table(gcrab2$Year)
View(motinv3)
max_nd <- max(motinv3$No.Damage, na.rm = TRUE)
motinv3[motinv3$No.Damage == max_nd,]
# create copy of motinv data
motinv_fix <- motinv3
# find the problematic value, and change it to 196
motinv_fix$No.Damage[motinv_fix$Year == 2019 &
motinv_fix$PlotName == "R4" &
motinv_fix$No.Damage == 1960] <- 196
# check your work
range(motinv3$No.Damage) #1960
range(motinv_fix$No.Damage) # now 282
#------------------------------------
# Day 1: Basic Plotting Code
#------------------------------------
hist(x = motinv3$No.Damage)
plot(motinv3$No.Damage)
plot(motinv3$No.Damage ~ motinv3$Damage_num)
plot(No.Damage ~ Damage_num, data = motinv3) # equivalent but cleaner axis titles
hist(motinv3$Damage_num)
#------------------------------------
# Day 2: Tidyverse Code
#------------------------------------
install.packages('tidyverse')
library(tidyverse)
library(dplyr)
#------------------------------------
# Day 2: Data Wrangling Code
#------------------------------------
motinv <- read.csv("./data/BASHAR_motile_invert_counts.csv")
# Base R
motinv2 <- motinv
motinv2$Damage[motinv2$Damage == "PM"] <- NA
motinv2$Damage_num <- as.numeric(motinv2$Damage)
# dplyr approach with mutate
motinv2 <- mutate(motinv, Damage_num = as.numeric(replace(Damage, Damage == "PM", NA)))
str(motinv2)
# Base R
motinv2$Date <- as.Date(motinv2$StartDate, format = "%m/%d/%Y")
# dplyr approach with mutate
motinv2 <- mutate(motinv2, Date = as.Date(StartDate, format = "%m/%d/%Y"))
# Base R code
names(motinv2)[names(motinv2) == "ScientificName"] <- "Species"
# dplyr approach with rename
motinv2 <- rename(motinv2, "Species" = "ScientificName")
names(motinv2)
# Base R
motinv2$Site_Plot <- paste(motinv2$SiteCode, motinv2$PlotName, sep = "-")
# dplyr approach with mutate
motinv2 <- mutate(motinv2, Site_Plot = paste(SiteCode, PlotName, sep = "-"))
# Base R
motinv3 <- subset(motinv2, QAQC == FALSE, select = -Damage) # Note the importance of FALSE all caps
# dplyr
motinv3a <- filter(motinv2, QAQC == FALSE)
motinv3 <- select(motinv3a, -Damage)
head(motinv3)
motinv4 <- mutate(motinv3, No.Damage = replace(No.Damage, No.Damage == 1960, 196))
motinv_final <- motinv |>
mutate(Damage_num = as.numeric(replace(Damage, Damage == "PM", NA)), # Fix Damage PM
SitePlot = paste(SiteCode, PlotName, sep = "-"), # create new SitePlot column
Date = as.Date(StartDate, format = "%m/%d/%Y"), # create new Date column
No.Damage_fix = replace(No.Damage, No.Damage == 1960, 196)) |> # fix error in No.Damage
rename("Species" = "ScientificName") |> # change column name
filter(QAQC == FALSE) |> # drop QAQC visits
select(-Damage) |> # drop original Damage column
arrange(SitePlot, Year, Species) # optional sorting the data
head(motinv_final)
# with brackets
A1_2024 <- motinv |> filter(PlotName == "A1" & Year == 2024)
nrow(A1_2024) # 3
gcrab <- motinv |> filter(ScientificName == "Carcinus maenas") |>
select(Year) |> unique()
gcrab
max_nd <- max(motinv$No.Damage, na.rm = TRUE)
motinv |> filter(No.Damage == max_nd)
# Reminder of the base R approach
# create copy of motinv data
motinv_fix <- motinv
# find the problematic value, and change it to 196
motinv_fix$No.Damage[motinv_fix$Year == 2019 &
motinv_fix$PlotName == "R4" &
motinv_fix$No.Damage == 1960] <- 196
# dplyr approach
motinv_fix <- motinv |> mutate(No.Damage = replace(No.Damage, No.Damage == 1960, 196))
range(motinv_fix$No.Damage)
#------------------------------------
# Day 2: Conditionals Code
#------------------------------------
# green crab, Asian shore crab, and common periwinkle species codes
exo_spp <- c("CARMAE", "HEMISAN", "LITLIT")
# smooth periwinkle, rough periwinkle, dogwhelk, and limpet species codes
nat_spp <- c("LITOBT", "LITSAX", "NUCLAP", "TECTES")
# Make a table of species codes in BASHAR
table(motinv$SpeciesCode)
# Add native column with ifelse
motinv <- motinv |> mutate(native = ifelse(SpeciesCode %in% nat_spp, TRUE, FALSE))
# Add native_status column with nested ifelse
motinv <- motinv |> mutate(native_status = ifelse(SpeciesCode %in% nat_spp, "native",
ifelse(SpeciesCode %in% c("CARMAE", "HEMISAN"), "invasive",
"exotic")))
table(motinv$SpeciesCode, motinv$native)
table(motinv$SpeciesCode, motinv$native_status)
# green crab, Asian shore crab, and common periwinkle species codes
exo_spp <- c("CARMAE", "HEMISAN", "LITLIT")
# smooth periwinkle, rough periwinkle, dogwhelk, and limpet species codes
nat_spp <- c("LITOBT", "LITSAX", "NUCLAP", "TECTES")
motinv <- motinv |>
mutate(native_status = case_when(SpeciesCode %in% nat_spp ~ 'native',
SpeciesCode %in% c("CARMAE", "HEMISAN") ~ 'invasive',
SpeciesCode %in% exo_spp ~ 'exotic',
TRUE ~ 'unknown'))
table(motinv$SpeciesCode, motinv$native_status) # check that the output worked
inv <- motinv |> filter(native_status == "invasive")
spp_det <- unique(inv$CommonName)
if(nrow(inv) > 0){print(paste0("The following invasive species were detected in the data: ",
paste0(spp_det, collapse = ", ")))
} else {print("No invasive species were detected in the data.")}
inv <- motinv |> filter(SpeciesCode %in% nat_spp) |>
filter(native_status == "invasive")
spp_det <- unique(inv$CommonName)
if(nrow(inv) > 0){print(paste0("The following invasive species were detected in the data: ",
paste0(spp_det, collapse = ", ")))
} else {print("No invasive species were detected in the data.")}
pred <- c("CARMAE", "NUCLAP")
# base R
motinv$trophic <- ifelse(motinv$SpeciesCode %in% pred, "predator", "herbivore")
table(motinv$trophic, motinv$SpeciesCode)
# tidyverse
motinv <- motinv |> mutate(trophic = ifelse(SpeciesCode %in% pred, "predator", "herbivore"))
table(motinv$trophic, motinv$SpeciesCode)
# Base R using a nested ifelse()
motinv$count_level <-
ifelse(motinv$No.Damage > 35, "High",
ifelse(motinv$No.Damage >= 10 & motinv$No.Damage <= 35, "Medium", "Low"))
table(motinv$count_level) # check that it worked
# Tidyverse using case_when() and between
motinv <- motinv |> mutate(count_level = case_when(No.Damage > 35 ~ "High",
between(No.Damage, 10, 35) ~ "Medium",
No.Damage < 10 ~ "Low"))
table(motinv$count_level) # check that it worked
#------------------------------------
# Day 2: Summarizing Code
#------------------------------------
pi_dat <- read.csv("./data/BASHAR_Point_Intercept_data.csv")
head(pi_dat)
pi_dat_mut <- pi_dat |> mutate(med_elev_sl = median(med_elev),
avg_pct_freq = mean(pct_freq),
.by = c(SiteCode, Year, CoverType, CoverCode))
nrow(pi_dat) #314
nrow(pi_dat_mut) #314
head(pi_dat_mut)
pi_dat_sum <- pi_dat |> summarize(elev_sl_med = median(med_elev),
elev_sl_min = min(med_elev),
elev_sl_max = max(med_elev),
avg_pct_freq = mean(pct_freq),
.by = c(SiteCode, Year, CoverType, CoverCode))
nrow(pi_dat) #314
nrow(pi_dat_sum) #124
head(pi_dat_sum)
# Fix the data issues again
motinv <- motinv |>
mutate(NoDamage_fix = replace(No.Damage, Damage == 1960, 196),
Damage_fix = as.numeric(replace(Damage, Damage == "PM", NA)),
total_count = NoDamage_fix + Damage_fix) |>
filter(QAQC == FALSE)
# Summarize the mean count per plot of each species by year and community type
motinv_sum <- motinv |>
summarize(mean_count = sum(total_count)/5, # 5 plots per site
se_counts = sd(total_count)/sqrt(5), # 5 plots per site
.by = c(SiteCode, Year, CommunityType,
ScientificName, CommonName, SpeciesCode))
head(motinv_sum)
pi_nonveg <- pi_dat |> filter(CoverCode %in% c("BOLT", "ROCK", "WATER")) |> # filter nonveg grps
summarize(avg_freq = mean(pct_freq), # calc avg.
.by = c(SiteCode, Year, CoverCode, CoverType)) # grouping variables
head(pi_nonveg) # check output
pi_subtype <- pi_dat |>
mutate(sub_type = ifelse(CoverCode %in% c("BOLT", "ROCK", "WATER"), "nonveg", "veg")) |> # filter nonveg grps
summarize(avg_freq = mean(pct_freq), # calc avg.
.by = c(SiteCode, Year, sub_type)) |> # grouping variables
arrange(SiteCode, Year, sub_type) # sort variables
head(pi_subtype) # check output
#------------------------------------
# Day 3: Pivot Code
#------------------------------------
# load the package
library(dplyr)
library(tidyr) # for pivot functions
#--- import the raw point intercept data
pi_dat <- read.csv("./data/BASHAR_Point_Intercept_data.csv")
# summarize data by site, year, and cover type
pi_dat_sum <- pi_dat |> summarize(med_elev_sl = median(med_elev, na.rm = T),
avg_pct_freq = mean(pct_freq, na.rm = T),
.by = c(SiteCode, Year, CoverType, CoverCode))
pi_wide <- pi_dat_sum |>
arrange(CoverCode, Year) |> # sort by CoverCode and year
select(-CoverType, -med_elev_sl) |> # Drop extra column
pivot_wider(names_from = CoverCode, # column that will produce column names
values_from = avg_pct_freq) # column to make the values
head(pi_wide)
pi_wide <- pi_dat_sum |>
arrange(CoverCode, Year) |>
select(-CoverType, -med_elev_sl) |>
pivot_wider(names_from = CoverCode,
values_from = avg_pct_freq,
values_fill = 0) # new line
head(pi_wide)
pi_wide_yr <- pi_dat_sum |>
arrange(Year) |>
select(-med_elev_sl) |>
pivot_wider(names_from = Year, # pivot on year instead of CoverCode
values_from = avg_pct_freq,
values_fill = 0,
names_prefix = "yr_") # new line
head(pi_wide_yr)
# Fix the data issues again
motinv <- motinv |>
mutate(NoDamage_fix = replace(No.Damage, Damage == 1960, 196),
Damage_fix = as.numeric(replace(Damage, Damage == "PM", NA)),
total_count = NoDamage_fix + Damage_fix) |>
filter(QAQC == FALSE)
# Summarize the mean count per plot of each species by year and community type
motinv_sum <- motinv |>
summarize(mean_count = sum(total_count)/5, # 5 plots per site
se_counts = sd(total_count)/sqrt(5), # 5 plots per site
.by = c(SiteCode, Year, CommunityType,
ScientificName, CommonName, SpeciesCode))
motinv_wide <- motinv_sum |>
arrange(SpeciesCode) |> # sorting so columns are alphabetical
select(-ScientificName, -CommonName) |>
pivot_wider(names_from = SpeciesCode,
values_from = mean_count,
values_fill = 0)
head(motinv_wide)
# Fix the data issues again
motinv <- motinv |>
mutate(NoDamage_fix = replace(No.Damage, Damage == 1960, 196),
Damage_fix = as.numeric(replace(Damage, Damage == "PM", NA)),
total_count = NoDamage_fix + Damage_fix) |>
filter(QAQC == FALSE)
# Summarize the mean count per plot of each species by year and community type
motinv_sum <- motinv |>
summarize(mean_count = sum(total_count)/5, # 5 plots per site
se_counts = sd(total_count)/sqrt(5), # 5 plots per site
.by = c(SiteCode, Year, CommunityType,
ScientificName, CommonName, SpeciesCode))
motinv_wide_yr <- motinv_sum |>
arrange(Year) |> # sorting so columns are alphabetical
select(-se_counts) |>
pivot_wider(names_from = Year,
values_from = mean_count,
values_fill = 0,
names_prefix = "yr_")
head(motinv_wide_yr)
pi_long <- pi_wide |> pivot_longer(cols = -c(SiteCode, Year),
names_to = "SpeciesCode",
values_to = "Avg_Pct_Freq")
head(pi_long)
motinv_long_yr <- pivot_longer(motinv_wide_yr,
cols = -c(SiteCode, CommunityType, ScientificName,
CommonName, SpeciesCode),
names_to = "Year",
values_to = "mean_counts",
names_prefix = "yr_") # drops this string from values
#------------------------------------
# Day 3: Join Code
#------------------------------------
#site data
bat_sites <- read.csv("./data/bat_site_info.csv")
# bat capture data
bat_cap <- read.csv("./data/bat_captures.csv")
# View sites listed in each
sort(unique(bat_sites$Site)) # Sites 1, 2, 3, 4, 5
sort(unique(bat_cap$Site)) # Sites 1, 2, 3, 5, 6
bat_full <- full_join(bat_sites, bat_cap, by = "Site")
table(bat_full$Site)
knitr::kable(bat_full, align = 'c') |>
kableExtra::scroll_box(height = "300px") |>
kableExtra::kable_styling(full_width = F, html_font = 'Arial', font_size = 10) |>
kableExtra::column_spec(1:10, background = 'white', include_thead = T)
bat_inner <- inner_join(bat_sites, bat_cap, by = "Site")
table(bat_inner$Site)
knitr::kable(bat_inner) |>
kableExtra::scroll_box(height = "300px") |>
kableExtra::kable_styling(full_width = F, html_font = 'Arial', font_size = 10) |>
kableExtra::column_spec(1:10, background = 'white', include_thead = T)
bat_left <- left_join(bat_sites, bat_cap, by = "Site")
table(bat_left$Site)
knitr::kable(bat_left) |>
kableExtra::scroll_box(height = "300px") |>
kableExtra::kable_styling(full_width = F, html_font = 'Arial', font_size = 10) |>
kableExtra::column_spec(1:10, background = 'white', include_thead = T)
bat_right <- right_join(bat_sites, bat_cap, by = "Site")
table(bat_right$Site)
knitr::kable(bat_right) |>
kableExtra::scroll_box(height = "300px") |>
kableExtra::kable_styling(full_width = F, html_font = 'Arial', font_size = 10) |>
kableExtra::column_spec(1:10, background = 'white', include_thead = T)
anti_join(bat_sites, bat_cap, by = "Site")
anti_join(bat_cap, bat_sites, by = "Site")
#--- Read in motinv data if you haven't yet
motinv <- read.csv("./data/BASHAR_motile_invert_counts.csv")
#--- Read in species table
motspp <- read.csv("./data/motile_invert_species_table.csv")
head(motspp)
intersect(names(motinv), names(motspp)) # 3 columns in common
# left join species to motinv, because don't want to include species not found in count data
motinv_spp <- left_join(motinv,
motspp,
by = c("SpeciesCode", "ScientificName", "CommonName"))
head(motinv_spp)
# anti join of
anti_join(motspp, motinv, by = c("SpeciesCode", "ScientificName", "CommonName"))
#------------------------------------
# Day 3: Dates and Time Code
#------------------------------------
codes <- read.csv("./data/datetime_codes.csv", encoding = "Latin-1")
knitr::kable(codes) |>
#kableExtra::scroll_box(width = "300px") |>
kableExtra::kable_styling(full_width = F, html_font = 'Arial', font_size = 11,
bootstrap_options = "condensed") |>
kableExtra::column_spec(1:2, background = 'white', include_thead = T)
Sys.time()
class(Sys.time()) # POSIXct POSIXt
Sys.Date()
class(Sys.Date()) # Date
# date with slashes and full year
date_chr1 <- "3/12/2026"
date1 <- as.Date(date_chr1, format = "%m/%d/%Y")
str(date1)
# date with dashes and 2-digit year
date_chr2 <- "3-12-26"
date2 <- as.Date(date_chr2, format = "%m-%d-%y")
str(date2)
# date written out
date_chr3 <- "March 12, 2026"
date3 <- as.Date(date_chr3, format = "%b %d, %Y")
str(date3)
#Julian date as numeric
as.numeric(format(date1, format = "%j"))
#Return day of week
format(date1, format = "%A")
#Return abbreviated day of week
format(date1, format = "%a")
#Return written out date with month name
format(date1, format = "%B %d, %Y")
#Return abbreviated written out date with month name
format(date1, format = "%b %d, %Y")
date1 + 1 # add a day
date1 + 7 # add a week
date_list <- as.Date(c("01/01/2026", "12/31/2026"), format = "%m/%d/%Y")
# by 15 days
seq.Date(date_list[1], date_list[2], by = "15 days")
# by month
seq.Date(date_list[1], date_list[2], by = "1 month")
# by 6 months
seq.Date(date_list[1], date_list[2], by = "6 months")
format(date1, format = "%Y%m%d")
date_list <- as.Date(c("01/01/2026", "12/31/2026"), format = "%m/%d/%Y")
seq.Date(date_list[1], date_list[2], by = "3 months")
date_list <- as.Date(c("01/01/2026", "12/31/2026"), format = "%m/%d/%Y")
seq.Date(date_list[1], date_list[2], by = "1 week")
unclass(as.POSIXct("2026-03-12 01:30:00", "%Y-%m-%d %H:%M:%S", tz = "America/New_York"))
unclass(as.POSIXlt("2026-03-12 01:30:00", "%Y-%m-%d %H:%M:%S", tz = "America/New_York"))
Sys.timezone()
OlsonNames()
temp_data1 <- read.csv("./data/HOBO_temp_example.csv")
# check data
head(temp_data1)
temp_data <- read.csv("./data/HOBO_temp_example.csv", skip = 1)[,1:3]
colnames(temp_data) <- c("index", "date_time", "tempF")
View(temp_data)
knitr::kable(temp_data[1:50,], caption = "First 50 rows of temp_data") |>
kableExtra::scroll_box(height = "300px") |>
kableExtra::kable_styling(full_width = F, html_font = 'Arial', font_size = 10) |>
kableExtra::column_spec(1:3, background = 'white', include_thead = T)
temp_data$timestamp <- as.POSIXct(temp_data$date_time,
format = "%m/%d/%Y %H:%M",
tz = "America/New_York")
head(temp_data)
temp_data$date <- format(temp_data$timestamp, "%Y%m%d")
temp_data$month <- format(temp_data$timestamp, "%b")
temp_data$time <- format(temp_data$timestamp, "%I:%M")
temp_data$hour <- as.numeric(format(temp_data$timestamp, "%I"))
head(temp_data)
temp_data$month_num <- as.numeric(format(temp_data$timestamp, "%m"))
head(temp_data)
temp_data$julian <- as.numeric(format(temp_data$timestamp, "%j"))
head(temp_data)
#----------------------------------------------
# Day 2: Data Viz. Best Practices Code
#----------------------------------------------
library(knitr)
library(kableExtra)
covid_numbers <- read.csv("./data/covid_numbers.csv")
head(covid_numbers, 7) |>
knitr::kable(align = "c", caption = "<h6><b>Table 1.</b> Daily Covid cases and population numbers by state (only showing first 7 records)</h6>") |>
kableExtra::kable_styling(full_width = F, html_font = 'Arial', font_size = 12) |>
kableExtra::column_spec(1:4, background = 'white', include_thead = T)
acme_in <- read.csv("./data/acme_sales.csv") |>
dplyr::arrange(category, product)
acme_in |>
knitr::kable(align = "c", caption = "<h6><b>Table 2. </b>Average monthly revenue (in $1000's) from Acme product sales, 1950 - 2020</h6>") |>
kableExtra::kable_styling(full_width = F, html_font = 'Arial', font_size = 12) |>
kableExtra::column_spec(1:14, background = 'white', include_thead = T)
acme <- acme_in |>
pivot_longer(-c(category, product), names_to = "month", values_to = "revenue")
acme$month <- factor(acme$month, levels = month.abb)
ggplot(acme, aes(x=month, y=product, fill=revenue)) +
geom_raster() +
geom_text(aes(label=revenue, color = revenue > 1250)) + # color of text conditional on revenue relative to 1250
scale_color_manual(guide = "none", values = c("black", "white")) + # set color of text
scale_fill_viridis_c(direction = -1, name = "Monthly revenue,\nin $1000's") +
scale_y_discrete(limits=rev) + # reverses order of y-axis bc ggplot reverses it from the data
labs(#title = "Average monthly revenue (in $1000's) from Acme product sales, 1950 - 2020",
x = "Month", y = "Product") +
theme_bw(base_size = 11) +
facet_grid(rows = vars(category), scales = "free") # set scales to free so each facet only shows its own levels
ansc <- anscombe |>
dplyr::select(x1, y1, x2, y2, x3, y3, x4, y4)
ansc |>
knitr::kable(align = "c", caption = "<h6><b>Table 3.</b> Anscombe's Quartet - Four bivariate datasets with identical summary statistics</h6>") |>
kableExtra::column_spec (c(2,4,6),border_left = F, border_right = T) |>
kableExtra::kable_styling(full_width = F, html_font = 'Arial', font_size = 12) |>
kableExtra::column_spec(1:8, background = 'white', include_thead = T)
sapply(ansc, function(x) c(mean=round(mean(x), 2), var=round(var(x), 2))) |>
knitr::kable(align = "c", caption = "<h6><b>Table 4. </b>Means and variances are identical in the four datasets. The correlation between x and y (r = 0.82) is also identical across the datasets.</h6>") |>
kableExtra::column_spec (c(1,3,5,7), border_left = F, border_right = T) |>
kableExtra::kable_styling(full_width = F, html_font = 'Arial', font_size = 12) |>
kableExtra::column_spec(1:9, background = 'white', include_thead = T)
#------------------------------------
# Day 2: Intro to ggplot Code
#------------------------------------
knitr::opts_chunk$set(warning=FALSE, message=FALSE, fig.align = 'center', fig.height = 4, fig.width = 6)
library(ggplot2) # load ggplot
library(dplyr)
pcov <- read.csv("./data/SHIHAR_photoplot_cover.csv") # import data
cols <- c("ASCNOD" = "#bcb02f", "BARSPP" = "#CAC7B6",
"NONCOR" = "#430816", "FUCSPP" = "#65651a",
"MUSSPP" = "#170461", "REDGRP" = "#9e224d")
shps <- c("ASCNOD" = 23, "BARSPP" = 24, "NONCOR" = 23,
"FUCSPP" = 25, "MUSSPP" = 23, "REDGRP" = 25)
xrange <- range(pcov$Year)
ggplot(data = pcov, aes(x = Year, y = avg_cover,
color = CoverCode, group = CoverCode,
fill = CoverCode, shape = CoverCode)) +
#geom_smooth(se = F, span = 0.75) +
#geom_line(linewidth = 0.6) +
geom_errorbar(aes(ymin = avg_cover - se_cover, ymax = avg_cover + se_cover),
linewidth = 0.6) +
geom_point(color = "dimgrey", size = 2.5) +
scale_fill_manual(values = cols, aesthetics = c("fill", "color"),
name = "Species Group") +
scale_shape_manual(values = shps, name = "Species Group") +
scale_x_continuous(limits = c(xrange[1] - 1, xrange[2] + 1),
breaks = c(seq(xrange[1] - 1, xrange[2] + 1, 2))) +
labs(x = NULL, y = "Avg. Percent Cover") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 0),
panel.grid.major = element_blank(), # turns of major grids
panel.grid.minor = element_blank(), # turns off minor grids
panel.background = element_rect(fill = 'white', color = 'dimgrey'),
legend.key = element_blank()) +
facet_wrap(~CommunityType)
# load package
library(ggplot2)
# import data
pcov <- read.csv("./data/SHIHAR_photoplot_cover.csv")
# check out the data
head(pcov)
p <- ggplot(data = pcov, aes(x = Year, y = avg_cover,
color = CoverCode,
fill = CoverCode,
shape = CoverCode))
p
p2 <- p +
geom_errorbar(aes(ymin = avg_cover - se_cover, ymax = avg_cover + se_cover)) +
geom_point()
p2
p3 <- p2 +
scale_fill_manual(values = c("ASCNOD" = "#bcb02f", "BARSPP" = "#CAC7B6",
"NONCOR" = "#420816", "FUCSPP" = "#646519",
"MUSSPP" = "#170461", "REDGRP" = "#9e224d")) +
scale_color_manual(values = c("ASCNOD" = "#bcb02f", "BARSPP" = "#CAC7B6",
"NONCOR" = "#420816", "FUCSPP" = "#646519",
"MUSSPP" = "#170461", "REDGRP" = "#9e224d")) +
scale_shape_manual(values = c("ASCNOD" = 23, "BARSPP" = 24, "NONCOR" = 23,
"FUCSPP" = 25, "MUSSPP" = 23, "REDGRP" = 25))
p3
cols <- c("ASCNOD" = "#bcb02f", "BARSPP" = "#CAC7B6",
"NONCOR" = "#420816", "FUCSPP" = "#646519",
"MUSSPP" = "#170461", "REDGRP" = "#9e224d")
shps <- c("ASCNOD" = 23, "BARSPP" = 24, "NONCOR" = 23,
"FUCSPP" = 25, "MUSSPP" = 23, "REDGRP" = 25)
p3 <- p2 +
geom_point(color = 'dimgrey', size = 2.5) + # setting point outline to dark grey
scale_fill_manual(values = cols, name = "Species Group") +
scale_color_manual(values = cols, name = "Species Group") +
scale_shape_manual(values = shps, name = "Species Group")
p3
# Determine year range (so not hard coded/easily updated in future)
xrange <- range(pcov$Year)
p4 <- p3 +
scale_x_continuous(limits = c(xrange[1] - 1, xrange[2] + 1),
breaks = c(seq(xrange[1] - 1, xrange[2] + 1, 2))) +
labs(x = NULL, y = "Avg. Percent Cover")
p4
p5 <- p4 +
theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1), # angle x text
panel.grid.major = element_blank(), # turns of major grids
panel.grid.minor = element_blank(), # turns off minor grids
panel.background = element_rect(fill = 'white', color = 'dimgrey'),# makes background white
legend.key = element_blank()) # removes square fill around symbols in legend
p5
p6 <- p5 + facet_wrap(~CommunityType)
p6
ggplot(data = pcov, aes(x = Year, y = avg_cover,
color = CoverCode, group = CoverCode,
fill = CoverCode, shape = CoverCode)) +
geom_errorbar(aes(ymin = avg_cover - se_cover, ymax = avg_cover + se_cover),
linewidth = 0.6) +
geom_point(color = "dimgrey", size = 2.5) +
scale_fill_manual(values = cols, aesthetics = c("fill", "color"),
name = "Species Group") +
scale_shape_manual(values = shps, name = "Species Group") +
scale_x_continuous(limits = c(xrange[1] - 1, xrange[2] + 1),
breaks = c(seq(xrange[1] - 1, xrange[2] + 1, 2))) +
labs(x = NULL, y = "Avg. Percent Cover") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 0),
panel.grid.major = element_blank(), # turns of major grids
panel.grid.minor = element_blank(), # turns off minor grids
panel.background = element_rect(fill = 'white', color = 'dimgrey'),
legend.key = element_blank()) +
facet_wrap(~CommunityType)
ggsave("SHIHAR_photoplot_cover.svg", height = 8, width = 7)
ggplot(data = pcov, aes(x = Year, y = avg_cover,
color = CoverCode, group = CoverCode,
fill = CoverCode, shape = CoverCode)) +
geom_errorbar(aes(ymin = avg_cover - se_cover, ymax = avg_cover + se_cover),
linewidth = 0.6) +
geom_point(color = "dimgrey", size = 1.5) + # changed this line
scale_fill_manual(values = cols, aesthetics = c("fill", "color"),
name = "Species Group") +
scale_shape_manual(values = shps, name = "Species Group") +
scale_x_continuous(limits = c(xrange[1] - 1, xrange[2] + 1),
breaks = c(seq(xrange[1] - 1, xrange[2] + 1, 2))) +
labs(x = NULL, y = "Avg. Percent Cover") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 0),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_rect(fill = 'white', color = 'dimgrey'),
legend.key = element_blank()) +
facet_wrap(~CommunityType)
ggplot(data = pcov, aes(x = Year, y = avg_cover,
color = CoverCode, group = CoverCode,
fill = CoverCode, shape = CoverCode)) +
geom_errorbar(aes(ymin = avg_cover - se_cover, ymax = avg_cover + se_cover),
linewidth = 1.2) + # changed this line
geom_point(color = "dimgrey", size = 2.5) +
scale_fill_manual(values = cols, aesthetics = c("fill", "color"),
name = "Species Group") +
scale_shape_manual(values = shps, name = "Species Group") +
scale_x_continuous(limits = c(xrange[1] - 1, xrange[2] + 1),
breaks = c(seq(xrange[1] - 1, xrange[2] + 1, 2))) +
labs(x = NULL, y = "Avg. Percent Cover") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 0),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_rect(fill = 'white', color = 'dimgrey'),
legend.key = element_blank()) +
facet_wrap(~CommunityType)
ggplot(data = pcov, aes(x = Year, y = avg_cover,
color = CoverCode, group = CoverCode,
fill = CoverCode, shape = CoverCode)) +
geom_errorbar(aes(ymin = avg_cover - se_cover, ymax = avg_cover + se_cover),
linewidth = 0.6) +
geom_point(color = "dimgrey", size = 2.5) +
scale_fill_manual(values = cols, aesthetics = c("fill", "color"),
name = "Species Group") +
scale_shape_manual(values = shps, name = "Species Group") +
scale_x_continuous(limits = c(xrange[1] - 1, xrange[2] + 1),
breaks = c(seq(xrange[1] - 1, xrange[2] + 1, 2))) +
labs(x = "Year", y = "Avg Percent Cover") + # changed this line
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 0),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_rect(fill = 'white', color = 'dimgrey'),
legend.key = element_blank()) +
facet_wrap(~CommunityType)
p6 + geom_line(linewidth = 0.8, aes(group = CoverCode))
p6 + geom_smooth(se = F, span = 0.75)
p6 + geom_smooth(method = 'lm', se = F, linetype = 'dashed')
p6 + geom_hline(aes(yintercept = 50), linetype = "dashed")
p6 + geom_hline(aes(yintercept = 50, linetype = "50% line")) +
scale_linetype_manual(values = c("50% line" = "dashed"),
name = "Threshold")
p6 + theme(legend.position = 'bottom')
p6 + theme(legend.title = element_text(size = 12, face = 'bold'),
legend.text = element_text(size = 11))
p6 + theme(strip.background = element_rect(fill = "#F5F0DC", color = "black"),
strip.text = element_text(size = 10))
ggplot(data = pcov, aes(x = Year, y = avg_cover,
color = CoverCode, group = CoverCode,
fill = CoverCode, shape = CoverCode)) +
geom_bar(stat = 'identity', position = 'fill', color = 'dimgrey') +
scale_fill_manual(values = cols, aesthetics = c("fill", "color"),
name = "Species Group") +
scale_x_continuous(limits = c(xrange[1] - 1, xrange[2] + 1),
breaks = c(seq(xrange[1] - 1, xrange[2] + 1, 2))) +
labs(x = NULL, y = "Median. Percent Cover") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 0),
panel.grid.major = element_blank(), # turns of major grids
panel.grid.minor = element_blank(), # turns off minor grids
panel.background = element_rect(fill = 'white', color = 'dimgrey'),
legend.key = element_blank()) +
facet_wrap(~CommunityType)
ggplot(data = pcov |> filter(CommunityType == "Barnacle"), # note filter
aes(x = Year, y = avg_cover,
color = CoverCode, group = CoverCode,
fill = CoverCode, shape = CoverCode)) +
geom_bar(stat = 'identity', position = 'dodge', color = 'dimgrey') + # new line
geom_errorbar(aes(ymin = avg_cover - se_cover, ymax = avg_cover + se_cover), linewidth = 0.6) +
scale_fill_manual(values = cols, aesthetics = c("fill", "color"),
name = "Species Group") +
scale_shape_manual(values = shps, name = "Species Group") +
scale_x_continuous(limits = c(xrange[1] - 1, xrange[2] + 1),
breaks = c(seq(xrange[1] - 1, xrange[2] + 1, 2))) +
labs(x = NULL, y = "Avg. Percent Cover") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 0),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_rect(fill = 'white', color = 'dimgrey'),
legend.key = element_blank()) +
facet_wrap(~CoverCode) # Different facet
# load packages
library(ggplot2)
library(patchwork) # multipanel plots
# load data
chem <- read.csv("./data/ACAD_Jordan_Pond_water_chem.csv")
# make date field a date
chem$date <- as.Date(chem$date, format = "%Y-%m-%d")
# pH plot
p_pH <-
ggplot(chem, aes(x = date, y = pH)) +
theme_bw() +
geom_smooth(se = F, span = 0.5) +
geom_point(color = "dimgrey", alpha = 0.5, size = 2) +
labs(y = "pH", x = "Year") +
scale_x_date(date_breaks = "2 years", date_labels = "%Y") +
theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1))
# temp plot
p_temp <-
ggplot(chem, aes(x = date, y = Temp_F)) +
theme_bw() +
geom_smooth(se = F, span = 0.5) +
geom_point(color = "dimgrey", alpha = 0.5, size = 2) +
labs(y = "Temp (F)", x = "Year") +
scale_x_date(date_breaks = "2 years", date_labels = "%Y") +
theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1))
# Diss. Oxygen plot
p_do <-
ggplot(chem, aes(x = date, y = DO_mgL)) +
theme_bw() +
geom_smooth(se = F, span = 0.5) +
geom_point(color = "dimgrey", alpha = 0.5, size = 2) +
labs(y = "DO (mg/L)", x = "Year") +
scale_x_date(date_breaks = "2 years", date_labels = "%Y") +
theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1))
# Conductance plot
p_cond <-
ggplot(chem, aes(x = date, y = SpCond_uScm)) +
theme_bw() +
geom_smooth(se = F, span = 0.5) +
geom_point(color = "dimgrey", alpha = 0.5, size = 2) +
labs(y = "Spec. Cond. (uScm)", x = "Year")+
scale_x_date(date_breaks = "2 years", date_labels = "%Y") +
theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1))
p_pH + p_temp + p_do + p_cond
library(patchwork)
p_pH / p_temp / p_do / p_cond + plot_layout(axes = "collect_x")
#------------------------------------
# Day 3: ggplot Palettes Code
#------------------------------------
display.brewer.all(colorblindFriendly = TRUE)
p_pal <- ggplot(data = pcov, aes(x = Year, y = avg_cover,
color = CoverCode, group = CoverCode,
fill = CoverCode, shape = CoverCode)) +
geom_errorbar(aes(ymin = avg_cover - se_cover, ymax = avg_cover + se_cover),
linewidth = 0.6) +
geom_point(color = "dimgrey", size = 2.5) +
scale_shape_manual(values = shps, name = "Species Group") +
scale_x_continuous(limits = c(xrange[1] - 1, xrange[2] + 1),
breaks = c(seq(xrange[1] - 1, xrange[2] + 1, 2))) +
labs(x = "Year", y = "Avg Percent Cover") + # changed this line
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 0),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_rect(fill = 'white', color = 'dimgrey'),
legend.key = element_blank()) +
facet_wrap(~CommunityType)
p_pal + scale_color_brewer(name = "Species Group", palette = "Set2",
aesthetics = c("fill", "color"))
p_pal + scale_color_brewer(name = "Species Group", palette = "Dark2",
aesthetics = c("fill", "color"))
p_pal + scale_color_brewer(name = "Species Group", palette = "RdYlBu",
aesthetics = c("fill", "color"))
# viridis
scales::show_col(viridis(12), cex_label = 0.45, ncol = 6)
p_pal + scale_color_viridis_d(name = "Species Group", aesthetics = c("fill", "color")) #default viridis
p_pal + scale_color_viridis_d(name = "Species Group", aesthetics = c("fill", "color"), option = 'turbo')
p_heat <-
ggplot(chem, aes(x = mon, y = year, color = Temp_F, fill = Temp_F)) +
theme_bw() +
geom_tile() +
labs(y = "Year", x = "Month") +
scale_x_continuous(breaks = c(5, 6, 7, 8, 9, 10),
limits = c(4, 11),
labels = month.abb[5:10]) +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5))
p_heat + scale_color_viridis_c(name = "Temp. (F)", aesthetics = c("fill", "color"))
p_heat + scale_color_viridis_c(name = "Temp. (F)", aesthetics = c("fill", "color"),
option = "plasma", direction = -1)
p_heat + scale_color_gradient(low = "#FCFC9A", high = "#F54927",
aesthetics = c("fill", 'color'),
name = "Temp. (F)")
p_heat + scale_color_gradient2(low = "navy", mid = "#FCFC9A", high = "#F54927",
aesthetics = c("fill", 'color'),
midpoint = mean(chem$Temp_F),
name = "Temp. (F)")
p_heat + scale_color_gradientn(colors = c("#805A91", "#406AC2", "#FBFFAD", "#FFA34A", "#AB1F1F"),
aesthetics = c("fill", 'color'),
guide = "legend",
breaks = c(seq(40, 85, 5)),
name = "Temp. (F)")
p_heat + scale_color_gradient2(low = "#3E693D", mid = "#FDFFC7", high = "#7A6646",
aesthetics = c("fill", 'color'),
midpoint = mean(chem$Temp_F),
name = "Temp. (F)")
library(tidyverse)
library(readxl)
ctd_mma <- read_xlsx("./data/PR_PF_2903444 (2).xlsx") |> data.frame()
ggplot(ctd_mma, aes(x = `TEMP..degree_Celsius.`,
y = `PRES..decibar.`,
group = Station,
color = Station)) +
geom_line() +
theme_bw() +
labs(x = "Temp. (C)", y = "Pressure (dbars)") +
scale_color_gradientn(colors = c("#805A91", "#406AC2", "#FBFFAD", "#FFA34A", "#AB1F1F"),
aesthetics = c('color'),
guide = "legend", # makes legend distinct, rather than color band
breaks = 1:15, # number of stations
name = "Station ID") +
scale_y_reverse() + # flip y axis
scale_x_continuous(limits = c(0, 30),
breaks = seq(0, 30, 5),
position = 'top') + # plot x-axis on top
theme(legend.position = 'bottom')
library(dplyr)
library(ggplot2)
# install.packages('car') # uncomment and run if you don't have this package installed
library(car) # for levene's test
# import data
motinv <- read.csv("./data/BASHAR_motile_invert_counts.csv")
head(motinv)
# prep data for analysis
motinv_final <- motinv |>
mutate(Damage = as.numeric(replace(Damage, Damage == "PM", NA)), # Fix Damage PM
SitePlot = paste(SiteCode, PlotName, sep = "-"), # create new SitePlot column
Date = as.Date(StartDate, format = "%m/%d/%Y"), # create new Date column
No.Damage_fix = replace(No.Damage, No.Damage == 1960, 196),
total_count = Damage + No.Damage,
total_count_fix = Damage + No.Damage_fix, # fix error in No.Damage
year_st = Year - 2012) |> # set start year to 1 instead of 2013 for better interpretation
filter(QAQC == FALSE) |> # drop QAQC visits
arrange(SitePlot, Year, ScientificName) # optional sorting the data
# summarize counts, so 1 count per year, species and community type
motinv_sum <- motinv_final |> summarize(mean_count = mean(total_count, na.rm = TRUE),
mean_count_fix = mean(total_count_fix, na.rm = T),
.by = c(SiteCode, year_st, CommunityType, SpeciesCode))
# prep for linear regression
motinv_reg <- motinv_sum |> filter(SpeciesCode == "LITLIT" & CommunityType == "Red Algae")
head(motinv_reg)
# prep for analysis of variance
motinv_aov <- motinv_sum |> filter(SpeciesCode == "LITLIT") |>
mutate(ComCode = toupper(substr(CommunityType, 1, 3))) # create community code for easier plotting
head(motinv_aov)
lm_mod <- lm(mean_count ~ year_st, data = motinv_reg)
par(mfrow = c(2,2)) # makes diagnostic plots 2 x 2 grid
plot(lm_mod)
par(mfrow = c(1,1)) # resets to 1 plot
hist(resid(lm_mod))
# detect outliers as > 2 SD of residuals
outliers <- which(abs(resid(lm_mod)) > 2 * sd(resid(lm_mod)))
# Highlight the outliers in a scatterplot
plot(mean_count ~ year_st, data = motinv_reg)
points(motinv_reg$year_st[outliers], motinv_reg$mean_count[outliers], col = "red", pch = 19)
lm_mod_fix <- lm(mean_count_fix ~ year_st, data = motinv_reg)
par(mfrow = c(2,2)) # makes diagnostic plots 2 x 2 grid
plot(lm_mod_fix)
par(mfrow = c(1,1)) # resets to 1 plot
hist(resid(lm_mod_fix))
summary(lm_mod_fix)
ggplot(data = motinv_reg, aes(x = year_st, y = mean_count_fix)) +
geom_point() +
geom_smooth(method = 'lm') +
scale_x_continuous(breaks = seq(1, 13, 2),
labels = seq(1, 13, 2) + 2012) +
labs(x = "Year", y = "Mean common periwinkle count") +
theme_bw()
aov_mod <- aov(mean_count_fix ~ ComCode, data = motinv_aov)
par(mfrow = c(2,2)) # makes diagnostic plots 2 x 2 grid
plot(aov_mod)
par(mfrow = c(1,1)) # resets to 1 plot
hist(resid(aov_mod))
library(car)
leveneTest(aov_mod)
shapiro.test(rstandard(aov_mod))
summary(aov_mod)
TukeyHSD(aov_mod, conf.level = 0.95)
plot(TukeyHSD(aov_mod, conf.level = 0.95), las = 2)
# reorder community by elevation
motinv_aov$ComCode_fac <- factor(motinv_aov$ComCode, levels = c("BAR", "ASC", "FUC", "RED"))
ggplot(data = motinv_aov, aes(x = ComCode_fac, y = mean_count_fix)) +
stat_summary(geom = 'bar', fun.data = mean_se, fill = 'grey', color = 'dimgrey') +
stat_summary(geom = 'errorbar', fun.data = mean_se, color = 'dimgrey', width = 0.3) +
labs(x = "Community Type", y = "Mean common periwinkle count") +
geom_text(aes(x = 1, y = 30, label = "AB"), size = 5) +
geom_text(aes(x = 2, y = 70, label = "A"), size = 5) +
geom_text(aes(x = 3, y = 125, label = "B"), size = 5) +
geom_text(aes(x = 4, y = 118, label = "B"), size = 5) +
theme_bw()
#------------------------------------
# Day 3: Best Practices Code
#------------------------------------
# libraries
library(dplyr) # for mutate and filter
# parameters
analysis_year <- 2023
# import data set
photo_dat <- read.csv("./data/SHIHAR_photoplot_cover.csv")
# Filtering on Barnacle community type and analysis year
photo_dat2 <- photo_dat |> filter(CommunityType == "Barnacle") |>
filter(Year == analysis_year)
snake_case # most common in R
camelCase # capitalize new words after the first
period.separation # separate words by periods
whyWOULDyouDOthisTOsomeone # excess capitalization is a pain
# good word order
ACAD_rocky <- data.frame(year = 2020:2025, plot = 1:6)
ACAD_rocky2 <- ACAD_rocky |> filter(year > 2020)
ACAD_rocky3 <- ACAD_rocky2 |> mutate(plot_type = "vital signs")
# bad word order
rocky_ACAD <- data.frame(year = 2020:2025, plot = 1:6)
ACAD_after_2020 <- rocky_ACAD |> filter(year > 2020)
vital_ACAD_2020 <- ACAD_after_2020 |> mutate(plot_type = "vital signs")
# super long names
ACAD_rocky_intertidal_sampling_data <- data.frame(years_plots_were_sampled = c(2020:2025), wetland_plots_sampled = c(1:6))
ACAD_rocky_intertidal_sampling_data2 <- rocky_intertidal_sampling_data |> filter(years_plots_were_sampled > 2020)
# shorter still meaningful
ACAD_rocky <- data.frame(year = 2020:2025, plot = 1:6)
ACAD_rocky2 <- ACAD_rocky |> filter(year > 2020)
# Good code
trees_final <- trees |>
mutate(DecayClassCode_num = as.numeric(DecayClassCode),
Plot_Name = paste(ParkUnit, PlotCode, sep = "-"),
Date = as.Date(SampleDate, format = "%m/%d/%Y")) |>
rename("Species" = "ScientificName") |>
filter(IsQAQC == FALSE) |>
select(-DecayClassCode) |>
arrange(Plot_Name, TagCode)
# Same code, but much harder to follow
trees_final <- trees|>mutate(DecayClassCode_num=as.numeric(DecayClassCode), Plot_Name=paste(ParkUnit,PlotCode,sep = "-"), Date=as.Date(SampleDate,format="%m/%d/%Y"))|> rename("Species"="ScientificName")|>filter(IsQAQC==FALSE)|>select(-DecayClassCode)|>arrange(Plot_Name,TagCode)
# Good code
ggplot(data = visits, aes(x = Year, y = Annual_Visits/1000)) +
geom_line() +
geom_point(color = "black", fill = "#82C2a3", size = 2.5, shape = 24) +
labs(x = "Year",
y = "Annual visitors in 1000's") +
scale_y_continuous(limits = c(2000, 4500),
breaks = seq(2000, 4500, by = 500)) +
scale_x_continuous(limits = c(1994, 2024),
breaks = c(seq(1994, 2024, by = 5))) +
theme(axis.text.x = element_text(size = 10, angle = 45, hjust = 1),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_rect(fill = 'white', color = 'dimgrey'),
title = element_text(size = 10)
)
# Same code but hard to follow
ggplot(data=visits,aes(x=Year,y=Annual_Visits/1000))+geom_line()+geom_point(color="black",fill="#82C2a3",size=2.5,shape=24) +
labs(x = "Year", y = "Annual visitors in 1000's")+
scale_y_continuous(limits=c(2000,4500),breaks=seq(2000,4500,by=500))+
scale_x_continuous(limits=c(1994,2024),breaks=c(seq(1994,2024,by=5)))+
theme(axis.text.x=element_text(size=10,angle=45,hjust=1), panel.grid.major=element_blank(),
panel.grid.minor=element_blank(),panel.background=element_rect(fill='white',color='dimgrey'),
title = element_text(size = 10))
#------------------------------------
# Day 1: Challenges Code
#------------------------------------
motinv <- read.csv("./data/BASHAR_motile_invert_counts.csv")
motinv[c(2, 4, 6, 8), c(1, 2)]
names(motinv) # get the names of the first 2 columns
motinv[c(2, 4, 6, 8), c("Network", "UnitCode")]
# Option 1
length(unique(motinv[, "ScientificName"])) # 6
# Option 2
length(unique(motinv$ScientificName)) # equivalent
# Option 1 - used unique to just return unique site name
unique(motinv$Year[motinv$QAQC == TRUE]) # 2013
# Option 2
unique(motinv[motinv$QAQC == TRUE, "Year"])
# with brackets
A1_2024 <- motinv[motinv$PlotName == "A1" & motinv3$Year == 2024, ]
nrow(A1_2024) # 3
# with base R subset
A1_2024b <- subset(motinv, PlotName == "A1" & Year == 2024)
View(A1_2024b) # 3
# OPTION 2
gcrab <- motinv[motinv$ScientificName == "Carcinus maenas",]
sort(unique(gcrab$Year)) #2019, 2021, 2022, 2023, 2024
gcrab2 <- subset(motinv, ScientificName == "Carcinus maenas")
table(gcrab2$Year)
View(motinv)
max_nd <- max(motinv$No.Damage, na.rm = TRUE)
motinv3[motinv$No.Damage == max_nd,]
# create copy of motinv data
motinv_fix <- motinv
# find the problematic value, and change it to 196
motinv_fix$No.Damage[motinv_fix$Year == 2019 &
motinv_fix$PlotName == "R4" &
motinv_fix$No.Damage == 1960] <- 196
# check your work
range(motinv$No.Damage) # 0 1960
range(motinv_fix$No.Damage) # 0 282
hist(as.numeric(motinv$Damage))
#------------------------------------
# Day 2: Challenges Code
#------------------------------------
library(dplyr)
#--- Point intercept data ---
pi_dat <- read.csv("./data/BASHAR_Point_Intercept_data.csv")
#--- Motile invert count ---
motinv <- read.csv("./data/BASHAR_motile_invert_counts.csv")
#--- Motile invert site ---
motspp <- read.csv("./data/motile_invert_species_table.csv")
#--- hobo temp data ---
temp_data <- read.csv("./data/HOBO_temp_example.csv", skip = 1)[,1:3]
colnames(temp_data) <- c("index", "date_time", "tempF")
# with brackets
A1_2024 <- motinv |> filter(PlotName == "A1" & Year == 2024)
nrow(A1_2024) # 3
gcrab <- motinv |> filter(ScientificName == "Carcinus maenas") |>
select(Year) |> unique()
gcrab
max_nd <- max(motinv$No.Damage, na.rm = TRUE)
motinv |> filter(No.Damage == max_nd)
# dplyr approach
motinv_fix <- motinv |> mutate(No.Damage = replace(No.Damage, No.Damage == 1960, 196))
range(motinv$No.Damage)
range(motinv_fix$No.Damage)
pred <- c("CARMAE", "NUCLAP")
# base R
motinv$trophic <- ifelse(motinv$SpeciesCode %in% pred, "predator", "herbivore")
table(motinv$trophic, motinv$SpeciesCode)
# tidyverse
motinv <- motinv |> mutate(trophic = ifelse(SpeciesCode %in% pred, "predator", "herbivore"))
table(motinv$trophic, motinv$SpeciesCode)
# Base R using a nested ifelse()
motinv$count_level <-
ifelse(motinv$No.Damage > 35, "High",
ifelse(motinv$No.Damage >= 10 & motinv$No.Damage <= 35, "Medium", "Low"))
table(motinv$count_level) # check that it worked
# Tidyverse using case_when() and between()
motinv <- motinv |> mutate(count_level = case_when(No.Damage > 35 ~ "High",
between(No.Damage, 10, 35) ~ "Medium",
No.Damage < 10 ~ "Low"))
table(motinv$count_level) # check that it worked
pi_nonveg <- pi_dat |> filter(CoverCode %in% c("BOLT", "ROCK", "WATER")) |> # filter nonveg grps
summarize(avg_freq = mean(pct_freq), # calc avg.
.by = c(SiteCode, Year, CoverCode, CoverType)) # grouping variables
head(pi_nonveg) # check output
pi_subtype <- pi_dat |>
mutate(sub_type = ifelse(CoverCode %in% c("BOLT", "ROCK", "WATER"), "nonveg", "veg")) |> # filter nonveg grps
summarize(avg_freq = mean(pct_freq), # calc avg.
.by = c(SiteCode, Year, sub_type)) |> # grouping variables
arrange(SiteCode, Year, sub_type) # sort variables
head(pi_subtype) # check output
# Fix the data issues again
motinv <- motinv |>
mutate(NoDamage_fix = replace(No.Damage, Damage == 1960, 196),
Damage_fix = as.numeric(replace(Damage, Damage == "PM", NA)),
total_count = NoDamage_fix + Damage_fix) |>
filter(QAQC == FALSE)
# Summarize the mean count per plot of each species by year and community type
motinv_sum <- motinv |>
summarize(mean_count = sum(total_count)/5, # 5 plots per site
se_counts = sd(total_count)/sqrt(5), # 5 plots per site
.by = c(SiteCode, Year, CommunityType,
ScientificName, CommonName, SpeciesCode))
motinv_wide <- motinv_sum |>
arrange(SpeciesCode) |> # sorting so columns are alphabetical
select(-ScientificName, -CommonName) |>
pivot_wider(names_from = SpeciesCode,
values_from = mean_count,
values_fill = 0)
head(motinv_wide)
# Fix the data issues again
motinv <- motinv |>
mutate(NoDamage_fix = replace(No.Damage, Damage == 1960, 196),
Damage_fix = as.numeric(replace(Damage, Damage == "PM", NA)),
total_count = NoDamage_fix + Damage_fix) |>
filter(QAQC == FALSE)
# Summarize the mean count per plot of each species by year and community type
motinv_sum <- motinv |>
summarize(mean_count = sum(total_count)/5, # 5 plots per site
se_counts = sd(total_count)/sqrt(5), # 5 plots per site
.by = c(SiteCode, Year, CommunityType,
ScientificName, CommonName, SpeciesCode))
motinv_wide_yr <- motinv_sum |>
arrange(Year) |> # sorting so columns are alphabetical
select(-se_counts) |>
pivot_wider(names_from = Year,
values_from = mean_count,
values_fill = 0,
names_prefix = "yr_")
head(motinv_wide_yr)
motinv_long_yr <- pivot_longer(motinv_wide_yr,
cols = -c(SiteCode, CommunityType, ScientificName,
CommonName, SpeciesCode),
names_to = "Year",
values_to = "mean_counts",
names_prefix = "yr_") # drops this string from values
# Read in motinv data if you haven't yet
motinv <- read.csv("./data/BASHAR_motile_invert_counts.csv")
head(motinv)
# Read in species table
motspp <- read.csv("./data/motile_invert_species_table.csv")
head(motspp)
intersect(names(motinv), names(motspp)) # 3 columns in common
# left join species to motinv, because don't want to include species not found in count data
motinv_spp <- left_join(motinv,
motspp,
by = c("SpeciesCode", "ScientificName", "CommonName"))
head(motinv_spp)
# anti join of
anti_join(motspp, motinv, by = c("SpeciesCode", "ScientificName", "CommonName"))
# Create date1 if you don't have it already
date1 <- as.Date("3/12/2026", format = "%m/%d/%Y")
format(date1, format = "%Y%m%d")
date_list <- as.Date(c("01/01/2026", "12/31/2026"), format = "%m/%d/%Y")
seq.Date(date_list[1], date_list[2], by = "3 months")
date_list <- as.Date(c("01/01/2026", "12/31/2026"), format = "%m/%d/%Y")
seq.Date(date_list[1], date_list[2], by = "1 week")
temp_data$month_num <- as.numeric(format(temp_data$timestamp, "%m"))
head(temp_data)
temp_data$julian <- as.numeric(format(temp_data$timestamp, "%j"))
head(temp_data)
#------------------------------------
# Day 3: Challenges Code
#------------------------------------
# packages
library(dplyr)
library(ggplot2)
library(patchwork) # for arranging ggplot objects
library(RColorBrewer) # for palettes
library(viridis) # for palettes
# load data
pcov <- read.csv("./data/SHIHAR_photoplot_cover.csv") # import data
# define color and shape objects
cols <- c("ASCNOD" = "#C5B47B", "BARSPP" = "#A9A9A9",
"NONCOR" = "#574F91", "FUCSPP" = "#FFD560",
"MUSSPP" = "#6F88BF", "REDGRP" = "#FF4C53")
shps <- c("ASCNOD" = 23, "BARSPP" = 24, "NONCOR" = 23,
"FUCSPP" = 25, "MUSSPP" = 23, "REDGRP" = 25)
# Set x axis range
xrange <- range(pcov$Year)
ggplot(data = pcov, aes(x = Year, y = avg_cover,
color = CoverCode, group = CoverCode,
fill = CoverCode, shape = CoverCode)) +
geom_errorbar(aes(ymin = avg_cover - se_cover, ymax = avg_cover + se_cover),
linewidth = 0.6) +
geom_point(color = "dimgrey", size = 1.5) + # changed this line
scale_fill_manual(values = cols, aesthetics = c("fill", "color"),
name = "Species Group") +
scale_shape_manual(values = shps, name = "Species Group") +
scale_x_continuous(limits = c(xrange[1] - 1, xrange[2] + 1),
breaks = c(seq(xrange[1] - 1, xrange[2] + 1, 2))) +
labs(x = NULL, y = "Avg. Percent Cover") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 0),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_rect(fill = 'white', color = 'dimgrey'),
legend.key = element_blank()) +
facet_wrap(~CommunityType)
ggplot(data = pcov, aes(x = Year, y = avg_cover,
color = CoverCode, group = CoverCode,
fill = CoverCode, shape = CoverCode)) +
geom_errorbar(aes(ymin = avg_cover - se_cover, ymax = avg_cover + se_cover),
linewidth = 1.2) + # changed this line
geom_point(color = "dimgrey", size = 2.5) +
scale_fill_manual(values = cols, aesthetics = c("fill", "color"),
name = "Species Group") +
scale_shape_manual(values = shps, name = "Species Group") +
scale_x_continuous(limits = c(xrange[1] - 1, xrange[2] + 1),
breaks = c(seq(xrange[1] - 1, xrange[2] + 1, 2))) +
labs(x = NULL, y = "Avg. Percent Cover") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 0),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_rect(fill = 'white', color = 'dimgrey'),
legend.key = element_blank()) +
facet_wrap(~CommunityType)
ggplot(data = pcov, aes(x = Year, y = avg_cover,
color = CoverCode, group = CoverCode,
fill = CoverCode, shape = CoverCode)) +
geom_errorbar(aes(ymin = avg_cover - se_cover, ymax = avg_cover + se_cover),
linewidth = 0.6) +
geom_point(color = "dimgrey", size = 2.5) +
scale_fill_manual(values = cols, aesthetics = c("fill", "color"),
name = "Species Group") +
scale_shape_manual(values = shps, name = "Species Group") +
scale_x_continuous(limits = c(xrange[1] - 1, xrange[2] + 1),
breaks = c(seq(xrange[1] - 1, xrange[2] + 1, 2))) +
labs(x = "Year", y = "Avg Percent Cover") + # changed this line
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 0),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_rect(fill = 'white', color = 'dimgrey'),
legend.key = element_blank()) +
facet_wrap(~CommunityType)
p6 <- ggplot(data = pcov, aes(x = Year, y = avg_cover,
color = CoverCode, group = CoverCode,
fill = CoverCode, shape = CoverCode)) +
geom_errorbar(aes(ymin = avg_cover - se_cover, ymax = avg_cover + se_cover),
linewidth = 0.6) +
geom_point(color = "dimgrey", size = 2.5) +
scale_fill_manual(values = cols, aesthetics = c("fill", "color"),
name = "Species Group") +
scale_shape_manual(values = shps, name = "Species Group") +
scale_x_continuous(limits = c(xrange[1] - 1, xrange[2] + 1),
breaks = c(seq(xrange[1] - 1, xrange[2] + 1, 2))) +
labs(x = NULL, y = "Avg. Percent Cover") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 0),
panel.grid.major = element_blank(), # turns of major grids
panel.grid.minor = element_blank(), # turns off minor grids
panel.background = element_rect(fill = 'white', color = 'dimgrey'),
legend.key = element_blank()) +
facet_wrap(~CommunityType)
p6 + geom_smooth(se = F, span = 0.75)
p6 + geom_smooth(method = 'lm', se = F, linetype = 'dashed')
p6 + theme(legend.title = element_text(size = 12, face = 'bold'),
legend.text = element_text(size = 11))
p_pal <- ggplot(data = pcov, aes(x = Year, y = avg_cover,
color = CoverCode, group = CoverCode,
fill = CoverCode, shape = CoverCode)) +
geom_errorbar(aes(ymin = avg_cover - se_cover, ymax = avg_cover + se_cover),
linewidth = 0.6) +
geom_point(color = "dimgrey", size = 2.5) +
scale_shape_manual(values = shps, name = "Species Group") +
scale_x_continuous(limits = c(xrange[1] - 1, xrange[2] + 1),
breaks = c(seq(xrange[1] - 1, xrange[2] + 1, 2))) +
labs(x = "Year", y = "Avg Percent Cover") + # changed this line
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 0),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_rect(fill = 'white', color = 'dimgrey'),
legend.key = element_blank()) +
facet_wrap(~CommunityType)
p_pal + scale_color_brewer(name = "Species Group", palette = "RdYlBu",
aesthetics = c("fill", "color"))
# Create p_heat for 'palettes' section
p_heat <-
ggplot(chem, aes(x = mon, y = year, color = Temp_F, fill = Temp_F)) +
theme_bw() +
geom_tile() +
labs(y = "Year", x = "Month") +
scale_x_continuous(breaks = c(5, 6, 7, 8, 9, 10),
limits = c(4, 11),
labels = month.abb[5:10]) +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5))
p_heat + scale_color_gradient2(low = "#3E693D", mid = "#FDFFC7", high = "#7A6646",
aesthetics = c("fill", 'color'),
midpoint = mean(chem$Temp_F),
name = "Temp. (F)")
?plot
?dplyr::filter
mean_x <- mean(c(1, 3, 5, 7, 8, 21) # missing closing parentheses
mean_x <- mean(c(1, 3, 5, 7, 8, 21)) # correct
birds <- c("black-capped chickadee", "golden-crowned kinglet, "wood thrush") # missing quote after kinglet
birds <- c("black-capped chickadee", "golden-crowned kinglet", "wood thrush") # corrected
birds <- c("black-capped chickadee", "golden-crowned kinglet" "wood thrush") # missing comma after kinglet
birds <- c("black-capped chickadee", "golden-crowned kinglet", "wood thrush") # corrected
x_mean <- maen(x) # misspelled mean
x_mean <- mean(x) # Corrected
# Missing comma to indicate subsetting rows (records)
motinv <- motinv[!is.na(motinv$SiteCode)]
# Correct
motinv <- motinv[!is.na(motinv$SiteCode),]