Using the waterGLKN R package

Getting started

Installation

Step 1. Install R, RStudio, and RTools44 in Software Center

Note that RTools is needed to install R packages from GitHub, and it only works with R versions 4.4.x. While R 4.5 is available on Software Center, the matching RTools45 isn’t available yet. Until that changes, link RStudio to the latest version of R 4.4 (I’m currently using R 4.4.3).

Troubleshooting Build Tools

Unfortunately Software Center installs RTools44 in C:/Program Files/, not C:/, which is where RStudio looks for it by default. The following code helps RStudio find RTools. You may occasionally have to rerun this code (except for the usethis line), so keep it handy. You know when you have to rerun the code when you try to rebuild a package, and a window pops up to ask if you want to install missing build files.

RTools step 1. Install the usethis package if you don’t already have it installed.

install.packages('usethis')

RTools step 2. Open the .Renviron file by running the line of code below.

usethis::edit_r_environ()

RTools step 3. Copy the following text and paste it into the .Renviron file, save, then close/reopen RStudio. You don’t need to run anything for this step. Copy/paste are all you need to do.

Sys.setenv(PATH = paste("C:\\PROGRA~1\\Rtools44\\bin", Sys.getenv("PATH"), sep=";"))
Sys.setenv(BINPREF = "C:\\PROGRA~1\\Rtools44\\mingw_$(WIN)\\bin\\")

Step 2. Install devtools package in R:

install.packages('devtools')


Step 3. Install waterGLKN from GitHub

You must have a GitHub login and be logged into GitHub to install packages from GitHub and for this step to work

Once you’ve successfully completed Steps 1-3, you should only have to repeat them if you have a new computer or if you move to R 4.5 or higher (note: R 4.5 will require installing RTools45).

Whenever the waterGLKN package is updated, you can rerun the code below to install the latest version.

library(devtools)
install_github("KateMMiller/waterGLKN")
Troubleshooting GitHub package installation

If you’re unable to install the R package via GitHub (often an error about permission being denied), download the following script from my OneDrive and open it in R: fix_TLS_inspection.R

Once this script is open in R Studio, press Control + A to select all of the code. Then Control + Enter to run all of the code. Assuming you don’t return any errors, you should be able to install from GitHub. Now try to reinstall waterGLKN. If you’re still running into issues, it could be that devtools is missing some package dependencies, which the error message will often mention. Install any missing packages and then try installing waterGLKN again. If you’re still not successful, send me a screenshot of the error and I’ll help you troubleshoot.


Step 4. Load waterGLKN R package

library(waterGLKN)

Step 5. Import data

Note that R is not able to connect to files on Sharepoint or MS Teams (b/c Teams also stores all files on Sharepoint). That means you need to store data package files on your local machine or on a server. The default option for importing data will add the data package views (i.e., flatfiles) to an environment called GLKN_WQ to your Environment work space (i.e. Environment tab in top right panel). If you would rather import each individual view into your R session, specify with the new_env argument (e.g., importData(new_env = F)).

You can download GLKN water packages from DataStore using the NPSutils package, or you can go to DataStore on IRMA and download the data packages. Code to download via NPSutils is below. Note, however, that the NPSutils package has a lot of dependencies, some of which are circular and takes multiple tries to install. I’m working on a slimmed down package that performs the same function, but for now, I just download them from DataStore. Once the data package is on your computer, you can run the importData() function.

# Download lakes and rivers data package from DataStore
devtools::install_github("nationalparkservice/NPSutils") # might take a few tries to install
NPSutils::get_data_packages(c("2306516", "2309369"))

Option 1. Import latest Rivers data package by specifying the folder where the unzipped csvs live.

importData(type = 'csv', filepath = "../data/GLKN_water/2309639") # filepath is the path on my computer

Option 2. Import latest Lakes data package by specifying the folder where the unzipped csvs live.

importData(type = 'csv', filepath = "../data/GLKN_water/2306516") # filepath is the path on my computer

Option 3. Import both Rivers and Lakes data packages using a zipped file of each data package. By importing both simultaneously, each of the views (e.g. Locations, Results, etc.) are row bound to include both rivers and lakes data in one view. If views between the Rivers and Lakes data packages ever have different columns, this approach won’t work.

river_zip = ("../data/GLKN_water/records-2309369.zip")
lake_zip = ("../data/GLKN_water/records-2306516.zip")
importData(type = 'zip', filepath = c(river_zip, lake_zip))

Step 6. Play with the data

The functions in the waterGLKN package are designed to work with the views, and are the best way to interact with the data to query by park, site, site type, year, parameter, etc. However, if you want to view the raw data, and you imported the data into the GLKN_WQ environment, you can access them with the code below:

# See list of the views
names(GLKN_WQ)

# View one of the views
View(GLKN_WQ$Results)

# Assign a view to a data frame named res in R. Interact with res the way you would work with any normal data frame in R. 
res <- GLKN_WQ$Results

If you want to use the print_head() function that shows output in the markdown, run the code below. This makes the results print cleaner in the markdown report. For your purposes, you can just run: head(dataframe).

print_head <- function(df){
  nrows <- min(6, nrow(df))
  knitr::kable(df[1:nrows,]) |>  
    kableExtra::kable_classic(full_width = F, font_size = 12, 
                              bootstrap_options = c("condensed"))
}

Getting help

Getting (and improving) help

The functions in waterGLKN have help documentation like any R package. To view the help, you can go to the Packages tab and click on waterGLKN (see below). That will show you all the functions in the package. Clicking on individual functions will take you to the help documentation for that function.

You can also see the help of a function by running, for example:

?importData

If waterGLKN isn’t loaded yet, you’d run:

?waterGLKN::importData

Each function’s help includes a Description, Usage (i.e. function arguments and their defaults), Argument options/definitions, and several examples showing how the function can be used.

This is where you come in! If you notice typos or can think of better descriptions, examples, error messages, etc., please send them my way! After we’re more comfortable with R packages and get versed on GitHub, you’ll be able to make those changes directly in the package. For now, you can just send me your suggestions and I’ll make the changes.

Finally, if you ever want to peak under the hood at the function, you can view it several ways.
  1. Keep F2 key pressed and click on the function name in R. This trick works for many but not all functions in R.
  2. View code in the GitHub katemmiller/waterGLKN repo. The functions are in the R folder.

get Data functions

getLocations()

Query location-level data. This function is a good building block for other functions, but may be less helpful on its own. Though, one helpful use of this function is to get the site codes for a given park or site.

Get location info for SACN lake sites

SACN_lakes <- getLocations(park = "SACN", site_type = "lake")
print_head(SACN_lakes)
Org_Code Park_Code Location_ID Location_Name Location_Type SiteType Latitude Longitude State_Code County_Code active
GLKN SACN SACN_PACQ_SP_01 Pacwawong Spring Pond Lake lake 46.13805 -91.34151 WI Sawyer TRUE
GLKN SACN SACN_PHIP_SP_01 Phipps Spring Pond Lake lake 46.06870 -91.40085 WI Sawyer TRUE
GLKN SACN SACN_STCR_15.8 Pool #2 on Lake St. Croix Downstream of I94 Bridge in Hudson Lake lake 44.95903 -92.76014 WI ST CROIX TRUE
GLKN SACN SACN_STCR_2.0 Pool #4 on Lake St. Croix near Prescott, WI Lake lake 44.76972 -92.80503 WI PIERCE TRUE
GLKN SACN SACN_STCR_20.0 Pool #1 of Lake St. Croix near Bayport, MN Lake lake 45.01838 -92.76500 MN WASHINGTON TRUE

Get site info for the Trapper’s Lake in PIRO

PIRO2 <- getLocations(site = "PIRO_02")
print_head(PIRO2)
Org_Code Park_Code Location_ID Location_Name Location_Type SiteType Latitude Longitude State_Code County_Code active
GLKN PIRO PIRO_02 Trapper’s Lake Lake lake 46.58729 -86.31504 MI ALGER TRUE

List Location_IDs and their full names for VOYA

voya_sites <- getLocations(park = "VOYA") |> select(Location_ID, Location_Name)
print_head(voya_sites)
Location_ID Location_Name
VOYA_01 Locator Lake
VOYA_05 Shoepack Lake
VOYA_09 Ek Lake
VOYA_12 Brown Lake
VOYA_14 Peary Lake
VOYA_16 Cruiser Lake

getResults()

This function allows you to query the Results view by park, site, site type, year, month, parameter, QC type, sample type, and sample depth type. The returned data frame is long (i.e. stacked) to facilitate data summary and plotting.

Get all non-QC, non-censored samples at all depths, parks, sites, and years (default), and only return important columns.

res <- getResults()
print_head(res)
Org_Code Park_Code Project_ID Location_ID Location_Name SiteType sample_date year month doy Activity_Relative_Depth Activity_Depth Activity_Depth_Unit Activity_Type samp_type Characteristic_Name param_name Result_Detection_Condition Result_Text value censored Result_Unit Method_Detection_Limit Lower_Quantification_Limit Upper_Quantification_Limit Result_Comment
GLKN SACN GLKNRVWQ SACN_NAKA_4.8 Namekagon River at Namekagon Trail river 2018-07-09 2018 7 190 Surface NA ft Sample-Integrated Vertical Profile VS Alkalinity, Total (total hydroxide+carbonate+bicarbonate) Alkalinity_mgL Detected and Quantified 69 69 FALSE mg/l 6 19 NA NA
GLKN SACN GLKNRVWQ SACN_NAKA_4.8 Namekagon River at Namekagon Trail river 2015-07-07 2015 7 188 Surface NA NA Sample-Routine VS Alkalinity, Total (total hydroxide+carbonate+bicarbonate) Alkalinity_mgL Detected and Quantified 62 62 FALSE mg/l 5 18 NA NA
GLKN SACN GLKNRVWQ SACN_NAKA_4.8 Namekagon River at Namekagon Trail river 2017-04-02 2017 4 92 Surface NA ft Sample-Integrated Vertical Profile VS Alkalinity, Total (total hydroxide+carbonate+bicarbonate) Alkalinity_mgL Detected and Quantified 78 78 FALSE mg/l 10 34 NA NA
GLKN SACN GLKNRVWQ SACN_NAKA_4.8 Namekagon River at Namekagon Trail river 2022-07-05 2022 7 186 Surface NA ft Sample-Integrated Vertical Profile VS Alkalinity, Total (total hydroxide+carbonate+bicarbonate) Alkalinity_mgL Detected and Quantified 74 74 FALSE mg/l 21 70 NA NA
GLKN SACN GLKNRVWQ SACN_NAKA_4.8 Namekagon River at Namekagon Trail river 2022-10-03 2022 10 276 Surface NA ft Sample-Integrated Vertical Profile VS Alkalinity, Total (total hydroxide+carbonate+bicarbonate) Alkalinity_mgL Detected and Quantified 79 79 FALSE mg/l 21 70 NA NA
GLKN SACN GLKNRVWQ SACN_NAKA_4.8 Namekagon River at Namekagon Trail river 2019-04-01 2019 4 91 Surface NA NA Sample-Routine VS Alkalinity, Total (total hydroxide+carbonate+bicarbonate) Alkalinity_mgL Detected and Quantified 44 44 FALSE mg/l 6 19 NA NA

Same as above, but return censored values.

resc <- getResults(include_censored = T)
print_head(resc)
Org_Code Park_Code Project_ID Location_ID Location_Name SiteType sample_date year month doy Activity_Relative_Depth Activity_Depth Activity_Depth_Unit Activity_Type samp_type Characteristic_Name param_name Result_Detection_Condition Result_Text value censored Result_Unit Method_Detection_Limit Lower_Quantification_Limit Upper_Quantification_Limit Result_Comment
GLKN SACN GLKNRVWQ SACN_NAKA_4.8 Namekagon River at Namekagon Trail river 2018-07-09 2018 7 190 Surface NA ft Sample-Integrated Vertical Profile VS Alkalinity, Total (total hydroxide+carbonate+bicarbonate) Alkalinity_mgL Detected and Quantified 69 69 FALSE mg/l 6 19 NA NA
GLKN SACN GLKNRVWQ SACN_NAKA_4.8 Namekagon River at Namekagon Trail river 2015-07-07 2015 7 188 Surface NA NA Sample-Routine VS Alkalinity, Total (total hydroxide+carbonate+bicarbonate) Alkalinity_mgL Detected and Quantified 62 62 FALSE mg/l 5 18 NA NA
GLKN SACN GLKNRVWQ SACN_NAKA_4.8 Namekagon River at Namekagon Trail river 2024-07-08 2024 7 190 Surface NA ft Sample-Integrated Vertical Profile VS Alkalinity, Total (total hydroxide+carbonate+bicarbonate) Alkalinity_mgL Present Below Quantification Limit NA 70 TRUE mg/l 21 70 NA Reported Value: 60 Reported Value: 60
GLKN SACN GLKNRVWQ SACN_NAKA_4.8 Namekagon River at Namekagon Trail river 2017-04-02 2017 4 92 Surface NA ft Sample-Integrated Vertical Profile VS Alkalinity, Total (total hydroxide+carbonate+bicarbonate) Alkalinity_mgL Detected and Quantified 78 78 FALSE mg/l 10 34 NA NA
GLKN SACN GLKNRVWQ SACN_NAKA_4.8 Namekagon River at Namekagon Trail river 2022-07-05 2022 7 186 Surface NA ft Sample-Integrated Vertical Profile VS Alkalinity, Total (total hydroxide+carbonate+bicarbonate) Alkalinity_mgL Detected and Quantified 74 74 FALSE mg/l 21 70 NA NA
GLKN SACN GLKNRVWQ SACN_NAKA_4.8 Namekagon River at Namekagon Trail river 2022-10-03 2022 10 276 Surface NA ft Sample-Integrated Vertical Profile VS Alkalinity, Total (total hydroxide+carbonate+bicarbonate) Alkalinity_mgL Detected and Quantified 79 79 FALSE mg/l 21 70 NA NA

Same as first example, but return QC samples too.

resq <- getResults(sample_type = 'all')
print_head(resq)
Org_Code Park_Code Project_ID Location_ID Location_Name SiteType sample_date year month doy Activity_Relative_Depth Activity_Depth Activity_Depth_Unit Activity_Type samp_type Characteristic_Name param_name Result_Detection_Condition Result_Text value censored Result_Unit Method_Detection_Limit Lower_Quantification_Limit Upper_Quantification_Limit Result_Comment
GLKN SACN GLKNRVWQ SACN_NAKA_4.8 Namekagon River at Namekagon Trail river 2018-07-09 2018 7 190 Surface NA ft Sample-Integrated Vertical Profile VS Alkalinity, Total (total hydroxide+carbonate+bicarbonate) Alkalinity_mgL Detected and Quantified 69 69 FALSE mg/l 6 19 NA NA
GLKN SACN GLKNRVWQ SACN_NAKA_4.8 Namekagon River at Namekagon Trail river 2015-07-07 2015 7 188 Surface NA NA Sample-Routine VS Alkalinity, Total (total hydroxide+carbonate+bicarbonate) Alkalinity_mgL Detected and Quantified 62 62 FALSE mg/l 5 18 NA NA
GLKN SACN GLKNRVWQ SACN_NAKA_4.8 Namekagon River at Namekagon Trail river 2017-04-02 2017 4 92 Surface NA ft Sample-Integrated Vertical Profile VS Alkalinity, Total (total hydroxide+carbonate+bicarbonate) Alkalinity_mgL Detected and Quantified 78 78 FALSE mg/l 10 34 NA NA
GLKN SACN GLKNRVWQ SACN_NAKA_4.8 Namekagon River at Namekagon Trail river 2022-07-05 2022 7 186 Surface NA ft Sample-Integrated Vertical Profile VS Alkalinity, Total (total hydroxide+carbonate+bicarbonate) Alkalinity_mgL Detected and Quantified 74 74 FALSE mg/l 21 70 NA NA
GLKN SACN GLKNRVWQ SACN_NAKA_4.8 Namekagon River at Namekagon Trail river 2022-10-03 2022 10 276 Surface NA ft Sample-Integrated Vertical Profile VS Alkalinity, Total (total hydroxide+carbonate+bicarbonate) Alkalinity_mgL Detected and Quantified 79 79 FALSE mg/l 21 70 NA NA
GLKN SACN GLKNRVWQ SACN_NAKA_4.8 Namekagon River at Namekagon Trail river 2019-04-01 2019 4 91 Surface NA NA Sample-Routine VS Alkalinity, Total (total hydroxide+carbonate+bicarbonate) Alkalinity_mgL Detected and Quantified 44 44 FALSE mg/l 6 19 NA NA

Get Sonde parameters for all sites and non-QAQC events in ISRO in 2023.

Note that QC_type = "VS" is the default for this function, which returns only non-QAQC events. Note also the use of named objects for the arguments. This allows you to set them at the top of a script, rather than having to type them out repeatedly. You can then change them in 1 place (e.g., update the year to 2024) and rerun the code.

sonde_params <- c("DO_mgL", "DOsat_pct", "pH", "SpecCond_uScm", "TempWater_C")
isro_sonde <- getResults(park = "ISRO", years = 2023, parameter = sonde_params)
print_head(isro_sonde) 
Org_Code Park_Code Project_ID Location_ID Location_Name SiteType sample_date year month doy Activity_Relative_Depth Activity_Depth Activity_Depth_Unit Activity_Type samp_type Characteristic_Name param_name Result_Detection_Condition Result_Text value censored Result_Unit Method_Detection_Limit Lower_Quantification_Limit Upper_Quantification_Limit Result_Comment
GLKN ISRO GLKNLKWQ ISRO_01 Lake Ahmik lake 2023-06-30 2023 6 181 Midwater 2.04 m Field Msr/Obs-Portable Data Logger VS Dissolved oxygen (DO) DO_mgL Detected and Quantified 5.94 5.94 FALSE mg/l NA NA NA NA
GLKN ISRO GLKNLKWQ ISRO_01 Lake Ahmik lake 2023-06-30 2023 6 181 Midwater 1.45 m Field Msr/Obs-Portable Data Logger VS Dissolved oxygen (DO) DO_mgL Detected and Quantified 7.51 7.51 FALSE mg/l NA NA NA NA
GLKN ISRO GLKNLKWQ ISRO_01 Lake Ahmik lake 2023-07-19 2023 7 200 Midwater 1.06 m Field Msr/Obs-Portable Data Logger VS Dissolved oxygen (DO) DO_mgL Detected and Quantified 8.59 8.59 FALSE mg/l NA NA NA NA
GLKN ISRO GLKNLKWQ ISRO_01 Lake Ahmik lake 2023-05-18 2023 5 138 Surface 0.50 m Field Msr/Obs-Portable Data Logger VS Dissolved oxygen (DO) DO_mgL Detected and Quantified 9.37 9.37 FALSE mg/l NA NA NA NA
GLKN ISRO GLKNLKWQ ISRO_01 Lake Ahmik lake 2023-05-18 2023 5 138 Midwater 1.02 m Field Msr/Obs-Portable Data Logger VS Dissolved oxygen (DO) DO_mgL Detected and Quantified 9.37 9.37 FALSE mg/l NA NA NA NA
GLKN ISRO GLKNLKWQ ISRO_01 Lake Ahmik lake 2023-05-18 2023 5 138 Bottom 2.50 m Field Msr/Obs-Portable Data Logger VS Dissolved oxygen (DO) DO_mgL Detected and Quantified 7.26 7.26 FALSE mg/l NA NA NA NA

Get surface-only measurements for pH in SACN, all years.

pH_SACN <- getResults(park = "SACN", parameter = "pH", sample_depth = 'surface')

Get censored and non-censored NO2+NO3 data for SLBE in all years. Note that the censored column is TRUE if the value is censored, and FALSE if the value is real.

slbecens <- getResults(park = "SLBE", param = "NO2+NO3_ugL", include_censored = TRUE)
print_head(slbecens) 
Org_Code Park_Code Project_ID Location_ID Location_Name SiteType sample_date year month doy Activity_Relative_Depth Activity_Depth Activity_Depth_Unit Activity_Type samp_type Characteristic_Name param_name Result_Detection_Condition Result_Text value censored Result_Unit Method_Detection_Limit Lower_Quantification_Limit Upper_Quantification_Limit Result_Comment
GLKN SLBE GLKNLKWQ SLBE_01 Manitou Lake lake 2012-09-13 2012 9 257 Surface NA ft Sample-Integrated Vertical Profile VS Nitrogen, Nitrite (NO2) + Nitrate (NO3) as N NO2+NO3_ugL Present Below Quantification Limit NA 8.000 TRUE ug/l 2.0 8 NA Reported Value: 5.687
GLKN SLBE GLKNLKWQ SLBE_01 Manitou Lake lake 2007-05-30 2007 5 150 Surface NA ft Sample-Integrated Vertical Profile VS Nitrogen, Nitrite (NO2) + Nitrate (NO3) as N NO2+NO3_ugL Not Detected NA NA TRUE ug/l 3.0 6 NA Reported value of 0.
GLKN SLBE GLKNLKWQ SLBE_01 Manitou Lake lake 2012-07-31 2012 7 213 Surface NA ft Sample-Integrated Vertical Profile VS Nitrogen, Nitrite (NO2) + Nitrate (NO3) as N NO2+NO3_ugL Detected and Quantified 154.705 154.705 FALSE ug/l 2.0 8 NA NA
GLKN SLBE GLKNLKWQ SLBE_01 Manitou Lake lake 2013-09-10 2013 9 253 Midwater NA ft Sample-Integrated Vertical Profile VS Nitrogen, Nitrite (NO2) + Nitrate (NO3) as N NO2+NO3_ugL Detected and Quantified 17.348 17.348 FALSE ug/l 2.0 8 NA NA
GLKN SLBE GLKNLKWQ SLBE_01 Manitou Lake lake 2019-06-17 2019 6 168 Surface NA ft Sample-Integrated Vertical Profile VS Nitrogen, Nitrite (NO2) + Nitrate (NO3) as N NO2+NO3_ugL Present Below Quantification Limit NA 8.000 TRUE ug/l 2.3 8 NA Reported Value: 3.7499
GLKN SLBE GLKNLKWQ SLBE_01 Manitou Lake lake 2007-09-24 2007 9 267 Surface NA ft Sample-Integrated Vertical Profile VS Nitrogen, Nitrite (NO2) + Nitrate (NO3) as N NO2+NO3_ugL Present Below Quantification Limit NA 6.000 TRUE ug/l 3.0 6 NA Reported value of 5.

Get Secchi depths for all sites in ISRO and VOYA for all years.

secchi <- getResults(park = c("ISRO", "VOYA"), parameter = "Secchi_m")
print_head(secchi)
Org_Code Park_Code Project_ID Location_ID Location_Name SiteType sample_date year month doy Activity_Relative_Depth Activity_Depth Activity_Depth_Unit Activity_Type samp_type Characteristic_Name param_name Result_Detection_Condition Result_Text value censored Result_Unit Method_Detection_Limit Lower_Quantification_Limit Upper_Quantification_Limit Result_Comment
GLKN ISRO GLKNLKWQ ISRO_01 Lake Ahmik lake 2020-07-24 2020 7 206 NA NA NA Field Msr/Obs VS Depth, Secchi Disk Depth Secchi_m Detected and Quantified 1.99 1.990 FALSE m NA NA NA NA
GLKN ISRO GLKNLKWQ ISRO_01 Lake Ahmik lake 2019-07-19 2019 7 200 NA NA NA Field Msr/Obs VS Depth, Secchi Disk Depth Secchi_m Detected and Quantified 1.65 1.650 FALSE m NA NA NA NA
GLKN ISRO GLKNLKWQ ISRO_01 Lake Ahmik lake 2014-08-22 2014 8 234 NA NA NA Field Msr/Obs VS Depth, Secchi Disk Depth Secchi_m Detected and Quantified 2.79 2.790 FALSE m NA NA NA NA
GLKN ISRO GLKNLKWQ ISRO_01 Lake Ahmik lake 2010-08-22 2010 8 234 Surface NA NA Field Msr/Obs VS Depth, Secchi Disk Depth Secchi_m Detected and Quantified 2.2 2.200 FALSE m NA NA NA NA
GLKN ISRO GLKNLKWQ ISRO_01 Lake Ahmik lake 2013-06-19 2013 6 170 NA NA NA Field Msr/Obs VS Depth, Secchi Disk Depth Secchi_m Detected and Quantified 1.45 1.450 FALSE m NA NA NA NA
GLKN ISRO GLKNLKWQ ISRO_01 Lake Ahmik lake 2023-08-30 2023 8 242 NA NA NA Field Msr/Obs VS Depth, Secchi Disk Depth Secchi_m Detected and Quantified 2.395 2.395 FALSE m NA NA NA NA

Plotting Functions

plotLakeProfile()

This function produces a heatmap in 1-m bins. You can filter on park, site, year, month, and parameter. You can only specify one parameter at a time, but see example for combining plots. If multiple sites or years are selected, plots will be faceted on those factors. Keep options limited for best plotting.

Note that occasionally profiles skip a bin, which show up as white sections in the plots. If you specify a lake x year x parameter combination that doesn’t exist (e.g., a year a lake isn’t sampled), the function will return an error message instead of an empty plot.

The width of the profiles take into account the number of days between sampling events. For the first and last months (e.g. June and August for ISRO), the left/right side of the profiles are padded by 14 days. Otherwise, profile widths are centered on the sample day with the left side representing half the number of days between that visit and the previous visit and the right side representing half the number of days between that visit and the following visit. Black lines are the thermocline, as calculated by rLakeAnalyzer. Note that you must install the rLakeAnalyzer package to use that feature. If you try to plot the thermocline (default) and don’t have rLakeAnalyzer installed, an error message will tell you to install it.

There are several arguments to customize plots.
  • Choose whether to plot the theromocline as points on each profile via plot_thermocline = TRUE (default). The thermocline is calculated by rLakeAnalyzer, and is the depth at which the largest change in temperature occurs in the sampled water column. If no thermocline is detected, as defined by rLakeAnalyzer::thermo.depth(), nothing is plotted.
  • Choose whether to include only active sites (default) or all sites that have been monitored via active.
  • Add gridlines on the y, x or both axes.
  • Choose palette. Current enabled themes are ‘viridis’ (yellow - green - blue), and built in continuous color patterns in RColorBrewer. If you prefer other palettes, I can add those too. The only thing I’m trying to avoid is creating the palette manually, since number 1-m bins varies by site and across years. More info on built in ggplot scales can be found here: https://ggplot2-book.org/scales-colour.
  • Choose position of legend via legend_position. If you don’t want to show the legend, legend_position = 'none'.
  • Include Location_ID as plot title (title = TRUE). Only enabled when 1 site is selected. Otherwise site names will be in the facets. Note that for more meaningful site names, the data packages will need a column with abbreviated names. Many of the river sites are too long to consider using as a title.
Continuous RColorBrewer palettes are below. Note that there are only as many colors as shown.
RColorBrewer::display.brewer.all(type = 'div')

RColorBrewer::display.brewer.all(type = 'seq')


Simple plots

Plot water temperature for ISRO_07 for years 2007 - 2023 with thermocline plotted as black lines.

plotLakeProfile(site = "ISRO_07", parameter = "TempWater_C", years = 2007:2023)

Plot water temperature for all PIRO lakes sampled in 2023, with fixed Y axis range.

plotLakeProfile(park = "PIRO", parameter = "TempWater_C", years = 2023, palette = "Spectral")

Plot water temperature for all PIRO lakes sampled in 2023, with varying Y axis range.

plotLakeProfile(park = "PIRO", parameter = "TempWater_C", years = 2023, palette = "Spectral",
                facet_scales = "free_y")

Plot temperature for VOYA_01 all years with thermocline plotted as black lines.

plotLakeProfile(site = "VOYA_01", parameter = "TempWater_C")

Plot Specific Conductance for SLBE_01 all years without theromocline or plot title

plotLakeProfile(site = "SLBE_01", parameter = "SpecCond_uScm", plot_thermocline = F, plot_title = F)

Plot water temp using Red-Yellow-Blue palette

plotLakeProfile(site = "PIRO_01", parameter = "TempWater_C", palette = "RdYlBu") # PIRO has more months

Plot DO for all sites in PIRO sampled in 2023 with mako palette

plotLakeProfile(park = "PIRO", years = 2023, parameter = "DOsat_pct", palette = "mako")

Plot temp for Lake St. Croix Sites in 2023

lkst <- c("SACN_STCR_20.0", "SACN_STCR_15.8", "SACN_STCR_2.0")
plotLakeProfile(site = lkst, parameter = "pH", years = 2023)


Combining plots

Combine plots for temp, DO, pH, and conductance in Lake St. Croix 15.8 for 2023 using the cowplot package.

To minimize typing, I define the parameters I wanted at the beginning. This allows you to adjust the parameters once (i.e., change site), and run through the rest of the code without having to edit it.

The cowplot package must be installed to use this code. Install the package via install.packages('cowplot'). There are other packages to combine plots, including grid and gridExtra, and the function ggarrage() in ggpubr. I tend to start with cowplot, because it’s easy to use and has a great help page. If I really need to customize a plot (like custom spacing for each plot), then I use grid/gridExtra, which allows for more customization, but is a bit harder to work with.

library(cowplot)
sitecode = "SACN_STCR_15.8"
year = 2023
depth = 'elev'

tplot <- plotLakeProfile(site = sitecode, parameter = "TempWater_C", years = year, 
                         plot_title = F)

doplot <- plotLakeProfile(site = sitecode, parameter = "DOsat_pct", years = year, 
                          color_rev = T, plot_title = F)

pHplot <- plotLakeProfile(site = sitecode, parameter = "pH", years = year, 
                          palette = "RdYlBu", color_rev = T, plot_title = F)

cnplot <- plotLakeProfile(site = sitecode, parameter = "SpecCond_uScm", years = year, 
                          palette = 'RdBu', plot_title = F)

# Default settings
plot_grid(tplot, doplot, pHplot, cnplot)

plotScatterPlot()

This function produces points or loess smoothed lines of 2 variables, filtered on park, site, year, month, and 2 parameters. Works best with Sonde and lab chemistry data, but can also plot Secchi depth and other parameters. By default, only surface measurements are included, although Secchi depth ignores that. To include all sample depths, specify sample_depth = 'all'. If multiple sites are specified, they will be plotted on the same figure, unless facet_site = T. Note that if you specify a site and parameter combination that doesn’t exist, the function will return an error message instead of an empty plot. Censored values are not permitted in this function.

Plot Temp vs DO surface measurements (default) for VOYA all years on same figure

plotScatterPlot(park = "VOYA", parameters = c("DO_mgL", "TempWater_C"),
  palette = 'viridis', facet_site = F, legend_position = "bottom")

Plot Temp vs DO surface measurements for VOYA all years on separate figures and same color

plotScatterPlot(park = "VOYA", parameters = c("DO_mgL", "TempWater_C"),
  palette = 'dimgrey', facet_site = T, legend_position = "none")

Plot Secchi depth vs. surface ChlA in PIRO 1-4

plotScatterPlot(site = c("PIRO_01", "PIRO_02", "PIRO_03", "PIRO_04"), 
                parameters = c("Secchi_m", "ChlA_ugL"),
  span = 0.9, facet_site = F, legend_position = 'bottom', 
  palette = c("red", "orange", "purple4", "blue"))

Same as above, but including linear instead of smoothed line

plotScatterPlot(site = c("PIRO_01", "PIRO_02", "PIRO_03", "PIRO_04"), 
                parameters = c("Secchi_m", "ChlA_ugL"),
  facet_site = F, legend_position = 'bottom', 
  palette = c("red", "orange", "purple4", "blue"),
  layers = c('points', 'line'))

Same as above, but points only

plotScatterPlot(site = c("PIRO_01", "PIRO_02", "PIRO_03", "PIRO_04"), 
                parameters = c("Secchi_m", "ChlA_ugL"),
  facet_site = F, legend_position = 'bottom', 
  palette = c("red", "orange", "purple4", "blue"),
  layers = 'points')

plotTrend()

This function produces a trend plot filtered on park, site, year, month, and parameter. If multiple sites are specified, they will be plotted on the same figure. If multiple parameters are specified, they will be plotted on separate figures. If smooth = T, a loess smoothed line will connect through the data. If smooth = F and layers includes “lines”, then lines will connect the sample points.

There are several arguments to customize plots.
  • Choose whether to include only active sites (default) or all sites that have been monitored via active.
  • Choose whether to add points, lines, or both (default) via layers argument.
  • If lines are chosen as a layer, choose whether to plot a loess smoothed line (default) or a line that connects the sample points via smooth.
  • [not enabled yet–] Choose whether to plot any water quality thresholds that exist via threshold. Upper limits are dashed. Lower limits are dotted.[–not enabled yet]
  • Choose whether to add gridlines, either both, grid_y or grid_x. Default is none.
  • Choose whether to plot surface (default) or all depth measurements via sample_depth.
  • Choose whether to include censored values or not via include_censored.
  • Choose color palette via palette. Default is ‘viridis’, but other options are magma (yellow, red, purple), plasma (brighter version of magma), turbo (rainbow), or specify a vector of colors manually. See the intro do viridis site for more info on built in color palettes.
  • Choose position of legend via legend_position. If you don’t want to show the legend, legend_position = 'none'.
  • Additional customizations are defined in the help documentation, accessable via ?plotTrend()
Single site; single parameter

Plot smoothed surface ChlA for ISRO_01 for all years with default settings of surface measurements, smoothed line with span = 0.3, no gridlines, and no censured points.

plotTrend(site = "ISRO_01", parameter = "ChlA_ugL")

Same as above, but change palette, add x gridlines and include all sample depths, and censored points.

plotTrend(site = "ISRO_01", parameter = "ChlA_ugL", palette = 'navyblue', gridlines = 'grid_x', 
          sample_depth = 'all', include_censored = T, legend_position = 'right')

Plot smoothed surface pH for VOYA_01 for past 10 years using default span of 0.3 and by default not including the legend.

plotTrend(site = "VOYA_01", parameter = "pH", palette = 'dimgrey', years = 2013:2023)

Plot smoothed surface pH for PIRO_02 for all years, with turbo palette, and using span of 0.75.

plotTrend(site = "PIRO_02", parameter = "pH", span = 0.75, palette = "turbo",
          legend_position = 'bottom', facet_site = F)

Plot smoothed surface SO4 for all VOYA_17 for all years with 0.6 span

plotTrend(site = "VOYA_17", parameter = "SO4_mgL", legend_position = "right",
  span = 0.6, point_size = 2.5)
Multiple sites or params

Plot smoothed surface pH for 4 lakes in ISRO over all years with 0.6 span and convert to plotly.

plotTrend(site = c("ISRO_02", "ISRO_03", "ISRO_04", "ISRO_05"), 
          parameter = "pH", legend_position = "right", span = 0.6)

Plot non-smoothed surface of multiple Sonde parameters for all PIRO sites on the same figure over all years with 0.6 span.

params <- c("TempWater_C", "SpecCond_uScm", "DOsat_pct", "pH")
plotTrend(park = "PIRO", parameter = params, legend_position = "right", span = 0.6, 
          facet_site = F)

Plot smoothed Alkalinity, Nitrite + Nitrate, P and SO4 in all MORR sites for all years, including the legend, different color palette, and using span of 0.6.

plotTrend(park = "ISRO", parameter = c('Alkalinity_mgL', "NO2+NO3_ugL", "P_ugL", "SO4_mgL"), 
          span = 0.6, legend_position = 'bottom', palette = 'plasma', 
          facet_site = F)

plotWaterBands()

This function produces a plot that summarizes the range of historic data compared with current measurements. The function can handle most water quality parameters. Historic measurements are displayed as the min-max values that have ever previously recorded (outermost band), upper and lower 95% distribution and middle 50% distribution (inner quartiles) of values previously recorded (inner bands). The line represents the median value.

Currently you can only specify one parameter at a time, and one year comparison at a time (i.e. year_current). You can add gridlines to the plot via the gridlines argument. If multiple sites are specified, they will be faceted in the order they were specified. If include_censored = TRUE, censored values will be plotted as an asterisk instead of circle.

[Not yet enabled–]Values that exceed water quality thresholds (where they exist) are plotted as orange and will show an orange point in the legend. Values within WQ thresholds or for parameters without set thresholds are black. You can include threshold lines (default), or remove them, where they make the y axis range too big, via threshold = FALSE. [–Not yet enabled].

Plot pH in sites in Lake St. Croix for 2023 with gridlines on the y-axis

lksc <- c('SACN_STCR_2.0', 'SACN_STCR_15.8', 'SACN_STCR_20.0')
plotWaterBands(site = lksc, year_curr = 2023, years_historic = 2007:2022, 
  parameter = "pH", legend_position = 'right', gridlines = 'grid_y')

Plot ChlA in sites in Lake St. Croix for 2023 including censored values, and move legend to bottom.

plotWaterBands(site = lksc, year_curr = 2023, years_historic = 2007:2022, 
  parameter = "ChlA_ugL", legend_position = 'bottom', include_censored = T)

Plot DO in SLBE_01 in 2023 and gridlines on both x and y axes

plotWaterBands(site = "SLBE_01", year_curr = 2023, years_historic = 2007:2022,
  parameter = "DO_mgL", legend_position = 'right', gridlines = "both")

Plot Specific Conductance in ISRO 1-3 sites in 2023

isro_sites <- c("ISRO_01", "ISRO_02", "ISRO_03")
plotWaterBands(site = isro_sites, year_curr = 2023, years_historic = 2007:2022, 
               parameter = "SpecCond_uScm", legend_position = 'bottom')

Summary Functions

sumEvents()

Summarize number of samples collected per park, site, month, and parameter. Resulting data frame show number of samples collected for each month, and whether the value is real (month) or censored (month_cens).

Summarize all events for ISRO for all years and active sites

isro_ev <- sumEvents(park = "ISRO")
print_head(isro_ev)
Park_Code SiteType Location_ID param_name year_range num_years Apr May Jun Jul Aug Sep Oct Apr_cens May_cens Jun_cens Jul_cens Aug_cens Sep_cens Oct_cens
ISRO lake ISRO_01 pH 2007 – 2023 17 0 10 66 75 76 7 0 0 0 0 0 0 0 0
ISRO lake ISRO_02 pH 2007 – 2023 17 0 42 210 224 219 14 13 0 0 0 0 15 0 0
ISRO lake ISRO_03 pH 2007 – 2023 17 0 57 185 175 220 48 0 0 0 0 0 12 0 0
ISRO lake ISRO_04 pH 2007 – 2023 17 0 12 105 89 110 19 0 0 0 0 0 1 0 0
ISRO lake ISRO_05 pH 2007 – 2023 17 0 0 85 88 101 11 0 0 0 0 0 6 0 0
ISRO lake ISRO_06 pH 2007 – 2023 17 0 0 73 66 73 8 0 0 0 0 0 0 0 0

Summarize only lake events for SACN for all years

sacn_lk <- sumEvents(park = "SACN", site_type = "lake")
print_head(sacn_lk)
Park_Code SiteType Location_ID param_name year_range num_years Apr May Jun Jul Aug Sep Oct Apr_cens May_cens Jun_cens Jul_cens Aug_cens Sep_cens Oct_cens
SACN lake SACN_PACQ_SP_01 pH 2019 – 2019 1 0 0 6 5 0 0 0 0 0 0 0 0 0 0
SACN lake SACN_PHIP_SP_01 pH 2019 – 2019 1 0 0 0 5 0 0 0 0 0 0 0 0 0 0
SACN lake SACN_STCR_15.8 pH 2007 – 2024 18 128 142 148 152 172 124 184 0 0 9 0 0 0 0
SACN lake SACN_STCR_2.0 pH 2007 – 2024 18 213 236 271 252 271 207 301 0 0 15 0 0 0 0
SACN lake SACN_STCR_20.0 pH 2007 – 2024 18 147 177 181 190 189 146 217 0 0 11 0 0 0 0

Summarize all GLKN events for 2023

glkn23 <- sumEvents(years = 2023)
print_head(glkn23)
Park_Code SiteType Location_ID param_name year_range num_years Apr May Jun Jul Aug Sep Oct Apr_cens May_cens Jun_cens Jul_cens Aug_cens Sep_cens Oct_cens
INDU lake INDU_05 pH 2023 – 2023 1 0 5 0 5 0 5 0 0 0 0 0 0 0 0
ISRO lake ISRO_01 pH 2023 – 2023 1 0 5 5 5 5 0 0 0 0 0 0 0 0 0
ISRO lake ISRO_02 pH 2023 – 2023 1 0 14 14 13 13 0 0 0 0 0 0 0 0 0
ISRO lake ISRO_03 pH 2023 – 2023 1 0 11 11 11 11 0 0 0 0 0 0 0 0 0
ISRO lake ISRO_04 pH 2023 – 2023 1 0 6 10 6 5 0 0 0 0 0 0 0 0 0
ISRO lake ISRO_05 pH 2023 – 2023 1 0 0 5 8 7 0 0 0 0 0 0 0 0 0