WaterUnderground

Doing Hydrogeology in R

Doing Hydrogeology in R

Post by Sam Zipper (@ZipperSam), current Postdoctoral Fellow at the University of Victoria and soon-to-be research scientist with the Kansas Geological Survey at the University of Kansas.


Using programming languages to interact with, analyze, and visualize data is an increasingly important skill for hydrogeologists to have. Coding-based science makes it easier to process and visualize large amounts of data and increase the reproducibility of your work, both for yourself and others. 

There are many programming languages out there; anecdotally, the most commonly used languages in the hydrogeology community are Python, MATLAB, and R. Kevin previously wrote a post highlighting Python’s role in the hydrogeology toolbox, in particular the excellent FloPy package for creating and interacting with MODFLOW models. 

In this post, we’ll focus on R to explore some of the tools that can be used for hydrogeology. R uses ‘packages’, which are collections of functions related to a similar task. There are thousands of R packages; recently, two colleagues and I compiled a ‘Hydrology Task View’ which compiles and describes a large number of water-related packages. We found that water-related R packages can be broadly categorized into data retrieval, data analysis, and modelling applications. Though packages related to surface water and meteorological data constitute the bulk of the package, there are many groundwater-relevant packages for each step of a typical workflow.

Here, I’ll focus on some of the packages I use most frequently. 

Data Retrieval:

Instead of downloading data as a CSV file and reading it into R, many packages exist to directly interface with online water data portals. For instance, dataRetrieval and waterData connect to the US Geological Survey water information service, tidyhydat to the Canadian streamflow monitoring network, and rnrfa for the UK National River Flow Archive.

Data Analysis:

Many common data analysis tasks are contained in various R packages. hydroTSM and zoo are excellent for working with timeseries data, and lfstat calculates various low-flow statistics. The EcoHydRology package contains an automated digital filter for baseflow separation from streamflow data.

Modelling:

While R does not have an interface to MODFLOW, there are many other models that can be run within R. The boussinesq package, unsurprisingly, contains functions to solve the 1D Boussinesq equation, and the kwb.hantush package models groundwater mounding beneath an infiltration basin. The first and only package I’ve ever made, streamDepletr, contains analytical models for estimating streamflow depletion due to groundwater pumping. To evaluate your model, check out the hydroGOF package which calculated many common goodness-of-fit metrics.

How do I get and learn R?

R is an open-source software program, available here. RStudio is a user-friendly interface for working with R. RStudio has also compiled a number of tutorials to help you get started!

Other Useful Resources

Louise Slater and many co-authors currently have a paper under discussion about ‘Using R in Hydrology’ which has many excellent resources.

While not hydrogeology-specific, there are many packages for generic data analysis and visualization that will be of use to hydrogeologists. In particular, the Tidyverse has a number of packages for reading, tidying, and visualizing data such as dplyr and ggplot2.

Claus Wilke’s Fundamentals of Data Visualization book (free online) was written entirely within R and shows examples of the many ways that R can be used to make beautiful graphs.

Groundwater—the world’s largest freshwater store— is a life-sustaining resource that supplies water to billions of people, plays a central part in irrigated agriculture and influences the health of many ecosystems. Water Underground is a groundwater nerd blog written by a global collective of hydrogeologic researchers for water resource professionals, academics and anyone interested in groundwater, research, teaching and supervision. The blog, started by Tom Gleeson and managed by Xander Huggins, is the first blog hosted on both the EGU blogs and the AGU blogosphere.


Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

*