GeoLog

Geoscientific Methods

How to increase reproducibility and transparency in your research

How to increase reproducibility and transparency in your research

Contemporary science faces many challenges in publishing results that are reproducible. This is due to increased usage of data and digital technologies as well as heightened demands for scholarly communication. These challenges have led to widespread calls for more research transparency, accessibility, and reproducibility from the science community. This article presents current findings and solutions to these problems, including recent new software that makes writing submission-ready manuscripts for journals of Copernicus Publications a lot easier.

While it can be debated if science really faces a reproducibility crisis, the challenges of computer-based research have sparked numerous articles on new good research practices and their evaluation. The challenges have also driven researchers to develop infrastructure and tools to help scientists effectively write articles, publish data, share code for computations, and communicate their findings in a reproducible way, for example Jupyter, ReproZip and research compendia.

Recent studies showed that the geosciences and geographic information science are not beyond issues with reproducibility, just like other domains. Therefore, more and more journals have adopted policies on sharing data and code. However, it is equally important that scientists foster an open research culture and teach researchers how they adopt more transparent and reproducible workflows, for example at skill-building workshops at conferences offered by fellow researchers, such as the EGU short courses, community-led non-profit organisations such as the Carpentries, open courses for students, small discussion groups at research labs, or individual efforts of self-learning. In the light of prevailing issues of a common definition of reproducibility, Philip Stark, a statistics professor and associate dean of mathematical and physical sciences at the University of California, Berkeley, recently coined the term preproducibility: “An experiment or analysis is preproducible if it has been described in adequate detail for others to undertake it.” The neologism intends to reduce confusion and also to embrace a positive attitude for more openness, honesty, and helpfulness in scholarly communication processes.

In the spirit of these activities, this article describes a modern workflow made possible by recent software releases. The new features allow the EGU community to write preproducible manuscripts for submission to the large variety of academic journals published by Copernicus Publications. The new workflow might require hard-earned adjustments for some researchers, but it pays off because of an increase in transparency and effectivity. This is especially the case for early career scientists. An open and reproducible workflow enables researchers to build on others’ and own previous work and better collaborate on solving the societal challenges of today.

Reproducible research manuscripts

Open digital notebooks, which interweave data and code and can be exported to different output formats such as PDF, are powerful means to improve transparency and preproducibility of research. Jupyter Notebook, Stencila and R Markdown let researchers combine long-form text of a publication and source code for analysis and visualisation in a single document. Having text and code side-by-side makes them easier to grasp and ensures consistency, because each rendering of the document executes the whole workflow using the original data. Caching for long-lasting computations is possible, and researchers working with supercomputing infrastructures or huge datasets may limit the executed code to purposes of visualisation using processed data as input. Authors can transparently expose specific code snippets to readers but also publish the complete source code of the document openly for collaboration and review.

The popular notebook formats are plain text-based, like Markdown in case of R Markdown. Therefore an R Markdown document can be managed with version control software, which are programs for managing multiple versions and contributions, even by different people, to the same documents. Version control provides traceability of authorship, a time machine for going back to any previous “working” version, and online collaboration such as on GitLab. This kind of workflow also stops the madness of using file names for versions yet still lets authors use awesome file names and apply domain-specific guidelines for packaging research.

R Markdown supports different programming languages besides the popular namesake R and is a sensible solution even if you do not analyse data with scripts nor have any code in your scholarly manuscript. It is easy to write, allows you to manage your bibliography effectively, can be used for websites, books or blogs, but most importantly it does not fall short when it is time to submit a manuscript article to a journal.

The rticles extension package for R provides a number of templates for popular journals and publishers. Since version 0.6 (published Oct 9 2018) these templates include the Copernicus Publications Manuscript preparations guidelines for authors. The Copernicus Publications staff was kind enough to give a test document a quick review and all seems in order, though of course any problems and questions shall be directed to the software’s vibrant community and not the publishers.

The following code snippet and screen shot demonstrate the workflow. Lines starting with # are code comments and explain the steps. Code examples provided here are ready to use and only lack the installation commands for required packages.

# load required R extension packages:
library("rticles")
library("rmarkdown")

# create a new document using a template:
rmarkdown::draft(file = "MyArticle.Rmd",
                 template = "copernicus_article",
                 package = "rticles", edit = FALSE)

# render the source of the document to the default output format:
rmarkdown::render(input = "MyArticle/MyArticle.Rmd")

{: .language-r}

The commands created a directory with the Copernicus Publications template’s files, including an R Markdown (.Rmd) file ready to be edited by you (left-hand side of the screenshot), a LaTeX (.tex) file for submission to the publisher, and a .pdf file for inspecting the final results and sharing with your colleagues (right-hand side of the screenshot). You can see how simple it is to format text, insert citations, chemical formulas or equations, and add figures, and how they are rendered into a high-quality output file.

All of these steps may also be completed with user-friendly forms when using RStudio, a popular development and authoring environment available for all operating systems. The left-hand side of the following screenshot shows the form for creating a new document based on a template, and the right-hand shows side the menu for rendering, called “knitting” with R Markdown because code and text are combined into one document like threads in a garment.

And in case you decide last minute to submit to a different journal, rticles supports many publishers so you only have to adjust the template while the whole content stays the same.

Sustainable access to supplemental data

Data published today should be published and properly cited using appropriate research data repositories following the FAIR data principles. Journals require authors to follow these principles, see for example the Copernicus Publications data policy or a recent announcement by Nature. Other publishers required, or still do today, to store supplemental information (SI), such as dataset files, extra figures, or extensive descriptions of experimental procedures, as part of the article. Usually only the article itself receives a digital object identifier (DOI) for long-term identification and availability. The DOI minted by the publisher is not suitable for direct access to supplemental files, because it points to a landing page about the identified object. This landing page is designed to be read by humans but not by computers.

The R package suppdata closes this gap. It supports downloading supplemental information using the article’s DOI. This way suppdata enables long-term reproducible data access when data was published as SI in the past or in exceptional cases today, for example if you write about a reproduction of a published article. In the latest version available from GitHub (suppdata is on its way to CRAN) the supported publishers include Copernicus Publications. The following example code downloads a data file for the article “Divergence of seafloor elevation and sea level rise in coral reef ecosystems” by Yates et al. published in Biogeosciences in 2017. The code then creates a mostly meaningless plot shown below.

# load required R extension package:
library("suppdata")

# download a specific supplemental information (SI) file
# for an article using the article's DOI:
csv_file <- suppdata::suppdata(
  x = "10.5194/bg-14-1739-2017",
  si = "Table S1 v2 UFK FOR_PUBLICATION.csv")
supplemental

# read the data and plot it (toy example!):
my_data <- read.csv(file = csv_file, skip = 3)
plot(x = my_data$NAVD88_G03, y = my_data$RASTERVALU,
     xlab = "Historical elevation (NAVD88 GEOID03))",
     ylab = "LiDAR elevation (NAVD88 GEOID03)",
     main = "A data plot for article 10.5194/bg-14-1739-2017",
     pch = 20, cex = 0.5)

{: .language-r}

Main takeaways

Authoring submission-ready manuscripts for journals of Copernicus Publications just got a lot easier. Everybody who can write manuscripts with a word processor can learn quickly R Markdown and benefit from a preproducible data science workflow. Digital notebooks not only improve day-to-day research habits, but the same workflow is suitable for authoring high-quality scholarly manuscripts and graphics. The interaction with the publisher is smooth thanks to the LaTeX submission format, but you never have to write any LaTeX. The workflow is based on an established Free and Open Source software stack and embraces the idea of preproducibility and the principles of Open Science. The software is maintained by an active, growing, and welcoming community of researchers and developers with a strong connection to the geospatial sciences. Because of the complete and consistent notebook, you, a colleague, or a student can easily pick up the work at a later time. The road to effective and transparent research begins with a first step – take it!

Acknowledgements

The software updates were contributed by Daniel Nüst from the project Opening Reproducible Research (o2r) at the Institute for Geoinformatics, University of Münster, Germany, but would not be able without the support of Copernicus Publications, the software maintainers most notably Yihui Xie and Will Pearse, and the general awesomeness of the R, R-spatial, Open Science, and Reproducible Research communities. The blog text was greatly improved with feedback by EGU’s Olivia Trani and Copernicus Publications’ Xenia van Edig. Thank you!

By Daniel Nüst, researcher at the Institute for Geoinformatics, University of Münster, Germany

[This article is cross posted-on the Opening Reproducible Research project blog]

References

Imaggeo on Mondays: The calm before the storm

Imaggeo on Mondays: The calm before the storm

The picture was taken during the 2015 research cruise HE441 in the southern German Bight, North Sea. It features the research vessel Heincke, on a remarkably calm and warm spring day, forming a seemingly steady wake.

The roughly 55 metre long FS Heincke, owned by the German federal government and operated by the Alfred Wegener Institute, provides a great platform for local studies of the North Sea shelf. Eleven scientists and students from the University of Bremen, MARUM Research Faculty, University of Kiel, and Federal Waterways Engineering and Research Institute, along with the ship’s crew formed a great team under the supervision of chief scientist Christian Winter.

On deck, different autonomous underwater observatories were waiting to be deployed. Their purpose was to measure the seabed- and hydrodynamics in a targeted area of the German Bight. The investigation of the interaction between geomorphology, sedimentology and biogeochemistry is crucial to understand the processes acting on this unique and dynamic environment. In the German Bight various stakeholders with diverse interests come together. Profound knowledge, backed by cutting edge research, helps to resolve future conflicts between use and protection of the environment.

While this photo features a tranquil day at sea, some days later the weather and wave conditions got so bad that the cruise had to be abandoned. Storm Niklas, causing wave heights of more than three metres, made deployment and recovery of the observatories too dangerous for the crew, scientists, and delicate instruments.

Despite the severe weather, the research cruise was still able to gather important data with the time made available. Schedules on research vessels are tight and optimized to fit as much high-quality measurements as possible into time slots that are depending on convenient sea (tide) and weather conditions. State-of-the-art research equipment were prepared, deployed, recovered and assessed several times during the then only 8-day long cruise. Measurements were supported by ship based seabed mapping and water column profiling. Transit times, like the one depicted, were used to prepare the different sensors and instruments for the upcoming deployment.

The rare occasion of good weather combined with idle time was utilized to take this long exposure photo. A calm sea, a stable clamp temporarily attached to a handrail, and a neutral density filter were additionally required to increase the exposure time of the camera to 13 seconds, in order to capture this picture. The long exposure time smooths all movement relative to the ship, enhancing the effect of the wake behind the Heincke vessel.

Over the course of several years, regular Heincke research cruises and the collaboration between the different institutions has led to the successful completion of research projects, with findings being published in various journals, listed below.

By Markus Benninghoff, MARUM, University of Bremen, Germany

Further reading

Ahmerkamp, S, Winter, C, Janssen, F, Kuypers, MMM and Holtappels, M (2015) The impact of bedform migration on benthic oxygen fluxes. Journal of Geophysical Research: Biogeosciences, 120(11). 2229-2242. doi:10.1002/2015JG003106

Ahmerkamp, S, Winter, C, Krämer, K, de Beer, D, Janssen, F, Friedrich, J, Kuypers, MMM and Holtappels, M (2017) Regulation of benthic oxygen fluxes in permeable sediments of the coastal ocean. Limnology and Oceanography. doi:10.1002/lno.10544

Amirshahi, SM, Kwoll, E and Winter, C (2018) Near bed suspended sediment flux by single turbulent events. Continental Shelf Research, 152. 76-86. doi:10.1016/j.csr.2017.11.005

Krämer, K and Winter, C (2016) Predicted ripple dimensions in relation to the precision of in situ measurements in the southern North Sea. Ocean Science, 12(6). 1221-1235. doi:10.5194/os-12-1221-2016

Krämer, K, Holler, P, Herbst, G, Bratek, A, Ahmerkamp, S, Neumann, A, Bartholomä, A, van Beusekom, JEE, Holtappels, M and Winter, C (2017) Abrupt emergence of a large pockmark field in the German Bight, southeastern North Sea. Scientific Reports, 7(1). doi:10.1038/s41598-017-05536-1

Oehler, T, Martinez, R, Schückel, U, Winter, C, Kröncke, I and Schlüter, M (2015) Seasonal and spatial variations of benthic oxygen and nitrogen fluxes in the Helgoland Mud Area (southern North Sea). Continental Shelf Research, 106. 118-129. doi:10.1016/j.csr.2015.06.009

If you pre-register for the 2019 General Assembly (Vienna, 07–12 April), you can take part in our annual photo competition! From 15 January until 15 February, every participant pre-registered for the General Assembly can submit up three original photos and one moving image related to the Earth, planetary, and space sciences in competition for free registration to next year’s General Assembly!  These can include fantastic field photos, a stunning shot of your favourite thin section, what you’ve captured out on holiday or under the electron microscope – if it’s geoscientific, it fits the bill. Find out more about how to take part at http://imaggeo.egu.eu/photo-contest/information/.

Imaggeo is the EGU’s online open access geosciences image repository. All geoscientists (and others) can submit their photographs and videos to this repository and, since it is open access, these images can be used for free by scientists for their presentations or publications, by educators and the general public, and some images can even be used freely for commercial purposes. Photographers also retain full rights of use, as Imaggeo images are licensed and distributed by the EGU under a Creative Commons licence. Submit your photos at http://imaggeo.egu.eu/upload/.

Imaggeo on Mondays: Crowned elephant seals do citizen science

Imaggeo on Mondays: Crowned elephant seals do citizen science

In the Southern Ocean and North Pacific lives a peculiar type of elephant seal. This group acts like any other marine mammal; they dive deep into the ocean, chow down on fish, and sunbathe on the beach. However, they do all this with scientific instruments attached to their heads. While the seals carry out their usual activities, the devices collect important oceanographic data that help scientists better understand our marine environment.

The practice of tagging elephant seals to obtain data started in 2004, and today equipped seals are the largest contributors of temperature and salinity profiles below of the 60th parallel south. You can find all sorts of data that has been collected by instrumented sea creatures through the Marine Mammals Exploring the Oceans Pole to Pole database online.

The female elephant seal, pictured here at Point Suzanne on the eastern end of the Kerguelen Islands in the Southern Ocean, is a member of this unusual headgear-wearing cohort. This particular seal had been roaming the sea for several months with the device (also known as a miniature Conductivity-Temperature-Depth sensor) on her head. As the seal dove hundreds of metres below the sea surface, the instrument captured the vertical profile of the area, recording the ocean’s temperature and salinity, as well as chlorophyll a fluorescence and concentrations. When the seal resurfaced, the sensor sent the data it had accrued to scientists by satellite.

Etienne Pauthenet, a PhD student at Stockholm University who was involved in a seal tagging campaign, had a chance to snap this photo before tranquilising the seal and retrieving the tag.

Using elephant seals and other marine mammals to collect data gives scientists the opportunity to analyse remote regions of the ocean that aren’t very accessible by vehicles. Studying these parts of the world are important for gaining insight on how oceans and their inhabitants are responding to climate change, for example. With the help of data-gathering elephant seals, researchers are able to amass in situ measurements from regions that previously had been hard to reach, apply this data to oceanographic models, and make predictions on ocean climate processes.

While gathering data via elephant seals are crucial to oceanographic research, Pauthenet explains that the practice is sometimes quite difficult. “It can be complicated to find back the seal, because of the Argo satellite signal precision. The quality of the signal depends on the position of the seal, if she is lying on her back for example, or if she is still in the water.”

While on the research campaign, Pauthenet and his colleagues were stationed at a small cabin on the shore of Point Suzanne and they walked the shore every day in search of the seal, relying on location points transmitted from a VHF radio. After seven days they finally located her and removed her valuable crown. The seal was then free to go about her business, having given her contribution to the hundreds of thousands of vertical profiles collected by marine mammal citizen scientists.

by Olivia Trani, EGU Communications Officer
Imaggeo is the EGU’s online open access geosciences image repository. All geoscientists (and others) can submit their photographs and videos to this repository and, since it is open access, these images can be used for free by scientists for their presentations or publications, by educators and the general public, and some images can even be used freely for commercial purposes. Photographers also retain full rights of use, as Imaggeo images are licensed and distributed by the EGU under a Creative Commons licence. Submit your photos at http://imaggeo.egu.eu/upload/.

Imaggeo on Mondays: The best of imaggeo in 2018

Imaggeo on Mondays: The best of imaggeo in 2018

Imaggeo, our open access image repository, is packed with beautiful images showcasing the best of the Earth, space and planetary sciences. Throughout the year we use the photographs submitted to the repository to illustrate our social media and blog posts.

For the past few years we’ve celebrated the end of the year by rounding-up some of the best Imaggeo images. But it’s no easy task to pick which of the featured images are the best! Instead, we turned the job over to you!  We compiled a Facebook album which included all the images we’ve used  as header images across our social media channels and on Imaggeo on Mondays blog post in 2018 an asked you to vote for your favourites.

Today’s blog post rounds-up the best 12 images of Imaggeo in 2018, as chosen by you, our readers.

Of course, these are only a few of the very special images we highlighted in 2018, but take a look at our image repository, Imaggeo, for many other spectacular geo-themed pictures, including the winning images of the 2018 Photo Contest. The competition will be running again this year, so if you’ve got a flair for photography or have managed to capture a unique field work moment, consider uploading your images to Imaggeo and entering the 2019 Photo Competition.

A view of the southern edge of the Ladebakte mountain in the Sarek national park in north Sweden. At this place the rivers Rahpajaka and Sarvesjaka meet to form the biggest river of the Sarek national park, the Rahpaädno. The rivers are fed by glaciers and carry a lot of rock material which lead to a distinct sedimentation and a fascinating river delta for which the Sarek park laying west of the Kungsleden hiking trail is famous.

 

Melt ponds. Credit: Michael Tjernström (distributed via imaggeo.egu.eu)

The February 2018 header image used across our social media channels. The photos features ponds of melted snow on top of sea ice in summer. The photo was taken from the Swedish icebreaker Oden during the “Arctic Summer Cloud Ocean Study” in 2008 as part of the International Polar Year.

 

Karstification in Chabahar Beach, IRAN. Credit: Reza Derakhshani (distributed via imaggeo.egu.eu)

The June 2018 header image used for our social media channels. The photo was taken on the Northern coast of the Oman Sea, where the subduction of Oman’s oceanic plate under the continental plate of Iran is taking place.

 

River in a Charoite Schist. Credit: Bernardo Cesare (distributed via imaggeo.egu.eu)

A polarized light photomicrograph of a thin section of a charoite-bearing schist. Charoite is a rare silicate found only at one location in Yakutia, Russia. For its beautiful and uncommon purple color it is used as a semi-precious stone in jewelry.

Under the microscope charoite-bearing rocks give an overall feeling of movement, with charoite forming fibrous mats that swirl and fold as a result of deformation during metamorphism. It may be difficult to conceive, but these microstructures tell us that solid rocks can flow!

 

Refuge in a cloudscape. Credit: Julien Seguinot (distributed via imaggeo.egu.eu)

The action of glaciers combined with the structure of the rock to form this little platform, probably once a small lake enclosed between a moraine at the mountain side and the ice in the valley.

Now it has become a green haven in the mountain landscape, a perfect place for an alp. In the Alps, stratus clouds opening up on autumn mornings often create gorgeous light display.

 

Antarctic Fur Seal and columnar basalt Credit: Etienne Pauthenet (distributed via imaggeo.egu.eu).

This female fur seal is sitting on hexagonal columns of basalt rock, that can be found in Pointe Suzanne at the extreme East of the Kerguelen Islands, near Antarctica. This photo was the November 2018 header image for our social media channels.

 

Silent swamp predator. Credit: Nikita Churilin (distributed via imaggeo.egu.eu).

A macro shot of a Drosera rotundifolia modified sundew leaf waiting for an insect at swamp Krugloe. This photo was the January 2018 header image and one of the finalists in the 2017 Imaggeo Photo Competition.

 

Once there was a road…the clay wall. Credit: Chiara Arrighi (distributed via imaggeo.egu.eu)

The badlands valley of Civita di Bagnoregio is a hidden natural gem in the province of Viterbo, Italy, just 100 kilometres from Rome. Pictured here is the ‘wall,’ one of the valley’s most peculiar features, where you can even find the wooden structural remains of a trail used for agricultural purposes in the 19th and 20th centuries.

 

New life on ancient rock. Credit: Gerrit de Rooij (distributed via imaggeo.egu.eu).

“After two days of canooing in the rain on lake Juvuln in the westen part of the middle of Sweden, the weather finally improved in the evening, just before we reached the small, unnamed, uninhabited but blueberry-rich island on which this picture was taken. The wind was nearly gone, and the ragged clouds were the remainder of the heavier daytime cloud cover,” said Gerrit de Rooij, who took this photograph and provided some information about the picture, which features some of the oldest rocks in the world but is bursting with new life, in this blog post.

 

Cordillera de la Sal. Credit: Martin Mergili (distributed via imaggeo.egu.eu)

The photograph shows the Valle de la Luna, part of the amazing Cordillera de la Sal mountain range in northern Chile. Rising only 200 metres above the basin of the Salar de Atacama salt flat, the ridges of the Cordillera de la Sal represent a strongly folded sequence of clastic sediments and evapourites (salt can be seen in the left portion of the image), with interspersed volcanic material.

 

Robberg Peninsula – a home of seals. Credit: Elizaveta Kovaleva (distributed via imaggeo.egu.eu).

“This picture is taken from the Robberg Peninsula, one of the most beautiful places, and definitely one of my favorite places in South Africa. The Peninsula forms the Robberg Nature Reserve and is situated close to the Plettenberg Bay on the picturesque Garden Route. “Rob” in Dutch means “seal”, so the name of the Peninsula is translated as “the seal mountain”. This name was given to the landmark by the early Dutch mariners, who observed large colonies of these noisy and restless animals on the rocky cliffs of the Peninsula,” said Elizaveta Kovaleva in this blog post.

 

The great jump of the Tequendama. Credit: Maria Cristina Arenas Bautista (distributed via imaggeo.egu.eu)

Tequendama fall is a natural waterfall of Colombia. This blog post highlights a Colombian myth about the origins of the waterfall, which is tied to a real climate event.

 

If you pre-register for the 2019 General Assembly (Vienna, 07 – 12 April), you can take part in our annual photo competition! From 15 January up until 15 February, every participant pre-registered for the General Assembly can submit up three original photos and one moving image related to the Earth, planetary, and space sciences in competition for free registration to next year’s General Assembly!  These can include fantastic field photos, a stunning shot of your favourite thin section, what you’ve captured out on holiday or under the electron microscope – if it’s geoscientific, it fits the bill. Find out more about how to take part at http://imaggeo.egu.eu/photo-contest/information/.