GD
Geodynamics

modelling

What controlled the evolution of Plate Tectonics on Earth?

Great Unconformity - Immensity River, Grand Canyon
Stephan Sobolev

Prof. Dr. Stephan Sobolev. Head of the Geodynamic Modelling section of GFZ Potsdam.

Plate tectonics is a key geological process on Earth, shaping its surface, and making it unique among the planets in the Solar System. Yet, how plate tectonics emerged and which factors controlled its evolution remain controversial. The recently published paper in Nature by Sobolev and Brown suggests new ideas to solve this problem….

What makes plate tectonics possible on contemporary Earth?

It is widely accepted that plate tectonics is driven by mantle convection, but is the presence of said convection sufficient for plate tectonics? The answer is no, otherwise plate tectonics would be present on Mars and Venus and not only on Earth. The geodynamic community recognized that another necessary condition for plate tectonics is low strength at plate boundaries and particularly along the plate interfaces in subduction zones (e.g. Zhong and Gurnis 1992, Tackley 1998, Moresi and Solomatov 1998, and Bercovici 2003). To quantify the required strength at subduction interfaces, we have used global models of plate tectonics (Fig. 1A) that combine a finite element numerical technique employing visco-elasto-plastic rheology to model deformation in the upper 300 km of the Earth (Popov and Sobolev 2008) with a spectral code to model convection in the deeper mantle (Steinberger and Calderwood 2006). The model reproduces well present-day plate velocities if the effective friction at convergent plate boundaries is about 0.03 (Fig.1B). Low strength corresponds to subduction interfaces that are well lubricated by continental sediments (low friction; Lamb and Davis 2003, Sobolev and Babeyko 2005, or low viscosity; Behr and Becker 2018). In case of sediment shortages in the trenches (corresponding to a friction coefficient of 0.07-0.1), plate velocities would first decrease about two times (Fig. 1C) and then even more because of less negatively buoyant material having subducted into the mantle, leading to less convection driving force.

Effects of sediments on contemporary subduction according to global numerical models.

Figure 1. Global numerical model showing the effect of sediments on contemporary subduction. (A) The global model combines two computational domains coupled through continuity of velocities and tractions at 300 km depth. (B) NUVEL 1A plate velocities in a no-net-rotation reference frame (black arrows) versus computed velocities (blue arrows) for the global model with a friction of 0.03 at convergent plate boundaries. (C) Root mean square of computed plate velocities in the global model versus friction coefficient at convergent plate boundaries.

Hypothesis and its testing

Based on the previous discussion, we infer that continental sediments in subduction channels act as a lubricant for subduction. In addition, the presence of these sediments in trenches is a necessary condition for the stable operation of plate tectonics, particularly earlier in Earth’s evolution when the mantle was warmer and slabs were relatively weak. With this hypothesis we challenge the popular view that secular cooling of the Earth was the only major control on the evolution of plate tectonics on Earth since about 3 Ga. The hypothesis predicts that periods of stable plate tectonics should follow widespread surface erosion events, whereas times of diminished surface erosion should be associated with reduced subduction and possibly intermittent plate tectonics.

We test this prediction using geological proxies believed to identify plate tectonics activity (supercontinental cycles) and geochemical proxies that trace the influence of the continental crust on the composition of seawater (Sr isotopes in ocean sediments; Shields 2007) and continental sediments in the source of subduction-related magmas (oxygen and Hf isotopes in zircons; Cawood et al. 2013, Spencer et al. 2017). All three geochemical markers indeed show that just before or in the beginning of supercontinental cycles the influence of sediments is increasing, while it decreases before periods of diminished plate tectonic activity, like the boring billion period between 1.7 and 0.7 Ga (Cawood and Hawkesworth 2014; Fig. 2). The largest surface erosion and subduction lubrication events were likely associated with the global glaciation evens identified in the beginning (2.5-2.2 Ga) and at the end (0.7-0.6 Ga) of the Proterozoic Era (Hoffman and Schrag 2002). The latter snowball Earth glaciation event terminated the boring billion period and kick-started the modern phase of active plate tectonics.

Another prediction of our hypothesis is that in order to start plate tectonics, continents had to rise above sea level and provide sediments to the oceans. This prediction is again consistent with observations: there are many arguments for the beginning of plate tectonics between 3 and 2.5 Ga (see the review of Condie 2018) and, at the same time, this period is most likely when the continents rose above sea level (Korenaga et al. 2017).

Cartoon summarizing the factors that control the emergence and evolution of plate tectonics on Earth.

Figure 2. Cartoon summarizing the factors that control the emergence and evolution of plate tectonics on Earth. Enhanced surface erosion due to the rise of the continents and major glaciations stabilized subduction and plate tectonics for some periods after 3 Ga and particularly after 0.7 Ga after the cooling of the mantle. Blue boxes mark major glaciations; transparent green rectangles show the time intervals when all three geochemical proxies consistently indicate increasing sediment influence (major lubrication events); and, a thick black dashed curve separates hypothetical domains of stable and unstable plate tectonics. The reddish domain shows the number of passive margins (Bradley 2008), here used as a proxy for plate tectonic intensity.

What was before plate tectonics?

The earlier geodynamic regime could have involved episodic lid overturn and resurfacing due to retreating large-scale subduction triggered by mantle plumes (Gerya et al. 2015) or meteoritic impacts (O’Neill et al. 2017). Retreating slabs would bring water into the upwelling hot asthenospheric mantle, generating a large volume of magma that formed protocontinents. Extension of the protocontinental crust could have produced nascent subduction channels (Rey et al. 2014) along the edges of the protocontinents lubricated by the sediments. In this way, a global plate tectonics regime could have evolved from a retreating subduction regime.

What is next?

Despite of the support from existing data, more geochemical information is required to conclusively test our hypothesis about the role of sediments in the evolution of plate tectonics. Additionally, this hypothesis must be fully quantified, which in turn will require coupled modeling of mantle convection and plate tectonics, surface processes and climate.

References
Behr, W. M. and Becker, T. W. Sediment control on subduction plate speeds. Earth Planet. Sci. Lett. 502, 166-173 (2018).

Bercovici, D. The generation of plate tectonics from mantle convection. Earth Planet. Sci. Lett. 205, 107–121 (2003).

Bradley, D. C. Passive margins through earth history. Earth Sci. Rev. 91, 1-26 (2008).

Cawood, P. A., Hawkesworth, C. J. and Dhuime, B. The continental record and the generation of continental crust. Geol. Soc. Amer. Bull. 125, 14-32 (2013).

Cawood, P. A. and Hawkesworth, C. J. Earth's middle age. Geology 42, 503-506 (2014).

Condie, K. C. A planet in transition: The onset of plate tectonics on Earth between 3 and 2 Ga? Geosci. Front. 9, 51-60 (2018).

Gerya, T.V. et al. Plate tectonics on the Earth triggered by plume-induced subduction initiation, Nature 527, 221-225 (2015).

Hoffman, P. F. and Schrag, D. P. The snowball Earth hypothesis: testing the limits of global change. Terra Nova 14, 129–155 (2002).

Korenaga, J., Planavsky, N. J. and Evans, D. A. D. Global water cycle and the coevolution of the Earth's interior and surface environment. Phil. Trans. R. Soc. Am. 375, 20150393 (2017).

Lamb, S. and Davis, P. Cenozoic climate change as a possible cause for the rise of the Andes. Nature 425, 792-797 (2003).

Moresi, L. and Solomatov, V. Mantle convection with a brittle lithosphere: Thoughts on the global tectonic style of the Earth and Venus. Geophys. J. Int. 133, 669-682 (1998).

O’Neill, C. et al. Impact-driven subduction on the Hadean Earth. Nature Geosci. 10, 793-797 (2017).

Popov, A.A. and Sobolev, S. V. SLIM3D: A tool for three-dimensional thermo mechanical modeling of lithospheric deformation with elasto-visco-plastic rheology, Phys. Earth Planet. Inter. 171, 55-75 (2008).

Rey, P. F., Coltice, N. and Flament, N. Spreading continents kick-started plate tectonics. Nature 513, 405–408 (2014).

Shields, G. A. A normalised seawater strontium isotope curve: possible implications for Neoproterozoic-Cambrian weathering rates and the further oxygenation of the Earth. eEarth 2, 35-42 (2007).

Sobolev, S. V. and Babeyko, A. Y. What drives orogeny in the Andes? Geology 33, 617-620 (2005).

Spencer, C. J., Roberts, N. M. W. and Santosh, M. Growth, destruction, and preservation of Earth's continental crust. Earth. Sci. Rev. 172, 87-106 (2017).

Steinberger, B. and Calderwood, A. Models of large-scale viscous flow in the Earth’s mantle with constraints from mineral physics and surface observations. Geophys. J. Intern., 167 1461–1481 (2006).

Tackley, P. J. Self-consistent generation of tectonic plates in three-dimensional mantle convection. Earth Planet. Sci. Lett. 157, 9-22, (1998).

Zhong, S. and Gurnis, M. Viscous flow model of a subduction zone with a faulted lithosphere: long and short wavelength topography, gravity and geoid. Geophys. Res. Lett. 19, 1891–1894 (1992).

 

The past is the key

The past is the key

Lorenzo Colli

“The present is the key to the past” is a oft-used phrase in the context of understanding our planet’s complex evolution. But this perspective can also be flipped, reflected, and reframed. In this Geodynamics 101 post, Lorenzo Colli, Research Assistant Professor at the University of Houston, USA, showcases some of the recent advances in modelling mantle convection.  

 

Mantle convection is the fundamental process that drives a large part of the geologic activity at the Earth’s surface. Indeed, mantle convection can be framed as a dynamical theory that complements and expands the kinematic theory of plate tectonics: on the one hand it aims to describe and quantify the forces that cause tectonic processes; on the other, it provides an explanation for features – such as hotspot volcanism, chains of seamounts, large igneous provinces and anomalous non-isostatic topography – that aren’t accounted for by plate tectonics.

Mantle convection is both very simple and very complicated. In its essence, it is simply thermal convection: hot (and lighter) material goes up, cold (and denser) material goes down. We can describe thermal convection using classical equations of fluid dynamics, which are based on well-founded physical principles: the continuity equation enforces conservation of mass; the Navier-Stokes equation deals with conservation of momentum; and the heat equation embodies conservation of energy. Moreover, given the extremely large viscosity of the Earth’s mantle and the low rates of deformation, inertia and turbulence are utterly negligible and the Navier-Stokes equation can be simplified accordingly. One incredible consequence is that the flow field only depends on an instantaneous force balance, not on its past states, and it is thus time reversible. And when I say incredible, I really mean it: it looks like a magic trick. Check it out yourself.

With four parameters I can fit an elephant, and with five I can make him wiggle his trunk

This is as simple as it gets, in the sense that from here onward every additional aspect of mantle convection results in a more complex system: 3D variations in rheology and composition; phase transitions, melting and, more generally, the thermodynamics of mantle minerals; the feedbacks between deep Earth dynamics and surface processes. Each of these additional aspects results in a system that is harder and costlier to solve numerically, so much so that numerical models need to compromise, including some but excluding others, or giving up dimensionality, domain size or the ability to advance in time. More importantly, most of these aspects are so-called subgrid-scale processes: they deal with the macroscopic effect of some microscopic process that cannot be modelled at the same scale as the macroscopic flow and is too costly to model at the appropriate scale. Consequently, it needs to be parametrized. To make matters worse, some of these microscopic processes are not understood sufficiently well to begin with: the parametrizations are not formally derived from first-principle physics but are long-range extrapolations of semi-empirical laws. The end result is that it is possible to generate more complex – thus, in this regard, more Earth-like – models of mantle convection at the cost of an increase in tunable parameters. But what parameters give a truly better model? How can we test it?

Figure 1: The mantle convection model on the left runs in ten minutes on your laptop. It is not the Earth. The one on the right takes two days on a supercomputer. It is fancier, but it is still not the real Earth.

Meteorologists face similar issues with their models of atmospheric circulation. For example, processes related to turbulence, clouds and rainfall need to be parametrized. Early weather forecast models were… less than ideal. But meteorologists can compare every day their model predictions with what actually occurs, thus objectively and quantitatively assessing what works and what doesn’t. As a result, during the last 40 years weather predictions have improved steadily (Bauer et al., 2015). Current models are better at using available information (what is technically called data assimilation; more on this later) and have parametrizations that better represent the physics of the underlying processes.

If time travel is possible, where are the geophysicists from the future?

We could do the same, in theory. We can initialize a mantle convection model with some best estimate for the present-day state of the Earth’s mantle and let it run forward into the future, with the explicit aim of forecasting its future evolution. But mantle convection evolves over millions of years instead of days, thus making future predictions impractical. Another option would be to initialize a mantle convection model in the distant past and run it forward, thus making predictions-in-the-past. But in this case we really don’t know the state of the mantle in the past. And as mantle convection is a chaotic process, even a small error in the initial condition quickly grows into a completely different model trajectory (Bello et al., 2014). One can mitigate this chaotic divergence by using data assimilation and imposing surface velocities as reconstructed by a kinematic model of past plate motions (Bunge et al., 1998), which indeed tends to bring the modelled evolution closer to the true one (Colli et al., 2015). But it would take hundreds of millions of years of error-free plate motions to eliminate the influence of the unknown initial condition.

As I mentioned before, the flow field is time reversible, so one can try to start from the present-day state and integrate the governing equations backward in time. But while the flow field is time reversible, the temperature field is not. Heat diffusion is physically irreversible and mathematically unstable when solved back in time. Plainly said, the temperature field blows up. Heat diffusion needs to be turned off [1], thus keeping only heat advection. This approach, aptly called backward advection (Steinberger and O’Connell, 1997), is limited to only a few tens of millions of years in the past (Conrad and Gurnis, 2003; Moucha and Forte, 2011): the errors induced by neglecting heat diffusion add up and the recovered “initial condition”, when integrated forward in time (or should I say, back to the future), doesn’t land back at the desired present-day state, following instead a divergent trajectory.

Per aspera ad astra

As all the simple approaches turn out to be either unfeasible or unsatisfactory, we need to turn our attention to more sophisticated ones. One option is to be more clever about data assimilation, for example using a Kalman filter (Bocher et al., 2016; 2018). This methodology allow for the combining of the physics of the system, as embodied by the numerical model, with observational data, while at the same time taking into account their relative uncertainties. A different approach is given by posing a formal inverse problem aimed at finding the “optimal” initial condition that evolves into the known (best-estimate) present-day state of the mantle. This inverse problem can be solved using the adjoint method (Bunge et al., 2003; Liu and Gurnis, 2008), a rather elegant mathematical technique that exploits the physics of the system to compute the sensitivity of the final condition to variations in the initial condition. Both methodologies are computationally very expensive. Like, many millions of CPU-hours expensive. But they allow for explicit predictions of the past history of mantle flow (Spasojevic & Gurnis, 2012; Colli et al., 2018), which can then be compared with evidence of past flow states as preserved by the geologic record, for example in the form of regional- and continental-scale unconformities (Friedrich et al., 2018) and planation surfaces (Guillocheau et al., 2018). The past history of the Earth thus holds the key to significantly advance our understanding of mantle dynamics by allowing us to test and improve our models of mantle convection.

Figure 2: A schematic illustration of a reconstruction of past mantle flow obtained via the adjoint method. Symbols represent model states at discrete times. They are connected by lines representing model evolution over time. The procedure starts from a first guess of the state of the mantle in the distant past (orange circle). When evolved in time (red triangles) it will not reproduce the present-day state of the real Earth (purple cross). The adjoint method tells you in which direction the initial condition needs to be shifted in order to move the modeled present-day state closer to the real Earth. By iteratively correcting the first guess an optimized evolution (green stars) can be obtained, which matches the present-day state of the Earth.

1.Or even to be reversed in sign, to make the time-reversed heat equation unconditionally stable.

Thoughts on geological modelling: an analogue perspective

Thoughts on geological modelling: an analogue perspective

In geodynamics we study the dynamics of the Earth (and other planets). We ground our studies in as much data as possible, however we are constrained by the fact that pretty much all direct information we can collect from the interior of the Earth only shows its present-day state. The surface rock record gives us a glimpse into the past dynamics and evolution of our planet, but this record gets sparser as we go back in time. This is why it is common to use modelling in geodynamics to fill this gap of knowledge. There are different types of modelling, and this week João Duarte writes about the importance of analogue modelling. 

João Duarte. Researcher at Instituto Dom Luiz and Invited Professor at the Geology Department, Faculty of Sciences of the University of Lisbon. Adjunct Researcher at Monash University.

The first time I went to EGU, in 2004, I presented a poster with some new marine geology data and a few sets of analogue models. I was doing accretionary wedges in sandboxes. At the time, I was in the third year of my bachelor’s degree and I was completely overwhelmed by the magnitude of the conference. It was incredible to see the faces of all those scientists that I used to read articles from. But one thing impressed me the most. Earth Sciences were consolidating as a modern, theoretical based science. The efforts in trying to develop an integrated dynamic theory of plate tectonics as part of mantle convection were obvious. The new emergent numerical models looked incredible, with all those colours, complex rheologies and stunning visualization that allowed us to “see” stresses, temperature gradients and non-linear viscosities. I was amazed.

Analogue modelling was at a relative peak in 2004, however it was also anticipated by some that it would quickly disappear (and indeed several analogue labs have closed since). It was with this mindset, that I later did the experiments for my PhD, which I finished in 2012 (Duarte et al., 2011). But I was fortunate. My supervisors, Filipe Rosas and Pedro Terrinha, took me to state-of-art labs, namely Toronto and Montpellier (lead at the time by Sandy Cruden and Jacques Malavieille, respectively), and I started to develop a passion for this kind of models. When I moved to Monash for a post-doc position, in 2011, this turned out to be a great advantage. There, modelers such as Wouter Schellart, Louis Moresi, Fabio Capitanio, David Boutelier and Sandy Cruden (yes, I met Sandy again at Monash) were using analogue models to benchmark numerical models. Why? Because many times, even though numerical models produce spectacular results, they might not be physically consistent. And there is only one way to get rid of this, which is to make sure that whatever numerical code we are using can reproduce simple experiments that we can run in a lab. The classical example is the sinking of a negatively buoyant sphere in a viscous medium.

Sandbox analogue model of an accretionary wedge. Part of the same experiment as shown in the header figure. Here, a sliced section cut after wetting, is shown. University of Lisbon, 2009. Experiments published in Duarte et al. (2011).

That was what we were doing at Monash. I worked with Wouter Schellart in the development of subduction experiments with an overriding plate, which were advancing step by step in both analogue and numerical schemes (see e.g., Duarte et al., 2013 and Chen et al., 2015, 2016 for the analogue models, and Schellart and Moresi, 2013 for numerical equivalents). The tricky bit was, we wanted to have self-consistent dynamic experiments in which we were ascribing the forces (negative buoyancy of the slab, the viscosity of the upper mantle, etc) and let the kinematics (i.e. the velocities) to be an emergent phenomenon. So, no lateral push or active kinematic boundaries were applied to the plates. This is because we now recognize that in general, it is the slab pull at subduction zones that majorly drives the plates and not the other way around. Therefore, if we want to investigate the fundamental physics and dynamics of subduction zones we need to use self-consistent models (both analogue and numerical). In order to carry out these models, we had to develop a new rheology for the subduction interface, which is a complex problem, both in the analogue and the numerical approaches (Duarte et al. 2013, 2014, 2015). But this is another very long story that would lead to a publication by itself.

Analogue models of subduction with an overriding plate and an interplate rheology. Monash University, 2012. Adapted from Duarte et al. (2013)

But what is analogue modelling all about? Basically, analogue models are scaled models that we can develop in the laboratory using analogue materials (such as sand), and that at the scale that we are doing our models have similar physical properties to those of natural materials (such as brittle rocks). But, as it is widely known, under certain circumstances (at large time and space scales), rocks behave like fluids, and for that we use analogue fluids, such as silicone putties, glucose and honey. We can also use fluids to simulate the interaction between subduction zones and mantle plumes in a fluid reservoir (see below figures and links to videos of scaled experiments using three different fluids to study slab-plume interaction; Meriaux et al., 2015a, 2015b, 2016). These are generally called geodynamic analogue models.

End of a slab-plume experiment in the upper mantle (see below). The tank is partially filled with glucose. The slab (laying at the analogue 660 discontinuity) is made of silicone mixed with iron powder. The plume is made of a water solution of glucose dyed with a red colorant. And that’s me on the left. Monash University, 2014.

I usually consider two main branches of analogue models. The first, which is the one mostly used by geologists, was started by Sir James Hall (1761 – 1832), that squeezed layers of clay to reproduce the patterns of folded rocks that he had observed in nature. This method was later improved by King Hubbert (1937), who laid the ground for the development of the field by developing a theory of scaling of analogue models applied to geological processes.

The other branch is probably as old as humans. It began when we started to manipulate objects and using them to understand basic empirical laws, such as the one that objects always fall. When Galileo was using small spheres in inclined surfaces to extract the physical laws that describe the movement of bodies, from rocks to planets, he was in a certain way using analogue models. He understood that many laws are scale invariant. Still today, these techniques are widely used by physicist and engineers when understanding for example the aerodynamics of airplanes, the stability of bridges, the dynamics of rivers or the resistance of dams. They use scaled models that reproduce at suitable laboratory scales the objects and processes that they are investigating.

What we did at Monash, was a mixture of both approaches. Though, we were less interested in exactly reproducing nature from a purely geometric and kinematic point of view, but we were more interested in understanding the physics of the object we were investigating: subduction zones. Therefore, we had to guarantee that we were using the correct dynamical approach in order to be able to extract generic physical empirical laws, hoping that these laws would provide us some insight on the dynamics of natural subduction zones. These empirical laws could readily be incorporated in numerical models, which would then help exploring more efficiently the space of the controlling parameters in the system.

Slab-Plume interaction in the upper mantle. Experiments published in Meriaux et al. (2015a, 2015b).

I want to finish with a question that I believe concerns all of us: are there still advantages in using analogue models? Yes, I believe so! One of the most important advantages is that analogue models are always three-dimensional and high-resolution. Furthermore, they allow a good tracking of the strain and to understand how it occurs in discontinuous mediums, for example when investigating the localization of deformation or the propagation of cracks. Numerical schemes still struggle with these problems. It is very difficult to have an efficient code that can deal simultaneously with very high resolution and large-scale three-dimensional problems, as it is required to investigate the process of subduction. Nevertheless, numerical models are of great help when it comes to track stresses, and model complex rheologies and temperature gradients. To sum up: nowadays, we recognize that certain problems can only be tackled using self-consistent dynamic models that model the whole system in three-dimensions, capturing different scales. For this, the combination of analogue and numerical models is still one of the most powerful tools we have. An interesting example of a field in which a combined approach is being used is the fascinating investigations on the seismic cycle (for example, see here).

Links to videos:

VIDEO 1: https://www.youtube.com/watch?v=U1TXC2XPbFA&feature=youtu.be
(Subduction with an overriding plate and an interplate rheology. Duarte et al., 2013)

VIDEO 2: https://www.youtube.com/watch?v=n5P2TzS6h_0&feature=youtu.be
(Slab-plume interaction at mantle scale. Side-view of the experiment on the top, and top-view of the experiment on the bottom. Meriaux et al., 2016)
References:

Chen, Z., Schellart, W.P., Strak, V., Duarte, J.C., 2016. Does subduction-induced mantle flow drive backarc extension? Earth and Planetary Science Letters 441, 200-210. https://doi.org/10.1016/j.epsl.2016.02.027

Chen, Z., Schellart, W.P., Duarte, J.C., 2015. Overriding plate deformation and variability of forearc deformation during subduction: Insight from geodynamic models and application to the Calabria subduction zone. Geochemistry, Geophysics, Geosystems 16, 3697–3715. DOI: 10.1002/2015GC005958

Duarte, J.C., Schellart, W.P., Cruden, A.R., 2015. How weak is the subduction zone interface? Geophysical Research Letters 41, 1-10. DOI: 10.1002/2014GL062876

Duarte, J.C., Schellart, W.P., Cruden, A.R., 2014. Rheology of petrolatum – paraffin oil mixtures: applications to analogue modelling of geological processes. Journal of Structural Geology 63, 1-11. https://doi.org/10.1016/j.jsg.2014.02.004

Duarte, J.C., Schellart, W.P., Cruden, A.R., 2013. Three-dimensional dynamic laboratory models of subduction with an overriding plate and variable interplate rheology. Geophysical Journal International 195, 47-66. https://doi.org/10.1093/gji/ggt257

Duarte, J.C., F.M. Rosas P., Terrinha, M-A Gutscher, J. Malavieille, Sónia Silva, L. Matias, 2011. Thrust–wrench interference tectonics in the Gulf of Cadiz (Africa–Iberia plate boundary in the North-East Atlantic): Insights from analog models. Marine Geology 289, 135–149. https://doi.org/10.1016/j.margeo.2011.09.014

Hubbert, M.K., 1937. Theory of scale models as applied to the study of geologic structures. GSA Bulletin 48, 1459-1520. https://doi.org/10.1130/GSAB-48-1459

Meriaux, C., Meriaux, A-S., Schellart, W.P., Duarte, J.C., Duarte, S.S., Chen, Z., 2016. Mantle plumes in the vicinity of subduction zones. Earth and Planetary Science Letters 454, 166-177. https://doi.org/10.1016/j.epsl.2016.09.001

Mériaux, C.A., Duarte, J.C., Schellart, W.P., Mériaux, A-S., 2015. A two-way interaction between the Hainan plume and the Manila subduction zone. Geophysical Research Letters 42, 5796–5802. DOI: 10.1002/2015GL064313

Meriaux, C.A., Duarte, J.C., Duarte, S., Chen, Z., Rosas, F.M., Mata, J., Schellart, W.P., and Terrinha, P. 2015. Capture of the Canary mantle plume material by the Gibraltar arc mantle wedge during slab rollback. Geophysical Journal International 201, 1717-1721. https://doi.org/10.1093/gji/ggv120

Schellart, W.P., Moresi, L., 2013. A new driving mechanism for backarc extension and backarc shortening through slab sinking induced toroidal and poloidal mantle flow: Results from dynamic subduction models with an overriding plate. Journal of Geophysical Research: Solid Earth 118, 3221-3248. https://doi.org/10.1002/jgrb.50173

Reproducible Computational Science

Reproducible Computational Science

 

Krister with his bat-signal shirt for reproducibility.

We’ve all been there – you’re reading through a great new paper, keen to get to the Data Availability only to find nothing listed, or the uninspiring “data provided on request”. This week Krister Karlsen, PhD student from the Centre for Earth Evolution and Dynamics (CEED), University of Oslo shares some context and tips for increasing the reproducibility of your research from a computational science perspective. Spread the good word and reach for the “Gold Standard”!

Historically, computational methods and modelling have been considered the third avenue of the sciences, but they are now some of the most important, paralleling experimental and theoretical approaches. Thanks to the rapid development of electronics and theoretical advances in numerical methods, mathematical models combined with strong computing power provide an excellent tool to study what is not available for us to observe or sample (Fig. 1). In addition to being able to simulate complex physical phenomena on computer clusters, these advances have drastically improved our ability to gather and examine high-dimensional data. For these reasons, computational science is in fact the leading tool in many branches of physics, chemistry, biology, and geodynamics.

Figure 1: Time–depth diagram presenting availability of geodynamic data. Modified from (Gerya, 2014).

A side effect of the improvement of methods for simulation and data gathering is the availability of a vast variety of different software packages and huge data sets. This poses a challenge in terms of sufficient documentation that will allow the study to be reproduced. With great computing power, comes great responsibility.

“Non-reproducible single occurrences are of no significance to science.” – Popper (1959)

Reproducibility is the cornerstone of cumulative science; the ultimate standard by which scientific claims are judged. With replication, independent researchers address a scientific hypothesis and build up evidence for, or against, it. This methodology represents the self-correcting path that science should take to ensure robust discoveries; separating science from pseudoscience. Reports indicate increasing pressure to publish manuscripts whilst applying for competitive grants and positions (Baker, 2016). Furthermore, a growing burden of bureaucracy takes away precious time designing experiments and doing research. As the time available for actual research is decreasing, the number of articles that mention a “reproducibility crisis?” are rising towards the present day peak (Fig. 2). Does this mean we have become sloppy in terms of proper documentation?

Figure 2: Number of titles, abstracts, or keywords that contain one of the following phrases: “reproducibility crisis,” “scientific crisis,” “science in crisis,” “crisis in science,” “replication crisis,” “replicability crisis”, found in the Web of Science records. Modified from (Fanelli, 2018).

Are we facing a reproducibility crisis?

A survey conducted by Nature asked 1,576 researchers this exact question, and reported 52% responded with “Yes, a significant crisis,” and 38% with “Yes, a slight crisis” (Baker, 2016). Perhaps more alarming is that 70% report they have unsuccessfully tried to reproduce another scientist’s findings, and more than half have failed to reproduce their own results. To what degree these statistics apply to our own field of geodynamics is not clear, but it is nonetheless a timely remainder that reproducibility must remain at the forefront of our dissemination. Multiple journals have implemented policies on data and software sharing upon publication to ensure the replication and reproduction of computational science is maintained. But how well are they working? A recent empirical analysis of journal policy effectiveness for computational reproducibility sheds light on this issue (Stodden et al., 2018). The study randomly selected 204 papers published in Science after the implementation of their code and data sharing policy. Of these articles, 24 contained sufficient information, whereas for the remaining 180 publications the authors had to be contacted directly. Only 131 authors replied to the request, of these 36% provided some of the requested material and 7% simply refused to share code and data. Apparently the implementation of policies was not enough, and there is still a lot of confusion among researchers when it comes to obligations related to data and code sharing. Some of the anonymized responses highlighted by Stodden et al. (2018) underline the confusion regarding the data and code sharing policy:

Putting aside for the moment that you are, in many cases, obliged to share your code and data to enhance reproducibility; are there any additional motivating factors in making your computational research reproducible? Freire et al. (2012) lists a few simple benefits of reproducible research:

1. Reproducible research is well cited. A study (Vandewalle et al., 2009) found that published articles that reported reproducible results have higher impact and visibility.

2. Code and software comparisons. Well documented computational research allows software developed for similar purposes to be compared in terms of performance (e.g. efficiency and accuracy). This can potentially reveal interesting and publishable differences between seemingly identical programs.

3. Efficient communication of science between researchers. New-comers to a field of research can more efficiently understand how to modify and extend an existing program, allowing them to more easily build upon recently published discoveries (this is simply the positive counterpart to the argument made against software sharing earlier).

“Replicability is not reproducibility: nor is it good science.” – Drummond (2009)

I have discussed reproducibility over quite a few paragraphs already, without yet giving it a proper definition. What precisely is reproducibility? Drummond (2009) proposes a distinction between reproducibility and replicability. He argues that reproducibility requires, at the minimum, minor changes in experiment or model setup, while replication is an identical setup. In other words, reproducibility refers to a phenomenon that can be predicted to recur with slightly different experimental conditions, while replicability describes the ability to obtain an identical result when an experiment is performed under precisely the same conditions. I think this distinction makes the utmost sense in computational science, because if all software, data, post-processing scripts, random number seeds and so on, are shared and reported properly, the results should indeed be identical. However, replicability does not ensure the validity of the scientific discovery. A robust discovery made using computational methods should be reproducible with a different software (made for similar purposes, of course) and small perturbations to the input data such as initial conditions, physical parameters, etc. This is critical because we rarely, if ever, know the model inputs with zero error bars. A way for authors to address such issues is to include a sensitivity analysis of different parameters, initial conditions and boundary conditions in the publication or the supplementary material section.

Figure 3: Illustration of the “spectrum of reproducibility”, ranging from not reproducible to the gold standard that includes code, data and executable files that can directly replicate the reported results. Modified from (Peng, 2011).

However, the gold standard of reproducibility in computation-involved science, like geodynamics, is often described as what Drummond would classify as replication (Fig. 3). That is, making all data and code available to others to easily execute. Even though this does not ensure reproducibility (only replicability), it provides other researchers a level of detail regarding the work-flow and analysis that is beyond what can usually be achieved by using common language. And this deeper understanding can be crucial when trying to reproduce (and not replicate) the original results. Thus replication is a natural step towards reproduction. Open-source community codes for geodynamics, like eg. ASPECT (Heister et al., 2017), and more general FEM libraries like FEniCS (Logg et al., 2012), allows for friction-free replication of results. An input-file describing the model setup provides a 1-to-1 relation to the actual results1 (which in many cases is reasonable because the data are too large to be easily shared). Thus, sharing the post-processing scripts accompanied by the input file on eg. GitHub, will allow for complete replication of the results, at low cost in terms of data storage.

Light at the end of the tunnel?

In order to improve practices for reproducibility, contributions will need to come from multiple directions. The community needs to develop, encourage and maintain a culture of reproducibility. Journals and funding agencies can play an important role here. The American Geosciences Union (AGU) has shared a list of best practices regarding research data2 associated with a publication:

• Deposit the data in support of your publication in a leading domain repository that handles such data.

• If a domain repository is not available for some of all of your data, deposit your data in a general repository such as Zenodo, Dryad, or Figshare. All of these repositories can assign a DOI to deposited data, or use your institution’s archive.

• Data should not be listed as “available from authors.”

• Make sure that the data are available publicly at the time of publication and available to reviewers at submission—if you are unable to upload to a public repository before submission, you may provide access through an embargoed version in a repository or in datasets or tables uploaded with your submission (Zenodo, Dryad, Figshare, and some domain repositories provide embargoed access.) Questions about this should be sent to journal staff.

• Cite data or code sets used in your study as part of the reference list. Citations should follow the Joint Declaration of Data Citation Principles.

• Develop and deposit software in GitHub which can be cited, or include simple scripts in a supplement. Code in Github can be archived separately and assigned a DOI through Zenodo for submission.

In addition to best practice guidelines, wonderful initiatives from other communities include a research prize. The European College of Neuropsychopharmacology offers a (11,800 USD) award for negative results, more specifically for careful experiments that do not confirm an accepted hypothesis or previous result. Another example is the International Organization for Human Brain Mapping who awards 2,000 USD for the best replication study − successful or not. Whilst not a prize per se, at recent EGU General Assemblies in Vienna the GD community have held sessions around the theme of failed models. Hopefully, similar initiatives will lead by example so that others in the community will follow.

1To the exact same results, information about the software version, compilers, operating system etc. would also typically be needed.

2 AGU’s definition of data includes all code, software, data, methods and protocols used to produce the results here.

References

AGU, Best Practices. https://publications.agu.org/author-resource-center/publication-policies/datapolicy/data-policy-faq/ Accessed: 2018-08-31.

Baker, Monya. Reproducibility crisis? Nature, 533:26, 2016.

Drummond, Chris. Replicability is not reproducibility: nor is it good science. 2009.

Fanelli, Daniele. Opinion: Is science really facing a reproducibility crisis, and do we need it to?Proceedings of the National Academy of Sciences, 115(11):2628–2631, 2018.

Freire, Juliana; Bonnet, Philippe, and Shasha, Dennis. Computational reproducibility: state-of-theart, challenges, and database research opportunities. In Proceedings of the 2012 ACM SIGMOD international conference on management of data, pages 593–596. ACM, 2012.

Gerya, Taras. Precambrian geodynamics: concepts and models. Gondwana Research, 25(2):442–463, 2014.

Heister, Timo; Dannberg, Juliane; Gassm"oller, Rene, and Bangerth, Wolfgang. High accuracy mantle convection simulation through modern numerical methods. II: Realistic models and problems. Geophysical Journal International, 210(2):833–851, 2017. doi: 10.1093/gji/ggx195. URL https://doi.org/10.1093/gji/ggx195.

Logg, Anders; Mardal, Kent-Andre; Wells, Garth N., and others, . Automated Solution of Differential Equations by the Finite Element Method. Springer, 2012. ISBN 978-3-642-23098-1. doi: 10.1007/978-3-642-23099-8.

Peng, Roger D. Reproducible research in computational science. Science, 334(6060):1226–1227, 2011.

Popper, Karl Raimund. The Logic of Scientific Discovery . University Press, 1959.

Stodden, Victoria; Seiler, Jennifer, and Ma, Zhaokun. An empirical analysis of journal policy effectiveness for computational reproducibility. Proceedings of the National Academy of Sciences , 115(11):2584–2589, 2018.

Vandewalle, Patrick; Kovacevic, Jelena, and Vetterli, Martin. Reproducible research in signal processing. IEEE Signal Processing Magazine , 26(3), 2009