GD
Geodynamics

Geodynamics

It’s just coding … – Scientific software development in geodynamics

The Spaghetti code challenge. Source: Wikimedia Commons, Plamen petkov 92, CC-BY-SA 4.0

As big software packages become a commonplace in geodynamics, which skills should a geodynamicist aim at having in software development? Which techniques should be considered a minimum standard for our software? This week Rene Gassmöller, project scientist at UC Davis, Computational Infrastructure for Geodynamics, shares his insights on the best practices to make scientific software better, and how we can work to translate these into our field. Enjoy the read!

Rene  Gassmöller

Nowadays we often equate geodynamics with computational geodynamics. While there are still interesting analytical studies to be made, and important data to be gathered, it is increasingly common that PhD students in geodynamics are expected to work exclusively on data interpretation, computational models, and in particular the accompanying development of geodynamic software packages. But as it turns out, letting an unprepared PhD student (or unprepared postdoc or faculty member for that matter) work on a big software package is a near guarantee for the project to develop into a sizeable bowl of spaghetti code (see figure above for a representative illustration).

Note, that I intentionally write about ‘software packages’ instead of ‘code’, as many of these packages — think of Gplates (Müller et al, 2018), ObsPy (Krischer et al, 2015), FeniCS (Alneas et al, 2015) , or the project I am working on, ASPECT (Heister et al, 2017) — have necessarily left the stage of a quickly written ‘code’ for a single purpose, and developed into multi-purpose tools with a complex internal structure. With this growing complexity, the activity of scientific ‘coding’ evolved into ‘developing software’. However, when students enter the field of geophysics, they are rarely prepared for this challenge. Hannay et al. (2009) report that while researchers typically spend 30% or more of their time developing software, 90% of them are primarily self-taught, and only few of them received formal training for writing software, including tests and documentation. Nobody told them: Programming and engineering software are two very different things. Many undergraduate and graduate geoscience curricula today include classes about the basics of programming (e.g. in Python, R, or Matlab), and also discuss numerical and computational methods. While these concepts are crucial for solving scientific problems, they are not sufficient for managing the complexity of growing scientific software. Writing a 50-line script is a very different task from contributing to an inherited and poorly documented PhD project of 1,000 lines, which again is very different from managing a multi-developer project of 100,000 lines of source code. A recurring theme is that these differences are only discovered when damage has already been done. Hannay et al. (2009) note:

Codes often start out small and only grow large with time as the software proves its usefulness in scientific investigations. The demand for proper software engineering is therefore seldom visible until it is “too late”.

But what are these ‘proper software engineering techniques’?

Best practices vs. Best techniques in practice

In a previous blog post, Krister Karlsen already discussed the value of version control systems for reproducibility of computational research. It is needless to say that these systems (originally also termed source code control systems, e.g. Rochkind, 1975) are just as valuable for scientific software development as they are for reproducibility of results. However, they are not sufficient for developing reliable scientific software. Wilson et al. (2014) summarize a list of 8 best practices that make scientific software better:

  1. Write programs for people, not computers.
    • A program should not require its readers to hold more than a handful of facts in memory at once.
    • Make names consistent, distinctive, and meaningful.
    • Make code style and formatting consistent.
  2. Let the computer do the work.
    • Make the computer repeat tasks.
    • Save recent commands in a file for re-use.
    • Use a build tool to automate workflows.
  3. Make incremental changes.
    • Work in small steps with frequent feedback and course correction.
    • Use a version control system.
    • Put everything that has been created manually in version control.
  4. Don’t repeat yourself (or others).
    • Every piece of data must have a single authoritative representation in the system.
    • Modularize code rather than copying and pasting.
    • Re-use code instead of rewriting it.
  5. Plan for mistakes.
    • Add assertions to programs to check their operation.
    • Use an off-the-shelf unit testing library.
    • Turn bugs into test cases.
    • Use a symbolic debugger.
  6. Optimize software only after it works correctly.
    • Use a profiler to identify bottlenecks.
    • Write code in the highest-level language possible.
  7. Document design and purpose, not mechanics.
    • Document interfaces and reasons, not implementations.
    • Refactor code in preference to explaining how it works.
    • Embed the documentation for a piece of software in that software.
  8. Collaborate.
    • Use pre-merge code reviews.
    • Use pair programming when bringing someone new up to speed and when tackling particularly tricky problems.
    • Use an issue tracking tool.

There is a lot to be said about each of these techniques, but that would be beyond the scope of this blog post (please see Wilson et al.’s excellent and concise paper if you are interested). What I would like to emphasize here is that these techniques are often requested, but rarely taught. What are peer code reviews? How do I gradually introduce tests and refactor a legacy code? Who knows if it is better to use unit testing, integration testing, regression testing, or benchmarking for a given change of the code? And do I really need to know the difference? After all, a common argument against using software development techniques in applied computational science disciplines boils down to:

  • We can not expect these software development techniques from geodynamicists.
  • We should not employ the same best practices as Google, Amazon, Apple, because they do not apply to us.
  • There is no time to learn/apply these techniques, because we have to conduct our research, write our publications, secure our funding.

While from a philosophical standpoint it is easy to dismiss these statements as not adhering to best practices, and possibly impacting the reliability of the created software, it is harder to tackle them from a practical perspective. Of course it is true that implementing a sophisticated testing infrastructure for a one-line shell command is neither useful nor necessary. Maybe the same is true for a 20 line script that is written to specifically convert one dataset into another, but in this case putting it under version control would already be useful in order to record your process and apply it to other datasets. And from my own experience it is extraordinarily easy to miss the threshold at 40-100 lines at which writing documentation and implementing first testing procedures become crucial to avoid cursing yourself in the future for not explaining what you did and why you did it. So why are there detailed instructions for lab notes and experimental procedures, but not for geodynamic software design and reliability of scientific software? Geoscience, chemistry, and physics have established multi-semester lab and field exercises, to drill students towards careful scientific analysis. Should we develop comparable exercises for scientific software development (beyond numerical methods and basic programming)? How would an equivalent of these classes look like for computational methods? And is there a point where the skills of software development and geodynamics research grow so far apart we have to consider them separately and establish a unique career track, such as the Research Software Engineer that is becoming more popular in the UK?

In my personal opinion we have made great progress over the last years in defining best practices for scientific software (see e.g. https://software.ac.uk/resources/online-sustainability-evaluation, or https://geodynamics.org/cig/dev/best-practices/). However, it is still considered a personal task to acquire the necessary skills and to find the correct balance between careful engineering and overdesigning software. Establishing courses and resources that discuss these questions could greatly benefit our community, and allow for a more reliable scientific progress in geodynamics.

Collaborative software development – The overlooked social challenge

The contributor funnel. The atmosphere and usability of a project influence how many users will join a project, how long they stick around, and if they will take responsibility for the project by contributing to it or eventually become maintainers. Credit: https://opensource.guide/

Now that we covered every topic a scientist can learn about scientific software development in a single blog post, what can go wrong when you put several of them together to work on a software package? Needless to say, a lot. No matter if your software project is a closed-source, intra-workgroup project, or an open-source project with users and developers spread over different continents, things are going to get exponentially more complicated the more people work on your software. Not only does discussion and interaction take more time, there will also be conflicting ideas about computational methods, software design, or implementation. Using state-of-the-art tools like collaborative development platforms (Github, Gitlab, Bitbucket, pick your favourite) and modern discussion channels like chats (Slack, Gitter), forums (Discourse), or video conferences (Skype, Hangouts, Zoom) can alleviate a part of the communication barriers. But ultimately, the social challenges remain. How does a project decide between competing goals of flexibility and performance? Who is going to enforce a code of conduct in a project to keep the development environment open and friendly? Does a project create a welcoming atmosphere that invites new contributions, or does it repel newcomers by unrealistic standards and inappropriate behavior? How should maintainers of scientific software deal with unrealistic feature requests by users? How to encourage new users to become contributors and take responsibility for the software they benefit from? How to compromise or combine providing improvements to the upstream project versus publishing them as scientific papers? How to provide credit to contributors?

In my opinion it is unfortunate that these questions about scientific software projects are even less discussed than the (now increasing) awareness of reproducibility. On the bright side, there is already a trove of experiences in the open-source community. The same questions about attribution and credit, collaboration and community-management, and correctness and security have been discussed over the past decades in open-source projects all over the world, and nowadays a good number of resources provide guidance, such as https://opensource.guide/, or the excellent book  ‘How to Run a Successful Free Software Project’ (Fogel, 2017). Not all of it can be transferred to science, but we would waste time and energy to dismiss these experiences and instead repeat their mistakes.

Let us talk about engineering scientific software

I realize that in this blog post I opened more questions than I answered. Maybe that is because I am not aware of the answers that are already out there. But maybe it is also caused by a lack of attention that these questions receive. I feel that there are no established guidelines for which software development skills a geodynamicist should have, and what techniques should be considered a minimum standard for our software. If that is the case, I would invite you to have a discussion about it. Maybe we can agree on a set of guidelines and improve the state of software in geodynamics. But at the very least I hope I inspired some thought about the topic, and provided some resources to learn more about a discussion that will likely grow more important over the coming years.

References:

M. S. Alnaes, J. Blechta, J. Hake, A. Johansson, B. Kehlet, A. Logg, C. Richardson, J. Ring, M. E. Rognes and G. N. Wells. The FEniCS Project Version 1.5. Archive of Numerical Software, vol. 3, 2015, http://dx.doi.org/10.11588/ans.2015.100.20553.

Fogel, K. (2017). Producing Open Source Software: How to Run a Successful Free Software Project. O'Reilly Media, 2nd edition.

Hannay, J. E., MacLeod, C., Singer, J., Langtangen, H. P., Pfahl, D., & Wilson, G. (2009). How do scientists develop and use scientific software?. In Proceedings of the 2009 ICSE workshop on Software Engineering for Computational Science and Engineering (pp. 1-8). IEEE Computer Society.

Heister, T., Dannberg, J., Gassmöller, R., & Bangerth, W. (2017). High accuracy mantle convection simulation through modern numerical methods–II: realistic models and problems. Geophysical Journal International, 210(2), 833-851.

Krischer, L., Megies, T., Barsch, R., Beyreuther, M., Lecocq, T., Caudron, C., & Wassermann, J. (2015). ObsPy: A bridge for seismology into the scientific Python ecosystem. Computational Science & Discovery, 8(1), 014003.

Müller, R.D., Cannon, J., Qin, X., Watson, R.J., Gurnis, M., Williams, S., Pfaffelmoser, T., Seton, M., Russell, S.H. & Zahirovic, S. (2018). GPlates–Building a Virtual Earth Through Deep Time. Geochemistry, Geophysics, Geosystems.

Open Source Guides. https://opensource.guide/. Oct, 2018.

Rochkind, M. J. (1975). The source code control system. IEEE transactions on Software Engineering, (4), 364-370.

Wilson, G., Aruliah, D.A., Brown, C.T., Hong, N.P.C., Davis, M., Guy, R.T., Haddock, S.H., Huff, K.D., Mitchell, I.M., Plumbley, M.D. and Waugh, B. (2014). Best practices for scientific computing. PLoS biology, 12(1), e1001745.

Presentation skills – 1. Voice

Presentation skills – 1. Voice

Presenting: some people love it, some people hate it. I firmly place myself in the first category and apparently, this presentation joy translates itself into being a good – and confident – speaker. Over the years, quite a few people have asked me for my secrets to presenting (which – immediate full disclosure – I do not have) and this is the result: a running series on the EGU GD Blog that covers my own personal tips and experience in the hope that it will help someone (you?) become a better and – more importantly – more confident speaker. In this first instalment, I discuss everything regarding your voice.

Disregarding the content of your talk (I can’t really help you with that), mastering your voice is an important first step towards presenting well and presenting with (or feigning) confidence. An important thing to always remember, is that your audience doesn’t know how you feel. If you come across as confident, people will perceive you as such, even though you are not necessarily feeling confident yourself. With time, I promise that you will in the end feel at ease and confident in front of an audience.
Using your voice optimally is, obviously very important: it is the one thing people will have to listen to in order to get your message. Therefore, knowing how to use your voice is essential to presenting well. And note that your ‘presenting voice’ doesn’t necessarily need to match up with your ‘normal voice’.

1. Volume

First things first: make sure all people can hear you wherever they are in the room! This is a very basic tip, but one of the most important ones as well: if people can’t hear you, it doesn’t matter how well you present, they won’t understand what you’re talking about, because they literally won’t be able to hear it. Depending on your voice, this will result in one of the following adjustments to get into proper ‘presentation voice mode’:
• You will raise your voice to make sure everyone in the back can clearly hear you. I always do this myself, so my ‘presentation voice’ is always louder than my more natural, soft everyday-talking voice.
• You will lower your voice, so that the people in the first row don’t get blown away: you don’t want your voice to be so loud as to be a nuisance for people sitting close by.

Make sure your voice carries across the room

To test how loudly you need to speak, you can ‘scout’ the room beforehand with a friend. Make sure they stay at the back of the room, and walk up to the front of the room and start talking in your ‘presentation voice’. Can your friend clearly hear everything you say? Then you are good to go. Otherwise, you can adjust and test the volume of your voice according to the comments of your friend. No time/opportunity for a test round of your voice volume? Start your presentation with ‘Can everybody hear me?’ and you’ll soon find out how loud you need to speak.

Help! There is a microphone: now what?!

If there is a microphone available, you should refrain from using your loud presentation voice, because no one wants to go home after a conference with hearing damage. Often, you can test out the microphone shortly before your presentation. Make use of that opportunity, so that you don’t face any surprises! Also, if there is a stationary microphone (i.e., not a headset), make sure to always talk into to the microphone. Adjust it to your height and make sure your voice is optimally picked up by the microphone. It is very tempting to start looking at your slides and turn your head, but that means your voice isn’t optimally picked up by the microphone, which will result in the fact that people in the back can’t hear you! If you alternate speaking into the microphone and turning your head, the sound of your voice during your presentation becomes a rollercoaster of soft-loud-soft-loud. This is very annoying to listen to, so try to avoid this! Having said that, I find this to be one of the hardest things ever, because I’m not used to talking into a stationary microphone… Let’s say practice makes perfect, right?

2. Tonality

It is incredibly boring to listen to someone who speaks in a dull, monotonous voice. No matter how interesting the content of your talk, if you can’t get the excitement and passion for your research across in your voice, chances are that people will start falling asleep during your presentation. And we all know how hard it is to stay awake during even the most animated of presentations, just because of irritating things like jetlag (or trying to finish your own presentation in the dead of the night on the previous evening). Therefore, I suggest practising the tonality of your voice.

Speak with emotion

If you want your audience to feel excited about your research or motivated to collaborate with you, you need to convey those emotions in your voice. Think about what you want your audience to feel and how you can convey that emotion with your voice. For example, if you want people to get excited, you can increase the pitch of your voice to indicate excitement.

Emphasise the right words

Another way of getting rid of a monotonous voice is putting emphasis on the right words, to make your point. Obviously the effect is negated when you overuse this method, but when used in moderation, you can use emphasis on words to get your message across more easily.
You can practice the tonality of your voice all the time: try reading a book out loud, tell a story about your weekend in an animated way, incorporate it in your day-to-day conversations, etc. Try to let your tonality come across as natural (and not over the top) and engaging. Recording your talks and listening back to them or asking comments from friends/family can help when you practice your presentation.

3. Pitch

The pitch of your voice should be pleasant for the audience. Now, of course you can’t (and shouldn’t) change your voice completely, but a very high-pitched, squeaky voice can be very annoying to listen to and a very deep voice can be hard to understand. So, depending on your voice and on what you think people find pleasant, you could consider slightly altering the pitch of your voice.

Don’t worry if your voice gets squeaky, because there is an easy way around it

My voice (and everyone else’s) gets really high-pitched and squeaky when I get excited and presentations make me very excited. So, I always make sure that my presentation voice has an ever-so-slightly lower pitch than my normal speaking voice (and doesn’t get near the high-pitched excitement voice). By lowering the pitch of my voice I (think I) am more clearly understandable and if I do get excited and my pitch increases due to the emotion in my voice, it is still at a very manageable and pleasant pitch, so no-one gets a headache on my watch.

Bearing these tips in mind, you can start honing your perfect presentation voice. Next time, we will start using our voice and tackle the subject of speech!

Reproducible Computational Science

Reproducible Computational Science

 

Krister with his bat-signal shirt for reproducibility.

We’ve all been there – you’re reading through a great new paper, keen to get to the Data Availability only to find nothing listed, or the uninspiring “data provided on request”. This week Krister Karlsen, PhD student from the Centre for Earth Evolution and Dynamics (CEED), University of Oslo shares some context and tips for increasing the reproducibility of your research from a computational science perspective. Spread the good word and reach for the “Gold Standard”!

Historically, computational methods and modelling have been considered the third avenue of the sciences, but they are now some of the most important, paralleling experimental and theoretical approaches. Thanks to the rapid development of electronics and theoretical advances in numerical methods, mathematical models combined with strong computing power provide an excellent tool to study what is not available for us to observe or sample (Fig. 1). In addition to being able to simulate complex physical phenomena on computer clusters, these advances have drastically improved our ability to gather and examine high-dimensional data. For these reasons, computational science is in fact the leading tool in many branches of physics, chemistry, biology, and geodynamics.

Figure 1: Time–depth diagram presenting availability of geodynamic data. Modified from (Gerya, 2014).

A side effect of the improvement of methods for simulation and data gathering is the availability of a vast variety of different software packages and huge data sets. This poses a challenge in terms of sufficient documentation that will allow the study to be reproduced. With great computing power, comes great responsibility.

“Non-reproducible single occurrences are of no significance to science.” – Popper (1959)

Reproducibility is the cornerstone of cumulative science; the ultimate standard by which scientific claims are judged. With replication, independent researchers address a scientific hypothesis and build up evidence for, or against, it. This methodology represents the self-correcting path that science should take to ensure robust discoveries; separating science from pseudoscience. Reports indicate increasing pressure to publish manuscripts whilst applying for competitive grants and positions (Baker, 2016). Furthermore, a growing burden of bureaucracy takes away precious time designing experiments and doing research. As the time available for actual research is decreasing, the number of articles that mention a “reproducibility crisis?” are rising towards the present day peak (Fig. 2). Does this mean we have become sloppy in terms of proper documentation?

Figure 2: Number of titles, abstracts, or keywords that contain one of the following phrases: “reproducibility crisis,” “scientific crisis,” “science in crisis,” “crisis in science,” “replication crisis,” “replicability crisis”, found in the Web of Science records. Modified from (Fanelli, 2018).

Are we facing a reproducibility crisis?

A survey conducted by Nature asked 1,576 researchers this exact question, and reported 52% responded with “Yes, a significant crisis,” and 38% with “Yes, a slight crisis” (Baker, 2016). Perhaps more alarming is that 70% report they have unsuccessfully tried to reproduce another scientist’s findings, and more than half have failed to reproduce their own results. To what degree these statistics apply to our own field of geodynamics is not clear, but it is nonetheless a timely remainder that reproducibility must remain at the forefront of our dissemination. Multiple journals have implemented policies on data and software sharing upon publication to ensure the replication and reproduction of computational science is maintained. But how well are they working? A recent empirical analysis of journal policy effectiveness for computational reproducibility sheds light on this issue (Stodden et al., 2018). The study randomly selected 204 papers published in Science after the implementation of their code and data sharing policy. Of these articles, 24 contained sufficient information, whereas for the remaining 180 publications the authors had to be contacted directly. Only 131 authors replied to the request, of these 36% provided some of the requested material and 7% simply refused to share code and data. Apparently the implementation of policies was not enough, and there is still a lot of confusion among researchers when it comes to obligations related to data and code sharing. Some of the anonymized responses highlighted by Stodden et al. (2018) underline the confusion regarding the data and code sharing policy:

Putting aside for the moment that you are, in many cases, obliged to share your code and data to enhance reproducibility; are there any additional motivating factors in making your computational research reproducible? Freire et al. (2012) lists a few simple benefits of reproducible research:

1. Reproducible research is well cited. A study (Vandewalle et al., 2009) found that published articles that reported reproducible results have higher impact and visibility.

2. Code and software comparisons. Well documented computational research allows software developed for similar purposes to be compared in terms of performance (e.g. efficiency and accuracy). This can potentially reveal interesting and publishable differences between seemingly identical programs.

3. Efficient communication of science between researchers. New-comers to a field of research can more efficiently understand how to modify and extend an existing program, allowing them to more easily build upon recently published discoveries (this is simply the positive counterpart to the argument made against software sharing earlier).

“Replicability is not reproducibility: nor is it good science.” – Drummond (2009)

I have discussed reproducibility over quite a few paragraphs already, without yet giving it a proper definition. What precisely is reproducibility? Drummond (2009) proposes a distinction between reproducibility and replicability. He argues that reproducibility requires, at the minimum, minor changes in experiment or model setup, while replication is an identical setup. In other words, reproducibility refers to a phenomenon that can be predicted to recur with slightly different experimental conditions, while replicability describes the ability to obtain an identical result when an experiment is performed under precisely the same conditions. I think this distinction makes the utmost sense in computational science, because if all software, data, post-processing scripts, random number seeds and so on, are shared and reported properly, the results should indeed be identical. However, replicability does not ensure the validity of the scientific discovery. A robust discovery made using computational methods should be reproducible with a different software (made for similar purposes, of course) and small perturbations to the input data such as initial conditions, physical parameters, etc. This is critical because we rarely, if ever, know the model inputs with zero error bars. A way for authors to address such issues is to include a sensitivity analysis of different parameters, initial conditions and boundary conditions in the publication or the supplementary material section.

Figure 3: Illustration of the “spectrum of reproducibility”, ranging from not reproducible to the gold standard that includes code, data and executable files that can directly replicate the reported results. Modified from (Peng, 2011).

However, the gold standard of reproducibility in computation-involved science, like geodynamics, is often described as what Drummond would classify as replication (Fig. 3). That is, making all data and code available to others to easily execute. Even though this does not ensure reproducibility (only replicability), it provides other researchers a level of detail regarding the work-flow and analysis that is beyond what can usually be achieved by using common language. And this deeper understanding can be crucial when trying to reproduce (and not replicate) the original results. Thus replication is a natural step towards reproduction. Open-source community codes for geodynamics, like eg. ASPECT (Heister et al., 2017), and more general FEM libraries like FEniCS (Logg et al., 2012), allows for friction-free replication of results. An input-file describing the model setup provides a 1-to-1 relation to the actual results1 (which in many cases is reasonable because the data are too large to be easily shared). Thus, sharing the post-processing scripts accompanied by the input file on eg. GitHub, will allow for complete replication of the results, at low cost in terms of data storage.

Light at the end of the tunnel?

In order to improve practices for reproducibility, contributions will need to come from multiple directions. The community needs to develop, encourage and maintain a culture of reproducibility. Journals and funding agencies can play an important role here. The American Geosciences Union (AGU) has shared a list of best practices regarding research data2 associated with a publication:

• Deposit the data in support of your publication in a leading domain repository that handles such data.

• If a domain repository is not available for some of all of your data, deposit your data in a general repository such as Zenodo, Dryad, or Figshare. All of these repositories can assign a DOI to deposited data, or use your institution’s archive.

• Data should not be listed as “available from authors.”

• Make sure that the data are available publicly at the time of publication and available to reviewers at submission—if you are unable to upload to a public repository before submission, you may provide access through an embargoed version in a repository or in datasets or tables uploaded with your submission (Zenodo, Dryad, Figshare, and some domain repositories provide embargoed access.) Questions about this should be sent to journal staff.

• Cite data or code sets used in your study as part of the reference list. Citations should follow the Joint Declaration of Data Citation Principles.

• Develop and deposit software in GitHub which can be cited, or include simple scripts in a supplement. Code in Github can be archived separately and assigned a DOI through Zenodo for submission.

In addition to best practice guidelines, wonderful initiatives from other communities include a research prize. The European College of Neuropsychopharmacology offers a (11,800 USD) award for negative results, more specifically for careful experiments that do not confirm an accepted hypothesis or previous result. Another example is the International Organization for Human Brain Mapping who awards 2,000 USD for the best replication study − successful or not. Whilst not a prize per se, at recent EGU General Assemblies in Vienna the GD community have held sessions around the theme of failed models. Hopefully, similar initiatives will lead by example so that others in the community will follow.

1To the exact same results, information about the software version, compilers, operating system etc. would also typically be needed.

2 AGU’s definition of data includes all code, software, data, methods and protocols used to produce the results here.

References

AGU, Best Practices. https://publications.agu.org/author-resource-center/publication-policies/datapolicy/data-policy-faq/ Accessed: 2018-08-31.

Baker, Monya. Reproducibility crisis? Nature, 533:26, 2016.

Drummond, Chris. Replicability is not reproducibility: nor is it good science. 2009.

Fanelli, Daniele. Opinion: Is science really facing a reproducibility crisis, and do we need it to?Proceedings of the National Academy of Sciences, 115(11):2628–2631, 2018.

Freire, Juliana; Bonnet, Philippe, and Shasha, Dennis. Computational reproducibility: state-of-theart, challenges, and database research opportunities. In Proceedings of the 2012 ACM SIGMOD international conference on management of data, pages 593–596. ACM, 2012.

Gerya, Taras. Precambrian geodynamics: concepts and models. Gondwana Research, 25(2):442–463, 2014.

Heister, Timo; Dannberg, Juliane; Gassm"oller, Rene, and Bangerth, Wolfgang. High accuracy mantle convection simulation through modern numerical methods. II: Realistic models and problems. Geophysical Journal International, 210(2):833–851, 2017. doi: 10.1093/gji/ggx195. URL https://doi.org/10.1093/gji/ggx195.

Logg, Anders; Mardal, Kent-Andre; Wells, Garth N., and others, . Automated Solution of Differential Equations by the Finite Element Method. Springer, 2012. ISBN 978-3-642-23098-1. doi: 10.1007/978-3-642-23099-8.

Peng, Roger D. Reproducible research in computational science. Science, 334(6060):1226–1227, 2011.

Popper, Karl Raimund. The Logic of Scientific Discovery . University Press, 1959.

Stodden, Victoria; Seiler, Jennifer, and Ma, Zhaokun. An empirical analysis of journal policy effectiveness for computational reproducibility. Proceedings of the National Academy of Sciences , 115(11):2584–2589, 2018.

Vandewalle, Patrick; Kovacevic, Jelena, and Vetterli, Martin. Reproducible research in signal processing. IEEE Signal Processing Magazine , 26(3), 2009

CIDER summer school

CIDER summer school

And we’re back! After a refreshing holiday (or was it?), the EGU GD Blog Team is ready to provide you with amazing blog posts once more! Although holidays can be great, one thing that can be even more great is a good summer school. Yep, you heard that correctly! Let me convince you to apply for the CIDER Summer School program next year.

Let’s start with the basics. What the hell is CIDER? Well, CIDER stands for the Cooperative Institute for Dynamic Earth Research. One of it’s main focusses is the interdisciplinary training of early career scientists. To that end, they organise a summer school every year (usually in June/July) that lasts for 4 weeks.

4 weeks?!

Again, you heard that correctly. You are very good at listening!
The first two weeks of the summer school are dedicated to getting up to speed on the topic of the summer school by means of lectures, tutorials, a little field trip, etc. During the last two weeks you will work together in groups on a project of your choosing. The projects are determined during the first two weeks, when you figure out where the knowledge gaps are and you start making teams (no worries, nobody will be left out). You will come up with possible project topics yourself, so you can imagine that there can be quite some lobbying going on to make sure your team gets sufficient members to pursue your favourite project!

Together with your team of students and postdocs, you will confer with established experts in the field to make your project a success. After two weeks, you can probably show some reasonable first results during the final presentation in front of everyone.

If you want to continue working on your project with your team afterwards, you can even write a small proposal to CIDER to request some funding to meet up again and turn your project into a paper. Although they can’t reimburse intercontinental flights, it is still a pretty awesome opportunity!

The topic of the summer school changes every year and alternates between a ‘deep’ topic and a ‘shallow’ topic. I attended the CIDER 2017 summer school with the topic ‘Subduction zone structure and dynamics‘ – a shallow topic. This year (2018), the topic was ‘Relating Geophysical and Geochemical Heterogeneity in the Deep Earth‘ – clearly a deep topic. If you want to know more about this year’s summer school, our Blog Reporter Diogo wrote about it here. Students from all kinds of different disciplines are encouraged to apply: geology, geochemistry, seismology, geodynamics, mineral physics, etc. The more diversity the better, because you need to learn from each other!

More/actual reasons to apply

Now that we have all the details out of the way, I can properly start to convince you to apply! Did I already mention that the summer school is in an exotic place in California, USA? In 2017, the summer school was in Berkeley and this year it was in Santa Barbara. These locations are always fixed, with the ‘shallow’ topics being held in Berkeley, and the deep topics being held in Santa Barbara. Maybe this can act as your guide for finding out which kind of topic to ultimately pursue in your career.

Also, can you imagine? Four weeks, in beautiful, sunny California for ‘work’? Because, yes, it is work, technically, but it won’t feel like it. Actually, it’s kind of like being transported to one of those American high school / college movies. Does anyone else watch those? Nope, just me? Okay then. You will get the full American student experience, as you will sleep in an actual dorm with all your fellow students and go to the dining hall religiously for breakfast, lunch, and dinner each day and every day! Yes, also in the weekends, because it’s free and you’re a poor student! Minor side-effect is that you want be able to look at – let alone stomach – burgers, fries, pizzas, and hotdogs for at least a year, but it’s totally worth it for this all-American movie-like experience. Obviously, sharing a dorm with all your fellow students and complaining about the food will forge bonds that will last far longer than the duration of the summer school and you are guaranteed to have a lot of fun during the summer school also after the lectures.

Although the program is pretty packed, you will have free evenings (during which you might catch up on your actual work) and you will have some days off during the weekends. Of course, you can’t have all weekend days off, because it wouldn’t be a proper summer school experience if you don’t return completely exhausted, right? However, on your precious days off, you can go and explore beyond the campus and do some nice day trips to a nearby city or nature reserve. You can of course also use your free evenings and weekends to sample some of the night life of whatever Californian city you are staying in!

My CIDER 2017 experience

I thoroughly enjoyed my own CIDER experience in Berkeley, 2017. I learned loads of things about subduction zones and a lot of my knowledge was refreshed, specifically on geochemistry, mineral physics and geology. It was great fun to live on an American campus (I mean, I really did feel as if I’d stumbled into an American teen movie) and we did some pretty cool things besides the summer school! There was a lovely field trip to learn a bit more about rocks and it was also a great opportunity to see something of the landscape and enjoy incredible views over San Francisco. Of course, San Francisco itself was also visited during one of our days off and I finally saw the Golden Gate bridge up close and ate crab at Fisherman’s Wharf. Unforgettable experience. Best day of the summer school. I cannot recommend it enough! We also went out for dinner and drinks on occasion in the city centre of Berkeley and we even snuck in a visit to the musical ‘Monsoon Wedding’ at Berkely Rep.

After the summer school, our project group applied for funding to meet up again (I just couldn’t get enough of the American vibe) and lo and behold, we actually got the funding! So this spring, I found myself in Austin, Texas, to work on our project.

Howdy y’all!

It was pretty amazing to have an opportunity like that, and I can assure you that we also had lots of fun in Austin. I mean, it’s Texas, what did you expect? I was already over the moon by the fact that I had the possibility of spotting men wearing cowboy boots for real and not just for carnival!

All in all, I can thoroughly recommend the CIDER summer school as a great learning experience and opportunity for meeting fellow scientists interested in your topic of choice.

Next year, the topic will be ‘Volcanoes‘, so if you have any interest in that, be sure to apply! There is also always a one-day pre-AGU workshop, where you can get a little taste of the summer school, as the progress on the projects of the previous year is reported and lectures anticipating the coming topic are held.

So, are you going to apply to CIDER next year? I mean, who doesn’t lava volcanoes?!