Applications to attend OpenCon 2015 on November 14-16 in Brussels, Belgium are now open! The application is available on the OpenCon website at opencon2015.org/attend and includes the opportunity to apply for a travel scholarship to cover the cost of travel and accommodations. Applications will close on June 22nd at 11:59pm PDT.
OpenCon seeks to bring together the most capable, motivated students and early career academic professionals from around the world to advance Open Access, Open Education, and Open Data—regardless of their ability to cover travel costs. In 2014, more than 80% of attendees received support. Due to this, attendance at OpenCon is by application only.
Palaeontology is the study of the history of life on Earth. Whenever I get asked what I do, my answer always gets a predictable response: either “Oh, like Ross from Friends?” “So Jurassic Park?” or “So you dig dinosaurs?”
Neither of these are close to what myself, my colleagues, or the broader field are doing. Well, apart from the digging dinos. We have to have some perks (not that I’ve actually ever been on a dig…).
What I want to highlight are a couple of recent developments in the field that show that palaeontology is just as technically advanced as any other major domain of science out there. They both involve the genesis and analysis of large data sets that we’re constantly using to test large-scale patterns and processes through time – known as macroevolution. Trying to decipher the patterns and processes of evolution leading towards the modern, extant fauna we have today is key in predicting their future as we destroy the planet.
One is rather inspired. OpenCon 2014 was a wonderful time bringing together the best minds in early career research and the ‘world of open’ to discuss how we make access to knowledge, data, and educational resources better for everyone. It wasn’t so much an event*, as a milestone. Here’s the story of its success.
I don’t want to run through the basics of each aspect of open access, data, and education. Let me instead tell you instead about how we just marked a revolutionary point in making the fundamental right to research a reality. When I use the word ‘publishers’ through this post, I’m talking primarily about legacy ones – those who operate on a paywall-based model and publicly declare themselves to be enemies of progressing research (I’m not going to name names, we all know who they are – PeerJ is clearly safe). This does not include many learned societies, which I think are an invaluable component of academic communities and are a completely separate discussion we need to have.
A new initiative has just been announced that could help to revolutionise palaeontology. PaleoDeepDive is essentially an automated version of the Paleobiology Database, which is an online, professionally crowd-sourced and curated database of fossil occurrences pulled from the literature.
They have a launch video here:
Click here to display content from YouTube. Learn more in YouTube’s privacy policy.
I have a couple of reservations about this. Firstly, how do they expect to mine data from articles that are mostly still locked behind paywalls, at least legally.
I’m also a little concerned about the precision of their algorithms. Towards the end, they mention that in a sample of 500 articles, they get 15000 species names, whereas the PaleobioDB only picks up 1100. Well, in the latter, these names are occurrences – explicit records of fossils in time and space. What these 15000 represent is not clear – are they just those that are mentioned in the text, and therefore don’t really have any use, or are all the palaeontologists really just missing out on 90% of the data when extracting manually?
Additionally, I am concerned about the linking of metadata, such as the location and age of fossils, as well as data about the geology, environment of deposition, taphonomy etc. All of this information has to be sifted out of articles from within a host of information in articles when extraction is manual. I’m not sure if a machine will be able to distinguish between, for example, geological dates from something related, but not directly the age of the fossil, in text.
Anyway, these are just preliminary thoughts, and am sure that they have crossed the developers’ minds at some point, I look forward to seeing how this progresses, and undermines a lot of my work! 😉
Oops.
Also, I’d love to hear any thoughts or comments you have about it!