EGU Blogs

Publishing

Which palaeontology stories in 2015 captured the public’s imagination?

This was originally posted here!

Happy New Year everyone! It’s that time of year when all the summaries of an amazing year of research are coming out, and goodness, what a year it’s been! The folk over at Altmetric have been kind enough to summarise the top 100 articles of 2015, measured by their altmetrics scores – a measure of the social media chatter around articles. All the data are available on Figshare, and here I just wanted to highlight the palaeontology stories that stood out in the media this year according to the list.

[Read More]

How to write to your MEPs about European Copyright reform

This was originally posted here.

I mentioned in a previous post how important it is for researchers to equip themselves with knowledge about copyright issues (like this), and to become active in the struggle against publishers in retaining fair re-use rights for research. In the European Commission, this has been quite a high-profile debate this year (see here for example), with some preliminary results being released already.

Recently, Peter Murray-Rust of ContentMine and the University of Cambridge posted an open letter designed to ask that our MEPs become active in copyright reform here in the EU. I used a personalised version of this letter, and the writetothem.org website to send a message to my MEPs from my East Midlands constituency, and present the letter here in full:

Dear Roger Helmer, Glenis Willmott, Emma McClarkin, Andrew Lewer and Margot Parker,

Reform of European Copyright to allow Text and Data Mining (TDM)

I am a PhD student and researcher at Imperial College London and write to urge you to promote the reform of European laws and directives relating to Copyright; and particularly the current restrictions on Text and Data Mining (“ContentMining”). The reforms that MEP Reda promoted to the European Parliament earlier this year [1] are sensible, pragmatic and beneficial and I urge you to represent them to Commissioner Oettinger before he produces the policy document on the Digital Single Market (expected in early December 2015).

Science and medicine publishes over 2 million research papers a year, and billions of Euro’s worth of publicly funded research lie unused since no human can read the vast current literature. That’s an opportunity cost (at worst people die) and potentially a huge new industry. Many of my colleagues have been working for many years to develop the technology and practice of text and data mining (especially in bio- and chemical sciences). This has led to initiatives like ContentMine (http://contentmine.org/) which are making unparalleled leaps forward for researchers. I am convinced that Europe is falling badly behind the U.S. “Fair use” (see the recent “Google” [2] and “Hathi” books case) is now often held to allow the US, but not Europeans (with only “fair dealing” at best), to mine science and publish results.

Over several years, my colleagues have tried to find practical ways forward, but the rightsholders (mainly mega publishers such as Elsevier/RELX, Springer, Wiley, Taylor and Francis, and Nature Publishing Group) have been unwilling to engage. The key issues is “Licences” , where rightsholders require readers to apply for further permissions (and maybe additional payments) just to allow machines to read and process the literature. The EC’s initiative “Licences for Europe” failed in 2013, with institutions such as LIBER, RLUK, and British Library effectively walking out [3]. Nonetheless there has been massive industry lobbying this year to try to convince MEPs , and Commissioners, that Licences are the way forward [4].

The issue is simply encapsulated in my phrase “The Right to Read is the Right to Mine”; if a human has the right to read a document, they should be allowed to use their machines to help them. We have found scientists who have to read 10,000 papers to make useful judgments (for example in systematic reviews of clinical trials, animal testing, and other critical evaluations of the literature. This can take weeks or months of highly skilled scientist’s time, whereas a machine can filter out perhaps 90%, saving thousands of Euros. This type of activity is carried out in many European laboratories, so the total waste is very significant. In my own field of Palaeontology, recent advances in text and data mining have allowed us to automatically reconstruct the entire history of the diversity of life on Earth through an initiative (developed in the U.S) known as PaleoDeepDive [5].

Unfortunately the rightsholders are confusing and frightening the scientific and library community. Two weeks ago a NL statistician [6] was analysing the scientific literature on a large scale to detect important errors in the conclusions reached by statistical methods. After downloading 30,000 papers, the publisher Elsevier demanded that the University (Tilburg) stop him doing his research, and the University complied. Such events are becoming more common anecdotally. This is against natural justice and is also effectively killing innovation – it is often said that Google and other industries could not start in Europe because of restrictive copyright.

In summary, European knowledge workers require the legal assurance that they can mine and republish anything they can read, for commercial as well as non-commercial purposes. This will create a new community and industry of mining which will bring major benefits to Europe (see [7]).

[1] https://juliareda.eu/copyright-evaluation-report-explained/ and https://juliareda.eu/2015/07/eu-parliament-defends-freedom-of-panorama-calls-for-copyright-reform/
[2] http://fortune.com/2015/10/16/google-fair-use/
[3] https://edri.org/failure-of-licenses-for-europe/,
http://ipkitten.blogspot.co.uk/2013/11/licences-for-europe-insiders-report.html
[4] The use of “API”s is now being promoted by rightsholders as a solution to the impasse. APIs are irrelevant; it is the additional licences (Terms and Conditions) which are almost invariably added.
[5]
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0113523
[6] “Elsevier stopped me doing my research”
http://onsnetwork.org/chartgerink/2015/11/16/elsevier-stopped-me-doing-my-research/
[7]
http://contentmine.org/2015/11/contentmining-in-the-uk-a-contentmine-perspective/

Yours sincerely,

Jonathan Tennant

So thanks to Peter for making this a relatively painless task, and one which could have potentially high impact in return. It’s vital that researchers have their voices heard in these sorts of debate, and I strongly encourage anyone who cares about the future of research to become active in this respect. You can write to MEPs and other policymakers about anything you are interested in: it’s dead easy, and you have nothing to lose!

So far, I’ve only had one response that wasn’t an ‘out of office’ or automatic response, and I copy the full text of the response from Glenis Willmott MEP here below:

Dear Jonathan,

Thank you very much for your email.

I can assure you that Labour MEPs are on the side of research and understand the situation of researchers and our research institutions more generally. The European Commission has promised a wide-ranging and long-term revision of the European copyright framework, and we will be certain to keep the interests of educational establishments at the forefront of these negotiations.

In particular, the issue of licensing solutions as against a general exception for content mining has been one of our main focuses. During the discussions on the Reda Report, the Labour Party proposed an amendment which would have had the effect of extending the scope of exceptions and limitations to new technologies or new uses of existing technology, which would of course take into account new methods of content mining. This was adopted by a large majority in the European Parliament, and we are confident the European Commission, when proposing its copyright reform, will take this into account. A leaked Commission document entitled “Towards a modern, more European copyright framework” suggests a broad exception for “public interest research organisations”, and Labour MEPs will endeavour to tie down this definition and ensure its effective application.

We fully understand the need for legal clarity for consumers and end users, as well as flexibility to ensure that legislation takes account of the pace of technological change.

I hope you have found this information useful. If you have further questions on this, or any other issue, please do not hesitate to contact me.

Best wishes

Glenis Willmott MEP

So that’s that! I hope some of you decide that this sort of thing is worth campaigning for, and consider adding your voice to the discussion.

A thought on impact factors

OK, bear with me on this one. It’s a bit of a thought dump, but it would be interesting to see what people think.

You can’t go anywhere in academia these days without hearing about impact factors. An impact factor is a metric assigned to a journal that measures the average number of citations per article over the preceding two year interval. It was originally designed to help libraries select which journals were being used by academics in their research, and therefore which ones they could not renew subscriptions to. However, in modern day academia, it is often used to measure the individual ‘impact’, or quality, of a single paper within a journal – that is, the metric assigned to a journal is used as a proxy for the value of each article inside. It doesn’t make much sense on the face of things, especially when you here stories about how much impact factors are gamed (read: purchased) by journals and their publishers (see link below), to the extent that they are at the least meaningless, and at the worst complete lies.

The evidence suggests that the only thing that an impact factor, and journal rank, is reflective of is academic malpractice – that is, fraud. The higher an impact factor, the higher the probability that there has been data fudging of some sort (or higher probability of detection of such practice). A rather appealing option seems to be to do away with journals altogether, and replace them with an architecture built within universities that basically removes all the negative aspects of assessment of impact factors, at the same time as removing power from profit-driven parasitic publishers. It’s not really too much a stretch of the imagination to do this – for example, Latin America already uses the SciELO platform to publish its research, and is free from the potential negative consequences of the impact factor. University College London also recently established it’s own open access press, the first of its kind in the UK. The Higher Education Funding Council for England (HEFCE) recently released a report about the role of metrics in higher education, finding that the impact factor was too often mis-used or ‘gamed’ by academics, and recommended its discontinuation as a measure of personal assessment. So there is a lot of evidence that we are moving away from a system where impact factors and commercial publishers are dominating the system (although see this post by Zen Faulkes).

But I think there might be a hidden aspect behind impact factors that has often been over-looked, and is difficult to measure. Hear me out.

Impact factors, whether we like it or not, are still used as a proxy for quality. Everyone equates a higher impact factor with a better piece of research. We do it automatically as scientists, irrespective of whether we’ve even read an article. How many times do you hear “Oh, you got an article in Nature – nice one!” I’m not really sure if this means well done for publishing good work, or well done for beating the system and getting published in a glamour magazine. Either way, this is natural now within academia, it’s ingrained into the system (and by system, I include people). The flip side of this is that researchers then, following this practice, submit their research which they perceive to be of ‘higher quality’ (irrespective of any subjective ties or a priori semblance of what this might mean) to higher impact factor journals. The inverse is also true – research which is perceived to be less useful in terms of results, or lower quality, will be sent to lower impact factor journals. Quality in this case can refer to any combination of things – strong conclusions, a good data set, relevance to the field.

Now, I’m not trying to defend the impact factor and it’s use as a personal measure for researchers. But what if there is some qualitative aspect of quality that it is capturing, based on this? Instead of thinking “It’s been published in this journal, therefore it’s high quality”, it’s rethinking it as “This research is high quality, therefore I’m going to submit it to this journal.” Researchers know journals well, and they submit to venues for numerous reasons – among them is the appropriateness of that venue based on its publishing history and subject matter. If a journal publishers hardcore quantitative research, large-scale meta-analyses and the sort, then it’s probably going to accrue more citations because it’s of more ‘use’ – more applicable to a wider range of subjects or projects.

For example, in my field, Palaeontology, research typically published in high impact factor journals involves fairly ground-breaking new studies regarding developmental biology, macroevolution, extinctions – large-scale patterns that offer great insight into the history of life on Earth. On the other hand, those published in lower impact factor journals might be more technical and specialist, or perhaps regarding descriptive taxonomy or systematics – naming of a new species, for example. An obvious exception to this is anything with feathers, which makes it’s way into Nature, irrespective of it’s actual value in progressing the field (I’ll give you a clue: no-one cares about new feathered dinosaurs any more. Get over it, Nature).

So I’ll leave with a question: do you submit to higher impact factor journals if you think your research is ‘better’ in some way. And following this, do you think that impact factors capture a qualitative aspect of research quality, that you don’t really get if you think about what impact factors mean in a post-publication context? Thoughts below! Feel free to smash this thought to shreds.

The Open Research Glossary round 2

A few months ago, we published the crowd-sourced Open Research Glossary, details of which can be found here. We’ve taken this to the next level now, and published the updated and much prettier version of this resource on Figshare. This means it is now openly licensed for re-use, and can also be cited like any normal research article. We also popped it on Zenodo, because why not!

The original document can be edited here, and remains an open crowd-sourced initiative, which means anyone can add or change anything they want. We strongly encourage the academic community to contribute to and broadly share this resource, so that we can all be a little bit more informed about the vastly complex topic of ‘Open Scholarship’.

This latest change was thanks to the hard work of Joe McArthur of the Right to Research Coalition, who have been kind enough not only to assist with formatting and the generation of an xml version of this document (pending), but also hosting the resource on their website.

If anyone has any questions, comments, or suggestions, then I’d love to hear them! In the mean time, I hope you find this useful. Thanks again to everyone who has contributed to or shared this work.