The rise of open research data

This was originally posted at: http://exchanges.wiley.com/blog/2015/08/26/the-rise-of-open-research-data/

As a junior researcher in the UK, it has given me great pleasure over the last few years to see the dramatic development of open access publishing. Most major research funders in the UK now require public access to published research articles in one form or another, and many other research intensive nations across the globe are following suit.

Along with this global increase in public access to papers, there has been a gear shift in demand for the availability of additional outputs of research, including code, videos, software, and raw data. One of the most recent steps in increasing access to these outputs has been the RECODE project for researchers in the EU, which seeks to develop an open data ecosystem through shifting research practices. With progress being made in the USA too, the wheels are truly in motion towards a global shift towards open access to all research outputs.

For researchers, there is a clear incentive for making your research data open – for example, through enhanced citations of your research, but also in making the clear statement that you are not afraid to have your research openly scrutinized and built upon. It can be very difficult to trust the research in a paper, when there is a refusal to make the underlying data openly available.

The recent partnership between Wiley and Figshare is a more than welcome development. As Wiley is one of the dominant publishers of global research, liberating the data behind those publications will be a great advantage for the development of ‘open science’. Wiley joins other publishers, such as PLOS, in recognizing the value of open research data. There are several levels to achieving open access to research data, and embracing the full power of the Web, and this partnership provides important steps to achieving this. There is still a dependence on researchers to making their data available in appropriate formats, and in a transparent context.

From a personal perspective, Figshare have always been one of my favorite players in the rise of open scholarship. Founded by Mark Hahnel based on frustrations in not getting full recognition for all of the outputs of his PhD, it has grown into a platform where pretty much any digital output from research can be freely and easily shared. Most importantly, from a researcher’s perspective, anything which is shared is citable through possession of a DOI (a unique digital object identifier), and protected through appropriate use of Creative Commons licenses.

With this new partnership, and the increasing global interest in open data, comes additional questions regarding appropriate data sharing and citation practices, as well as the recognition of outputs beyond papers when it comes down to academic assessment criteria. As many funders now require data to be more openly available, it comes down to the combination of these funders to make sure that credit for making data open is given, and for researchers to recognize that research outputs go beyond the generation of a pdf manuscript.

One of the biggest hurdles to cross is making data re-usable – simply having data available is not much use. What is needed is transparency in data creation and development, and the creation of community-based data sharing standards that allow other researchers to be able to re-use and innovate using open data. Part of this relies on making sure shared data is machine readable, and with transparent methods regarding its use. Journals should make sure that methods sections are suitably detailed, and form the core of manuscripts, instead of being neglected to a short note towards the end of papers.

The next step for publishers and funders is to enforce the sharing of the data behind publications. There is a clear role for academic editors here, in making sure that data is available via a public archive such as Figshare, upon publication of a manuscript, as well as in encouraging data citation. Where needed, appropriate embargoes can be applied to datasets, which can be important for early career researchers who want to maximize personal usage of the data they have generated. As Figshare provides an institutional service, this could be a great way to achieve support for open data practices closer to home for researchers.

There really is no excuse for not having data openly available to support papers these days. With this, researchers will be able to re-analyze and develop data, and open new doors for research. By embracing the principles underpinning open access to research data, researchers ultimately enhance global scientific discourse, and who knows what we might be able to achieve!