It is important that researchers are aware of the responsibilities that using a data collection entails. This is especially the case where data has been obtained from restricted or subscription sources. It will also apply to data that includes confidential content.
- How they will use data in practice,
- How data is being combined (where multiple sources are being used),
- How it will be presented during and after analysis,
- Whether data is being appropriately stored and backed up.
Terms and conditions
Some data collections attract particular restrictions, but there are also general terms and conditions that should be observed as part of good research practice.
- Used for academic research only,
- Not used for commercial purposes,
- Not shared with anyone else,
- Destroyed once used and not re-used for other projects,
- Properly attributed so original sources may be consulted if necessary.
In most cases this should be straightforward but if you have any concerns they should be raised with your own Subject Librarian.
- Developing a research project with commercial potential or collaborators,
- Fulfilling requirements from funding bodies that research data be preserved,
- Creating a working environment that satisfies restricted access data suppliers requirements for secure access.
Data citation is rapidly emerging as a key practice supporting data preservation, access and reuse, as well as sound scholarship. The motivation to cite datasets arises from a recognition that data generated and archived in the course of research are just as valuable to the ongoing academic discourse as papers and monographs. This view is shared by research institutions, funding councils and a growing number of publishers.
The Data Citation Synthesis Group (FORCE11) has published a set of data citation principles The Joint Declaration of Data Citation, representing a formal statement pulling together practices used in the research and publishing arenas and in common use. The declaration encompasses eight principles that stress the importance and legitimacy of data, the need to give scholarly credit to contributors and the importance of data as evidence. Cited data should have unique and persistent identifiers i.e. a Digital Object Identifier (DOI) which is the equivalent of an ISBN for data. These are as issued by data repositories such as ORA-Data. Go to Research Data Oxford for more details.
Cf. an example of citation for an existing ORA-Data deposit:
or a citation from the ESRC’s Economic and Social Data Service (ESDS):
University of Essex. Institute for Social and Economic Research and National Centre for Social Research, Understanding Society: Wave 1, 2009-2010 and Wave 2, Year 1 (Interim Release), 2010 [computer file]. 3rd Edition. Colchester, Essex: UK Data Archive [distributor], February 2012. SN: 6614, http://dx.doi.org/10.5255/UKDA-SN-6614-3
In short, when citing data, include Author(s), Title, Year of deposit, repository or distributor, DOI (the standard persistent digital object identifier), or other access location. Make sure your citation includes enough information to find the data easily.
An exhaustive guide, How to Cite Datasets and Link to Publications from the Digital Curation Centre (DCC) discusses data citation in great detail, with information for researchers and data repositories.
- A general guide to citing data from DataCite,
- Quick Guide to Data Citation from IASSIST,
- Data citation of evolving data by Research Data Alliance.
Data archives may provide guidelines on how to cite the data, and sometimes a website has this information on individual data set pages. More frequently, the website or database where you found your data will also have information on how to cite that data in their FAQs, "About" page, or "How to Use" information.
- Why documenting your research data is important, and why documentation is important for using others’ data,
- Why and when to use metadata,
- The importance of citing data, and how to do it.
Copyright and data
Copyright, an intellectual property right assigned automatically to the creator, prevents unauthorised copying and publishing of an original work. Under the Copyright, Designs and Patents Act, 1988 copyright applies to research data, which falls under the category of Literary, dramatic and musical works, and plays a role when creating, sharing and re-using data. However, 2014 amendments (see section 29A) and the Government official Guidance Note introduce recent changes to copyright and intellectual property law and how it affects researchers.
You can also read guidance from JISC and UKDS, outlining the implications of the new text and data mining copyright exception for researchers, research support services and librarians in UK universities, and giving the summary of Text and Data Mining best practice.
The copyright law of the European Union consists of a number of directives and the judgments of the European Court of Justice.
The EU legislation on Text and Data Mining (TDM) is very much a work in progress.
In September 2016 the European Commission proposed, as part of its update of EU copyright rules, a copyright exception that would permit researchers to analyse on a large scale scientific data to which they have lawful access.
You can find data mining guides and tools on the Future TDM project website.
Additional advice on general obligations of using or sharing data and more specifically on citation of data may be found on the Research Data Oxford website.