Data Obligations


It is important that researchers are aware of the responsibilities that using a data collection entails. This is especially the case where data has been obtained from restricted or subscription sources. It will also apply to data that includes confidential content.

Researchers need to think about:
  • How they will use data in practice,
  • How data is being combined (where multiple sources are being used),
  • How it will be presented during and after analysis,
  • Whether data is being appropriately stored and backed up.

Terms and conditions

Some data collections attract particular restrictions, but there are also general terms and conditions that should be observed as part of good research practice.

Databases or data collections should be:
  • Used for academic research only,
  • Not used for commercial purposes,
  • Not shared with anyone else,
  • Destroyed once used and not re-used for other projects,
  • Properly attributed so original sources may be consulted if necessary.

Further clarification

In most cases this should be straightforward but if you have any concerns they should be raised with your own Subject Librarian.

Situations where more clarification may be needed include:
  • Developing a research project with commercial potential or collaborators,
  • Fulfilling requirements from funding bodies that research data be preserved,
  • Creating a working environment that satisfies restricted access data suppliers requirements for secure access.

Data citation

Data citation is rapidly emerging as a key practice supporting data preservation, access and reuse, as well as sound scholarship. The motivation to cite datasets arises from a recognition that data generated and archived in the course of research are just as valuable to the ongoing academic discourse as papers and monographs. This view is shared by research institutions, funding councils and a growing number of publishers.

The Data Citation Synthesis Group (FORCE11) has published a set of data citation principles The Joint Declaration of Data Citation, representing a formal statement pulling together practices used in the research and publishing arenas and in common use. The declaration encompasses eight principles that stress the importance and legitimacy of data, the need to give scholarly credit to contributors and the importance of data as evidence. Cited data should have unique and persistent identifiers i.e. a Digital Object Identifier (DOI) which is the equivalent of an ISBN for data. These are as issued by data repositories such as ORA-Data. Go to Research Data Oxford for more details.

Cf. an example of citation for an existing ORA-Data deposit:

Tomkins, D. & Jackson, A. (2015) “Ephemera and the British Empire - colour illustrations”. Oxford University Research Archive. doi:10.5287/bodleian:xp68kg235

or a citation from the ESRC’s Economic and Social Data Service (ESDS):

University of Essex. Institute for Social and Economic Research and National Centre for Social Research, Understanding Society: Wave 1, 2009-2010 and Wave 2, Year 1 (Interim Release), 2010 [computer file]. 3rd Edition. Colchester, Essex: UK Data Archive [distributor], February 2012. SN: 6614,

In short, when citing data, include Author(s), Title, Year of deposit, repository or distributor, DOI (the standard persistent digital object identifier), or other access location. Make sure your citation includes enough information to find the data easily.

An exhaustive guide, How to Cite Datasets and Link to Publications from the Digital Curation Centre (DCC) discusses data citation in great detail, with information for researchers and data repositories.

ORA-data, part of Oxford Research Archive designed to help access, create, archive, share and cite research data, advises researchers to follow these guidelines.

Other guides include:

Data archives may provide guidelines on how to cite the data, and sometimes a website has this information on individual data set pages. More frequently, the website or database where you found your data will also have information on how to cite that data in their FAQs, "About" page, or "How to Use" information.

MANTRA, research data management training course, offers an interactive training module which introduces the concepts of documentation and metadata, including:
  • Why documenting your research data is important, and why documentation is important for using others’ data,
  • Why and when to use metadata,
  • The importance of citing data, and how to do it.

Additional advice on general obligations of using or sharing data and more specifically on citation of data may be found on the Research Data Oxford website.

Back to top