It is important that researchers are aware of the responsibilities that using a data collection entails. This is especially the case where data has been obtained from restricted or subscription sources. It will also apply to data that includes confidential content.
Researchers need to think about:
how they will use data in practice
how data is being combined (where multiple sources are being used)
how it will be presented during and after analysis
whether data is being appropriately stored and backed up.
An obligation in working with data is to be open and transparent about what was used and in what way. One way of doing this is to keep its usage well documented. This is an important stage in developing habits that demonstrate research integrity and where appropriate support reproducibility and verification. Such documentation may then be expanded to describe the whole of the research process.
A well documented research project will be:
a useful resource for the data creator during and after the research;
evidence of a commitment to research transparency;
Data citation is rapidly emerging as a key practice supporting data preservation, access and reuse, as well as sound scholarship. The motivation to cite datasets arises from a recognition that data generated and archived in the course of research are just as valuable to the ongoing academic discourse as papers and monographs. This view is shared by research institutions, funding councils and a growing number of publishers.
The Data Citation Synthesis Group (FORCE11) has published a set of data citation principles, The Joint Declaration of Data Citation. This represents a formal statement pulling together practices used in the research and publishing arenas and in common use. The declaration comprises eight principles that stress the importance and legitimacy of data, the need to give scholarly credit to contributors and the importance of data as evidence.
Cited data should have unique and persistent identifiers, ie a Digital Object Identifier (DOI) which is the equivalent of an ISBN for data. These are issued by data repositories such as ORA-Data. Visit Research Data Oxford for more details.
Here is an example of citation for an existing ORA-Data deposit:
Tomkins, D. & Jackson, A. (2015) “Ephemera and the British Empire - colour illustrations”. Oxford University Research Archive. doi:10.5287/bodleian:xp68kg235
or a citation from the ESRC’s Economic and Social Data Service (ESDS):
University of Essex. Institute for Social and Economic Research and National Centre for Social Research, Understanding Society: Wave 1, 2009-2010 and Wave 2, Year 1 (Interim Release), 2010 [computer file]. 3rd Edition. Colchester, Essex: UK Data Archive [distributor], February 2012. SN: 6614, http://dx.doi.org/10.5255/UKDA-SN-6614-3
In short, when citing data include author(s), title, year of deposit, repository or distributor, DOI (the standard persistent digital object identifier), or other access location. Make sure your citation includes enough information to find the data easily.
Data archives may provide guidelines on how to cite the data. Some websites provide this information on individual dataset pages. More frequently, the website or database where you found your data will also have information on how to cite that data in their FAQs, 'About' page, or 'How to Use' information.
why documenting your research data is important, and why documentation is important for using others’ data,
why and when to use metadata,
the importance of citing data, and how to do it.
Copyright and data
Copyright, an intellectual property right assigned automatically to the creator, prevents unauthorised copying and publishing of an original work. Under the Copyright, Designs and Patents Act 1988, copyright applies to research data, which falls under the category of Literary, dramatic and musical works, and plays a role when creating, sharing and re-using data. However, 2014 amendments (see section 29A) and the Government official Guidance Note introduced changes to copyright and intellectual property law and how it affects researchers.