Skip to Main Content

Data Introduction: Cite

Getting started to manage, find, share, understand ethics, create, and cite data.

Purpose & Recommendations on Citing Data

Purpose

Citing data is important for the same reason as citing any other information if it is from other sources. Other researchers can find the data you used and replicate the research. You help credit researchers through citation counts in Web of Science, Scopus or Google Scholar. You help data producers, funding agencies to track how and where their data are used and measure impact.

There is no consensus yet in the academic community on how to cite data:

  • electronic files of datasets often change location, file format, metadata information (title, author, date and other) as is the case with all digital assets
  • there are issues with digital preservation, access to data formats after some time etc.
  • some journals might not have data in they reference/citation formatting.

Recommendations

As a general rule, when citing data, make sure you include all the information you would include for a regular citation. You should be familiar with the citation style your professor asked you to use or the one you have chosen. Elements that should be present many times vary depending on the style, but note down at least the following, if present:

  • Author: creator of the data(set). It can be an individual researcher, a research team or an organization, like a government agency.
  • Title: if the data come from a research paper, the title will probably match the title of the paper. Otherwise, a dataset title might be available, for ex. for population data from a specific country
  • Edition or Version: sometimes datasets might change overtime, so make sure you have the correct information noted down.
  • Date: at least the year of publishing the dataset (online) should be indicated
  • Editor: datasets might have editors, compilers or curators.
  • Publisher and Publisher Location: a publisher is the entity that makes available the dataset for the public, and it can be related sometimes to a location.
  • Material Designator: the type of the file format and if it is available online or in another medium, ex. hard disk etc
  • Electronic Retrieval Location: Usually the Digital Object Identifier (DOI) or a URL if this is not available
  • Date accessed online: the exact date you last looked up the data
Hao, Z., AghaKouchak, A., Nakhjiri, N., Farahmand, A. Global Integrated Drought Monitoring and Prediction System (GIDMaPS) [Data sets]. Figshare. http://dx.doi.org/10.6084/m9.figshare.853801 (2014)

 

Kessler, Ronald C. National Comorbidity Survey: Baseline (NCS-1), 1990-1992 (Restricted Version) [Computer file]. ICPSR25381-v1. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2009-05-11. doi:10.3886/ICPSR2538 

 

Pew Hispanic Center. (2008). 2007 Hispanic Healthcare Survey [Data file and code book]. http://www.pewhispanic.org/2007/09/23/2007-hispanic-healthcare-survey/

Usually you will find the citation information in the webpage of the data repository or publisher itself. Copy the citation directly in the citation style you are using, if provided, or export it in your citation tool to format it later. Some data providers also give suggestions on how to cite their data, take that into consideration.

What if the format is not what you are looking for? APA, MLA or Other?

  • Do it manually, check our additional resources at the bottom of the page for links and books
  • Use a citation formatting webpage like CrossCite; paste the DOI and choose your preferred style
  • Use a citation tool to format citation/reference. See below for instructions

Caution! When you find datasets through a registry repository like re3data.org, don't cite the registry! Instead, go the specific URL. If your purpose is to cite the registry then use the re3data.org citation tool.

Zotero

If you are using Zotero as a citation tool:

  • use Journal Article a a primary type, especially if you are citing research data from a paper
  • select the 'Extra' field in the 'Info' panel of your document
  • and fill in 'itemType: dataset'. This will mark your citation as a dataset, providing all the functionality needed (source: Zotero forums)

detail from zotero interface

Endnote

In EndNote you can directly add a dataset as a citation type. You can do that either from the online library or the browser plugin

  1. From the EndNote online library:
    • go to your references list
    • click on the title and change manually the field "Reference type" to "Dataset"
  2. From the browser plugin 'Capture Reference'
    • choose 'Dataset' as 'Reference Type':

detail from EndNote interface

Further Resources

Library Homepage Facebook Youtube Instagram Twitter Telegram E-mail