Purpose
Citing data is important for the same reason as citing any other information if it is from other sources. Other researchers can find the data you used and replicate the research. You help credit researchers through citation counts in Web of Science, Scopus or Google Scholar. You help data producers, funding agencies to track how and where their data are used and measure impact.
There is no consensus yet in the academic community on how to cite data:
- electronic files of datasets often change location, file format, metadata information (title, author, date and other) as is the case with all digital assets
- there are issues with digital preservation, access to data formats after some time etc.
- some journals might not have data in they reference/citation formatting.
Recommendations
As a general rule, when citing data, make sure you include all the information you would include for a regular citation. You should be familiar with the citation style your professor asked you to use or the one you have chosen. Elements that should be present many times vary depending on the style, but note down at least the following, if present:
- Author: creator of the data(set). It can be an individual researcher, a research team or an organization, like a government agency.
- Title: if the data come from a research paper, the title will probably match the title of the paper. Otherwise, a dataset title might be available, for ex. for population data from a specific country
- Edition or Version: sometimes datasets might change overtime, so make sure you have the correct information noted down.
- Date: at least the year of publishing the dataset (online) should be indicated
- Editor: datasets might have editors, compilers or curators.
- Publisher and Publisher Location: a publisher is the entity that makes available the dataset for the public, and it can be related sometimes to a location.
- Material Designator: the type of the file format and if it is available online or in another medium, ex. hard disk etc
- Electronic Retrieval Location: Usually the Digital Object Identifier (DOI) or a URL if this is not available
- Date accessed online: the exact date you last looked up the data
Kessler, Ronald C. National Comorbidity Survey: Baseline (NCS-1), 1990-1992 (Restricted Version) [Computer file]. ICPSR25381-v1. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2009-05-11. doi:10.3886/ICPSR2538
Usually you will find the citation information in the webpage of the data repository or publisher itself. Copy the citation directly in the citation style you are using, if provided, or export it in your citation tool to format it later. Some data providers also give suggestions on how to cite their data, take that into consideration.
What if the format is not what you are looking for? APA, MLA or Other?
- Do it manually, check our additional resources at the bottom of the page for links and books
- Use a citation formatting webpage like CrossCite; paste the DOI and choose your preferred style
- Use a citation tool to format citation/reference. See below for instructions
Caution! When you find datasets through a registry repository like re3data.org, don't cite the registry! Instead, go the specific URL. If your purpose is to cite the registry then use the re3data.org citation tool.
