Skip to Main Content

Data & Data Visualization

Resources for locating and citing data sets.

Cite your data!

Citing datasets is just as important as citing journal articles, books, and other sources that contributed to your research.

By citing your use of a dataset, you are supporting the reproducibility of your research and attributing credit to those who provided the data-including datasets that you have created yourself. Citations also allow for tracking reuse and measuring impact.

Citation styles do not consistently provide examples for dataset citations. Many data providers also recommend their preferred citation or supply an example.

Data Citation Tips

  • Be sure to provide enough information in your citation so that the reader can identify, retrieve, and access the same unique dataset you have used.
  • Provide citations for data sets when you have either conducted secondary analyses of publicly archived data or archived your own data being presented for the first time in the current work.
  • If you are citing existing data or statistics, cite the publication in which the data were published (e.g., a journal article, report, or webpage) rather than the data set itself.
  • The date in the reference is the year of publication for the version of the data used.
  • Provide the publisher of the data set in the source element. ICPSR is one common example.

from: ICPSR and APA Style Guide


Double check which citation style your professor requires!

Citation Styles

Elements of Data Citation

  • Author: Name(s) of each individual or organizational entity responsible for the creation of the dataset.

  • Date of Publication: Year the dataset was published or disseminated.

  • Title: Complete title of the dataset, including the edition or version number, if applicable.

  • Publisher and/or Distributor: Organizational entity that makes the dataset available by archiving, producing, publishing, and/or distributing the dataset.

  • Electronic Location or Identifier: Web address or unique, persistent, global identifier used to locate the dataset (such as a DOI). Append the date retrieved if the title and locator are not specific to the exact instance of the data you used.

These are the minimum elements required for dataset identification and retrieval. Fewer or additional elements may be requested by author guidelines or style manuals. Be sure to include as many elements as needed to precisely identify the dataset you have used.

from: ICPSR

APA style, Datasets

General Format

Author(s) (year). Title. [Data set]. Publisher. URL

Example:

O’Donohue, W. (2017). Content analysis of undergraduate psychology textbooks (ICPSR 21600; Version V1). [Data set]. ICPSR. https://doi.org/10.3886/ICPSR36966.v1

The Chicago Manual of Style does not directly address how to cite datasets. However, the basic Chicago formatting for a source is as follows:

General format

Author. Title of Work. Format. Date of creation or completion. Medium. Name of Institution. URL  (accessed month dd, yyyy).

  • Author(s), if known: last name, first name middle initial.
  • Format: how it is organized (e.g. table, dataset, chart)
  • Medium is software required to access work: csv, JSON, ARVO

Example

O’Donohue, William. Table. 2017. Content analysis of undergraduate psychology textbooks (ICPSR 21600; Version V1). csv. ICPSR. https://doi.org/10.3886/ICPSR36966.v1 (accessed August 21, 2023).

Data Cite is a global network of dataset researchers, whose goal, as stated on their website, is “to help make data more accessible and more useful; our purpose is to develop and support methods to locate, identify and cite data and other research objects.” They recommend the following when citing a dataset:

General Format

Creator (PublicationYear). Title. Version. Publisher [or Distributor]. (ResourceType.) Identifier

  • Creator: the author or producer of the dataset
  • Publication year: year that the dataset (and/or data visualization) was created
  • Title: the title of the dataset, followed by any information on the specific  version of the dataset referenced.
  • Publisher: the publisher or distributor of the dataset
  • Resource type: where available. For example: (dataset.)
  • Identifier: a unique identifier assigned to the dataset and/or data visualization. For citation purposes, DataCite recommends that DOI names are displayed as linkable, permanent URLs: https://doi.org/10.6068/DP15E5374E97A17 vs doi: 10.6068/DP15E5374E97A17.

Example:

Organisation for Economic Co-operation and Development (OECD) (2018-04-06). Main Economic Indicators (MEI): Finance | Country: Argentina | Indicator ID: CCUS, 01/1959 - 12/2017. Data Planet™ Statistical Datasets: A SAGE Publishing Resource. (dataset). Dataset-ID: 062-003-004. https://doi.org/10.6068/DP163F9ED671E6

The Institute for Electrical and Electronics Engineers (IEEE) is a professional organization supporting many branches of engineering, computer science, and information technology. In addition to publishing journals, magazines, and conference proceedings, IEEE also makes many standards for a wide variety of industries.

IEEE citation style includes in-text citations, numbered in square brackets, which refer to the full citation listed in the reference list at the end of the paper. The reference list is organized numerically, not alphabetically. For examples, see the IEEE Editorial Style Manual and the IEEE Citation Guidelines.

General Format

[number of reference] Author, Title. Location of publisher: publisher, date. [format]. Available: URL [accessed month dd, yyyy]

  • Author: First initial. Middle initial. Last name
  • Format: software used to render the file, e.g. csv or JSON
  • Available date: month abbreviated to three letters, January = Jan. dd, yyyy

Example:

[1] Hoen, B.D., Diffendorfer, J.E., Rand, J.T., Kramer, L.A., ​ ​Garrity, C.P., Hunt, H.E., United States Wind Turbine Database. U.S. Geological Survey, 2018. [JSON]. Available: https://eerscmap.usgs.gov/arcgis/rest/services/uswtdb/uswtdbTiled/MapServer [accessed Aug. 22, 2023]

The MLA Handbook does not directly mention datasets. However, the basic MLA formatting is as follows:

General Format

Author. Title of Dataset (including date range of dataset). Publisher, Publication Date. Database Name, DOI. 

  • Author(s), if known: Last name, first name. 
  • Title of Source: Name of the dataset
  • Title of container: Website or study 
  • Publisher: Organization making the dataset available
  • Publication date: When it was made available
  • Database Name: If it came from a database or larger study, put name here
  • Publication location: DOI or URL

Example:

Energy Information Administration. Retail Gasoline Prices: Retail Gasoline Prices - All Grades, 08/20/1990 - 05/30/2016. Data Planet Statistical Datasets: A SAGE Publishing Resource, 17 Sept. 2017. https://doi.org/10.6068/DP15E5374E97A17