Data Package Design for Special Cases

Version 1

published

Community-developed considerations for creating well-designed datasets that include data specialized by type, format, or acquisition method. Examples are images, code, documents, and raw data in other repositories.

Authors

Affiliations

Corinna Gries

University of Wisconsin, Madison

Stace Beaulieu

Woods Hole Oceanographic Institute

Renée F. Brown

University of New Mexico

Sarah Elmendorf

University of Colorado Boulder

Hap Garritt

Woods Hole Oceanographic Institute

Gastil Gastil-Buhl

University of California Santa Barbara

Hsun-Yi Hseih

Michigan State University

Li Kui

University of California Santa Barbara

Mary Martin

University of New Hampshire

Gregory E. Maurer

New Mexico State University

An T. Nguyen

University of Texas at Austin

John H. Porter

University of Virginia

Adam Sapp

University of Georgia

Mark Servilla

University of New Mexico

Tim Whiteaker

University of Texas at Austin

LTER Network Information Management Committee

Various

Published

February 19, 2021

Abstract

This book covers considerations for creating well-designed datasets for publishing data types outside the most common cases for research data. Specialized needs for preparing and publishing a dataset may arise because the data are not tabular in structure, are stored in non-standard file formats, are collected with unique or new acquisition methods, or are large in size. Some examples of specialized data types described here are imagery, code, document archives, spatial data files, and raw genomic data stored in other repositories. This is Version 1 of the document, written and edited by the Non-tabular Data Working Group of the U.S. LTER Network Information Management Committee.

Keywords

data management, EML, dataset, imagery, big data, documents, drone, code, research, environmental science, metadata, publishing, guide, repository

Reuse

CC BY 4.0

Citation

BibTeX citation:

@book{gries2021,
  author = {Gries, Corinna and Beaulieu, Stace and F. Brown, Renée and
    Elmendorf, Sarah and Garritt, Hap and Gastil-Buhl, Gastil and Hseih,
    Hsun-Yi and Kui, Li and Martin, Mary and E. Maurer, Gregory and T.
    Nguyen, An and H. Porter, John and Sapp, Adam and Servilla, Mark and
    Whiteaker, Tim and Network Information Management Committee, LTER},
  title = {Data {Package} {Design} for {Special} {Cases}},
  version = {1},
  date = {2021-02-19},
  url = {https://prerelease-edi-docs.netlify.app/guide-special-cases/},
  doi = {10.6073/pasta/9d4c803578c3fbcb45fc23f13124d052},
  langid = {en},
  abstract = {This book covers considerations for creating well-designed
    datasets for publishing data types outside the most common cases for
    research data. Specialized needs for preparing and publishing a
    dataset may arise because the data are not tabular in structure, are
    stored in non-standard file formats, are collected with unique or
    new acquisition methods, or are large in size. Some examples of
    specialized data types described here are imagery, code, document
    archives, spatial data files, and raw genomic data stored in other
    repositories. This is Version 1 of the document, written and edited
    by the Non-tabular Data Working Group of the U.S. LTER Network
    Information Management Committee.}
}

For attribution, please cite this work as:

Gries, Corinna, Stace Beaulieu, Renée F. Brown, Sarah Elmendorf, Hap Garritt, Gastil Gastil-Buhl, Hsun-Yi Hseih, et al. 2021. Data Package Design for Special Cases (version 1). Environmental Data Initiative. https://doi.org/10.6073/pasta/9d4c803578c3fbcb45fc23f13124d052.