Creating-Open-Datasets.md

Three steps to creating open datasets:

1) initial data build (choosing data to archive)

1a) redudancy (lock-down original source)

1b) alignment (create data matrix with consistent order)

1c) test data quality (produce test plots)

2) annotation

2a) label and metadata for each variable

2b) basic descriptive statistics

3) verification step (before archiving)

3a) sort/realignment (verify data matrix has a consistent order)

3b) verify quality (reproduce plots)

4) archiving step

4b) HDFView (https://www.hdfgroup.org/) for manipulating databases (HDF5 Format).

4b) Open Science Framework (https://osf.io) or Github (https://www.github.com) for storage and providing stable URL. For Github repo storage, data can be converted to csv (comma-delimited) format, which will render in the form of a table in the repo.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FilesExpand file tree

Creating-Open-Datasets.md

Latest commit

History

Creating-Open-Datasets.md

File metadata and controls