GitHub - ITC-CRIB/fairly: A package to create, publish, and clone research datasets

fair-software.nl recommendations	Badges
1. Code repository
2. License
3. Community Registry
4. Enable Citation
Other best practices
Continuous integration
Documentation
Anaconda package

fairly

A package to create, publish and clone research datasets.

Installation

fairly requires Python 3.8 or later, and ruamel.yaml version 0.17.26 or later. It can be installed directly from PYPI or Conda-Forge.

# Using pip
pip install fairly

# using anaconda or miniconda
conda install conda-forge::fairly

Installing from source

Clone or download the source code:

git clone https://github.com/ITC-CRIB/fairly.git

Go to the root directory:
```
cd fairly/
```
Compile and install using pip:
```
pip install .
```

Usage

Basic example to create a local research dataset and deposit it to a repository:

import fairly

# Initialize a local dataset
dataset = fairly.init_dataset('/path/dataset')

# Set metadata
dataset.metadata['license'] = 'MIT'
dataset.set_metadata(
    title='My dataset',
    keywords=['FAIR', 'research', 'data'],
    authors=[
        '0000-0002-0156-185X',
        {'name': 'John', 'surname': 'Doe'}
    ]
)

# Add data files
dataset.includes.extend([
    'README.txt',
    '*.csv',
    'train/*.jpg'
])

# Save dataset
dataset.save()

# Upload to a data repository
remote_dataset = dataset.upload('zenodo')

Basic example to access a remote dataset and store it locally:

import fairly

# Open a remote dataset
dataset = fairly.dataset('doi:10.4121/21588096.v1')

# Get dataset information
dataset.id
>>> {'id': '21588096', 'version': '1'}

dataset.url
>>> 'https://data.4tu.nl/articles/dataset/.../21588096/1'

dataset.size
>>> 33339

len(dataset.files)
>>> 6

dataset.metadata
>>> Metadata({'keywords': ['Earthquakes', 'precursor', ...], ...})

# Update metadata
dataset.metadata['keywords'] = ['Landslides', 'precursor']
dataset.save_metadata()

# Store dataset to a local directory (i.e. clone dataset)
local_dataset = dataset.store('/path/dataset')

Currently, the package supports the following research data management platforms:

Invenio
Figshare
Djehuty (experimental)

All research data repositories based on the listed platforms are supported.

For more details and examples, consult the package documentation.

Testing

Unit tests can be run by using pytest command in the root directory.

Contributions

Read the guidelines to know how you can be part of this open source project.

JupyterLab Extension

An extension for JupyerLab is being developed in a different repository.

Citation

Please cite this software using as follows:

Girgin, S., Garcia Alvarez, M., & Urra Llanusa, J., fairly: a package to create, publish and clone research datasets [Computer software]

Acknowledgements

This research is funded by the Dutch Research Council (NWO) Open Science Fund, File No. 203.001.114.

Project members:

Name		Name	Last commit message	Last commit date
Latest commit History 473 Commits
.github/workflows		.github/workflows
docs		docs
src/fairly		src/fairly
tests		tests
.gitignore		.gitignore
CITATION.cff		CITATION.cff
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.rst		README.rst
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.github/workflows

.github/workflows

docs

docs

src/fairly

src/fairly

tests

tests

.gitignore

.gitignore

CITATION.cff

CITATION.cff

CONTRIBUTING.md

CONTRIBUTING.md

LICENSE

LICENSE

README.rst

README.rst

pyproject.toml

pyproject.toml

Repository files navigation

fairly

Installation

Installing from source

Usage

Testing

Contributions

JupyterLab Extension

Citation

Acknowledgements

About

Releases 8

Packages

Contributors 3

Languages

License

ITC-CRIB/fairly

Folders and files

Latest commit

History

Repository files navigation

fairly

Installation

Installing from source

Usage

Testing

Contributions

JupyterLab Extension

Citation

Acknowledgements

About

Topics

Resources

License

Stars

Watchers

Forks

Languages