Skip to content
This repository has been archived by the owner on Apr 2, 2022. It is now read-only.

Commit

Permalink
Update README on daily updates
Browse files Browse the repository at this point in the history
  • Loading branch information
J535D165 committed Jun 12, 2020
1 parent 9107ac5 commit 662d15a
Show file tree
Hide file tree
Showing 2 changed files with 42 additions and 39 deletions.
57 changes: 24 additions & 33 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,64 +3,55 @@
Extension to add publications on COVID-19 to [ASReview](https://github.com/asreview/asreview).

# ASReview against COVID-19
The Active learning for Systematic Reviews software [ASReview](https://github.com/asreview/asreview) implements learning algorithms that interactively query the researcher during the title and abstract reading phase of a systematic search. This way of interactive training is known as active learning. ASReview offers support for classical learning algorithms and state-of-the-art learning algorithms like neural networks. The software can be used for traditional systematic reviews for which the user uploads a dataset of papers, or one can make use of the built-in datasets.
The Active learning for Systematic Reviews software [ASReview](https://github.com/asreview/asreview) implements learning algorithms that interactively query the researcher during the title and abstract reading phase of a systematic search. This way of interactive training is known as active learning. ASReview offers support for classical learning algorithms and state-of-the-art learning algorithms like neural networks. The software can be used for traditional systematic reviews for which the user uploads a dataset of papers, or one can make use of the built-in datasets.

To help combat the COVID-19 crisis, the ASReview team released an extension that integrates the latest scientific datasets on COVID-19 in the ASReview software.
To help combat the COVID-19 crisis, the ASReview team released an extension that integrates the latest scientific datasets on COVID-19 in the ASReview software. Experts can **start reviewing the latest scientific literature on COVID-19 immediately!** See [datasets](#datasets) for an overview of the datasets (daily updates).

## CORD-19 dataset
The [CORD-19 dataset](https://pages.semanticscholar.org/coronavirus-research) is a dataset with scientific publications on COVID-19 and coronavirus-related research (e.g. SARS, MERS, etc.) from PubMed Central, the WHO COVID-19 database of publications, the preprint servers bioRxiv, medRxiv and arXiv, and papers contributed by specific publishers (currently Elsevier). The dataset is compiled and maintained by a collaboration of the Allen Institute for AI, the Chan Zuckerberg Initiative, Georgetown University’s Center for Security and Emerging Technology, Microsoft Research, and the National Library of Medicine of the National Institutes of Health. The full dataset contains metadata of >140K publications on COVID-19 and coronavirus-related research. The CORD-19 dataset is updated daily.

The most recent version of the dataset can be downloaded here:
[https://ai2-semanticscholar-cord-19.s3-us-west-2.amazonaws.com/latest/metadata.csv](https://ai2-semanticscholar-cord-19.s3-us-west-2.amazonaws.com/latest/metadata.csv).
Older versions are archived on [Zenodo](https://doi.org/10.5281/zenodo.3715505) and [Amazon AWS](https://ai2-semanticscholar-cord-19.s3-us-west-2.amazonaws.com/historical_releases.html).
## Installation and usage

## COVID19 preprints dataset
The [COVID19 preprints dataset](https://github.com/nicholasmfraser/covid19_preprints) is created by [Nicholas Fraser](https://github.com/nicholasmfraser) and [Bianca Kramer](https://github.com/bmkramer), by collecting metadata of COVID19-related preprints from over 15 preprint servers with DOIs registered with Crossref or DataCite, and from arXiv. The dataset contains metadata of >10K preprints on COVID-19 and coronavirus-related research. The COVID19 preprints dataset is updated weekly.
The COVID-19 plug-in requires ASReview 0.8 or higher. Install ASReview by following the instructions in [Installation of ASReview](https://asreview.readthedocs.io/en/latest/installation.html).

The most recent version of the dataset can be downloaded here (csv):
[https://github.com/nicholasmfraser/covid19_preprints/blob/master/data/covid19_preprints.csv](https://github.com/nicholasmfraser/covid19_preprints/blob/master/data/covid19_preprints.csv).
All versions are archived on [Figshare](https://doi.org/10.6084/m9.figshare.12033672)
Install the extension with pip:

## ASReview plugin
```bash
pip install asreview-covid19
```

To help combat the COVID-19 crisis, the ASReview team has decided to release a package that provides the latest scientific datasets on COVID-19. These are integrated automatically into ASReview once we install the correct packages, so reviewers can start reviewing the latest scientific literature on COVID-19 as soon as possible!
Two versions of the CORD-19 dataset (publications relating to COVID-19) are made available in ASReview, as well as the COVID19 preprints dataset
The datasets are immediately available after starting ASReview (`asreview oracle`). The datasets are selectable in Step 2 of the project initialization. For more information on the usage of ASReview, please have a look at the [Quick Tour](https://asreview.readthedocs.io/en/latest/quicktour.html).

- full CORD-19 dataset
- CORD-19 dataset with publications from December 2019 onwards
- COVID19 preprints dataset

The current datasets are based on **CORD-19 release 2020-06-02)** and **COVID19 preprints version 9 (released 2020-05-24)**
## Datasets

The datasets are updated in ASReview plugin shortly after their release.
The following datasets are available:

## Installation and usage
- [CORD-19 dataset](#cord-19-dataset)
- [CORD-19 2020 dataset](#cord-19-2020-dataset)
- [COVID19 preprints dataset](#covid19-preprints-dataset)

The COVID-19 plug-in requires ASReview 0.8 or higher. Install ASReview by following the instructions in [Installation of ASReview](https://asreview.readthedocs.io/en/latest/installation.html).
:exclamation: The datasets are checked for updates every couple of hours such that the latest collections are available in the ASReview COVID19 plugin and ASReview software.

Install the extension with pip:
[![ASReview CORD19 datasets](https://github.com/asreview/asreview/blob/master/images/asreview-covid19-screenshot.png?raw=true)](https://github.com/asreview/asreview-covid19)

```bash
pip install asreview-covid19
```
### CORD-19 dataset
The [CORD-19 dataset](https://pages.semanticscholar.org/coronavirus-research) is a dataset with scientific publications on COVID-19 and coronavirus-related research (e.g. SARS, MERS, etc.) from PubMed Central, the WHO COVID-19 database of publications, the preprint servers bioRxiv, medRxiv and arXiv, and papers contributed by specific publishers (currently Elsevier). The dataset is compiled and maintained by a collaboration of the Allen Institute for AI, the Chan Zuckerberg Initiative, Georgetown University’s Center for Security and Emerging Technology, Microsoft Research, and the National Library of Medicine of the National Institutes of Health. The full dataset contains metadata of more than **100K publications** on COVID-19 and coronavirus-related research. **The CORD-19 dataset receives daily updates and is directly available in the ASReview software.** The most recent versions of the dataset can be found here: https://ai2-semanticscholar-cord-19.s3-us-west-2.amazonaws.com/historical_releases.html

The datasets are immediately available after starting ASReview.
##### CORD-19 2020 dataset

```bash
asreview oracle
```
The CORD-19 dataset contains publications on all publications on coronavirus-related research. Therefore, the dataset is not specific for SARS-CoV-2 or COVID19. A subset of the CORD-19 dataset is created containing publications from Dec 2019 onwards (i.e. publication relating to the current COVID-19 outbreak). This dataset can be found as `CORD-19 2020` in the software.

The datasets are selectable in Step 2 of the project initialization. For more information on the usage of ASReview, please have a look at the [Quick Tour](https://asreview.readthedocs.io/en/latest/quicktour.html).
### COVID19 preprints dataset
The [COVID19 preprints dataset](https://github.com/nicholasmfraser/covid19_preprints) is created by [Nicholas Fraser](https://github.com/nicholasmfraser) and [Bianca Kramer](https://github.com/bmkramer), by collecting metadata of COVID19-related preprints from over 15 preprint servers with DOIs registered with Crossref or DataCite, and from arXiv. The dataset contains metadata of >10K preprints on COVID-19 and coronavirus-related research. All versions are archived on [Figshare](https://doi.org/10.6084/m9.figshare.12033672). The COVID19 preprints dataset receives weekly updates.

[![ASReview CORD19 datasets](https://github.com/asreview/asreview/blob/master/images/asreview-covid19-screenshot.png?raw=true)](https://github.com/asreview/asreview-covid19)
The most recent version of the dataset can be found here:[https://github.com/nicholasmfraser/covid19_preprints/blob/master/data/covid19_preprints.csv](https://github.com/nicholasmfraser/covid19_preprints/blob/master/data/covid19_preprints.csv).

## License, citation and contact

[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.3764749.svg)](https://doi.org/10.5281/zenodo.3764749) [![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)

The ASReview software and the plugin have an Apache 2.0 LICENSE. For the datasets, please see the license of the CORD-19 dataset https://pages.semanticscholar.org/coronavirus-research. The COVID19 preprints dataset has a [CC0 license](https://creativecommons.org/publicdomain/zero/1.0/).

Visit https://doi.org/10.5281/zenodo.3764749 to get the citation style of your preference.
Visit https://doi.org/10.5281/zenodo.3764749 to get the citation style of your preference.

This project is coordinated by by Rens van de Schoot (@Rensvandeschoot) and Daniel Oberski (@daob) and is part of the research work conducted by the Department of Methodology & Statistics, Faculty of Social and Behavioral Sciences, Utrecht University, The Netherlands. Maintainers are Jonathan de Bruin (@J535D165) and Raoul Schram (@qubixes).

Expand Down
24 changes: 18 additions & 6 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,27 @@
from setuptools import setup, find_namespace_packages
from os import path
from io import open
import re

import versioneer


here = path.abspath(path.dirname(__file__))

# Get the long description from the README file
with open(path.join(here, 'README.md'), encoding='utf-8') as f:
long_description = f.read()

def get_long_description():
"""Get project description based on README"""
here = path.abspath(path.dirname(__file__))

# Get the long description from the README file
with open(path.join(here, 'README.md'), encoding='utf-8') as f:
long_description = f.read()

# remove emoji
long_description = re.sub(r"\:[a-z_]+\:", "", long_description)

return long_description


DEPS = {
"config-create": "asreview-statistics",
Expand All @@ -23,19 +35,19 @@
version=versioneer.get_version(),
cmdclass=versioneer.get_cmdclass(),
description='Covid-19 related datasets for ASReview',
long_description=long_description,
long_description=get_long_description(),
long_description_content_type='text/markdown',
url='https://github.com/asreview/asreview-covid19',
author='Utrecht University',
author_email='[email protected]',
include_package_data=True,
# package_data={'[covid19', ['config.json'])],

classifiers=[
# How mature is this project? Common values are
# 3 - Alpha
# 4 - Beta
# 5 - Production/Stable
'Development Status :: 3 - Alpha',
'Development Status :: 4 - Beta',

# Pick your license as you wish
'License :: OSI Approved :: Apache Software License',
Expand Down

0 comments on commit 662d15a

Please sign in to comment.