Skip to content

monarch-initiative/monarchkg-model-organism-gene-transform

Repository files navigation

Model Organism Gene Transform

| Documentation |

This is the project description.

Requirements

Setting Up a New Project -- Delete this section when completed

Upon creating a new project from the cookiecutter-monarch-ingest template, you can install and test the project:

cd Model Organism Gene Transform
make install
make test

There are a few additional steps to complete before the project is ready for use.

GitHub Repository

  1. Create a new repository on GitHub.

  2. Enable GitHub Actions to read and write to the repository (required to deploy the project to GitHub Pages).

    • in GitHub, go to Settings -> Action -> General -> Workflow permissions and choose read and write permissions
  3. Initialize the local repository and push the code to GitHub. For example:

    cd Model Organism Gene Transform
    git init
    git remote add origin https://github.com/<username>/<repository>.git
    git add -A && git commit -m "Initial commit"
    git push -u origin main

Transform Code and Configuration

  1. Edit the download.yaml, transform.py, transform.yaml, and metadata.yaml files to suit your needs.
  2. Add any additional dependencies to the pyproject.toml file.
  3. Adjust the contents of the tests directory to test the functionality of your transform.

Documentation

  1. Update this README.md file with any additional information about the project.
  2. Add any appropriate documentation to the docs directory.

Note: After the GitHub Actions for deploying documentation runs, the documentation will be automatically deployed to GitHub Pages.
However, you will need to go to the repository settings and set the GitHub Pages source to the gh-pages branch, using the /docs directory.

Once you have completed these steps, you can remove the Setting Up a New Project section from this README.md file.

Data Sources

Update this section to describe the source of the data for the ingest. Include information about the projects and groups that create or curate the data, which data files are used, and the specific sources and/or versions of those files. It is also valuable to document what model is used for the ingest (generally the Biolink Model) and what types of nodes and edges are created. Here is an example of how you might document this:

Data files for YOUR_SOURCE_DATA_TYPE are available from GROUP_OR_PROJECT through there portal at (include links where possible).

Source Files

This ingest relies on N data files from GROUP_OR_PROJECT and one additional data file for FILE_USAGE (often mapping) from OTHER_GROUP_OR_PROJECT.

  • FILENAME_1 - Describe the data in the file and give a basic description of how it's used. It's nice to include the URL's here as well as having them in the downloads.yaml later

Nodes and Edges

Use this section describe the nodes and edges generated from the ingest for instance

  • Gene Nodes - Description of which nodes are created and what data may be excluded from the ingest.
  • Gene → Disease - Similar description of the edges and which edges are created or how the data may be filtered.

Transform Code and Configuration

Metadata for the infest is in the metadata.yaml file and may require some adjustment depending on your configuration. Data files and locations are listed in the download.yaml file which is used to download all of the data sources before the transform. The transform.yaml file and python file transform.py contain the configuration and transformation code, respectively.

For more information, see the Koza documentation and kghub-downloader.

Dependencies are listed in pyproject.toml file. This project uses pytest for development testing located in the tests directory to test the functionality of your transform.

Documentation

The documentation for this ingest is in this README.md file and additional documentation is in the docs directory.

Note: After the GitHub Actions for deploying documentation runs, the documentation will be automatically deployed to GitHub Pages.

GitHub Actions

This project is set up with several GitHub Actions workflows. You should not need to modify these workflows unless you want to change the behavior. The workflows are located in the .github/workflows directory:

  • test.yaml: Run the pytest suite.
  • create-release.yaml: Create a new release once a week, or manually.
  • deploy-docs.yaml: Deploy the documentation to GitHub Pages (on pushes to main).
  • update-docs.yaml: After a release, update the documentation with node/edge reports.

Installation

cd Model Organism Gene Transform
make install
# or
poetry install

Note that the make install command is just a convenience wrapper around poetry install.

Once installed, you can check that everything is working as expected:

# Run the pytest suite
make test
# Download the data and run the Koza transform
make download
make run

Usage

This project is set up with a Makefile for common tasks.
To see available options:

make help

Download and Transform

Download the data for the model_organism_gene_transform transform:

poetry run model_organism_gene_transform download

To run the Koza transform for Model Organism Gene Transform:

poetry run model_organism_gene_transform transform

To see available options:

poetry run model_organism_gene_transform download --help
# or
poetry run model_organism_gene_transform transform --help

Testing

To run the test suite:

make test

This project was generated using monarch-initiative/cookiecutter-monarch-ingest.
Keep this project up to date using cruft by occasionally running in the project directory:

cruft update

For more information, see the cruft documentation

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published