Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add PySCENIC #50445

Open
wants to merge 43 commits into
base: master
Choose a base branch
from
Open

Add PySCENIC #50445

wants to merge 43 commits into from

Conversation

LiliyaBioinf
Copy link

Describe your pull request here


Please read the guidelines for Bioconda recipes before opening a pull request (PR).

General instructions

  • If this PR adds or updates a recipe, use "Add" or "Update" appropriately as the first word in its title.
  • New recipes not directly relevant to the biological sciences need to be submitted to the conda-forge channel instead of Bioconda.
  • PRs require reviews prior to being merged. Once your PR is passing tests and ready to be merged, please issue the @BiocondaBot please add label command.
  • Please post questions on Gitter or ping @bioconda/core in a comment.

Instructions for avoiding API, ABI, and CLI breakage issues

Conda is able to record and lock (a.k.a. pin) dependency versions used at build time of other recipes.
This way, one can avoid that expectations of a downstream recipe with regards to API, ABI, or CLI are violated by later changes in the recipe.
If not already present in the meta.yaml, make sure to specify run_exports (see here for the rationale and comprehensive explanation).
Add a run_exports section like this:

build:
  run_exports:
    - ...

with ... being one of:

Case run_exports statement
semantic versioning {{ pin_subpackage("myrecipe", max_pin="x") }}
semantic versioning (0.x.x) {{ pin_subpackage("myrecipe", max_pin="x.x") }}
known breakage in minor versions {{ pin_subpackage("myrecipe", max_pin="x.x") }} (in such a case, please add a note that shortly mentions your evidence for that)
known breakage in patch versions {{ pin_subpackage("myrecipe", max_pin="x.x.x") }} (in such a case, please add a note that shortly mentions your evidence for that)
calendar versioning {{ pin_subpackage("myrecipe", max_pin=None) }}

while replacing "myrecipe" with either name if a name|lower variable is defined in your recipe or with the lowercase name of the package in quotes.

Bot commands for PR management

Please use the following BiocondaBot commands:

Everyone has access to the following BiocondaBot commands, which can be given in a comment:

@BiocondaBot please update Merge the master branch into a PR.
@BiocondaBot please add label Add the please review & merge label.
@BiocondaBot please fetch artifacts Post links to CI-built packages/containers.
You can use this to test packages locally.

Note that the @BiocondaBot please merge command is now depreciated. Please just squash and merge instead.

Also, the bot watches for comments from non-members that include @bioconda/<team> and will automatically re-post them to notify the addressed <team>.

@pcm32 pcm32 changed the title test new adding files Add PySCENIC Sep 3, 2024
@pcm32 pcm32 closed this Sep 4, 2024
@pcm32 pcm32 reopened this Sep 4, 2024
@pcm32
Copy link
Member

pcm32 commented Sep 11, 2024

This seems to be often failing (and then losing the logs) with:

11:52:46 BIOCONDA INFO (OUT) WARNING: No numpy version specified in conda_build_config.yaml.  Falling back to default numpy value of 1.22
11:52:46 BIOCONDA INFO (OUT) Adding in variants from internal_defaults
11:52:46 BIOCONDA INFO (OUT) Adding in variants from /opt/host-conda-bld/conda_build_config_0_-e_conda_build_config.yaml
11:52:46 BIOCONDA INFO (OUT) Adding in variants from /opt/host-conda-bld/conda_build_config_1_-e_bioconda_utils-conda_build_config.yaml
11:52:48 BIOCONDA INFO (OUT) Attempting to finalize metadata for pyscenic
##[warning]Free memory is lower than 5%; Currently used: 95.32%
##[warning]Free memory is lower than 5%; Currently used: 95.32%
##[warning]Free memory is lower than 5%; Currently used: 95.32%
##[warning]Free memory is lower than 5%; Currently used: 95.32%
##[warning]Free memory is lower than 5%; Currently used: 95.32%
##[warning]Free memory is lower than 5%; Currently used: 95.32%

on test linux...

@pcm32
Copy link
Member

pcm32 commented Sep 11, 2024

Getting past that and now sort of stuck in:

11:55:51 BIOCONDA INFO (OUT) Files containing CONDA_PREFIX
11:55:51 BIOCONDA INFO (OUT) -----------------------------
11:55:51 BIOCONDA INFO (OUT) bin/arboreto_with_multiprocessing.py (text): Patching
11:55:51 BIOCONDA INFO (OUT) bin/csv2loom (text): Patching
11:55:51 BIOCONDA INFO (OUT) bin/db2feather (text): Patching
11:55:51 BIOCONDA INFO (OUT) bin/gmt2regions (text): Patching
11:55:51 BIOCONDA INFO (OUT) bin/invertdb (text): Patching
11:55:51 BIOCONDA INFO (OUT) bin/pyscenic (text): Patching
11:55:58 BIOCONDA INFO (OUT) WARNING: Importing conda-verify failed.  Please be sure to test your packages.  conda install conda-verify to make this message go away.
11:56:02 BIOCONDA INFO (OUT) TEST START: /opt/conda/conda-bld/linux-64/pyscenic-0.12.1-py310hdbdd923_0.tar.bz2
11:56:04 BIOCONDA INFO (OUT) Renaming work directory '/opt/conda/conda-bld/pyscenic_1726055566916/work' to '/opt/conda/conda-bld/pyscenic_1726055566916/work_moved_pyscenic-0.12.1-py310hdbdd923_0_linux-64'
11:56:04 BIOCONDA INFO (OUT) shutil.move(work)=/opt/conda/conda-bld/pyscenic_1726055566916/work, dest=/opt/conda/conda-bld/pyscenic_1726055566916/work_moved_pyscenic-0.12.1-py310hdbdd923_0_linux-64)

@pcm32
Copy link
Member

pcm32 commented Sep 11, 2024

Stuck on the above now for 30 minutes... otherwise all looks good.

Comment on lines 13 to 15
- url: "https://files.pythonhosted.org/packages/source/a/arboreto/arboreto-0.1.6.tar.gz"
sha256: "32fdac5e8a3e0ef2e196b5827f067d815ac4e689d2fca0dc437f42abdeeb89ab"
folder: arboreto-0.1.6
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LiliyaBioinf arboreto is available from conda, and as such should be used as a conda dependency. Can you remove it from the sources and add under requirements.run please? Can you do the same with any of the above sources that might also be in conda (usually searching for conda + name of the package should allow you to find them if the exists as conda packages). Thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and of course, if added to the requirements, then shouldn't go in the build.sh.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there was some discussion a while back to add multiprocessing_on_dill to conda-forge , but the upstream maintainers did not approve the PR to add the license. That would need to be resolved in some way so it can be added to conda-forge.

Copy link
Member

@pcm32 pcm32 Sep 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe in the meantime multiprocessing_on_dill can be installed through pip so that the community can benefit soon from the pyscenic recipe.

@aliciaaevans
Copy link
Contributor

We have had some resource issues, such as running out of memory, with Azure recently. After moving all dependencies out of source (either using existing bioconda or conda-forge packages or by adding new recipes in separate approved/merged PRs), if you are still seeing the Azure linux-64 build hang/disconnect, let me know and I will take a look.

@pcm32
Copy link
Member

pcm32 commented Sep 12, 2024

@BiocondaBot please fetch artifacts

@pcm32
Copy link
Member

pcm32 commented Sep 12, 2024

Thanks @LiliyaBioinf for addressing the issues.

@BiocondaBot
Copy link
Collaborator

Package(s) built are ready for inspection:

Arch Package Zip File / Repodata CI Instructions
linux-64 pyscenic-0.12.1-py310hdbdd923_0.tar.bz2 LinuxArtifacts.zip Azure
showYou may also use conda to install after downloading and extracting the zip file. From the LinuxArtifacts directory: conda install -c ./packages <package name>
osx-64 pyscenic-0.12.1-py310h4a92bd6_0.tar.bz2 OSXArtifacts.zip Azure
showYou may also use conda to install after downloading and extracting the zip file. From the OSXArtifacts directory: conda install -c ./packages <package name>

Docker image(s) built:

Package Tag CI Install with docker
pyscenic 0.12.1--py310hdbdd923_0 Azure
showImages for Azure are in the LinuxArtifacts zip file above.gzip -dc LinuxArtifacts/images/pyscenic:0.12.1--py310hdbdd923_0.tar.gz | docker load

Copy link
Contributor

@aliciaaevans aliciaaevans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the recipe, but I don't think this should be packing its dependencies into the recipe as sources. One of the reasons is that license for distribution needs to be accounted for.

sha256: "{{ sha256 }}"
folder: pyscenic-0.12.1

- url: "https://github.com/aertslab/ctxcore/archive/refs/tags/0.2.0.tar.gz"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ctxcore is in bioconda, so you should be able to add that to the dependencies

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed!

- url: "https://github.com/aertslab/ctxcore/archive/refs/tags/0.2.0.tar.gz"
sha256: "a7ebf0f2625641b76a390993e12042e92fff7d0ac242c7fad5e3bff3ff3cd67a"
folder: ctxcore-0.2.0
- url: "https://files.pythonhosted.org/packages/source/m/multiprocessing_on_dill/multiprocessing_on_dill-3.5.0a4.tar.gz"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned above, the licensing for multiprocessing_on_dill has to be worked out. It should be added to conda-forge rather than bioconda because it's not bioinformatics related.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Working out a license like that could take a long time, in the meantime this could be used like this and possibly improved once that is resolved. There are many packages in bioconda that recourse to pip installing dependencies that are not available in conda, for instance https://github.com/bioconda/bioconda-recipes/blob/def3a07b68330abb40f7650b3747b8bee189ab66/recipes/latch/build.sh

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I see your point. I will get some clarity on this from the core team.


extra:
recipe-maintainers:
- 179018299
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this number refer to?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was my github id, deleted it

@pcm32
Copy link
Member

pcm32 commented Sep 12, 2024

I suspect that you will need multiprocessing_on_dill from source, unless that someone has added it to conda. Thanks for switching ctxcore @LiliyaBioinf .

@daler
Copy link
Member

daler commented Sep 13, 2024

Chiming in here, to summarize it sounds like mulitprocessing_on_dill is not available on conda-forge because of a missing license in 2018, and an upstream author replied to a PR while on vacation in 2019 but probably didn't check afterwards.

The right way to fix this is to 1) submit a new conda-forge PR and 2) contact the upstream author(s) again.

(Side note, the abovementioned latch recipe is the only instance I know of that is installing separately from pip, and is probably a fluke -- my guess is that it got missed in the review process -- and should be fixed.)

@pcm32
Copy link
Member

pcm32 commented Sep 13, 2024

License aside, I have to say I have seen many recipes doing the pip trick when there are no conda packages along the years, here is another one I found quickly https://github.com/bioconda/bioconda-recipes/blob/master/recipes/iphop/build.sh .

I think it puts a large burden on contributors to having to create recipes recursively like this. What happens if multiprocessing_on_dill has 2 other deps that are not in conda? Are we asking the contributor to also have to deal with that? where do we stop?

@daler
Copy link
Member

daler commented Sep 16, 2024

@pcm32 after discussing with the @bioconda/core team, the consensus is that pip should not be used to install additional packages from within recipes. The existence of existing recipes that do this is an oversight, so we will be fixing them. We will also update the docs to clarify this policy.

Regarding recursively adding recipes -- yes, that is exactly what we need to do. If multiprocessing_on_dill has more dependencies, then they need to be added to conda as well. In fact, in the early days of bioconda (and conda-forge) the majority of the work was in creating all of those recursive dependencies. It's a lot rarer nowadays, but sometimes it still does need to happen like in this case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants