-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Description
Problem
At my workplace, we are updating a workflow that uses eggnog-mapper from biocontainer.
We observed a huge performance regression on this tool with option diamond emapper.py ... -m diamond
Results quality also differs (many less hits with conda version)
When using eggnog-mapper from pip, there is no issue.
When using diamond>=2.0.11,<2.1 with conda version of eggnog-mapper, there is no more issue. I get the same run time and result than the pip version (tested with both diamond 2.0.11 and 2.0.15).
Now the problem is that anyone that uses default eggnog-mapper installation from conda or biocontainer will suffer those issues without knowing it (not really carbon friendly move)
Proposed solution
A quick solution is to constrain diamond version with >=2.0.11,<2.1
in eggnog-mapper recipe. But it will forbid anyone to install newer a version of diamond if she/he want to.
A better solution can be to set a preferred version range for diamond in eggnog recipe, but I didn't find anything about this kind of feature in conda-build documentation.
Appendix
- conda version installation
conda create -y -n eggnog-2.1.13-conda eggnog-mapper=2.1.13
- pip version installation
# pip version requires biopython==1.76, psutil==5.7.0 and xlsxwriter==1.4.3
conda create -y -n eggnog-2.1.13-pip python=3.8 pip
conda run -n eggnog-2.1.13-pip python -m pip install eggnog-mapper==2.1.13
- fixed conda version installation
conda create -y -n eggnog-2.1.13-conda eggnog-mapper=2.1.13 "diamond>=2.0.11,<2.1"