Skip to content

hicup failing when there is a period in a genome name (like a versioned accession) #7999

@scottcain

Description

@scottcain

When trying to run a Hi-C workflow from the IWC, hicup_disgester fails when using a cached genome, which is what a user would typically do (in fact, that's what is done for a user when coming from BRC-Analytics).

Details

History: https://usegalaxy.org/u/scottcain/h/trypanosoma-cruzi-hic-diaz-viraque

Command executed:

ln -f -s '/jetstream2/scratch/main/jobs/77283260/inputs/dataset_ed6ecf21-7555-4047-9cf7-862ec1456d39.dat' 'dataset1.fq.gz' && \
 ln -f -s '/jetstream2/scratch/main/jobs/77283260/inputs/dataset_29cf0348-4f29-44e5-9235-716a7c154bd9.dat' 'dataset2.fq.gz' &&  \
 BOWTIE_PATH_BASH="$(which bowtie2)" && \
 hicup_digester --re1 '^GATC' --genome 'GCA_003177105.1' '/cvmfs/brc.galaxyproject.org/data/genomes/GCA_003177105.1/bowtie_index/v2/GCA_003177105.1/GCA_003177105.1'.fa && \
mv *Digest_* digest_file.txt && hicup --zip --threads ${GALAXY_SLOTS:-1} \
--digest digest_file.txt \
--index '/cvmfs/brc.galaxyproject.org/data/genomes/GCA_003177105.1/bowtie_index/v2/GCA_003177105.1/GCA_003177105.1' \
--bowtie2 $BOWTIE_PATH_BASH --keep   'dataset1.fq.gz' 'dataset2.fq.gz'

Note the period in GCA_003177105.1. In the line after the line containing --genome there appears to be a stray single quote towards the end of the line (near .fa &&) but that is perhaps intentional.

Error message:

Option --genome may only be passed alphanumeric characters and underscore, please adjust
Please change configuration file and/or command-line parameters and/or installation accordingly

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions