Skip to content

Latest commit

 

History

History
54 lines (37 loc) · 2.5 KB

How-to-install.md

File metadata and controls

54 lines (37 loc) · 2.5 KB

Requirements

  • Docker (see https://docs.docker.com/install/ if it is not already installed in your system)

  • 64-bit Linux system operation. We validated MAGset with the following distributions:

    • Ubuntu version 18.04 and 20.04
    • CentOS version 7.6 and 8.3
    • Fedora version 33
    • Debian version 9 and 10 (In version 10, there is a open issue about docker and Debian, please follow these steps docker/for-linux#58 (comment) to use MAGset.)
  • About 12GB of free space (mostly for docker image)

Installing

  • Download the main script
    curl -OL https://github.com/LaboratorioBioinformatica/magset/releases/download/1.5.2/run-magset.sh
    or
    wget https://github.com/LaboratorioBioinformatica/magset/releases/download/1.5.2/run-magset.sh
  • Make the script executable
    chmod +x run-magset.sh

That's it! Please test your installation following the Quick start tutorial.

Memory and execution time

The execution time and memory usage will vary based on the data size, format file (GBFF or FASTA) and if the negative GRIs will be validated against the raw data (MAGcheck module).

Running the software with GBFF files will increase the memory/time considerably, because the pipeline with this type of file executes extra steps (pangenome and annotations).

In general, 8 GB of memory and 4 threads will be enough to execute comparisons with 4 bacterial genomes in a reasonable time.

The tables below show some examples of time/memory consumption, using Ubuntu 20.4 running in the cloud (digital ocean provider), Basic Plan (8 GB / 4 CPUs / 160 GB SSD Disk):

  • Data:
    • Genomes compared: 4 genomes of approximately 3MB each
    • MAGcheck raw data: Illumina pair end, 50 GB (MAGcheck data)
FASTA without MAGcheck FASTA with MAGcheck GBK without MAGcheck GBK with MAGcheck
Time (hh:mm) 00:05 00:45 01:00 01:45
Memory 600MB 600MB 3.5 GB 3.5 GB
  • Data:
    • Genomes compared: 10 genomes of approximately 3MB each
    • No raw data
FASTA without MAGcheck GBK without MAGcheck
Time (hh:mm) 00:35 04:00
Memory 850B 4.5 GB

Limitations

This software was built to run comparisons with no more than 10 genomes, always of the same species. Running comparisons with more than 10 genomes are not officially supported and can take a long time to execute or have memory issues.