Skip to content

A fast and scalable CUDA implementation to conduct highly parallelized evolutionary tests on large scale genomic data.

License

Notifications You must be signed in to change notification settings

theLongLab/CATE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

logo

CATE (CUDA Accelerated Testing of Evolution)

A fast and scalable CUDA implementation to conduct highly parallelized evolutionary tests on large-scale genomic data.


Description

The CATE software is a CUDA based solution to enable rapid processing of large-scale VCF files to conduct a series of six different tests on evolution.


🟠 Here we have provided only a brief overview of CATE's useablity.
🟢 Please refer to CATE's wiki to otain a more detailed understaning of its functionality and usability.


Prerequisites

  1. CUDA capable hardware
  2. LINUX or UNIX based kernel
  3. NVIDIA's CUDA toolkit (nvcc compiler)
  4. C++ compiler (gcc compiler)

How to INSTALL

CATE can be used via an on-device executable and also has the ability to run via Google Colab.

For the Google Colab notebook please follow the link to CATE on Colab.

Else, if you want to install CATE on-device you may have to compile the code using an nvcc compiler. If so execute the following on the terminal:

Download the repository:

git clone "https://github.com/theLongLab/CATE/"
cd CATE/

cuda 11.3.0 or higher

module load cuda/11.3.0

Finally, compile the project:

nvcc -std=c++17 *.cu *.cpp -o "CATE"

How to RUN

CATE is a command-line-based software. Its available functions include six different tests on evolution and a series of tools for editing and processing FASTA and VCF files.

The six tests on evolution are:

  1. Tajima’s D
  2. Fu and Li's D, D*, F, and F *
  3. Fay and Wu’s H and E
  4. McDonald–Kreitman test
  5. Fixation Index
  6. Extended Haplotype Homozygosity

Currently, the program's executable is called:

Test_Main

To run the software you need a JSON-style parameters file. An example is provided above:

parameters.json.

The parameters file is used to specify all input and output locations as well as the gene list file locations. Each function's execution can be customized individually using the parameters file.

The typical syntax for program execution is as follows (example below shows running the Tajima's function):

program_executable --function parameter_file

program_executable -f parameter_file

Example:

./Test_Main -t parameters.json

The HELP menu will list all available functions and how each function can be executed. It can be accessed by simply typing -h as the function as shown below:

./Test_Main -h


How to Cite

CATE has been successfully published in the journal Methods in Ecology and Evolution (MEE). If you find this framework or the software solution useful in your analyses, please CITE the published article available in MEE, CATE: A fast and scalable CUDA implementation to conduct highly parallelized evolutionary tests on large scale genomic data.

To cite CATE's code please use the Zenodo release:

DOI

The details of the citation are listed below:

Perera, D., Reisenhofer, E., Hussein, S., Higgins, E., Huber, C. D., & Long, Q. (2023). CATE: A fast and scalable CUDA implementation to conduct highly parallelized evolutionary tests on large scale genomic data. Methods in Ecology and Evolution, 00, 1–15. https://doi.org/10.1111/2041-210X.14168.


MIT License

Copyright (c) 2022 The Long Lab

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.