Skip to content

acg-team/rust-phylo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Phylo

A high-performance Rust library for phylogenetic analysis and multiple sequence alignment under maximum likelihood and parsimony optimality criteria.

Licence CI codecov

Current FunctionalityGetting StartedCrate FeaturesRoadmapContributingRelated ProjectsSupportCitationLicence and Attributions

Current Functionality

  • Maximum Likelihood Phylogenetic Analysis: Efficient implementation of phylogenetic tree inference using SPR moves using likelihood or parsimony cost functions;
  • Multiple Sequence Alignment (MSA): Support for Multiple Sequence Alignment using the IndelMaP algorithm (paper, python implementation);
  • Sequence Evolution Models: Support for various DNA (JC69, K80, TN93, HKY, GTR) and protein (WAG, HIVB, BLOSUM62) substitution models as well as the Poisson Indel Process (PIP) (paper) model;
  • High Performance: Optimised tree search with optional parallel processing capabilities.

Getting Started

Note: This crate is not yet published on crates.io. To use it directly from GitHub, add this to your Cargo.toml:

[dependencies]
phylo = { git = "https://github.com/acg-team/rust-phylo", package = "phylo" }

Once published on crates.io, you'll be able to use:

[dependencies]
phylo = "0.1.0"

Minimum Supported Rust Version: 1.82.0

MSRV detected using cargo-msrv.

Example

use phylo::likelihood::TreeSearchCost;
use phylo::optimisers::TopologyOptimiser;
use phylo::phylo_info::PhyloInfoBuilder;
use phylo::substitution_models::{SubstModel, SubstitutionCostBuilder, K80};

fn main() -> std::result::Result<(), anyhow::Error> {
    // Note: This example uses test data from the repository
    let info = PhyloInfoBuilder::new("./examples/data/K80.fasta").build()?;
    let k80 = SubstModel::<K80>::new(&[], &[4.0, 1.0]);
    let c = SubstitutionCostBuilder::new(k80, info).build()?;
    let unopt_cost = c.cost();
    let rng = FakeGenerator::default();
    let optimiser = TopologyOptimiser::new(c, SprOptimiser {}, &rng);
    let result = optimiser.run()?;
    assert_eq!(unopt_cost, result.initial_cost);
    assert!(result.final_cost > result.initial_cost);
    assert!(result.iterations <= 100);
    assert_eq!(result.cost.tree().len(), 9); // The initial tree has 9 nodes, 5 leaves and 4 internal nodes, and so should the resulting tree.
    Ok(()) 
}

Crate Features

This crate supports several optional features:

  • par-regraft: Enable parallel regrafting operations using Rayon;
  • par-regraft-chunk: Enable chunked parallel regrafting;
  • par-regraft-manual: Enable manual parallel regrafting control;
  • precomputed-test-results: Speed up test runs with precomputed results (for local development).

Enable features in your Cargo.toml:

[dependencies]
phylo = { git = "https://github.com/acg-team/rust-phylo", package = "phylo", features = ["par-regraft"] }

Roadmap

This crate is new and in active development at the moment. The basic existing functionality is mentioned above, but the following features are being currently implemented or planned:

  • Simultaneous tree and alignment estimation under the PIP model (paper);
  • Maximum likelihood tree search using NNI moves under the TKF92 long indel model (paper);
  • Extension to the PIP model that includes long insertions (manuscript in preparation);
  • Ancestral state reconstruction using PIP (paper), TKF92 and IndelMaP (paper);
  • Randomised starting trees for tree inference;
  • Generalisation of the tree structure for easier use in other crates.

Other minor features/improvements are documented on the GitHub issues page.

Contributing

This is a new library that is currently in active development. Contributions are highly welcome!

API Stability: As this crate is in active development, the API may change between versions until we reach 1.0. We'll follow semantic versioning and document breaking changes in release notes.

New Contributors

Please read our contributor guide!

Current Contributors:

Related Projects

Support

For questions, bug reports, or feature requests, please go to rust-phylo discussion page and/or open an issue on GitHub.

Citation

If you use this library in your research, please consider citing:

@software{phylo_rust,
  title = {Phylo: A Rust library for phylogenetic analysis},
  author = {Pečerska, Jūlija and Mrzik, Mattes and Iartsev, Dmitrii and Gil, Manuel and Anisimova, Maria},
  url = {https://github.com/acg-team/rust-phylo},
  year = {2025}
}

Licence and Attributions

Licence

This project is licensed under either of

at your option.

Benchmarking Datasets

Datasets for benchmarking were taken from:

  • Zhou, Xiaofan (2017). Single-gene alignments. figshare. Dataset. Link;
  • Zhou, Xiaofan (2017). Supermatrices. figshare. Dataset. Link.

The datasets are licensed under CC BY 4.0.

The datasets were modified by normalising invalid/unrecognised sequence characters since the exact sequences are less relevant for pure performance measurements.

About

Phylo library in Rust

Resources

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT

Contributing

Stars

Watchers

Forks

Packages

No packages published

Contributors 8

Languages