A high-performance Rust library for phylogenetic analysis and multiple sequence alignment under maximum likelihood and parsimony optimality criteria.
Current Functionality • Getting Started • Crate Features • Roadmap • Contributing • Related Projects • Support • Citation • Licence and Attributions
- Maximum Likelihood Phylogenetic Analysis: Efficient implementation of phylogenetic tree inference using SPR moves using likelihood or parsimony cost functions;
- Multiple Sequence Alignment (MSA): Support for Multiple Sequence Alignment using the IndelMaP algorithm (paper, python implementation);
- Sequence Evolution Models: Support for various DNA (JC69, K80, TN93, HKY, GTR) and protein (WAG, HIVB, BLOSUM62) substitution models as well as the Poisson Indel Process (PIP) (paper) model;
- High Performance: Optimised tree search with optional parallel processing capabilities.
Note: This crate is not yet published on crates.io. To use it directly from GitHub, add this to your Cargo.toml:
[dependencies]
phylo = { git = "https://github.com/acg-team/rust-phylo", package = "phylo" }Once published on crates.io, you'll be able to use:
[dependencies]
phylo = "0.1.0"Minimum Supported Rust Version: 1.82.0
MSRV detected using cargo-msrv.
use phylo::likelihood::TreeSearchCost;
use phylo::optimisers::TopologyOptimiser;
use phylo::phylo_info::PhyloInfoBuilder;
use phylo::substitution_models::{SubstModel, SubstitutionCostBuilder, K80};
fn main() -> std::result::Result<(), anyhow::Error> {
// Note: This example uses test data from the repository
let info = PhyloInfoBuilder::new("./examples/data/K80.fasta").build()?;
let k80 = SubstModel::<K80>::new(&[], &[4.0, 1.0]);
let c = SubstitutionCostBuilder::new(k80, info).build()?;
let unopt_cost = c.cost();
let rng = FakeGenerator::default();
let optimiser = TopologyOptimiser::new(c, SprOptimiser {}, &rng);
let result = optimiser.run()?;
assert_eq!(unopt_cost, result.initial_cost);
assert!(result.final_cost > result.initial_cost);
assert!(result.iterations <= 100);
assert_eq!(result.cost.tree().len(), 9); // The initial tree has 9 nodes, 5 leaves and 4 internal nodes, and so should the resulting tree.
Ok(())
}This crate supports several optional features:
par-regraft: Enable parallel regrafting operations using Rayon;par-regraft-chunk: Enable chunked parallel regrafting;par-regraft-manual: Enable manual parallel regrafting control;precomputed-test-results: Speed up test runs with precomputed results (for local development).
Enable features in your Cargo.toml:
[dependencies]
phylo = { git = "https://github.com/acg-team/rust-phylo", package = "phylo", features = ["par-regraft"] }This crate is new and in active development at the moment. The basic existing functionality is mentioned above, but the following features are being currently implemented or planned:
- Simultaneous tree and alignment estimation under the PIP model (paper);
- Maximum likelihood tree search using NNI moves under the TKF92 long indel model (paper);
- Extension to the PIP model that includes long insertions (manuscript in preparation);
- Ancestral state reconstruction using PIP (paper), TKF92 and IndelMaP (paper);
- Randomised starting trees for tree inference;
- Generalisation of the tree structure for easier use in other crates.
Other minor features/improvements are documented on the GitHub issues page.
This is a new library that is currently in active development. Contributions are highly welcome!
API Stability: As this crate is in active development, the API may change between versions until we reach 1.0. We'll follow semantic versioning and document breaking changes in release notes.
Please read our contributor guide!
- Jūlija Pečerska (GitHub, email);
- Mattes Mrzik (GitHub, email);
- Dmitrii Iartsev (GitHub, email);
- Merlin Maggi (GitHub);
- Luca Müller (GitHub);
- Kai Davidson (GitHub).
For questions, bug reports, or feature requests, please go to rust-phylo discussion page and/or open an issue on GitHub.
If you use this library in your research, please consider citing:
@software{phylo_rust,
title = {Phylo: A Rust library for phylogenetic analysis},
author = {Pečerska, Jūlija and Mrzik, Mattes and Iartsev, Dmitrii and Gil, Manuel and Anisimova, Maria},
url = {https://github.com/acg-team/rust-phylo},
year = {2025}
}This project is licensed under either of
- Apache Licence, Version 2.0 (LICENSE-APACHE or www.apache.org/licenses/LICENSE-2.0), or
- MIT Licence (LICENSE-MIT or opensource.org/licenses/MIT)
at your option.
Datasets for benchmarking were taken from:
- Zhou, Xiaofan (2017). Single-gene alignments. figshare. Dataset. Link;
- Zhou, Xiaofan (2017). Supermatrices. figshare. Dataset. Link.
The datasets are licensed under CC BY 4.0.
The datasets were modified by normalising invalid/unrecognised sequence characters since the exact sequences are less relevant for pure performance measurements.