Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-threading, banding, other speed-ups #3255

Open
jaysunl opened this issue May 19, 2024 · 3 comments
Open

Multi-threading, banding, other speed-ups #3255

jaysunl opened this issue May 19, 2024 · 3 comments
Labels
question a user question how to do certain things

Comments

@jaysunl
Copy link

jaysunl commented May 19, 2024

Platform

  • SeqAn version: 3
  • Operating system: Linux raptor.ucsd.edu 5.4.0-149-generic #166-Ubuntu SMP Tue Apr 18 16:51:45 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
  • Compiler: gcc (Ubuntu 13.1.0-8ubuntu1~20.04.2) 13.1.0

Question

Can someone explain how to use multi-threading to gain a significant speed-up? My multi-threaded version seems to be slow than without multi-threading. An example of a code snippet helps (maybe with fasta files would help but a vector example also works). I tried following the example in the docs and the speed didn't improve for me.

@jaysunl jaysunl added the question a user question how to do certain things label May 19, 2024
@rrahn
Copy link
Contributor

rrahn commented May 21, 2024

Hi @jaysunl can you please specify what you are trying to do?
The best way to do this would be to give a minimal working example of what you are parallelizing and how you are doing it.
Best regards

@eseiler
Copy link
Member

eseiler commented May 21, 2024

I tried following the example in the docs and the speed didn't improve for me.

Looks like you forgot to reference the example you tried out?

@jaysunl jaysunl changed the title Multi-threading Multi-threading, banding, other speed-ups May 21, 2024
@jaysunl
Copy link
Author

jaysunl commented May 21, 2024

Yes apologies, I tried these examples:
multi-threading with callback

    using namespace seqan3::literals;
    using sequence_pair_t = std::pair<seqan3::dna4_vector, seqan3::dna4_vector>;
    auto start = std::chrono::high_resolution_clock::now();

    std::vector<sequence_pair_t> sequences{100000, {"CAGGCATGAGCCACTACTCCTGTTTTTTAGAGGATATAGATAGAATGGATCCTGTGTCCCATAATAAATTAAGGGCAACTTGTCACACCCCTTCCATACAAAGACTGAATCAGCAGACACCACAGCCAAATCAGAGGGAAGGATGGCATGGGCTTGCTTGGTTAAGCAACAGAATAACAGCAATAATAACATAAATATAATTGCAATTTATGAGTTCTTGTTATTTGCCAGGTTCTGTAATTAATGCCATCATTAC"_dna4, 
    "AATACCTGTTTTTAGAGGTATAGTAATAGAGTAGATGTGCCTCCCATAATAAATAGGGCTACTTGTACAAATACCCACCTTCCAACAAAGGACCTAATCAGCAGACACAAGAGCCAAAGCAGAGCGAAGGAATGCACATGGGCTTAGCTTGTAAAGCAAAGAGTAACAGCAAAAAATCATAAATTAAATTTCCAATTTAGGTTCATTTCATTGCCAGGTATCGAATCAATGGCTGATATTACTATCTACTTTTTGT"_dna4}};

    auto alignment_config = seqan3::align_cfg::method_global{} 
                                | seqan3::align_cfg::scoring_scheme{
                                  seqan3::nucleotide_scoring_scheme{}}  
                                | seqan3::align_cfg::gap_cost_affine{} 
                                | seqan3::align_cfg::output_score{} 
                                | seqan3::align_cfg::output_alignment{}
                                | seqan3::align_cfg::parallel{4};
    std::mutex write_to_debug_stream{};
    auto const alignment_config_with_callback = alignment_config |
                                                seqan3::align_cfg::on_result{[&] (auto && result)
                                                {
                                                    std::lock_guard sync{write_to_debug_stream}; // critical section
                                                    //seqan3::debug_stream << result << '\n';
                                                }};
    seqan3::align_pairwise(sequences, alignment_config_with_callback);

and then multi-threading without callback

    using namespace seqan3::literals;
    using sequence_pair_t = std::pair<seqan3::dna4_vector, seqan3::dna4_vector>;
    auto start = std::chrono::high_resolution_clock::now();

    std::vector<sequence_pair_t> sequences{100000, {"CAGGCATGAGCCACTACTCCTGTTTTTTAGAGGATATAGATAGAATGGATCCTGTGTCCCATAATAAATTAAGGGCAACTTGTCACACCCCTTCCATACAAAGACTGAATCAGCAGACACCACAGCCAAATCAGAGGGAAGGATGGCATGGGCTTGCTTGGTTAAGCAACAGAATAACAGCAATAATAACATAAATATAATTGCAATTTATGAGTTCTTGTTATTTGCCAGGTTCTGTAATTAATGCCATCATTAC"_dna4, 
    "AATACCTGTTTTTAGAGGTATAGTAATAGAGTAGATGTGCCTCCCATAATAAATAGGGCTACTTGTACAAATACCCACCTTCCAACAAAGGACCTAATCAGCAGACACAAGAGCCAAAGCAGAGCGAAGGAATGCACATGGGCTTAGCTTGTAAAGCAAAGAGTAACAGCAAAAAATCATAAATTAAATTTCCAATTTAGGTTCATTTCATTGCCAGGTATCGAATCAATGGCTGATATTACTATCTACTTTTTGT"_dna4}};

    auto alignment_config = seqan3::align_cfg::method_global{} 
                                | seqan3::align_cfg::scoring_scheme{
                                  seqan3::nucleotide_scoring_scheme{}}  
                                | seqan3::align_cfg::gap_cost_affine{} 
                                | seqan3::align_cfg::output_score{} 
                                | seqan3::align_cfg::output_alignment{}
                                | seqan3::align_cfg::parallel{4};
    seqan3::align_pairwise(sequences, alignment_config);

and then standard sequential procedure:

    using namespace seqan3::literals;
    using sequence_pair_t = std::pair<seqan3::dna4_vector, seqan3::dna4_vector>;
    auto start = std::chrono::high_resolution_clock::now();

    std::vector<sequence_pair_t> sequences{100000, {"CAGGCATGAGCCACTACTCCTGTTTTTTAGAGGATATAGATAGAATGGATCCTGTGTCCCATAATAAATTAAGGGCAACTTGTCACACCCCTTCCATACAAAGACTGAATCAGCAGACACCACAGCCAAATCAGAGGGAAGGATGGCATGGGCTTGCTTGGTTAAGCAACAGAATAACAGCAATAATAACATAAATATAATTGCAATTTATGAGTTCTTGTTATTTGCCAGGTTCTGTAATTAATGCCATCATTAC"_dna4, 
    "AATACCTGTTTTTAGAGGTATAGTAATAGAGTAGATGTGCCTCCCATAATAAATAGGGCTACTTGTACAAATACCCACCTTCCAACAAAGGACCTAATCAGCAGACACAAGAGCCAAAGCAGAGCGAAGGAATGCACATGGGCTTAGCTTGTAAAGCAAAGAGTAACAGCAAAAAATCATAAATTAAATTTCCAATTTAGGTTCATTTCATTGCCAGGTATCGAATCAATGGCTGATATTACTATCTACTTTTTGT"_dna4}};

    auto alignment_config = seqan3::align_cfg::method_global{} 
                                | seqan3::align_cfg::scoring_scheme{
                                  seqan3::nucleotide_scoring_scheme{}}  
                                | seqan3::align_cfg::gap_cost_affine{} 
                                | seqan3::align_cfg::output_score{} 
                                | seqan3::align_cfg::output_alignment{}
       // notice no parallel specification
    seqan3::align_pairwise(sequences, alignment_config);

but all codes ran the same speed, and actually in some cases the parallelism slows down the code. I tried increasing the number of alignments and the thread count but this also doesn't do that much. Also sort of unrelated, but sometimes a local alignment is slower than a global alignment, which is weird to me. In addition, banding also doesn't speed the alignment time as much. Any tips?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question a user question how to do certain things
Projects
None yet
Development

No branches or pull requests

3 participants