Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Anyway to dock many ligands with the same protein instead of loading the same protein everytime? #212

Open
sky1ove opened this issue Apr 10, 2024 · 2 comments

Comments

@sky1ove
Copy link

sky1ove commented Apr 10, 2024

I'm testing a dataset of ligands against the same protein. Instead of loading the same protein pdb file everytime, is there anyway to load the pdb once, and just start docking new ligand?

@tornikeo
Copy link

Anything short of modifying the code a bit, won't work.

One way would be to "memoize" the function you suspect is the slowest using joblib memory cache. Basically, find the function that causes you most delay, and annotate it with that @cache decorator. AFAIK this is the simplest way to do what you want.

Let me know if this helps.

@demian3b
Copy link

demian3b commented May 3, 2024

What I do in my local is memorizing esm embeddings. Since the most expensive computation during target preprocessing is esm, so you can reduce preprocessing time much smaller.
Anyway, it can be done by modifiying the code like

unique_sequences = compute_unique(list_of_protein_input)

labels, sequences = [], []
for protein_info, sequence in unique_sequences.items():
    s = sequence.split(":")
    sequences.extend(s)
    labels.extend([(*protein_info, j) for j in range(len(s))])

lm_embeddings = compute_ESM_embeddings(model, alphabet, labels, sequences)
unique_lm_embeddings = {}
for protein_info, sequence in unique_sequences.items():
     s = sequence.split(":")
     unique_lm_embeddings[protein_info] = [
         lm_embeddings[(*protein_info, j)] for j in range(len(s))
     ]

lm_embeddings = [
    unique_lm_embeddings[protein_info]
    for protein_info in list_of_protein_input
]

For caution, my code is from checkout v1.0 code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants