Skip to content

yuanx749/deep-vaccine

Repository files navigation

deep-vaccine

Predict multi-epitope vaccine subunit candidates using NLP.

Data

Immune Epitope Database (IEDB)

Construct datasets with data/preprocess.py (notebook format used by mainstream editors).

Environment

Usage

To train, python train.py, or use LSTM.ipynb on Colab. Models are saved in ./runs.

Predict a list of sequences using a model saved at path_to_model as follows:

from api import Predictor

seqs = """
PVAGAAIAAPVAGQQGPQRR
IAADFVEDQEVCKNYTGTVVGFASMVA
ADGAYRFLSGTAAVLAAAETAEAKAAAAAE
GDNLKGIVVIKDRNIGVLGENGSHMPDRCN
""".split()

predictor = Predictor(path_to_model)
predictor.predict_proba(seqs)

An application on the spike protein of SARS-CoV-2 is in example.py.

Misc.

  • Models: (CNN+) LSTM/GRU, Transformer
  • Different tokenizers and pooling
  • Visualize models, data, training: tensorboard --logdir=runs

If you find this helpful, please consider citing (bib).

About

Predict multi-epitope vaccine subunit candidates using NLP.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published