- 1 Programming languages
- 2 Important envirnment management Software
- 3 Important packages
- 4 Applications
- 5 Quick references
- 6 Tools
###---Main goal---###
Learn the basic usage of each language
- understand data types,variables,loop,
- learn to use terminal
Python Programming:
- Basics:
Variables, Data Types, and Operators
Control Flow (if statements, loops)
Functions and Modules
- Intermediate:
Lists, Dictionaries, and Sets
File Handling
Exception Handling
- Advanced:
Object-Oriented Programming (OOP)
Decorators and Generators
Virtual Environments
- Python
- R
- shell
I would highly recommend to read "Computing Skills for Biologists A Toolbox"
- Python Programming website (https://www.w3schools.com/python/python_variables.asp)
- R Programming Website (https://daviddalpiaz.github.io/appliedstats/introduction-to-r.html)
- Computing Skills for Biologists A Toolbox
###---Main goal---###
- learn to use programming supportive platform
- learn to manage environments and packages
- learn to version control
- Anaconda(https://www.anaconda.com/products/individual)
- learn to use conda (https://docs.conda.io/projects/conda/en/latest/user-guide/getting-started.html)
comments: anaconda is useful tool for python packages and enviroments management
- Pycharm (https://www.jetbrains.com/pycharm/)
- learn to create first project (https://www.jetbrains.com/help/pycharm/creating-and-running-your-first-python-project.html)
- learn to create conda env on pycharm (https://www.jetbrains.com/help/pycharm/conda-support-creating-conda-virtual-environment.html#conda-requirements)
comments: pycharm is a powerful IDE for different programming languages (support Python, R, etc.). It is used for code writing, testing and debugging.
- git and github (https://product.hubspot.com/blog/git-and-github-tutorial-for-beginners)
- add VS on Pycharm (https://www.jetbrains.com/help/pycharm/version-control-integration.html)
comments: verson control is extremely important since it will track all changes you made to the files
###---Main goal---###
understand the main functions of each package
- Biopython (http://biopython.org/DIST/docs/tutorial/Tutorial.html)
- Pandas
- prody
- Igblast
- PyIR
- git
- snakemake
- numpy
- scikit-learn
- pytorch
- tensorflow
comments:
#1 Biopython is super helpful package for biological application, so I highly recommend go throughly its tutorial
#2 I also provide some cheatsheets on the text_book file for assisting programming
- ggplot2 (http://www.sthda.com/english/wiki/ggplot2-essentials)
- dplyr
- ggpubr
###---Main goal---###
run and test some toy experiments
- H3N2 NA antigenic region DMS (https://github.com/Wangyiquan95/NA_EPI)
- H3N2 HA egg-passaging adaptation (https://github.com/Wangyiquan95/HA_egg_passage)
- SARS-CoV-2 cell culture-adaptive mutations (https://github.com/nicwulab/SARS-CoV-2_in_vitro_adaptation)
###---Main goal---###
- Fundamentals:
Supervised vs. Unsupervised Learning
Types of Machine Learning Algorithms (self-supervised, generative,etc)
Training and Testing Data
- Common Models:
Transformer
GPT
BERT
GNN
Difussion
- Evaluation and Optimization:
Metrics (Accuracy, Precision, Recall)
Wanb (Visualization)
Hyperparameter Tuning
- An explainable language model for antibody specificity prediction (https://github.com/Wangyiquan95/HA1)
- Deep learning model for antigen identification (https://github.com/nicwulab/SARS-CoV-2_Abs)
- H3N2 NA antigenic region DMS regression (https://github.com/Wangyiquan95/NA_EPI)
###---Main goal---###
- goals:
Build home website for running deep learning model on the server (AWS EC2)
1. Set Up an AWS EC2 Instance
2. Install Necessary Software (conda)
3. Develop the Web Application (Flask)
4. overview
flask --> gunicorn <--> nginx <--> requests
If you've never run ssh
before, you will need to create a .ssh
directory. Run
mkdir -p ~/.ssh && chmod 700 ~/.ssh
Create config
inside .ssh
directory by
touch ~/.ssh/config
chmod 600 ~/.ssh/config
Add server info into config
Host wulab
HostName nicwulab-linux.life.illinois.edu
User id
Port 22
To log in to the server, in the terminal, run
ssh wulab
To copy from your computer to a (remote) server. Run(Change the path accordingly)
scp ~/local/path [email protected]:/home/server/path
or download files from server. Run
scp -r [email protected]:/home/server/path ~/local/path
Download miniconda
:
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
Install by
bash Miniconda3-latest-Linux-x86_64.sh
Create new env by
conda create --name Env_name python=3.9
conda activate Env_name
Remove env by
conda remove --name myenv --all
Create new env using yml by
conda env create -f environment.yml
Save conda env as yml by
conda activate my_env
conda env export > path/to/environment.yml
git init and connect to github repo by
git init
git add README.md
git commit -m "README.md"
git branch -M main
git remote add origin <repository-url>
git remove files by
git rm --cached file.csv
git commit -m "Removed files"
git push -u <remote> <branch>
fetch and merge from github by
git fetch <remote>
git merge <remote>/<branch>
Connect to server and initialize juypter lab by
ssh wulab
jupyter notebook --no-browser --port=8080
jupyter lab --no-browser --port=8080
Open another local terminal and connect it by
ssh -N -L 8080:localhost:8080 [email protected]
Copy the Jupyter lab URL that appears, and paste it into your web browser.
sudo nano /etc/systemd/system/helloworld.service
Then add this into the file.
[Unit]
Description=Gunicorn instance for a simple hello world app
After=network.target
[Service]
User=ubuntu
Group=www-data
WorkingDirectory=/home/ubuntu/helloworld
ExecStart=/home/ubuntu/helloworld/venv/bin/gunicorn -b localhost:8000 app:app
Restart=always
[Install]
WantedBy=multi-user.target
Then enable the service:
sudo systemctl daemon-reload
sudo systemctl start helloworld
sudo systemctl enable helloworld
Install Nginx
sudo apt-get nginx
Start the Nginx service and go to the Public IP address of your EC2 on the browser to see the default nginx landing page
sudo systemctl start nginx
sudo systemctl enable nginx
Edit the default file in the sites-available folder.
sudo nano /etc/nginx/sites-available/default
Add the following code at the top of the file (below the default comments)
upstream flaskhelloworld {
server 127.0.0.1:8000;
}
Add a proxy_pass to flaskhelloworld atlocation /
location / {
proxy_pass http://flaskhelloworld;
}
Restart Nginx
sudo systemctl restart nginx
coming soon ~