You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+35Lines changed: 35 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -108,6 +108,41 @@ In some cases using precomputed database can still be useful. For the following
108
108
109
109
If no index was created (`MMSEQS_NO_INDEX=1` was set), then `--db-load-mode` does not do anything and can be ignored.
110
110
111
+
### Generating MSAs on the GPU
112
+
113
+
Recently [GPU-accelerated search for MMSeqs](https://www.biorxiv.org/content/10.1101/2024.11.13.623350v1) was introduced and is now supported in ColabFold. To leverage it, you will need to ajdust the database setup and how you run `colabfold_search`.
114
+
115
+
#### GPU database setup
116
+
117
+
To setup the GPU databases, you will need to run the `setup_databases.sh` command with `GPU=1`:
118
+
119
+
```shell
120
+
GPU=1 ./setup_databases.sh /path/to/db_folder
121
+
```
122
+
123
+
This will download and setup the GPU databases in the specified folder. Note that here we do not pass `MMSEQS_NO_INDEX=1` as an argument since the indices are useful in the GPU search since we will keep them in the GPU memory.
124
+
125
+
#### GPU search with colabfold_search
126
+
127
+
To run the MSA search on the GPU, it is recommended (although not required) to start a GPU server before running the search; this server will keep the indices in the GPU memory and will be used to accelerate the search. To start a GPU server, run:
By default, this server will use all available GPUs and split the database up evenly across them. If you want to restrict the numbers of GPU used, you can set the environment variable `CUDA_VISIBLE_DEVICES` to a specific GPU or set of GPUs, e.g., `CUDA_VISIBLE_DEVICES=0,1`. You can control how many sequences are loaded onto the GPU with the `--max-seqs` option. If your database is larger than the available GPU memory, the GPU server will efficiently swap the required data in and out of the GPU memory, overlapping data transfer and computation. The GPU server will be started in the background and will continue to run until you stop it explicitly via killing the process via `kill $PID1` and `kill $PID2`.
137
+
138
+
You can then run colabfold_search with the `--gpu` and `--gpu-server` option enabled:
You can also run the search only with the `--gpu` option enabled if you do not want to start a GPU server, but the GPU server option is generally faster. Similarly to the GPU server, you can control with GPUs are used for the search via the `CUDA_VISIBLE_DEVICES` environment variable.
145
+
111
146
### Tutorials & Presentations
112
147
- ColabFold Tutorial presented at the Boston Protein Design and Modeling Club. [[video]](https://www.youtube.com/watch?v=Rfw7thgGTwI)[[slides]](https://docs.google.com/presentation/d/1mnffk23ev2QMDzGZ5w1skXEadTe54l8-Uei6ACce8eI).
0 commit comments