You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+14-19Lines changed: 14 additions & 19 deletions
Original file line number
Diff line number
Diff line change
@@ -23,7 +23,7 @@ Although only the packages and data models necessary for its proper functioning
23
23
24
24
## Requirements
25
25
26
-
Other than being able to run a Docker image, it is absolutely necessary that you have the [PredictProtein Database](http://www.rostlab.org/services/ppmi/download_file?format=gzip&file_to_download=db) downloaded.
26
+
Other than being able to run a Docker image and [having a MySQL instance to connect to](#additional-result-data-using-an-external-mysql-instance), it is absolutely necessary that you have the [PredictProtein Database](http://www.rostlab.org/services/ppmi/download_file?format=gzip&file_to_download=db) downloaded.
27
27
28
28
To download this database manually or using a terminal/CLI:
29
29
@@ -55,6 +55,7 @@ However, in order for predictprotein to properly run, and for you to have access
55
55
56
56
* the PredictProtein database downloaded earlier
57
57
* storage for computed results, which predictprotein will produce
58
+
* an accessible and [configured MySQL server](#additional-result-data-using-an-external-mysql-instance)
58
59
59
60
This requires using [Docker bind mounts](https://docs.docker.com/storage/bind-mounts/) that bind-mount a directory from the Docker host to the Docker predictprotein container, which is configured on the command line, when running a container for the first time.
When bind-mounted using the `docker run` command as later documented, the following directories on the Docker host will contain the following data, which will remain even after an erased or shutdown container:
71
72
72
-
*`/var/tmp/pp-data/config` - *(optional)* configuration files affecting how predictprotein runs inside of the container. Necessary if you plan on using[MySQL result storage](#additional-result-data-using-an-external-mysql-instance)
73
-
* `/var/tmp/pp-data/method-data/loctree3 - *(optional)* data files used for loctree3 algorithm. Including this directory will override the already-included loctree3 data files.
74
-
* `/var/tmp/pp-data/method-data/metastudent - *(optional)* data files used for metastudent algorithm. Including this directory will override the already-included metastudent data files.
73
+
*`/var/tmp/pp-data/config` - **(required)** configuration files affecting how predictprotein runs inside of the container. You'll need to configure[MySQL result storage](#additional-result-data-using-an-external-mysql-instance)
74
+
*`/var/tmp/pp-data/method-data/loctree3` - *(optional)* data files used for loctree3 algorithm. Including this directory will override the already-included loctree3 data files.
75
+
*`/var/tmp/pp-data/method-data/metastudent` - *(optional)* data files used for metastudent algorithm. Including this directory will override the already-included metastudent data files.
75
76
*`/var/tmp/pp-data/ppcache/ppcache-data` - **(required)** predictprotein cache (ppcache) where computed results are stored, indexed by computed hash
76
77
*`/var/tmp/pp-data/ppcache/results-retrieve` - *(optional)* may be used, when bind mountd, to retrieve a result set from the cache (see ppc_fetch)
77
78
*`/var/tmp/pp-data/ppcache/rost_db` - **(required)** rost_db (internal to Rostlab) or PPMI databases
@@ -131,7 +132,7 @@ The predictprotein cache location in the container may be changed by defining a
131
132
132
133
## Running predictprotein... (Examples Please!)
133
134
134
-
The following are some examples to help you get an idea of how predictprotein work in its Docker-ized form, using the directory tree as described in [Create Docker host data and configuration file directories for predictprotein](#create-docker-host-data-and-configuration-file-directories-for-predictprotein)
135
+
The following are some examples to help you get an idea of how predictprotein work in its Docker-ized form, using the directory tree as described in [Create Docker host data and configuration file directories for predictprotein](#create-docker-host-data-and-configuration-file-directories-for-predictprotein), and having a [configured MySQL server](#additional-result-data-using-an-external-mysql-instance).
135
136
136
137
### To get help about the command
137
138
By default, running the Docker container will produce the help information for the predictprotein command, just like running `predictprotein --help`:
@@ -147,6 +148,7 @@ $ docker run --rm predictprotein
@@ -212,11 +215,11 @@ So, now you can run it just like we do, and reproduce the same results as [predi
212
215
213
216
## Changing Default Configuration Files
214
217
215
-
By default, the Docker container will look in `/etc/docker-predictprotein` (inside its container), for the configuration files needed to run.
218
+
By default, the Docker container will look in `/etc/docker-predictprotein` (inside its container), for the configuration files needed to run. However, you'll need to adjust settings for [configuring MySQL services](#additional-result-data-using-an-external-mysql-instance)
216
219
217
220
**Note**: every time the Docker container is run, a check is done to make sure all of the necessary configuration files exist in their expected location within the container, `/etc/docker-predictprotein`. Missing configuration files are automatically created, copied from the Docker container directory `/var/tmp/config/` to `/etc/docker-predictprotein`
218
221
219
-
If you would like to access the configuration files, or to enable[MySQL services](#additional-result-data-using-an-external-mysql-instance), you'll need to bind-mount from your Docker host, to the Docker container at `/etc/docker-predictprotein`:
222
+
In order to access the configuration files, to configure[MySQL services](#additional-result-data-using-an-external-mysql-instance), for instance, you'll need to bind-mount from your Docker host, to the Docker container at `/etc/docker-predictprotein`:
220
223
221
224
```shell
222
225
$ docker run \
@@ -234,8 +237,6 @@ In this example, the `/var/tmp/pp-data/config` directory is bind-mounted to `/et
234
237
235
238
With this in mind, if you edit any of the configuration files, their state will be maintained regardless if the container is stopped or erased. Multiple running container instances of the Docker image may also bind-mount to this directory to use the same configuration.
236
239
237
-
Lastly, if you then decide not to bind-mount the configuration directory to the Docker host, the configuration files in the container will not be hidden by the bind-mount, and therefore be used.
238
-
239
240
### Initializing or resetting configuration files to their defaults
240
241
241
242
In the case that you would like to reset all configuration files to the defaults contained within the Docker image, no matter what configuration files already exist, you may execute the following Docker run command:
## Additional Result Data Using an External MySQL Instance
250
251
251
-
In addition to the result files stored in the [predictprotein cache](#the-predictprotein-cache), predictprotein also can store additional result data in various tables within an external MySQL instance. This was not included within the Docker image in order to keep size and resource usage down to a minimum, while allowing the user the possiblity of tuning their MySQL server to their liking.
252
+
In addition to the result files stored in the [predictprotein cache](#the-predictprotein-cache), predictprotein also stores additional result data in various tables within an external MySQL instance. This was not included within the Docker image in order to keep size and resource usage down to a minimum, while allowing the user the possiblity of tuning their MySQL server to their liking.
252
253
253
254
### Adjusting predictprotein configuration files
254
255
255
-
You can enable the MySQL services predictprotein offers, by [changing values in two of the default configuration files](#changing-default-configuration-files), namely:
256
+
Enable the MySQL services predictprotein offers, by [changing values in one of the default configuration files](#changing-default-configuration-files), namely:
256
257
257
258
*`ppcache-my.cnf` - MySQL connection information, following the same syntax as a user-defined my.cnf file
258
-
*`ppcacherc` - the predictprotein cache configuration file
259
259
260
260
In `ppcache-my.cnf`, you must enter the details necessary to connect to your MySQL instance.
261
261
262
-
In `ppcacherc`, all you have to do is *uncomment* one line (i.e.. don't change the path or filename):
263
-
```
264
-
# mysql_read_default_file=/etc/ppcache/my.cnf
265
-
```
266
-
267
-
**Note**: when the `mysql_read_default_file` setting is not defined, no MySQL connections are attempted by predictprotein. If it's defined, near the end of the predictprotein run, it will attempt to connect to the MySQL instance defined in the `ppcache-my.cnf` file on the Docker host, writing additional result data to the database tables.
262
+
**Note**: Near the end of the predictprotein run, it will attempt to connect to the MySQL instance defined in the `ppcache-my.cnf` file on the Docker host, writing additional result data to the database tables.
268
263
269
264
### Creating a dedicated MySQL user and database
270
265
@@ -286,7 +281,7 @@ In order to successfully write data to the database, the proper MySQL database t
286
281
```shell
287
282
# mysql -u root -p ppres < /var/tmp/pp-data/config/ppres_tables_mysql.sql
288
283
```
289
-
Now, as long as the `mysql_read_default_file` setting in `ppcacherc` is uncommented, additional predictprotein result data will be created in the `ppres` database of the configured MySQL instance.
284
+
Now, additional predictprotein result data will be created in the `ppres` database of the configured MySQL instance.
# cache segment allocation - unlisted segments are allocated on/migrated to the first root; syntax: [root_number_1_based:segment_number segment_number ...];...
12
12
cache_segments=1:18 19 56 57 58 59 5a 5b 5c 5d 5e 5f 60 61 62 63 64 65 66 67 1a 79 7a 7b 7c 7d 7e 7f 80 81 82 83 84 85 86 87 88 89 8a 8b 1b 1c ac ad ae af b0 b1 b2 b3 b4 b5 b6 b7 b8 b9 ba bb bc bd 1d cf d0 d1 d2 d3 d4 d5 d6 d7 d8 d9 da db dc dd de df e0 e1 1e 1f 46 47 48 49 4a 4b 4c 4d 4e 4f 50 51 52 53 54 55 9c 9d 00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 10 11 12 13 21 22 23 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f 30 31 32 33 34 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 75 76 77 78 8c 8d 8e 8f 90 91 92 93 94 95 96 97 98 99 9a 9b be bf c0 c1 c2 c3 c4 c5 c6 c7 c8 c9 ca cb cc cd ce e2 e3 e4 e5 e6 e7 e8 e9 ea eb ec ed ee ef f0 f1 9e 9f a0 a1 a2 a3 a4 a5 a6 a7 a8 a9 aa ab f2 f3 f4 14 15 16 17 20 f5 f6 f7 f8 f9 fa fb fc fd fe ff 35 36 37 38 39 3a 3b 3c 3d 3e 3f 40 41 42 43 44 45;
13
13
14
-
## Enable this only if you have a MySQL database,
15
-
## and the proper tables and user permissions
16
-
## set up. Otherwise, predictprotein will fail.
17
-
## This setting will enable additional result
18
-
## data to be written to the MySQL database
19
-
## whose connection settings are contained
20
-
## in the defined file.
14
+
## Additional result data will be writtento the MySQL database 'ppres',
15
+
## whose connection settings are contained in the defined file.
0 commit comments