Skip to content

Commit 842a197

Browse files
committed
Update README.md, config/etc/ppcacherc files
1 parent 51847c1 commit 842a197

File tree

2 files changed

+17
-27
lines changed

2 files changed

+17
-27
lines changed

README.md

Lines changed: 14 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ Although only the packages and data models necessary for its proper functioning
2323

2424
## Requirements
2525

26-
Other than being able to run a Docker image, it is absolutely necessary that you have the [PredictProtein Database](http://www.rostlab.org/services/ppmi/download_file?format=gzip&file_to_download=db) downloaded.
26+
Other than being able to run a Docker image and [having a MySQL instance to connect to](#additional-result-data-using-an-external-mysql-instance), it is absolutely necessary that you have the [PredictProtein Database](http://www.rostlab.org/services/ppmi/download_file?format=gzip&file_to_download=db) downloaded.
2727

2828
To download this database manually or using a terminal/CLI:
2929

@@ -55,6 +55,7 @@ However, in order for predictprotein to properly run, and for you to have access
5555

5656
* the PredictProtein database downloaded earlier
5757
* storage for computed results, which predictprotein will produce
58+
* an accessible and [configured MySQL server](#additional-result-data-using-an-external-mysql-instance)
5859

5960
This requires using [Docker bind mounts](https://docs.docker.com/storage/bind-mounts/) that bind-mount a directory from the Docker host to the Docker predictprotein container, which is configured on the command line, when running a container for the first time.
6061

@@ -69,9 +70,9 @@ $ mkdir -p /var/tmp/pp-data/ppcache/{ppcache-data,results-retrieve,rost_db,seque
6970

7071
When bind-mounted using the `docker run` command as later documented, the following directories on the Docker host will contain the following data, which will remain even after an erased or shutdown container:
7172

72-
* `/var/tmp/pp-data/config` - *(optional)* configuration files affecting how predictprotein runs inside of the container. Necessary if you plan on using [MySQL result storage](#additional-result-data-using-an-external-mysql-instance)
73-
* `/var/tmp/pp-data/method-data/loctree3 - *(optional)* data files used for loctree3 algorithm. Including this directory will override the already-included loctree3 data files.
74-
* `/var/tmp/pp-data/method-data/metastudent - *(optional)* data files used for metastudent algorithm. Including this directory will override the already-included metastudent data files.
73+
* `/var/tmp/pp-data/config` - **(required)** configuration files affecting how predictprotein runs inside of the container. You'll need to configure [MySQL result storage](#additional-result-data-using-an-external-mysql-instance)
74+
* `/var/tmp/pp-data/method-data/loctree3` - *(optional)* data files used for loctree3 algorithm. Including this directory will override the already-included loctree3 data files.
75+
* `/var/tmp/pp-data/method-data/metastudent` - *(optional)* data files used for metastudent algorithm. Including this directory will override the already-included metastudent data files.
7576
* `/var/tmp/pp-data/ppcache/ppcache-data` - **(required)** predictprotein cache (ppcache) where computed results are stored, indexed by computed hash
7677
* `/var/tmp/pp-data/ppcache/results-retrieve` - *(optional)* may be used, when bind mountd, to retrieve a result set from the cache (see ppc_fetch)
7778
* `/var/tmp/pp-data/ppcache/rost_db` - **(required)** rost_db (internal to Rostlab) or PPMI databases
@@ -131,7 +132,7 @@ The predictprotein cache location in the container may be changed by defining a
131132

132133
## Running predictprotein... (Examples Please!)
133134

134-
The following are some examples to help you get an idea of how predictprotein work in its Docker-ized form, using the directory tree as described in [Create Docker host data and configuration file directories for predictprotein](#create-docker-host-data-and-configuration-file-directories-for-predictprotein)
135+
The following are some examples to help you get an idea of how predictprotein work in its Docker-ized form, using the directory tree as described in [Create Docker host data and configuration file directories for predictprotein](#create-docker-host-data-and-configuration-file-directories-for-predictprotein), and having a [configured MySQL server](#additional-result-data-using-an-external-mysql-instance).
135136

136137
### To get help about the command
137138
By default, running the Docker container will produce the help information for the predictprotein command, just like running `predictprotein --help`:
@@ -147,6 +148,7 @@ $ docker run --rm predictprotein
147148
```shell
148149
$ docker run \
149150
--mount type=bind,source=/var/tmp/pp-data/ppcache,target=/mnt/ppcache \
151+
--mount type=bind,source=/var/tmp/pp-data/config,target=/etc/docker-predictprotein \
150152
predictprotein \
151153
predictprotein \
152154
--sequence MFRTKRSALVRRLWRSRAPGGNSR \
@@ -175,6 +177,7 @@ Items to note:
175177
```shell
176178
$ docker run \
177179
--mount type=bind,source=/var/tmp/pp-data/ppcache,target=/mnt/ppcache \
180+
--mount type=bind,source=/var/tmp/pp-data/config,target=/etc/docker-predictprotein \
178181
predictprotein \
179182
predictprotein \
180183
--seqfile /mnt/ppcache/sequence-submit/my_sequence.fasta \
@@ -212,11 +215,11 @@ So, now you can run it just like we do, and reproduce the same results as [predi
212215

213216
## Changing Default Configuration Files
214217

215-
By default, the Docker container will look in `/etc/docker-predictprotein` (inside its container), for the configuration files needed to run.
218+
By default, the Docker container will look in `/etc/docker-predictprotein` (inside its container), for the configuration files needed to run. However, you'll need to adjust settings for [configuring MySQL services](#additional-result-data-using-an-external-mysql-instance)
216219

217220
**Note**: every time the Docker container is run, a check is done to make sure all of the necessary configuration files exist in their expected location within the container, `/etc/docker-predictprotein`. Missing configuration files are automatically created, copied from the Docker container directory `/var/tmp/config/` to `/etc/docker-predictprotein`
218221

219-
If you would like to access the configuration files, or to enable [MySQL services](#additional-result-data-using-an-external-mysql-instance), you'll need to bind-mount from your Docker host, to the Docker container at `/etc/docker-predictprotein`:
222+
In order to access the configuration files, to configure [MySQL services](#additional-result-data-using-an-external-mysql-instance), for instance, you'll need to bind-mount from your Docker host, to the Docker container at `/etc/docker-predictprotein`:
220223

221224
```shell
222225
$ docker run \
@@ -234,8 +237,6 @@ In this example, the `/var/tmp/pp-data/config` directory is bind-mounted to `/et
234237

235238
With this in mind, if you edit any of the configuration files, their state will be maintained regardless if the container is stopped or erased. Multiple running container instances of the Docker image may also bind-mount to this directory to use the same configuration.
236239

237-
Lastly, if you then decide not to bind-mount the configuration directory to the Docker host, the configuration files in the container will not be hidden by the bind-mount, and therefore be used.
238-
239240
### Initializing or resetting configuration files to their defaults
240241

241242
In the case that you would like to reset all configuration files to the defaults contained within the Docker image, no matter what configuration files already exist, you may execute the following Docker run command:
@@ -248,23 +249,17 @@ $ docker run --rm -it predictprotein init
248249

249250
## Additional Result Data Using an External MySQL Instance
250251

251-
In addition to the result files stored in the [predictprotein cache](#the-predictprotein-cache), predictprotein also can store additional result data in various tables within an external MySQL instance. This was not included within the Docker image in order to keep size and resource usage down to a minimum, while allowing the user the possiblity of tuning their MySQL server to their liking.
252+
In addition to the result files stored in the [predictprotein cache](#the-predictprotein-cache), predictprotein also stores additional result data in various tables within an external MySQL instance. This was not included within the Docker image in order to keep size and resource usage down to a minimum, while allowing the user the possiblity of tuning their MySQL server to their liking.
252253

253254
### Adjusting predictprotein configuration files
254255

255-
You can enable the MySQL services predictprotein offers, by [changing values in two of the default configuration files](#changing-default-configuration-files), namely:
256+
Enable the MySQL services predictprotein offers, by [changing values in one of the default configuration files](#changing-default-configuration-files), namely:
256257

257258
* `ppcache-my.cnf` - MySQL connection information, following the same syntax as a user-defined my.cnf file
258-
* `ppcacherc` - the predictprotein cache configuration file
259259

260260
In `ppcache-my.cnf`, you must enter the details necessary to connect to your MySQL instance.
261261

262-
In `ppcacherc`, all you have to do is *uncomment* one line (i.e.. don't change the path or filename):
263-
```
264-
# mysql_read_default_file=/etc/ppcache/my.cnf
265-
```
266-
267-
**Note**: when the `mysql_read_default_file` setting is not defined, no MySQL connections are attempted by predictprotein. If it's defined, near the end of the predictprotein run, it will attempt to connect to the MySQL instance defined in the `ppcache-my.cnf` file on the Docker host, writing additional result data to the database tables.
262+
**Note**: Near the end of the predictprotein run, it will attempt to connect to the MySQL instance defined in the `ppcache-my.cnf` file on the Docker host, writing additional result data to the database tables.
268263

269264
### Creating a dedicated MySQL user and database
270265

@@ -286,7 +281,7 @@ In order to successfully write data to the database, the proper MySQL database t
286281
```shell
287282
# mysql -u root -p ppres < /var/tmp/pp-data/config/ppres_tables_mysql.sql
288283
```
289-
Now, as long as the `mysql_read_default_file` setting in `ppcacherc` is uncommented, additional predictprotein result data will be created in the `ppres` database of the configured MySQL instance.
284+
Now, additional predictprotein result data will be created in the `ppres` database of the configured MySQL instance.
290285

291286
## License
292287

config/etc/ppcacherc

Lines changed: 3 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -11,15 +11,10 @@ cache_root=/mnt/ppcache/ppcache-data
1111
# cache segment allocation - unlisted segments are allocated on/migrated to the first root; syntax: [root_number_1_based:segment_number segment_number ...];...
1212
cache_segments=1:18 19 56 57 58 59 5a 5b 5c 5d 5e 5f 60 61 62 63 64 65 66 67 1a 79 7a 7b 7c 7d 7e 7f 80 81 82 83 84 85 86 87 88 89 8a 8b 1b 1c ac ad ae af b0 b1 b2 b3 b4 b5 b6 b7 b8 b9 ba bb bc bd 1d cf d0 d1 d2 d3 d4 d5 d6 d7 d8 d9 da db dc dd de df e0 e1 1e 1f 46 47 48 49 4a 4b 4c 4d 4e 4f 50 51 52 53 54 55 9c 9d 00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 10 11 12 13 21 22 23 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f 30 31 32 33 34 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 75 76 77 78 8c 8d 8e 8f 90 91 92 93 94 95 96 97 98 99 9a 9b be bf c0 c1 c2 c3 c4 c5 c6 c7 c8 c9 ca cb cc cd ce e2 e3 e4 e5 e6 e7 e8 e9 ea eb ec ed ee ef f0 f1 9e 9f a0 a1 a2 a3 a4 a5 a6 a7 a8 a9 aa ab f2 f3 f4 14 15 16 17 20 f5 f6 f7 f8 f9 fa fb fc fd fe ff 35 36 37 38 39 3a 3b 3c 3d 3e 3f 40 41 42 43 44 45;
1313

14-
## Enable this only if you have a MySQL database,
15-
## and the proper tables and user permissions
16-
## set up. Otherwise, predictprotein will fail.
17-
## This setting will enable additional result
18-
## data to be written to the MySQL database
19-
## whose connection settings are contained
20-
## in the defined file.
14+
## Additional result data will be writtento the MySQL database 'ppres',
15+
## whose connection settings are contained in the defined file.
2116
## mysql_read_default_file=/path/to/my.cnf
22-
#mysql_read_default_file=/etc/ppcache/my.cnf
17+
mysql_read_default_file=/etc/ppcache/my.cnf
2318

2419
# cacheuser - owner of cached files
2520
cache_user=ppcache

0 commit comments

Comments
 (0)