Skip to content

Commit 0a09fd6

Browse files
committed
+Docs
1 parent ac07d04 commit 0a09fd6

File tree

3 files changed

+336
-0
lines changed

3 files changed

+336
-0
lines changed

INSTALL.md

Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
# ROCker
2+
3+
Accurately detecting functional genes in metagenomes.
4+
5+
## System requirements
6+
7+
1. [Ruby](https://www.ruby-lang.org/), with the [restclient](https://rubygems.org/gems/rest_client),
8+
[nokogiri](http://www.nokogiri.org/), and [json](https://rubygems.org/gems/json) packages. To
9+
install required packages execute:
10+
11+
```bash
12+
$> gem install rest_client
13+
$> gem install nokogiri
14+
$> gem install json
15+
```
16+
17+
2. [R](http://www.r-project.org/), with the [pROC](http://cran.r-project.org/web/packages/pROC/index.html)
18+
package. To install required packages, execute:
19+
20+
```bash
21+
$> R
22+
R> install.packages('pROC');
23+
```
24+
25+
3. A metagenome simulation software: [Grinder](http://sourceforge.net/projects/biogrinder/).
26+
27+
4. A local search software: [NCBI BLAST+](ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/),
28+
[DIAMOND](http://ab.inf.uni-tuebingen.de/software/diamond/), or any other software producing a results
29+
in the same format of Tabular-BLAST without comments (use the `--search-cmd` and `--makedb-cmd` options
30+
to execute any other software).
31+
32+
5. A multiple alignment software: [MUSCLE](http://www.drive5.com/muscle/),
33+
[Clustal Omega](http://www.clustal.org/omega/) or any other software supporting FastA input and output
34+
(use the `--aligner-cmd` option to execute any other software).
35+
36+
## Installation
37+
38+
Install ROCker using [RubyGems](https://rubygems.org/gems/bio-rocker):
39+
40+
```bash
41+
$> gem install bio-rocker
42+
```
43+
44+
Or get the source from [GitHub](https://github.com/lmrodriguezr/rocker):
45+
46+
```bash
47+
$> git clone https://github.com/lmrodriguezr/rocker.git
48+
$> ./rocker/bin/rocker
49+
```
50+
51+
## License
52+
53+
[Artistic license 2.0](http://www.perlfoundation.org/artistic_license_2_0).
54+
55+
## Authors
56+
57+
Luis H (Coto) Orellana, Luis M. Rodriguez-R & Konstantinos Konstantinidis, at the
58+
[Kostas lab](http://enve-omics.gatech.edu/).
59+

LICENSE.txt

Lines changed: 202 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,202 @@
1+
The Artistic License 2.0
2+
3+
Copyright (c) 2000-2006, The Perl Foundation.
4+
5+
Everyone is permitted to copy and distribute verbatim copies
6+
of this license document, but changing it is not allowed.
7+
8+
Preamble
9+
10+
This license establishes the terms under which a given free software
11+
Package may be copied, modified, distributed, and/or redistributed.
12+
The intent is that the Copyright Holder maintains some artistic
13+
control over the development of that Package while still keeping the
14+
Package available as open source and free software.
15+
16+
You are always permitted to make arrangements wholly outside of this
17+
license directly with the Copyright Holder of a given Package. If the
18+
terms of this license do not permit the full use that you propose to
19+
make of the Package, you should contact the Copyright Holder and seek
20+
a different licensing arrangement.
21+
22+
Definitions
23+
24+
"Copyright Holder" means the individual(s) or organization(s)
25+
named in the copyright notice for the entire Package.
26+
27+
"Contributor" means any party that has contributed code or other
28+
material to the Package, in accordance with the Copyright Holder's
29+
procedures.
30+
31+
"You" and "your" means any person who would like to copy,
32+
distribute, or modify the Package.
33+
34+
"Package" means the collection of files distributed by the
35+
Copyright Holder, and derivatives of that collection and/or of
36+
those files. A given Package may consist of either the Standard
37+
Version, or a Modified Version.
38+
39+
"Distribute" means providing a copy of the Package or making it
40+
accessible to anyone else, or in the case of a company or
41+
organization, to others outside of your company or organization.
42+
43+
"Distributor Fee" means any fee that you charge for Distributing
44+
this Package or providing support for this Package to another
45+
party. It does not mean licensing fees.
46+
47+
"Standard Version" refers to the Package if it has not been
48+
modified, or has been modified only in ways explicitly requested
49+
by the Copyright Holder.
50+
51+
"Modified Version" means the Package, if it has been changed, and
52+
such changes were not explicitly requested by the Copyright
53+
Holder.
54+
55+
"Original License" means this Artistic License as Distributed with
56+
the Standard Version of the Package, in its current version or as
57+
it may be modified by The Perl Foundation in the future.
58+
59+
"Source" form means the source code, documentation source, and
60+
configuration files for the Package.
61+
62+
"Compiled" form means the compiled bytecode, object code, binary,
63+
or any other form resulting from mechanical transformation or
64+
translation of the Source form.
65+
66+
67+
Permission for Use and Modification Without Distribution
68+
69+
(1) You are permitted to use the Standard Version and create and use
70+
Modified Versions for any purpose without restriction, provided that
71+
you do not Distribute the Modified Version.
72+
73+
74+
Permissions for Redistribution of the Standard Version
75+
76+
(2) You may Distribute verbatim copies of the Source form of the
77+
Standard Version of this Package in any medium without restriction,
78+
either gratis or for a Distributor Fee, provided that you duplicate
79+
all of the original copyright notices and associated disclaimers. At
80+
your discretion, such verbatim copies may or may not include a
81+
Compiled form of the Package.
82+
83+
(3) You may apply any bug fixes, portability changes, and other
84+
modifications made available from the Copyright Holder. The resulting
85+
Package will still be considered the Standard Version, and as such
86+
will be subject to the Original License.
87+
88+
89+
Distribution of Modified Versions of the Package as Source
90+
91+
(4) You may Distribute your Modified Version as Source (either gratis
92+
or for a Distributor Fee, and with or without a Compiled form of the
93+
Modified Version) provided that you clearly document how it differs
94+
from the Standard Version, including, but not limited to, documenting
95+
any non-standard features, executables, or modules, and provided that
96+
you do at least ONE of the following:
97+
98+
(a) make the Modified Version available to the Copyright Holder
99+
of the Standard Version, under the Original License, so that the
100+
Copyright Holder may include your modifications in the Standard
101+
Version.
102+
103+
(b) ensure that installation of your Modified Version does not
104+
prevent the user installing or running the Standard Version. In
105+
addition, the Modified Version must bear a name that is different
106+
from the name of the Standard Version.
107+
108+
(c) allow anyone who receives a copy of the Modified Version to
109+
make the Source form of the Modified Version available to others
110+
under
111+
112+
(i) the Original License or
113+
114+
(ii) a license that permits the licensee to freely copy,
115+
modify and redistribute the Modified Version using the same
116+
licensing terms that apply to the copy that the licensee
117+
received, and requires that the Source form of the Modified
118+
Version, and of any works derived from it, be made freely
119+
available in that license fees are prohibited but Distributor
120+
Fees are allowed.
121+
122+
123+
Distribution of Compiled Forms of the Standard Version
124+
or Modified Versions without the Source
125+
126+
(5) You may Distribute Compiled forms of the Standard Version without
127+
the Source, provided that you include complete instructions on how to
128+
get the Source of the Standard Version. Such instructions must be
129+
valid at the time of your distribution. If these instructions, at any
130+
time while you are carrying out such distribution, become invalid, you
131+
must provide new instructions on demand or cease further distribution.
132+
If you provide valid instructions or cease distribution within thirty
133+
days after you become aware that the instructions are invalid, then
134+
you do not forfeit any of your rights under this license.
135+
136+
(6) You may Distribute a Modified Version in Compiled form without
137+
the Source, provided that you comply with Section 4 with respect to
138+
the Source of the Modified Version.
139+
140+
141+
Aggregating or Linking the Package
142+
143+
(7) You may aggregate the Package (either the Standard Version or
144+
Modified Version) with other packages and Distribute the resulting
145+
aggregation provided that you do not charge a licensing fee for the
146+
Package. Distributor Fees are permitted, and licensing fees for other
147+
components in the aggregation are permitted. The terms of this license
148+
apply to the use and Distribution of the Standard or Modified Versions
149+
as included in the aggregation.
150+
151+
(8) You are permitted to link Modified and Standard Versions with
152+
other works, to embed the Package in a larger work of your own, or to
153+
build stand-alone binary or bytecode versions of applications that
154+
include the Package, and Distribute the result without restriction,
155+
provided the result does not expose a direct interface to the Package.
156+
157+
158+
Items That are Not Considered Part of a Modified Version
159+
160+
(9) Works (including, but not limited to, modules and scripts) that
161+
merely extend or make use of the Package, do not, by themselves, cause
162+
the Package to be a Modified Version. In addition, such works are not
163+
considered parts of the Package itself, and are not subject to the
164+
terms of this license.
165+
166+
167+
General Provisions
168+
169+
(10) Any use, modification, and distribution of the Standard or
170+
Modified Versions is governed by this Artistic License. By using,
171+
modifying or distributing the Package, you accept this license. Do not
172+
use, modify, or distribute the Package, if you do not accept this
173+
license.
174+
175+
(11) If your Modified Version has been derived from a Modified
176+
Version made by someone other than you, you are nevertheless required
177+
to ensure that your Modified Version complies with the requirements of
178+
this license.
179+
180+
(12) This license does not grant you the right to use any trademark,
181+
service mark, tradename, or logo of the Copyright Holder.
182+
183+
(13) This license includes the non-exclusive, worldwide,
184+
free-of-charge patent license to make, have made, use, offer to sell,
185+
sell, import and otherwise transfer the Package with respect to any
186+
patent claims licensable by the Copyright Holder that are necessarily
187+
infringed by the Package. If you institute patent litigation
188+
(including a cross-claim or counterclaim) against any party alleging
189+
that the Package constitutes direct or contributory patent
190+
infringement, then this Artistic License to you shall terminate on the
191+
date that such litigation is filed.
192+
193+
(14) Disclaimer of Warranty:
194+
THE PACKAGE IS PROVIDED BY THE COPYRIGHT HOLDER AND CONTRIBUTORS "AS
195+
IS' AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES. THE IMPLIED
196+
WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR
197+
NON-INFRINGEMENT ARE DISCLAIMED TO THE EXTENT PERMITTED BY YOUR LOCAL
198+
LAW. UNLESS REQUIRED BY LAW, NO COPYRIGHT HOLDER OR CONTRIBUTOR WILL
199+
BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, OR CONSEQUENTIAL
200+
DAMAGES ARISING IN ANY WAY OUT OF THE USE OF THE PACKAGE, EVEN IF
201+
ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
202+

README.md

Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
# ROCker
2+
3+
Accurately detecting functional genes in metagenomes.
4+
5+
For installation instructions, see [INSTALL.md](./INSTALL.md). For license information, see
6+
[LICENSE.txt](./LICENSE.txt).
7+
8+
## Using existing models
9+
10+
Once you have installed ROCker, the easiest way to use it is by searching pre-existing
11+
models. We maintain a [list of precomputed models](http://enve-omics.ce.gatech.edu/rocker/models)
12+
that you're free to use.
13+
14+
1. **Obtain the model** of interest either downloading it from our repository or creating one yourself
15+
(see below).
16+
17+
2. **Execute ROCker search**. The minimum required parameters are:
18+
19+
```bash
20+
$> ROCker search -q input.fasta -k model.rocker -o output.blast
21+
```
22+
23+
Where `input.fasta` is the input metagenome in FastA format, `model.rocker` is the ROCker model,
24+
and `output.blast` is the output file to be created in tabular BLAST format. For additional
25+
supported options, execute `ROCker search -h`.
26+
27+
3. **If you have a pre-computed BLAST file**, you can execute instead:
28+
29+
```bash
30+
$> ROCker filter -x input.blast -k model.rocker -o output.blast
31+
```
32+
33+
Where `input.blast` is the input search to be filtered in tabular BLAST format, `model.rocker` is the
34+
ROCker model, and `output.blast` is the output file to be created in tabular BLAST format. For additional
35+
supported options, execute `ROCker filter -h`.
36+
37+
## Creating models
38+
39+
Collect a good reference collection of the gene of interest. This is the most important step, but there are
40+
some resources to help you. In general, we find the resources at [UniProt](http://uniprot.org/) very useful.
41+
42+
1. Create a list of **UniProt identifiers** (IDs and/or accessions) representing proteins of the family of
43+
interest, in a raw text file (one per line).
44+
45+
2. If you want to explicitly exclude certain proteins from the model (*e.g.*, if there are very similar proteins
46+
with distinct functional properties), create a similar list with those, we will refer to them as a **negative
47+
set** and it's optional.
48+
49+
3. **Build the model files**. The minimum required parameters are:
50+
51+
```bash
52+
$> ROCker build -P positive.txt -o prep
53+
```
54+
55+
Where `positive.txt` is the set from step 1, and `prep` is the base name for the output files. You can also
56+
pass the negative set from step 2 using `-N` (or `-n`). For additional supported options, execute `ROCker build -h`.
57+
This is by far the most computationally-expensive step, so you might want to consider using multiple threads (`-t`)
58+
or even re-using files in case the run fails (`--reuse-files` and`--nocleanup`). Also, consider setting the
59+
simulated read length to match that of your metagenomes (`-l`).
60+
61+
4. **Compile the model**. The minimum required parameters are:
62+
63+
```bash
64+
$> ROCker compile -a prep.aln -b prep.blast -k model.rocker
65+
```
66+
67+
Where `prep.aln` is the alignment generated in step 3 (manual curation is strongly encouraged), `prep.blast` is the
68+
reference BLAST generated in step 3, and `model.rocker` is the model to compile.
69+
70+
5. **Register your model** (optional). If you would like to share your model with the community, please
71+
[Contact us](mailto:[email protected]). We'll need the final ROCker model and the reference BLAST, and will add your
72+
model to our [curated list](http://enve-omics.ce.gatech.edu/rocker/models).
73+
74+
75+

0 commit comments

Comments
 (0)