Skip to content

Commit

Permalink
Update RNAmining according to revisors comments
Browse files Browse the repository at this point in the history
  • Loading branch information
thaisratis committed Jun 1, 2021
1 parent 3f7b8ff commit 4c43928
Show file tree
Hide file tree
Showing 10 changed files with 56 additions and 35 deletions.
9 changes: 5 additions & 4 deletions volumes/rnamining-front/about.php
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
<div class="row">
<div class="col-md-8 col-md-offset-2 col-xs-12 col-sm-12">
<h4 class="text-justify text-about" style="line-height:30px;text-indent:50px">
<p><span class="specialchar">RNAmining</span> is a web tool that allows nucleotides coding potential prediction. It takes a user-defined fasta sequences. This tool was implemented using machine learning algorithms. Machine learning is a subfield of computer science that developed from the study of pattern recognition and computational learning theories in artificial intelligence. This tool operate through a model obtained from training data analyzes and produces an inferred function, which can be used for mapping new examples.
<p><span class="specialchar">RNAmining</span> is a web tool that allows nucleotides coding potential prediction. It takes a user-defined fasta sequences. This tool was implemented using XGBoost machine learning algorithm. Machine learning is a subfield of computer science that developed from the study of pattern recognition and computational learning theories in artificial intelligence. This tool operate through a model obtained from training data analyzes and produces an inferred function, which can be used for mapping new examples.
</p>
</h4>

Expand All @@ -27,14 +27,15 @@

<h3 class="portfolio-text">How does the algorithm used work?</h3>
<h4 class="text-justify text-about" style="line-height:30px;text-indent:50px">
<p>The algorithm begins by reading the RNA sequences provided in the uploaded file. Thereafter, it is divided into two main parts: the preprocessing and the prediction. In preprocessing, we perfomed a tri-nucleotides frequency of each RNA sequence and then, normalize it according to the sequence's lenght. This process is save in a file, which is going to be used as input for the second part. In prediction, since the user provides the organism type (e.g. Homo sapiens), the tool selects a specific organism model trained by XGBoost and perform the prediction, which is shown in the platform and can be downloaded as a .zip file.</p>
<p>The algorithm begins by reading the RNA sequences provided in the uploaded file. Thereafter, it is divided into two main parts: the preprocessing and the prediction. In preprocessing, we perfomed a tri-nucleotides frequency of each RNA sequence and then, we normalized it according to the sequence's lenght. This process is save in a file, which is going to be used as input for the second part. In prediction, since the user provides the organism type (e.g. Homo sapiens), the tool selects a specific organism model trained by XGBoost and perform the prediction, which is shown in the platform and can be downloaded as a .zip file.</p>
</h4>

<h3 class="portfolio-text">How can RNAmining helps?</h3>
<h4 class="text-justify text-about" style="line-height:30px;text-indent:50px">
<p>Non-coding RNAs are untranslated RNA molecules, but are important players in the cellular regulation of organisms from different kingdom. Thus, the research interest on non-coding RNAs has increased dramatically in recent years. Its investigation is routine in every transcriptome or genome project, since any mutations or misregulation on them result in disorders such as: tumor formation (cancerous or other type), cardiovascular, neurological diseases and others human illness. Therefore, exists an important step in ncRNAs research which is the ability to distinguish coding/non-coding sequences. Thus, <span class="specialchar">RNAmining</span> was developed to perform nucleotides coding potential prediction.</p>
<p>Non-coding RNAs are untranslated RNA molecules, but are important players in the cellular regulation of organisms from different kingdom. Thus, the research interest on non-coding RNAs has increased dramatically in recent years. Its investigation is routine in every transcriptome or genome project, since any mutations or misregulation on them result in disorders such as: tumor formation (cancerous or other type), cardiovascular, neurological diseases and others human illness. Therefore, exists an important step in ncRNAs research which is the ability to distinguish coding/non-coding sequences.</p>

<p>Thus, <span class="specialchar">RNAmining</span> was built to enable easy access to coding potential prediction for non-programming researchers. Additionally, the results are very easy to interpret.</p>
<p>Thus, <span class="specialchar">RNAmining</span> was built to enable easy access to nucleotides coding potential prediction for non-programming researchers. Additionally, the results are very easy to interpret.</p>
<p>More information about RNAmining in: Ramos TAR, Galindo NRO, Arias-Carrasco R et al. RNAmining: A machine learning stand-alone and web server tool for RNA coding potential prediction [version 1; peer review: 2 approved with reservations]. F1000Research 2021, 10:323 (<a href= 'https://doi.org/10.12688/f1000research.52350.1'>https://doi.org/10.12688/f1000research.52350.1</a>)</p>
</h4>
</div>
</div>
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
19 changes: 12 additions & 7 deletions volumes/rnamining-front/assets/scripts/rnamining.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ def process_inputfile(filename, organism_name,output_folder):
return X


def process_outputfile(filename_path, predict, organism_name, prediction_type, output_folder):
def process_outputfile(filename_path, predict, proba, organism_name, prediction_type, output_folder):
"""
Description: function that generates the output file. First, it converts the predict classes so that the predictions
with value equal to 1 are renamed to coding and 0 to non-coding. Then, the funciton generates a file with a header for general information
Expand All @@ -57,17 +57,21 @@ def process_outputfile(filename_path, predict, organism_name, prediction_type, o
#The last instance
if(i==(len(predict)-1)):
if predict[i]==0:
out[i] = ids[i] + '\tnon-coding'
out[i] = ids[i] + '\tnon-coding\t'

else:
out[i] = ids[i] + '\tcoding'
out[i] = ids[i] + '\tcoding\t'
out[i] += str(max(proba[i]))
else:
#All instances
if predict[i]==0:
out[i] = ids[i] + '\tnon-coding\n'
out[i] = ids[i] + '\tnon-coding\t'

else:
out[i] = ids[i] + '\tcoding\n'
out[i] = ids[i] + '\tcoding\t'
out[i] += str(max(proba[i])) + '\n'



output_file = open(output_folder+'/predictions.txt', 'w')
output_file.writelines("RNAMining Predictions\n")
Expand Down Expand Up @@ -105,7 +109,8 @@ def predict(filename_path, organism_name, prediction_type, output_folder):
X = process_inputfile(filename_path, organism_name, output_folder)
model = pickle.load(open('models/' + 'coding_prediction/' + organism_name + '.pkl', 'rb'))
predict = model.predict(X)
process_outputfile(filename_path, predict, organism_name, prediction_type,output_folder)
proba = model.predict_proba(X)
process_outputfile(filename_path, predict, proba, organism_name, prediction_type,output_folder)

except NameError:
print('Please check if organism_name and prediction_type matches RNAMining documentation.')
Expand Down
48 changes: 30 additions & 18 deletions volumes/rnamining-front/download.php
Original file line number Diff line number Diff line change
Expand Up @@ -14,15 +14,16 @@
<div class="row">
<div class="col-md-8 col-md-offset-2 col-xs-12 col-sm-12">

<h3 class="portfolio-text">Used databases</h3>
<h3 class="portfolio-text">Used databases</h3>
<h4 class="text-justify text-about" style="line-height:30px;text-indent:50px">
<p><span class="specialchar">RNAmining</span> was validated by a series of tests. all the FASTA sequences used in the creation process of this tool can be downloaded <a href = "../examples/Sequences.zip">here.</a></p>
</h4>



<h3 class="portfolio-text">Standalone Version</h3>
<h4 class="text-justify text-about" style="line-height:30px;text-indent:50px">
<p><span class="specialchar">RNAmining</span> was also developed in a standalone software. A complete tutorial of how to install and use it to perform coding potential prediction and RNA functional assignationis is described below.</p>
<p><span class="specialchar">RNAmining</span> was also developed in a standalone software. A complete tutorial of how to install and use it to perform coding potential prediction is described below.</p>
</h4>

<h3 class="portfolio-text">Dependencies</h3>
Expand All @@ -32,31 +33,42 @@
<li>Pandas Version >= 0.23.3</li>
<li>Scikit-learn Version >= 0.21.3</li>
<li>XGBoost Version >= 1.2.0</li>
<li>Biopython Version >= 1.78</li>
</ul>
</h4>

<h3 class="portfolio-text">How to run?</h3>
<h4 class="text-justify text-about" style="line-height:30px;text-indent:50px">
<p>Download the <a href="https://gitlab.com/integrativebioinformatics/RNAmining/-/tree/master/volumes/rnamining-front/assets/scripts/">RNAmining</a> files and run the commands explain there to perform files prediction!</p>
<p><span class="specialchar">RNAmining</span> is supported in <a href = "https://gitlab.com/integrativebioinformatics/RNAmining"> Docker version </a>and the user also can download the <a href="https://gitlab.com/integrativebioinformatics/RNAmining/-/tree/master/volumes/rnamining-front/assets/scripts/">RNAmining stand-alone version</a> through Gitlab. All the installation and run commands are explained in:</p>
</h4>
<h4 class="text-justify text-about" style="line-height:30px;">
<p>Run the following command to display all the parameters available to change in RNAmining:</p>
</h4>
<div style="white-space: nowrap; overflow-x: auto;">
<table>
<th>
python3 rnamining.py -h
</th>
</table>
</div>
</br>
<div class="container-fluid" style='margin: 20px 0'>
<div class="col-md-12 col-sm-12 col-xs-12 vcenter" style="text-align:center">
<h4> Docker and stand-alone versions with all commands: </p>
<a href="https://gitlab.com/integrativebioinformatics/RNAmining" target=_blank">
<img height=70 src="../assets/images/Logos/docker_image.png"/>Docker Version</a>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
<a href="https://gitlab.com/integrativebioinformatics/RNAmining/-/tree/master/volumes/rnamining-front/assets/scripts/" target="_blank">
<img height=75 src="../assets/images/Logos/GitLab_Logo.png"/>&nbsp;&nbsp;Stand-alone Version</a>


</div>
</div>


<h4 class="text-justify text-about" style="line-height:30px;">
<p>If need more information see the file README.md in the .zip file!</p>


<h3 class="portfolio-text">Release history</h3>
<h4 class="text-justify text-about">
<ul style="list-style-type:circle">
<li><b>RNAmining v1.0.4</b> (Jun 01, 2021)</li>
<p> Inclusion of classification probabilities in the output file </p>
<li><b>RNAmining v1.0.3</b> (Dec 17, 2020)</li>
<p> Fix inconsistency in sequence's read.</p>
<li><b>RNAmining v1.0.2</b> (Nov 13, 2020)</li>
<p> New version using XGBoost models.</p>
</ul>
</h4>


</div>
</div>

Expand Down
1 change: 1 addition & 0 deletions volumes/rnamining-front/index.php
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@
<!-- <img src="/assets/images/MDP_scheme_front.png" alt="MetaVolcano" class="img-responsive center-block"
style="width:50%; padding-top: 50px; padding-bottom: 50px"> -->
<p>See more <a href="/about">about</a> RNAmining and follow our <a href="/tutorial">tutorial</a> to learn how to use it.</p>
<p>More information about RNAmining in: <a href="https://doi.org/10.12688/f1000research.52350.1">https://doi.org/10.12688/f1000research.52350.1</a></p>
</div>
</div>
</div>
Expand Down
2 changes: 1 addition & 1 deletion volumes/rnamining-front/pages/example.php

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion volumes/rnamining-front/pages/footer.php
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
<div style='text-align:center;font-size:11px;'>
<hr>
<p> Copyright &copy; <span id='year'></span></p>
<p>Laboratory of Integrative Bioinformatics - University of Chile</p>
<p>Laboratory of Integrative Bioinformatics - Universidad de Chile &amp; Instituto Vandique</p>
<p>Bioinformatics Multidisciplinary Environment - University of Rio Grande do Norte</p>
<p>Artificial Intelligence Applications Laboratory - University of Paraiba</p>
</div>
Expand Down
10 changes: 6 additions & 4 deletions volumes/rnamining-front/results.php
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
<?php include("pages/head.php"); ?>


<script type="text/javascript" src="http://cdn.datatables.net/1.10.19/js/jquery.dataTables.min.js"></script>
<script type="text/javascript" src="https://cdn.datatables.net/1.10.24/js/jquery.dataTables.min.js"></script>
<link rel="stylesheet" type="text/css" href="http://cdn.datatables.net/1.10.19/css/jquery.dataTables.min.css">


Expand Down Expand Up @@ -84,8 +84,8 @@

<?php
echo '<thead><tr><th style = "text-align:center;">Sequence ID</th>';
echo '<th style = "text-align:center;">Coding Potential Classification</th></tr></thead>';

echo '<th style = "text-align:center;">Coding Potential Classification</th>';
echo '<th style = "text-align:center;">Classification Probabilities</th></tr></thead>';
echo '<tbody>';

foreach ($array as $i => $line) {
Expand All @@ -96,10 +96,12 @@

$sequence_id = $exploded[0];
$label = $exploded[1];
$probability = $exploded[2];


echo '<tr><td>'.$sequence_id .'</td>';
echo '<td>'.$label .'</td></tr>';
echo '<td>'.$label .'</td>';
echo '<td>'.$probability .'</td></tr>';
}
}

Expand Down

0 comments on commit 4c43928

Please sign in to comment.