A graph convolutional network-based lncRNA subcellular localization predictor
torch==1.9.1
dgl-cu102==0.7.2
scikit-learn==1.0.1
numpy==1.20.3
gensim==4.1.2
tqdm==4.62.3
You can train the model in a very simple way by the command blow:
python train.py >output/log.txt 2>&1
Also you can use the package provided by us to train your model.
First, you need to import the package.
from data.LncRNADataset import *
from model.classifier import *
from utils.config import *
Second, you can train your own model by modifying the variables in utils/config.py, and create a configuration object by the command blow:
params = config()
In the utils/config.py, the meaning of the variables is explained as follows:
k is the value of the k-mer nodes.
d is the dimension of vector of node features which are trained by gensim library.
hidden_dim is the parameters of the hidden layer of GCNs.
n_classes is the number of sample categories.
savePath is the folder where the model is saved.
device is the device you used to build and train the model. It can be "cpu" for cpu or "cuda" for gpu, and "cuda:0" for gpu 0.
Then you need to create the data object.
dataset = LncRNADataset(raw_dir='data/data.txt', save_dir=f'checkpointslgraph/k{params.k}_d{params.d}')
Finally, you can create the model object and start training.
model = GraphClassifier(in_dim=params.d, hidden_dim=params.hidden_dim, n_classes=params.n_classes, device=params.device)
model.cv_train(dataset, batchSize=params.batchSize, num_epochs=params.num_epochs, lr=params.lr, kFold=params.kFold, savePath=params.savePath, device=params.device)
First, import the package.
from model.lncRNA_lib import *
Then instantiate five model objects.
mapLocation={'cuda:0':'cpu', 'cuda:1':'cpu'}
GraphLncLoc =[]
for i in range(1,6):
GraphLncLoc.append(lncRNALocalizer(f"checkpoints/Final_model/fold{i}.pkl", map_location=mapLocation))
Finally, the prediction results of the five models were voted to get the final prediction results.
def lncRNA_loc_predict(lncRNA):
return vote_predict(GraphLncLoc, lncRNA.upper())
if __name__=="__main__":
sequence="ACC...UCU"
print(lncRNA_loc_predict(sequence))
The test_set.txt in Independent_test_set folder is used in comparison with other existing predictors.
The other details can be seen in the paper and the codes.
Min Li, Baoying Zhao, Rui Yin, Chengqian Lu, Fei Guo, Min Zeng. GraphLncLoc: long non-coding RNA subcellular localization prediction using graph convolutional networks based on sequence to graph transformation
This project is licensed under the MIT License - see the LICENSE.txt file for details