Skip to content

This repository contains code used in the following publication: Jain S, Jou JD, Georgiev IS, and Donald BR. A Critical Analysis of Computational Protein Design with Sparse Residue Interaction Graphs. PLoS Comp Biol. 2017, In press. For the latest version of OSPREY, please try http://www.cs.duke.edu/donaldlab/osprey.php instead.

Notifications You must be signed in to change notification settings

donaldlab/OSPREY_SparseAStar

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This repository contains code used in the following publication: Jain S, Jou JD, Georgiev IS, and 
Donald BR. A Critical Analysis of Computational Protein Design with Sparse Residue Interaction Graphs. 
PLoS Comp Biol. 2017, In press. We recommend that this be used to test the hypothesis 
and results of sparse residue interaction graphs as presented in this manuscript. In addition, our lab 
has also developed a novel and more efficient dynamic programming algorithm called Branch-Width 
Minimization* (BWM*) available at https://github.com/donaldlab/BWM for protein design with sparse residue 
interaction graphs. For new empirical designs, we recommend using the latest version of OSPREY 2.1 
available at http://www.cs.duke.edu/donaldlab/osprey.php that includes continuous sidechain and backbone 
flexibility, ensemble-based design, and new and faster methods to calculate energy functions. 

All the results on the 136 protein design problems and 6 retrospective protein design problems discussed 
in this manuscript are available from the Harvard Dataverse repository 
(https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/6C6IWA , doi:10.7910/DVN/6C6IWA).

OSPREY Protein Redesign Software Version 2.1 beta 
Copyright (C) 2001-2012 Bruce Donald Lab, Duke University

OSPREY is free software: you can redistribute it and/or modify it under the terms of the GNU
Lesser General Public License as published by the Free Software Foundation, either version 3 of
the License, or (at your option) any later version. The two parts of the license are attached below
(Sec. 3). There are additional restrictions imposed on the use and distribution of this open-source
code, including:
• The header from Sec. 1 must be included in any modification or extension of the code;
• Any publications, grant applications, or patents that use OSPREY must state that OSPREY
was used, with a sentence such as ”We used the open-source OSPREY software [Ref] to
design....”
• Any publications, grant applications, or patents that use OSPREY must cite our papers. The
citations for the various different modules of our software are described in Sec. 2.


Section 1: Source Header

OSPREY Protein Redesign Software Version 2.1 beta
Copyright (C) 2001-2012 Bruce Donald Lab, Duke University
OSPREY is free software: you can redistribute it and/or modify
it under the terms of the GNU Lesser General Public License as
published by the Free Software Foundation, either version 3 of
the License, or (at your option) any later version.
OSPREY is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public
License along with this library; if not, see:
<http://www.gnu.org/licenses/>.
There are additional restrictions imposed on the use and distribution
of this open-source code, including: (A) this header must be included
in any modification or extension of the code; (B) you are required to
cite our papers in any publications that use this code. The citation
for the various different modules of our software, together with a
complete list of requirements and restrictions are found in the
document license.pdf enclosed with this distribution.
Contact Info:
Bruce Donald
Duke University
Department of Computer Science
Levine Science Research Center (LSRC)
Durham
NC 27708-0129
USA
e-mail: www.cs.duke.edu/brd/
<signature of Bruce Donald>, Mar 1, 2012
Bruce Donald, Professor of Computer Science



Section 2: Citation Requirements

The citation requirements for the various different modules of our software are:
• For a general citation, please use:
C. Chen, I. Georgiev, A. C. Anderson, and B. R. Donald. Computational structure-based redesign
of enzyme activity. PNAS USA, 106(10):3764–3769, 2009.
P. Gainza, K. E. Roberts, I. Georgiev, R. H. Lilien, D. A. Keedy, C. Chen, F. Reza, A. C. Anderson,
D. C. Richardson, J. S. Richardson, and B. R. Donald. OSPREY: Protein design with ensembles,
flexibility, and provable algorithms. Methods in Enzymology, in press, 2012.
• iMinDEE:
P. Gainza, K.E. Roberts, and B.R. Donald. Protein Design using Continuous Rotamers PLoS
Computational Biology, (1): e1002335. doi:10.1371/journal.pcbi.1002335, 2012.
• Protein:Protein Interactions:
K.E. Roberts, P.R. Kushing, P. Boisguerin, DR Madden, and B.R. Donald. Design of proteinprotein interactions with a novel ensemble-based scoring algorithm. Research in Computational
Molecular Biology., volume 6577 of Lecture Notes in Computer Science. Heidelberg: Springer
Berlin. pp. 361376. 2011
K.E. Roberts, P.R. Kushing, P. Boisguerin, DR Madden, and B.R. Donald. Computational Design
of a PDZ Domain Peptide Inhibitor that Rescues CFTR Activity. PLoS Computational Biology.,
2012. In press.
• MinDEE:
I. Georgiev, R. Lilien, and B. R. Donald. The minimized dead-end elimination criterion and its
application to protein redesign in a hybrid scoring and search algorithm for computing partition
functions over molecular ensembles. J Comput Chem, 29(10):1527–42, 2008.
• BD:
I. Georgiev and B. R. Donald. Dead-end elimination with backbone flexibility. Bioinformatics,
23(13):i185–94, 2007. Proc. International Conference on Intelligent Systems for Molecular
Biology (ISMB), Vienna, Austria (2007).
3
• Brdee:
I. Georgiev, D. Keedy, J. S. Richardson, D. C. Richardson, and B. R. Donald. Algorithm for
backrub motions in protein design. Bioinformatics, 24(13):i196–204, 2008. Proc. International
Conference on Intelligent Systems for Molecular Biology (ISMB), Toronto, Canada (2008).
• DACS:
I. Georgiev, R. Lilien, and B. R. Donald. Improved pruning algorithms and divide-and-conquer
strategies for dead-end elimination, with application to protein design. Bioinformatics, 22(14):e174–
183, 2006. Proc. International Conference on Intelligent Systems for Molecular Biology (ISMB),
Fortaleza, Brazil (2006).
• K∗ (current implementation):
I. Georgiev, R. Lilien, and B. R. Donald. The minimized dead-end elimination criterion and its
application to protein redesign in a hybrid scoring and search algorithm for computing partition
functions over molecular ensembles. J Comput Chem, 29(10):1527–42, 2008.
To cite the general idea of K∗, you can also cite:
R. Lilien, B. Stevens, A. Anderson, and B. R. Donald. A novel ensemble-based scoring and
search algorithm for protein redesign, and its application to modify the substrate specificity of
the Gramicidin Synthetase A phenylalanine adenylation enzyme. J Comp Biol, 12(6–7):740–761,
2005.
• DEEPer:
M. A. Hallen, D. A. Keedy, and B. R. Donald. Dead-End elimination with perturbations (DEEPer):
A provable protein design algorithm with continuous sidechain and backbone flexibility. Proteins,
in press, 2012.
• To perform protein designs with sparse residue interaction graphs, Sparse A* can in some cases
return the GMEC faster than traditional A*, even when traditional A* times out or runs out of 
memory. If using Sparse A*, use the following citation:
Jain S, Jou JD, Georgiev IS, and Donald BR. A Critical Analysis of Computational Protein Design 
with Sparse Residue Interaction Graphs. PLoS Comp Biol. 2017, In press.
• To perform protein designs with sparse residue interaction graphs, BWM* is able to exploit the 
sparseness of the graph to efficiently enumerate conformations, signifcantly faster than 
traditional iMinDEE/A*. If using BWM*, use the following citation:
Jou JD, Jain S, Georgiev I, Donald BR. BWM*: A novel, provable ensemble- based dynamic programming 
algorithm for sparse approximations of computational protein design. J Comput Biol. 2016.



Technical instructions specific to running Sparse A* follow.

The runs here are initial experiments to determine for what setups using doBranchDGMEC is most powerful. These runs should not be used for exact run time comparison; rather, these runs should be used to get a general idea of the computational bottlenecks and power of doBranchDGMEC and the related functions.

All runs were performed on hexa.cs.duke.edu, which has 8 Dual Core AMD Opteron(tm) Processor 885 at 2613.712MHz, a total of 64GB RAM.

For a given system, the runs are peformed using the following steps (the example is given with 1amu):

(DEE)
java -Xmx1024M KStar doDEE System.cfg DEE.cfg >! logDEE1amu.cfg


(GBD)
java -Xmx1024M KStar doDEE System.cfg DEEgg.cfg >! logGG1amu.cfg
cd BranchDecomposition
java -Xmx1024M -classpath .:jmatharray.jar BranchDecomposition ../gd_1amu ../gd_1amu_bd >! logBD1amu.out
		gd_1amu_bd - will contain the branch Decomposition.
cd ..
java -Xmx1024M KStar doBranchDGMEC System.cfg GD.cfg >! logGD1amu.out


For the doBranchDGMEC run, -Xmx may have to be increased significantly in some cases (e.g., 10000M-30000M).


Additional commands:

doBranchDGMEC 
genSparseGraph
prunedObjInfo - if DEE pruning objects have been computed previously, set to TRUE. Defaults to FALSE.

About

This repository contains code used in the following publication: Jain S, Jou JD, Georgiev IS, and Donald BR. A Critical Analysis of Computational Protein Design with Sparse Residue Interaction Graphs. PLoS Comp Biol. 2017, In press. For the latest version of OSPREY, please try http://www.cs.duke.edu/donaldlab/osprey.php instead.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages