+ + + +
+ +
+
+ +
+
+ +
+ +
+ +
+ + +
+ +
+ +
+ + + + + + + + + + + + + + + + + + + +
+ +
+ +
+
+ + + +
+

Investigation the binding Site

+ +
+ +
+
+ + + + +
+ +
+

Investigation the binding Site#

+

Learning Objectives

+
    +
  • Use NGLView to view the 3D structure of our protein and ligand.

  • +
  • Prepare molecule structures using PDQ2PQR and RDKit.

  • +
  • Analyze the interactions of the protein in the binding site using a 2D map and ProLIF.

  • +
+

Before we begin our docking calculations, we will likely want to investigate the binding site of our ligand of interest. +We will want to look at the binding pocket and the interactions of the ligand with the protein residues. +For the rest of our studies, we will choose the ligand 13U. +A trypsin structure where this ligand is bound is 2ZQ2.

+
+

Downloading the Structure#

+

First we will need to download our protein structure. We will download 2ZQ2, which is a trypsin structure with our ligand of interest bound.

+

We will use a similar strategy to our last notebook for getting the file. +We will use Python’s request module and a URL from the Protein Data Bank.

+
+
+
import os # for making directories
+import requests
+
+# make a directory for pdb files
+os.makedirs("pdb", exist_ok=True)
+
+pdb_id = "2zq2" # trypsin PDB file with ligand bound
+
+pdb_request = requests.get(f"https://files.rcsb.org/download/{pdb_id}.pdb")
+pdb_request.status_code
+
+
+
+
+
200
+
+
+
+
+

After downloading, we will write the text to a file.

+
+
+
with open(f"pdb/{pdb_id}.pdb", "w+") as f:
+    f.write(pdb_request.text)
+
+
+
+
+
+
+

View the structure#

+

Before we start to really work with our molecule, let’s investigate the structure. +We will use a library called MDAnalysis to first process our PDB, then visualize it with a library called NGLView. +In the cell below, we define some convenience functions for NGLView. +These are functions that the tutorial writers wrote for our protein ligand system.

+
+
+
import math
+
+def rotate_view(view, x=0, y=0, z=0, degrees=True):
+    radians = 1
+    if degrees: radians = math.pi / 180
+    view.control.spin([1, 0, 0], x*radians)
+    view.control.spin([0, 1, 0], y*radians)
+    view.control.spin([0, 0, 1], z*radians)
+
+def view_binding_site(protein, ligand):
+    """View binding site of 13U to trypsin.
+
+    Parameters
+    -----------
+    protein: mda.Universe
+        The protein as an MDAnalysis universe.
+
+    ligand: mda.Universe
+        The ligand as an MDAnalysis universe.
+    """
+    view = nv.show_mdanalysis(protein)
+    view.clear_representations()
+    view.add_representation("surface", colorScheme="hydrophobicity")
+    lig_view = view.add_component(ligand)
+    lig_view.center()
+    rotate_view(view, y=180, x=20)
+    return view
+    
+
+
+
+
+

MDAnalysis is a popular tool for processing molecular dynamics trajectories and other molecular structures. +The central object in MDAnalysis is called a “Universe”. In MDAnalysis terms, a Universe represents a molecular +system. We can load an MDAnalysis universe from a PDB file.

+
+
+
import MDAnalysis as mda
+import nglview as nv
+
+u = mda.Universe(f"pdb/{pdb_id}.pdb")
+
+
+
+
+
+
+

Text here about NGLView

+
+
+
view = nv.show_mdanalysis(u)
+view
+
+
+
+
+
+
+

This view looks a bit messy. MDAnalysis has a human readable selection syntax +that allows us to isolate parts of our structure. We will take our MDAnalysis Universe (the variable u) and use the select_atoms function. +Inside this function, we will fill in what we want to select.

+

We will create separate variables for the protein and ligand. We can select all protein residues in MDAnalysis using the word “protein” in the select_atoms function. Then, we will select our ligand using resname 13U. This corresponds to the residue name in the PDB we downloaded.

+
+
+
protein = u.select_atoms("protein")
+ligand = u.select_atoms("resname 13U")
+
+
+
+
+

We will use our helper function, defined above, to look at how the ligand is bound to the protein. +This helper function will use NGLView, like we did previously, but adds coloring the surface by hydrophobicity. +It also zooms in on the ligand.

+
+
+
view_binding_site(protein, ligand)
+
+
+
+
+
+
+

Upon viewing this structure, you will notice that our ligand seems to appear twice. +If you open the PDB file to investigate, you will see the following in the ligand section:

+
HETATM 1673  C14A13U A 501      18.144  -9.216  12.088  0.61 24.22           C  
+ANISOU 1673  C14A13U A 501     1755   4793   2654   1752    148   1233       C  
+HETATM 1674  C14B13U A 501      18.147  -8.840  11.672  0.39 24.46           C  
+ANISOU 1674  C14B13U A 501     2583   4283   2430   1765    353   1279       C  
+HETATM 1675  O32A13U A 501      18.209  -8.355  11.186  0.61 24.38           O  
+ANISOU 1675  O32A13U A 501     2354   5394   1514   2217    238    919       O
+
+
+

This PDB structure provides alternate locations for each ligand atom. In excerpt above, you will see C14A13U and C14B13U. These are alternate locations of the same atom. +By checking the documentation page for MDAnalysis selections, we can see that MDAnalysis is prepared for this scenario. We will want to use the altloc keyword. This keyword is described as:

+
+

altLoc alternative-location

+
+
+

a selection for atoms where alternative locations are available, which is often the case with high-resolution crystal structures e.g. resid 4 and resname ALA and altLoc B selects only the atoms of ALA-4 that have an altLoc B record.

+
+

We can alter our MDAnalysis selection syntax to isolate our ligands of interest.

+
+
+
protein = u.select_atoms("protein")
+ligand_A = u.select_atoms("resname 13U and altLoc A")
+ligand_B = u.select_atoms("resname 13U and altLoc B")
+
+
+
+
+

Now, we can use our viewing function to see the location of each ligand.

+
+
+
view_binding_site(protein, ligand_A)
+
+
+
+
+
+
+
+
+
view_binding_site(protein, ligand_B)
+
+
+
+
+
+
+

When we inspect the ligand in the binding site, we notice a few things. +First, the binding site has a large hydrophobic area on the surface. +If you zoom in on the binding pocket, you’ll also see that the benzene ring and amine groups are inside.

+
+
+

Making a Map of Ligand Contacts#

+

To get an even better idea of how our ligand is binding to the protein, we might choose to make a 2D map of ligand contacts with protein residues. +In this analysis, we’ll want to know how the ligand is interacting with the protein residues including if it is making hydrogen bonds, Van Der Waals interactions, etc.

+

We will use a library called ProLIF for this analysis. ProLIF is short for “Protein-Ligand Interaction Fingerprints” and it “ is a tool designed to generate interaction fingerprints for complexes made of ligands, protein, DNA or RNA molecules extracted from molecular dynamics trajectories, docking simulations and experimental structures.” (quote taken from ProLIF docs).

+

Before we use ProLIF, we first have to make sure our ligand and protein file are prepared properly. +Hydrogens are absent in most PDB files because they are not well resolved by methods like X-Ray crystallography. +We’ll need to add them back in in order to complete our analysis of the binding site.

+

This process can actually be quite involved, as we’ll see below.

+

We will start by saving new PDBs of our selections from MDAnalysis. +Then, we will add hydrogens.

+
+
+
protein.write(f"pdb/protein_{pdb_id}.pdb")
+ligand_A.write(f"pdb/ligand_A.pdb")
+
+
+
+
+
/usr/share/miniconda/envs/biochemist-python/lib/python3.11/site-packages/MDAnalysis/coordinates/PDB.py:1153: UserWarning: Found no information for attr: 'formalcharges' Using default value of '0'
+  warnings.warn("Found no information for attr: '{}'"
+
+
+
+
+
+

Structure Preparation#

+

Before we run the analysis, we need to make sure our protein and ligand have hydrogen atoms. +We will do this first for the protein.

+

We will use a specialized program called PDB2PQR that is made for working with biomolecules like proteins. +The advantage of using PDB2PQR is that it will check our protein for missing atoms and multiple occupancy in the protein.

+

We will use the command-line interface of this program. This means that you would usually type the command below into your terminal +You can run command line commands in the Jupyter notebook by putting a ! in front of the command.

+
+
+
! pdb2pqr --pdb-output=pdb/protein_h.pdb --pH=7.4 pdb/protein_2zq2.pdb protein.pqr
+
+
+
+
+
/usr/bin/sh: 1: pdb2pqr: not found
+
+
+
+
+
+
+
protein = mda.Universe("pdb/protein_H.pdb")
+
+
+
+
+
---------------------------------------------------------------------------
+FileNotFoundError                         Traceback (most recent call last)
+    [... skipping hidden 1 frame]
+
+Cell In[13], line 1
+----> 1 protein = mda.Universe("pdb/protein_H.pdb")
+
+File /usr/share/miniconda/envs/biochemist-python/lib/python3.11/site-packages/MDAnalysis/core/universe.py:356, in Universe.__init__(self, topology, all_coordinates, format, topology_format, transformations, guess_bonds, vdwradii, fudge_factor, lower_bound, in_memory, in_memory_step, *coordinates, **kwargs)
+    355     self.filename = _check_file_like(topology)
+--> 356     topology = _topology_from_file_like(self.filename,
+    357                                         topology_format=topology_format,
+    358                                         **kwargs)
+    360 if topology is not None:
+
+File /usr/share/miniconda/envs/biochemist-python/lib/python3.11/site-packages/MDAnalysis/core/universe.py:118, in _topology_from_file_like(topology_file, topology_format, **kwargs)
+    115 if (err.errno is not None and
+    116     errno.errorcode[err.errno] in ['ENOENT', 'EACCES']):
+    117     # Runs if the error is propagated due to no permission / file not found
+--> 118     raise sys.exc_info()[1] from err
+    119 else:
+    120     # Runs when the parser fails
+
+File /usr/share/miniconda/envs/biochemist-python/lib/python3.11/site-packages/MDAnalysis/core/universe.py:110, in _topology_from_file_like(topology_file, topology_format, **kwargs)
+    109     with parser(topology_file) as p:
+--> 110         topology = p.parse(**kwargs)
+    111 except (IOError, OSError) as err:
+    112     # There are 2 kinds of errors that might be raised here:
+    113     # one because the file isn't present
+    114     # or the permissions are bad, second when the parser fails
+
+File /usr/share/miniconda/envs/biochemist-python/lib/python3.11/site-packages/MDAnalysis/topology/PDBParser.py:210, in PDBParser.parse(self, **kwargs)
+    204 """Parse atom information from PDB file
+    205 
+    206 Returns
+    207 -------
+    208 MDAnalysis Topology object
+    209 """
+--> 210 top = self._parseatoms()
+    212 try:
+
+File /usr/share/miniconda/envs/biochemist-python/lib/python3.11/site-packages/MDAnalysis/topology/PDBParser.py:247, in PDBParser._parseatoms(self)
+    246 last_wrapped_serial = 100000  # if serials wrap, start from here
+--> 247 with util.openany(self.filename) as f:
+    248     for line in f:
+
+File /usr/share/miniconda/envs/biochemist-python/lib/python3.11/contextlib.py:137, in _GeneratorContextManager.__enter__(self)
+    136 try:
+--> 137     return next(self.gen)
+    138 except StopIteration:
+
+File /usr/share/miniconda/envs/biochemist-python/lib/python3.11/site-packages/MDAnalysis/lib/util.py:318, in openany(datasource, mode, reset)
+    276 """Context manager for :func:`anyopen`.
+    277 
+    278 Open the `datasource` and close it when the context of the :keyword:`with`
+   (...)
+    316 :func:`anyopen`
+    317 """
+--> 318 stream = anyopen(datasource, mode=mode, reset=reset)
+    319 try:
+
+File /usr/share/miniconda/envs/biochemist-python/lib/python3.11/site-packages/MDAnalysis/lib/util.py:398, in anyopen(datasource, mode, reset)
+    397 openfunc = read_handlers[ext]
+--> 398 stream = _get_stream(datasource, openfunc, mode=mode)
+    399 if stream is not None:
+
+File /usr/share/miniconda/envs/biochemist-python/lib/python3.11/site-packages/MDAnalysis/lib/util.py:440, in _get_stream(filename, openfunction, mode)
+    439 if errno.errorcode[err.errno] in ['ENOENT', 'EACCES']:
+--> 440     raise sys.exc_info()[1] from err
+    441 return None
+
+File /usr/share/miniconda/envs/biochemist-python/lib/python3.11/site-packages/MDAnalysis/lib/util.py:434, in _get_stream(filename, openfunction, mode)
+    433 try:
+--> 434     stream = openfunction(filename, mode=mode)
+    435 except (IOError, OSError) as err:
+    436     # An exception might be raised due to two reasons, first the openfunction is unable to open the file, in this
+    437     # case we have to ignore the error and return None. Second is when openfunction can't open the file because
+    438     # either the file isn't there or the permissions don't allow access.
+
+File /usr/share/miniconda/envs/biochemist-python/lib/python3.11/site-packages/MDAnalysis/lib/picklable_file_io.py:502, in bz2_pickle_open(name, mode)
+    501 bz_mode = mode.replace("t", "")
+--> 502 binary_file = BZ2Picklable(name, bz_mode)
+    503 if "t" in mode:
+
+File /usr/share/miniconda/envs/biochemist-python/lib/python3.11/site-packages/MDAnalysis/lib/picklable_file_io.py:283, in BZ2Picklable.__init__(self, name, mode)
+    282 self._bz_mode = mode
+--> 283 super().__init__(name, mode)
+
+File /usr/share/miniconda/envs/biochemist-python/lib/python3.11/bz2.py:81, in BZ2File.__init__(self, filename, mode, compresslevel)
+     80 if isinstance(filename, (str, bytes, os.PathLike)):
+---> 81     self._fp = _builtin_open(filename, mode)
+     82     self._closefp = True
+
+FileNotFoundError: [Errno 2] No such file or directory: 'pdb/protein_H.pdb'
+
+The above exception was the direct cause of the following exception:
+
+FileNotFoundError                         Traceback (most recent call last)
+    [... skipping hidden 1 frame]
+
+Cell In[13], line 1
+----> 1 protein = mda.Universe("pdb/protein_H.pdb")
+
+File /usr/share/miniconda/envs/biochemist-python/lib/python3.11/site-packages/MDAnalysis/core/universe.py:356, in Universe.__init__(self, topology, all_coordinates, format, topology_format, transformations, guess_bonds, vdwradii, fudge_factor, lower_bound, in_memory, in_memory_step, *coordinates, **kwargs)
+    355     self.filename = _check_file_like(topology)
+--> 356     topology = _topology_from_file_like(self.filename,
+    357                                         topology_format=topology_format,
+    358                                         **kwargs)
+    360 if topology is not None:
+
+File /usr/share/miniconda/envs/biochemist-python/lib/python3.11/site-packages/MDAnalysis/core/universe.py:118, in _topology_from_file_like(topology_file, topology_format, **kwargs)
+    115 if (err.errno is not None and
+    116     errno.errorcode[err.errno] in ['ENOENT', 'EACCES']):
+    117     # Runs if the error is propagated due to no permission / file not found
+--> 118     raise sys.exc_info()[1] from err
+    119 else:
+    120     # Runs when the parser fails
+
+File /usr/share/miniconda/envs/biochemist-python/lib/python3.11/site-packages/MDAnalysis/core/universe.py:110, in _topology_from_file_like(topology_file, topology_format, **kwargs)
+    109     with parser(topology_file) as p:
+--> 110         topology = p.parse(**kwargs)
+    111 except (IOError, OSError) as err:
+    112     # There are 2 kinds of errors that might be raised here:
+    113     # one because the file isn't present
+    114     # or the permissions are bad, second when the parser fails
+
+File /usr/share/miniconda/envs/biochemist-python/lib/python3.11/site-packages/MDAnalysis/topology/PDBParser.py:210, in PDBParser.parse(self, **kwargs)
+    204 """Parse atom information from PDB file
+    205 
+    206 Returns
+    207 -------
+    208 MDAnalysis Topology object
+    209 """
+--> 210 top = self._parseatoms()
+    212 try:
+
+File /usr/share/miniconda/envs/biochemist-python/lib/python3.11/site-packages/MDAnalysis/topology/PDBParser.py:247, in PDBParser._parseatoms(self)
+    246 last_wrapped_serial = 100000  # if serials wrap, start from here
+--> 247 with util.openany(self.filename) as f:
+    248     for line in f:
+
+File /usr/share/miniconda/envs/biochemist-python/lib/python3.11/contextlib.py:137, in _GeneratorContextManager.__enter__(self)
+    136 try:
+--> 137     return next(self.gen)
+    138 except StopIteration:
+
+File /usr/share/miniconda/envs/biochemist-python/lib/python3.11/site-packages/MDAnalysis/lib/util.py:318, in openany(datasource, mode, reset)
+    276 """Context manager for :func:`anyopen`.
+    277 
+    278 Open the `datasource` and close it when the context of the :keyword:`with`
+   (...)
+    316 :func:`anyopen`
+    317 """
+--> 318 stream = anyopen(datasource, mode=mode, reset=reset)
+    319 try:
+
+File /usr/share/miniconda/envs/biochemist-python/lib/python3.11/site-packages/MDAnalysis/lib/util.py:398, in anyopen(datasource, mode, reset)
+    397 openfunc = read_handlers[ext]
+--> 398 stream = _get_stream(datasource, openfunc, mode=mode)
+    399 if stream is not None:
+
+File /usr/share/miniconda/envs/biochemist-python/lib/python3.11/site-packages/MDAnalysis/lib/util.py:440, in _get_stream(filename, openfunction, mode)
+    439 if errno.errorcode[err.errno] in ['ENOENT', 'EACCES']:
+--> 440     raise sys.exc_info()[1] from err
+    441 return None
+
+File /usr/share/miniconda/envs/biochemist-python/lib/python3.11/site-packages/MDAnalysis/lib/util.py:434, in _get_stream(filename, openfunction, mode)
+    433 try:
+--> 434     stream = openfunction(filename, mode=mode)
+    435 except (IOError, OSError) as err:
+    436     # An exception might be raised due to two reasons, first the openfunction is unable to open the file, in this
+    437     # case we have to ignore the error and return None. Second is when openfunction can't open the file because
+    438     # either the file isn't there or the permissions don't allow access.
+
+File /usr/share/miniconda/envs/biochemist-python/lib/python3.11/site-packages/MDAnalysis/lib/picklable_file_io.py:502, in bz2_pickle_open(name, mode)
+    501 bz_mode = mode.replace("t", "")
+--> 502 binary_file = BZ2Picklable(name, bz_mode)
+    503 if "t" in mode:
+
+File /usr/share/miniconda/envs/biochemist-python/lib/python3.11/site-packages/MDAnalysis/lib/picklable_file_io.py:283, in BZ2Picklable.__init__(self, name, mode)
+    282 self._bz_mode = mode
+--> 283 super().__init__(name, mode)
+
+File /usr/share/miniconda/envs/biochemist-python/lib/python3.11/bz2.py:81, in BZ2File.__init__(self, filename, mode, compresslevel)
+     80 if isinstance(filename, (str, bytes, os.PathLike)):
+---> 81     self._fp = _builtin_open(filename, mode)
+     82     self._closefp = True
+
+FileNotFoundError: [Errno 2] No such file or directory: 'pdb/protein_H.pdb'
+
+The above exception was the direct cause of the following exception:
+
+FileNotFoundError                         Traceback (most recent call last)
+Cell In[13], line 1
+----> 1 protein = mda.Universe("pdb/protein_H.pdb")
+
+File /usr/share/miniconda/envs/biochemist-python/lib/python3.11/site-packages/MDAnalysis/core/universe.py:356, in Universe.__init__(self, topology, all_coordinates, format, topology_format, transformations, guess_bonds, vdwradii, fudge_factor, lower_bound, in_memory, in_memory_step, *coordinates, **kwargs)
+    354 if not isinstance(topology, Topology) and not topology is None:
+    355     self.filename = _check_file_like(topology)
+--> 356     topology = _topology_from_file_like(self.filename,
+    357                                         topology_format=topology_format,
+    358                                         **kwargs)
+    360 if topology is not None:
+    361     self._topology = topology
+
+File /usr/share/miniconda/envs/biochemist-python/lib/python3.11/site-packages/MDAnalysis/core/universe.py:118, in _topology_from_file_like(topology_file, topology_format, **kwargs)
+    111 except (IOError, OSError) as err:
+    112     # There are 2 kinds of errors that might be raised here:
+    113     # one because the file isn't present
+    114     # or the permissions are bad, second when the parser fails
+    115     if (err.errno is not None and
+    116         errno.errorcode[err.errno] in ['ENOENT', 'EACCES']):
+    117         # Runs if the error is propagated due to no permission / file not found
+--> 118         raise sys.exc_info()[1] from err
+    119     else:
+    120         # Runs when the parser fails
+    121         raise IOError("Failed to load from the topology file {0}"
+    122                         " with parser {1}.\n"
+    123                         "Error: {2}".format(topology_file, parser, err))
+
+File /usr/share/miniconda/envs/biochemist-python/lib/python3.11/site-packages/MDAnalysis/core/universe.py:110, in _topology_from_file_like(topology_file, topology_format, **kwargs)
+    108 try:
+    109     with parser(topology_file) as p:
+--> 110         topology = p.parse(**kwargs)
+    111 except (IOError, OSError) as err:
+    112     # There are 2 kinds of errors that might be raised here:
+    113     # one because the file isn't present
+    114     # or the permissions are bad, second when the parser fails
+    115     if (err.errno is not None and
+    116         errno.errorcode[err.errno] in ['ENOENT', 'EACCES']):
+    117         # Runs if the error is propagated due to no permission / file not found
+
+File /usr/share/miniconda/envs/biochemist-python/lib/python3.11/site-packages/MDAnalysis/topology/PDBParser.py:210, in PDBParser.parse(self, **kwargs)
+    203 def parse(self, **kwargs):
+    204     """Parse atom information from PDB file
+    205 
+    206     Returns
+    207     -------
+    208     MDAnalysis Topology object
+    209     """
+--> 210     top = self._parseatoms()
+    212     try:
+    213         bonds = self._parsebonds(top.ids.values)
+
+File /usr/share/miniconda/envs/biochemist-python/lib/python3.11/site-packages/MDAnalysis/topology/PDBParser.py:247, in PDBParser._parseatoms(self)
+    245 self._wrapped_serials = False  # did serials go over 100k?
+    246 last_wrapped_serial = 100000  # if serials wrap, start from here
+--> 247 with util.openany(self.filename) as f:
+    248     for line in f:
+    249         line = line.strip()  # Remove extra spaces
+
+File /usr/share/miniconda/envs/biochemist-python/lib/python3.11/contextlib.py:137, in _GeneratorContextManager.__enter__(self)
+    135 del self.args, self.kwds, self.func
+    136 try:
+--> 137     return next(self.gen)
+    138 except StopIteration:
+    139     raise RuntimeError("generator didn't yield") from None
+
+File /usr/share/miniconda/envs/biochemist-python/lib/python3.11/site-packages/MDAnalysis/lib/util.py:318, in openany(datasource, mode, reset)
+    274 @contextmanager
+    275 def openany(datasource, mode='rt', reset=True):
+    276     """Context manager for :func:`anyopen`.
+    277 
+    278     Open the `datasource` and close it when the context of the :keyword:`with`
+   (...)
+    316     :func:`anyopen`
+    317     """
+--> 318     stream = anyopen(datasource, mode=mode, reset=reset)
+    319     try:
+    320         yield stream
+
+File /usr/share/miniconda/envs/biochemist-python/lib/python3.11/site-packages/MDAnalysis/lib/util.py:398, in anyopen(datasource, mode, reset)
+    396 for ext in ('bz2', 'gz', ''):  # file == '' should be last
+    397     openfunc = read_handlers[ext]
+--> 398     stream = _get_stream(datasource, openfunc, mode=mode)
+    399     if stream is not None:
+    400         break
+
+File /usr/share/miniconda/envs/biochemist-python/lib/python3.11/site-packages/MDAnalysis/lib/util.py:440, in _get_stream(filename, openfunction, mode)
+    435 except (IOError, OSError) as err:
+    436     # An exception might be raised due to two reasons, first the openfunction is unable to open the file, in this
+    437     # case we have to ignore the error and return None. Second is when openfunction can't open the file because
+    438     # either the file isn't there or the permissions don't allow access.
+    439     if errno.errorcode[err.errno] in ['ENOENT', 'EACCES']:
+--> 440         raise sys.exc_info()[1] from err
+    441     return None
+    442 if mode.startswith('r'):
+    443     # additional check for reading (eg can we uncompress) --- is this needed?
+
+File /usr/share/miniconda/envs/biochemist-python/lib/python3.11/site-packages/MDAnalysis/lib/util.py:434, in _get_stream(filename, openfunction, mode)
+    432 """Return open stream if *filename* can be opened with *openfunction* or else ``None``."""
+    433 try:
+--> 434     stream = openfunction(filename, mode=mode)
+    435 except (IOError, OSError) as err:
+    436     # An exception might be raised due to two reasons, first the openfunction is unable to open the file, in this
+    437     # case we have to ignore the error and return None. Second is when openfunction can't open the file because
+    438     # either the file isn't there or the permissions don't allow access.
+    439     if errno.errorcode[err.errno] in ['ENOENT', 'EACCES']:
+
+File /usr/share/miniconda/envs/biochemist-python/lib/python3.11/site-packages/MDAnalysis/lib/picklable_file_io.py:502, in bz2_pickle_open(name, mode)
+    499     raise ValueError("Only read mode ('r', 'rt', 'rb') "
+    500                      "files can be pickled.")
+    501 bz_mode = mode.replace("t", "")
+--> 502 binary_file = BZ2Picklable(name, bz_mode)
+    503 if "t" in mode:
+    504     return TextIOPicklable(binary_file)
+
+File /usr/share/miniconda/envs/biochemist-python/lib/python3.11/site-packages/MDAnalysis/lib/picklable_file_io.py:283, in BZ2Picklable.__init__(self, name, mode)
+    281 def __init__(self, name, mode='rb'):
+    282     self._bz_mode = mode
+--> 283     super().__init__(name, mode)
+
+File /usr/share/miniconda/envs/biochemist-python/lib/python3.11/bz2.py:81, in BZ2File.__init__(self, filename, mode, compresslevel)
+     78     raise ValueError("Invalid mode: %r" % (mode,))
+     80 if isinstance(filename, (str, bytes, os.PathLike)):
+---> 81     self._fp = _builtin_open(filename, mode)
+     82     self._closefp = True
+     83     self._mode = mode_code
+
+FileNotFoundError: [Errno 2] No such file or directory: 'pdb/protein_H.pdb'
+
+
+
+
+

Adding hydrogens to our ligand is a little bit more difficult. +We can’t just use PDB2PQR in this case. +Our ligand might not have bond lengths such that the proper bond orders are always recognized, so we will want to make sure that we have a proper reference.

+

We will use the ideal ligand we downloaded as a reference and use a small molecule manipulation software called RDKit to match bond orders and add hydrogens.

+
+
+
from rdkit import Chem
+
+from rdkit.Chem.AllChem import AssignBondOrdersFromTemplate
+
+template = Chem.MolFromMol2File("ligands/13U_ideal.mol2")
+pdb_ligand = Chem.MolFromPDBFile(f"pdb/ligand_A.pdb")
+
+template = Chem.RemoveAllHs(template)
+
+
+
+
+
+
+
ligand = AssignBondOrdersFromTemplate(template, pdb_ligand)
+
+# Write the ligand to an SDF file
+Chem.MolToMolFile(ligand, "pdb/ligand.sdf")
+
+
+
+
+

Now, we need to make sure this structure has hydrogens for our analysis.

+
+
+
from openbabel import pybel
+
+# Use pybel to read the SDF, add hydrogens, and save as PDB
+mol = next(pybel.readfile("sdf", "pdb/ligand.sdf"))
+mol.addh()  # Add hydrogens
+mol.write("pdb", "pdb/ligand_h.pdb", overwrite=True)
+
+
+
+
+
+
+

Visualizing the Binding Site#

+

Now that we have our files with hydrogens prepared, we can make a map of the binding site.

+
+
+
import prolif as plf
+
+protein_h = mda.Universe(f"pdb/protein_h.pdb")
+ligand_h = mda.Universe(f"pdb/ligand_h.pdb")
+
+
+
+
+
+
+
protein_mol = plf.Molecule.from_mda(protein_h)
+
+
+
+
+
+
+
ligand_mol = plf.Molecule.from_mda(ligand_h)
+
+
+
+
+
+
+
fp = plf.Fingerprint()
+
+
+
+
+
+
+
lig_list = [ ligand_mol ] 
+
+interactions = fp.run_from_iterable(lig_list, protein_mol)
+
+
+
+
+
+
+
view = fp.plot_lignetwork(lig_list[0])
+view
+
+
+
+
+
+
+
+ + + + +
+ + + + + + + + +
+ + + + + + +
+
+ + +
+ + +