Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REST scaling for protein-ligand systems with the CLI #1197

Draft
wants to merge 8 commits into
base: 0.10.x
Choose a base branch
from

Conversation

ijpulidos
Copy link
Contributor

Description

Adds handling of max temperature parameter from input yaml file with CLI.

Motivation and context

We need to be able to specify the maximum temperature from input YAML file to be able to benefit from using REST in the small molecule simulations pipeline.

Resolves #1195

How has this been tested?

Need to come up with a way of testing that REST scaling is being done from the serialized objects. I still don't know what would be the best way to accomplish this.

Change log

Enabling ``max_temperature`` parameter in input yaml file for CLI. This makes REST available for the small molecule pipeline using ``perses-cli``.

@mikemhenry mikemhenry changed the base branch from main to 0.10.x July 11, 2023 21:02
@codecov
Copy link

codecov bot commented Aug 7, 2023

Codecov Report

Merging #1197 (7250cc8) into 0.10.x (f2f63a9) will decrease coverage by 0.02%.
The diff coverage is 70.37%.

@ijpulidos
Copy link
Contributor Author

I figured that a selection of is_protein was hardcoded when defining the REST region, but if I try to overcome this I'm just faced with the following issue

Traceback (most recent call last):
  File "/home/user/miniconda3/envs/perses-dev/lib/python3.9/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/home/user/miniconda3/envs/perses-dev/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/user/miniconda3/envs/perses-dev/lib/python3.9/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/home/user/workdir/repos/perses/perses/app/cli.py", line 109, in cli
    run(yaml_filename=yaml, override_string=override)
  File "/home/user/workdir/repos/perses/perses/app/setup_relative_calculation.py", line 820, in run
    setup_dict = run_setup(setup_options)
  File "/home/user/workdir/repos/perses/perses/app/setup_relative_calculation.py", line 639, in run_setup
    _validate_endstate_energies_for_htf(htf, top_prop, phase,
  File "/home/user/workdir/repos/perses/perses/app/setup_relative_calculation.py", line 1140, in _validate_endstate_energies_for_htf
    zero_state_error, one_state_error = validate_endstate_energies(topology_proposal,
  File "/home/user/workdir/repos/perses/perses/dispersed/utils.py", line 898, in validate_endstate_energies
    nonalch_zero, nonalch_one, alch_zero, alch_one = generate_endpoint_thermodynamic_states(hybrid_system, top_proposal, repartitioned_endstate)
  File "/home/user/workdir/repos/perses/perses/dispersed/utils.py", line 613, in generate_endpoint_thermodynamic_states
    check_system(system)
  File "/home/user/workdir/repos/perses/perses/dispersed/utils.py", line 64, in check_system
    force = forces['PeriodicTorsionForce']
KeyError: 'PeriodicTorsionForce'

For which I don't have a good clue of what could be causing it.

@ijpulidos ijpulidos changed the title Maximum temperature yaml parameter for REST with CLI REST scaling for preotein-ligand systems with the CLI Aug 8, 2023
n_replicas : int, optional
The number of replicas for replica exchange. If not specified, it will be set to `n_states`.
lambda_schedule : array-like, optional
The schedule of lambda values for the alchemical states. Default is a linear schedule from 0 to 1.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change this to something like "Default is None, in which case a linear schedule from 0 to 1 will be used"

@@ -3220,8 +3213,10 @@ def _generate_rest_region(self):

# Retrieve neighboring atoms within self._rest_radius nm of the query atoms
traj = md.Trajectory(np.array(self._hybrid_positions), self._hybrid_topology)
solute_atoms = list(traj.topology.select("is_protein"))
solute_atoms = list(traj.topology.select("not water"))
Copy link
Contributor

@zhang-ivy zhang-ivy Aug 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is the right way to change this function to work for small molecules. Specifying "not water" will include solvent ions, which we definitely don't want to include as part of the REST region.

We first need to determine what the desired behavior is. The current behavior of this function for small molecules is to automatically include all small molecule atoms in the REST region, and also include protein atoms within X angstroms (where X is user specified) of the small molecule. I think we discussed that you would run some experiments testing this approach, though this will require finding transformations where REST is known to improve sampling / speed up convergence.

One worry I have with this approach is that including all small molecule atoms in the REST region is not the best practices approach (Schrodinger papers typically choose the alchemically-changing (aka unique old and new) atoms and some nearby protein atoms) and there may be risk of small molecule unbinding. Therefore, I think the desired approach should be to modify these lines:

        # Retrieve the residue index of the residue that is being alchemified based on the first unique old atom
        atom_index = list(self._atom_classes['unique_old_atoms'])[0]
        hybrid_topology_atoms = list(self._hybrid_topology.atoms)
        alchemical_residue_index = hybrid_topology_atoms[atom_index].residue.index
        
       # Retrieve indices of all atoms in the alchemical residue
        hybrid_topology_residues = list(self._hybrid_topology.residues)
        mutated_res = hybrid_topology_residues[alchemical_residue_index]
        query_indices = [atom.index for atom in mutated_res.atoms]
        _logger.info(f"Generating rest_region with query indices: {query_indices}")

such that for small molecule transformations only, query_indices is all unique old/new atoms/core atoms. This may requiring introducing an argument to the function like is_small_molecule that tells the function whether its dealing with a small molecule transformation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the things that we noticed is that the solute_atoms after the selection is empty for solvent phase. But we probably want to have the option to do REST scaling in solvent phase as well, such that the neighboring part is not needed, but the molecule atoms are still scaled according to the REST protocol.

Copy link
Contributor

@zhang-ivy zhang-ivy Aug 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notes from discussion:

  • As noted above, there should be three code paths: 1) protein mutation, 2) small molecule complex phase, 3) small molecule solvent phase
  • Double check that "is protein" mdtraj selection only contains protein atoms (and does not interpret small molecule atoms as protein atoms). If indeed "is protein" only corresponds to protein atoms, the way that the function is written will fail to include small molecule atoms in the rest_atoms_all. Will need to make sure the relevant small molecule atoms are included in this.

@ijpulidos ijpulidos changed the title REST scaling for preotein-ligand systems with the CLI REST scaling for protein-ligand systems with the CLI Aug 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

REST max temperature from CLI
2 participants