Skip to content

CHARMM36 supports disulfide bonds#2040

Merged
peastman merged 2 commits into
openmm:masterfrom
peastman:disulfide
Apr 17, 2018
Merged

CHARMM36 supports disulfide bonds#2040
peastman merged 2 commits into
openmm:masterfrom
peastman:disulfide

Conversation

@peastman

@peastman peastman commented Apr 6, 2018

Copy link
Copy Markdown
Member

Fixes #2039. I created the patch by hand. I added a test case that verifies it can create a System for a protein with disulfide bonds, but I haven't checked that it produces identical forces to CHARMM. @jchodera can you check that?

This fixes the problem in most cases, but the molecule a user discovered the problem in (5UIW) still does not work. It's an unusual case, and a full fix for it will take bigger changes than I feel safe doing in a patch release. The problem is that it contains two CYS residues that are adjacent to each other along the chain, and each of them forms a disulfide bond. The code for applying multi-residue patches ends up matching the patch to the two adjacent residues. They have the right types, after all, and they're bonded to each other. But then it's unable to match the two residues they form the disulfide bonds to and it gets an error. I'll try to come up with a fix for that in 7.3.

In the process of testing this, I also discovered another problem: addMembrane() doesn't work with CHARMM36 because it doesn't handle CustomNonbondedForces. I'll send another PR to fix that.

@peastman

peastman commented Apr 6, 2018

Copy link
Copy Markdown
Member Author

No, fixing addMembrane() is going to take bigger changes too, so that will also have to wait. The tabulated functions used by the CustomNonbondedForce are specific to that particular system, so you can't create a system and then add more particles to it (which is what addMembrane() does internally).

@jchodera

jchodera commented Apr 6, 2018

Copy link
Copy Markdown
Member

Fixes #2039. I created the patch by hand. I added a test case that verifies it can create a System for a protein with disulfide bonds, but I haven't checked that it produces identical forces to CHARMM. @jchodera can you check that?

Thanks! Working on these tests now.

the molecule a user discovered the problem in (5UIW) still does not work. The problem is that it contains two CYS residues that are adjacent to each other along the chain, and each of them forms a disulfide bond. The code for applying multi-residue patches ends up matching the patch to the two adjacent residues. They have the right types, after all, and they're bonded to each other. But then it's unable to match the two residues they form the disulfide bonds to and it gets an error. I'll try to come up with a fix for that in 7.3.

Good catch here. More generally, it seems like we should be using some sort of distance-based criteria to figure out how to properly apply two-residue patches that involve adding new bonds, since the bonds should be applied between the closest potential matching atoms. Alternatively, we can use information from the PDB structure about which residues should be disulfide bonded to restrict patch application.

Note that this structure has additional issues we need to be able to treat. In addition to the SSBOND directives:

SSBOND   1 CYS A   20    CYS A  269                          1555   1555  2.04  
SSBOND   2 CYS A  101    CYS A  178                          1555   1555  2.06  
SSBOND   3 CYS B   10    CYS B   34                          1555   1555  2.04  
SSBOND   4 CYS B   11    CYS B   50                          1555   1555  2.05  

this structure also has CYS residues coordinating zinc:

LINK         SG  CYS A1006                ZN    ZN A1101     1555   1555  2.45  
LINK         SG  CYS A1009                ZN    ZN A1101     1555   1555  2.18  
LINK         SG  CYS A1039                ZN    ZN A1101     1555   1555  2.29  
LINK         SG  CYS A1042                ZN    ZN A1101     1555   1555  2.29  

@rafwiewiora has some strategies to deal with this that we can hope to include in a future release of OpenMM.

@peastman

peastman commented Apr 6, 2018

Copy link
Copy Markdown
Member Author

That assumes you're starting from the original file that still has all that information. The file the user sent me had already been processed with PyMol and all that information was gone.

My thought is to identify all pairs of residues the patch could be applied to, then look for the combination that covers as many residues as possible. In this case, applying the patch to the two adjacent residues leaves the other two unmatched. But applying them in the proper way matches all four residues, so that would be selected.

@peastman peastman mentioned this pull request Apr 6, 2018
@jchodera

jchodera commented Apr 6, 2018

Copy link
Copy Markdown
Member

What about having PDBFile and friends preserve SSBOND information, and use that by default, but allow the pipeline to override this to try to guess how patches can be applied with a flag?

My thought is to identify all pairs of residues the patch could be applied to, then look for the combination that covers as many residues as possible. In this case, applying the patch to the two adjacent residues leaves the other two unmatched. But applying them in the proper way matches all four residues, so that would be selected.

Couldn't this cause really weird things to happen to residues that are nowhere near each other?

I'm also a bit confused here as to why the topology didn't contain bonds between the bonded residues in question. Should we only be applying patched residue combination parameter sets that have the same connectivity as the pair of residues we're trying to apply the parameters to?

@peastman

peastman commented Apr 6, 2018

Copy link
Copy Markdown
Member Author

We do only apply patches to pairs of residues that are directly bonded to each other. In this case, they are directly bonded because they're next to each other along the chain. That's why it's an unusual case.

@jchodera

jchodera commented Apr 6, 2018

Copy link
Copy Markdown
Member

But the patched residue specifies an S-S bond, and the two sequential Cys residues don't have an S-S bond in the topology that ForceField is passed. So how does this bond get applied? Is ForceField creating bonds that are not in the Topology, or is this happening in Modeller?

@peastman

peastman commented Apr 6, 2018

Copy link
Copy Markdown
Member Author

The patch doesn't say anything about S-S bonds. It just says that it applies to two CYS residues, and it lists a set of changes to make. You know that it was intended for creating disulfide bonds, but there's nothing in that patch definition that enforces that.

@jchodera

jchodera commented Apr 6, 2018

Copy link
Copy Markdown
Member

The patch doesn't say anything about S-S bonds. It just says that it applies to two CYS residues, and it lists a set of changes to make. You know that it was intended for creating disulfide bonds, but there's nothing in that patch definition that enforces that.

Ah! For matching residues, we require that all the elements, internal bonds, and external bonds match the template description of the residue. If we are considering a patch + residue combination, shouldn't we enforce the same conditions? A patched combination of two Cys templates would then contain an S-S bond that would have to exist in the topology in order for a match to occur. This would solve that problem fully while being consistent with how we treat individual residues.

@JoaoRodrigues

Copy link
Copy Markdown
Contributor

User chiming in :) The file I gave you is not 5UIW, but a model based on that structure, so is does not have any information on the header. What about about checking for CYS pairs that are within a certain distance threshold and with the SG atoms pointing towards (or generally towards) each other?

@peastman

peastman commented Apr 6, 2018

Copy link
Copy Markdown
Member Author

What about about checking for CYS pairs that are within a certain distance threshold and with the SG atoms pointing towards (or generally towards) each other?

That's basically what it does for identifying disulfide bonds when building the topology. Now we just need it to be smarter about figuring out where to apply patches!

Currently when deciding whether a patch can be applied to a set of residues, it just matches up atoms, bonds, and external bonds. But you're right that the patch definition contains more information than that. It specifies not just that an external bond should be added to each SG, but also that they should connect to each other. Enforcing that would remove the ambiguity.

@peastman

Copy link
Copy Markdown
Member Author

Any progress on testing this? It's the bottleneck to releasing 7.2.2.

@jchodera

Copy link
Copy Markdown
Member

Will hopefully finish today!

@jchodera

Copy link
Copy Markdown
Member

I can confirm that the addition of this patch doesn't break any of the existing protein tests:

Testing 1VII protein
----------------------------------------------------------------------------------------------------
OpenMM CHARMM reader total energy:       4077.812
ParmEd CHARMM reader total energy:       4077.812
OPENMM ffxml total energy:               4077.814
TOTAL ERROR:                                0.002
----------------------------------------------------------------------------------------------------
Testing 1VII solvated
----------------------------------------------------------------------------------------------------
OpenMM CHARMM reader total energy:    -161010.709
ParmEd CHARMM reader total energy:    -161010.646
OPENMM ffxml total energy:            -161010.646
TOTAL ERROR:                                0.062
----------------------------------------------------------------------------------------------------
Testing 7DFR solvated
----------------------------------------------------------------------------------------------------
OpenMM CHARMM reader total energy:    -303736.889
ParmEd CHARMM reader total energy:    -303736.764
OPENMM ffxml total energy:            -303735.014
TOTAL ERROR:                                1.875
----------------------------------------------------------------------------------------------------

I've also added a new test: barnase (1BNG, chain A) with a single disufilde bond, in vacuum. I used CHARMM-GUI to generate the CHARMM input files and then manually added the missing CONECT record to the PDB file (since CHARMM-GUI does not generate PDB-compliant files).

The PDB file is in the attached archive, along with a short test script that ensures createSystem() can be called:
test-DISU.zip

For some reason, it appears that there's a significant energy discrepancy introduced in this test protein (1BNG):

----------------------------------------------------------------------------------------------------
OpenMM CHARMM reader total energy:        657.946
ParmEd CHARMM reader total energy:        657.946
OPENMM ffxml total energy:                627.441
TOTAL ERROR:                              -30.505
----------------------------------------------------------------------------------------------------

The PR with these changes is here:
openmm/openmmforcefields#63

The test run is cd charmm ; python test_charmm.py --verbose.

It will take some time to automate the energy component breakdown to see which term is causing trouble, but here are the serialized System XML files in case a quick manual inspection can turn up the culprit:
1BNG.zip

@peastman

Copy link
Copy Markdown
Member Author

There's a small difference in the PeriodicTorsionForces (4222 vs. 4227 kJ/mol), and a larger difference in the nonbonded forces (-5824 vs. -5859 kJ/mol). For 1BNG protein.charmm.system.xml, all the nonbonded interactions are handled with a single NonbondedForce. For 1BNG protein.openmm.system.xml, it's split between a NonbondedForce, a CustomNonbondedForce, and a CustomBondForce. If the psf reader isn't creating a CustomNonbondedForce, that implies the system has no nbfix terms. Is that really the case?

@peastman

Copy link
Copy Markdown
Member Author

I see the problem. Just a silly mistake.

@jchodera

Copy link
Copy Markdown
Member

That did the trick!

Testing 1BNG protein
----------------------------------------------------------------------------------------------------
OpenMM CHARMM reader total energy:        657.946
ParmEd CHARMM reader total energy:        657.946
OPENMM ffxml total energy:                657.972
TOTAL ERROR:                                0.026
----------------------------------------------------------------------------------------------------

@peastman

Copy link
Copy Markdown
Member Author

Great, thanks!

@peastman peastman merged commit acc9e93 into openmm:master Apr 17, 2018
@peastman peastman deleted the disulfide branch April 17, 2018 16:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Disulfide bonds are broken in CHARMM36

3 participants