Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about free enegry uncertainties #304

Open
harlor opened this issue Apr 24, 2018 · 4 comments
Open

Question about free enegry uncertainties #304

harlor opened this issue Apr 24, 2018 · 4 comments

Comments

@harlor
Copy link

harlor commented Apr 24, 2018

I noticed that some people (for example: https://github.com/MobleyLab/alchemical-analysis/blob/37b21ebb6d0495990c43595d9ade10c779c3ab08/alchemical_analysis/alchemical_analysis.py#L650) just use the sum of the squares of uncertainties between intermediate states to calculate the uncertainty of the total difference in free energy. This requires the free energy differences to be uncorrelated correct?

But is this really the case? We know that we (can) have nonzero contributions in the covariance matrix (eq 12 https://aip.scitation.org/doi/10.1063/1.2978177).

In my random example this makes a huge difference in the uncertainty of the free energy difference:

using (eq 12) - np.sqrt(Theta[0, 0] - 2*Theta[0, -1] + Theta[-1, -1])
err: 0.072118

using sum of the squares - alchemical-analysis:
err: 0.032283

@harlor
Copy link
Author

harlor commented Apr 24, 2018

Btw. I can repoduce the alchemical-analysis value using:

var = 0.0
for i in range(K-1):
  var += Theta[i, i] - 2*Theta[i, i+1] + Theta[i+1, i+1]
print('err2: %f' % (np.sqrt(var) / B))

err2: 0.032283

@jchodera
Copy link
Member

You are correct that using the sum of squares of uncertainties between sequential pairs of intermediate states requires the data used to estimate those differences to be uncorrelated! Since practitioners typically use the data for an intermediate state i to estimate both (i-1, i) and (i,i+1) free energies, this condition is generally violated, and the uncertainty can be underestimated.

Pymbar should give a superior estimate of the uncertainty in this case because, as you point out, it can include the covariance not only in sequential pairs of free energy differences, but in the free energy estimate obtained from all the data jointly.

@harlor
Copy link
Author

harlor commented Apr 28, 2018

Thanks for your fast reply! You convinced me that it is a good idea to change this.

@SmithUoG
Copy link

If you have the computer resources available, I think that a preferable way to calculate the uncertainty (precision) in particular quantity z obtained using an approximate algorithm is to treat the algorithm the same way that experimentalists treat their experimental apparatus. This means that one would perform replicate independent "experiments" and take the final result as the mean value and its standard deviation as a measure of the uncertainty/precision. An "independent run" is one that, for example, uses different random number seeds wherever they are invoked in the algorithm.

So, I would rather calculate uncertainties this way (from whatever method I'm using, whether it be BAR, MBAR, TI, ...) and report its value. In our lab, we typically perform 10 such independent runs for a given quantity and report the uncertainty (the value "10" is both convenient and provides a not unreasonable approximation of a normal distribution by a t distribution). This applies to the calculation of free energy quantities, as well as to PVT properties (as a replacement for block averaging).

As an example, we have calculated (at 1 bar and 298.15K) 10 independent replicate Delta G(hyd) values for MEA in TIP3P water using the BAR method, and obtained an uncertainty of 0.09 kJ/mol. The reported individual BAR values for the uncertainty of each run ranged from 0.04 to 0.10.

For some discussion of this and related issues, see A. Nicholls, J Comput Aided Mol Des (2014) 28:887–918, and (2016) 30:103–126.

William R. Smith, Un. of Guelph

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants