|
Predicting ligand activities towards a specific target is always a hot topic in computational structural biology and drug discovery. For this purpose, many technologies are developed or being improved: from optimizing scoring function to QSAR model; from MM/GBSA to MM/PBSA; from potential mean force (PMF) to linear interaction energy (LIE); from thermodynamic integration (TI) to free energy perturbation (FEP). More recently, several relative binding energy calculation methods based on Alchemical methods were developed, including:
-Schrodinger FEP Mapper
-Amber free energy workflow (FEW)
-Gromacs PMX
-FESetup
-Lead Optimization Mapper (LOMAP)
The basic principle for those tools are identical: making a free energy circle between various ligands.
Free energy among above ligand mutation circle should be zero in all:
dG1+dG2+dG3=0
This principle sounds good and many developer are in the hope that it could improve the accuracy of ligand binding energy prediction. Thus “Mapper” definition were introduced into either TI (Amber FEW) or FEP calculation (eg: Schrodinger FEP Mapper) .
Mapper Netwrok generated by "LOMAP"
However, this new concept has not been widely validated by large number of users yet and the reliability is still uncertain. Recently, we did a systematic test for this technology for a specific target. By comparison, we also compare the results with widely used MM/PBSA calculation.
Results1
A 10 compounds testing set in a 4 GPU workstation
Mapper based methods MM/PBSA
Efficiency: 5 ns/day 50 ns/day
Productivity: 0.6 compound/day 20 compounds/day
total time for testing: 18 days 2 days
Figure Legend| Upper: Mapper based prediction; Down: MM/PBSA calculation
As we can observe from above plot that: although the Mapper based free energy calculation cost much more CPU/GPU resources and took much longer time to be done, the results doesn't improve at all. And there is almost no any correlation between experimental data and Mapper predictions. By contract, the inexpensive MM/PBSA was done much faster and the accuracy for the testing set is pretty good. Nervertheless, the original Amber Mapper paper, which published in Journal of Comp. Chem 2013 (34) 965-973, also indicated that the Mapper based free enrgy calcultion has poor correlation factor with experimental data (R2=0.26~0.28 in the testing). There are definitely a lot of space to be improved for this concept.
Additionally, we did another two testings with differennt class of compounds for the same targets. We found that MM/PBSA obtained correlation factor R2=0.65-0.85. However, with the same testing sets, the expensive Mapper based free energy calculation only got R2 0.1-0.2!
Result 2
10 compounds are very small testing sets and probably we are lucy to obtain good correlation between MM/PBSA prediction and experimental data. To further validate the MM/PBSA reliabilities, we increased the testing sets to 46 and here is the results:
As we can see from above plot that the correlation factor decreased obviously when the sampling volum increased. It dropped from previous R2=0.72 to 0.34. We also would like to perform large sampling for Mapper based prediction, but we noticed that 46 compound will take us 80 days to do this !! However, since usually large sampling is much poor the than the small testing sets, the results from Mapper based won't be better than its small testing sets as shown in Result 1. Meanwhile, it is extremely difficult to generate a Mapper for 46 compounds as a big circle. Thus, it is infeasible to perform Mapper based free energy calculation for such large samplings. Even this could be done, the inner circle is probably not minimized which could lead to the predictions to be even worse.
Result 3
Can we somehow improve MM/PBSA calculation results, especially when the samplings are very large? The answer is "YES"! We deposit each term of MM/PBSA calulation, and introduced QSAR concept: taking 25 compounds as "training sets" and the left as "testing sets". By employing multiple linear regression function, we obtained fantastic results:
As we can see, the prediction of "testing sets" increased from 0.34 to 0.60; while the training sets and overall R2 is 0.94 and 0.78 respectively. !! That's really amazing improvement.
Interestingly, when we did QSAR alone (without MM/PBSA corelation) by two popuar commercial software, both of them failed: the training sets R2 is 0.3-0.5, but the testing set R2 for both software are only 0.1 or null no matter how we optimized the parameters!!
Conclusion
Although the Mapper based Alchemical free energy calculations sounds quite popular and was advertised everywhere these days, it doesn’t actually improve the ligand binding free energy predictions. By contract, the inexpensive MM/PBSA could work very well at least for certain targets, and benchmark should be done for validation. Combing MM/PBSA and QSAR could improve the prediction accuracy dramatically.
Archiver|手机版|科学网 ( 京ICP备07017567号-12 )
GMT+8, 2024-11-25 02:04
Powered by ScienceNet.cn
Copyright © 2007- 中国科学报社