使用分子嵌合模擬在資料庫中找出合適合的化合物已經是現今虛擬藥物蒒選很重要的一個部分。但如今的分子嵌合模擬在評分函數的部分仍有很多的進步空間。目前評分函數都會遭遇到一個稱作例外者的問題。實際能量比預測能量高出許多的例外者復合體在分子嵌合模擬時更顯得重要。這篇論文提出了一個使用非線性函數模型的評分函數以及例外者自動偵測的機制。在使用的607個蛋白質配體的複合體資料集中,這個非線性評分函數得到之RMSE(root-mean-squared-error)為 2.13千卡每莫耳,相對於Autodock程式在相同資料集的3.543千卡每莫耳,可以得到更好的結果。再進一步使用例外者自動偵測後,可以將RMSE降到2千卡每莫耳的準度。如結果所示,新的評分函數配合例外者偵測的幫助,可以提供未來生化分析時更多的線索。
Virtual screening by molecular docking has become a crucial component for hit identification and lead optimization against very large libraries of compounds, but there is still much room for improvement in design of scoring function. The most common problem of existing scoring functions is the existence of “outliers”. Outliers of molecular docking can be very important and interesting especially when the observed biological activity is higher than the predicted one by scoring function. This article proposes a non-linear scoring function along with outlier detection. The evaluation is conducted with a comparison against the scoring function incorporated in the well-known AutoDock docking package. Based on the testing dataset from 607 protein-ligand complexes, the proposed non-linear scoring function has RMSE (root-mean-squared-error) equal to 2.13 kcal/mol that is comparable with the scoring function in AutoDock (3.453 kcal/mol). Moreover, with the proposed outlier detection mechanism, the RMSE could improve to 2.0 kcal/mol. As a result, the proposed scoring function with outlier detection helps the scoring quality and provides valuable clues for further biochemical analysis.