生物谷報道:Nature Chemical Biology雜志在線版一篇文章介紹,,伊利諾斯州立大學John A. Gerlt博士率領的研究小組研制出一種新途徑,能夠確定氨基酸序列已知的蛋白的結構和功能,。
這是首次利用計算機程序,,根據(jù)蛋白的氨基酸序列精確預測蛋白功能。Gerlt等的“in silico”預測結果被實驗室酶分析和X 射線結晶學手段所證實,。
研究過程大致為搜尋序列已知的蛋白的數(shù)據(jù)庫,,尋找氨基酸序列與未知蛋白的氨基酸序列具有很大同源性的蛋白,然后根據(jù)搜索到的與未知蛋白最為貼近的蛋白的三維結構,,分析未知蛋白的功能,。這種方法提高了鑒別功能未知蛋白的生物學作用的工作效率。
研究小組利用這種同源模型中得到的結構數(shù)據(jù),,進行計算機化的分子對接實驗(docking experiments,,生物通編者譯),快速評估是否未知蛋白偏向于結合任何潛在靶標分子或底物,。確定與目的蛋白相結合的底物對于了解蛋白功能非常重要,。Gerlt說,不需要對3萬個成分進行(實驗室)研究,,確定它們是否為底物,,利用這種方法你只需要對10種成分進行實驗。
研究人員以烯醇酶超家族為研究模型,,此超家族有3000多個成員,,其中絕大部份的功能不是研究的非常清楚。烯醇酶催化葡萄糖及相關成分降解為代謝所需的其它分子,,它們采用相似的催化機制,,但作用底物不同,因此很難檢測它們的功能,。(新研究發(fā)現(xiàn)某個烯醇酶蛋白家族的分類是錯誤的),。Gerlt與其同事推測他們創(chuàng)立的計算法能夠更有效地研究這些蛋白和其它未知蛋白的功能。
圖:烯醇酶BC0371及其底物N-succinyl-L-arginine的同源模擬復合體(青色)與通過X射線結晶學推測的結構(黃色)非常匹配,。
注:In silico是生物信息學的新名詞,,是相對應于in vivo(體內(nèi))與in bitro(體外)而產(chǎn)生的新名詞,,強調(diào)使用計算機來解決生物學問題。In silico生物學是生物學與信息學的交叉學科,,是產(chǎn)同的實驗室研究在in vitro 和in vivo的合理的繼承者,。
原始出處:
Nature Chemical Biology 3, 486-491 (2007)
doi:10.1038/nchembio.2007.11
Prediction and assignment of function for a divergent N-succinyl amino acid racemase
Ling Song1, Chakrapani Kalyanaraman2, Alexander A Fedorov3, Elena V Fedorov3, Margaret E Glasner4, Shoshana Brown4, Heidi J Imker1, Patricia C Babbitt2,4,5, Steven C Almo3, Matthew P Jacobson2 & John A Gerlt1
Abstract
The protein databases contain many proteins with unknown function. A computational approach for predicting ligand specificity that requires only the sequence of the unknown protein would be valuable for directing experiment-based assignment of function. We focused on a family of unknown proteins in the mechanistically diverse enolase superfamily and used two approaches to assign function: (i) enzymatic assays using libraries of potential substrates, and (ii) in silico docking of the same libraries using a homology model based on the most similar (35% sequence identity) characterized protein. The results matched closely; an experimentally determined structure confirmed the predicted structure of the substrate-liganded complex. We assigned the N-succinyl arginine/lysine racemase function to the family, correcting the annotation (L-Ala-D/L-Glu epimerase) based on the function of the most similar characterized homolog. These studies establish that ligand docking to a homology model can facilitate functional assignment of unknown proteins by restricting the identities of the possible substrates that must be experimentally tested.