生物谷報(bào)道:Nature Chemical Biology雜志在線版一篇文章介紹,,伊利諾斯州立大學(xué)John A. Gerlt博士率領(lǐng)的研究小組研制出一種新途徑,,能夠確定氨基酸序列已知的蛋白的結(jié)構(gòu)和功能,。
這是首次利用計(jì)算機(jī)程序,,根據(jù)蛋白的氨基酸序列精確預(yù)測蛋白功能,。Gerlt等的“in silico”預(yù)測結(jié)果被實(shí)驗(yàn)室酶分析和X 射線結(jié)晶學(xué)手段所證實(shí),。
研究過程大致為搜尋序列已知的蛋白的數(shù)據(jù)庫,,尋找氨基酸序列與未知蛋白的氨基酸序列具有很大同源性的蛋白,然后根據(jù)搜索到的與未知蛋白最為貼近的蛋白的三維結(jié)構(gòu),,分析未知蛋白的功能。這種方法提高了鑒別功能未知蛋白的生物學(xué)作用的工作效率,。
研究小組利用這種同源模型中得到的結(jié)構(gòu)數(shù)據(jù),,進(jìn)行計(jì)算機(jī)化的分子對接實(shí)驗(yàn)(docking experiments,生物通編者譯),,快速評(píng)估是否未知蛋白偏向于結(jié)合任何潛在靶標(biāo)分子或底物,。確定與目的蛋白相結(jié)合的底物對于了解蛋白功能非常重要。Gerlt說,,不需要對3萬個(gè)成分進(jìn)行(實(shí)驗(yàn)室)研究,,確定它們是否為底物,利用這種方法你只需要對10種成分進(jìn)行實(shí)驗(yàn),。
研究人員以烯醇酶超家族為研究模型,,此超家族有3000多個(gè)成員,其中絕大部份的功能不是研究的非常清楚,。烯醇酶催化葡萄糖及相關(guān)成分降解為代謝所需的其它分子,,它們采用相似的催化機(jī)制,但作用底物不同,,因此很難檢測它們的功能,。(新研究發(fā)現(xiàn)某個(gè)烯醇酶蛋白家族的分類是錯(cuò)誤的)。Gerlt與其同事推測他們創(chuàng)立的計(jì)算法能夠更有效地研究這些蛋白和其它未知蛋白的功能,。
圖:烯醇酶BC0371及其底物N-succinyl-L-arginine的同源模擬復(fù)合體(青色)與通過X射線結(jié)晶學(xué)推測的結(jié)構(gòu)(黃色)非常匹配,。
注:In silico是生物信息學(xué)的新名詞,是相對應(yīng)于in vivo(體內(nèi))與in bitro(體外)而產(chǎn)生的新名詞,,強(qiáng)調(diào)使用計(jì)算機(jī)來解決生物學(xué)問題,。In silico生物學(xué)是生物學(xué)與信息學(xué)的交叉學(xué)科,是產(chǎn)同的實(shí)驗(yàn)室研究在in vitro 和in vivo的合理的繼承者,。
原始出處:
Nature Chemical Biology 3, 486-491 (2007)
doi:10.1038/nchembio.2007.11
Prediction and assignment of function for a divergent N-succinyl amino acid racemase
Ling Song1, Chakrapani Kalyanaraman2, Alexander A Fedorov3, Elena V Fedorov3, Margaret E Glasner4, Shoshana Brown4, Heidi J Imker1, Patricia C Babbitt2,4,5, Steven C Almo3, Matthew P Jacobson2 & John A Gerlt1
Abstract
The protein databases contain many proteins with unknown function. A computational approach for predicting ligand specificity that requires only the sequence of the unknown protein would be valuable for directing experiment-based assignment of function. We focused on a family of unknown proteins in the mechanistically diverse enolase superfamily and used two approaches to assign function: (i) enzymatic assays using libraries of potential substrates, and (ii) in silico docking of the same libraries using a homology model based on the most similar (35% sequence identity) characterized protein. The results matched closely; an experimentally determined structure confirmed the predicted structure of the substrate-liganded complex. We assigned the N-succinyl arginine/lysine racemase function to the family, correcting the annotation (L-Ala-D/L-Glu epimerase) based on the function of the most similar characterized homolog. These studies establish that ligand docking to a homology model can facilitate functional assignment of unknown proteins by restricting the identities of the possible substrates that must be experimentally tested.