卡內(nèi)基·梅隆大學(xué)的計(jì)算生物學(xué)家開發(fā)出了一種分析技術(shù),用于檢測(cè)對(duì)于糖尿病、哮喘和癌癥等具有多種臨床和分子特征的復(fù)雜疾病綜合征有貢獻(xiàn)的多重遺傳變異,。
與每次尋找一種導(dǎo)致特定癥狀和特征的遺傳變異的方法(大多數(shù)傳統(tǒng)方法就是這樣做的)不同,,卡內(nèi)基·梅隆大學(xué)的科學(xué)家使用了一種統(tǒng)計(jì)方法,,這讓他們可以發(fā)現(xiàn)造成復(fù)雜疾病的整個(gè)基因調(diào)控網(wǎng)絡(luò)或特征背后的基因組變異,。
Eric P. Xing 教授和博士后科學(xué)家Seyoung Kim今天在網(wǎng)上出版的《公共科學(xué)圖書館·遺傳學(xué)》雜志上報(bào)告說,他們的graph-guided fused lasso (GFlasso) 方法在檢測(cè)與復(fù)雜綜合征有關(guān)的基因變異方面優(yōu)于其他方法,。在一項(xiàng)測(cè)試中,,GFlasso成功地探測(cè)到了一種已知涉及到嚴(yán)重哮喘的基因變異以及額外的兩種此前未與該病有聯(lián)系的基因,。Xing和Kim說,需要對(duì)這兩種變異進(jìn)行更多研究從而證實(shí)這種聯(lián)系,。
“我們知道困擾人類的一些最常見和最嚴(yán)重的疾病不是由單一遺傳突變?cè)斐傻?,而是許多遺傳和環(huán)境因素的組合,”機(jī)器學(xué)習(xí),、語言技術(shù)和計(jì)算機(jī)科學(xué)副教授Xing說,。“讓情況復(fù)雜化的是大多數(shù)復(fù)雜疾病有大量臨床特征,諸如各種癥狀,、身體特征和家族史,,而且全基因組的基因表達(dá)譜分析可以發(fā)現(xiàn)上萬種疾病有關(guān)的分子特征。”
通常,,其中許多的特征是有相關(guān)性的,。例如,高血壓和高體重可能共享一些同樣的遺傳因素,。Xing 說,,如果某人每次一對(duì)一地測(cè)試每種基因變異和每種特征(傳統(tǒng)方法就是這樣做的),測(cè)試的數(shù)量及其龐大,,而且關(guān)于相關(guān)特征的遺傳因素的信息沒有被正確地使用,,導(dǎo)致了統(tǒng)計(jì)檢驗(yàn)功效的喪失。“因此我們不太可能一次一個(gè)基因一個(gè)特征地揭示出諸如癌癥,、糖尿病和哮喘等疾病的根本原因,,”他說。“相反,,我們需要諸如GFlasso等工具,,從而讓我們可以尋找基因網(wǎng)絡(luò)和臨床特征之間的相關(guān)性。”
例如,,嚴(yán)重哮喘擁有超過50個(gè)臨床特征,,其中一些與環(huán)境或活動(dòng)程度有關(guān),一些與氣喘和胸悶有關(guān),,而另一些與肺部生理狀況有關(guān),。Xing和Kim在《公共科學(xué)圖書館·遺傳學(xué)》的這篇論文中指出,其中一些特征相互高度相關(guān),,這提示它們具有一種共有的遺傳基礎(chǔ),。他們的這種技術(shù)通過聯(lián)合分析這些高度相關(guān)的特征從而利用了它們,。這種方法還有助于檢測(cè)一些遺傳變異,,如果沒有這種方法,這些遺傳變異就可能被遺漏,,因?yàn)樗鼈兙哂袑?duì)于任何單獨(dú)特征相對(duì)難以捉摸的影響,,但是這些變異很重要,,因?yàn)樗鼈儗?duì)于一些相關(guān)特征有貢獻(xiàn)。
“這種方法將提供對(duì)于復(fù)雜疾病的更全面的遺傳和分子視角,,”Xing說,。“因此我們可以發(fā)現(xiàn)在疾病過程背后的基因,理解基因在確定疾病的嚴(yán)重性方面的作用,,并研發(fā)診斷疾病的改良手段,。”
Xing是卡內(nèi)基·梅隆大學(xué)Ray 與Stephanie Lane計(jì)算生物學(xué)中心的成員,作為受到國立衛(wèi)生研究院支持的正在進(jìn)行的一項(xiàng)研究的一部分,,他正在與匹茲堡大學(xué)醫(yī)學(xué)院以及哈佛大學(xué)醫(yī)學(xué)院的同事合作使用GFlasso研究嚴(yán)重哮喘,。(生物谷Bioon.com)
生物谷推薦原始出處:
PLoS Genet 5(8): e1000587. doi:10.1371/journal.pgen.1000587
Statistical Estimation of Correlated Genome Associations to a Quantitative Trait Network
Seyoung Kim, Eric P. Xing*
School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
Many complex disease syndromes, such as asthma, consist of a large number of highly related, rather than independent, clinical or molecular phenotypes. This raises a new technical challenge in identifying genetic variations associated simultaneously with correlated traits. In this study, we propose a new statistical framework called graph-guided fused lasso (GFlasso) to directly and effectively incorporate the correlation structure of multiple quantitative traits such as clinical metrics and gene expressions in association analysis. Our approach represents correlation information explicitly among the quantitative traits as a quantitative trait network (QTN) and then leverages this network to encode structured regularization functions in a multivariate regression model over the genotypes and traits. The result is that the genetic markers that jointly influence subgroups of highly correlated traits can be detected jointly with high sensitivity and specificity. While most of the traditional methods examined each phenotype independently and combined the results afterwards, our approach analyzes all of the traits jointly in a single statistical framework. This allows our method to borrow information across correlated phenotypes to discover the genetic markers that perturb a subset of the correlated traits synergistically. Using simulated datasets based on the HapMap consortium and an asthma dataset, we compared the performance of our method with other methods based on single-marker analysis and regression-based methods that do not use any of the relational information in the traits. We found that our method showed an increased power in detecting causal variants affecting correlated traits. Our results showed that, when correlation patterns among traits in a QTN are considered explicitly and directly during a structured multivariate genome association analysis using our proposed methods, the power of detecting true causal SNPs with possibly pleiotropic effects increased significantly without compromising performance on non-pleiotropic SNPs.