近日,,國(guó)際著名學(xué)術(shù)期刊Human Molecular Genetics在線刊登了了上海生科院計(jì)算生物學(xué)所金力教授等的最新研究成果“A Systematic Characterization of Genes Underlying both Complex and Mendelian Diseases”,。該研究系統(tǒng)分析了人類(lèi)疾病相關(guān)基因的特征并與其他各類(lèi)基因進(jìn)行了全面比較,研究結(jié)果對(duì)于認(rèn)識(shí)人類(lèi)疾病基因的特性,,了解致病遺傳變異的產(chǎn)生,、基因組分布,、自然選擇和進(jìn)化機(jī)制,以及理解疾病基因表達(dá)和調(diào)控網(wǎng)絡(luò)模式有重要理論意義,,對(duì)研究復(fù)雜疾病的實(shí)驗(yàn)設(shè)計(jì)具有參考價(jià)值和指導(dǎo)性,。
傳統(tǒng)上,遺傳類(lèi)疾病可以分為罕見(jiàn)的孟德?tīng)栠z傳?。∕endelian Diseases)和較常見(jiàn)的復(fù)雜疾?。–ommon Diseases or Complex Diseases)。孟德?tīng)栠z傳病往往由單基因控制,,在人群中的發(fā)病率很低,,表現(xiàn)出很強(qiáng)的家族聚集性,如鐮刀性貧血,、白化病,、色盲、苯丙酮尿癥,、血友病,、短指癥等。導(dǎo)致孟德?tīng)栠z傳病的單基因變異效應(yīng)較強(qiáng),,這些疾病在家系傳遞中遵循孟德?tīng)栠z傳定律,。這類(lèi)疾病基因也基本上都是通過(guò)基于家系數(shù)據(jù)的連鎖分析(Linkage Analysis)鑒定出來(lái)的。復(fù)雜疾病往往受多個(gè)基因控制,,在人群中發(fā)病率高,,因而也稱(chēng)常見(jiàn)疾病,如癌癥,、高血壓,、糖尿病、哮喘,、精神分裂癥等,。復(fù)雜疾病是在多種因素的共同作用下發(fā)生的,其遺傳模式復(fù)雜,,不遵循典型的孟德?tīng)栠z傳定律,,每個(gè)基因變異的效應(yīng)很弱,受人群和個(gè)體整體的遺傳背景影響很大,,這類(lèi)疾病基因主要是通過(guò)關(guān)聯(lián)研究(Association Study)鑒定出來(lái)的,。近幾年大量的全基因組關(guān)聯(lián)研究(GWAS)已經(jīng)發(fā)現(xiàn)大量與復(fù)雜疾病相關(guān)的基因和遺傳變異。
然而,,對(duì)目前最新的孟德?tīng)栠z傳病基因數(shù)據(jù)庫(kù)和復(fù)雜疾病基因數(shù)據(jù)庫(kù)進(jìn)行對(duì)比分析后,,該研究發(fā)現(xiàn)孟德?tīng)栠z傳病基因和復(fù)雜疾病基因并不像通常分類(lèi)那樣界限分明,相反地,,在兩類(lèi)疾病基因存在大量的重疊,,即與兩類(lèi)疾病共同關(guān)聯(lián)的基因(姑且稱(chēng)之為雙聯(lián)基因),而且比基于統(tǒng)計(jì)學(xué)隨機(jī)假設(shè)的預(yù)期數(shù)目多出8倍,。為了闡明這類(lèi)雙聯(lián)基因的特征,,該研究對(duì)已知的人類(lèi)基因按照功能重要性和與疾病的關(guān)系進(jìn)行分類(lèi),除了雙聯(lián)基因,,還有必需基因(Essential Genes),,單基因疾病基因(去除與復(fù)雜疾病相關(guān)基因),復(fù)雜疾病基因(去除與單基因疾病相關(guān)基因)和其它基因,,并進(jìn)行了系統(tǒng)的比較分析,。該研究發(fā)現(xiàn)雙聯(lián)基因和復(fù)雜疾病基因都受到了近期的正向自然選擇,而必需基因和單基因疾病基因受到較強(qiáng)的負(fù)向選擇,。對(duì)物種間差異數(shù)據(jù)分析表明必需基因總是最保守,,這支持必需基因在長(zhǎng)期的進(jìn)化史中總是受到最強(qiáng)的負(fù)向選擇;而雙聯(lián)基因在保守性上排在第二位,,提示其在進(jìn)化中也受到較強(qiáng)的負(fù)向選擇,。同時(shí),該研究也比較了各類(lèi)基因在基因表達(dá)模式,、基因結(jié)構(gòu),、蛋白蛋白相互作用和群體分化等方面的差異。該研究根據(jù)這些分析結(jié)果推測(cè)雙聯(lián)基因的很多特征和他們?cè)趶?fù)雜疾病和單基因疾病中的雙重作用相關(guān),。該研究是第一個(gè)對(duì)雙聯(lián)基因特征進(jìn)行系統(tǒng)分析的研究,,結(jié)果同時(shí)也對(duì)其它四類(lèi)基因的特征有新的認(rèn)識(shí)。比如該研究發(fā)現(xiàn)很多復(fù)雜疾病基因落在拷貝數(shù)變異區(qū)域(Copy Number Variations, CNVs),,表明拷貝數(shù)變異可能在很多種復(fù)雜疾病的遺傳因素中起著重要作用,。拷貝數(shù)變異在不同類(lèi)基因類(lèi)型中的富集分析也支持雙聯(lián)基因同時(shí)受到較強(qiáng)的正選擇與負(fù)向選擇,。
該工作由博士生靳文菲,、秦鵬飛和樓海一在導(dǎo)師金力教授和徐書(shū)華研究員的指導(dǎo)下共同完成。該研究工作得到了國(guó)家自然科學(xué)基金委,、上海市科委,、中國(guó)科學(xué)院、德國(guó)馬普學(xué)會(huì),、香港王寬誠(chéng)教育基金會(huì)等多項(xiàng)基金的資助,。(生物谷Bioon.com)
doi:10.1093/hmg/ddr599
PMC:
PMID:
A Systematic Characterization of Genes Underlying both Complex and Mendelian Diseases
Wenfei Jin1, Pengfei Qin1, Haiyi Lou1, Li Jin1,2,* and Shuhua Xu1,*
Traditionally, genetic disorders have been classified as either Mendelian or complex diseases. This nosology has greatly benefited genetic counseling and the development of gene mapping strategies. However, based on two well-established databases, we identified that 54% (524 of 968) of the Mendelian diseases genes were also involved in complex diseases, and this kind of genes has not been systematically analyzed. Here, we classified human genes into five categories: Mendelian and complex diseases (MC) genes, Mendelian but not complex diseases (MNC) genes, complex but not Mendelian diseases (CNM) genes, essential genes and OTHER genes. Firstly, we found that MC genes were associated with more diseases and phenotypes, and were involved in more complex protein-protein interaction network than MNC or CNM genes on average. Secondly, MC genes encoded the longest proteins and had the highest transcript count among all gene categories. Especially, tissue specificity of MC genes was much higher than that of any other gene categories (P< 7.5×10-5), although their expression level was similar to that of essential genes. Thirdly, evidences from different aspects supported that MC genes have been subjected to both purifying and positive selection. Interestingly, functions of some human disease genes might be different from those of their orthologous genes in non-primate-mammalians since they were even less conserved than OTHER genes. The significant over-representation of CNVs in CNM genes suggested the important roles of CNVs in complex diseases. In brief, our study not only revealed the characteristics of MC genes, but also provided new insights into the other four gene categories.