來自杜克大學(xué)的研究人員創(chuàng)造了第一張人類基因組印記基因(imprintedgenes)圖譜,,并且他們表示其成功的關(guān)鍵在于一個(gè)稱為機(jī)器學(xué)習(xí)(machinelearning)的人工智能形式:modern-dayRosettastone,。這項(xiàng)研究新發(fā)現(xiàn)了四倍于之前識(shí)別的印記基因,并即將公布在12月3日Genome Research封面上,。
印記基因是指存在親本染色體上的等位基因的表達(dá)取決于它們是在父源染色體上還是在母源染色體上,,來自父系、母系的印記基因有所不同,,當(dāng)精卵結(jié)合時(shí),,父母雙方印記基因均應(yīng)出現(xiàn),否則發(fā)育就不正常,。這種基因印記是等位基因依賴雙親性別表達(dá)的不符合孟德爾遺傳定律的特殊遺傳現(xiàn)象,,基因印記異常調(diào)節(jié)可引起一些遺傳性疾病。
在傳統(tǒng)的遺傳學(xué)中,,子女會(huì)繼承一個(gè)基因的兩個(gè)拷貝,,一個(gè)來自于父本,一個(gè)來自于母本,,這兩個(gè)拷貝的活性形式會(huì)影響子女的發(fā)育,。但是當(dāng)印記基因出現(xiàn)——這兩個(gè)拷貝中一個(gè)會(huì)被來自母本或父本的分子調(diào)控關(guān)閉,這也就意味著子女只會(huì)繼承基因的一個(gè)拷貝的信息,,這樣的子女易受到環(huán)境壓力的影響:如果一個(gè)功能拷貝受到損傷或遺失,那么就沒有頂替的后備了,。
杜克大學(xué)放射腫瘤學(xué)及病理學(xué)系的遺傳學(xué)家Randy Jirtle博士表示,,“基因印記一直以來都是一個(gè)謎,這部分是由于它們并不遵循傳統(tǒng)的遺傳規(guī)律”,,“我們希望這一新發(fā)現(xiàn)的roadmap能幫助我們和其他研究人員發(fā)現(xiàn)更多有關(guān)這些基因如何影響我們的健康的信息,。”
在文章的其他作者AlexanderHartemink,PhilippeLuedi的合作下,Jirtle他們將兩類基因——一類是已知的印記基因,,一類不是——的序列數(shù)據(jù)輸入到計(jì)算機(jī)中,,利用程序幫助發(fā)現(xiàn)其中的差別,通過這一機(jī)器分析的方法獲得了一個(gè)運(yùn)算法則:能像最原始的Rosettastone解碼看上去費(fèi)解的數(shù)據(jù),,在這里指的是指向印記基因的特異性DNA序列,。
Hartemink表示,“我們不能完全肯定的說我們識(shí)別了所有印記基因,,但是我們認(rèn)為這是其中的大部分,。”
Jirtle研究印記已經(jīng)多年了,他表示印記事件是一個(gè)表觀遺傳事件,,這也就是說不需要改變DNA的序列就可以改變基因的功能,,“印記基因容易受到環(huán)境的攻擊——甚至是我們的飲食和呼吸。而且重要的是,,表觀遺傳變化是可以遺傳,,我想人們還沒有意識(shí)到這一點(diǎn)。”
預(yù)計(jì)印記基因占人類基因組的1%,,并且至今只發(fā)現(xiàn)了一部分,,利用這一研究中的新“Rosettastone”方法,Jirtel和Hartemink發(fā)現(xiàn)了156個(gè)新的印記基因,,其中兩個(gè)特殊基因定位在8號(hào)染色體上,,這在之前是沒有發(fā)現(xiàn)過的,其中一個(gè)基因:KCNK9,,在大腦中十分活躍,,已知是引起癌癥,和雙相障礙(bipolardisorder),,癲癇的原因之一,,而第二個(gè)基因:DLGAP2是一個(gè)可能的膀胱癌腫瘤抑制因子。
原始出處:
Cover Just as the discovery of the Rosetta Stone by Napoleon’s troops in 1799 led to the deciphering of Egyptian hieroglyphics, computational machine learning techniques have recently been used to decipher the imprint status of a gene from nearby genomic sequence features. These techniques permit the genome-wide identification of human genes that have a high probability of being imprinted. These candidate imprinted genes are in turn linked to complex human conditions where parent-of-origin inheritance is involved. (Cover design by James V. Jirtle, Webwiz Design, www.webwizdesign.com. Phototgraph of the Rosetta Stone used with permission © The Trustees of the British Museum.
Published online before print November 30, 2007, 10.1101/gr.6584707
Genome Res. 17:1723-1730, 2007
Computational and experimental identification of novel human imprinted genes
Philippe P. Luedi1, Fred S. Dietrich2,3, Jennifer R. Weidman4, Jason M. Bosko5, Randy L. Jirtle4,6, and Alexander J. Hartemink1,5,6
1 Center for Bioinformatics and Computational Biology, Duke University, Durham, North Carolina 27708, USA; 2 Institute for Genome Sciences & Policy, Duke University, Durham, North Carolina 27708, USA; 3 Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, North Carolina 27710, USA; 4 Department of Radiation Oncology, Duke University Medical Center, Durham, North Carolina 27710, USA; 5 Department of Computer Science, Duke University, Durham, North Carolina 27708, USA
Imprinted genes are essential in embryonic development, and imprinting dysregulation contributes to human disease. We report two new human imprinted genes: KCNK9 is predominantly expressed in the brain, is a known oncogene, and may be involved in bipolar disorder and epilepsy, while DLGAP2 is a candidate bladder cancer tumor suppressor. Both genes lie on chromosome 8, not previously suspected to contain imprinted genes. We identified these genes, along with 154 others, based on the predictions of multiple classification algorithms using DNA sequence characteristics as features. Our findings demonstrate that DNA sequence characteristics, including recombination hot spots, are sufficient to accurately predict the imprinting status of individual genes in the human genome.
6 Corresponding authors.
E-mail [email protected] ; fax (919) 660-6519.
E-mail [email protected] ; fax (919) 684-5584.