來自杜克大學的研究人員創(chuàng)造了第一張人類基因組印記基因(imprintedgenes)圖譜,,并且他們表示其成功的關鍵在于一個稱為機器學習(machinelearning)的人工智能形式:modern-dayRosettastone。這項研究新發(fā)現(xiàn)了四倍于之前識別的印記基因,并即將公布在12月3日Genome Research封面上,。
印記基因是指存在親本染色體上的等位基因的表達取決于它們是在父源染色體上還是在母源染色體上,來自父系,、母系的印記基因有所不同,,當精卵結合時,父母雙方印記基因均應出現(xiàn),,否則發(fā)育就不正常,。這種基因印記是等位基因依賴雙親性別表達的不符合孟德爾遺傳定律的特殊遺傳現(xiàn)象,基因印記異常調(diào)節(jié)可引起一些遺傳性疾病,。
在傳統(tǒng)的遺傳學中,,子女會繼承一個基因的兩個拷貝,一個來自于父本,,一個來自于母本,,這兩個拷貝的活性形式會影響子女的發(fā)育。但是當印記基因出現(xiàn)——這兩個拷貝中一個會被來自母本或父本的分子調(diào)控關閉,,這也就意味著子女只會繼承基因的一個拷貝的信息,,這樣的子女易受到環(huán)境壓力的影響:如果一個功能拷貝受到損傷或遺失,那么就沒有頂替的后備了,。
杜克大學放射腫瘤學及病理學系的遺傳學家Randy Jirtle博士表示,,“基因印記一直以來都是一個謎,這部分是由于它們并不遵循傳統(tǒng)的遺傳規(guī)律”,,“我們希望這一新發(fā)現(xiàn)的roadmap能幫助我們和其他研究人員發(fā)現(xiàn)更多有關這些基因如何影響我們的健康的信息,。”
在文章的其他作者AlexanderHartemink,PhilippeLuedi的合作下,,Jirtle他們將兩類基因——一類是已知的印記基因,,一類不是——的序列數(shù)據(jù)輸入到計算機中,,利用程序幫助發(fā)現(xiàn)其中的差別,通過這一機器分析的方法獲得了一個運算法則:能像最原始的Rosettastone解碼看上去費解的數(shù)據(jù),,在這里指的是指向印記基因的特異性DNA序列,。
Hartemink表示,“我們不能完全肯定的說我們識別了所有印記基因,,但是我們認為這是其中的大部分,。”
Jirtle研究印記已經(jīng)多年了,他表示印記事件是一個表觀遺傳事件,,這也就是說不需要改變DNA的序列就可以改變基因的功能,,“印記基因容易受到環(huán)境的攻擊——甚至是我們的飲食和呼吸。而且重要的是,,表觀遺傳變化是可以遺傳,,我想人們還沒有意識到這一點。”
預計印記基因占人類基因組的1%,,并且至今只發(fā)現(xiàn)了一部分,,利用這一研究中的新“Rosettastone”方法,Jirtel和Hartemink發(fā)現(xiàn)了156個新的印記基因,,其中兩個特殊基因定位在8號染色體上,,這在之前是沒有發(fā)現(xiàn)過的,其中一個基因:KCNK9,,在大腦中十分活躍,,已知是引起癌癥,和雙相障礙(bipolardisorder),,癲癇的原因之一,,而第二個基因:DLGAP2是一個可能的膀胱癌腫瘤抑制因子。
原始出處:
Cover Just as the discovery of the Rosetta Stone by Napoleon’s troops in 1799 led to the deciphering of Egyptian hieroglyphics, computational machine learning techniques have recently been used to decipher the imprint status of a gene from nearby genomic sequence features. These techniques permit the genome-wide identification of human genes that have a high probability of being imprinted. These candidate imprinted genes are in turn linked to complex human conditions where parent-of-origin inheritance is involved. (Cover design by James V. Jirtle, Webwiz Design, www.webwizdesign.com. Phototgraph of the Rosetta Stone used with permission © The Trustees of the British Museum.
Published online before print November 30, 2007, 10.1101/gr.6584707
Genome Res. 17:1723-1730, 2007
Computational and experimental identification of novel human imprinted genes
Philippe P. Luedi1, Fred S. Dietrich2,3, Jennifer R. Weidman4, Jason M. Bosko5, Randy L. Jirtle4,6, and Alexander J. Hartemink1,5,6
1 Center for Bioinformatics and Computational Biology, Duke University, Durham, North Carolina 27708, USA; 2 Institute for Genome Sciences & Policy, Duke University, Durham, North Carolina 27708, USA; 3 Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, North Carolina 27710, USA; 4 Department of Radiation Oncology, Duke University Medical Center, Durham, North Carolina 27710, USA; 5 Department of Computer Science, Duke University, Durham, North Carolina 27708, USA
Imprinted genes are essential in embryonic development, and imprinting dysregulation contributes to human disease. We report two new human imprinted genes: KCNK9 is predominantly expressed in the brain, is a known oncogene, and may be involved in bipolar disorder and epilepsy, while DLGAP2 is a candidate bladder cancer tumor suppressor. Both genes lie on chromosome 8, not previously suspected to contain imprinted genes. We identified these genes, along with 154 others, based on the predictions of multiple classification algorithms using DNA sequence characteristics as features. Our findings demonstrate that DNA sequence characteristics, including recombination hot spots, are sufficient to accurately predict the imprinting status of individual genes in the human genome.
6 Corresponding authors.
E-mail [email protected] ; fax (919) 660-6519.
E-mail [email protected] ; fax (919) 684-5584.