8月30日,Nature Genetics報(bào)道了華盛頓大學(xué)的研究人員設(shè)計(jì)的新算法,。該算法對(duì)重復(fù)基因組序列拷貝數(shù)及其含量的計(jì)算被證明是有效的,。研究人員將該方法命名為mrFAST, 即微視快速算法搜索工具。
人類基因組中的片段重復(fù)被認(rèn)為與情感和免疫相關(guān),。比如狼瘡,,克隆氏病,精神發(fā)育遲滯,,精神分裂癥,,色盲,,牛皮癬,,和年齡相關(guān)性黃斑變性等疾病都與此有關(guān),。重復(fù)片段中常常含有重復(fù)未知功能的基因,不同個(gè)體的重復(fù)片段的拷貝數(shù)不同,。檢測(cè)重復(fù)片段的數(shù)量,、含量以及位置是理解基因拷貝數(shù)變化對(duì)于健康的意義中很重要的一步,。
Alkan說,"新算法,,采用了新一代DNA測(cè)序技術(shù),首次重復(fù)片段中可變拷貝數(shù)提供了精準(zhǔn)的統(tǒng)計(jì),。"Kidd解釋道,,"它可以統(tǒng)計(jì)一個(gè)人是否含有1個(gè),、2個(gè),、3個(gè)或者更多的基因拷貝。"許多標(biāo)準(zhǔn)基因組分析并沒有包括人類基因組重復(fù)片段的分析,,因?yàn)檫@些序列并不是唯一的。其實(shí),,"這種計(jì)算是非常困難的,。"
在該研究之前,,也有科學(xué)家就此展開過研究,但是都沒有計(jì)算出具體拷貝數(shù),。比如一些科學(xué)家研究結(jié)果表明了部分人可以通過增加基因拷貝數(shù)來抵抗HIV,但是關(guān)于拷貝數(shù)的增加數(shù)目卻是一個(gè)未知數(shù)據(jù)。
該研究獲得了1000基因組項(xiàng)目的支持,,全球有多所研究機(jī)構(gòu)參與了其中,實(shí)驗(yàn)樣本來源于世界各地?cái)?shù)百人的基因組,。
Alkan及他的團(tuán)隊(duì)認(rèn)為拷貝數(shù)變異為人類多樣性做出了重大貢獻(xiàn),。精確且系統(tǒng)的檢測(cè)基因組片段拷貝數(shù)的能力是很重要的,特別是在個(gè)體基因組圖譜的繪制和基因組如何塑造一個(gè)人的性格方面,。
他們認(rèn)為,接下來的挑戰(zhàn)是確定片段重復(fù)在序列含量的變化和人類基因組中這些動(dòng)態(tài)的,、重要區(qū)域的結(jié)構(gòu)。(生物谷Bioon.com)
生物谷推薦原始出處:
Nature Genetics Published online: 30 August 2009 | doi:10.1038/ng.437
Personalized copy number and segmental duplication maps using next-generation sequencing
Can Alkan1,2, Jeffrey M Kidd1, Tomas Marques-Bonet1,3, Gozde Aksay1, Francesca Antonacci1, Fereydoun Hormozdiari4, Jacob O Kitzman1, Carl Baker1, Maika Malig1, Onur Mutlu5, S Cenk Sahinalp4, Richard A Gibbs6 & Evan E Eichler1,2
1 Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington, USA.
2 Howard Hughes Medical Institute, Seattle, Washington, USA.
3 Institut de Biologia Evolutiva (UPF-CSIC), Barcelona, Catalonia, Spain.
4 School of Computing Science, Simon Fraser University, Burnaby, British Columbia, Canada.
5 Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA.
6 Baylor College of Medicine, Houston, Texas, USA.
Correspondence to: Evan E Eichler
Despite their importance in gene innovation and phenotypic variation, duplicated regions have remained largely intractable owing to difficulties in accurately resolving their structure, copy number and sequence content. We present an algorithm (mrFAST) to comprehensively map next-generation sequence reads, which allows for the prediction of absolute copy-number variation of duplicated segments and genes. We examine three human genomes and experimentally validate genome-wide copy number differences. We estimate that, on average, 73-87 genes vary in copy number between any two individuals and find that these genic differences overwhelmingly correspond to segmental duplications (odds ratio = 135; P < 2.2 10-16). Our method can distinguish between different copies of highly identical genes, providing a more accurate assessment of gene content and insight into functional constraint without the limitations of array-based technology.