8月30日,,Nature Genetics報道了華盛頓大學的研究人員設計的新算法,。該算法對重復基因組序列拷貝數(shù)及其含量的計算被證明是有效的。研究人員將該方法命名為mrFAST, 即微視快速算法搜索工具,。
人類基因組中的片段重復被認為與情感和免疫相關,。比如狼瘡,克隆氏病,,精神發(fā)育遲滯,,精神分裂癥,色盲,,牛皮癬,,和年齡相關性黃斑變性等疾病都與此有關。重復片段中常常含有重復未知功能的基因,,不同個體的重復片段的拷貝數(shù)不同,。檢測重復片段的數(shù)量、含量以及位置是理解基因拷貝數(shù)變化對于健康的意義中很重要的一步,。
Alkan說,,"新算法,采用了新一代DNA測序技術,,首次重復片段中可變拷貝數(shù)提供了精準的統(tǒng)計,。"Kidd解釋道,"它可以統(tǒng)計一個人是否含有1個,、2個,、3個或者更多的基因拷貝。"許多標準基因組分析并沒有包括人類基因組重復片段的分析,,因為這些序列并不是唯一的,。其實,"這種計算是非常困難的,。"
在該研究之前,,也有科學家就此展開過研究,但是都沒有計算出具體拷貝數(shù),。比如一些科學家研究結果表明了部分人可以通過增加基因拷貝數(shù)來抵抗HIV,但是關于拷貝數(shù)的增加數(shù)目卻是一個未知數(shù)據(jù),。
該研究獲得了1000基因組項目的支持,全球有多所研究機構參與了其中,,實驗樣本來源于世界各地數(shù)百人的基因組,。
Alkan及他的團隊認為拷貝數(shù)變異為人類多樣性做出了重大貢獻。精確且系統(tǒng)的檢測基因組片段拷貝數(shù)的能力是很重要的,,特別是在個體基因組圖譜的繪制和基因組如何塑造一個人的性格方面,。
他們認為,接下來的挑戰(zhàn)是確定片段重復在序列含量的變化和人類基因組中這些動態(tài)的,、重要區(qū)域的結構,。(生物谷Bioon.com)
生物谷推薦原始出處:
Nature Genetics Published online: 30 August 2009 | doi:10.1038/ng.437
Personalized copy number and segmental duplication maps using next-generation sequencing
Can Alkan1,2, Jeffrey M Kidd1, Tomas Marques-Bonet1,3, Gozde Aksay1, Francesca Antonacci1, Fereydoun Hormozdiari4, Jacob O Kitzman1, Carl Baker1, Maika Malig1, Onur Mutlu5, S Cenk Sahinalp4, Richard A Gibbs6 & Evan E Eichler1,2
1 Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington, USA.
2 Howard Hughes Medical Institute, Seattle, Washington, USA.
3 Institut de Biologia Evolutiva (UPF-CSIC), Barcelona, Catalonia, Spain.
4 School of Computing Science, Simon Fraser University, Burnaby, British Columbia, Canada.
5 Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA.
6 Baylor College of Medicine, Houston, Texas, USA.
Correspondence to: Evan E Eichler
Despite their importance in gene innovation and phenotypic variation, duplicated regions have remained largely intractable owing to difficulties in accurately resolving their structure, copy number and sequence content. We present an algorithm (mrFAST) to comprehensively map next-generation sequence reads, which allows for the prediction of absolute copy-number variation of duplicated segments and genes. We examine three human genomes and experimentally validate genome-wide copy number differences. We estimate that, on average, 73-87 genes vary in copy number between any two individuals and find that these genic differences overwhelmingly correspond to segmental duplications (odds ratio = 135; P < 2.2 10-16). Our method can distinguish between different copies of highly identical genes, providing a more accurate assessment of gene content and insight into functional constraint without the limitations of array-based technology.