北京時間2010年10月4日,,美國加州大學(xué)伯克利分校,、丹麥哥本哈根大學(xué)等單位合作的研究成果“對200個人類外顯子的測序揭示大量低頻率非同義突變的存在”在國際著名學(xué)術(shù)雜志Nature Genetics上。
該項研究對200個丹麥個體蛋白質(zhì)編碼基因的外顯子組進行了深度測序,發(fā)現(xiàn)了大量以往未知的單核苷酸多態(tài)性位點(SNP),其中大部分在人群中都以較低頻率出現(xiàn)。該研究完成了目前在人類外顯子區(qū)域規(guī)模最大,、分辨率最精細的遺傳圖譜,并以翔實的數(shù)據(jù)證明,,人群當(dāng)中的低頻率多態(tài)性位點富集了大量能引起蛋白質(zhì)氨基酸序列改變的變異,,而這類變異在人群中受到自然選擇作用,可能具有影響人類健康的功能,。
最近有多項科學(xué)研究指出,,以往對多基因控制的復(fù)雜疾病所進行的關(guān)聯(lián)分析研究盡管從理論上可行,并在實踐中發(fā)現(xiàn)了許多疾病關(guān)聯(lián)基因,,但卻僅能解釋復(fù)雜疾病遺傳性的一小部分,。這一現(xiàn)象被稱為“遺傳度缺失”,是當(dāng)前復(fù)雜疾病基因組研究的一個主要難題,。該研究首次證實,影響人類健康和疾病易感性的多態(tài)性位點在人群中往往頻率低,,但是相關(guān)位點的個數(shù)很多,。既往的復(fù)雜疾病關(guān)聯(lián)分析使用的基因分型芯片僅對常見多態(tài)性位點進行測定,而無法研究低頻率多態(tài)性位點,從而漏掉大量疾病關(guān)聯(lián)位點,,造成“遺傳度缺失”,。該研究不僅指出了目前主流疾病研究方法的缺陷,并顛覆性地提出疾病關(guān)聯(lián)分析應(yīng)充分使用測序技術(shù)而非基因分型技術(shù),,從而對改變科學(xué)家對復(fù)雜疾病的研究手段,,推動人類健康與醫(yī)學(xué)研究的進步具有里程碑意義。
該研究是中丹合作糖尿病關(guān)聯(lián)基因及變異研究(LUCAMP)項目的一部分,。LUCAMP項目旨在利用新一代測序技術(shù)對1000個內(nèi)臟肥胖病人和1000名對照健康人進行外顯子組測序,,計劃將鑒定出新的與代謝疾病相關(guān)的常見突變和稀有突變。 (生物谷Bioon.com)
生物谷推薦英文摘要:
Nature Genetics doi:10.1038/ng.680
Resequencing of 200 human exomes identifies an excess of low-frequency non-synonymous coding variants
Yingrui Li1,19, Nicolas Vinckenbosch2,19, Geng Tian1,19, Emilia Huerta-Sanchez2,3,19, Tao Jiang1,19, Hui Jiang1, Anders Albrechtsen4, Gitte Andersen5, Hongzhi Cao1, Thorfinn Korneliussen4, Niels Grarup5, Yiran Guo1, Ines Hellman6, Xin Jin1,7, Qibin Li1, Jiangtao Liu1, Xiao Liu1, Thomas Spars?5, Meifang Tang1, Honglong Wu1, Renhua Wu1, Chang Yu1, Hancheng Zheng1,7, Arne Astrup8, Lars Bolund1,9,10, Johan Holmkvist5, Torben J?rgensen11,12, Karsten Kristiansen1,4, Ole Schmitz13,14, Thue W Schwartz15, Xiuqing Zhang1, Ruiqiang Li1,4, Huanming Yang1, Jian Wang1, Torben Hansen5,16, Oluf Pedersen5,17,18, Rasmus Nielsen2,3,4 & Jun Wang1,4
Targeted capture combined with massively parallel exome sequencing is a promising approach to identify genetic variants implicated in human traits. We report exome sequencing of 200 individuals from Denmark with targeted capture of 18,654 coding genes and sequence coverage of each individual exome at an average depth of 12-fold. On average, about 95% of the target regions were covered by at least one read. We identified 121,870 SNPs in the sample population, including 53,081 coding SNPs (cSNPs). Using a statistical method for SNP calling and an estimation of allelic frequencies based on our population data, we derived the allele frequency spectrum of cSNPs with a minor allele frequency greater than 0.02. We identified a 1.8-fold excess of deleterious, non-syonomyous cSNPs over synonymous cSNPs in the low-frequency range (minor allele frequencies between 2% and 5%). This excess was more pronounced for X-linked SNPs, suggesting that deleterious substitutions are primarily recessive.