人類基因存在著大范圍的復制和缺失,,并且與人類基因組的變異和多態(tài)性有關,,而基因的拷貝數(shù)是一個重要因素,但是,,由于在高分辨率下,,確定基因組的DNA 拷貝數(shù)的能力的限制,因而,,沒有針對在整個基因組掃描這種拷貝數(shù)多態(tài)性(copy number polymorphisms,,簡稱CNPs)的技術,故人們對CNPs對人類基因的變異以及多態(tài)性的影響程度所知甚少,。
在2004年7月23日的Science上,,發(fā)表了冷泉港實驗室由Michael Wigler領導的一項新研究,此項研究借助一種新技術——代表性寡核苷酸微陣列分析ROMA(representational oligonucleotide microarray analysis),,揭示出不同人的正常細胞DNA之間存在驚人的差異,。
他們從來自不同地域的個體上采得血樣和多種組織樣品。然后用ROMA技術,,借助一套探針,,做差異標記雜交,測定從樣品中提純的染色體DNA 的相對濃度,。簡單的說,,他們用代表性Bgl II基因組法,大大降低了樣品的復雜性,;寡聚核苷酸微陣列探針是從人的染色體序列匯編分析得到,,而設計在芯片上,,并通過操作進一步優(yōu)化;而雜交數(shù)據(jù)用Hidden Markov 模型(HMM)進行分析,。
他們對來自不同地域的20名實驗對象的血液及組織樣本進行了分析,。他們鑒定了221個拷貝數(shù)差異,并發(fā)現(xiàn)所有志愿者體細胞中有70個基因存在76處“拷貝數(shù)多態(tài)性”或稱CNPs,,表現(xiàn)為大段DNA序列的缺失或復制,。在70個與新發(fā)現(xiàn)的CNPs有關的基因中,有一些神經(jīng)發(fā)育有關,,一些則與細胞生長調控有關,,一些CNPs的基因與代謝調控有關,另外有些已知與疾病有關,。
此項研究的結果是令人震驚的,,而且ROMA技術功能是強大的。ROMA的幾個特征決定了它的信噪比高于全基因組DNA與BACs芯片雜交獲得的信號,。研究人員正在不斷的改進ROMA技術,,以期能發(fā)現(xiàn)更多的有關人類基因組中大范圍多態(tài)性的信息。
Large-Scale Copy Number Polymorphism in the Human Genome
The extent to which large duplications and deletions contribute to human genetic variation and diversity is unknown. Here, we show that large-scale copy number polymorphisms (CNPs) (about 100 kilobases and greater) contribute substantially to genomic variation between normal humans. Representational oligonucleotide microarray analysis of 20 individuals revealed a total of 221 copy number differences representing 76 unique CNPs. On average, individuals differed by 11 CNPs, and the average length of a CNP interval was 465 kilobases. We observed copy number variation of 70 different genes within CNP intervals, including genes involved in neurological function, regulation of cell growth, regulation of metabolism, and several genes known to be associated with disease.
Fig. 1. Genome-wide map of CNPs identified by ROMA. The position of all CNPs (excluding somatic differences) is shown. CNPs identified in multiple individuals (by Bgl II–ROMA) are indicated in yellow, and CNPs observed in only one individual are indicated in red. Additional CNPs identified by one Hind III–ROMA experiment are indicated in blue. Symbols denoting CNPs are not drawn to scale. Genome assembly gaps in pericentromeric and satellite regions are indicated by gray boxes. Genomic regions where recurring de novo rearrangements cause the developmental disorders Prader-Willi and Angelman syndromes, cat eye syndrome, DiGeorge/velocardiofacial syndrome, and spinal muscular atrophy are labeled A, B, C, and D, respectively.
Fig. 2. Validation of ROMA results by FISH. (A), (C), (E), and (G) show CNPs identified by ROMA and include the CNP identification number, the name of one gene located entirely within the interval, and the experiment name. (B), (D), (F), (H), and (I) show cytogenetic analyses of one or both individuals with probes that target the same CNP intervals. In all panels, the polymorphic probe is labeled red. In interphase cells [(B), (D), and (F)], a control probe (labeled green) was also included to confirm that cells were diploid. (B) CNP15 probe in GM11322 cells; (D) CNP56 probe in GM10470 cells; (F) CNP21 probe in GM10470 cells; (H) CNP32 probe in GM10540 cells; (I) CNP32 probe in SKN1 cells. In (I), one parental copy of chromosome 16 in SKN1 lacks the duplication (arrow).
全文