人類基因存在著大范圍的復(fù)制和缺失,,并且與人類基因組的變異和多態(tài)性有關(guān),而基因的拷貝數(shù)是一個(gè)重要因素,,但是,,由于在高分辨率下,確定基因組的DNA 拷貝數(shù)的能力的限制,,因而,,沒有針對(duì)在整個(gè)基因組掃描這種拷貝數(shù)多態(tài)性(copy number polymorphisms,簡(jiǎn)稱CNPs)的技術(shù),,故人們對(duì)CNPs對(duì)人類基因的變異以及多態(tài)性的影響程度所知甚少,。
在2004年7月23日的Science上,發(fā)表了冷泉港實(shí)驗(yàn)室由Michael Wigler領(lǐng)導(dǎo)的一項(xiàng)新研究,,此項(xiàng)研究借助一種新技術(shù)——代表性寡核苷酸微陣列分析ROMA(representational oligonucleotide microarray analysis),,揭示出不同人的正常細(xì)胞DNA之間存在驚人的差異。
他們從來(lái)自不同地域的個(gè)體上采得血樣和多種組織樣品,。然后用ROMA技術(shù),,借助一套探針,做差異標(biāo)記雜交,,測(cè)定從樣品中提純的染色體DNA 的相對(duì)濃度,。簡(jiǎn)單的說(shuō),他們用代表性Bgl II基因組法,,大大降低了樣品的復(fù)雜性,;寡聚核苷酸微陣列探針是從人的染色體序列匯編分析得到,而設(shè)計(jì)在芯片上,,并通過(guò)操作進(jìn)一步優(yōu)化,;而雜交數(shù)據(jù)用Hidden Markov 模型(HMM)進(jìn)行分析,。
他們對(duì)來(lái)自不同地域的20名實(shí)驗(yàn)對(duì)象的血液及組織樣本進(jìn)行了分析。他們鑒定了221個(gè)拷貝數(shù)差異,,并發(fā)現(xiàn)所有志愿者體細(xì)胞中有70個(gè)基因存在76處“拷貝數(shù)多態(tài)性”或稱CNPs,,表現(xiàn)為大段DNA序列的缺失或復(fù)制。在70個(gè)與新發(fā)現(xiàn)的CNPs有關(guān)的基因中,,有一些神經(jīng)發(fā)育有關(guān),,一些則與細(xì)胞生長(zhǎng)調(diào)控有關(guān),一些CNPs的基因與代謝調(diào)控有關(guān),,另外有些已知與疾病有關(guān),。
此項(xiàng)研究的結(jié)果是令人震驚的,而且ROMA技術(shù)功能是強(qiáng)大的,。ROMA的幾個(gè)特征決定了它的信噪比高于全基因組DNA與BACs芯片雜交獲得的信號(hào),。研究人員正在不斷的改進(jìn)ROMA技術(shù),以期能發(fā)現(xiàn)更多的有關(guān)人類基因組中大范圍多態(tài)性的信息,。
Large-Scale Copy Number Polymorphism in the Human Genome
The extent to which large duplications and deletions contribute to human genetic variation and diversity is unknown. Here, we show that large-scale copy number polymorphisms (CNPs) (about 100 kilobases and greater) contribute substantially to genomic variation between normal humans. Representational oligonucleotide microarray analysis of 20 individuals revealed a total of 221 copy number differences representing 76 unique CNPs. On average, individuals differed by 11 CNPs, and the average length of a CNP interval was 465 kilobases. We observed copy number variation of 70 different genes within CNP intervals, including genes involved in neurological function, regulation of cell growth, regulation of metabolism, and several genes known to be associated with disease.
Fig. 1. Genome-wide map of CNPs identified by ROMA. The position of all CNPs (excluding somatic differences) is shown. CNPs identified in multiple individuals (by Bgl II–ROMA) are indicated in yellow, and CNPs observed in only one individual are indicated in red. Additional CNPs identified by one Hind III–ROMA experiment are indicated in blue. Symbols denoting CNPs are not drawn to scale. Genome assembly gaps in pericentromeric and satellite regions are indicated by gray boxes. Genomic regions where recurring de novo rearrangements cause the developmental disorders Prader-Willi and Angelman syndromes, cat eye syndrome, DiGeorge/velocardiofacial syndrome, and spinal muscular atrophy are labeled A, B, C, and D, respectively.
Fig. 2. Validation of ROMA results by FISH. (A), (C), (E), and (G) show CNPs identified by ROMA and include the CNP identification number, the name of one gene located entirely within the interval, and the experiment name. (B), (D), (F), (H), and (I) show cytogenetic analyses of one or both individuals with probes that target the same CNP intervals. In all panels, the polymorphic probe is labeled red. In interphase cells [(B), (D), and (F)], a control probe (labeled green) was also included to confirm that cells were diploid. (B) CNP15 probe in GM11322 cells; (D) CNP56 probe in GM10470 cells; (F) CNP21 probe in GM10470 cells; (H) CNP32 probe in GM10540 cells; (I) CNP32 probe in SKN1 cells. In (I), one parental copy of chromosome 16 in SKN1 lacks the duplication (arrow).
全文