近日,,密蘇里大學發(fā)現(xiàn),,在多種植物基因組完全不同的區(qū)域中發(fā)現(xiàn)相同的DNA序列。Dmitry Korkin是計算機系的助理教授,,也是該論文的主要作者,。“之前沒有人能夠完成這樣一規(guī)模的研究。”研究結果發(fā)表在PNAS雜志上,。
當白宮科技政策辦公室宣布了“大數(shù)據(jù)研究和發(fā)展倡議”后,,對大量數(shù)據(jù)進行官方分析成為國家的重中之重。密蘇里大學的一個多學科團隊成功地應對了巨大數(shù)據(jù)的挑戰(zhàn),,他們用開創(chuàng)性的計算計算法發(fā)現(xiàn)不同動植物種類間的相同DNA序列,,從而解決的一個主要的生物學問題。
研究的共同作者,、動物科學助理教授Gavin Conant說,,“我們的發(fā)現(xiàn)有助于解釋植物進化的一些謎團,,植物基因組的基礎研究為藥物及農(nóng)作物開發(fā)提供給了原材料并改進技術”
先前的研究發(fā)現(xiàn),在不同的動物DNA中存在長段的相同編碼,。但是在MU的此次新研究前,,計算機程序不足夠發(fā)現(xiàn)植物DNA中的相同序列,因為這些相同的片段不在同一位點上,。
之前的研究是將六種動物(狗,、雞、人類,、小鼠,、獼猴、大鼠)的基因組相互進行了對比,。同樣的,,六種植物(擬南芥、大豆,、大米,、三葉、高粱和葡萄)的基因組也進行了相互對比,。完成這些遺傳序列對比共使用了48臺具有每小時100萬次搜索能力的計算機,,耗時4個星期,總搜索次數(shù)達320億次,。
雖然研究人員發(fā)現(xiàn)植物種類間就像動物種族一樣有相同序列,,但他們表示這些序列演化過程不同。
Conant 說,,“人們可能希望看到趨同進化,但是我們不這么認為,,植物和動物都是復雜的多細胞生物,,都需要應對許多相同的環(huán)境條件,例如呼吸空氣和攝入水分,、應對天氣變化,,不過它們的基因組以不同的方式編碼應對這些挑戰(zhàn)的解決方案。
MU團隊的研究為將來研究動植物發(fā)展出不同的遺傳機制的原因以及這些遺傳機制如何運作奠定了基礎,;他們的基礎研究也為可能改善人類生活的新發(fā)現(xiàn)奠定了基礎,。用于編碼分析的計算機程序除了提高遺傳科學在抵抗疾病中的潛能外,其本身也有助于新藥研發(fā),。
Korkin說:“同樣的算法可用于發(fā)現(xiàn)生物體整套蛋白質(zhì)中相同的序列模式,,這有助于找到現(xiàn)有藥物新靶標或研究這些藥物的副作用。”(生物谷Bioon.com)
doi:10.1073/pnas.1121356109
PMC:
PMID:
Long identical multispecies elements in plant and animal genomes
Jeff Reneker, Eric Lyons, Gavin C. Conant, J. Chris Pires, Michael Freeling, Chi-Ren Shyu, and Dmitry Korkin
Ultraconserved elements (UCEs) are DNA sequences that are 100% identical (no base substitutions, insertions, or deletions) and located in syntenic positions in at least two genomes. Although hundreds of UCEs have been found in animal genomes, little is known about the incidence of ultraconservation in plant genomes. Using an alignment-free information-retrieval approach, we have comprehensively identified all long identical multispecies elements (LIMEs), which include both syntenic and nonsyntenic regions, of at least 100 identical base pairs shared by at least two genomes. Among six animal genomes, we found the previously known syntenic UCEs as well as previously undescribed nonsyntenic elements. In contrast, among six plant genomes, we only found nonsyntenic LIMEs. LIMEs can also be classified as either simple (repetitive) or complex (nonrepetitive), they may occur in multiple copies in a genome, and they are often spread across multiple chromosomes. Although complex LIMEs were found in both animal and plant genomes, they differed significantly in their composition and copy number. Further analyses of plant LIMEs revealed their functional diversity, encompassing elements found near rRNA and enzyme-coding genes, as well as those found in transposons and noncoding DNA. We conclude that despite the common presence of LIMEs in both animal and plant lineages, the evolutionary processes involved in the creation and maintenance of these elements differ in the two groups and are likely attributable to several mechanisms, including transfer of genetic material from organellar to nuclear genomes, de novo sequence manufacturing, and purifying selection.