生物的基因組是由A,T,,G和C四個(gè)核苷酸組成的,??茖W(xué)家已經(jīng)發(fā)現(xiàn)一個(gè)基因組不同區(qū)域的定長核苷酸串(譬如雙核苷酸串有6個(gè),分別為AT,,AG,AC,,TG,TC和GC)基本一致,。最近,,佐治亞大學(xué)系統(tǒng)生物學(xué)實(shí)驗(yàn)室的研究科學(xué)家周豐豐博士發(fā)現(xiàn),,通過將一個(gè)基因組不同區(qū)域的所有定長核苷酸串的出現(xiàn)頻率映射為不同顏色的圖形方式,,可以非常直觀有效的表現(xiàn)出以上特征(見下圖),。
該特征被稱為一個(gè)基因組(或者一條染色體)的條形碼,。進(jìn)一步分析表明,,同一個(gè)物種不同染色體的條形碼互相比較相似,,不同物種的基因組(或者染色體)的條形碼有一定的差別,。真核生物、原核生物,、葉綠體和線粒體的條形碼可以非常清晰的分隔開,。這些特性可以大大提高超基因組學(xué)(metagenomics)的分類研究。一個(gè)基因組(或者染色體)的條形碼中可能存在一些具有不同條形碼的區(qū)域,。研究表明,,這些區(qū)域可能是通過水平轉(zhuǎn)移等機(jī)制從其他物種中得到的。
該成果發(fā)表于BMC Bioinformatics,,在發(fā)表后不到半個(gè)月時(shí)間,,已經(jīng)被訪問超過1100次。進(jìn)一步的應(yīng)用研究將于近期發(fā)布,具體信息請參見作者的個(gè)人主頁:http://csbl.bmb.uga.edu/~ffzhou/(生物谷Bioon.com)
生物谷推薦原始出處:
BMC Bioinformatics 2008, 9:546doi:10.1186/1471-2105-9-546
Barcodes for genomes and applications
Fengfeng Zhou , Victor Olman and Ying Xu
Background
Each genome has a stable distribution of the combined frequency for each k-mer and its reverse complement measured in sequence fragments as short as 1000 bps across the whole genome, for 1<k<6. The collection of these k-mer frequency distributions is unique to each genome and termed the genome's barcode.
Results
We found that for each genome, the majority of its short sequence fragments have highly similar barcodes while sequence fragments with different barcodes typically correspond to genes that are horizontally transferred or highly expressed. This observation has led to new and more effective ways for solving two challenging problems: metagenome binning problem and identification of horizontally transferred genes. Our barcode-based metagenome binning algorithm substantially improves the state of the art in terms of both binning accuracies and the scope of applicability. Other attractive properties of genomes barcodes include (a) the barcodes have different and identifiable characteristics for different classes of genomes like prokaryotes, eukaryotes, mitochondria and plastids, and (b) barcodes similarities are generally proportional to the genomes' phylogenetic closeness.
Conclusions
These and other properties of genomes barcodes make them a new and powerful tool for studying numerous genome and metagenome analysis problems.