近日,華大基因公開(kāi)一種基因融合檢測(cè)算法SOAPfuse,。模擬數(shù)據(jù)和真實(shí)驗(yàn)證數(shù)據(jù)的綜合測(cè)評(píng)表明,,該算法具有準(zhǔn)確率高、敏感性強(qiáng),、精度高,、資源消耗少等優(yōu)點(diǎn)。該算法主要采用局部窮舉算法和一系列精細(xì)的過(guò)濾策略,,從而對(duì)基因融合進(jìn)行快速,、精確的檢測(cè)。相關(guān)研究成果在《基因組生物學(xué)》(Genome Biology)雜志上在線發(fā)表,。
基因融合是指染色體上兩個(gè)異位的基因嵌合在一起,,形成一個(gè)嵌合基因的現(xiàn)象。這種現(xiàn)象一般是由于染色體發(fā)生易位,、缺失或者倒置造成的,,它們?cè)诎┌Y的發(fā)生上扮演著重要角色,并且可以作為診斷和治療癌癥的靶標(biāo),。隨著對(duì)基因融合的深入研究,,科研人員發(fā)現(xiàn),除血液系統(tǒng)腫瘤外,,在實(shí)體瘤中也存在著基因融合現(xiàn)象,。
傳統(tǒng)基因融合研究方法存在通量低、操作復(fù)雜,、不便于大規(guī)模樣品篩查的缺點(diǎn),。而高通量RNA測(cè)序技術(shù)具有通量高、成本低,、檢測(cè)精度高和檢測(cè)范圍廣的優(yōu)點(diǎn),,其與全基因組測(cè)序相比,不僅能找到由于重排導(dǎo)致的基因融合,,還能找到更多轉(zhuǎn)錄水平上的融合,。
SOAPfuse算法首先通過(guò)比對(duì)到基因組和轉(zhuǎn)錄本中雙末端(pair end)關(guān)系的序列尋找候選的基因融合,然后采用局部窮舉算法和一系列精細(xì)的過(guò)濾策略,,在盡量保留真實(shí)融合的情況下過(guò)濾掉其中假陽(yáng)性的基因融合,。同時(shí),該算法還具有融合斷點(diǎn)預(yù)測(cè)和可視化功能,這對(duì)臨床分子分型和腫瘤新藥的開(kāi)發(fā)具有重要意義,。(生物谷Bioon.com)
doi:10.1200/JCO.2012.46.9270
PMC:
PMID:
SOAPfuse: an algorithm for identifying fusion transcripts from paired-end RNA-Seq data
Wenlong Jia, Kunlong Qiu, Minghui He, Pengfei Song, Quan Zhou, Feng Zhou
We have developed a new method, SOAPfuse, to identify fusion transcripts from paired-end RNA-seq data. SOAPfuse applies an improved partial exhaustion algorithm to construct a library of fusion junction sequences, which can be used to efficiently identify fusion events, and employs a series of filters to nominate high-confidence fusion transcripts. Compared with other released tools, SOAPfuse achieves higher detection efficiency and consumed less computing resources. We applied SOAPfuse to RNA-seq data from two bladder cancer cell lines, and confirmed 15 fusion transcripts, including several novel events common to both cell lines. SOAPfuse is available at http://soap.genomics.org.cn/soapfuse.html.