澳大利亞昆士蘭大學(xué)的科學(xué)家發(fā)明一種快速,,可靠,,簡便的糾錯方法,可如同計算機(jī)檢查文字拼寫錯誤那樣,,發(fā)現(xiàn)基因測序過程中產(chǎn)生的擴(kuò)增序列DNA代碼錯誤,。這項成果發(fā)表在今年5月出版的《自然》旗下子刊《自然方法學(xué)》Nature Methods上。
新方法編制的軟件稱為“刺槐(Acacia)”,,特別適用于分析微生物基因的重要片段——擴(kuò)增子,。基因測序儀閱讀DNA堿基代碼的四個字母表:As,,Cs,,Ts和Gs,并拼寫出不同生物體的基因后,,“刺槐”軟件分析輸出結(jié)果,。“刺槐”通過使用似然性的統(tǒng)計理論分析DNA的特定堿基序列,而這些堿基常常在基因測序中被錯誤地添加或刪除,。該方法集成了計算機(jī)科學(xué),,統(tǒng)計學(xué)和生物學(xué),屬生物信息學(xué)范疇,。
當(dāng)前,,冗長的A,C,,G,,T代碼引起的機(jī)器錯誤常常導(dǎo)致生物學(xué)家們誤解基因的種類,,誤解諸如來自污水處理廠、海洋,、甚至我們的腸道樣本中可能存在的微生物種類,。為此,科學(xué)家們主要使用雙誤差校正軟件進(jìn)行校正,。同這些工具相比,,“刺槐”不僅具有明顯的優(yōu)勢,而且便于使用,。(生物谷Bioon.com)
doi:10.1038/nmeth.1990
PMC:
PMID:
Fast, accurate error-correction of amplicon pyrosequences using Acacia
Lauren Bragg, Glenn Stone, Michael Imelfort, Philip Hugenholtz & Gene W Tyson
Microbial diversity metrics based on high-throughput amplicon sequencing are compromised by read errors. Roche 454 GS FLX Titanium pyrosequencing is currently the most widely used technology for amplicon-based microbial community studies, despite high homopolymer-associated insertion-deletion error rates1, 2. Currently, there are two software packages, AmpliconNoise3 and Denoiser4, that are commonly used to correct amplicon pyrosequencing errors. AmpliconNoise applies an approximate likelihood using empirically derived error distributions to remove pyrosequencing noise from reads. AmpliconNoise is highly effective at noise removal but is computationally intensive3. Denoiser is a faster algorithm that uses frequency-based heuristics rather than statistical modeling to cluster reads. Neither tool modifies individual reads; instead both select an 'error-free' read to represent reads in a given cluster.