在中國科學院北京基因組所研究員于軍的指導下,碩博生朱江、和夫紅等所在小組利用EST數(shù)據(jù)對人類基因表達的組織特異性進行了系統(tǒng)分析,,取得的成果發(fā)表在日前出版的《BMC基因組學》(BMC Genomics)雜志上,。
成人個體由超過200種細胞類型組成,,各種細胞類型存在細胞特異的轉(zhuǎn)錄組構(gòu)成,。Tissue-specific (TS) 基因只在特定組織中表達以完成組織特異的細胞功能,而housekeeping (HK) 基因在所有組織中均表達以維持細胞的基本功能,。界定HK基因所組成的“基本轉(zhuǎn)錄組”是理解“細胞特異轉(zhuǎn)錄組”的基礎,。目前已有多個基于芯片的研究工作界定了相應的HK基因集,雖然這些基因集都估計大約500個人體HK基因,,但是各HK基因集之間重疊很小,。人體轉(zhuǎn)錄組中究竟有多少,哪些是HK基因仍然是目前待解決的問題,。
中科院的科研人員按照人體組織分類整合了現(xiàn)有的人類EST數(shù)據(jù)和一組被廣泛使用并且較為完備的芯片數(shù)據(jù),,同時針對目前研究最為透徹的18個人體組織,對兩組數(shù)據(jù)進行了系統(tǒng)的比較分析,。通過一組按照基因功能注釋確定的HK基因做為參照,,研究結(jié)果表明兩類數(shù)據(jù)均存在各自的局限性:目前對大多數(shù)人體組織的EST測序仍然沒有達到飽和,,限制了在這些組織中的基因檢測及組織特異表達模式的研究;而芯片數(shù)據(jù)平均具有較低的基因檢測率,,因而過低估計了人體HK基因,。研究還表明,在目前已經(jīng)注釋的基因中,,約40%的基因在人體各組織中廣泛表達,,而僅有5%的基因在特定組織中特異表達,揭示組織特異的基因表達模式因此需要更精確,、更大規(guī)模的轉(zhuǎn)錄組數(shù)據(jù),。該研究最終對人體HK基因進行了重新界定?;贓ST數(shù)據(jù)界定的人體HK基因數(shù)量上在3140到6909之間,大約是過去基于芯片數(shù)據(jù)得到的HK基因集的十倍,。該研究工作為系統(tǒng)分析HK及TS基因的性質(zhì)提供了新的基礎,,同時為該所正在開展的以細胞為單元的人體轉(zhuǎn)錄組研究(“973”項目)提供了分析框架和初步的數(shù)據(jù)模型。
后續(xù)研究將基于新界定的人體HK基因集,,在基因結(jié)構(gòu),、進化速率、啟動子結(jié)構(gòu)等方面對HK和TS基因進行系統(tǒng)的比較分析,,證明HK基因在各個層面上都具有與TS基因不同的特征,,最終揭示HK基因在轉(zhuǎn)錄組組成中的特殊角色。(來源:中科院北京基因組研究所)
生物谷推薦原始出處:
(BMC Genomics),,doi:10.1186/1471-2164-9-172,,Jiang Zhu,Jun Yu
How many human genes can be defined as housekeeping with current expression data?
Jiang Zhu , Fuhong He , Shuhui Song , Jing Wang and Jun Yu
Abstract (provisional)
Background
Housekeeping (HK) genes are ubiquitously expressed in all tissue/cell types and constitute a basal transcriptome for the maintenance of basic cellular functions. Partitioning transcriptomes into HK and tissue-specific (TS) genes relatively is fundamental for studying gene expression and cellular differentiation. Although many studies have aimed at large-scale and thorough categorization of human HK genes, a meaningful consensus has yet to be reached.
Results
We collected two latest gene expression datasets (both EST and microarray data) from public databases and analyzed the gene expression profiles in 18 human tissues that have been well-documented by both two data types. Benchmarked by a manually-curated HK gene collection (HK408), we demonstrated that present data from EST sampling was far from saturated, and the inadequacy has limited the gene detectability and our understanding of TS expressions. Due to a likely over-stringent threshold, microarray data showed higher false negative rate compared with EST data, leading to a significant underestimation of HK genes. Based on EST data, we found that 40.0% of the currently annotated human genes were universally expressed in at least 16 of 18 tissues, as compared to only 5.1% specifically expressed in a single tissue. Our current EST-based estimate on human HK genes ranged from 3,140 to 6,909 in number, a ten-fold increase in comparison with previous microarray-based estimates.
Conclusions
We concluded that a significant fraction of human genes, at least in the currently annotated data depositories, was broadly expressed. Our understanding of tissue-specific expression was still preliminary and required much more large-scale and high-quality transcriptomic data in future studies. The new HK gene list categorized in this study will be useful for genome-wide analyses on structural and functional features of HK genes.