多基因(Multiple genes),,gene-by-gene相互作用,gene-by-environment相互作用被認(rèn)為是了解大多數(shù)復(fù)雜疾病的重要理論研究,,然而要辨別這些相互關(guān)系十分困難,,雖然
近期在識(shí)別復(fù)雜基因的遺傳變異方面取得了一些成功,但是目前要辨認(rèn)出基因與基因之間,,基因與環(huán)境之間的相互作用依然存在許多困難,。
來(lái)自耶魯大學(xué)醫(yī)學(xué)院傳染病學(xué)與公共健康學(xué)系,江西師范法大學(xué)的研究人員為了克服這些困難,,提出了一個(gè)forest-based方法,,以及一個(gè)變異重要性(variable importance)的概念。這一研究成果公布在《美國(guó)國(guó)家科學(xué)院院刊》(PNAS)雜志上,。
領(lǐng)導(dǎo)這一研究的是張和平教授,,他于1991年獲斯坦福大學(xué)博士學(xué)位,現(xiàn)任耶魯大學(xué)生物統(tǒng)計(jì)學(xué)系終身教授,。
原始出處:
Published online before print November 28, 2007
Proc. Natl. Acad. Sci. USA, 10.1073/pnas.0709868104
A forest-based approach to identifying gene and gene–gene interactions
Xiang Chen*, Ching-Ti Liu*, Meizhuo Zhang*, and Heping Zhang*,,
*Department of Epidemiology and Public Health, Yale University School of Medicine, New Haven, CT 06520-8034; and Jiangxi Normal University, Jiangxi 330027, China
Communicated by Herman Chernoff, Harvard University, Cambridge, MA, October 18, 2007 (received for review July 30, 2007)
Abstract
Multiple genes, gene-by-gene interactions, and gene-by-environment interactions are believed to underlie most complex diseases. However, such interactions are difficult to identify. Although there have been recent successes in identifying genetic variants for complex diseases, it still remains difficult to identify gene–gene and gene–environment interactions. To overcome this difficulty, we propose a forest-based approach and a concept of variable importance. The proposed approach is demonstrated by simulation study for its validity and illustrated by a real data analysis for its use. Analyses of both real data and simulated data based on published genetic models show the effectiveness of our approach. For example, our analysis of a published data set on age-related macular degeneration (AMD) not only confirmed a known genetic variant (P value = 2E-6) for AMD, but also revealed an unreported haplotype surrounding single-nucleotide polymorphism (SNP) rs10272438 on chromosome 7 that was significantly associated with AMD (P value = 0.0024). These significance levels are obtained after the consideration for a large number of SNPs. Thus, the importance of this work is twofold: it proposes a powerful and flexible method to identify high-risk haplotypes and their interactions and reveals a potentially protective variant for AMD.
age-related macular degeneration | genomewide association | haplotype | single-nucleotide polymorphism | tree and forest methods
附:
Heping Zhang
Associate Professor Biostatistics, Child Study, and Statistics
Department of Epidemiology and Public Health
Yale University School of Medicine
New Haven, CT 06520-8034
Office phone: 785-6272
Office location: LEPH 202
Email: [email protected]
Research Interests
Recursive Partitioning (Trees and Splines) in Health Sciences Nonparametric Analysis of Longitudinal (continuous and discrete) Data Linkage and Association Analyses, Mapping Quantitative Trait Loci FMR Imaging Analysis
Education
Ph.D. in Statistics, with minor in Computer Sciences
Stanford University, Stanford, 1991
Selected Publications and Related Software
Confidence regions in linear functional relationship. Annals of Statistics, 22, 49-66, 1994.
Maximum correlation and splines. Technometrics,36, 196-201, 1994. Software (MASAL) is available.
Extreme discordant sib pairs for mapping quantitative trait loci in humans. Science, 268, 1584--1589, 1995. (with N. Risch) Software is available.
A tree-based method in prospective studies. Statistics in Medicine, 15, 37--50, 1996. (with T. Holford and M. Bracken) Software (RTREE) is available.
Multivariate adaptive splines for longitudinal data (MASAL). Journal of Computational and Graphic Statistics, 6, 74-91, 1997. Software is available.
Classification trees for multiple binary responses. Journal of the American Statistical Association, 93, 180-193, 1998. Software (CTMBR) is available.
Recursive Partitioning in the Health Sciences. Springer, New York, 1999. (with B. Singer) Survial tree program (STREE) is available.