Efficient and powerful familywise error control in genome-wide association studies using generalized linear models

Halle, K. K.; Bakke, Ø.; Djurovic, S.; Bye, A.; Ryeng, E.; Wisløff, U.; Andreassen, O. A.; Langaas, M.

doi:10.1111/sjos.12451

Full-text links:

Download:

Current browse context:

stat.ME

< prev | next >

new | recent | 1603

Statistics > Methodology

Title: Efficient and powerful familywise error control in genome-wide association studies using generalized linear models

Authors: K. K. Halle, Ø. Bakke, S. Djurovic, A. Bye, E. Ryeng, U. Wisløff, O. A. Andreassen, M. Langaas

(Submitted on 18 Mar 2016 (v1), last revised 22 Dec 2016 (this version, v2))

Abstract: In genetic association studies, detecting phenotype-genotype association is a primary goal. We assume that the relationship between the data -phenotype, genetic markers and environmental covariates - can be modelled by a generalized linear model (GLM). The inclusion of environmental covariates makes it possible to account for important confounding factors, such as sex and population substructure. A multivariate score statistic, which under the complete null hypothesis of no phenotype-genotype association asymptotically has a multivariate normal distribution with a covariance matrix that can be estimated from the data, is used to test a large number of genetic markers for association with the phenotype. We stress the importance of controlling the familywise error rate (FWER), and use the asymptotic distribution of the multivariate score test statistic to find a local significance level for the individual test. Using real data (from one study on schizophrenia and bipolar disorder and one on maximal oxygen uptake) and constructed correlated structures, we show that our method is a powerful alternative to the popular Bonferroni and Sidak methods. For GLMs without environmental covariates, we show that our method is an efficient alternative to permutation methods for multiple testing. Further, we show that if environmental covariates and genetic markers are uncorrelated, the estimated covariance matrix of the score test statistic can be approximated by the estimated correlation matrix for just the genetic markers. As byproducts of our method, an effective number of independent tests can be defined, and FWER-adjusted $p$-values can be calculated as an alternative to using a local significance level.

Subjects:	Methodology (stat.ME); Applications (stat.AP)
DOI:	10.1111/sjos.12451
Cite as:	arXiv:1603.05938 [stat.ME]
	(or arXiv:1603.05938v2 [stat.ME] for this version)

Submission history

From: Mette Langaas [view email]
[v1] Fri, 18 Mar 2016 17:51:37 GMT (190kb,D)
[v2] Thu, 22 Dec 2016 12:09:23 GMT (51kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> stat > arXiv:1603.05938

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Statistics > Methodology

Title: Efficient and powerful familywise error control in genome-wide association studies using generalized linear models

Submission history