Current browse context:
stat.AP
Change to browse by:
References & Citations
Statistics > Applications
Title: Statistical methods for linguistic research: Foundational Ideas - Part I
(Submitted on 6 Jan 2016 (v1), last revised 7 Jan 2016 (this version, v2))
Abstract: We present the fundamental ideas underlying statistical hypothesis testing using the frequentist framework. We begin with a simple example that builds up the one-sample t-test from the beginning, explaining important concepts such as the sampling distribution of the sample mean, and the iid assumption. Then we examine the p-value in detail, and discuss several important misconceptions about what a p-value does and does not tell us. This leads to a discussion of Type I, II error and power, and Type S and M error. An important conclusion from this discussion is that one should aim to carry out appropriately powered studies. Next, we discuss two common issues we have encountered in psycholinguistics and linguistics: running experiments until significance is reached, and the "garden-of-forking-paths" problem discussed by Gelman and others, whereby the researcher attempts to find statistical significance by analyzing the data in different ways. The best way to use frequentist methods is to run appropriately powered studies, check model assumptions, clearly separate exploratory data analysis from confirmatory hypothesis testing, and always attempt to replicate results.
Submission history
From: Shravan Vasishth [view email][v1] Wed, 6 Jan 2016 10:27:38 GMT (147kb,D)
[v2] Thu, 7 Jan 2016 13:47:08 GMT (146kb,D)
Link back to: arXiv, form interface, contact.