神刀安全网

Exploring P-values with Simulations in R

(This article was first published on R – Stable Markets , and kindly contributed toR-bloggers)

The recent flare-up in discussions on p-values inspired me to conduct a brief simulation study.

In particularly, I wanted to illustrate just how p-values vary with different effect and sample sizes.

Here are the details of the simulation. I simulated Exploring P-values with Simulations in R draws of my independent variable Exploring P-values with Simulations in R

:

Exploring P-values with Simulations in R

where

Exploring P-values with Simulations in R

For each Exploring P-values with Simulations in R , I define a Exploring P-values with Simulations in R as

Exploring P-values with Simulations in R

where

Exploring P-values with Simulations in R Exploring P-values with Simulations in R

In other words, for each effect size, Exploring P-values with Simulations in R , the simulation draws Exploring P-values with Simulations in R and Exploring P-values with Simulations in R with some error Exploring P-values with Simulations in R . The following regression model is estimated and the p-value of Exploring P-values with Simulations in R is observed.

Exploring P-values with Simulations in R

The drawing and the regression is done 1,000 times so that for each effect size – sample size combination, the simulation yields 1,000 p-values. The average of these 1,000 p-values for each effect size and sample size combination is plotted below.

Note, these results are for a fixed Exploring P-values with Simulations in R . Higher sampling error would typically shift these curves upward, meaning that for each effect size, the same sample would yield a lower signal.

Exploring P-values with Simulations in R

There are many take-aways from this plot.

First, for a given sample size, larger effect sizes are “detected” more easily. By detected, I mean found to be statistically significant using the .05 threshold. It’s possible to detect larger effect sizes (e.g. .25) with relatively low sample sizes (in this case <10). By contrast, if the effect size is small (e.g. .05), then a larger sample is needed to detect the effect (>10).

Second, this figure illustrates an oft-heard warning about p-values: always interpret them within the context of sample size. Lack of statistical significance does not imply lack of an effect. An effect may exist, but the sample size may be insufficient to detect it (or the variability in the data set is too high). On the other hand, just because a p-value signals statistical significance does not mean that the effect is actually meaningful. Consider an effect size of .00000001 (effectively 0). According to the chart, even the p-value of this effect size tends to 0 as the sample size increases, eventually crossing the statistical significance threshold.

Exploring P-values with Simulations in R

转载本站任何文章请注明:转载至神刀安全网,谢谢神刀安全网 » Exploring P-values with Simulations in R

分享到:更多 ()

评论 抢沙发

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址