📰 Summary (use your own words)
What does imply causation? The gold standard is a double-blind controlled trail (or the [[AB Testing]] equivalent) but if we can't perform such experiment what do we do?
What does imply causation? The gold standard is a double-blind controlled trail (or the [[AB Testing]] equivalent) but if we can't perform such experiment what do we do?
Explaining the methodology behind how 538 simulates their game prediction models.
First ghost - its either significant or noise
- Frequentist approach to statistical tests significance can be prone to large sample size
The Bayesian AB testing approach does mitigate the problem of peeking but it still increases the Type I error which is not the promise that the Bayesian approach promises. Conversely, because the frequentist approach is promising a Type I error rate in the form of p-value testing, it is explicitly breaking that promise if we peeked and took action on the experiments.
- It can be an alternative approach to the frequentist approach
- A way to evaluate model's design on limited data set
- Usually, the goal of statistical tests is to show with high degree of confidence that an empirically estimated statistic is similar to a theoretically derived statistic