7-years-of-statistics-in-25-minutes
📰 Summary (use your own words)
Goes over core, shell skills for masters and phd degrees, and then advanced skills are specializations.
✍️ Notes
Statistics is about understanding what future data will appear from the data you have collected.
Core
- Probabilities
-
Data can be represented by $X$ (random variable) and there are ways to characterize the data to help us grasp it
-
Probability distribution ($f(X)$) is something that can describe what values are likely to appear in the future and what values are not
- Expectation (mean) is the typical value we’d expect to see in the data
- Variance is the uncertainty or how far away a random variable is away from the expectation (mean)
- Variance between two random variables (covariance) is also usually interesting because it can derive correlation which is standardized covariance
-
A dataset is multiple observations of the data, the key assumption we make in statistics is that they come from the same underlying distribution
-
Statistics are condensed ways of talking about a dataset
- A general statistic could just be the function of the random variable - $g(X)$
- Estimators are a type of statistics that are educated guess of the underlying function for the observed dataset
- Test statistics are another type which is used for hypothesis tests
-
Estimators
- Sample a population to learn about a characteristic (parameter)
- The estimator is trying to “guess” the parameter from the sample
- Maximum likelihood estimation is how we can make the guess for the estimator
-
MLE
- Take the probability of the data and construct a likelihood function
- By picking the parameter value that maximizes the likelihood we have, theoretically that is a good guess
- For certain distributions the MLE will have a closed form that can be solved by taking the derivative of the log-likelihood
- For real-world applications you use a computer to solve for it
-
-
Mathematics statistics
-
Statistical programming