7-years-of-statistics-in-25-minutes

#statistics

📰 Summary (use your own words)

Goes over core, shell skills for masters and phd degrees, and then advanced skills are specializations.

✍️ Notes

Statistics is about understanding what future data will appear from the data you have collected.

Core

  1. Probabilities
    • Data can be represented by $X$ (random variable) and there are ways to characterize the data to help us grasp it

    • Probability distribution ($f(X)$) is something that can describe what values are likely to appear in the future and what values are not

      • Expectation (mean) is the typical value we’d expect to see in the data
      • Variance is the uncertainty or how far away a random variable is away from the expectation (mean)
        • Variance between two random variables (covariance) is also usually interesting because it can derive correlation which is standardized covariance
    • A dataset is multiple observations of the data, the key assumption we make in statistics is that they come from the same underlying distribution

    • Statistics are condensed ways of talking about a dataset

      • A general statistic could just be the function of the random variable - $g(X)$
      • Estimators are a type of statistics that are educated guess of the underlying function for the observed dataset
      • Test statistics are another type which is used for hypothesis tests
    • Estimators

      • Sample a population to learn about a characteristic (parameter)
      • The estimator is trying to “guess” the parameter from the sample
      • Maximum likelihood estimation is how we can make the guess for the estimator
    • MLE

      • Take the probability of the data and construct a likelihood function
      • By picking the parameter value that maximizes the likelihood we have, theoretically that is a good guess
      • For certain distributions the MLE will have a closed form that can be solved by taking the derivative of the log-likelihood
      • For real-world applications you use a computer to solve for it
  2. Mathematics statistics

  3. Statistical programming

Shell

Advanced