- ☀️Daily Log:
- Dealing with lots of zeros in data #data-science
- Data transformation
- Use log, sqrt, box-cox to transform the skew
- Zero-inflated models (count)
- Designed to deal with situations where there is an “excessive” number of individuals with count of 0
- In situation where there is overdispersion, characterized by the conditional variance is greater than the conditional mean, a zero-inflated model such as zero inflated Poisson (ZIP) would fit better
- This model assumes there are two sorts of individuals: one group whose counts are generated by the standard Poisson regression model and another group (absolute zero group) who have zero probability of a count greater than 0
- The model typically includes a logistic regression model to predict which group it belongs to
- Another model that does this is the negative binomial model or the special version of the zero inflated negative binomial model
- Hurdle Model
- A binomial model to predict whether the values are 0 or > 0, then a linear model (or Gamma, log-Normal, truncated-Normal) to model the observed non-zero values
- Tobit Model
- Assumes normal distribution and zero-censored
- Retrospective::
-
Daily Stoic::