Skip to main content

Tools And Applications — Yuly Koshevnik Fundamentals Of Statistical Thinking:

| Concept | Recommended dataset | Task | |---------|--------------------|------| | Data summarization | iris or mtcars | Create a report with plots + summary stats | | Sampling & CLT | NHANES (CDC data) | Take 1000 samples of size 30, plot sample means | | Hypothesis testing | tips (seaborn) | Test if lunch vs. dinner bills differ | | Regression | Boston housing | Predict MEDV from RM, LSTAT | | Logistic regression | titanic | Predict survival by class, sex, age | | A/B testing | Simulated web click data | Compare two conversion rates (prop.test in R) | | Mistake | Koshevnik’s corrective | |---------|------------------------| | Using mean without checking outliers | Always use median + IQR for skewed data | | Interpreting correlation as causation | Draw causal diagrams (DAGs) | | p‑hacking (multiple tests) | Apply Bonferroni / FDR correction | | Overfitting regression models | Use adjusted R² or cross‑validation | | Ignoring assumption checks | Test normality, equal variance before t‑test/ANOVA | 6. Study Schedule (6‑week plan) Week 1–2 – Ch 1–3: Descriptive stats + probability Week 3 – Ch 4–5: Inference (CI, hypothesis tests) Week 4 – Ch 6–7: Regression & Bayesian intro Week 5 – Ch 8–9: Categorical data + DOE Week 6 – Ch 10–11 + capstone project