Regression analysis is a set of statistical methods for estimating the relationships between a dependent variable and one or more independent variables, forming the backbone of predictive modelling and causal inference across every quantitative field. Its power lies not just in fitting a line, but in providing standard errors, hypothesis tests, and diagnostics that quantify how much trust to place in each estimate. The critical insight most practitioners learn too late is that a model's coefficients are only as meaningful as the assumptions it rests on β every regression analysis should be accompanied by careful diagnostics before results are reported.
What This Cheat Sheet Covers
This topic spans 18 focused tables and 146 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.
Table 1: OLS Foundations β The Simple Linear Regression Model
The ordinary least squares estimator underpins almost every form of linear regression. Understanding how OLS works β what it minimises, what conditions make it optimal, and what its closed-form solution looks like β is the entry point to all more advanced regression methods.
| Technique | Example | Description |
|---|---|---|
y_i = \beta_0 + \beta_1 x_i + \varepsilon_i | β’ The data-generating process relating the scalar response y_i to predictor x_iβ’ \varepsilon_i is the unobserved error term | |
\min_{\beta_0,\beta_1} \sum_{i=1}^{n}(y_i - \beta_0 - \beta_1 x_i)^2 | β’ Minimises the sum of squared residuals β’ gives the line that is geometrically closest to all points in the vertical direction | |
\hat{\beta}_1 = \frac{\sum(x_i - \bar{x})(y_i - \bar{y})}{\sum(x_i - \bar{x})^2} | β’ Closed-form solution obtained by setting the first-order conditions to zero β’ equals the sample covariance of x and y divided by the sample variance of x. | |
\hat{\beta}_0 = \bar{y} - \hat{\beta}_1 \bar{x} | Ensures the fitted line always passes through the point of means (\bar{x},\bar{y}). |