Econometrics is the application of statistical methods to economic data, enabling researchers to test theories, estimate relationships, and make causal inferences. It forms the empirical backbone of economics, finance, and policy analysis, bridging theoretical models and real-world data through regression analysis, hypothesis testing, and identification strategies. Ordinary Least Squares (OLS) serves as the foundational estimation method, but violations of its assumptions—endogeneity, heteroskedasticity, autocorrelation—demand corrections and alternative approaches. Modern econometrics emphasizes causal identification through instrumental variables, panel data methods, difference-in-differences, regression discontinuity, and other quasi-experimental designs that recover treatment effects from observational data. A deep working knowledge of diagnostics, robust inference, and model specification—including recent advances in double/debiased machine learning and staggered treatment designs—is essential for producing credible empirical research.
What This Cheat Sheet Covers
This topic spans 16 focused tables and 116 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.
Table 1: OLS Assumptions (Classical Linear Regression Model)
The CLRM assumptions form the bedrock of regression analysis; satisfying all six guarantees that OLS is Best Linear Unbiased Estimator (BLUE) by the Gauss-Markov theorem. In practice, each assumption failure has a known diagnostic test and a corresponding correction, making this table the starting checklist for any applied regression project.
| Assumption | Example | Description |
|---|---|---|
y = \beta_0 + \beta_1 x_1 + \varepsilon | • Model is linear in coefficients β, not necessarily in variables • nonlinear transforms of x are allowed | |
IID draws from population | • Observations are independently and identically distributed • violated by cluster sampling, time series | |
VIF < 10 | • No regressor is an exact linear combination of others • X'X must be invertible | |
E[\varepsilon \mid X] = 0 | • Error is uncorrelated with regressors • the most critical assumption — violation implies OLS is biased and inconsistent |