Statsmodels Cheat Sheet

Updated 2026-05-28

Next Topic: Streamlit Data Application Framework Cheat Sheet

Statsmodels is Python's inference-focused statistics and econometrics library, sitting on top of NumPy, SciPy, pandas, and Patsy. It matters because it combines model estimation with the outputs practitioners actually need: standard errors, hypothesis tests, diagnostics, confidence intervals, and interpretable summaries. A useful mental model is that statsmodels is less about black-box prediction and more about specifying the model carefully, checking assumptions, and defending the conclusions after fitting. As of version 0.14, major additions include treatment-effect estimation, hurdle and truncated count models, multiple seasonal decomposition (MSTL), and extended copula support.

What This Cheat Sheet Covers

This topic spans 11 focused tables and 147 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.

Table 1: Core APIs and WorkflowTable 2: Linear and Robust Regression ModelsTable 3: GLM, Correlated Data, and Smooth ModelsTable 4: Discrete, Count, and Limited Dependent Variable ModelsTable 5: Hypothesis Tests, ANOVA, and Multiple ComparisonsTable 6: Regression Diagnostics and Inference ChecksTable 7: Time Series Diagnostics and DecompositionTable 8: Univariate Forecasting and Structural Time SeriesTable 9: Multivariate, State Space, and Regime-Switching ModelsTable 10: Nonparametric, Survival, Copulas, and Custom ModelingTable 11: Causal Inference and Missing Data

Table 1: Core APIs and Workflow

The two import surfaces — matrix-based sm and formula-based smf — determine how you build design matrices; almost every other call in the library flows from the results object returned by .fit().

Method	Example	Description
statsmodels.api	`import statsmodels.api as sm`	Main import surface for matrix-based model classes, datasets, graphics, and statistical tools.
statsmodels.formula.api	`import statsmodels.formula.api as smf`	Formula interface using R-style formulas and Patsy design matrices.
add_constant	`X = sm.add_constant(X)`	Adds an explicit intercept column for matrix-based models that do not include one by default.
from_formula	`sm.OLS.from_formula('y ~ x1 + x2', data=df)`	Alternate constructor that builds the design matrix from a formula and DataFrame.
fit	`res = sm.OLS(y, X).fit()`	Estimates the model and returns a results object with inference and diagnostics.
summary	`res.summary()`	Produces the standard text summary with coefficients, tests, fit metrics, and diagnostics.
params	`res.params`	Estimated coefficients in the fitted model.

Table 1: Core APIs and Workflow

Method	Example	Description
statsmodels.api	`import statsmodels.api as sm`	Main import surface for matrix-based model classes, datasets, graphics, and statistical tools.
statsmodels.formula.api	`import statsmodels.formula.api as smf`	Formula interface using R-style formulas and Patsy design matrices.
add_constant	`X = sm.add_constant(X)`	Adds an explicit intercept column for matrix-based models that do not include one by default.
from_formula	`sm.OLS.from_formula('y ~ x1 + x2', data=df)`	Alternate constructor that builds the design matrix from a formula and DataFrame.
fit	`res = sm.OLS(y, X).fit()`	Estimates the model and returns a results object with inference and diagnostics.
summary	`res.summary()`	Produces the standard text summary with coefficients, tests, fit metrics, and diagnostics.
params	`res.params`	Estimated coefficients in the fitted model.