PyMC is a Python probabilistic programming library for Bayesian statistical modeling and inference, built on PyTensor for automatic differentiation. Unlike frequentist methods that seek single point estimates, Bayesian inference treats parameters as random variables and produces full posterior distributions that quantify uncertainty. PyMC specializes in MCMC sampling algorithms (particularly NUTS) that explore complex, high-dimensional parameter spaces, making it ideal for hierarchical models, GLMs, time series, and causal inference. A critical insight: always run prior predictive checks before fitting—poorly specified priors can dominate weak data and produce nonsensical posteriors, while well-chosen weakly informative priors regularize estimates and improve sampling efficiency.
What This Cheat Sheet Covers
This topic spans 14 focused tables and 71 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.
Table 1: Model Context and Variable Declarations
| Concept | Example | Description |
|---|---|---|
with pm.Model() as model: x = pm.Normal('x', 0, 1) | Defines the probabilistic model scope; all distributions declared inside are automatically registered as model variables | |
X = pm.Data('X', data, mutable=True) | Wraps observed data in a shared variable; mutable=True allows updating values for out-of-sample prediction without recompiling | |
theta = pm.Deterministic('theta', pm.math.exp(log_theta)) | Creates a named deterministic transformation tracked in the trace; useful for derived quantities like odds ratios or predictions |