Probability theory provides the mathematical foundation for quantifying uncertainty across statistics, machine learning, and data science. Rooted in Kolmogorov's axioms, it formalizes how we assign probabilities to events, combine them through rules like independence and conditioning, and reason about random phenomena. At its core, probability answers: given what we know, what can we expect—and with what confidence? The theory's power lies not in computing single probabilities, but in chaining conditional relationships, transforming distributions, and leveraging limit theorems to move from finite samples to population-level insights. Mastering these fundamentals—from sample spaces to convergence concepts—unlocks rigorous modeling of real-world randomness.
What This Cheat Sheet Covers
This topic spans 15 focused tables and 96 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.
Table 1: Foundational Concepts and Sample Spaces
Before any probability can be assigned, you need the vocabulary of what's being measured — the full space of outcomes, the events carved out of it, and how those events relate through unions, intersections, and complements. These set-theoretic building blocks are the language every later rule is written in, so getting comfortable with partitions and mutual exclusivity here pays off everywhere downstream.
| Concept | Example | Description |
|---|---|---|
Coin flip: \Omega = \\{\text{H}, \text{T}\\} | • Set of all possible outcomes of a random experiment • denoted Ω or S | |
A = \\{\text{H}\\} (getting heads) | • Any subset of the sample space • represents a collection of outcomes | |
Single outcome like \\{\text{H}\\} | • Event containing exactly one outcome • also called atomic event or sample point. | |
\\{A_1, A_2, A_3\\} with A_i \cap A_j = \emptyset | • Collection of disjoint events whose union equals Ω • every outcome belongs to exactly one partition element |