Statistics is the science of collecting, analyzing, interpreting, and presenting data to extract meaningful insights and inform decision-making. Rooted in probability theory and mathematical principles, it serves as the foundation for data science, scientific research, business intelligence, and evidence-based policy. The field divides into descriptive statistics (summarizing and visualizing data) and inferential statistics (drawing conclusions about populations from samples). A crucial mental model: uncertainty is inherent in data—statistics provides rigorous frameworks to quantify that uncertainty, assess the reliability of findings, and distinguish signal from noise.
What This Cheat Sheet Covers
This topic spans 32 focused tables and 262 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.
Table 1: Measures of Central Tendency
| Measure | Example | Description |
|---|---|---|
mean = sum(x) / n | • Arithmetic average of all values • sensitive to outliers and best suited for symmetric distributions. | |
median = sorted(x)[n//2] | • Middle value in sorted data • robust to outliers and preferred for skewed distributions. | |
mode = most_frequent(x) | • Most frequently occurring value • useful for categorical data and identifying peaks in distributions. | |
sum(x * w) / sum(w) | Average where each value has an assigned weight reflecting its importance or frequency. |