ggplot2 is R's most powerful data visualization package, built on the Grammar of Graphicsβa systematic framework that treats plots as compositions of independent layers (data, aesthetics, geometries, statistics, scales, coordinates, and facets). Originally created by Hadley Wickham, ggplot2 is now the industry standard for publication-quality graphics in R, deeply integrated into the tidyverse ecosystem. Unlike base R plotting, which describes what to draw, ggplot2 describes how variables map to visual properties, enabling you to build complex visualizations incrementally. The key mental model: every plot is a layered combination of components added with +, where swapping one layer (e.g., geom_point() for geom_line()) transforms the visualization without rewriting the entire plot. As of version 4.0.0 (2025), ggplot2 uses an S7 backend and introduces element_geom() for controlling geom appearance through the theme system.
What This Cheat Sheet Covers
This topic spans 21 focused tables and 202 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.
Table 1: Core Building Blocks
| Component | Example | Description |
|---|---|---|
ggplot(data = df, aes(x, y)) | β’ Initializes a plot object β’ specifies data and global aesthetic mappings inherited by all layers. | |
aes(x = mpg, y = hp, color = cyl) | β’ Maps variables to aesthetics (x, y, color, fill, size, shape, alpha, linetype)β’ placed in ggplot() for global or in geom_*() for layer-specific. | |
geom_point()geom_line() | β’ Defines the geometric object (visual representation) β’ each geom creates a new layer. | |
stat_summary(fun = mean) | β’ Applies statistical transformation before plotting β’ every geom has a default stat (e.g., geom_bar() uses stat_count()). | |
scale_x_continuous()scale_color_manual() | β’ Controls how data values map to visual properties β’ customizes axes, colors, sizes, limits, labels, and transformations. |