Skip to main content

Menu

LEVEL 0
0/5 XP
HomeAboutTopicsPricingMy VaultStatsPractice TestsCertifications

Categories

🎓 Certifications
🤖 Artificial Intelligence
☁️ Cloud and Infrastructure
💾 Data and Databases
💼 Professional Skills
🎯 Programming and Development
🔒 Security and Networking
📚 Specialized Topics
CheatGrid
HomeAboutTopicsPricingMy VaultStatsPractice TestsCertifications
LVLEVEL 0
0/5 XP
GitHub
© 2026 CheatGrid™. All rights reserved.
Privacy PolicyTerms of UseAboutContact

R for Data Science and Tidyverse Cheat Sheet

R for Data Science and Tidyverse Cheat Sheet

Back to Data Science
Updated 2026-05-15
Next Topic: Raster Data Analysis with Rasterio and GDAL Cheat Sheet

R for Data Science combines the R programming language with the tidyverse, a collection of packages designed around consistent grammar and workflow principles for data manipulation, visualization, and analysis. The tidyverse provides tidy data as a unifying structure (observational units as rows, variables as columns) and emphasizes readable code through pipes and verb-based functions. At its core sits dplyr for data transformation, tidyr for reshaping, purrr for functional programming, ggplot2 for visualization, readr for fast I/O, stringr for text, lubridate for dates, forcats for factors, and broom for model output—all integrated with R Markdown and Quarto for reproducible reporting. Keep in mind that the native pipe |> (R ≥ 4.1) behaves slightly differently from magrittr's %>%—the native pipe doesn't auto-expose . and requires explicit function calls.

What This Cheat Sheet Covers

This topic spans 31 focused tables and 242 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.

Table 1: dplyr Core Verbs for Row and Column OperationsTable 2: dplyr Grouping and AggregationTable 3: dplyr Joins for Combining Data FramesTable 4: dplyr Advanced Column and Row OperationsTable 5: dplyr Select Helpers for Column SelectionTable 6: tidyr Reshaping Data with PivotsTable 7: tidyr Nested and List-Column DataTable 8: tidyr Missing Data HandlingTable 9: purrr Iteration and MappingTable 10: purrr List Manipulation and SelectionTable 11: purrr Functional Programming UtilitiesTable 12: lubridate Parsing Date-TimesTable 13: lubridate Extracting Date-Time ComponentsTable 14: lubridate Date-Time Arithmetic and ManipulationTable 15: lubridate Time SpansTable 16: stringr Detection and ExtractionTable 17: stringr Modification and ReplacementTable 18: stringr String Manipulation and AssemblyTable 19: stringr Advanced String OperationsTable 20: forcats Factor ReorderingTable 21: forcats Factor ModificationTable 22: forcats Factor UtilitiesTable 23: readr Data Import FunctionsTable 24: readr Column Type SpecificationsTable 25: readr Data Export FunctionsTable 26: tibble Creation and ConversionTable 27: tibble Manipulation and InspectionTable 28: broom for Tidy Model OutputTable 29: Base R Statistical Functions with Formula InterfaceTable 30: R Markdown and Quarto for Reproducible ReportingTable 31: Pipe Operators - Native vs Magrittr

Table 1: dplyr Core Verbs for Row and Column Operations

These are the workhorse verbs you reach for in almost every analysis—each one does a single, predictable thing to a data frame, and chaining them with the pipe is the heart of the dplyr grammar. Filter rows, select and reshape columns, sort, deduplicate, and collapse to summaries; learn these and most everyday wrangling falls into place.

VerbExampleDescription
filter
df %>% filter(age > 30, city == "NYC")
• Keeps rows that satisfy logical conditions
• multiple conditions combine with AND by default
select
df %>% select(name, age, starts_with("val"))
• Picks columns by name or helper
• can rename inline (e.g., new = old).
mutate
df %>% mutate(total = price * quantity)
• Creates new columns or modifies existing ones
• transformations applied row-wise
summarise / summarize
df %>% summarise(avg = mean(value), n = n())
• Collapses rows into summary statistics
• often combined with group_by().
arrange
df %>% arrange(desc(date), name)
• Sorts rows by one or more columns
• use desc() for descending order

More in Data Science

  • Quarto for Data Science Reporting Cheat Sheet
  • Raster Data Analysis with Rasterio and GDAL Cheat Sheet
  • AB Testing and Online Experimentation Cheat Sheet
  • Design of Experiments (DOE) Cheat Sheet
  • Network Analysis with NetworkX Cheat Sheet
  • SciPy Cheat Sheet
View all 47 topics in Data Science