Skip to main content

Menu

LEVEL 0
0/5 XP
HomeAboutTopicsPricingMy VaultStatsPractice TestsCertifications

Categories

🎓 Certifications
🤖 Artificial Intelligence
☁️ Cloud and Infrastructure
💾 Data and Databases
💼 Professional Skills
🎯 Programming and Development
🔒 Security and Networking
📚 Specialized Topics
CheatGrid
HomeAboutTopicsPricingMy VaultStatsPractice TestsCertifications
LVLEVEL 0
0/5 XP
GitHub
© 2026 CheatGrid™. All rights reserved.
Privacy PolicyTerms of UseAboutContact

OpenRefine Cheat Sheet

OpenRefine Cheat Sheet

Back to Data Science
Updated 2026-05-28
Next Topic: Pandas API on Spark Cheat Sheet

OpenRefine (formerly Google Refine) is a powerful, open-source desktop application for working with messy data, offering capabilities for cleaning, transforming, and extending datasets. Originally developed by Metaweb and later supported by Google before becoming an independent open-source project, OpenRefine operates through a browser-based interface while running locally on your computer, ensuring your data never leaves your machine. The tool excels at clustering algorithms for finding and merging near-duplicate entries, supports reconciliation against external services like Wikidata and VIAF, and provides a complete undo/redo history that makes all transformations reversible and reproducible. A key strength is OpenRefine's faceting and filtering system, which allows you to slice data along multiple dimensions simultaneously, and its GREL (General Refine Expression Language) for complex data transformations. Understanding that OpenRefine works in rows mode (where each row is independent) versus records mode (where multiple rows can be linked together) is fundamental to mastering multi-valued cell operations and maintaining relational structure during transformations.

What This Cheat Sheet Covers

This topic spans 20 focused tables and 236 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.

Table 1: Core Facet TypesTable 2: Clustering Methods – Key CollisionTable 3: Clustering Methods – Nearest NeighborTable 4: Common TransformationsTable 5: Cell Editing OperationsTable 6: Column OperationsTable 7: GREL String FunctionsTable 8: GREL Array FunctionsTable 9: GREL Control StructuresTable 10: GREL Boolean FunctionsTable 11: GREL Math FunctionsTable 12: Date and Time FunctionsTable 13: GREL HTML and XML Parsing FunctionsTable 14: Reconciliation OperationsTable 15: Import and Export FormatsTable 16: Undo/Redo and Operation HistoryTable 17: Records Mode OperationsTable 18: Advanced GREL Features and VariablesTable 19: Wikibase IntegrationTable 20: Performance and Best Practices

Table 1: Core Facet Types

Facets are the primary way to explore and filter data in OpenRefine — they group values in a column and let you narrow your working set before applying transformations. Choosing the right facet type for your data's structure (text, number, date, or expression-based) determines what patterns you can find and what you can bulk-edit.

FacetExampleDescription
Text Facet
Column → Facet → Text facet
• Groups all unique text values with counts
• allows bulk editing by clicking a value name, and includes/excludes specific entries.
Numeric Facet
Column → Facet → Numeric facet
• Creates a draggable range slider for filtering numbers
• displays distribution; handles non-numeric cells as errors or blanks automatically.
Timeline Facet
Column → Facet → Timeline facet
• Visualizes date/time data on a draggable timeline slider
• column must be date type
Scatterplot Facet
Column → Facet → Scatterplot facet
• Plots two numeric columns as X/Y coordinates
• drag a rectangle to select and filter correlated rows

More in Data Science

  • NumPy Scientific Computing Cheat Sheet
  • Pandas API on Spark Cheat Sheet
  • AB Testing and Online Experimentation Cheat Sheet
  • Design of Experiments (DOE) Cheat Sheet
  • Network Analysis with NetworkX Cheat Sheet
  • SciPy Cheat Sheet
View all 47 topics in Data Science