Skip to main content

Menu

LEVEL 0
0/5 XP
HomeAboutTopicsPricingMy VaultStats

Categories

πŸ€– Artificial Intelligence
☁️ Cloud and Infrastructure
πŸ’Ύ Data and Databases
πŸ’Ό Professional Skills
🎯 Programming and Development
πŸ”’ Security and Networking
πŸ“š Specialized Topics
HomeAboutTopicsPricingMy VaultStats
LEVEL 0
0/5 XP
GitHub
Β© 2026 CheatGridβ„’. All rights reserved.
Privacy PolicyTerms of UseAboutContact

OpenRefine Cheat Sheet

OpenRefine Cheat Sheet

Back to Data Science
Updated 2026-03-19
Next Topic: Pandas API on Spark Cheat Sheet

OpenRefine (formerly Google Refine) is a powerful, open-source desktop application for working with messy data, offering capabilities for cleaning, transforming, and extending datasets. Originally developed by Metaweb and later supported by Google before becoming an independent open-source project, OpenRefine operates through a browser-based interface while running locally on your computer, ensuring your data never leaves your machine. The tool excels at clustering algorithms for finding and merging near-duplicate entries, supports reconciliation against external services like Wikidata and VIAF, and provides a complete undo/redo history that makes all transformations reversible and reproducible. A key strength is OpenRefine's faceting and filtering system, which allows you to slice data along multiple dimensions simultaneously, and its GREL (General Refine Expression Language) for complex data transformations. Understanding that OpenRefine works in rows mode (where each row is independent) versus records mode (where multiple rows can be linked together) is fundamental to mastering multi-valued cell operations and maintaining relational structure during transformations.

What This Cheat Sheet Covers

This topic spans 17 focused tables and 127 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.

Table 1: Core Facet TypesTable 2: Clustering Methods – Key CollisionTable 3: Clustering Methods – Nearest NeighborTable 4: Common TransformationsTable 5: Cell Editing OperationsTable 6: Column OperationsTable 7: GREL String FunctionsTable 8: GREL Array FunctionsTable 9: GREL Control StructuresTable 10: Date and Time FunctionsTable 11: Reconciliation OperationsTable 12: Import and Export FormatsTable 13: Undo/Redo and Operation HistoryTable 14: Records Mode OperationsTable 15: Advanced GREL FeaturesTable 16: Wikibase IntegrationTable 17: Performance and Best Practices

Table 1: Core Facet Types

FacetExampleDescription
Text Facet
Column β†’ Facet β†’ Text facet
β€’ Groups all unique text values in a column with counts
β€’ allows bulk editing by clicking value names, selecting multiple choices, and excluding/including specific entries.
Numeric Facet
Column β†’ Facet β†’ Numeric facet
β€’ Creates a draggable range slider for filtering numbers
β€’ displays min, max, and distribution
β€’ automatically handles non-numeric cells as errors or blanks.
Timeline Facet
Column β†’ Facet β†’ Timeline facet
β€’ Visualizes date/time data on a draggable timeline slider
β€’ requires column to be formatted as date type
β€’ useful for filtering temporal ranges.
Scatterplot Facet
Column β†’ Facet β†’ Scatterplot facet
β€’ Plots two numeric columns against each other as X/Y coordinates
β€’ allows selection by dragging rectangular regions to filter correlated data.

More in Data Science

  • NumPy Scientific Computing Cheat Sheet
  • Pandas API on Spark Cheat Sheet
  • AB Testing and Online Experimentation Cheat Sheet
  • Design of Experiments (DOE) Cheat Sheet
  • Network Analysis with NetworkX Cheat Sheet
  • SciPy Cheat Sheet
View all 47 topics in Data Science