Skip to main content

Menu

LEVEL 0
0/5 XP
HomeAboutTopicsPricingMy VaultStats

Categories

🤖 Artificial Intelligence
☁️ Cloud and Infrastructure
💾 Data and Databases
💼 Professional Skills
🎯 Programming and Development
🔒 Security and Networking
📚 Specialized Topics
HomeAboutTopicsPricingMy VaultStats
LEVEL 0
0/5 XP
GitHub
© 2026 CheatGrid™. All rights reserved.
Privacy PolicyTerms of UseAboutContact

Databricks Cheat Sheet

Databricks Cheat Sheet

Back to Data Engineering
Updated 2026-04-20
Next Topic: Databricks Notebooks Cheat Sheet

Databricks is a unified data intelligence platform built on Apache Spark, providing a collaborative environment for data engineering, data science, machine learning, and AI application development. The platform abstracts infrastructure complexity through managed clusters, serverless compute, and interactive notebooks while adding enterprise features like Unity Catalog for centralized governance across data and AI assets, Delta Lake for reliable ACID-transactional storage with UniForm/Iceberg interoperability, Lakeflow for declarative data pipelines and ingestion, Mosaic AI for model serving, vector search, and agent frameworks, Lakebase for PostgreSQL-compatible OLTP workloads, and built-in AI Functions (ai_query, ai_classify, ai_extract, ai_parse_document) that bring LLM capabilities directly into SQL. Understanding Databricks-specific capabilities—from magic commands and widgets to workflows, serverless compute, Declarative Automation Bundles, and predictive optimization—enables practitioners to build production-grade data, analytics, and AI systems efficiently.

What This Cheat Sheet Covers

This topic spans 21 focused tables and 191 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.

Table 1: Notebook Magic CommandsTable 2: Databricks Utilities (dbutils)Table 3: Compute ConfigurationTable 4: Delta Lake OperationsTable 5: Unity CatalogTable 6: Databricks SQLTable 7: Workflows and JobsTable 8: Git Folders (Repos)Table 9: MLflow & Model ServingTable 10: Security and Access ControlTable 11: REST API & SDKTable 12: Performance and OptimizationTable 13: Monitoring, System Tables & DebuggingTable 14: Data Formats and ConnectorsTable 15: Databricks File System (DBFS)Table 16: Structured Streaming & Auto LoaderTable 17: Lakeflow Declarative Pipelines (formerly DLT)Table 18: AI Functions & Mosaic AITable 19: Lakebase (OLTP Database)Table 20: Developer Tools & DeploymentTable 21: Lakeflow Connect (Data Ingestion)

Table 1: Notebook Magic Commands

CommandExampleDescription
%python
%python
print("Hello")
• Switches cell language to Python for the current cell only
• useful in multi-language notebooks.
%sql
%sql
SELECT * FROM catalog.schema.table
• Executes SQL queries directly in a cell
• results display as formatted tables and are accessible as the implicit _sqldf DataFrame in Python cells.
%scala
%scala
val x = 5
Switches cell language to Scala for Spark operations or JVM interop.
%r
%r
df <- data.frame(x=1:5)
Executes R code in the cell for statistical analysis or visualization.
%sh
%sh
ls -la /Volumes/catalog/schema/vol/
• Runs shell commands on the driver node
• useful for file inspection, debugging, or system-level operations.
%fs
%fs ls /Volumes/main/default/my-volume/
Executes DBFS commands (ls, cp, rm, head, etc.) as a shorthand for dbutils.fs syntax.

More in Data Engineering

  • Databricks Asset Bundles Cheat Sheet
  • Databricks Notebooks Cheat Sheet
  • Airbyte Open-Source ELT Cheat Sheet
  • Big Data Storage Formats Cheat Sheet
  • Data Wrangling Cheat Sheet
  • Great Expectations Data Quality Cheat Sheet
View all 53 topics in Data Engineering