Skip to main content

Menu

LEVEL 0
0/5 XP
HomeAboutTopicsPricingMy VaultStats

Categories

🤖 Artificial Intelligence
☁️ Cloud and Infrastructure
💾 Data and Databases
💼 Professional Skills
🎯 Programming and Development
🔒 Security and Networking
📚 Specialized Topics
HomeAboutTopicsPricingMy VaultStats
LEVEL 0
0/5 XP
GitHub
© 2026 CheatGrid™. All rights reserved.
Privacy PolicyTerms of UseAboutContact

Azure Synapse Analytics Cheat Sheet

Azure Synapse Analytics Cheat Sheet

Back to Data Engineering
Updated 2026-04-12
Next Topic: Big Data Cheat Sheet

Azure Synapse Analytics is Microsoft's unified analytics platform that combines enterprise data warehousing with big data analytics into a single integrated service. Built on massively parallel processing (MPP) architecture, it enables organizations to ingest, prepare, manage, and analyze large volumes of data from diverse sources using SQL, Spark, and data integration pipelines. The service operates across three primary compute engines—dedicated SQL pools for structured data warehousing, serverless SQL pools for ad-hoc querying without infrastructure provisioning, and Apache Spark pools for big data processing. A critical concept to understand: Synapse distributes data across 60 distributions in dedicated pools, and minimizing data movement between these distributions is the single most important performance optimization— every design decision around distribution keys, table joins, and query patterns should aim to keep related data co-located.

What This Cheat Sheet Covers

This topic spans 14 focused tables and 76 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.

Table 1: Core Compute ResourcesTable 2: Table Distribution StrategiesTable 3: Indexing StrategiesTable 4: Performance Optimization FeaturesTable 5: Data Loading and External AccessTable 6: Security and Access ControlTable 7: Monitoring and ManagementTable 8: Data Formats and File HandlingTable 9: Integration and ConnectivityTable 10: CI/CD and DevOpsTable 11: SQL Pool Architecture and FeaturesTable 12: Spark Pool CapabilitiesTable 13: Lake DatabaseTable 14: Advanced Scenarios

Table 1: Core Compute Resources

ResourceExampleDescription
Dedicated SQL Pool
CREATE TABLE Sales
DISTRIBUTION = HASH(CustomerId)
• Provisioned data warehouse with scalable Data Warehouse Units (DWU)
• uses MPP architecture with 60 distributions for parallel query processing
• requires active management (pause/resume) to control costs.
Serverless SQL Pool
SELECT * FROM OPENROWSET(
BULK 'data/*.parquet',
FORMAT = 'PARQUET') AS data
• On-demand query service that reads data directly from Azure Storage without loading
• pay-per-query based on data processed
• ideal for ad-hoc exploration and data lake queries
• automatically provisioned with every workspace.
Apache Spark Pool
spark = SparkSession.builder
.appName("DataProc").getOrCreate()
df = spark.read.parquet("path")
• Managed Spark clusters for big data processing using Python, Scala, .NET, or SQL
• supports Spark 3.4 and 3.5 runtimes
• auto-pause capability after inactivity
• integrates with Delta Lake for ACID transactions.

More in Data Engineering

  • Azure Data Factory Cheat Sheet
  • Big Data Cheat Sheet
  • Airbyte Open-Source ELT Cheat Sheet
  • Change Data Capture (CDC) Cheat Sheet
  • Databricks Delta Live Tables (DLT) Cheat Sheet
  • Great Expectations Data Quality Cheat Sheet
View all 61 topics in Data Engineering