ELK and OpenSearch Stack Cheat Sheet

Updated 2026-05-22

The ELK Stack — Elasticsearch, Logstash, and Kibana — is the dominant open-source platform for centralized log management, full-text search, and observability, widely deployed for ingesting and analyzing machine-generated data at scale. OpenSearch is an Apache-licensed fork of Elasticsearch 7.10, maintained by Amazon and the open-source community, offering a parallel feature set including its own dashboards, security plugin, and alerting framework. Understanding the stack as a whole — from how Elasticsearch distributes data across nodes and shards, through ILM lifecycle policies and data tiers, to Logstash/Beats pipelines and Kibana visualizations — is essential because every architectural decision (shard count, tier placement, pipeline design) compounds: a poor choice early in the lifecycle multiplies storage costs and query latency at scale.

What This Cheat Sheet Covers

This topic spans 21 focused tables and 197 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.

Table 1: Elasticsearch Cluster ArchitectureTable 2: Index Lifecycle Management (ILM) and Data TiersTable 3: Elasticsearch Mappings and Field TypesTable 4: Text Analysis — Analyzers, Tokenizers, and Token FiltersTable 5: Elasticsearch Query DSLTable 6: Elasticsearch AggregationsTable 7: Ingest Pipelines and ProcessorsTable 8: Logstash PipelinesTable 9: Beats — Lightweight Data ShippersTable 10: Kibana Visualizations and DashboardsTable 11: Elastic Common Schema (ECS)Table 12: Security — X-Pack and OpenSearch Security PluginTable 13: Machine Learning and Anomaly DetectionTable 14: Alerting — Watcher and OpenSearch AlertingTable 15: Snapshots and Backup with S3Table 16: Monitoring the Elastic StackTable 17: OpenSearch — Fork Differences and OpenSearch DashboardsTable 18: Integration with OpenTelemetry and Fluent BitTable 19: Common Failure Modes and Operational GotchasTable 20: Capacity Planning and ScalingTable 21: Comparing Elastic Stack to Alternatives

Table 1: Elasticsearch Cluster Architecture

Elasticsearch distributes every index across primary shards and their replicas, placing them on nodes in a cluster. Getting the node roles and shard topology right before first write is critical — these settings cannot be changed on running indices without reindexing.

Concept	Example	Description
Cluster	`GET _cluster/health`	• One or more nodes sharing the same `cluster.name` • all nodes discover each other and share cluster state
Node	`node.roles: [master, data_hot]`	• A single Elasticsearch process • each node can hold one or more roles
Primary shard	`PUT my-index {"settings":{"number_of_shards":3}}`	• Basic unit of storage • each index is split into N primary shards at creation time — cannot be changed after creation.
Replica shard	`"number_of_replicas": 1`	• Copy of a primary shard on a different node • provides fault tolerance and increases read throughput
Master node	`node.roles: [master]`	• Manages cluster state, index creation, shard allocation • at least 3 master-eligible nodes recommended for production to avoid split-brain
Data node	`node.roles: [data_hot]`	• Stores shards and handles search/indexing requests • sub-roles `data_hot`, `data_warm`, `data_cold`, `data_frozen` map to data tiers

Table 1: Elasticsearch Cluster Architecture

Concept	Example	Description
Cluster	`GET _cluster/health`	• One or more nodes sharing the same `cluster.name` • all nodes discover each other and share cluster state
Node	`node.roles: [master, data_hot]`	• A single Elasticsearch process • each node can hold one or more roles
Primary shard	`PUT my-index {"settings":{"number_of_shards":3}}`	• Basic unit of storage • each index is split into N primary shards at creation time — cannot be changed after creation.
Replica shard	`"number_of_replicas": 1`	• Copy of a primary shard on a different node • provides fault tolerance and increases read throughput
Master node	`node.roles: [master]`	• Manages cluster state, index creation, shard allocation • at least 3 master-eligible nodes recommended for production to avoid split-brain
Data node	`node.roles: [data_hot]`	• Stores shards and handles search/indexing requests • sub-roles `data_hot`, `data_warm`, `data_cold`, `data_frozen` map to data tiers