The ELK Stack — Elasticsearch, Logstash, and Kibana — is the dominant open-source platform for centralized log management, full-text search, and observability, widely deployed for ingesting and analyzing machine-generated data at scale. OpenSearch is an Apache-licensed fork of Elasticsearch 7.10, maintained by Amazon and the open-source community, offering a parallel feature set including its own dashboards, security plugin, and alerting framework. Understanding the stack as a whole — from how Elasticsearch distributes data across nodes and shards, through ILM lifecycle policies and data tiers, to Logstash/Beats pipelines and Kibana visualizations — is essential because every architectural decision (shard count, tier placement, pipeline design) compounds: a poor choice early in the lifecycle multiplies storage costs and query latency at scale.
What This Cheat Sheet Covers
This topic spans 21 focused tables and 197 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.
Table 1: Elasticsearch Cluster Architecture
Elasticsearch distributes every index across primary shards and their replicas, placing them on nodes in a cluster. Getting the node roles and shard topology right before first write is critical — these settings cannot be changed on running indices without reindexing.
| Concept | Example | Description |
|---|---|---|
GET _cluster/health | • One or more nodes sharing the same cluster.name• all nodes discover each other and share cluster state | |
node.roles: [master, data_hot] | • A single Elasticsearch process • each node can hold one or more roles | |
PUT my-index {"settings":{"number_of_shards":3}} | • Basic unit of storage • each index is split into N primary shards at creation time — cannot be changed after creation. | |
"number_of_replicas": 1 | • Copy of a primary shard on a different node • provides fault tolerance and increases read throughput | |
node.roles: [master] | • Manages cluster state, index creation, shard allocation • at least 3 master-eligible nodes recommended for production to avoid split-brain | |
node.roles: [data_hot] | • Stores shards and handles search/indexing requests • sub-roles data_hot, data_warm, data_cold, data_frozen map to data tiers |