String Algorithms and Pattern Matching Cheat Sheet

Updated 2026-05-19

String algorithms and pattern matching form the backbone of text search engines, compilers, bioinformatics pipelines, and network intrusion detection systems. The central challenge is locating one or multiple patterns inside a potentially enormous text without inspecting every character at every position. Most efficient algorithms work by preprocessing the pattern — building jump tables, automata, or hash structures — so mismatches cause large skips rather than single-character retreats. A practitioner's most important decision is whether to match a single exact pattern (KMP or Boyer-Moore), multiple patterns simultaneously (Aho-Corasick), approximate patterns (Bitap/Levenshtein), or to index the text for repeated queries (suffix array or suffix automaton).

What This Cheat Sheet Covers

This topic spans 14 focused tables and 97 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.

Table 1 — Single-Pattern Exact Matching: Algorithm OverviewTable 2 — KMP: Prefix Function and LPS ArrayTable 3 — Z-AlgorithmTable 4 — Boyer-Moore: Heuristics and RulesTable 5 — Rabin-Karp and String HashingTable 6 — Aho-Corasick: Multi-Pattern MatchingTable 7 — Trie (Prefix Tree) OperationsTable 8 — Suffix Array and LCP ArrayTable 9 — Suffix Tree and Suffix AutomatonTable 10 — Palindrome Detection AlgorithmsTable 11 — Subsequence and Edit Distance AlgorithmsTable 12 — Anagram and Permutation DetectionTable 13 — String Compression and Canonical TransformsTable 14 — Approximate String Matching

Table 1 — Single-Pattern Exact Matching: Algorithm Overview

Algorithm	Preprocessing / Complexity	Key Idea & Notes
Naïve / Brute-Force	None · $O(nm)$ worst	Slide pattern one character at a time; compare each position. Simple but slow on repetitive text (e.g., `aaaa…ab` vs `aab`).
KMP (Knuth-Morris-Pratt)	$O(m)$ prefix table · $O(n+m)$ search	Precompute failure function so mismatches jump without re-examining characters. Never backtracks in text.
Z-Algorithm	$O(m+n)$ Z-array	Concatenate `P + $ + T`; Z-array entry = length of longest match starting at that position. Match found when `Z[i] == m`.
Boyer-Moore	$O(m + \sigma)$ tables · $O(n/m)$ best	Scans pattern right-to-left; two heuristics (bad-char + good-suffix) give sub-linear best case.

Table 1 — Single-Pattern Exact Matching: Algorithm Overview

Algorithm	Preprocessing / Complexity	Key Idea & Notes
Naïve / Brute-Force	None · $O(nm)$ worst	Slide pattern one character at a time; compare each position. Simple but slow on repetitive text (e.g., `aaaa…ab` vs `aab`).
KMP (Knuth-Morris-Pratt)	$O(m)$ prefix table · $O(n+m)$ search	Precompute failure function so mismatches jump without re-examining characters. Never backtracks in text.
Z-Algorithm	$O(m+n)$ Z-array	Concatenate `P + $ + T`; Z-array entry = length of longest match starting at that position. Match found when `Z[i] == m`.
Boyer-Moore	$O(m + \sigma)$ tables · $O(n/m)$ best	Scans pattern right-to-left; two heuristics (bad-char + good-suffix) give sub-linear best case.