String algorithms and pattern matching form the backbone of text search engines, compilers, bioinformatics pipelines, and network intrusion detection systems. The central challenge is locating one or multiple patterns inside a potentially enormous text without inspecting every character at every position. Most efficient algorithms work by preprocessing the pattern β building jump tables, automata, or hash structures β so mismatches cause large skips rather than single-character retreats. A practitioner's most important decision is whether to match a single exact pattern (KMP or Boyer-Moore), multiple patterns simultaneously (Aho-Corasick), approximate patterns (Bitap/Levenshtein), or to index the text for repeated queries (suffix array or suffix automaton).
What This Cheat Sheet Covers
This topic spans 14 focused tables and 97 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.
Table 1 β Single-Pattern Exact Matching: Algorithm Overview
| Algorithm | Preprocessing / Complexity | Key Idea & Notes |
|---|---|---|
None Β· O(nm) worst | Slide pattern one character at a time; compare each position. Simple but slow on repetitive text (e.g., aaaaβ¦ab vs aab). | |
O(m) prefix table Β· O(n+m) search | Precompute failure function so mismatches jump without re-examining characters. Never backtracks in text. | |
O(m+n) Z-array | Concatenate P + $ + T; Z-array entry = length of longest match starting at that position. Match found when Z[i] == m. | |
O(m + \sigma) tables Β· O(n/m) best | Scans pattern right-to-left; two heuristics (bad-char + good-suffix) give sub-linear best case. |