String algorithms and pattern matching form the backbone of text search engines, compilers, bioinformatics pipelines, and network intrusion detection systems. The central challenge is locating one or multiple patterns inside a potentially enormous text without inspecting every character at every position. Most efficient algorithms work by preprocessing the pattern — building jump tables, automata, or hash structures — so mismatches cause large skips rather than single-character retreats. A practitioner's most important decision is whether to match a single exact pattern (KMP or Boyer-Moore), multiple patterns simultaneously (Aho-Corasick), approximate patterns (Bitap/Levenshtein), or to index the text for repeated queries (suffix array or suffix automaton).
What This Cheat Sheet Covers
This topic spans 14 focused tables and 97 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.
Table 1 — Single-Pattern Exact Matching: Algorithm Overview
The starting point for the whole field: finding one fixed pattern inside a text. The naïve scan is the baseline everything improves on, and the rest — KMP, Z, Boyer-Moore, Horspool, Rabin-Karp, Two-Way — each preprocess the pattern so a mismatch lets you skip ahead instead of crawling one character at a time. This table is the map; the algorithms that matter most get their own detailed table later.
| Algorithm | Preprocessing / Complexity | Key Idea & Notes |
|---|---|---|
None · O(nm) worst | • Slide pattern one character at a time • compare each position • Simple but slow on repetitive text (e.g., aaaa…ab vs aab). | |
O(m) prefix table · O(n+m) search | • Precompute failure function so mismatches jump without re-examining characters • Never backtracks in text | |
O(m+n) Z-array | • Concatenate P + $ + T• Z-array entry = length of longest match starting at that position • Match found when Z[i] == m. | |
O(m + \sigma) tables · O(n/m) best | • Scans pattern right-to-left • two heuristics (bad-char + good-suffix) give sub-linear best case |