{Agorithm}
Algorithms/KMP String Search
Stringadvancedpattern-matchinglinear

KMP String Search

Find pattern in text in O(n+m) using a failure function.

How it works

The Knuth-Morris-Pratt (KMP) algorithm solves the string search problem in O(n + m) time by preprocessing the pattern into a *failure function* (also called the LPS — Longest Proper Prefix which is also Suffix — array). When a mismatch occurs during search, instead of restarting the pattern from the beginning, KMP uses the failure function to jump the pattern pointer back only as far as necessary. This avoids the O(nm) worst-case of naïve search entirely.

The two-phase approach first builds the LPS array by comparing the pattern against itself in O(m), then performs the main search sweep in O(n). Each character of the text is visited at most twice, giving a provably linear overall complexity regardless of the input. KMP is particularly efficient when the pattern contains many repeated substrings, as the failure function captures all reusable prefix information. It is foundational in text processing tools such as grep, and forms the basis of multi-pattern algorithms like Aho-Corasick.

Complexity

Best case
O(n+m)
Average case
O(n+m)
Worst case
O(n+m)
Space
O(m)

Visualizer

Generating steps…

Where it's used in the real world

grep / sed

Text file pattern searching

POSIX-compliant implementations of grep use KMP or its derivatives for fixed-string (-F) searches, guaranteeing linear time even on adversarial inputs crafted to defeat naïve backtracking.

Bioinformatics pipelines

DNA sequence matching

Genome sequencing tools scan billions of base pairs for short primer or adapter sequences. KMP's O(n + m) guarantee makes it viable for these massive inputs where quadratic algorithms would be unusable.

Aho-Corasick / network IDS

Intrusion detection signature matching

Network intrusion detection systems (Snort, Suricata) use Aho-Corasick, a multi-pattern extension of KMP, to match thousands of attack signatures against packet payloads simultaneously in linear time.

Implementation

function buildLPS(pattern: string): number[] {
  const m = pattern.length;
  const lps = new Array<number>(m).fill(0);
  let len = 0;
  let i = 1;

  while (i < m) {
    if (pattern[i] === pattern[len]) {
      lps[i++] = ++len;
    } else if (len !== 0) {
      len = lps[len - 1];
    } else {
      lps[i++] = 0;
    }
  }
  return lps;
}

function kmpSearch(text: string, pattern: string): number[] {
  const n = text.length;
  const m = pattern.length;
  const lps = buildLPS(pattern);
  const matches: number[] = [];
  let i = 0; // text pointer
  let j = 0; // pattern pointer

  while (i < n) {
    if (text[i] === pattern[j]) {
      i++;
      j++;
    }
    if (j === m) {
      matches.push(i - j);
      j = lps[j - 1];
    } else if (i < n && text[i] !== pattern[j]) {
      j = j !== 0 ? lps[j - 1] : (i++, 0);
    }
  }
  return matches;
}

// Usage
console.log(kmpSearch("ABABCABABABC", "ABABC")); // [0, 6]

Related algorithms