{Agorithm}
Algorithms/Longest Common Subsequence
DPintermediatestringtabulation

Longest Common Subsequence

Find the longest subsequence present in both sequences.

How it works

Longest Common Subsequence (LCS) finds the longest sequence of characters that appear in the same relative order in both input strings — but not necessarily contiguously.

The algorithm builds an (m+1) × (n+1) table where dp[i][j] stores the LCS length of the first i characters of s1 and the first j characters of s2.

Recurrence: - If s1[i-1] === s2[j-1]: dp[i][j] = dp[i-1][j-1] + 1 (extend the common prefix) - Otherwise: dp[i][j] = max(dp[i-1][j], dp[i][j-1]) (best without current char)

Base cases: dp[0][j] = dp[i][0] = 0 (empty string has LCS 0 with anything).

The answer sits at dp[m][n]. To recover the actual subsequence (not just its length), trace back through the table from dp[m][n] to dp[0][0].

LCS is closely related to Edit Distance: edit distance equals m + n − 2 × LCS(s1, s2) when only insertions and deletions are allowed.

Complexity

Best case
O(mn)
Average case
O(mn)
Worst case
O(mn)
Space
O(mn)

Visualizer

Generating steps…

Where it's used in the real world

Version control — diff/patch tools (git diff, GNU diff)

Computing line-level diffs between file revisions

Git's diff engine models each revision as a sequence of lines and finds their LCS. Lines inside the LCS are unchanged; lines outside form the insertions and deletions shown in the unified diff output. This keeps patches minimal and human-readable.

Bioinformatics — sequence alignment pipelines

Aligning DNA, RNA, or protein sequences to find conserved regions

Conserved subsequences across species indicate functional or evolutionary significance. Tools such as BLAST use LCS-style DP as a core primitive before applying gap penalties and scoring matrices (e.g. BLOSUM62) for full Smith-Waterman or Needleman-Wunsch alignment.

Academic integrity / plagiarism detection

Measuring token-level similarity between student submissions

Systems like MOSS tokenise source code and compute LCS-based similarity scores. A high LCS ratio between two submissions — after normalising identifiers — is a strong signal of code copying even when variable names or formatting have been changed.

Implementation

function lcs(s1: string, s2: string): number {
  const m = s1.length;
  const n = s2.length;
  const dp: number[][] = Array.from({ length: m + 1 }, () =>
    Array(n + 1).fill(0)
  );

  for (let i = 1; i <= m; i++) {
    for (let j = 1; j <= n; j++) {
      if (s1[i - 1] === s2[j - 1]) {
        dp[i][j] = dp[i - 1][j - 1] + 1;
      } else {
        dp[i][j] = Math.max(dp[i - 1][j], dp[i][j - 1]);
      }
    }
  }

  return dp[m][n];
}

// Usage
console.log(lcs("ABCBDAB", "BDCABA")); // 4

Related algorithms