The Longest Common Subsequence (LCS) is a classic computer science problem that involves finding the longest subsequence present in two sequences (strings, lists, or arrays) that appears in the same order. It's important to note that the elements of the LCS need not be contiguous; rather, they only need to maintain their relative ordering.
For example, consider the two sequences:
ABCBDAB
BDCAB
The longest common subsequence for these two strings is BCAB
or BDAB
, both with a length of 4. However, ACAB
is not a valid subsequence because it does not maintain the character order of both sequences.
The LCS problem has various applications, particularly in:
The recursive approach is the simplest to understand. You compare each character of both strings:
Here's how it can be implemented:
def lcs_recursive(X, Y, m, n): if m == 0 or n == 0: return 0 if X[m-1] == Y[n-1]: return 1 + lcs_recursive(X, Y, m-1, n-1) else: return max(lcs_recursive(X, Y, m, n-1), lcs_recursive(X, Y, m-1, n))
Although this method is easy to follow, it has an exponential time complexity O(2^(m+n))
, making it inefficient for larger strings.
Dynamic programming offers a more efficient way to solve the LCS problem, reducing the time complexity to O(m * n)
and using O(m * n)
space for storing intermediate results.
The idea is to use a 2D table to store the lengths of the longest common subsequence of substrings. The value in the cell dp[i][j]
represents the length of LCS of X[0...i-1]
and Y[0...j-1]
.
Dynamic Programming Steps:
dp
of size (m+1) x (n+1)
with all zeros.Here’s the Python code demonstrating this approach:
def lcs_dynamic(X, Y): m = len(X) n = len(Y) dp = [[0] * (n + 1) for _ in range(m + 1)] for i in range(1, m + 1): for j in range(1, n + 1): if X[i - 1] == Y[j - 1]: dp[i][j] = dp[i - 1][j - 1] + 1 else: dp[i][j] = max(dp[i - 1][j], dp[i][j - 1]) return dp[m][n]
While the typical dynamic programming solution requires O(m * n)
space, we can optimize it to O(min(m, n))
by keeping only the current and previous rows in memory. Here’s how to do that:
def lcs_optimized(X, Y): m, n = len(X), len(Y) if m < n: X, Y, m, n = Y, X, n, m previous = [0] * (n + 1) current = [0] * (n + 1) for i in range(1, m + 1): for j in range(1, n + 1): if X[i - 1] == Y[j - 1]: current[j] = previous[j - 1] + 1 else: current[j] = max(previous[j], current[j - 1]) previous, current = current, previous # Shift the arrays return previous[n]
In this article, we explored the concept of Longest Common Subsequence and various approaches to solve it, from the naive recursive method to more sophisticated dynamic programming techniques. This knowledge is crucial for tackling string manipulation problems in coding interviews and practical applications in technology. Whether working with text comparisons, bioinformatics, or version control, a strong grasp of the LCS problem will empower you in your development endeavors.
15/11/2024 | DSA
08/12/2024 | DSA
13/10/2024 | DSA
16/11/2024 | DSA
15/11/2024 | DSA
16/11/2024 | DSA
06/12/2024 | DSA
23/09/2024 | DSA
06/12/2024 | DSA
23/09/2024 | DSA
23/09/2024 | DSA