Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added levenshtein-distance algo #1639

Merged
merged 1 commit into from
Nov 4, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
71 changes: 71 additions & 0 deletions Dynamic Programming/Levenshtein Distance/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
# Levenshtein Distance in C

## Problem Statement
The Levenshtein distance between two strings is the minimum number of single-character edits (insertions, deletions, or substitutions) required to transform one string into the other. This algorithm is commonly used in spell checkers, DNA sequencing, and natural language processing.

### Examples

#### Example 1
**Input:**
s1 = "kitten" s2 = "sitting"
**Output:**
3

**Explanation:**
The minimum edits to transform "kitten" to "sitting" are:
1. Substitute "k" with "s"
2. Substitute "e" with "i"
3. Append "g"

#### Example 2
**Input:**
s1 = "flaw" s2 = "lawn"
**Output:**
2

**Explanation:**
The minimum edits to transform "flaw" to "lawn" are:
1. Substitute "f" with "l"
2. Substitute "w" with "n"

---

## Approach
This solution uses Dynamic Programming (DP) to calculate the Levenshtein distance between two strings efficiently.

### DP Recurrence Relation
Define `dp[i][j]` as the Levenshtein distance between the first `i` characters of `s1` and the first `j` characters of `s2`.

1. **If characters match** (`s1[i-1] == s2[j-1]`):
- `dp[i][j] = dp[i-1][j-1]`
2. **If characters do not match**:
- `dp[i][j] = 1 + min(dp[i-1][j], dp[i][j-1], dp[i-1][j-1])`

where:
- `dp[i-1][j] + 1` represents a deletion.
- `dp[i][j-1] + 1` represents an insertion.
- `dp[i-1][j-1] + 1` represents a substitution.

### Complexity
- **Time Complexity:** \(O(n \times m)\), where `n` is the length of `s1` and `m` is the length of `s2`.
- **Space Complexity:** \(O(n \times m)\), as we use a 2D array to store the distances.

---

## Code
The full code is available in `program.c`.

## Running the Code

### Prerequisites
Ensure you have a C compiler installed, such as GCC.

### Instructions
1. Clone this repository.
2. Compile the C file:

```bash
gcc program.c -o program

3. Run the compiled program:
./program
65 changes: 65 additions & 0 deletions Dynamic Programming/Levenshtein Distance/program.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

// Function to calculate the minimum of three values
int min(int a, int b, int c) {
int min = a;
if (b < min) min = b;
if (c < min) min = c;
return min;
}

// Function to calculate Levenshtein distance
int levenshteinDistance(const char *s1, const char *s2) {
int len1 = strlen(s1);
int len2 = strlen(s2);

// Create a 2D array to store distances
int **dp = (int **)malloc((len1 + 1) * sizeof(int *));
for (int i = 0; i <= len1; i++) {
dp[i] = (int *)malloc((len2 + 1) * sizeof(int));
}

// Initialize base cases
for (int i = 0; i <= len1; i++) {
dp[i][0] = i; // Distance of any first string to an empty second string
}
for (int j = 0; j <= len2; j++) {
dp[0][j] = j; // Distance of any second string to an empty first string
}

// Fill the dp array
for (int i = 1; i <= len1; i++) {
for (int j = 1; j <= len2; j++) {
if (s1[i - 1] == s2[j - 1]) {
dp[i][j] = dp[i - 1][j - 1]; // Characters match, no cost
} else {
dp[i][j] = min(
dp[i - 1][j] + 1, // Deletion
dp[i][j - 1] + 1, // Insertion
dp[i - 1][j - 1] + 1 // Substitution
);
}
}
}

int distance = dp[len1][len2];

// Free allocated memory
for (int i = 0; i <= len1; i++) {
free(dp[i]);
}
free(dp);

return distance;
}

int main() {
const char *s1 = "kitten";
const char *s2 = "sitting";

printf("Levenshtein distance between '%s' and '%s' is %d\n", s1, s2, levenshteinDistance(s1, s2));

return 0;
}
Loading