Enter text in both fields above to see the transformation path.
The Complete Guide to Levenshtein Distance
Whether you are cleaning a database, building a spell-checker, or auditing code for near-duplicate functions, Levenshtein distance gives you a precise, numeric answer to the question: "how different are these two strings, and exactly which characters changed?"
How to Use This Tool
Type or paste your first string into the "Source String" box and your second string into the "Target String" box. Results update in real time as you type - no button to click. The Edit Distance score shows the raw integer count of minimum operations. The Similarity percentage normalizes that score against the longer string so you can compare pairs of different lengths on the same scale.
The Transformation Path below the scores shows a character-level map. Characters in red were deleted from the source; characters in green were inserted to reach the target; characters in yellow were substituted. Unchanged characters appear in grey. Use the toggles to control whether the comparison is case-sensitive (on by default) and whether whitespace should be stripped before comparing (off by default).
What the Edit Distance Score Actually Means
An edit distance of 0 means the strings are identical under the current settings. A distance of 1 means a single typo - one key press wrong - separates them. Distances in the range of 1 to 3 are typically considered "close enough to be the same" for fuzzy matching purposes, though the right threshold depends entirely on your use case. A distance equal to the length of the longer string means every character is different - the strings share nothing in common.
The similarity percentage (1 - distance / max length) is more intuitive for communication. 95% similar is easy to grasp; an edit distance of 3 on a 60-character string is less so. Use the raw distance for algorithms and the percentage for dashboards and human review queues.
Why Not Just Compare Character by Character?
A naive positional comparison breaks the moment a single insertion or deletion shifts everything. Compare "colour" and "color": every character from position 4 onward is "wrong" according to positional comparison, even though only one letter was added. Levenshtein distance recognizes that a single deletion transforms one into the other and returns a distance of 1. This makes it far more robust for real-world text where insertions and deletions are as common as outright substitutions.
How the Wagner-Fischer Algorithm Works
The algorithm creates a grid with (source length + 1) rows and (target length + 1) columns. The first row is filled with 0, 1, 2, 3... (cost of deleting each source character to reach an empty target). The first column is filled the same way (cost of inserting each target character from an empty source). Each remaining cell is filled with the minimum of three options: the cell above plus 1 (deletion), the cell to the left plus 1 (insertion), or the diagonal cell plus 0 if the characters match, or plus 1 if they do not (substitution). The value in the bottom-right corner is the Levenshtein distance.
To generate the Transformation Path, this tool then backtracks from the bottom-right cell to the top-left, at each step identifying whether a match, substitution, insertion, or deletion was chosen. This backtrack trace is what powers the color-coded character display you see in the Operation Map.
Practical Applications
Spell checkers use edit distance to rank candidate corrections by closeness to the misspelled word. Database deduplication tools use it to find records that differ only by a typo or abbreviation. DNA sequencing software uses it (and related algorithms) to align genetic sequences. Search engines use it to handle "did you mean?" suggestions. Version control systems use edit-distance-derived metrics for displaying code changes. Any time you need to answer "are these two things the same, just slightly garbled?" edit distance is the right starting point.
Performance Note for Large Inputs
The Wagner-Fischer algorithm has time and memory complexity of O(n x m) where n and m are the string lengths. For strings up to 1,000 characters each, computation is near-instantaneous in a modern browser. Beyond that, you will see a warning in the character counter and the algorithm may take a perceptible moment to run. For very long inputs such as entire files or large code blocks, a line-level diff tool is usually more appropriate - see the Diff Checker tool for that use case.