Skip to content

How to Compare Text and Find Differences: A Developer's Guide

Comparing two pieces of text to find what changed is one of the most fundamental tasks in software development. Whether you are reviewing a pull request, tracking configuration changes, debugging an unexpected output or editing documentation, text diff tools help you see exactly what was added, removed or modified. Understanding how diffs work will make you more efficient at code review, debugging and collaboration.

What is a Text Diff?

A text diff (short for "difference") is the result of comparing two blocks of text and identifying the changes between them. The output highlights which lines or characters were added, removed or left unchanged. Diff tools have been a core part of computing since the early 1970s, when the original diff utility was written for Unix. Today, diffs are everywhere. They power version control systems like Git, enable code review workflows and help teams collaborate on shared documents.

How Diff Algorithms Work

At the heart of every diff tool is an algorithm that finds the longest common subsequence (LCS) between two texts. The LCS represents the parts that did not change. Everything else is either an addition or a deletion.

Longest Common Subsequence (LCS)

The LCS algorithm uses dynamic programming to find the longest sequence of elements that appear in both texts in the same order, but not necessarily consecutively. For two texts of length m and n, the basic LCS algorithm runs in O(m * n) time and space. This works well for small to medium-sized files but can become slow for very large inputs.

Myers Diff Algorithm

The Myers algorithm, published by Eugene Myers in 1986, is the most widely used diff algorithm today. Git uses it by default. Instead of computing the full LCS matrix, Myers finds the shortest edit script (the minimum number of insertions and deletions needed to transform one text into another). It runs in O(n * d) time, where d is the number of differences. This means it is very fast when the two texts are similar, which is the common case in version control.

Types of Diff Output

Diff tools present changes in several different formats, each suited to different workflows.

  • Unified diff. The most common format in development. Shows both files in a single view with + for additions and - for deletions. This is what git diff produces by default.
  • Side-by-side diff. Shows the original text on the left and the modified text on the right. This makes it easier to visually compare corresponding sections. Many GUI tools and code review platforms offer this view.
  • Inline diff. Highlights changes at the character or word level within each line. This is useful when only a small part of a line changed, like a variable name or a number.
  • Context diff. An older format that shows a few lines of context around each change. It uses ! to mark changed lines and includes more surrounding text than unified diffs.

Reading Diff Output

Understanding diff notation is essential for every developer. Here is what a typical unified diff looks like:

@@ -1,4 +1,4 @@

 function greet(name) {

-  return "Hello, " + name;

+  return `Hello, ${name}!`;

 }

  • Lines starting with - indicate text that was removed from the original.
  • Lines starting with + indicate text that was added in the new version.
  • Lines with no prefix are context lines that appear in both versions and help you orient yourself in the file.
  • The @@ header shows the line numbers and ranges for each chunk of changes.

Common Use Cases

Text diff tools are useful across many different workflows in development and beyond.

  • Code review. Reviewing pull requests is the most common use case. Diffs let reviewers focus on exactly what changed rather than reading entire files.
  • Content editing. Writers and editors use diffs to track revisions in documentation, blog posts and technical specifications.
  • Configuration changes. Comparing configuration files (YAML, JSON, environment variables) before and after a deployment helps catch unintended changes.
  • Debugging. When output changes unexpectedly, comparing the expected output against the actual output quickly reveals the discrepancy.
  • Database migrations. Comparing SQL schemas or seed data files helps ensure migrations are correct before running them in production.
  • Legal and compliance. Tracking changes in contracts, policies or terms of service documents where every word matters.

Text Diff in the Command Line

The command line offers powerful built-in tools for comparing text. Here are the most commonly used commands.

The diff Command

The Unix diff command compares two files line by line. The -u flag produces unified output:

diff -u original.txt modified.txt

Git Diff

Git provides a rich set of diff commands for comparing commits, branches and staged changes:

git diff → Show unstaged changes in working directory

git diff --staged → Show staged changes ready to commit

git diff main..feature → Compare two branches

git diff HEAD~3 → Compare with three commits ago

git diff --word-diff → Show inline word-level changes

Diff in Code

Sometimes you need to compute diffs programmatically. Here are examples in popular languages.

JavaScript (diff library)

// Using the "diff" npm package

import { diffLines } from "diff";

const oldText = "line one\nline two\nline three";

const newText = "line one\nline 2\nline three";

const changes = diffLines(oldText, newText);

changes.forEach(part => {

const prefix = part.added ? "+" : part.removed ? "-" : " ";

console.log(prefix, part.value);

});

Python (difflib)

import difflib

old = ["line one\n", "line two\n", "line three\n"]

new = ["line one\n", "line 2\n", "line three\n"]

diff = difflib.unified_diff(old, new, fromfile="old.txt", tofile="new.txt")

print("".join(diff))

Best Practices for Comparing Text

Follow these guidelines to get the most accurate and useful diff results.

  • Normalize whitespace. Trailing spaces, tabs versus spaces and different line endings (CRLF vs LF) can create noisy diffs. Trim and normalize before comparing when the whitespace is not meaningful.
  • Use word-level diffs for prose. Line-level diffs work well for code, but for natural language text, word-level or character-level diffs are much more readable.
  • Ignore irrelevant changes. Many diff tools let you ignore whitespace changes, case differences or specific patterns. Use these options to focus on what matters.
  • Keep changes small. Smaller, focused changes produce cleaner diffs that are easier to review. This applies to commits, pull requests and document revisions alike.
  • Provide context. When sharing diffs, include enough surrounding lines so the reader understands where the change occurs. Most tools default to three lines of context.
  • Sort before comparing. If the order of lines does not matter (like a list of dependencies), sort both texts first. This prevents false positives from reordering.

When to Use an Online Diff Tool

Command-line tools are powerful, but sometimes a visual diff tool is faster and more intuitive. Online diff checkers are ideal when you need to quickly compare two snippets without setting up a local environment, when you want a color-coded visual output, or when you are working with non-technical team members who are not comfortable with the terminal. They are also useful for one-off comparisons where installing a dedicated diff application would be overkill.

Compare text and find differences instantly

Paste two blocks of text and see the differences highlighted in real time. Side-by-side comparison with color-coded additions and deletions. No data leaves your browser.

Open Text Diff Checker