2026-04-23

Regex Lookahead and Lookbehind: Zero-Width Assertions Explained

Lookahead and lookbehind are the two regex features that separate beginners from developers who can actually solve complex matching problems. They let you check what comes before or after a position in the string without consuming any characters. Once you understand them, password validators, currency formatters and complex find-and-replace operations go from frustrating to straightforward. This guide covers all four lookaround variants, shows the common pitfalls, and walks through real patterns you can use in JavaScript and Python today.

What Zero-Width Means

Normal regex tokens consume characters. If your pattern matches abc, the engine moves three positions forward and the next token starts from position 3. Lookarounds are different. They check a condition at the current position but do not advance the cursor. That is what zero-width assertion means, and it is the single most useful property of lookarounds.

The practical consequence is that you can enforce a condition on one part of the string while matching a different part. You can require that a digit is preceded by a dollar sign without capturing the dollar sign. You can find a word that is followed by a specific suffix without including the suffix in the result.

The Four Lookaround Types

There are exactly four combinations: look forward or backward, positive or negative.

Positive lookahead (?=...) - match only if what follows is ...
Negative lookahead (?!...) - match only if what follows is not ...
Positive lookbehind (?<=...) - match only if what precedes is ...
Negative lookbehind (?<!...) - match only if what precedes is not ...

The ?= reads as "followed by", the ?! reads as "not followed by", and adding < flips the direction to "preceded by".

Positive Lookahead

Suppose you want to match the word foo but only if it is followed by bar. Without lookahead, you would have to capture both and then trim. With lookahead:

/foo(?=bar)/

Applied to foobar foobaz, the engine matches only the first foo. The bar is checked but not consumed, so the next match attempt starts right after foo.

Negative Lookahead

Flip the assertion. Match foo only when it is not followed by bar.

/foo(?!bar)/

Against foobar foobaz this matches only the second foo. Negative lookahead is extremely common for exclusion rules like "find all numbers except phone numbers" or "find words that do not end in -ing".

Positive and Negative Lookbehind

Lookbehinds check what precedes the current position. They are perfect for capturing numbers with currency prefixes, matching text inside quotes, or stripping prefixes.

// Prices preceded by a dollar sign
/(?<=\$)\d+(\.\d{2})?/g

// Words not preceded by "not "
/(?<!not )\bimportant\b/g

Lookbehind support was historically patchy, but as of 2026 every major engine supports it: JavaScript (ES2018+), Python (re module, always), PCRE, Java, .NET, Rust (regex crate), Go (RE2 still does not). If you need to support older Safari, test carefully.

Real-World Pattern: Password Validation

The classic lookahead use case. You want to enforce multiple independent rules on a password: at least one uppercase, at least one digit, at least one special character, minimum length. Instead of a monstrous alternation, stack several lookaheads at the start of the pattern.

const strongPassword =
  /^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[!@#$%^&*]).{12,}$/;

strongPassword.test("Weakpass1");      // false (no special)
strongPassword.test("StrongPass1!");   // true

Each lookahead is a zero-width check from the start of the string. They run in sequence, none of them consume characters, and finally .{12,} consumes the full password. The beauty is that each rule is independent and you can add or remove lookaheads without touching the others.

Real-World Pattern: Thousands Separator

Inserting commas into large numbers is a classic interview question that is almost trivial with lookahead. The insight is that you want to insert a comma at every position that has three digits after it, but is not itself at the end.

"1234567890".replace(/\B(?=(\d{3})+(?!\d))/g, ",");
// "1,234,567,890"

The \B is a non-word-boundary to avoid matching before the first digit. The lookahead says "what follows is a multiple of three digits, and then no more digits". It consumes nothing, so replace inserts a comma at each match position.

Real-World Pattern: Match Inside Quotes

Extract text between quotation marks without including the quotes in the match.

/(?<=")[^"]+(?=")/g

This is cleaner than using capture groups because every match is exactly the string inside quotes. No post-processing needed. For production code, remember to handle escaped quotes.

Python Syntax

Python's re module uses identical lookaround syntax, but lookbehind in the standard re module must be fixed-width. For variable-width lookbehind, use the third-party regex package.

import re
# Positive lookahead
re.findall(r"\d+(?=\s*USD)", "100 USD and 50 EUR")
# ['100']

# Positive lookbehind (fixed width)
re.findall(r"(?<=\$)\d+", "$100 and 50")
# ['100']

JavaScript Variable-Width Lookbehind

JavaScript (ES2018+) supports variable-width lookbehind out of the box. So patterns like (?<=\s*\$)\d+ work fine in modern browsers and Node.js. This is one of the few areas where JavaScript regex is more flexible than Python's default re.

Common Pitfalls

Forgetting that lookarounds are zero-width. (?=foo)foo matches foo, not foofoo. The lookahead does not advance.
Using lookahead when you want a capture group. If you need the following text in your result, use a normal group. Lookahead is for conditions, not capture.
Backtracking cost. Complex lookarounds, especially nested or on very long inputs, can be slow. Benchmark before using them in hot paths. For finite inputs this is almost never a problem.
Go/RE2 does not support lookbehind or backreferences. This is a deliberate design choice that guarantees linear-time matching. If your regex needs to run in Go, you may have to work around it with code.
Multiline and dotall flags matter. . does not match newlines by default. If your lookbehind checks for a newline or your lookahead spans lines, enable the s flag.

Quick Reference

(?=X)    positive lookahead   - followed by X
(?!X)    negative lookahead   - not followed by X
(?<=X)   positive lookbehind  - preceded by X
(?<!X)   negative lookbehind  - not preceded by X

Test your lookarounds live

Paste a pattern and a test string into our free Regex Tester to see matches, groups and lookaround behavior in real time. Supports all JavaScript flags.

Open Regex Tester