Skip to content

Email Regex Validation: How to Validate Email Addresses in JavaScript and Python

Email validation is one of the most common tasks in web development. Every signup form, contact page and newsletter subscription needs to verify that the user provided a properly formatted email address. Getting it wrong means bounced messages, corrupted databases and frustrated users. Regular expressions are the go-to tool for this job, but the details matter more than most developers realize.

Why Email Validation Matters

Validating email addresses at the point of entry prevents a cascade of downstream problems. Here is why it deserves more attention than a quick copy-paste from Stack Overflow.

  • Data quality. Invalid emails pollute your database and skew analytics. Marketing teams end up with inflated subscriber counts that never convert.
  • Deliverability. Sending to malformed addresses increases your bounce rate. High bounce rates trigger spam filters, which can damage your sender reputation across all outbound email.
  • User experience. Catching a typo before the form submits saves the user from waiting for a confirmation email that never arrives.
  • Security. Malformed input can be a vector for injection attacks if your backend processes email addresses without proper sanitization.

The Simple Email Regex Pattern Explained

The most commonly used basic email regex looks like this:

^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$

Let's break this down piece by piece:

  • ^ asserts the start of the string.
  • [a-zA-Z0-9._%+-]+ matches the local part (before the @). It allows letters, digits, dots, underscores, percent signs, plus signs and hyphens. The + requires at least one character.
  • @ matches the literal at-sign separator.
  • [a-zA-Z0-9.-]+ matches the domain name. It allows letters, digits, dots and hyphens.
  • \\. matches the literal dot before the TLD.
  • [a-zA-Z]{2,} matches the top-level domain, requiring at least two letters. This covers .com, .io, .museum and newer TLDs like .technology.
  • $ asserts the end of the string.

This pattern is good enough for most web forms. It catches obvious typos and rejects clearly invalid input, while being simple enough to understand and maintain.

Common Email Regex Patterns by Strictness

Basic Pattern

The minimal check. It only verifies that there is something before and after the @, with a dot in the domain:

.+@.+\..+

This is too permissive for production use. It will match strings with spaces, special characters and other invalid input. Use it only for the loosest sanity check.

Intermediate Pattern

This strikes a good balance between accuracy and simplicity. It is the pattern most projects should use:

^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$

RFC 5322 Compliant Pattern

The official email specification (RFC 5322) defines an extremely permissive syntax. The full regex to match it is impractical for real use, but here is a closer approximation that handles more edge cases:

^[a-zA-Z0-9!#$%&'*+/=?^`{|}~-]+(?:\.[a-zA-Z0-9!#$%&'*+/=?^`{|}~-]+)*@(?:[a-zA-Z0-9](?:[a-zA-Z0-9-]*[a-zA-Z0-9])?\.)+[a-zA-Z]{2,}$

This pattern allows the full set of valid characters in the local part (including !, #, $ and others), prevents consecutive dots and requires valid domain label formatting. In most real-world applications, the intermediate pattern above is sufficient.

Email Validation in JavaScript

JavaScript's built-in RegExp makes email validation straightforward. Here is a reusable function:

function isValidEmail(email) {
  const pattern = /^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/;
  return pattern.test(email);
}

// Usage
console.log(isValidEmail("user@example.com"));     // true
console.log(isValidEmail("jane.doe+work@corp.io")); // true
console.log(isValidEmail("missing-at-sign.com"));   // false
console.log(isValidEmail("@no-local-part.com"));    // false
console.log(isValidEmail("user@.com"));             // false

The .test() method returns a boolean, making it ideal for form validation. If you also need to extract the match or capture groups, use .match() or .exec() instead.

For a more robust version, you can add length checks and normalize the input:

function validateEmail(email) {
  if (typeof email !== "string") return false;

  const trimmed = email.trim().toLowerCase();

  if (trimmed.length === 0 || trimmed.length > 254) return false;

  const localPart = trimmed.split("@")[0];
  if (localPart && localPart.length > 64) return false;

  const pattern = /^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/;
  return pattern.test(trimmed);
}

RFC 5321 limits the total email length to 254 characters and the local part to 64 characters. Adding these checks catches addresses that are syntactically valid but will be rejected by mail servers.

Email Validation in Python

Python's re module provides the same regex capabilities. Here is the equivalent validation function:

import re

def is_valid_email(email: str) -> bool:
    pattern = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
    return bool(re.match(pattern, email))

# Usage
print(is_valid_email("developer@codetools.run"))  # True
print(is_valid_email("test+tag@sub.domain.com"))  # True
print(is_valid_email("spaces in@email.com"))      # False
print(is_valid_email("double..dot@email.com"))     # True (basic regex allows this)

Note that re.match() anchors to the start of the string by default, so the ^ anchor is technically redundant. However, including it makes the pattern portable across languages and easier to read.

For production Python projects, consider using the email-validator library instead of rolling your own regex. It handles internationalized addresses, checks deliverability, and normalizes output:

from email_validator import validate_email, EmailNotValidError

try:
    result = validate_email("user@example.com", check_deliverability=True)
    clean_email = result.normalized
    print(f"Valid: {clean_email}")
except EmailNotValidError as e:
    print(f"Invalid: {e}")

Edge Cases That Break Simple Regex

The email specification is famously permissive. Here are real edge cases that the basic regex either wrongly accepts or wrongly rejects.

Plus Addressing (Sub-addressing)

Addresses like user+newsletters@gmail.com are perfectly valid. Gmail, Outlook and many other providers support the + character for filtering. Our intermediate pattern handles this correctly, but some overly strict patterns reject it.

Internationalized Domain Names (IDN)

Domains can contain non-ASCII characters when using Punycode encoding. An address like user@münchen.de is valid once the domain is converted to its ASCII-compatible form (xn--mnchen-3ya.de). Standard ASCII-only regex will reject the Unicode form.

Quoted Local Parts

The RFC allows the local part to be quoted, enabling characters that are normally forbidden. For example, "john doe"@example.com and "very.(),:;<>[]\".VERY.\"very@\\ \"very\".unusual"@strange.example.com are technically valid. No simple regex handles these, and in practice, no major email provider creates addresses like this.

IP Address Domains

Email addresses can use an IP address instead of a domain name, like user@[192.168.1.1]. These are valid per the spec but extremely rare in consumer-facing applications. Most validation patterns intentionally exclude them.

Consecutive Dots

An address like user..name@example.com is invalid per RFC 5322, but the basic regex will accept it. The intermediate pattern also misses this. If you need to catch it, add a negative lookahead: (?!.*\.\.) at the start of your pattern.

Why Regex Alone Is Not Enough

Regex can only check the format of an email address. It cannot tell you whether the address actually exists or whether it can receive mail. A comprehensive validation strategy layers multiple checks.

  • MX record lookup. Querying DNS for the domain's MX records confirms that the domain is configured to receive email. An address at a domain with no MX records will never work.
  • SMTP verification. Opening a connection to the mail server and issuing a RCPT TO command can check if the specific mailbox exists. However, many servers disable this to prevent enumeration attacks.
  • Disposable email detection. Services like Mailinator and Guerrilla Mail provide throwaway addresses. Maintaining a blocklist of disposable domains helps filter them out.
  • Confirmation email. The most reliable validation is sending a real email with a unique confirmation link. If the user clicks it, you know the address works. This is the gold standard for email verification.

The practical approach is to combine a regex check on the client side for instant feedback with a confirmation email on the server side for definitive validation.

HTML5 Built-in Email Validation

Before reaching for a custom regex, remember that HTML5 provides native email validation through the <input type="email"> element. Browsers validate the input against their own internal pattern, which closely follows the RFC spec.

<!-- Basic HTML5 email validation -->
<form>
  <label for="email">Email:</label>
  <input
    type="email"
    id="email"
    name="email"
    required
    placeholder="you@example.com"
  />
  <button type="submit">Subscribe</button>
</form>

<!-- With a custom regex pattern attribute -->
<input
  type="email"
  pattern="[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}"
  title="Enter a valid email address"
  required
/>

The type="email" attribute gives you free validation, a mobile-optimized keyboard and built-in error messages. The optional pattern attribute lets you layer your own regex on top for stricter rules. This is often the best starting point for simple forms because it requires zero JavaScript.

Best Practices for Email Validation

After years of collective developer experience, these guidelines have emerged as the most practical approach:

  • Keep the regex simple. The intermediate pattern (^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$) covers 99.9% of real email addresses. Over-engineering the pattern causes more problems than it solves.
  • Validate on both client and server. Client-side regex provides instant feedback. Server-side validation is the real gate. Never trust the client alone.
  • Normalize before validating. Trim whitespace and convert to lowercase. This prevents false negatives from accidental spaces or capitalization.
  • Check length limits. The total address must not exceed 254 characters (RFC 5321), and the local part must not exceed 64 characters.
  • Do not block plus addressing. Rejecting the + character frustrates power users who rely on it for email filtering. It is valid and should be accepted.
  • Send a confirmation email. For any workflow where you need to actually reach the user, this is the only way to confirm the address works.
  • Show clear error messages. Instead of a generic "invalid email" message, tell the user what is wrong. "Missing @ symbol" or "domain appears invalid" helps them fix the issue faster.
  • Do not try to fully implement RFC 5322 in regex. The full spec allows absurd edge cases that no real email provider uses. A pragmatic pattern that covers real-world addresses is far more useful than a technically complete one that is impossible to maintain.

Try your email regex patterns

Test and debug your email validation regex with real-time matching, capture groups and flag support. Runs entirely in your browser with no data sent to any server.

Open Regex Tester