Email Syntax Validation
Email syntax validation is the check that confirms an email address conforms to the format defined in RFC 5321 and RFC 5322. Before touching DNS or an SMTP server, the system first asks a simpler question: could this string even be a valid email address?
Structure of an email address
The format is local-part@domain. Left of the @ is the local part (the mailbox name); right of it is the domain. The rules for each half differ significantly.
The local part may contain ASCII letters, digits, dots, hyphens, underscores, and a handful of special characters. A dot cannot appear first or last, and two consecutive dots are forbidden. The local part is capped at 64 characters.
The domain follows DNS naming rules: letters, digits, and hyphens only, with hyphens banned at label boundaries. Each label (the segment between dots) may not exceed 63 characters. The full address, including the @ sign, must stay within 254 characters.
How the check works
The most common tool is a regular expression (regex). Minimal patterns just look for an @ and a dot in the domain. More thorough ones enforce allowed characters, length limits, and domain structure.
Fully reproducing RFC 5322 in a single regex is not realistic. The standard permits quoted strings in the local part, comments in parentheses, and IP address literals in place of a hostname. Real addresses almost never use any of that, so validators cover a practical subset of the rules and call it done.
Some systems replace regex with a parser — a sequential walk through the string according to a formal grammar. Parsers are easier to test and debug than dense regex patterns. For lightweight client-side checks, though, regex remains the standard choice.
Common format errors
- Spaces inside the address:
user @mail.com - Consecutive dots:
user..name@mail.com - Missing domain:
user@ - Non-ASCII domain without Punycode encoding:
user@münchen.de - Disallowed characters:
user<name>@mail.com - Address exceeding 254 characters
Why syntax checking matters
Syntax is the cheapest check there is — no network round-trip, no external service. Rejecting clearly broken input here means no DNS query gets fired for a string like “hello”, and no SMTP connection gets opened for something that could never be an address.
Syntax validation catches structural typos — a missing @, doubled dots, an accidental space — that no amount of DNS or SMTP probing would ever fix. It does not guarantee the address exists, but it does guarantee the format is coherent.
On subscription forms, client-side syntax validation (JavaScript) gives immediate feedback. The user sees the error before the form even submits, which trims the share of malformed addresses that make it into the database in the first place.
Limitations
A syntactically valid address can still be completely useless. The string zzzzz@zzzzz.zzz passes every format check, yet no such domain exists. That is why syntax validation is only the first step, followed by DNS, MX lookup, SMTP probing, and further checks.
Strict rules can overcorrect. An address with a + sign like user+tag@gmail.com is valid per RFC, but hand-rolled regex patterns often reject it. IDN domains have the same problem: legitimate addresses, but they need Punycode conversion before any ASCII pattern will match them.
Practical examples
The HTML5 attribute type="email" triggers built-in browser validation. It catches basic formatting issues but accepts technically valid strings like a@b that are useless in practice.
Server-side libraries such as email-validator (Python) and validator.js (Node) apply stricter rules: they split the address into parts, check lengths and character sets, and handle encoding. Most of them also normalise — lowercasing the domain, trimming leading and trailing whitespace — before returning a result.
uChecker runs syntax validation as the first stage of its multi-step pipeline. Addresses with format errors are rejected instantly; the rest move on to DNS, MX, SMTP, and additional checks. Upload a list or connect via API and get results back in seconds.
