Our great sponsors
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
It misses the very common mistake of typing a comma instead of a dot.
Otherwise, yeah, most people would be better served by a library that detects domain typos like https://github.com/mailcheck/mailcheck than spending time on regexes.
I invite you to try my test suite at https://github.com/daurnimator/lpeg_patterns/blob/master/spe...
One important aspect of this that I often see people forgetting is that this check is designed for working with ASCII, but the domain name at least can be non-ASCII. User interfaces should remember to support IDN in domain labels and convert it to punycode before validating, and if you store A-labels (which you probably do) then convert it back to IDN form when presenting it to users.
(Alas, still doesn’t support non-ASCII in the local part, which isn’t supported everywhere but is, I believe, fairly widely supported now. See https://github.com/whatwg/html/issues/4562 plus https://en.wikipedia.org/wiki/Email_address_internationaliza... for a little more background on what it is.)
It's complicated but thankfully you don't have to reinvent the wheel:
https://github.com/JoshData/python-email-validator (my project)
The README covers a lot of ground: internationalized domain names, internationalized local parts, SMTPUTF8, Unicode normalization, not performing SMTP checks, not permitting obsolete email syntax, and missing UCS-4 support in Python 2.7.