Japanese/chinese email addresses?

As per RFC 5322 ("Internet Message Format"), section 3.4.1 ("Addr-Spec Specification") you can't use non US-ASCII characters such as those you've listed. However, characters such as...

! # $ % & ' * + - / = ? ^ _  { | } ~

...are legal, as well as the full stop/period character as long as there's only one in a row.

For more information see the above RFC and indeed the Wikipedia article on email addresses, specifically the "syntax" section.

UPDATE

There's also a newer, albeit experimental, RFC 5336 (now obsoleted by RFC6531) which handles the now legitimate international domains containing UTF-8 characters, etc.


You must be very careful when you try to match/validate email addresses on a regex. In some cases you reject email addresses which however are valid. Basically its:

Show me one regex and I show you one email which doesn't match.

For that reason if I check email addresses I use a very simple regex like .+@.+(\..+)* (user part anything, host part got at least one dot). Anything else results in false positives and false negatives.

Its better not to match email addresses (only check trivial stuff like "@") but instead send opt-in emails instead.

Tags:

Php

Unicode

Regex