Regex split email address

Some of the previous answers are wrong, as a valid email address can, in fact, include more than a single @ symbol by containing it within dot delimited, quoted text. See the following example:

$email = 'a."b@c"[email protected]';
echo (filter_var($email, FILTER_VALIDATE_EMAIL) ? 'V' : 'Inv'), 'alid email format.';

Valid email format.


Multiple delimited blocks of text and a multitude of @ symbols can exist. Both of these examples are valid email addresses:

$email = 'a."b@c".d."@"[email protected]';
$email = '/."@@@@@@"./@a.b';

Based on Michael Berkowski's explode answer, this email address would look like this:

$email = 'a."b@c"[email protected]';
$parts = explode('@', $email);
$user = $parts[0];
$domain = '@' . $parts[1];

User: a."b"
Domain: @c".d


Anyone using this solution should beware of potential abuse. Accepting an email address based on these outputs, followed by inserting $email into a database could have negative implications.

$email = 'a."b@c".d@INSERT BAD STUFF HERE';

The contents of these functions are only accurate so long as filter_var is used for validation first.

From the left:

Here is a simple non-regex, non-exploding solution for finding the first @ that is not contained within delimited and quoted text. Nested delimited text is considered invalid based on filter_var, so finding the proper @ is a very simple search.

if(filter_var($email, FILTER_VALIDATE_EMAIL)) {
    $a = '"';
    $b = '.';
    $c = '@';
    $d = strlen($email);
    $contained = false;
    for($i = 0; $i < $d; ++$i) {
        if($contained) {
            if($email[$i] === $a && $email[$i + 1] === $b) {
                $contained = false;
                ++$i;
            }
        }
        elseif($email[$i] === $c)
            break;
        elseif($email[$i] === $b && $email[$i + 1] === $a) {
            $contained = true;
            ++$i;
        }
    }
    $local = substr($email, 0, $i);
    $domain = substr($email, $i);
}

Here is the same code tucked inside a function.

function parse_email($email) {
    if(!filter_var($email, FILTER_VALIDATE_EMAIL)) return false;
    $a = '"';
    $b = '.';
    $c = '@';
    $d = strlen($email);
    $contained = false;
    for($i = 0; $i < $d; ++$i) {
        if($contained) {
            if($email[$i] === $a && $email[$i + 1] === $b) {
                $contained = false;
                ++$i;
            }
        }
        elseif($email[$i] === $c)
            break;
        elseif($email[$i] === $b && $email[$i + 1] === $a) {
            $contained = true;
            ++$i;
        }
    }
    return array('local' => substr($email, 0, $i), 'domain' => substr($email, $i));
}

In use:

$email = 'a."b@c".x."@"[email protected]';
$email = parse_email($email);
if($email !== false)
    print_r($email);
else
    echo 'Bad email address.';

Array ( [local] => a."b@c".x."@".d.e [domain] => @f.g )

$email = 'a."b@c".x."@"[email protected]@';
$email = parse_email($email);
if($email !== false)
    print_r($email);
else
    echo 'Bad email address.';

Bad email address.


From the right:

After doing some testing of filter_var and researching what is acceptable as a valid domain name (Hostnames separated by dots), I created this function to get a better performance. In a valid email address, the last @ should be the true @, as the @ symbol should never appear in the domain of a valid email address.

if(filter_var($email, FILTER_VALIDATE_EMAIL)) {
    $domain = strrpos($email, '@');
    $local = substr($email, 0, $domain);
    $domain = substr($email, $domain);
}

As a function:

function parse_email($email) {
    if(!filter_var($email, FILTER_VALIDATE_EMAIL)) return false;
    $a = strrpos($email, '@');
    return array('local' => substr($email, 0, $a), 'domain' => substr($email, $a));
}

Or using explode and implode:

if(filter_var($email, FILTER_VALIDATE_EMAIL)) {
    $local = explode('@', $email);
    $domain = '@' . array_pop($local);
    $local = implode('@', $local);
}

As a function:

function parse_email($email) {
    if(!filter_var($email, FILTER_VALIDATE_EMAIL)) return false;
    $email = explode('@', $email);
    $domain = '@' . array_pop($email);
    return array('local' => implode('@', $email), 'domain' => $domain);
}

If you would still like to use regex, splitting the string starting from the end of a valid email address is the safest option.

/(.*)(@.*)$/

(.*) Matches anything.
(@.*) Matches anything that begins with an @ symbol.
$ End of the string.

if(filter_var($email, FILTER_VALIDATE_EMAIL)) {
    $local = preg_split('/(.*)(@.*)$/', $email, -1, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE);
    $domain = $local[1];
    $local = $local[0];
}

As a function:

function parse_email($email) {
    if(!filter_var($email, FILTER_VALIDATE_EMAIL)) return false;
    $email = preg_split('/(.*)(@.*)$/', $email, -1, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE);
    return array('local' => $email[0], 'domain' => $email[1]);
}

Or

if(filter_var($email, FILTER_VALIDATE_EMAIL)) {
    preg_match('/(.*)(@.*)$/', $email, $matches);
    $local = $matches[1];
    $domain = $matches[2];
}

As a function:

function parse_email($email) {
    if(!filter_var($email, FILTER_VALIDATE_EMAIL)) return false;
    preg_match('/(.*)(@.*)$/', $email, $matches);
    return array('local' => $matches[1], 'domain' => $matches[2]);
}

$parts = explode('@', "[email protected]");

$user = $parts[0];
// Stick the @ back onto the domain since it was chopped off.
$domain = "@" . $parts[1];