Cannot get the body of email with Gmail PHP API

Let's do a little experiment. I've sent two messages to myself. One with an attachment, and one without.

Request:

GET https://www.googleapis.com/gmail/v1/users/me/messages?maxResults=2

Response:

{
 "messages": [
  {
   "id": "14fe21fd6b3fb46f",
   "threadId": "14fe21fd6b3fb46f"
  },
  {
   "id": "14fe21f9341ed73c",
   "threadId": "14fe21f9341ed73c"
  }
 ],
 "nextPageToken": "08943597140129624594",
 "resultSizeEstimate": 3
}

I only ask for the payload, since that is where all the relevant parts are:

fields = payload

GET https://www.googleapis.com/gmail/v1/users/me/messages/14fe21fd6b3fb46f?fields=payload

GET https://www.googleapis.com/gmail/v1/users/me/messages/14fe21f9341ed73c?fields=payload

Mail without attachment:

{
 "payload": {
  "parts": [
   {
    "partId": "0",
    "mimeType": "text/plain",
    "filename": "",
    "headers": [
     {
      "name": "Content-Type",
      "value": "text/plain; charset=UTF-8"
     }
    ],
    "body": {
     "size": 22,
     "data": "aGVjaz8gTm8gYXR0YWNobWVudD8NCg=="
    }
   },
   {
    "partId": "1",
    "mimeType": "text/html",
    "filename": "",
    "headers": [
     {
      "name": "Content-Type",
      "value": "text/html; charset=UTF-8"
     }
    ],
    "body": {
     "size": 43,
     "data": "PGRpdiBkaXI9Imx0ciI-aGVjaz8gTm8gYXR0YWNobWVudD88L2Rpdj4NCg=="
    }
   }
  ]
 }
}

Mail with attachment:

{
 "payload": {
  "parts": [
   {
    "mimeType": "multipart/alternative",
    "filename": "",
    "headers": [
     {
      "name": "Content-Type",
      "value": "multipart/alternative; boundary=001a1142e23c551e8e05200b4be0"
     }
    ],
    "body": {
     "size": 0
    },
    "parts": [
     {
      "partId": "0.0",
      "mimeType": "text/plain",
      "filename": "",
      "headers": [
       {
        "name": "Content-Type",
        "value": "text/plain; charset=UTF-8"
       }
      ],
      "body": {
       "size": 9,
       "data": "V293IG1hbg0K"
      }
     },
     {
      "partId": "0.1",
      "mimeType": "text/html",
      "filename": "",
      "headers": [
       {
        "name": "Content-Type",
        "value": "text/html; charset=UTF-8"
       }
      ],
      "body": {
       "size": 30,
       "data": "PGRpdiBkaXI9Imx0ciI-V293IG1hbjwvZGl2Pg0K"
      }
     }
    ]
   },
   {
    "partId": "1",
    "mimeType": "image/jpeg",
    "filename": "feelthebern.jpg",
    "headers": [
     {
      "name": "Content-Type",
      "value": "image/jpeg; name=\"feelthebern.jpg\""
     },
     {
      "name": "Content-Disposition",
      "value": "attachment; filename=\"feelthebern.jpg\""
     },
     {
      "name": "Content-Transfer-Encoding",
      "value": "base64"
     },
     {
      "name": "X-Attachment-Id",
      "value": "f_ieq3ev0i0"
     }
    ],
    "body": {
     "attachmentId": "ANGjdJ_2xG3WOiLh6MbUdYy4vo2VhV2kOso5AyuJW3333rbmk8BIE1GJHIOXkNIVGiphP3fGe7iuIl_MGzXBGNGvNslwlz8hOkvJZg2DaasVZsdVFT_5JGvJOLefgaSL4hqKJgtzOZG9K1XSMrRQAtz2V0NX7puPdXDU4gvalSuMRGwBhr_oDSfx2xljHEbGG6I4VLeLZfrzGGKW7BF-GO_FUxzJR8SizRYqIhgZNA6PfRGyOhf1s7bAPNW3M9KqWRgaK07WTOYl7DzW4hpNBPA4jrl7tgsssExHpfviFL7yL52lxsmbsiLe81Z5UoM",
     "size": 100446
    }
   }
  ]
 }
}

These responses corresponds to the $parts in your code. As you can see, if you are lucky, $parts[0]['body']->data will give you what you want, but most of the time it will not.

There are generally two approaches to this problem. You could implement the following algorithm (you are much better at PHP than me, but this is the general outline of it):

  1. Traverse the payload.parts and check if it contains a part that has the body you were looking for (either text/plain or text/html). If it has, you are done with your searching. If you were parsing a mail like the one above with no attachment, this would be enough.
  2. Do step 1 again, but this time with the parts found inside the parts you just checked, recursively. You will eventually find your part. If you were parsing a mail like the one above with an attachment, this would eventually find you your body.

The algorithm could look something like the following (example in JavaScript):

var response = {
 "payload": {
  "parts": [
   {
    "mimeType": "multipart/alternative",
    "filename": "",
    "headers": [
     {
      "name": "Content-Type",
      "value": "multipart/alternative; boundary=001a1142e23c551e8e05200b4be0"
     }
    ],
    "body": {
     "size": 0
    },
    "parts": [
     {
      "partId": "0.0",
      "mimeType": "text/plain",
      "filename": "",
      "headers": [
       {
        "name": "Content-Type",
        "value": "text/plain; charset=UTF-8"
       }
      ],
      "body": {
       "size": 9,
       "data": "V293IG1hbg0K"
      }
     },
     {
      "partId": "0.1",
      "mimeType": "text/html",
      "filename": "",
      "headers": [
       {
        "name": "Content-Type",
        "value": "text/html; charset=UTF-8"
       }
      ],
      "body": {
       "size": 30,
       "data": "PGRpdiBkaXI9Imx0ciI-V293IG1hbjwvZGl2Pg0K"
      }
     }
    ]
   },
   {
    "partId": "1",
    "mimeType": "image/jpeg",
    "filename": "feelthebern.jpg",
    "headers": [
     {
      "name": "Content-Type",
      "value": "image/jpeg; name=\"feelthebern.jpg\""
     },
     {
      "name": "Content-Disposition",
      "value": "attachment; filename=\"feelthebern.jpg\""
     },
     {
      "name": "Content-Transfer-Encoding",
      "value": "base64"
     },
     {
      "name": "X-Attachment-Id",
      "value": "f_ieq3ev0i0"
     }
    ],
    "body": {
     "attachmentId": "ANGjdJ_2xG3WOiLh6MbUdYy4vo2VhV2kOso5AyuJW3333rbmk8BIE1GJHIOXkNIVGiphP3fGe7iuIl_MGzXBGNGvNslwlz8hOkvJZg2DaasVZsdVFT_5JGvJOLefgaSL4hqKJgtzOZG9K1XSMrRQAtz2V0NX7puPdXDU4gvalSuMRGwBhr_oDSfx2xljHEbGG6I4VLeLZfrzGGKW7BF-GO_FUxzJR8SizRYqIhgZNA6PfRGyOhf1s7bAPNW3M9KqWRgaK07WTOYl7DzW4hpNBPA4jrl7tgsssExHpfviFL7yL52lxsmbsiLe81Z5UoM",
     "size": 100446
    }
   }
  ]
 }
};

// In e.g. a plain text message, the payload is the only part.
var parts = [response.payload];

while (parts.length) {
  var part = parts.shift();
  if (part.parts) {
    parts = parts.concat(part.parts);
  }

  if(part.mimeType === 'text/html') {
    var decodedPart = decodeURIComponent(escape(atob(part.body.data.replace(/\-/g, '+').replace(/\_/g, '/'))));
    console.log(decodedPart);
  }
}

The far easier option is to just get the raw data of the mail, and let a already written library do the work for you:

Request:

format = raw
fields = raw

GET https://www.googleapis.com/gmail/v1/users/me/messages/14fe21fd6b3fb46f?format=raw&fields=raw

Response:

{
 "raw": "TUlNRS1WZXJzaW9uOiAxLjANClJlY2VpdmVkOiBieSAxMC4yOC45OS4xOTYgd2l0aCBIVFRQOyBGcmksIDE4IFNlcCAyMDE1IDEzOjIzOjAxIC0wNzAwIChQRFQpDQpEYXRlOiBGcmksIDE4IFNlcCAyMDE1IDIyOjIzOjAxICswMjAwDQpEZWxpdmVyZWQtVG86IGVtdGhvbGluQGdtYWlsLmNvbQ0KTWVzc2FnZS1JRDogPENBRHNaTFJ5eGk2UGt0MHZnUS1iZHd1N2FNLWNHRmZKcEwrRHYyb3ZKOGp4SGN4VWhfQUBtYWlsLmdtYWlsLmNvbT4NClN1YmplY3Q6IFdoYXQgZGENCkZyb206IEVtaWwgVGhvbGluIDxlbXRob2xpbkBnbWFpbC5jb20-DQpUbzogRW1pbCBUaG9saW4gPGVtdGhvbGluQGdtYWlsLmNvbT4NCkNvbnRlbnQtVHlwZTogbXVsdGlwYXJ0L2FsdGVybmF0aXZlOyBib3VuZGFyeT0wMDFhMTE0NjhmMTY1YzUwNDUwNTIwMGI0YzYxDQoNCi0tMDAxYTExNDY4ZjE2NWM1MDQ1MDUyMDBiNGM2MQ0KQ29udGVudC1UeXBlOiB0ZXh0L3BsYWluOyBjaGFyc2V0PVVURi04DQoNCmhlY2s_IE5vIGF0dGFjaG1lbnQ_DQoNCi0tMDAxYTExNDY4ZjE2NWM1MDQ1MDUyMDBiNGM2MQ0KQ29udGVudC1UeXBlOiB0ZXh0L2h0bWw7IGNoYXJzZXQ9VVRGLTgNCg0KPGRpdiBkaXI9Imx0ciI-aGVjaz8gTm8gYXR0YWNobWVudD88L2Rpdj4NCg0KLS0wMDFhMTE0NjhmMTY1YzUwNDUwNTIwMGI0YzYxLS0="
}

The biggest drawback of the second method is that if you get the message raw, you will download all the attachment data right away, which might be far to much data for your use case.

I'm not good at PHP, but this looks promising if you want to go with the second solution! Good luck!


UPDATE: You might want to check my second answer below this one for a more complete code.

Finally, I worked today, so here's the complete code answer for finding the body - thanks to @Tholle:

// Authentication things above
/*
 * Decode the body.
 * @param : encoded body  - or null
 * @return : the body if found, else FALSE;
 */
function decodeBody($body) {
    $rawData = $body;
    $sanitizedData = strtr($rawData,'-_', '+/');
    $decodedMessage = base64_decode($sanitizedData);
    if(!$decodedMessage){
        $decodedMessage = FALSE;
    }
    return $decodedMessage;
}

$client = getClient();
$gmail = new Google_Service_Gmail($client);

$list = $gmail->users_messages->listUsersMessages('me', ['maxResults' => 1000]);

try{
    while ($list->getMessages() != null) {

        foreach ($list->getMessages() as $mlist) {

            $message_id = $mlist->id;
            $optParamsGet2['format'] = 'full';
            $single_message = $gmail->users_messages->get('me', $message_id, $optParamsGet2);
            $payload = $single_message->getPayload();

            // With no attachment, the payload might be directly in the body, encoded.
            $body = $payload->getBody();
            $FOUND_BODY = decodeBody($body['data']);

            // If we didn't find a body, let's look for the parts
            if(!$FOUND_BODY) {
                $parts = $payload->getParts();
                foreach ($parts  as $part) {
                    if($part['body']) {
                        $FOUND_BODY = decodeBody($part['body']->data);
                        break;
                    }
                    // Last try: if we didn't find the body in the first parts, 
                    // let's loop into the parts of the parts (as @Tholle suggested).
                    if($part['parts'] && !$FOUND_BODY) {
                        foreach ($part['parts'] as $p) {
                            // replace 'text/html' by 'text/plain' if you prefer
                            if($p['mimeType'] === 'text/html' && $p['body']) {
                                $FOUND_BODY = decodeBody($p['body']->data);
                                break;
                            }
                        }
                    }
                    if($FOUND_BODY) {
                        break;
                    }
                }
            }
            // Finally, print the message ID and the body
            print_r($message_id . " : " . $FOUND_BODY);
        }

        if ($list->getNextPageToken() != null) {
            $pageToken = $list->getNextPageToken();
            $list = $gmail->users_messages->listUsersMessages('me', ['pageToken' => $pageToken, 'maxResults' => 1000]);
        } else {
            break;
        }
    }
} catch (Exception $e) {
    echo $e->getMessage();
}

As you can see, my problem was, sometimes the body cannot be found in the payload->parts but directly in the payload->body! (plus I add the loop for multiple parts).

Hope this helps somebody else.


For those who are interested I greatly improved my last answer, making it working with text/html (and fallback to text/plain if necessary) and transforming the images as base64 attachments that will auto-load when printed as full HTML!

Code isn't perfect at all and is way too long to explain in details but it's working for me.

Feel free to take it and adapt it (maybe correct/improve it if necessary).

// Authentication things above
/*
 * Decode the body.
 * @param : encoded body  - or null
 * @return : the body if found, else FALSE;
 */
function decodeBody($body) {
    $rawData = $body;
    $sanitizedData = strtr($rawData,'-_', '+/');
    $decodedMessage = base64_decode($sanitizedData);
    if(!$decodedMessage){
        $decodedMessage = FALSE;
    }
    return $decodedMessage;
}

$client = getClient();
$gmail = new Google_Service_Gmail($client);

$list = $gmail->users_messages->listUsersMessages('me', ['maxResults' => 1000]);

try{
    while ($list->getMessages() != null) {

        foreach ($list->getMessages() as $mlist) {

            $message_id = $mlist->id;
            $optParamsGet2['format'] = 'full';
            $single_message = $gmail->users_messages->get('me', $message_id, $optParamsGet2);
            $payload = $single_message->getPayload();
            $parts = $payload->getParts();
            // With no attachment, the payload might be directly in the body, encoded.
            $body = $payload->getBody();
            $FOUND_BODY = FALSE;
            // If we didn't find a body, let's look for the parts
            if(!$FOUND_BODY) {
                foreach ($parts  as $part) {
                    if($part['parts'] && !$FOUND_BODY) {
                        foreach ($part['parts'] as $p) {
                            if($p['parts'] && count($p['parts']) > 0){
                                foreach ($p['parts'] as $y) {
                                    if(($y['mimeType'] === 'text/html') && $y['body']) {
                                        $FOUND_BODY = decodeBody($y['body']->data);
                                        break;
                                    }
                                }
                            } else if(($p['mimeType'] === 'text/html') && $p['body']) {
                                $FOUND_BODY = decodeBody($p['body']->data);
                                break;
                            }
                        }
                    }
                    if($FOUND_BODY) {
                        break;
                    }
                }
            }
            // let's save all the images linked to the mail's body:
            if($FOUND_BODY && count($parts) > 1){
                $images_linked = array();
                foreach ($parts  as $part) {
                    if($part['filename']){
                        array_push($images_linked, $part);
                    } else{
                        if($part['parts']) {
                            foreach ($part['parts'] as $p) {
                                if($p['parts'] && count($p['parts']) > 0){
                                    foreach ($p['parts'] as $y) {
                                        if(($y['mimeType'] === 'text/html') && $y['body']) {
                                            array_push($images_linked, $y);
                                        }
                                    }
                                } else if(($p['mimeType'] !== 'text/html') && $p['body']) {
                                    array_push($images_linked, $p);
                                }
                            }
                        }
                    }
                }
                // special case for the wdcid...
                preg_match_all('/wdcid(.*)"/Uims', $FOUND_BODY, $wdmatches);
                if(count($wdmatches)) {
                    $z = 0;
                    foreach($wdmatches[0] as $match) {
                        $z++;
                        if($z > 9){
                            $FOUND_BODY = str_replace($match, 'image0' . $z . '@', $FOUND_BODY);
                        } else {
                            $FOUND_BODY = str_replace($match, 'image00' . $z . '@', $FOUND_BODY);
                        }
                    }
                }
                preg_match_all('/src="cid:(.*)"/Uims', $FOUND_BODY, $matches);
                if(count($matches)) {
                    $search = array();
                    $replace = array();
                    // let's trasnform the CIDs as base64 attachements 
                    foreach($matches[1] as $match) {
                        foreach($images_linked as $img_linked) {
                            foreach($img_linked['headers'] as $img_lnk) {
                                if( $img_lnk['name'] === 'Content-ID' || $img_lnk['name'] === 'Content-Id' || $img_lnk['name'] === 'X-Attachment-Id'){
                                    if ($match === str_replace('>', '', str_replace('<', '', $img_lnk->value)) 
                                            || explode("@", $match)[0] === explode(".", $img_linked->filename)[0]
                                            || explode("@", $match)[0] === $img_linked->filename){
                                        $search = "src=\"cid:$match\"";
                                        $mimetype = $img_linked->mimeType;
                                        $attachment = $gmail->users_messages_attachments->get('me', $mlist->id, $img_linked['body']->attachmentId);
                                        $data64 = strtr($attachment->getData(), array('-' => '+', '_' => '/'));
                                        $replace = "src=\"data:" . $mimetype . ";base64," . $data64 . "\"";
                                        $FOUND_BODY = str_replace($search, $replace, $FOUND_BODY);
                                    }
                                }
                            }
                        }
                    }
                }
            }
            // If we didn't find the body in the last parts, 
            // let's loop for the first parts (text-html only)
            if(!$FOUND_BODY) {
                foreach ($parts  as $part) {
                    if($part['body'] && $part['mimeType'] === 'text/html') {
                        $FOUND_BODY = decodeBody($part['body']->data);
                        break;
                    }
                }
            }
            // With no attachment, the payload might be directly in the body, encoded.
            if(!$FOUND_BODY) {
                $FOUND_BODY = decodeBody($body['data']);
            }
            // Last try: if we didn't find the body in the last parts, 
            // let's loop for the first parts (text-plain only)
            if(!$FOUND_BODY) {
                foreach ($parts  as $part) {
                    if($part['body']) {
                        $FOUND_BODY = decodeBody($part['body']->data);
                        break;
                    }
                }
            }
            if(!$FOUND_BODY) {
                $FOUND_BODY = '(No message)';
            }
            // Finally, print the message ID and the body
            print_r($message_id . ": " . $FOUND_BODY);
        }

        if ($list->getNextPageToken() != null) {
            $pageToken = $list->getNextPageToken();
            $list = $gmail->users_messages->listUsersMessages('me', ['pageToken' => $pageToken, 'maxResults' => 1000]);
        } else {
            break;
        }
    }
} catch (Exception $e) {
    echo $e->getMessage();
}

Cheers.

Tags:

Php

Gmail Api