Platform Event Apex trigger limits exception - where does it go?

A very unsatisfying answer from Support:

As discussed in the call, the Update that i got from my Product Team is that if a Transaction is executed under the Process Automated User and has failed out due to an Apex Time out limit then there is not gona [sic] be an email sent out as the Automated User doesn't have the level of access to send an email.

and follow up from support ...

As discussed in the call, the Automated Process User has many process that are runned [sic] under him and if there is a failure because of that then the organisation is gonna [sic] get many emails if there is anything failing out.

You can't have a Production system where if a transaction fails (while running as Automated Process User) due to an uncatchable Limits exception and nobody gets notified.

Possibly related (same root cause ?):

  • Automated Process User can't send VF emails with embedded components that have VF controllers (also this one)
  • Not all merge fields work correctly in VF email templates

UPDATE - Spring 19 may have a partial solution to this (props to @danielballinger for pointing this out on Twitter)

The new Apex Unexpected Exception event type in the EventLogFile object captures information about unexpected exceptions in Apex code execution. The standard way to obtain exception information is from generated email. However, you now have the option of analyzing the EventLogFile object for Apex exception information, including stack traces.

This is most likely an extra cost item as EventLogFile is part of Shield

https://releasenotes.docs.salesforce.com/en-us/spring19/release-notes/rn_security_em_eventlogfile_apex_unexpected_exception.htm


My plan to work around this limitation would be roughly as below.

Configuration

  1. Use a logging object to track each Platform Event you process in your trigger.
    • Make sure this logging object has no required fields nor validation rules.
  2. Add a Text (18) field to track Job_Id__c.
    • You can later use this field to query AsyncApexJob.
  3. Add a Text field to track the Job_Status__c.
  4. Add a TextArea field to track the Job_Error_Message__c.

Code Changes

  1. Move your core logic to a Queueable.
  2. From your subscriber trigger, fire this async job so any steps which can fail will take place in a separate transaction.
  3. From your subscriber trigger, insert a record into your log object.
  4. On your log record, set the Job_Id__c field.
  5. Set up a scheduled batch which iterates over any log records whose Job_Status__c is in (null, 'Holding', 'Preparing', 'Processing').
    • Match each Job_Id__c up to the corresponding AsyncApexJob.
    • Map Status to Job_Status__c.
    • Map ExtendedStatus to Job_Error_Message__c.