Magento 1.9.1 sending multiple emails

This issue must be related to the new Magento Email Queue system, that leaves orphan records on the Recipients table. If this is your issue, I send you a fix.

The new Magento Email Queue system manages these two tables: core_email_queue and core_email_queue_recipients. The former one handles the email Messages, and the later one, the Recipients of these messages.

The core_email_queue table is cleaned out as emails on the Magento Email Queue are sent. This cleaning is performed by a cron tab job called core_email_queue_clean_up, that's defined inside the app/code/core/Mage/Core/etc/config.xml config file. The code that performs the cleaning is defined on the removeSentMessages function in the Mage_Core_Model_Resource_Email_Queue class:

/**
 * Remove already sent messages
 *
 * @return Mage_Core_Model_Resource_Email_Queue
 */
public function removeSentMessages()
{
    $this->_getWriteAdapter()->delete($this->getMainTable(), 'processed_at IS NOT NULL');
    return $this;
}

The above code is executed once a day by the cron task.

But it happens that the core_email_queue_recipients table (the one that holds email Recipients, and that is linked to the core_email_queue table by the message_id field ), is not cleaned together with the core_email_queue table (the one that holds email Messages), leaving orphan records inside that Recipients table when then Message table is cleaned out.

The issue described here arises when the core_email_queue table (Messages) is reset and the autoincrement message_id field on this table is reinitiated to 1.

Because the core_email_queue_recipients table (Recipients) has not been cleaned accordingly, when new emails are added to the Magento Email Queue, new records are created on the core_email_queue table (with message_id starting again from 1), and at the same time new records are created on the core_email_queue_recipients table with these same ids (starting again from 1).

The problem is that these ids may already exist on the Recipients table as orphans records (due to previous email messages). These new mesages ids then get repeated inside the core_email_queue_recipients table. In the end, different email Messages are linked to their corresponding Recipients by the message_id, but they get also wrongly linked to previous recepients that were assigned the same message_id from previous emails.

Thus, when recipients are searched to send a given message, besides the appropriate recipient, other wrong recipients may arise.

Fortunately the fix for this issue is easy to perform.

All that's needed is cleaning all repeated messages ids on the core_email_queue_recipients table, and making sure that when a Message is deleted on the core_email_queue table, at the same time its corresponding Recipients get deleted on the core_email_queue_recipients table.

The best way to achieve this is to create a foreign key that links these records and deletes them on cascade (but you need to make some cleaning before you can do that).

This is the procedure to fix the issue:

1) Execute the follwoing two SQL queries to clean the core_email_queue_recipients table from orphan records and repeated messages ids:

DELETE FROM core_email_queue_recipients WHERE message_id NOT IN (SELECT message_id FROM core_email_queue);
DELETE FROM core_email_queue_recipients WHERE recipient_id < (SELECT recipient_id FROM (SELECT recipient_id FROM core_email_queue_recipients ORDER BY message_id ASC, recipient_id DESC LIMIT 1) AS r);

The first query deletes orphan records, and the second one deletes old records that are no longer valid.

2) Create a foreign key on the core_email_queue_recipients table to delete Recipients records on cascade. The SQL query to create this foreign key is:

ALTER TABLE core_email_queue_recipients ADD FOREIGN KEY(message_id) REFERENCES core_email_queue(message_id) ON DELETE CASCADE;

By using this new foreign key, no orphan records will be left on the core_email_queue_recipients table when cleaning the core_email_queue table, and no duplicated messages to wrong recipients will be sent in the future.


I'm not sure if you found the solution in the mean time, but I recently came into the same problem on a client's server.

What happened was indeed the same, the cron job core_email_queue_send_all was sending the same emails multiple times and each time the same exception you found was added into the exception log.

The cron job was sending the same emails multiple times because the processed_at field did not get saved in the core_email_queue table for the corresponding messages.

I've added some logs into the code and looked into how the query for saving the core_email_queue message was built, and why it was missing the SET part of it (which should have contained the columns to be updated):

UPDATE `core_email_queue` SET  WHERE (message_id='3')'

In Magento, before building the database queries, the columns used in the query get checked against the column descriptions for the respective table inside the Mage_Core_Model_Resource_Abstract::_prepareDataForTable method by calling:

$fields = $this->_getWriteAdapter()->describeTable($table);

In order not to execute the DESCRIBE query each time, Magento caches the DDL info for the tables. Inside the Varien_Db_Adapter_Pdo_Mysql::describeTable method the cache is first checked:

public function describeTable($tableName, $schemaName = null)
{
    $cacheKey = $this->_getTableName($tableName, $schemaName);
    $ddl = $this->loadDdlCache($cacheKey, self::DDL_DESCRIBE);
    if ($ddl === false) {
        $ddl = array_map(
            array(
                $this,
                'decorateTableInfo'
            ),
            parent::describeTable($tableName, $schemaName)
        );
        /**
        * Remove bug in some MySQL versions, when int-column without default value is described as:
        * having default empty string value
        */
        $affected = array('tinyint', 'smallint', 'mediumint', 'int', 'bigint');
        foreach ($ddl as $key => $columnData) {
            if (($columnData['DEFAULT'] === '') && (array_search($columnData['DATA_TYPE'], $affected) !== FALSE)) {
                $ddl[$key]['DEFAULT'] = null;
            }
        }
        $this->saveDdlCache($cacheKey, self::DDL_DESCRIBE, $ddl);
    }

    return $ddl;
}

I found that the columns received from the cache for the core_email_queue table, were not the ones expected, instead sometimes they were: data, lifetime, expire and priority.

This pointed to a cache problem and I found that Zend_Cache_Core was saving data into the DDL cache files, sometimes by calling Zend_Cache_Backend_File::save() directly and sometimes by calling Zend_Cache_Backend_TwoLevels::save().

The two levels cache from Zend uses the _prepareData method to build a serialized array to store the data and metadata information:

private function _prepareData($data, $lifetime, $priority)
{
    $lt = $lifetime;
    if ($lt === null) {
        $lt = 9999999999;
    }
    return serialize(array(
        'data' => $data,
        'lifetime' => $lifetime,
        'expire' => time() + $lt,
        'priority' => $priority
    ));
}

Finally, the issue was that the cron (which sent the emails) was called from the command line. Comparing a request from the browser with a command line request, I found that Mage_Core_Model_Cache::getBackendOptions was returning different options. This server was set up to use APC cache, however when the cron was running ini_get(‘apc.enabled’) was false.

On this server there were 2 php configuration files fpm/php.ini where apc.enabled was 1, and cli/php.ini where apc.enabled was 0. The Magento instance that was run from the command line was not able, thus, to use the APC cache, so it did not use a two-level cache, which led to it not knowing how to correctly read the data from the cache files.

The fpm Magento instance used APC and the two level cache and was saving DDL data into the var/cache folder enclosed in an array with the data, lifetime, expire and priority keys. When the cron ran and read the DDL cache file, it used the data found there and basically considered for each table, that the columns are data, lifetime, expire and priority.

Changing apc.enabled to 1 in the cli/php.ini configuration file did the trick and solved the issue.

If you are interested on reading more about how I debugged this issue you can have a look over the more detailed explication I wrote for a blog post: http://ayg.ro/magento-cron-twolevel-cache-issue-pdoexception-sqlstate-42s22-and-sqlstate-42000