Wordpress - Faster way to wp_insert_post & add_post_meta in bulk

I had similar problems sometime ago with a custom CSV import, but I ended up by using some custom SQL for the bulk insert. But I hadn't seen this answer by then:

Optimize post insert and delete for bulk operations?

to use wp_defer_term_counting() to enable or disable term counting.

Also if you check out the source for the WordPress importer plugin, you will see these functions just before the bulk import:

wp_defer_term_counting( true );
wp_defer_comment_counting( true );

and then after the bulk insert:

wp_defer_term_counting( false );
wp_defer_comment_counting( false );

So this might be something to try out ;-)

Importing posts as draft instead of publish, will also speed things up, as the slow process of finding a unique slug for each one is skipped. One could e.g. publish them later in smaller steps, but note that this kind of approach would need to mark the imported posts somehow, so we don't just publish any drafts later! This would need careful planning and most likely some custom coding.

If there are e.g. lot of similar post titles (same post_name) to be imported, then wp_unique_post_slug() can become slow, due to the loop query iteration to find an available slug. This can possibly generate a huge number of db queries.

Since WordPress 5.1 the pre_wp_unique_post_slug filter is available to avoid the loop iteration for the slug. See core ticket #21112. Here's an example:

add_filter( 'pre_wp_unique_post_slug', 
    function( $override_slug, $slug, $post_id, $post_status, $post_type, $post_parent ) {
        // Set a unique slug value to shortcircuit the slug iteration loop.
        // $override_slug = ...

        return $override_slug;
    }, 10, 6

If one tries e.g. $override_slug = _truncate_post_slug( $slug, 200 - ( strlen( $suffix ) + 1 ) ) . "-$suffix" with $suffix as $post_id, then we would note that $post_id is always 0 for new posts, as expected. There are various ways though to generate unique numbers in PHP, like uniqid( '', true ). But use this filter with care to make sure you have unique slugs. We could e.g. run a group count query afterwards on post_name to be sure.

Another option would be to use WP-CLI to avoid timeout. See e.g. my answer posted for Creating 20,000 Posts or Pages using a .csv file?

Then we can run our custom PHP import script import.php with the WP-CLI command:

wp eval-file import.php

Also avoid importing large number of hierarchical post types, as the current wp-admin UI doesn't handle it well. See e.g. Custom post type - posts list - white screen of death

Here's the great tip from @otto:

Before bulk inserts, disable the autocommit mode explicitly:

$wpdb->query( 'SET autocommit = 0;' );

After the bulk inserts, run:

$wpdb->query( 'COMMIT;' );

I also think it would be good idea to do some housekeeping like:

$wpdb->query( 'SET autocommit = 1;' );

I've not tested this on MyISAM but this should work on InnoDB.

As mentioned by @kovshenin this tip wouldn't work for MyISAM.

I had to add this:

    remove_action('do_pings', 'do_all_pings', 10, 1);

Keep in mind that this will skip do_all_pings, which processes pingbacks, enclosures, trackbacks, and other pings (link: https://developer.wordpress.org/reference/functions/do_all_pings/). My understanding from looking at the code is that pending pingbacks/trackbacks/enclosures will still be processed after you remove this remove_action line, but I'm not completely sure.

Update: I also added

    define( 'WP_IMPORTING', true );

Beyond that I'm using:


    wp_defer_term_counting( true );
    wp_defer_comment_counting( true );
    $wpdb->query( 'SET autocommit = 0;' );

    /* Inserting 100,000 posts at a time
       including assigning a taxonomy term and adding meta keys
       (i.e. a `foreach` loop with each loop containing:
       `wp_insert_post`, `wp_set_object_terms`, `add_post_meta`.)

    $wpdb->query( 'COMMIT;' );
    wp_defer_term_counting( false );
    wp_defer_comment_counting( false );

You will need to insert the post to get your ID but the $wpdb->postmeta table is very simple in structure. You could probably use a straight INSERT INTO statement, like this from the MySQL docs: INSERT INTO tbl_name (a,b,c) VALUES(1,2,3),(4,5,6),(7,8,9);

In your case...

$ID = 1; // from your wp_insert_post
$values = '($ID,2,3),($ID,5,6),($ID,8,9)'; // build from your 97 columns; I'd use a loop of some kind
$wpdb->query("INSERT INTO {$wpdb->postmeta} (post_id,meta_key,meta_value) VALUES {$values}");

That won't deal with any encoding, serialization, escaping, error checking, duplications, or anything else, but I'd expect it to be faster (though I haven't tried).

I would not do this on a production site without thorough testing, and if I only had to do it once or twice, I'd use the core functions and take a long lunch while things import.