Wordpress - Remove taxonomy slug from a custom hierarchical taxonomy permalink

UPDATE

Since writing this WordPress core has added the 'do_parse_request' hook that allows URL routing to be handled elegantly and without the need to extend the WP class. I covered the topic in-depth in my 2014 Atlanta WordCamp talk entitled "Hardcore URL Routing"; the slides are available at the link.

ORIGINAL ANSWER

URL Design has been important to be for well over a decade; I even wrote a blog about it several years back. And while WordPress is sum is a brilliant bit of software unfortunately it's URL rewrite system is just short of brain dead (IMHO, of course. :) Anyway, glad to see people caring about URL design!

The answer I'm going to provide is a plugin I'm calling WP_Extended that is a proof of concept for this proposal on Trac (Note that proposal started as one thing and evolved into another, so you have to read the entire thing to see where it was headed.)

Basically the idea is to subclass the WP class, override the parse_request() method, and then assign the global $wp variable with an instance of the subclass. Then within parse_request() you actually inspect the path by path segment instead of using a list of regular expressions that must match the URL in their entirety.

So to state it explicitly, this technique inserts logic in front of the parse_request() which checks for URL-to-RegEx matches and instead first looks for taxonomy term matches, but it ONLY replaces parse_request() and leaves the entire rest of the WordPress URL routing system intact including and especially the use of the $query_vars variable.

For your use-case this implementation only compares URL path segments with taxonomy terms since that's all you need. This implementation inspects taxonomy terms respecting parent-child term relationships and when it finds a match it assigns the URL path (minus leading and trailing slashes) to $wp->query_vars['category_name'], $wp->query_vars['tag'] or $wp->query_vars['taxonomy'] & $wp->query_vars['term'] and it bypasses the parse_request() method of the WP class.

On the other hand if the URL path does not match a term from a taxonomy you've specified it delegates URL routing logic to the WordPress rewrite system by calling the parse_request() method of the WP class.

To use WP_Extended for your use-case you'll need to call the register_url_route() function from within your theme's functions.php file like so:

add_action('init','init_forum_url_route');
function init_forum_url_route() {
  register_url_route(array('taxonomy'=>'forum'));
}

What here is the source code for the plugin:

<?php
/*
Filename: wp-extended.php
Plugin Name: WP Extended for Taxonomy URL Routes
Author: Mike Schinkel
*/
function register_url_route($args=array()) {
  if (isset($args['taxonomy']))
    WP_Extended::register_taxonomy_url($args['taxonomy']);
}
class WP_Extended extends WP {
  static $taxonomies = array();
  static function on_load() {
    add_action('setup_theme',array(__CLASS__,'setup_theme'));
  }
  static function register_taxonomy_url($taxonomy) {
    self::$taxonomies[$taxonomy] = get_taxonomy($taxonomy);
  }
  static function setup_theme() { // Setup theme is 1st code run after WP is created.
    global $wp;
    $wp = new WP_Extended();  // Replace the global $wp
  }
  function parse_request($extra_query_vars = '') {
    $path = $_SERVER['REQUEST_URI'];
    $domain = str_replace('.','\.',$_SERVER['SERVER_NAME']);
    //$root_path = preg_replace("#^https?://{$domain}(/.*)$#",'$1',WP_SITEURL);
$root_path = $_SERVER['HTTP_HOST'];

    if (substr($path,0,strlen($root_path))==$root_path)
      $path = substr($path,strlen($root_path));
    list($path) = explode('?',$path);
    $path_segments = explode('/',trim($path,'/'));
    $taxonomy_term = array();
    $parent_id = 0;
    foreach(self::$taxonomies as $taxonomy_slug => $taxonomy) {
      $terms = get_terms($taxonomy_slug);
      foreach($path_segments as $segment_index => $path_segment) {
        foreach($terms as $term_index => $term) {
          if ($term->slug==$path_segments[$segment_index]) {
            if ($term->parent!=$parent_id) { // Make sure we test parents
              $taxonomy_term = array();
            } else {
              $parent_id = $term->term_id; // Capture parent ID for verification
              $taxonomy_term[] = $term->slug; // Collect slug as path segment
              unset($terms[$term_index]); // No need to scan it again
            }
            break;
          }
        }
      }
      if (count($taxonomy_term))
        break;
    }
    if (count($taxonomy_term)) {
      $path = implode('/',$taxonomy_term);
      switch ($taxonomy_slug) {
        case 'category':
          $this->query_vars['category_name'] = $path;
          break;
        case 'post_tag':
          $this->query_vars['tag'] = $path;
          break;
        default:
          $this->query_vars['taxonomy'] = $taxonomy_slug;
          $this->query_vars['term'] = $path;
          break;
      }
    } else {
      parent::parse_request($extra_query_vars); // Delegate to WP class
    }
  }
}
WP_Extended::on_load();

P.S. CAVEAT #1

Although for a given site I think this technique works brilliantly but this technique should NEVER be used for a plugin to be distributed on WordPress.org for others to use. If it is at the core of a software package based on WordPress then that might be okay. Otherwise this technique should be limited to improving the URL routing for a specific site.

Why? Because only one plugin can use this technique. If two plugins try to use it they will conflict with each other.

As an aside this strategy can be expanded to generically handle practically every use-case pattern that could be required and that's what I intend to implement as soon as I either find the spare time or a client who can sponsor the time that it would take to build fully generic implementations.

CAVEAT #2

I wrote this to override parse_request() which is a very large function, and it is quite possible that I missed a property or two of the global $wp object that I should have set.. So if something acts wonky let me know and I'll be happy to research it and revise the answer if need be.

Anyway...


Simple, really.

Step 1: Stop using the rewrite parameter at all. We're going to roll your own rewrites.

'rewrite'=>false;

Step 2: Set verbose page rules. This forces normal Pages to have their own rules instead of being a catch-all at the bottom of the page.

Step 3: Create some rewrite rules to handle your use cases.

Step 4: Manually force a flush rules to happen. Easiest way: go to settings->permalink and click the save button. I prefer this over a plugin activation method for my own usage, since I can force the rules to flush whenever I change things around.

So, code time:

function test_init() {
    // create a new taxonomy
    register_taxonomy(
        'forum',
        'post',
        array(
            'query_var' => true,
            'public'=>true,
            'label'=>'Forum',
            'rewrite' => false,
        )
    );

    // force verbose rules.. this makes every Page have its own rule instead of being a 
    // catch-all, which we're going to use for the forum taxo instead
    global $wp_rewrite;
    $wp_rewrite->use_verbose_page_rules = true;

    // two rules to handle feeds
    add_rewrite_rule('(.+)/feed/(feed|rdf|rss|rss2|atom)/?$','index.php?forum=$matches[1]&feed=$matches[2]');
    add_rewrite_rule('(.+)/(feed|rdf|rss|rss2|atom)/?$','index.php?forum=$matches[1]&feed=$matches[2]');

    // one rule to handle paging of posts in the taxo
    add_rewrite_rule('(.+)/page/?([0-9]{1,})/?$','index.php?forum=$matches[1]&paged=$matches[2]');

    // one rule to show the forum taxo normally
    add_rewrite_rule('(.+)/?$', 'index.php?forum=$matches[1]');
}

add_action( 'init', 'test_init' );

Remember that after adding this code, you need to have it active when you go flush the permalink rules (by Saving the page on Settings->Permalinks)!

After you've flushed the rules and saved to the database, then /whatever should go to your forum=whatever taxonomy page.

Rewrite rules really aren't that difficult if you understand regular expressions. I use this code to help me when debugging them:

function test_foot() {
    global $wp_rewrite;
    echo '<pre>';
    var_dump($wp_rewrite->rules);
    echo '</pre>';
}
add_action('wp_footer','test_foot');

This way, I can see the current rules at a glance on my page. Just remember that given any URL, the system starts at the top of the rules and goes down through them until it finds one that matches. The match is then used to rewrite the query into a more normal looking ?key=value set. Those keys get parsed into what goes into the WP_Query object. Simple.

Edit: Side note, this method will probably only work if your normal custom post structure starts with something that isn't a catchall, like %category% or some such thing like that. You need to start it with a static string or a numeric, like %year%. This is to prevent it catching your URL before it gets to your rules.


You will not be able to do this using WP_Rewrite alone, since it can't distinguish between term slugs and post slugs.

You have to also hook into 'request' and prevent the 404, by setting the post query var instead of the taxonomy one.

Something like this:

function fix_post_request( $request ) {
    $tax_qv = 'forum';
    $cpt_name = 'post';

    if ( !empty( $request[ $tax_qv ] ) ) {
        $slug = basename( $request[ $tax_qv ] );

        // if this would generate a 404
        if ( !get_term_by( 'slug', $slug, $tax_qv ) ) {
            // set the correct query vars
            $request[ 'name' ] = $slug;
            $request[ 'post_type' ] = $cpt_name;
            unset( $request[$tax_qv] );
        }
    }

    return $request;
}
add_filter( 'request', 'fix_post_request' );

Note that the taxonomy has to be defined before the post type.

This would be a good time to point out that having a taxonomy and a post type with the same query var is a Bad Idea.

Also, you won't be able to reach posts that have the same slug as one of the terms.