Magento EE 1.13 catalog routing

(Thought I'd posted an answer similar to Alan's, but I hadn't. Sitting here in LocalStorage. But, I can tag onto his answer with an interesting solution theory.)

The CMS router adds itself to the Front Controller instance by observing the controller_front_init_routers event after the Admin and Standard routers are added. With a little config XML, it would be possible to switch this to the controller_front_init_before event, thereby adding the CMS router first, meaning its match()ing logic will run before the others.

To test this theory, drop the following into app/etc/local.xml:

<frontend>
    <events>
        <!-- fire observer for different event -->
        <controller_front_init_before>
            <observers>
                <cms>
                    <class>Mage_Cms_Controller_Router</class>
                    <method>initControllerRouters</method>
                </cms>
            </observers>
        </controller_front_init_before>
        <!-- disable the original observer -->
        <controller_front_init_routers>
            <observers>
                <cms>
                    <type>disabled</type>
                </cms>
            </observers>
        </controller_front_init_routers>
    </events>
</frontend>

See if this solves the problem.

Incidentally, the CMS router will adjust the request path in the same way as the URL rewrite model.


I've seen a lot of interesting implementations (some good, some bad) by SEO specialists who didn't have a Magento background throughout the years. It sounds like you may be running into problems with some custom code you don't understand. The high level answer to your question may be "Contact the person who wrote the SEO code and/or installed the extension you don't understand", or find a Magento consultant to take a look and quickly dissect it for you.

Your question, even with its clarification, is still too confusing. Literally speaking, no, there's no setting in Magento to "force category routing to be more strict". I'm going to explain, in broad terms, how category routing works vs. CMS routing in a standard Magento system. This will (hopefully) give you enough information to ask a new question in terms we'll be able to understand it by. Also, I've written extensively on Magento's request dispatch before, so if you're interested on the nitty gritty details I'd start there.

Category Routing

There is, strictly speaking, no "category" routing in Magento. On a site with SEO Friendly URLs turned off, a category listing page looks like this.

http://magento.example.com/catalog/category/view/id/8

When SEO friendly URLs are on, Magento (in an indexing process) creates between one and several entries in the

core_url_rewrite

table for that category. The request_path column is the import one here. When Magento is deciding how to handle a particular URL, it will first look in this table. If the current URL matches the request_path, Magento will change its internal representation of the URL so it look like the target_path column.

So, in the sample data, there's a row that looks like this

*************************** 1. row ***************************
url_rewrite_id: 17
      store_id: 1
   category_id: 8
    product_id: NULL
       id_path: category/8
  request_path: electronics/cell-phones.html
   target_path: catalog/category/view/id/8
     is_system: 1
       options: NULL
   description: NULL
1 row in set (0.01 sec)

When Magento sees the url http://magento.example.com/electronics/cell-phones.html, it matches this row because the request_path variable is electronics/cell-phones.html. It then changes its internal representation of the URL to the target_path (catalog/category/view/id/8). Then, Magento handles the URL normally.

That's probably a bit much to follow if you're not used to it, but the important thing to take away is the system that decides how to handle the URL doesn't care that it's a category URL, it just cares that there's an entry in the core_url_rewrite table. This same table is used for product name URLs. Many SEO extensions and custom code solution use this table as well.

CMS Page Routing

After Magento finishes referencing the core_url_rewrite table, what happens next is

  1. It checks for any Admin application page matches (manage products, manage categories, etc.)

  2. It checks for any Frontend application page matches (product listing page, the above mentioned category listing page, etc.)

  3. If numbers 1 & 2 contain no matches, it then looks for a CMS page match.

Magento doesn't use the core_url_rewrite table for CMS pages. Instead, if is reaches step number 3, it tries to match the URL with the URL Key set on the CMS page object. (it would be more accurate to say that when Magento is looking for a CMS page match, it's operating on a URL already modified by the core_url_rewrite process — but things are already confusing enough)

The important take aways here is: CMS matching happens only after a category page match has failed.

It sounds like you may have external processes modifying the core_url_rewrite table, or may have a custom router object added to your system the does extra routing, or maybe even a non-magento system doing things to change URLs.

I'm afraid there's no quick and easy answer for your situation.


What I'm looking for is a URL structure similar to the previous Magento versions (IE category/subcategory), without losing the benefits of background indexing that 1.13 gives.

This is a problem I've been looking at, which so far doesn't seem to have a great solution. We have some deeply nested categories, for example:

Cat A
    Cat B
        Cat C
            Cat D

Prior to 1.13, the category url would have generated as www.domain.com/cat-a/cat-b/cat-c/cat-d/, but now it generates as www.domain.com/catd. Although if you have multiple "Cat D"s, then it could generate as something like www.domain.com/catalog/category/view/s/cat-d/id/132/.

I've been tinkering with different ideas for addressing this, one thing I'm trying right now is to modify the loadByRequestPath method of Enterprise_UrlRewrite_Model_Resource_Url_Rewrite to first look for a full path, before using the default behavior. I did that by adding this method:

protected function tryLoadByFullPath($object, $paths)
{
    if (count($paths) > 1) {
        $_path = implode('/', $paths);

        $select = $this->_getReadAdapter()->select()
            ->from(array('m' => $this->getMainTable()))
            ->where('m.request_path = ?', $_path);

        $result = $this->_getReadAdapter()->fetchRow($select);

        if ($result) {
            $object->setData($result);
            $this->unserializeFields($object);
            $this->_afterLoad($object);

            return true;
        }
    }

    return false;
}

and then adding this code to the top of loadByRequestPath():

if ($this->tryLoadByFullPath($object, $paths)) {
        return $this;
}

It appears to work, at first glance anyway, I haven't tested it very well yet. The downside to this is that the url_key has to be manually set to the full path for every category, so you would have to set the url key for Cat D to "cat-a/cat-b/cat-c/cat-d". That's obviously not ideal.

Anyway, that's probably not very helpful, but maybe someone has a better take on this approach.