Remove .php extension with .htaccess

Gumbo's answer in the Stack Overflow question How to hide the .html extension with Apache mod_rewrite should work fine.

Re 1) Change the .html to .php

Re a.) Yup, that's possible, just add #tab to the URL.

Re b.) That's possible using QSA (Query String Append), see below.

This should also work in a sub-directory path:

RewriteCond %{REQUEST_FILENAME}.php -f
RewriteRule !.*\.php$ %{REQUEST_FILENAME}.php [QSA,L]

Apache mod_rewrite

What you're looking for is mod_rewrite,

Description: Provides a rule-based rewriting engine to rewrite requested URLs on the fly.

Generally speaking, mod_rewrite works by matching the requested document against specified regular expressions, then performs URL rewrites internally (within the Apache process) or externally (in the clients browser). These rewrites can be as simple as internally translating example.com/foo into a request for example.com/foo/bar.

The Apache docs include a mod_rewrite guide and I think some of the things you want to do are covered in it. Detailed mod_rewrite guide.

Force the www subdomain

I would like it to force "www" before every URL, so its not domain.example but www.domain.example/page

The rewrite guide includes instructions for this under the Canonical Hostname example.

Remove trailing slashes (Part 1)

I would like to remove all trailing slashes from pages

I'm not sure why you would want to do this as the rewrite guide includes an example for the exact opposite, i.e., always including a trailing slash. The docs suggest that removing the trailing slash has great potential for causing issues:

Trailing Slash Problem

Description:

Every webmaster can sing a song about the problem of the trailing slash on URLs referencing directories. If they are missing, the server dumps an error, because if you say /~quux/foo instead of /~quux/foo/ then the server searches for a file named foo. And because this file is a directory it complains. Actually it tries to fix it itself in most of the cases, but sometimes this mechanism need to be emulated by you. For instance after you have done a lot of complicated URL rewritings to CGI scripts etc.

Perhaps you could expand on why you want to remove the trailing slash all the time?

Remove .php extension

I need it to remove the .php

The closest thing to doing this that I can think of is to internally rewrite every request document with a .php extension, i.e., example.com/somepage is instead processed as a request for example.com/somepage.php. Note that proceeding in this manner would would require that each somepage actually exists as somepage.php on the filesystem.

With the right combination of regular expressions this should be possible to some extent. However, I can foresee some possible issues with index pages not being requested correctly and not matching directories correctly.

For example, this will correctly rewrite example.com/test as a request for example.com/test.php:

RewriteEngine  on
RewriteRule ^(.*)$ $1.php

But will make example.com fail to load because there is no example.com/.php

I'm going to guess that if you're removing all trailing slashes, then picking a request for a directory index from a request for a filename in the parent directory will become almost impossible. How do you determine a request for the directory 'foobar':

example.com/foobar

from a request for a file called foobar (which is actually foobar.php)

example.com/foobar

It might be possible if you used the RewriteBase directive. But if you do that then this problem gets way more complicated as you're going to require RewriteCond directives to do filesystem level checking if the request maps to a directory or a file.

That said, if you remove your requirement of removing all trailing slashes and instead force-add trailing slashes the "no .php extension" problem becomes a bit more reasonable.

# Turn on the rewrite engine
RewriteEngine  on
# If the request doesn't end in .php (Case insensitive) continue processing rules
RewriteCond %{REQUEST_URI} !\.php$ [NC]
# If the request doesn't end in a slash continue processing the rules
RewriteCond %{REQUEST_URI} [^/]$
# Rewrite the request with a .php extension. L means this is the 'Last' rule
RewriteRule ^(.*)$ $1.php [L]

This still isn't perfect -- every request for a file still has .php appended to the request internally. A request for 'hi.txt' will put this in your error logs:

[Tue Oct 26 18:12:52 2010] [error] [client 71.61.190.56] script '/var/www/test.peopleareducks.com/rewrite/hi.txt.php' not found or unable to stat

But there is another option, set the DefaultType and DirectoryIndex directives like this:

DefaultType application/x-httpd-php
DirectoryIndex index.php index.html

Update 2013-11-14 - Fixed the above snippet to incorporate nicorellius's observation

Now requests for hi.txt (and anything else) are successful, requests to example.com/test will return the processed version of test.php, and index.php files will work again.

I must give credit where credit is due for this solution as I found it Michael J. Radwins Blog by searching Google for php no extension apache.

Remove trailing slashes

Some searching for apache remove trailing slashes brought me to some Search Engine Optimization pages. Apparently some Content Management Systems (Drupal in this case) will make content available with and without a trailing slash in URLs, which in the SEO world will cause your site to incur a duplicate content penalty. Source

The solution seems fairly trivial, using mod_rewrite we rewrite on the condition that the requested resource ends in a / and rewrite the URL by sending back the 301 Permanent Redirect HTTP header.

Here's his example which assumes your domain is blamcast.net and allows the the request to optionally be prefixed with www..

#get rid of trailing slashes
RewriteCond %{HTTP_HOST} ^(www.)?blamcast\.net$ [NC]
RewriteRule ^(.+)/$ http://%{HTTP_HOST}/$1 [R=301,L]

Now we're getting somewhere. Lets put it all together and see what it looks like.

Mandatory www., no .php, and no trailing slashes

This assumes the domain is foobar.example and it is running on the standard port 80.

# Process all files as PHP by default
DefaultType application/x-httpd-php
# Fix sub-directory requests by allowing 'index' as a DirectoryIndex value
DirectoryIndex index index.html

# Force the domain to load with the www subdomain prefix
# If the request doesn't start with www...
RewriteCond %{HTTP_HOST}   !^www\.foobar\.com [NC]
# And the site name isn't empty
RewriteCond %{HTTP_HOST}   !^$
# Finally rewrite the request: end of rules, don't escape the output, and force a 301 redirect
RewriteRule ^/?(.*)         http://www.foobar.example/$1 [L,R,NE]

#get rid of trailing slashes
RewriteCond %{HTTP_HOST} ^(www.)?foobar\.com$ [NC]
RewriteRule ^(.+)/$ http://%{HTTP_HOST}/$1 [R=301,L]

The 'R' flag is described in the RewriteRule directive section. Snippet:

redirect|R [=code] (force redirect) Prefix Substitution with http://thishost[:thisport]/ (which makes the new URL a URI) to force a external redirection. If no code is given, a HTTP response of 302 (MOVED TEMPORARILY) will be returned.

Final Note

I wasn't able to get the slash removal to work successfully. The redirect ended up giving me infinite redirect loops. After reading the original solution closer I get the impression that the example above works for them because of how their Drupal installation is configured. He mentions specifically:

On a normal Drupal site, with clean URLs enabled, these two addresses are basically interchangeable

In reference to URLs ending with and without a slash. Furthermore,

Drupal uses a file called .htaccess to tell your web server how to handle URLs. This is the same file that enables Drupal's clean URL magic. By adding a simple redirect command to the beginning of your .htaccess file, you can force the server to automatically remove any trailing slashes.

Tags:

.Htaccess