The right way of using index.html

The reason we use index.html or home.html or derivitives thereof, is because the webserver software itself actually looks for that and serves it. For example:

This is INVALID: (www-directory)

/var/www/
|_blog.html
|_blog/
  |_math.html
  |_page2.html
  |_page3.html
  |_(...)

This will in fact get served as a page listing the folders and files. (Not what you want). You can try this structure, but also make an index.html file next to blog.html. Notice how it will not serve blog.html unless you specify http://www.site.com/blog.html) This is why http://www.google.com/ shows the page without you having to specify http://www.google.com/index.html

This is VALID:

/var/www/
|_index.html (renamed blog.html to index.html)
|_blog/
  |_math.html
  |_page2.html
  |_page3.html
  |_(...)

This will serve your blog.html file AS THE HOMEPAGE. (Not list all the folders/files in that directory)

The webserver software has (in the config) a specialized list of file names that will be served as the homepage or the main page of a folder. (In my experience, index.html takes precedence over index.php, so if you have index.html and index.php in a folder, the index.html is what the public will see) Of course that can all be changed, and you can even set blog.html to be recognized as an "index".

Addressing your comment:

"This trick would change the address of my blog from www.xxx.com/blog.html into www.xxx.com/blog/."

This would be done by moving blog.html entirely into /blog/ and renaming it to index.html.

Your new structure would be:

/var/www/
|_blog/
  |_index.html (renamed from blog.html)
  |_math.html
  |_page2.html
  |_page3.html
  |_(...)

This should correctly serve http://www.site.com/blog/ to show the contents of your blog.html which we renamed to index.html so the software could set it as the index of your directory /blog/

You're also free now to put and index.html file into the root of your site http://www.site.com/(index.html) to have links to /blog/ and whatever else you wish.

Specifically answering your questions in short statements:

  1. Is it a good practice to have the index.html file in every subfolder or is it intended to be only in the root folder?

    Yes, because it prevents people from seeing what files are in your directories. You can prevent this with a .htaccess file containing Options -Indexes

  2. Are there any disadvantages or problems that may occur when using the second, "index in every folder" method?

    None that I can think of.

  3. Which one of the two ways of structuring the website described above would you prefer?

    I usually have an index.html or index.php file in the root, subfolders based on category (such as forum or news or login etc.) and then some sort of index inside each of those.


The technical term for index.html is Directory Index for Apache and Default Document for IIS. The other Apache directive of interest is the Options directive. As indicated in the documentation, when Options Indexes is set:

If a URL which maps to a directory is requested, and there is no DirectoryIndex (e.g., index.html) in that directory, then mod_autoindex will return a formatted listing of the directory.

When I setup a website that is not using a content management system, my preferred setup is to have one content page per directory. That page is the directory index (default document) for the directory. All links on the site only link to the directory and end with a trailing slash (e.g., http://example.com/blog/ instead of http://example.com/blog/index.html or ./blog/ instead of ./blog/index.html). The trailing slash is important to avoid what is commonly referred to as a courtesy redirect. (If the trailing slash is omitted, everything still resolves correctly, but the number of HTTP requests and thus bandwidth increase.)

My primary motivation for the above methodology is twofold. First, it facilitates switching the technology used on the website. For example, I can change a page from index.html to index.php without breaking any links or search engine listings. Second, the file extension of a content page is "noise"; removing the file extension from the URL results in shorter and hopefully more readable URLs.

As for other file types:

  • All CSS files reside in a css directory in the root of the website.
  • All image files reside in an image directory or subdirectory thereof in the root of the website.
  • All JavaScript files reside in a scripts directory in the root of the website.
  • All flash and other movie files reside in a video directory or subdirectory thereof in the root of the website.

On an Apache server, I disable Options Indexes for the abovementioned directories. On both Apache and IIS servers, I do not specify a directory index (default document) for the abovementioned directories. Thus, a request for any of the directories results in an HTTP 403 error.