How to ditch the file name extension in URLs?

software development

#1

Hello

I’d like to be able to refer to to URL’s on my web site without using file name extensions. e.g.-So that people connect to a page on my site using an url such as www.mydomain.com/coolpage instead of www.mydomain.com/coolpage.html. Does anyone know how I can configure an .htaccess file or files to allow this?

Dreamhost tech support said that it can be done through .htaccess files. However, they noted that they did not have the exact code to do this.

According to the article below, “Cool URIs don’t change” by Tim Berners-Lee, this can be done through “content negotiation.”

http://www.w3.org/Provider/Style/URI
http://www.w3.org/Provider/Style/URI#remove

Here’s an excerpt from that article:

"If you are using, for example, Apache, you can set it up to do content negotiation. You keep the file extension (such as .png) on the file (eg mydog.png), but refer to the web resource without it. Apache then checks the directory for all files with that name and any extension, and it can also pick the best one out of a set (e.g. GIF and PNG). (You do not have to put different types of file in different directories, in fact the content negotiation won’t work if you do.)

  • Set up your server to do content negotiation
  • Make references always to the URI without the extension

References which do have the extension on will still work but will not allow your server to select the best of currently available and future formats."

I know much about server configuration, Apache, .htaccess files, ect. so the Apache’s docs on content negotiation I found at apache.org where as clear as mud to me. Moreover, I couldn’t find anything which specifically described how to do what the “Cool URIs don’t change” advocated.

Thanks


#2

To get that effect I’ve always just use a directory of the name I want to access without a trailing .something, with an index.html file in it. Their browser should automatically add the / which seems to work pretty well. No luck if you’ve got a lot of pages you’d rather keep in the same directory, though, and you don’t want the hassle of generating all those directories.


#3

I use content negotiation myself, in order to serve English and Spanish versions of the pages in a bilingual site based on browser configuration. I don’t use the extensionless versions of URLs myself, because the site was in operation using *.html links before I put content negotiation myself, and besides, aesthetically, the links just don’t look right to me without the .html at the end. (Of course, links to directories don’t have the .html at the end, but they do have a trailing slash; one of my pet peeves is people leaving the slash off, forcing an unnecessary server redirection!) However, I’ve tested that, with the .htaccess lines enabling content negotiation in place, extensionless links do in fact work on my site even if I don’t actually link to the pages that way.

Here’s what’s in my .htaccess file:

Options +MultiViews
AddLanguage en-US .en
AddLanguage es-MX .es
LanguagePriority en-US es-MX

The important line is the first one, which enables content negotiation where file extensions designate the content type and content language. Files can have multiple extensions, like “foo.html.en” and “foo.html.es” for English and Spanish versions of a page. The page can be linked to as “foo.html” or just “foo”, and will result in the English or Spanish version based on browser preference, or you can link to “foo.html.en” to specifically get the English version.

A directory’s default index file can be named “index.html.en” or “index.html.es” (or “index.html.[anything else]”) to allow those to have multiple variants as well.

– Dan


#4

[quote]However, I’ve tested that, with the .htaccess lines
enabling content negotiation in place, extensionless
links do in fact work on my site even if I don’t
actually link to the pages that way.

Here’s what’s in my .htaccess file:

Options +MultiViews
[/quote]

[quote]The important line is the first one, which enables
content negotiation where file extensions designate the
content type and content language. Files can have
multiple extensions, like “foo.html.en” and
"foo.html.es" for English and Spanish versions of a
page.

[/quote]

Sorry, I’m not clear on this so let’s see if I’m understanding you correctly. If I add a .htaccess file to a directory which only has the line…

Options +MultiViews

…then the content negotiation will be enabled which means one doesn’t need to include the file name extensions when either linking to the page or surfing to the page? i.e.- a visitor only needs to type in www.mydomainname/mysubdirectory/mypage to connect to mypage.html in the mysubdirectory if I have an .htaccess file in that directory?

If so, does the .htaccess file affect sub-directories or only the specific directory in which it resides?

Thanks,

Anthony


#5

A .htaccess file affects the directory it’s in and any subdirectories directly beneath it.

– Dan


#6

I’m still not clear on something. If I add a .htaccess file which contains only this line…

Options +MultiViews

…then will the content negotiation be enabled ? Specifically, does this mean that one doesn’t need to include the file name extensions when either linking to the page or surfing to the page? And the url www.mydomain.com/mypage will direct a vistor to www.mydomain.com/mypage.html?

Anthony


#7

Trying it would be one quick way to find out :>

I seem to recall it being a little more complex than that, but it’s been a while since we’ve had a customer asking about this, so my memory may well be faulty.


#8

If it’s a little more complex than that why not fill me in on the details?

Thanks,

Anthony


#9

Because I don’t remember them.