Aliases, htdocs, ErrorDocument, HTTP errors


#1

The short version of my question is whether or not it is possible to get the Apache directive ErrorDocument to use a file above the web root (/home/username/example/wee.php) for HTTP errors like 404, etc. From what I read, it sounds like it should be possible. I can get it to work on my local machine through Aliases, and the Apache documentation implies that there are methods besides alias to make this work. But Aliases don’t seem to be an option on Dreamhost, and don’t think mod_rewrite alone would work either. I know I could use a file beneath the document root that includes the file above, but I would like to avoid that if possible. At this point it’s not so much a matter of saving time but solving the puzzle and filling the gaps in my knowledge.

Now for the long version… This is a reeeeeally long question, I appreciate anyone’s attempt to help!

I had a hard time deciding how I should manage these errors (404, 500, …) and when I finally decided, I am encountering problems.

Let me first describe how I decided to set it up. I have several sites hosted on a shared Dreamhost account. In the folder structure that I see, everything of mine on the server is under /home/username, and for example, site1.com’s web root is at

/home/username/site1.com

I am creating a generic error handler (php script) for errors like 404 not found, 500, etc. that I want to store above the web roots of my sites at

/home/username/error_handler/index.php

so that I can use an .htaccess file at /home/username/.htaccess which includes something like the following:

ErrorDocument 404 /error_handler/index.php
ErrorDocument 500 /error_handler/index.php

…and many more

When these errors occur on any of my sites, I want it to be directed to

/home/username/error_handler/index.php

This is the problem I’m having a hard time figuring out. The ErrorDocument directives above will actually cause Apache to look for

/home/username/site1.com/error_handler/index.php

Anyway, the errors should be redirected to my error handling php script. The script will use $_SERVER[‘REDIRECT_STATUS’] to get the error code, then use $_SERVER[‘REDIRECT_URL’] and $_SERVER[‘HTTP_HOST’] to decide what to do. It will check if an error handler specific to that site exists (for example: site1.com/errors/404.php). If this custom page doesn’t exist, it will output a generic message that is slightly more user-friendly and styled, and perhaps will include some contact info for me depending on the error.

Doing it this way lets me funnel all these errors through this 1 php script. I can log the errors however I like or send email notifications if I want. It also lets me set up the ErrorDocument Apache directives once for all my sites instead of having to do it for every site. It will also continue to work without modification when I move the site around, since I already have a system that scans the folder structure to figure out where my site roots are when they really aren’t at the web root technically speaking. This may not be possible with other solutions like using mod_rewrite for all 404 problems, which I know is common. Or if it is possible, it may be very difficult to do. Plus, I have already done that work, so it will be easy for me to adapt.

When I am working on sites for which I don’t have a domain name yet (or sites where the domain name is already in use at the moment), I store them temporarily in site1.com/dev/site3.com for example. Moving the site to site3.com eventually would cause me to have to update the htaccess files if I had one for each site. Changing the domain name would do the same.

Ex: a site stored at site1.com/dev/site3.com would have this in its htaccess file:

ErrorDocument 404 /site1.com/dev/site3.com/error/404.php

And it would have to be changed to this:

ErrorDocument 404 /site3.com/error/404.php

Obviously, this isn’t a huge amount of work, but I already manage a lot of sites and I will probably be making more every year, 95% of which will be hosted on my shared DreamHost account. And most of them get moved at least once. So setting up something automatic will save me a some effort in the long run.

I already have a system set up for managing site-relative links on all my sites. These links will work whether the site exists in a subdirectory of an existing site, or in their own domain. They also work without change in a local development server despite a difference in the web root location. For example, on the live server, the site-relative http link /img/1.jpg would resolve to the file /home/username/site1.com/img/1.jpg while on my local development server it would resolve to C:\xampp\htdocs\img\1.jpg, despite what I consider the logical site root being at C:\xampp\htdocs\site1.com. I love this system, and it is what gave me the idea to set up something that would work automatically like I expected it to, based on the file structure I used.

So, if I could get it to work, I think this seems like a pretty good system. But I am still very new to apache configuration, mod_rewrite, etc. It’s possible there is a much easier and better way to do this. If you know of one, please let me know.

Anyway, all that aside, I can’t get it working. The easiest thing would be if I could have the ErrorDocument directive send the requests to folders above the web root. But the path is a URL path relative to the document root. Using the following in /home/username/.htaccess,

ErrorDocument 404 /error_handler/index.php

a request for a non-existent resource causes Apache to look for the file at

site1.com/error_handler/index.php

So I thought I should set up a redirection (on all my sites) that would redirect those URLS to /home/username/error_handler. I tried a few things and couldn’t get any of them to work.

Alias seemed like the simplest solution, but it is something that has to be set at server runtime (not sure if that is the right terminology - when the server is started). On my local server, it worked fine using:

Alias /error_handler C:\xampp\htdocs\error_handler2

I changed the local folder to test that the Alias was functioning properly. (On the local server, the URL path specified by the ErrorDocument directive is actually pointing to the right folder, since in my local server the web root is technically C:\xampp\htdocs and I store the error handler I want to use is stored locally at C:\xampp\htdocs\error_handler\index.php)

Dreamhost has a web client that can create what I am guessing is an Alias. When I tried to redirect the folder error_handler on site1.com to /home/username/error_handler, it would seem to work right if I typed site1.com/error_handler in the browser. But if I typed site1.com/test1234 (non-existant), it would say there was a 404 error trying to use the error handler. Also, I would have to login through the web client and point and click (and wait several minutes for the server to restart) every time I wanted to set this up for a new site, even if I could get it to work.

So I tried getting it to work with mod_rewrite, which seems like the most flexible solution. My first attempt looked something like this (stored in /home/username/site1.com/.htaccess for now, though it would eventually be at /home/username/.htaccess:

RewriteEngine On
RewriteRule ^error_handler/index.php$ /home/username/error_handler/index.php

The plain english version of what I was trying to do above is to send requests on any of my sites for error_handler/index.php to /home/username/error_handler/index.php. The mis-understanding I had is that the subsitution will be treated as a file path if it exists. But I missed that the documentation says “(or, in the case of using rewrites in a .htaccess file, relative to your document root)”. So instead of rewriting to /home/username/error_handler/index.php, it’s actually trying to rewrite to /home/username/site1.com/home/username/error_handler/index.php.

I tried including Options +FollowSymLinks because in the Apache documentation it says this:

To enable the rewrite engine in this context [per-directory re-writes in htaccess], you need to set “RewriteEngine On” and “Options FollowSymLinks” must be enabled. If your administrator has disabled override of FollowSymLinks for a user’s directory, then you cannot use the rewrite engine. This restriction is required for security reasons.

I searched around for a while and I couldn’t find anything about how Dreamhost handles this (probably because I don’t know where to look).

I experimented with RewriteBase because in the Apache documentation it says this:

"This directive is required when you use a relative path in a substitution in per-directory (htaccess) context unless either of the following conditions are true:

The original request, and the substitution, are underneath the DocumentRoot (as opposed to reachable by other means, such as Alias)."

Since this is supposed to be a URL path, in my case it should be RewriteBase /, since all my redirects will be from site1.com/error_handler. I also tried Rewrite Base /home/username and RewriteRule ^error_handler/index.php$ error_handler/index.php. However, the Rewrite Base is a URL path relative to the document root. So I need to use something like an alias still. The implication in the quote from the documentation above is that it is possible to use mod_rewrite to send content above the web root. One of the many things I don’t know is what the ‘other means’ besides Alias might be. I believe Alias might not be an option on Dreamhost. At least I couldn’t make sense of it.


#2

I think you are barking up the wrong tree. You can try using symlinks, like /error/handler.php -> to /home/username/error_handler.php (this is set up through shell access using the ‘ln’ command) or you can using includes, ie <? php include(/home/username/error_handler.php) ?> or perhaps server-parsed html includes. I don’t know offhand if you can execute symlinks.


#3

“I know I could use a file beneath the document root that includes the file above, but I would like to avoid that if possible. At this point it’s not so much a matter of saving time but solving the puzzle and filling the gaps in my knowledge.”

I was just hoping to set up something that would work across all my existing and future sites without having to do anything manually site by site. So I would like to avoid creating a php file in each site root that are all including the same file if there is a way to access that file directly and avoid having to make unnecessary files. I have never used symlinks before. Am I right that it’s something I would have to do for every domain? Like I would I be creating a symlink in the web root of each site? If that’s the case, it’s no better/worse than the includes. I should have some time tomorrow to try out symlinks.


#4

I’ve used a small subdomain for custom error handling in the past (for logging badbots).

~/.htaccess

[code]ErrorDocument 404 http://error.domain.com/error_handler.php
ErrorDocument 500 http://error.domain.com/error_handler.php

etc…[/code]

The access logs for the error subbie only have records of any errors.


#5

Thanks for the reply. I hadn’t thought of that, but it doesn’t work well for me in this application. From the apache doc:

“Note that when you specify an ErrorDocument that points to a remote URL (ie. anything with a method such as http in front of it), Apache will send a redirect to the client to tell it where to find the document, even if the document ends up being on the same server. This has several implications, the most important being that the client will not receive the original error status code, but instead will receive a redirect status code. This in turn can confuse web robots and other clients which try to determine if a URL is valid using the status code. In addition, if you use a remote URL in an ErrorDocument 401, the client will not know to prompt the user for a password since it will not receive the 401 status code. Therefore, if you use an ErrorDocument 401 directive then it must refer to a local document.”

I thought perhaps the server would retain the information about the error code and the original request url, but I don’t think it does. Perhaps sending a redirect status code to the client causes the next request to be seen as a new request, thus wiping the variables I was using. My error handling script needs to be able to tell the original error code and the request it came from to respond appropriately. Perhaps there is another way to get this value, but in my script I used $_SERVER[‘REDIRECT_STATUS’] to get the error code and $_SERVER[‘REDIRECT_URL’] to tell what the client’s request was. Using your method (external URL’s in ErrorDocument) causes both of those variables to be ‘undefined indexes’. It also changes the url for the user, which I wouldn’t want it to do.

What I am trying to achieve works great on the local server using something like this:

<Directory C:\xampp\htdocs>
ErrorDocument 404 /errors

Alias /errors C:\xampp\htdocs\error_handler\index.php

Technically I could skip the alias on my local server, because the url in the ErrorDocument is relative to my web root, which is technically C:\xampp\htdocs. But on my remote server, the web root will be below the error_handler folder, so I need something like Alias. I just can’t use Alias like that on the Dreamhost server. I don’t know if there is something besides alias that would have a similar function.


#6

The “Remap Sub-Directory” tab in the panel creates Alias directives on the backend.


#7

I tried symlinks today as Atropos7 suggested. It seemed like it would be slightly easier to automate creating those symlinks than it creating those php files, so that would be a slight upgrade from using php files with includes if it worked. However, the same thing happens with symlinks that happens with the ‘Alias’ created through the Dreamhost web interface. I created a symlink at /home/username/site1.com/error_handler/index.php that points to /home/username/error_handler/index.php. When I access it directly in the browser (site1.com/error_handler/index.php), it redirects to the file appropriately. But if I attempt to access site1.com/test123 (non-existent resource), I get this: “404 Not Found error was encountered while trying to use an ErrorDocument to handle the request”. I tried using the Options +FollowSymLinks or +SymLinksIfOwnerMatch and it didn’t seem to make a difference.

So really the only thing that works so far on the Dreamhost server is to create php files below the web root that include the file above. So still no automatic solution, at least on the shared Dreamhost server. Alias makes it easy on the local server. Any other ideas?
[hr]

I couldn’t get that to work with the ErrorDocument directive.

“Dreamhost has a web client that can create what I am guessing is an Alias. When I tried to redirect the folder error_handler on site1.com to /home/username/error_handler, it would work right if I typed site1.com/error_handler/index.php in the browser. But if I typed site1.com/test1234 (non-existant), it would say there was a 404 error trying to use the error handler to handle the request.”

The above is using ErrorDocument 404 /error_handler/index.php. I tried several varieties, but I could only get it to work with direct requests.

But even if I can get it working, it’s still not the automated solution I was hoping for. It’s actually easier to use php files with includes than it is to go through the DH web client. Is an automated solution impossible on the shared server? I’m beginning to think it is.