Prompting to download webpage


#1

This problem has been occuring since the 16th as far as I have been aware of it. My site is at http://www.pixeljunkies.com. Here is a copy of the relevant portions of my email to dreamhost:


Basically, the server is randomly prompting to download the page rather then displaying it. I could give a screenshot, but I am sure you know what I am talking about. Just as if you clicked a zip file or MP3 file, except its happening on regular links on my site.

I can’t give a page to see it on, because it is happening randomly and intermittently. The file type for the open file dialogue that comes up is sometimes “PHP” and sometimes “application/octet-stream”. All my pages are being parsed as PHP.

This may be just a coincidence, but I think that when the server is in a fit where it is doing this often, that even when it isn’t doing it the performance seems to go down, pages taking longer to come up. The timer on PHP scripts still report low times when this happens, so PHP may not actually be the culprit.

There is nothing showing up in my error logs.

PLEASE do not respond saying you went to my site, clicked a few links, and didn’t see the problem. The last time I had to deal with Dreamhost tech support it was an awful experience, they kept saying “looks OK here” until the problem went away.

This problem is occurring randomly and intermittently. It didn’t occur for me at all on Saturday, yet other users said it did. Chances are if you go to the site and click a few links that it wont happen to you, but IT IS HAPPENING.

Dreamhost responded saying that it sounds like a problem with my .htaccess file, and said that my .htaccess file looks fine.

I responded saying is that all they are going to do? They responded saying they will forward my request to Level 2 support and I should also post here, so here I am.

My .htaccess contains several things. I’m not at home right now so I can’t post the whole thing, but can later if requested. It contains
several little things, like error document links, but mainly contains mod_rewrite rules for search engine friendly links.

The .htaccess file had been working fine for months. I was trying to think of anyting that could have changed recently, and I came up with two things:

The first is that this started occuring around the time that Dreamhost had the power outage that caused some trouble. Maybe something is misconfigured with the server? Maybe it actually has nothing to do with the .htaccess file?

The second, is that I did recently add this to my .htaccess file:

RewriteCond %{REQUEST_URI} .css$
RewriteRule ^(.+)$ css-ssc.php?css=%{REQUEST_URI}&%{QUERY_STRING}

This is supposed to only fire on requests that contain “.css” in the request. I dont see how this could be causing the server to randomly prompt for download on a link like “www.pixeljunkies.com/forum”, but that is what is occuring.

I’d appreciate any help anyone can offer. The problem is still occurring. The above condition is the only new functionality I’ve added to the site recently, but I don’t see how that rule could affect other pages on the site that do not contain “.css” in their name.

Does anyone know any other good webhosts aside from Dreamhost? I have sites at work on Rapidsite but they are very expensive. I love dreamhost, except the two times I’ve had to use their support they have been completely unhelpful.

Thanks in advance again for any help. This has me pulling my hair out.


#2

I’m guessing here, but I can dimly remember someone suffering a similar experience when I was hosted somewhere else. If I remember correctly, the error was eventually traced to the user’s FTP client. Sometimes, but not all the time, they had been uploading their pages in binary mode by accident. The FTP client was failing to automatically switch in to ASCII mode. To test this, you could deliberately upload the page in binary mode and see if it reproduces the same error.

I may be way off base here, but I can’t think of anything else it could be.


Simon Jessey
Keystone Websites | si-blog


#3

I’ve experienced the problem on your site once tonight, and it appears the response headers were not sent (which would cause a browser to have to safely guess the response is binary data, or try to guess from the filename in the URL).

Only happened once though out of dozens of hits.

:cool: Perl / MySQL / HTML+CSS


#4

Well, that wouldn’t explain why it only happens sometimes, and rarely at that. Plus, it tends to happen in bursts - it might not happen to me for an hour, then happen several times in the space of 5 minutes. Or that could be completely random.

I also tried uploading two similair files, one as binary and one as ascii:

http://www.pixeljunkies.com/tests/binary.php
http://www.pixeljunkies.com/tests/ascii.php

And they both seem to work fine.

Thanks for trying though.

OFFTOPIC:

I’ve used your fix for Adsense in XHTML before (http://keystonewebsites.com/articles/adsense.php). It works well, but I sent you an email that you may not have gotten that explains that it interferes with Google’s code that detects multiple ad units, so if you have multiple ad units using your method they will all act as if they were the first ad unit. I think you should probably note this on your page so publishers can be aware of the pitfall.


#5

Hmm, next time it prompts me to download the file I will actually do it and see if it is all there as it should be.

Do you have ANY idea what would prompt PHP to not send the proper headers? I’ve tried Googling it and can’t come up with anything. My pages are sent with the following headers:

header(‘Expires: Mon, 26 Jul 1997 05:00:00 GMT’);
header(‘Last-Modified: ’ . gmdate(‘D, d M Y H:i:s’) . ’ GMT’);
header(‘Cache-Control: no-store, no-cache, must-revalidate’);
header(‘Cache-Control: post-check=0, pre-check=0’, false);
header(‘Pragma: no-cache’);
header(‘Content-Type: text/html; charset=utf-8’);
header(‘Vary: Accept’);

and all the pages of the forum I use (which the problem also occurs on) only send cache control headers.

I got another email from Dreamhost basically saying “it’s a little bit beyond the scope of our support system to debug your entire site”, which I find silly. They seem to be sure this is a mod_rewrite problem.

Does anyone have any idea if/how/why mod_rewrite could be causing this problem (even on pages that match no rewrite conditions/rules)?

Thanks again for any and all help!

Next is the contents of my .htaccess file for reference. In the meantime, all try commenting out the only new rule I’ve added recently:

<Files .htaccess>
order allow,deny
deny from all

<Files .htpasswd>
order allow,deny
deny from all

DirectoryIndex index.php

parse javascript as php

AddType application/x-httpd-php .js

no getting directory indexes

Options -Indexes
Options +FollowSymlinks

make pretty links

RewriteEngine On
RewriteRule ^comic/([0-9]+)/ comic.php?comic=$1 [NC]
RewriteRule ^comic/([0-9]+) comic.php?comic=$1 [NC]
RewriteRule ^comic/latest/ comic.php?comic=latest [NC]
RewriteRule ^comic/latest comic.php?comic=latest [NC]
RewriteRule ^comic/ comic.php [NC]
RewriteRule ^comic$ comic.php [NC]
RewriteRule ^accessibility/ accessibility.php [NC]
RewriteRule ^accessibility accessibility.php [NC]
RewriteRule ^archive/author/([0-9]+)/ archive.php?author=$1 [NC]
RewriteRule ^archive/author/([0-9]+) archive.php?author=$1 [NC]
RewriteRule ^archive/postauthor/([0-9]+)/ archive.php?postauthor=$1 [NC]
RewriteRule ^archive/postauthor/([0-9]+) archive.php?postauthor=$1 [NC]
RewriteRule ^archive/post/([0-9]+)/ archive.php?post=$1 [NC]
RewriteRule ^archive/post/([0-9]+) archive.php?post=$1 [NC]
RewriteRule ^archive/ archive.php [NC]
RewriteRule ^archive archive.php [NC]
RewriteRule ^art/home/ art/home.php [NC]
RewriteRule ^art/home art/home.php [NC]
RewriteRule ^change-style/ change-style.php [NC]
RewriteRule ^change-style change-style.php [NC]
RewriteRule ^privacy/ privacy.php [NC]
RewriteRule ^privacy privacy.php [NC]
RewriteRule ^home/ index.php [NC]
RewriteRule ^home index.php [NC]
RewriteRule ^donate/ donate.php [NC]
RewriteRule ^donate donate.php [NC]
RewriteRule ^rss/comics.xml http://feeds.feedburner.com/PixelJunkiesComics [NC]
RewriteRule ^rss/posts.xml http://feeds.feedburner.com/PixelJunkiesPosts [NC]
RewriteRule ^rss/newtopics.xml http://feeds.feedburner.com/PixelJunkiesForumNewTopics [NC]
RewriteRule ^rss/activetopics.xml http://feeds.feedburner.com/PixelJunkiesForumActiveTopics [NC]
RewriteRule ^rss/ rss.php [NC]
RewriteRule ^rss rss.php [NC]

CSS rules

RewriteCond %{REQUEST_URI} .css$
RewriteRule ^(.+)$ css-ssc.php?css=%{REQUEST_URI}&%{QUERY_STRING}

make friendly error pages

ErrorDocument 400 /error.php?error=400
ErrorDocument 401 /error.php?error=401
ErrorDocument 403 /error.php?error=403
ErrorDocument 404 /error.php?error=404
ErrorDocument 405 /error.php?error=405
ErrorDocument 406 /error.php?error=406
ErrorDocument 407 /error.php?error=407
ErrorDocument 408 /error.php?error=408
ErrorDocument 409 /error.php?error=409
ErrorDocument 410 /error.php?error=410
ErrorDocument 411 /error.php?error=411
ErrorDocument 412 /error.php?error=412
ErrorDocument 413 /error.php?error=413
ErrorDocument 414 /error.php?error=414
ErrorDocument 415 /error.php?error=415
ErrorDocument 416 /error.php?error=416
ErrorDocument 417 /error.php?error=417
ErrorDocument 500 /error.php?error=500
ErrorDocument 501 /error.php?error=501
ErrorDocument 502 /error.php?error=502
ErrorDocument 503 /error.php?error=503
ErrorDocument 504 /error.php?error=504
ErrorDocument 505 /error.php?error=505


#6

I believe it has to do with gzip encoding. I saved the content to disk, and it was binary data. But halfway through the file was a set of HTTP headers, and one of them indicated the Content-Encoding was gzip.

So apparently the server is sending out gzip-compressed content, then a set of HTTP headers, then gzip compressed content again. Naturally a browser won’t know what to do if doesn’t recieve the headers first.

gzip encoding for PHP scripts is something I’ve seen people get wrong before, such as saying the content is compressed and its really not, or compressing it twice because they don’t realize PHP was going to compress it anyways, or coming up with bad code that seems to work with Internet Explorer but break in Mozilla browsers. If I were you, I would definitely check the PHP scripts and make sure they are not trying to compress the content themselves. Otherwise I’d suggest checking to see what kind of PHP you are running (PHP-Apache or PHP-CGI, or a version you built yourself) and check to see if there are bugs with its gzip handling.

:cool: Perl / MySQL / HTML+CSS


#7

I saved a page before I noticed your reply and observed the same thing. I have saved it here for people to look at:

http://www.pixeljunkies.com/tests/sample.html

So, I’ve been thinking this through, and here are my thoughts.

I am indeed using ob_start (“ob_gzhandler”); in my PHP script to gzip the file. And my headers do occur in my script after this. However, no content is output until after my headers, and PHP doesn’t send its headers until the first content is output, and the page was working fine for 6 months set up like that. So I highly doubt that the gzhandler occuring before my header()s is the problem, though I would gladly switch them around to test it.

Here’s where it gets interesting. The page I linked to above, the sample.html, is the .php file I was promted to download when visiting a page on my site. The page, however, was in my passworded stats section. For stats I am using BBClone (http://bbclone.de/). So, I did a search through all of BBClone’s files and it IS NOT using the ob_gzhandler.

This got me thinking, why the hell is it gzipped content? When looking at the headers appearing in the middle of sample.html, I searched BBClone’s source to see how it sends its header, and they are not the same.

Hmmm. So, if you remember before, the only change I’ve made for months is the following addition to my .htaccess:

RewriteCond %{REQUEST_URI} .css$
RewriteRule ^(.+)$ css-ssc.php?css=%{REQUEST_URI}&%{QUERY_STRING}

Which routes all my CSS through a PHP file for processing. I opened up my css-ssc.php script and looked how it was sending its headers - again, no match to sample.html.

However, I have a script in /scripts/header.php, that I include at the top of all my pages in my site (except pages from external scripts, like the forum or stats).

In my header.php, I specifiy the following headers:

header(‘Expires: Mon, 26 Jul 1997 05:00:00 GMT’);
header(‘Last-Modified: ’ . gmdate(‘D, d M Y H:i:s’) . ’ GMT’);
header(‘Cache-Control: no-store, no-cache, must-revalidate’);
header(‘Cache-Control: post-check=0, pre-check=0’, false);
header(‘Pragma: no-cache’);
header(‘Content-Type: text/html; charset=utf-8’);
header(‘Vary: Accept’);

Which appear to match exactly the headers in the middle of the page in sample.html.

So, this is absolutely confusing to me. I am in no way including my header.php in the stats package, which generated the page sample.html comes from. How could this be happening? Does this really have anything to do with mod_rewrite like Dreamhost says?

Thanks for your help, I feel good that I am at least getting somewhere.

My plan for now is to save any more pages that I get prompted to download and analyze them more. I can’t get the server to do it right now, even though I just tried like two hundred pages.

Perhaps then I’ll have some more information. Maybe I’ll try posting on other forums, like the SitePoint forums.


#8

Alright, got a forum page to spit out the error, have uploaded it here:

http://www.pixeljunkies.com/tests/viewforum.html

Same style of thing. Although, in this case the headers do match what the forum software is sending:

header(‘Expires: Thu, 21 Jul 1977 07:30:00 GMT’); // When yours truly first set eyes on this world! :slight_smile:
header(‘Last-Modified: ‘.gmdate(‘D, d M Y H:i:s’).’ GMT’);
header(‘Cache-Control: post-check=0, pre-check=0’, false);
header(‘Pragma: no-cache’); // For HTTP/1.0 compability

And yes, the forum is being sent gzipped.

Are we sure these binary pages are gzipped? Ive tried naming them gzip or gz and extracting them with no luck. Is the gzipping used by PHP different then the archive format?

And I am getting no closer to an answer. Why is this occuring only sometimes? Why didnt it occur for six months? Did dreamhost change something on the server? Is it mod_rewrite related? Where do I go from here?

Do I try turning off gzipping? Do I try modifying mod_rewrite (which is essentially required by some of my pages)?

Verifying that the issue is fixed will take some time, as I’ve had ~1.5 days go buy without experiencing it.

My mind!


#9

I’m just throwing an idea out, but perhaps it is the CSS rule. If everything was ok until you added it (also DH’s power outage) then it must be that, although if you could re-arrange your code so it stops gzipping and then setting headers. Who knows, it might be trying to gzip first. Or turn gzip off.

Anyway, it might only be happening ocassionally because that’s when your browser is getting another copy of the CSS instead of using it’s cache. So if there is a problem with your CSS rule, every one in a while, the browser will get redirected and get a ‘bad’ copy of the CSS file. Also, is your CSS file being gzipped as well? Not sure if browser could cope with gzipped css?
Just throwing out ideas =)


#10

Thanks for the brainstorming.

I think I might end up commenting out the CSS rewrite rule just as a shot in the dark, but I really don’t think it could be causing it.

It’s hard to know if adding them is what caused it, because this error is so unpredictable and random, and it will be hard to know when it is truely fixed.

If it was messing up on the CSS file, then the CSS file would be the one being prompted to download, not the PHP file. Unless I know way less about the internet then I think.

I had already considerd that the occasional occurence of this issue might be related to caching, but clearing my cache completely does not increase the odds of it happening.

My CSS is gzipped, that is one of the things my CSS handling PHP script does, and browsers cope with it perfectly:
http://www.fiftyfoureleven.com/sandbox/weblog/2004/jun/the-definitive-css-gzip-method/

Thanks for your ideas, I really do appreciate it.

If I can’t find anyone else who can give me an idea of what to tackle with this, I will probably try:

  1. Removing the CSS rewrite rule.
  2. Rearranging my scripts so all headers are sent before gzipping.
  3. Completely stop using gzip.

In that order, one step at a time for a few days each to see if it makes a difference.


#11

As far as I can tell, that only happens in the event that you are using the same kind of ad unit each time. For example, the ads on this page are different.


Simon Jessey
Keystone Websites | si-blog


#12

As an aside, note that DH has mod_gzip installed, so you really don’t need to compress your output yourself.

kchrist@dreamhost:~>$ lynx -head -dump http://www.pixeljunkies.com/ HTTP/1.1 200 OK Date: Fri, 23 Sep 2005 16:15:57 GMT Server: Apache/1.3.33 (Unix) DAV/1.0.3 mod_fastcgi/2.4.2 mod_gzip/1.3.26.1a PHP/4.3.10 mod_ssl/2.8.22 OpenSSL/0.9.7e

If you want useful replies, ask smart questions.


#13

I am not familiar with mod_gzip. Do I need to do anything to enable it, or is automatically gzipping content? And how does it decide what to gzip, by mimetype? Would it gzip something sent at application/xhtml+xml or text/css?


#14

Okay, maybe this will reveal something. I currently switched off the CSS mod_rewrite rules seeing how Dreamhost is so sure this is a mod_rewrite problem. I won’t know for days if this fixes it or not because the problem is so intermitent.

However, before I turned it off I was prompted to download the following page:

http://www.pixeljunkies.com/forum

Which is supposed to just 301 to:

http://www.pixeljunkies.com/forum/

So I downloaded it, and this is what it gave me:
http://www.pixeljunkies.com/tests/forum.html

If you look at the source, you can see the whole 301 page is there. But there is still all this binary data and headers before it.

Now, I don’t have any error directives for 301 errors. This page does not pass through a single one of my scripts, and does not match a single one of my rewrite rules. Can this tell us anything about the problem?


#15

Well, like I mention in a thread of my own, there seems to be something wrong lately in the ASCII vs. binary mode setting for FTPing to/from Dreamhost, so maybe that has something to do with your problem.

You don’t mention what browser you’re using; that makes a difference, since some (e.g., Mozilla/Firefox) are meticulously standards-compliant in that they’ll do whatever the MIME type header says, while others (MSIE and Opera) will second-guess the MIME headers in some cases and do what they think you really want. Both approaches can cause problems in some cases; the standards-compliant behavior fails in cases where the site is misconfigured and sends the wrong MIME type, while the second-guessing can produce unpredictable failures when the browser guesses wrong.

– Dan


#16

The mod_gzip site is at http://www.schroepl.net/projekte/mod_gzip/index.htm

How it works with Apache is described at http://www.schroepl.net/projekte/mod_gzip/install.htm, under the heading “The integration of mod_gzip into Apache’s evaluation of a request”

An example configuration file with the rules is at http://www.schroepl.net/projekte/mod_gzip/config.htm

We may need to ask support if DreamHost is using different rules than shown here.

:cool: Perl / MySQL / HTML+CSS