Traffic Report


#1

Before we switched to Dreamhost, I had been using a free Bravenet counter to measure our traffic. We run a fairly large multi section webzine, so a simple hit counter on the front page wasn’t doing a good job. I expected the number to be under our real traffic figures, but now that we are on Dreamhost, we have been keeping an eye on their stat reports. But the difference between the Bravenet, front page counter and the “Distinct Hosts served” on the Dreamhost report is vast. I understand that the number reports how many individual hosts have visited the site and that the number usually is lower than the amount of actual human visitors because of proxy servers from AOL and all that. I also understand that this is looking at the whole site and not just a single page, so the numbers should increase from folks who find us on google searches and links to individual stories. But the number is still big enough for me to question my understanding of it. What do you folks on the board use in the Dreamhost stats to measure your traffic?


#2

A lot of the net is behind proxy servers.

When someone hits a page from MSNTV, for instance, they typically get the HTML with one server and the graphic at the top of page with another server.

Of course, when someone else from WebTV hits your site an hour later, your site never finds out about it - they serve up a copy that MSNTV cached, unless you prevent caching by attaching a cookie or using some sort of cache control header or meta tag.

AOL is behind caching proxy servers, too, but unlike MSNTV, they consider cookies to be part of the file, and they will give the cookies to parties they weren’t intended for. Never overestimate the intelligence of AOL or Microsoft.

And then there are corporate firewalls. A good deal of my traffic appears to be bored workers on 3-hour coffeebreaks. Corporate IT generally has trouble staying out of their own way when it comes to the net; they’re more concerned with keeping the sound card in the CEO’s PC functioning correctly.

Since raw log files are readily available, it’s no big deal to write a Perl script to parse them and generate whatever statistic you want. Within limits, those statistics may even be useful to you and to others.

Just remember, the map is not the territory.