Best Practices/New Site

I am not going live yet, probably not for a (long) while. If I am right, my site will grow in popularity.

I am planning to:

  • Validate pages for proper html and css
  • No frames
  • Opt-in confirmation per the TOS
  • Parse all user submitted info (no tags, etc.)
  • Test on several (FireFox, IE, ??) browsers
  • Collect and review statistics regarding bandwidth, disk, and SQL usage

What else should I do so I build a “clean” site, and also have a heads-up if and when my site no longer “plays well with others”?


You may also want to add accessibility to the list, basically ensure it plays nice with screen readers, text browsers or any other considerations for visually impaired.

$97 DISCOUNT with [color=#CC0000]DISCOM97[/color]
More codes

Choose which HTML and CSS standards you want to adhere to. There is a BIG difference between HTML 4.01 and XHTML 1.0 transitional or 1.1. The latter is “cleaner”, but relies on CSS exclusively for style elements, which can be hard to accomplish if you are used to HTML4 or nonvalidating webpages.

Firefox and IE are good candidates to test on, though I’d recommend you test Opera and Safari as well, if possible.

“Parse al user submitted info” can mean many things. Try to keep it in a default-deny state, i.e. deny everything unless it’s specifically allowed (that goes for tags, attributes, shell characters, SQL characters, etc.). If that’s not possible, be sure to escape wherever needed.

Make it a double-opt-in, and be sure to throttle repeated requests on the website, if you have to have that at all (i.e. opt-in is useless if I can use your site to spam some email address tens of times a second; this problem is even worse if you allow “custom” messages; though it’s probably a very good idea to avoid “send this to a friend” kind of buttons; if they REALLY want to send it to their friends, they can bloody well open their mail program or instant messenger; just make sure the address bar properly reflects where you are at all times).

Consider adding RSS feeds. They have many uses, and your users will thank you for it.

Avoid intrusive and annoying ads and marketing; it’s probably a good idea to CLEARLY mark advertisements (by separating them out from your own content in style and or location), avoid inundation with ads, really really avoid pointless segmentation of articles into 20 or 30 parts just so you get more adviews, avoid “intelligent links” kind of ads (if I encounter those on any page, it’s pretty much automatically closed; no need to read anything that’s intended purely to drive traffic and not inform).
Yes, these may reduce your click rate. Too bad. Better a much higher and loyal readership than a few cheap clicks.

Don’t forget a proper robots.txt. Think about whether you want search engines to index everything, archivers to archive everything, etc.

If possible, avoid URLs with session IDs; Those are much better placed in cookies or POST arguments; Sticking them into the URL will just lead to people copying links including the session ID into mails and instant messages.

Be up to date on current security best practices. For instance, you should probably know what Cross Site Scripting is.

Have a plan for when things go wrong. The more complex your site, the harder it is. Things that can easily go wrong :

  • security breach (have backups and be able to shut down (parts of) the site if needed, and inform your visitors about the problem, possibly with an ETA.
  • performance issues; if you get hit by slashdot, digg, and msnbc at once and all your pages are highly dynamic and regenerated on every page hit, you will hit a problem. It helps to be able to move to a lightweight mode, or have certain parts of the site cached as a static version. There are some independent site performance measurement companies out there that regularly test the response-time of your site. They can help with that.
  • Be sure to have a working abuse@ address if your site contains user-generated content. Be sure to read mail to that account in a timely manner. If you send out mail, make damn sure that you accept abuse complaints via email and not via a web form. If you have mail functions for users on the site, let them know in EVERY mail how to unsubscribe – and don’t make them re-enter their email-address on your site. Either provide an in-band means (via email unsubscribe messages), or specially coded URLs to do this without entering their email address. Spammers love to do that the other way around to harvest more (active) mail addresses.
  • Legal issues. Be sure about the legality of your site and what may go wrong, legally. It might be a good idea to have a DMCA contact, for instance, and an easily reachable and servable postal address (this may be taken care of by a well-cared for WHOIS entry). If you are not from the US, be sure to know your local laws; for instance, privacy laws in many EU countries are a LOT stricter than the (almost nonexistant) US ones. It’s probably a good idea to know who to call when you get into a sticky legal mess.

If your site becomes very popular, have a plan for the future. Once you outgrow a shared hosting account, it would be good for the site to be flexible enough to be moved to a dedicated server or a server farm without too many problems.

As for SQL-usage : it helps to analyze your SQL statements. The number of them performed or their concurrency alone do not really matter as much; if your query creates HUGE intermediary tables or takes an extremely long time to run due to missing or useless indices, you should probably fix that. Sometimes, two queries can be performed a lot faster than one query that tries to do two things at once (with fancy GROUP BY and HAVING statements, for instance).

Make sure to look at your page in different resolutions. 800x600 for smaller screens, 1024x768 is still extremely common, and 19??x14?? for some laptops. Don’t be afraid to look at wide-screens as well. Your target-audience might use PDAs and cellphones as well.

Don’t forget graceful degradation issues. If the browser being used does not support CSS all that well, or has trouble understanding some HTML constructs, make sure the page is still readable. It doesn’t have to look perfect, but it should still be usable. A good way to test some of this is to look at it in a text-only browser such as lynx (or links, if you want ecmascript and css support) or play with some browser settings that disable or override CSS or style issues.

If you decide to restructure your site or move to different engines, make an effort to keep old URLs valid. Frustrated users that came to your site from a link from another site that was placed years ago but can’t find the linked-to materials don’t help anybody, especially if it “just” moved to a different place (mod_rewrite can help with that, in many cases).

There are some decent analytics out there; be sure to sign up for the Google Webmaster tools to see how they index your site, and what search terms get people to your site. If you want analytics of your viewership that go beyond access log parsing, Google Analytics or similar are decent; though avoid more than one of those services on a page.

Finally, figure out what your users want from your site, and make those things easy (if it fits with your vision of the site). The more satisfied your users are with their user experience, the more likely they are going to come back or recommend it to others.

Just some random best-practice thoughts :slight_smile:

Nice post.

If you use Firefox, the Web Developer plug-in allows you to disable things like CSS & Javascript with a few easy clicks.

One common practice that might not look right in that test would be some of the layouts geared towards search engines, even though they look right to users (with CSS enabled, of course).

For example, using CSS for positioning a header, footer, side menu and main content section–but putting the body of the page first in the source. Depending on your layout, the difference might not be a big deal.

Personally, I often do it that way and don’t worry about it, since just about everyone will see it the right way. Those that can’t will probably just have to scroll a little for the menu.

:stuck_out_tongue: Save up to $96 at Dreamhost with ALMOST97 promo code (I get $1).
Or save $97 with THEFULL97.

That was a great post…with a lot of useful and valuable information. Thanks for taking the time to post that! :slight_smile:


Great post eike, thanks for taking the time to write it. It gave me a couple of ideas that I need to consider.

Web Hosting Reviews | Miracle Directory