Block for Google Code Search?

I got this message from a Joomla forum admin. Would be interested in your thoughts.

It has come to our attention that Google has released a new product,
Google Code Search, that is capable of indexing and crawling through
archive files stored in the public directories of web servers. We are
reporting this as a security advisory because we have discovered that some
site administrators are storing archives / backups of their website in the
web root. Because of this, Google Code Search is able to crawl the
archives and read unparsed PHP files as if they were plain text. This has
resulted in the disclosure of some sensitive information including MySQL
passwords and SMTP credentials.

Read more at [,101880.0.html]

This is just common sense, really. You should not be storing any information in a publiclly accessable web directory you do not want being seen.

If you don’t store this stuff in a directory that is accessable from the web, Google code search becomes a moot point. :wink:

Given that many Joomla! users have precious little understanding of how webserving actually works (I mean, that is (at least a small) part of the whole point with such CMS systems), it is probably good that the Joomla enthusiasts warn of this, but the warning should really not be necessary as such things sholdn’t be stored in such a way in any event.

Goodle code search does not reveal anything that couldn’t be revealed without it’s use - it just makes it easier :wink:

Rules to follow:

  1. Don’t place it in a web accessable directory if you don’t want it to be reachable from the web.

  2. make sure there is an index.html (or other page served by default) in every web accessable directory whose contents you don’t want browsed.

  3. Use Apache authentication (.htaccess), or other protection mechanism, if you want to restrict public, or robot, access to a web accessable directory.

THere are myriad of other things you can consider, such as the use of robots.txt, other .htaccess restrictions, re-write rules, etc. - but the three above are the one you really need.


The simple (and obvious) answer to that is “Don’t store sensitive information in a web-accessible location”. You shouldn’t be doing this anyway, whether Google Code Search exists or not.

Very good advice.

There are so many other crawlers out there these days, not all of them innocent, that sensitive data should never be stored in a web accessable area. The existence of Google Code Search doesn’t change this.


If you have EVER made the mistake of having your archives stored in a web-accessible directory, after correcting that problem, make sure you change your Joomla administrator password immediately. While Google has gone to the minimal trouble of hiding sensitive data identified with the keyword “password”, there are other glaring issues.

Example results from Google Code Search

$mosConfig_password = '...password obscured...';Anyone who knows where to find this sort of info, and how to utilize simple MD5 decryption, can easily gain access to other “_secret” information stored in the same place. If this doesn’t make any sense to you, just CHANGE YOUR PASSWORD. I am not trying to make life easier for would be 1337 |-|8x0r5.