One way to do this would be to search for the browser agent using grep. For example:
grep "FooScraper" logs/yourdomain.com/http/access.log
(...assuming your domain is "yourdomain.com", you're looking for accessed by a client that identifies itself as "FooScraper", and the scrapings happened that day - there are older log files in that same directory you can look inside of, too, and you can even view failed requests inside of error.log)
Honestly, the best way to do it is to simply try viewing those files using a web browser. If you can't get to the information in question, neither can a scraper (which are really just 'dumb' web clients, and have no more ability to view inaccessible files than any other).
- Jeff @ DreamHost
- DH Discussion Forum Admin