Automatic Mirroring?


I’m trying to figure out how I can make it so one page on my site is a copy of a page on another one of my sites (not hosted with Dreamhost), without using a link or redirect. Basically I need to set it up so that the page on the external site is automatically copied to my Dreamhost site once a day or so. I was thinking of using a cron job to do this, but unfortunately I don’t have any experience with cron, so I don’t know how to set it up or even if it’s possible. Anyone have any tips? I am comfortable with a UNIX command line, and have shell access, it’s just that I’ve never used cron before at all. If cron is not the best/easiest way to do this, is there a better way?


It’s not terribly difficult to use cron; you’d probably have more to worry about what tool to use to fetch the page. IMO it is the best/easiest way.

What I do:

  1. Edit a text file with cron entries.
  2. Upload text file
  3. Then in shell, execute following: “crontab textfile”
  4. If you need to check what the current entries are, execute “crontab -l”
  5. And to save them to a file, “crontab -l > textfile”

cron entries are space-delimited paramters. The first five are minute, hour, day of month, month, and day of week. The last is the command(s) to execute.

So you might want something like this:


0 2 * * * chdir ~/mirror; wget -p --convert-links
-------------------[/code]This would run every day at 2 AM (server time). The wget command here is from the documentation, it would save the HTML document and linked resources (CSS, images) in the same directory, ~/mirror/

:cool: Perl / MySQL / HTML+CSS


rsync is another option for mirroring. This has the advantage that it won’t transmit unmodified files every day.

For example, I use the following command to create a remote backup:

rsync -e ssh -avcp --delete
~/ ~/

(Careful with --delete, as it will delete everything in the remote directory.) The above example also assumes you have private key ssh authentication set up.