I’m trying to figure out how I can make it so one page on my site is a copy of a page on another one of my sites (not hosted with Dreamhost), without using a link or redirect. Basically I need to set it up so that the page on the external site is automatically copied to my Dreamhost site once a day or so. I was thinking of using a cron job to do this, but unfortunately I don’t have any experience with cron, so I don’t know how to set it up or even if it’s possible. Anyone have any tips? I am comfortable with a UNIX command line, and have shell access, it’s just that I’ve never used cron before at all. If cron is not the best/easiest way to do this, is there a better way?
It’s not terribly difficult to use cron; you’d probably have more to worry about what tool to use to fetch the page. IMO it is the best/easiest way.
What I do:
- Edit a text file with cron entries.
- Upload text file
- Then in shell, execute following: “crontab textfile”
- If you need to check what the current entries are, execute “crontab -l”
- And to save them to a file, “crontab -l > textfile”
cron entries are space-delimited paramters. The first five are minute, hour, day of month, month, and day of week. The last is the command(s) to execute.
So you might want something like this:
0 2 * * * chdir ~/mirror; wget -p --convert-links http://www.server.com/dir/page.html
-------------------[/code]This would run every day at 2 AM (server time). The wget command here is from the documentation, it would save the HTML document and linked resources (CSS, images) in the same directory, ~/mirror/www.server.com/dir
Perl / MySQL / HTML+CSS
rsync is another option for mirroring. This has the advantage that it won’t transmit unmodified files every day.
For example, I use the following command to create a remote backup:
rsync -e ssh -avcp --delete
(Careful with --delete, as it will delete everything in the remote directory.) The above example also assumes you have private key ssh authentication set up.