How to prevent server timeout? Generating large XML file


#1

Greetings,

I am trying to generate an XML product feed for around 200,000 items on my site. This script is to be run once per week.

I have shared hosting on Dreamhost and apparently the max_execution_time is set at 300 seconds. Although my script times out at around 120 seconds.

Is there a way or trick to prevent the timeout and allow the script to complete it’s job? The size of the file is currently 360MB, which is almost close to the finished size I was expecting.

Thanks
Kind regards


#2

I now have a VPS and I’m still having this problem. I need to build a sitemap index for 400,000 items with 50,000 items per sitemap. Each 50,000 item batch gets created by the script during a cron job one at a time. I’ve used:

ini_set(‘max_execution_time’, 300);
ini_set(‘max_input_time’, 300);

These do not do anything to override the 120 timeout that causes a 500 error. Where is this 120 second limit being set?

Thanks
Kind regards


#3

Don’t run long-running tasks through the web server. It’s not well suited for that.

Since you’re already doing this through cron, switching it to run the script directly will be easy — just change the command for the cronjob to e.g.

/usr/local/php54/bin/php /home/username/example.com/scriptname.php

#4

Thanks for pointing me in a new direction, I have all my cronjobs set up to be run on the web server. One reason why I’m using the “wget” cronjob command is so that the server “acts” like a browser and redirects to the next batch on the script in order to prevent timeouts and to try getting the job done in one daily cronjob. I found that it can only redirect 20 times before I have to think of some other way of running the script… the solution you gave is something new to me…

Here is an example of each header("Location: … "); redirect at the end of each script iteration:

http://www.mysite.com/cronjobs/script.php
http://www.mysite.com/cronjobs/script.php?lastid=350000
http://www.mysite.com/cronjobs/script.php?lastid=300000
http://www.mysite.com/cronjobs/script.php?lastid=250000
http://www.mysite.com/cronjobs/script.php?lastid=200000
http://www.mysite.com/cronjobs/script.php?lastid=150000
http://www.mysite.com/cronjobs/script.php?lastid=100000
http://www.mysite.com/cronjobs/script.php?lastid=50000

Will the php command ( usr/local/php54/bin/php /home/username/example.com/scriptname.php ) follow the header redirects at the end of the script like the “wget” command? If not, how can I modify the script to do this in order to prevent timeouts in the middle of an iteration?

Thanks
Kind regards


#5

I’m assuming that the “right” way to do this sitemap script is to replace at the end of each iteration:
[php]header(‘Location: http://www.example.com/cronjob/script.php?lastid=$lastid’);[/php]

with:
[php]exec(’/usr/local/php54/bin/php /home/username/example.com/scriptname.php '.$lastid);[/php]

and then use the “$argv” variable instead of $_GET during the next iteration??? This seems to be working fine but I was wondering if this is the proper way.

Thanks
Kind regards


#6

You shouldn’t need to break it up at all when you’re running the script directly, unless it leaks a lot of resources while running or something (and in that case, you should fix it). There’s no inherent time limit on cron jobs.

Don’t use exec() recursively for this. I know it sounds like a good idea, but it makes each script wait for the next one to finish before it exits. This will end up using a lot of memory.

If you must break the script up, write a top-level “wrapper” script to exec() each chunk in sequence. That way, once each chunk finishes, it’s done with. (This’ll also make it easier to debug.)


#7

Thanks for the response. I’ve been working on this for a few days trying to make the script work better. I can see in the SSH that the recursive exec() calling isn’t going to work in the long term as each exec creates a new process with 15MB of memory usage, and they all run at the same time until either the script finishes or the memory becomes exhausted…

Unfortunately, the entire reason I had to break up the script into chunks is because of memory leaks in the while loops. I have quite a few scripts like this because I can’t figure out how to “clear” the memory after each loop. It just keeps accumulating even after I “unset” and “NULL” the variables used in the while loop.

The only hint that I’ve found is with using some kind of Object-Oriented programming style inside classes. Since I’ve done everything in a Procedural style, this is all kind of new to me. I’ve been trying to find a solution for this crappy memory leak problem for a few years and gave up many times. Do you know how to fix this, or to point me in the right direction?

Thanks
Kind regards

UPDATE:

I re-designed the script back to one large script without splitting it into chunks. It seems like it’s working MUCH better and much faster now by running it with the command: “/usr/local/php54/bin/php” . Maybe this behaves differently than running it on the web server? Seems like the memory usage isn’t accumulating like it used to.