While it might be rejected on Jeopardy for not being in the form of a question, there is indeed a question here IMO. Looks like a request for sugestions on importing raw log files into an RDBMS, that will serve as a data warehouse for log aggregation and reporting.
First just because I like to be aware of stuff like this, Transact SQL is the proper name for Microsoft’s flavor of programmatic data scripting. Oracle’s version is called PL/SQL, but I am not sure what it’s named when talking about MySQL’s version.
Data loads of the raw log files can be accomplished with manual scripting, but there are pretty sophisticated utilities provided for this sort of loading with SQL Server since is such a common task. For MySQL I am sure there is a more appropriate method, but I know you can load the raw log into an Open Office spreadsheet and get it into a holding database, which you can use to generate MySQL import scripts.
You can get a description of the columns contained in the access log by checking out the wiki article on setting up Awstats. You would just need a table to hold the various details in their own columns, as well as a PK and a FK to a table describing your websites.
LogFormat="%host %other %other %time1 %methodurl %code %bytesd %refererquot %uaquot"
184.108.40.206 - - [21/Nov/2006:00:52:33 -0800] “GET /robots.txt HTTP/1.1” 301 247 “-” “Exabot/3.0”
There might be some other questions in your post, but like rlparker I’m having difficulty identifying them. Maybe I’ll dork out and work on something that might help you. Keep it up though, this kind of exercise is good for keeping yourself useful in industry.
There is no real reason to concatenate your log files first, just load them sequentially. This will make it easier to keep the data seperated in case of some mistakes. Also, you’re not going to be able to delete the daily log files with any scripts, they’re owned by root…at least on my machine they are.