MySQL Indexing


This is in reply to a message posted by Dreamhost support earlier today.

[quote]We found the problem after much looking.
A single user who had a 110MB table was performing a query similar to the following:

SELECT * FROM images WHERE imageid=‘3563’;

However, imageid was not indexed. Therefore, everytime the above query was performed (many times a second), the server had to scan through the entire 110MBs of the table to find the single record it was looking for. That was pushing the load up to 100 (where 1 is normal).

A simple
CREATE INDEX imageid_index ON images (imageid(10));
fixed the problem, and dropped the load back down to 1.

(She also indicated that her site had recently been featured on a large search engine.)
I’m ery interested in this, as my tables are growing by the day. I have indexed many of my tables before, but never figured out how to find out how much speed I have saved. I see here that you have given usage figures. How would I go about doing this for my unoptimized and optimized tables?


Wiliam Stephens

Web Developer

Hi Wil -

The ‘load’ mentioned was that of the server itself - this is a good metric when you notice something is slowing the entire server down, but not for gauging the speed of a single query or table structure.

Probably the best way to measure the actual savings on a per -table basis is to manually (from the command line MySQL client) run your queries, and note the time given. Doing this after indexing, I’ve been able to cut down the speed of certain selects to 1/100th of what they were before. Most results will be far less obvious, but it can still help.

  • Jeff @ DreamHost
  • DH Discussion Forum Admin

just a note… i did ask jason (another admin, who deals with a lot of the mysql stuff, and made the announcement in question) to clarify, but he was having some problems signing up for the forum / posting. hopefully he’ll post a response soon, and if not, i’ll try to get him to write something up and i’ll post it for him.

Great, thanks.


Web Developer

Sorry about that, finally got a user account and am posting like a madman now! Whee!

Jeff was very correct about noting the amount of time a particular query takes. This can be very indicative of how good your queries are.

A very simple rule of thumb is to try to index most columns that appear in “WHERE” clauses. For example, in the query below:

SELECT * FROM images WHERE imageid=‘3563’ and dateid=‘2002-01-02’;

You have two columns that you’re searching on, imageid and dateid. You would likely want to index both of those separately:

CREATE INDEX imageid_index ON images (imageid(10));
CREATE INDEX dateid_index ON images (dateid(10));

(You generally don’t want to index more than the first 10 characters of a field, hence the “10” above).

But that’s just a rule of thumb.
The best way to determine how good your queries are is to use the explain statement. Just put the word EXPLAIN in front of your query and let the magic begin:

EXPLAIN SELECT * FROM images WHERE imageid=‘3563’ and dateid=‘2002-01-02’;

Mysql does an excellent job of tutorializing EXPLAIN:

Basically, you’d want to see that a key is actually being used (ie, the “key” field is not NULL, but something like imageid), and you want to see that the “rows” field is --much less than-- the total number of rows in your table.

For example, if your table “images” had a total of 50325 rows, and you had indexed your table properly, then

EXPLAIN SELECT * FROM images WHERE imageid=‘3563’ and dateid=‘2002-01-02’;

might show something like 28 in the “rows” field. That basically means that mysql had to examine 28 rows to find the correct information that you had asked for. This is MUCH better than if you had not indexed your tables properly and mysql had to examine all 50325 rows to find the information you were looking for.

That’s it! Let me know if that was helpful and understandable or not! =)