- How can I speed up bulk loading or repairing of MyISAM tables?
There are two different methods that can be used by MySQL when it creates new indexes for MyISAM tables. MySQL will automatically decide which method to use based on the settings of a few parameters. You can view which one MySQL is using by running the SHOW PROCESSLIST statement while the operation is running and looking at the output.
These optimizations can only occur when you are doing the following operations:
- REPAIR TABLE
- ALTER TABLE ... ENABLE KEYS after doing an ALTER TABLE ... DISABLE KEYS and adding data between the two statements (such as done by mysqldump --opt)
- ALTER TABLE that requires an index rebuild
- OPTIMIZE TABLE
- LOAD DATA INFILE into an empty table
- Using myisamchk to do any of the above
- Any other operation that requires an index to be rebuilt and sets a write lock on the table
The first method MySQL can use is called Repair by keycache. This is the method that MySQL will use by default. It is also required when working with a UNIQUE or PRIMARY KEY due to the fact that MySQL needs to verify that all of the data is UNIQUE.
Repair by keycache will use the key_buffer in order to generate the indexes. So key_buffer variable should generally be set as large as possible while this operation is running. However, it will work even when the system has very little resources.
Using the key cache method is generally much slower than the second method, so it only should be used when required, such as when dealing with a UNIQUE or PRIMARY KEY or when the system has very little resources of which to take advantage.
The second method is called Repair by sort. This method is much faster than the Repair by keycache method. However, it will require more resources in order to function. This method will use a separate buffer in order to generate the indexes called myisam_sort_buffer_size. You will generally want to set this to as large as possible before doing the import, even up to 90% of system memory if only doing the load. This variable can be set without having to restart the server and can be done on a per session basis before the load occurs. Below is an example of how this might be done:
SET SESSION myisam_sort_buffer_size = 500*1024*1024;
In this example, the buffer size is set to 500MB. After performing this change, you could then do the operation and take advantage of the memory available. Of course, if normal database operations are ongoing, you would need to take into account the memory that's in use.
The second set of variables that are significant is the myisam_max_sort_file_size and the myisam_max_extra_sort_file_size. myisam_max_sort_file_size is the maximum size of the temporary file MySQL is allowed to use while re-creating a MyISAM index when using the sort method. Also, if the size of the temporary file needed to use the sort method exceeds key_buffer + myisam_max_extra_sort_file_size the slower key cache based method will be used.
The last setting that can be important is myisam_repair_threads. If you set this variable to a value larger than 1, then MySQL will create multiple indexes at a time. By default, MySQL will only use a single thread in order to do the index creation. This means that on an SMP system that it will only use a single CPU. With this turned on, MySQL will instead create a single thread for each index on the table. This will allow multiple CPU's to be used. One should only turn this on if there are multiple CPU's on a system that you want to use for a repair.
Note: Multi-threaded repair is still alpha quality code at the time of writing and was added in 4.0.13.