Table of Contents
As mentioned before, disks seeks are a big performance bottleneck. This problems gets more and more apparent when the data starts to grow so large that effective caching becomes impossible. For large databases, where you access data more or less randomly, you can be sure that you will need at least one disk seek to read and a couple of disk seeks to write things. To minimise this problem, use disks with low seek times.
Increase the number of available disk spindles (and thereby reduce the seek overhead) by either symlink files to different disks or striping the disks.
Using symbolic links | This means that, for MyISAM tables, you symlink the index file and/or datafile from their usual location in the data directory to another disk (that may also be striped). This makes both the seek and read times better, assuming the disk is not used for other purposes as well). See Symbolic links. |
Striping | Striping means that you have many disks and put the first block on the first disk, the second block on the second disk, and the Nth on the (N mod number_of_disks) disk, and so on. This means if your normal data size is less than the stripe size (or perfectly aligned) you will get much better performance. Striping is very dependent on the operating system and the stripe size, so benchmark your application with different stripe sizes. See Custom Benchmarks. Note that the speed difference for striping is very dependent on the parameters. Depending on how you set the striping parameters and number of disks you may get a difference in orders of magnitude. Note that you have to choose to optimize for random or sequential access. |
For reliability you may want to use RAID 0+1 (striping + mirroring), but in this case you will need 2*N drives to hold N drives of data. This is probably the best option if you have the money for it! However, you may also have to invest in some volume-management software to handle it efficiently.
A good option is to vary the RAID level according to how critical a type of data is. For example, have semi-important data that can be regenerated on a RAID 0 disk while storing really important data such as host information and logs on a RAID 0+1 or RAID N disk. RAID N can be a problem if you have many writes because of the time to update the parity bits.
On Linux, you can get much more performance (up to 100% under load is not uncommon) by using hdparm to configure your disk's interface! The following should be quite good hdparm options for MySQL (and probably many other applications):
hdparm -m 16 -d 1
Note that performance and reliability when using the above depends on your hardware, so we strongly suggest that you test your system thoroughly after using hdparm. Please consult the hdparm man page for more information. If hdparm is not used wisely, filesystem corruption may result, so back up everything before experimenting!
You may also set the parameters for the filesystem that the database uses:
If you don't need to know when files were last accessed (which is not really useful on a database server), you can mount your filesystems with the -o noatime option. That skips updates to the last access time in inodes on the filesystem, which avoids some disk seeks.
On many operating systems you can mount the disks with the -o async option to set the filesystem to be updated asynchronously. If your computer is reasonably stable, this should give you more performance without sacrificing too much reliability. (This flag is on by default on Linux.)
You can move tables and databases from the database directory to other locations and replace them with symbolic links to the new locations. You might want to do this, for example, to move a database to a file system with more free space or increase the speed of your system by spreading your tables to different disk.
The recommended way to do this is to just symlink databases to a different disk and only symlink tables as a last resort.
On Unix, the way to symlink a database is to first create a directory on some disk where you have free space and then create a symlink to it from the MySQL database directory.
shell> mkdir /dr1/databases/test shell> ln -s /dr1/databases/test mysqld-datadir
MySQL doesn't support that you link one directory to multiple databases. Replacing a database directory with a symbolic link will work fine as long as you don't make a symbolic link between databases. Suppose you have a database db1 under the MySQL data directory, and then make a symlink db2 that points to db1:
shell> cd /path/to/datadir shell> ln -s db1 db2
Now, for any table tbl_a in db1, there also appears to be a table tbl_a in db2. If one thread updates db1.tbl_a and another thread updates db2.tbl_a, there will be problems.
If you really need this, you must change the following code in mysys/mf_format.c:
if (flag & 32 || (!lstat(to,&stat_buff) && S_ISLNK(stat_buff.st_mode)))
to
if (1)
On Windows you can use internal symbolic links to directories by compiling MySQL with -DUSE_SYMDIR. This allows you to put different databases on different disks. See Windows symbolic links.
Before MySQL 4.0 you should not symlink tables unless you are very careful with them. The problem is that if you run ALTER TABLE, REPAIR TABLE, or OPTIMIZE TABLE on a symlinked table, the symlinks will be removed and replaced by the original files. This happens because these statements work by creating a temporary file in the database directory and replacing the original file with the temporary file when the statement operation is complete.
You should not symlink tables on systems that don't have a fully working realpath() call. (At least Linux and Solaris support realpath()). You can check if your system supports symbolic links by doing SHOW VARIABLES LIKE 'have_symlink'.
In MySQL 4.0, symlinks are fully supported only for MyISAM tables. For other table types, you will probably get strange problems if you try to use symbolic links on files in the operating system with any of the above commands.
The handling of symbolic links for MyISAM tables in MySQL 4.0 works the following way:
In the data directory you will always have the table definition file, the datafile, and the index files The datafile and index file can be moved elsewhere and replaced in the data directory by symlinks. The definition file cannot.
You can symlink the datafile and the index file to different directories independently of the other.
The symlinking can be done at the operating system level (if mysqld is not running) or with SQL by using the DATA DIRECTORY and INDEX DIRECTORY options to CREATE TABLE. See CREATE TABLE.
myisamchk will not replace a symlink with the datafile or index file. It works directly on the file a symlink points to. Any temporary files will be created in the same directory where the datafile or index file is located.
When you drop a table that is using symlinks, both the symlink and the file the symlink points to are dropped. This is a good reason to why you should not run mysqld as root or allow persons to have write access to the MySQL database directories.
If you rename a table with ALTER TABLE RENAME and you don't move the table to another database, the symlinks in the database directory are renamed to the new names and the datafile and index file are renamed accordingly.
If you use ALTER TABLE RENAME to move a table to another database, the table is moved to the other database directory and the old symlinks and the files to which they pointed are deleted. (In other words, the new table will not be symlinked.)
If you are not using symlinks, you should use the --skip-symlink option to mysqld to ensure that no one can use mysqld to drop or rename a file outside of the data directory.
SHOW CREATE TABLE doesn't report if the table has symbolic links prior to MySQL 4.0.15. This is also true for mysqldump, which uses SHOW CREATE TABLE to generate CREATE TABLE statements.
Things that are not yet supported:
ALTER TABLE ignores the DATA DIRECTORY and INDEX DIRECTORY table options.
BACKUP TABLE and RESTORE TABLE don't respect symbolic links.
The frm file must never be a symbolic link (as said previously, only the data and index files can be symbolic links). Doing this (for example to make synonyms), will produce incorrect results. Suppose you have a database db1 under the MySQL data directory, a table tbl1 in this database, and in the db1 directory you make a symlink tbl2 that points to tbl1:
shell> cd /path/to/datadir/db1 shell> ln -s tbl1.frm tbl2.frm shell> ln -s tbl1.MYD tbl2.MYD shell> ln -s tbl1.MYI tbl2.MYI
Now if one thread reads db1.tbl1 and another thread updates db1.tbl2, there will be problems: the query cache will be fooled (it will believe tbl1 has not been updated so will return out-of-date results), the ALTER commands on tbl2 will also fail.
Beginning with MySQL Version 3.23.16, the mysqld-max and mysql-max-nt servers in the MySQL distribution are compiled with the -DUSE_SYMDIR option. This allows you to put a database directory on a different disk by setting up a symbolic link to it. This is similar to the way that symbolic links work on Unix, though the procedure for setting up the link is different.
On Windows, you make a symbolic link to a MySQL database by creating a file that contains the path to the destination directory. Save the file in the data directory using the filename db_name.sym, where db_name is the database name.
For example, if the MySQL data directory is C:\mysql\data and you want to have database foo located at D:\data\foo, you should create the file C:\mysql\data\foo.sym that contains the pathname D:\data\foo\. After that, all tables created in the database foo will be created in D:\data\foo. The D:\data\foo directory must exist for this to work. Also, note that the symbolic link will not be used if a directory with the database name exists in the MySQL data directory. This means that if you already have a database directory named foo in the data directory, you must move it to D:\data before the symbolic link will be effective. (To avoid problems, the server should not be running when you move the database directory.)
Note that because of the speed penalty you get when opening every table, we have not enabled this by default even if you have compiled MySQL with support for this. To enable symlinks you should put in your my.cnf or my.ini file the following entry:
[mysqld] symbolic-links
In MySQL 4.0, symbolic links are enabled by default. If you don't need them, you can disable them with the skip-symbolic-links option.