Red Hat Training

A Red Hat training course is available for Red Hat Enterprise Linux

2.3. Storage Considerations

A database does only a couple of things: it reads a lot of data and it writes a lot of data. It produces and consumes I/O, and with few exceptions most of those I/O operations are in the form of small, random reads and writes. A well-tuned application (they do exist) will access most of the data in the most efficient way possible. This means extensive use of indexes and that translates into random IOPS, or I/Os per second.
Disk drives are physical media and are at the mercy of the law of physics. A disk drive (or spindle) must deliver as many IOPS as possible to make it a good candidate for database use. This usually means a high RPM, and support for SCSI. Modern SAS drives (Serial Access SCSI) replaced the SCSI bus with a cheaper, serial bus. Modern SATA (Serial ATA) replaced the ribbon cable in your PCI with a much cheaper cable. SAS drives tend to be higher RPM, support something called tagged queuing and usually have the best IOPS/spindle. However, disk drive technology changes often, so insist on the highest IOPS/spindle/$; regardless of the technology. It is not possible to buy too many spindles.
The storage layer must absolutely preserve the persistency of the data, so the data is still there when the lights go out. Be more aware of what hardware actually fails in a typical no single point of failure configuration. Drives fail, the grid fails, power supplies fail, in that order. Most other components outlast the lifetime of the deployed cluster.

Note

Drive technology has not kept up with CPU and memory technology, and much of this has to do with basic physics. A recent trend is the use of Flash technology in a disk form factor (Solid State Drives or SSD). The other trend is the use of large Flash RAM cards (connected by 8-16 lanes of PCI-e) to operate as a coherent, write cache, either in the storage array or somewhere between you and the physical disks. Both Flash cards and SSDs are very fast, but must be just as persistent. Since Red Hat Cluster Suite Oracle HA requires shared storage (in either case), the storage vendor tends to have both options. Either can work well for a given workload, but it is always the workload adaptability that will determine the success of these technologies (or any disk technology).

Note

There seem to be more RAID options than ever before. A simple thing to remember for databases is that, on average, a 144GB 15K drive is the same speed as a 36GB 15K, so if you factor for IOPS throughput, you don’t need to worry about space.
RAID5 is often used as a speed/space compromise, but is very slow especially for random writes, which databases do a lot. Sometimes the RAID controllers can hide this effect, but not well, and not forever. Another common algorithm uses one or more parity drives (most notably Netapp and HP EVA), and this option is a much better alternative to RAID5.
For database performance, the gold standard is RAID10 (a stripe of mirrored drives), which can tolerate the loss of 50% of the spindles and keep running at full performance. It might seem like a “waste” of space, but you are purchasing IOPS/spindle/$; the size of the drive is not relevant to database performance.
Various RAID options can create extra I/O in order to maintain the persistency, so the actual numbers of IOPS available to the database (payload IOPS), tends to be less than the spindle count, as is a function of the selected RAID algorithm.
Shared or non-shared file systems tend to be blocks-based file systems that are constructed on a set of physical or logical LUNs, as in the case of Red Hat’s Logical Volume Manager (LVM), or the clustered equivalent for the shared GFS install, CLVMD. An example of a files-based file system would be the NFS file system. This guide assumes that LUNs are presented for formatting into the appropriate filesystem type for either Enterprise Edition HA, or RAC.

Note

There are 3 main factors in calculating IOPS or I/O's Per Second:
  • Rotational Speed – AKA spindle speed (RPM)
  • Average Latency – Time for sector being accessed to be under a r/w head
  • Average Seek – Time it takes for hard drive's r/w head to position itself over the track to be read or written.
IOPS is calculated as 1/(Avg. Latency + Avg. Seek)(ms)
Total IOPS = IOPS * Total number of drives
For example, let's say we want to find the total IOPS in our storage subsystem and we have the following storage:
4 X 1TB 10kRPM SAS (RAID 0)
Avg. Latency = 3ms
Avg. Seek = 4.45ms
1(.003 + .0045)= 133 IOPS
Total IOPS = 4 * 133 IOPS = 532 IOPS