Red Hat Training

A Red Hat training course is available for Red Hat Enterprise Linux

3.2. Storage Topology

Storage layout is very workload dependent, and some rudimentary knowledge of the workload is necessary. Historically, database storage is provisioned by space, not speed. In the rare case where performance is considered, topology bandwidth (MB/sec) is used as the metric. This is the wrong performance metric for databases. All but the largest data warehouses require 1000s of IOPs to perform well. IOPS only come from high numbers of spindles that are provisioned underneath the file system.
The easiest way to configure an array for both performance and reliability is to use a RAID set size of 8-12 (depending on the RAID algorithm). Many RAID sets can be combined to produce a single large volume. It is recommended that you then use this volume and strip the LUNs off this high IOP volume to create the specific number of sized LUNS. This is often called the "block of cheese" model, where every strip independent of size has full access to the IOP capacity of this large, single volume. This is the easiest way to produce high performance LUN for a database.
Acquire as many 15K spindles as is practical or affordable. Resist the temptation to use large, low RPM drives (i.e., SATA). Resist the temptation to use drive technology (including controllers and arrays) that don’t support tagged queuing (i.e., most SATA). Tagged queuing is critical to sustained high IOP rates. In the SATA world, it is called NCQ (Native Command Queuing). In the FCP/SAS world, it is called Tagged Queuing. It is usually implemented at the shelf level; insist on it.
Contrary to some detailed studies, in general a 15K 72GB drive has better performance than a 10K 300GB drive. Outer track optimizations cannot be relied upon over the lifecycle of the application, nor can they be relied upon with many storage array allocation algorithms. If you could ensure that only the outer tracks were used, then larger capacity drives should seek less. It is difficult to buy small, high RPM drives, but they will always have the best IOP price/performance ratio.
Software, or host-based RAID is less reliable than array-based RAID, especially during reconstruction, and load balancing. Host-based RAID operations compete for resources and could compromise throughput on the database server.
Many storage topologies include FCP switch infrastructure and this can be used to isolate the I/O traffic to the array. We recommend that the storage array HCAs and the four ports of the two HBAs all be placed in one zone. For more information on HBA configuration see Section 4.2.2, “Multipath Configuration”.
We do not recommend the multi-purposing of a storage array. Many customers buy very large arrays and place multiple Oracle databases (including dev and test) all on one array. This is ill advised, and the write-back cache policies in the array (which will become the bottleneck) are difficult to tune. Relative to the cost of Oracle and the critical nature of most Oracle databases to their respective enterprises, the storage is free; dedicate the storage, if possible. Oracle workloads are voracious, and unpredictable consumers of arrays.

3.2.1. Storage Allocation

Red Hat Cluster Suite requires a single, 64MB LUN for quorum disk support. It is recommended that the qdisk feature be used for Oracle Cold Failover.

Warning

The qdisk feature is mandatory for RAC/GFS clusters.
RAC/GFS clusters require Oracle Clusterware to be installed, and they require five 384MB LUNS (two for registry, three for quorum). It is recommended that three Clusterware voting (quorum) disks be configured, but a single, externally (array) redundant Clusterware vote disk is fully supported.
In either the HA or RAC/GFS install, the LUNs will be used to create file systems. Oracle supports AIO and DIO for both EXT3 and GFS; this provides raw device performance. In our configuration, the performance of any given LUN is the same; the size of the LUN does not affect performance. However, the size of the LUN may affect filesystem performance if large numbers of files are placed in many directories. Most Oracle databases use a relatively low number of datafiles in a file system, but this is at the discretion of the DBA and is determined by the ongoing operational requirements of the database. Tablespaces consist of datafiles, and contain base tables and indexes. Tables are indexes are usually in separate tablespaces (if you are lucky) and the datafiles are usually created to be as large as possible. In some cases, tablespaces and datafiles are intentionally created small, with AUTOEXETEND disabled. This generates alerts that cause DBAs to be notified of dynamic growth requests in the database. No two shops have the same policy towards AUTOEXTEND.
Redo Logs, UNDO tablespaces and Redo Archive logs often get their own file system. Redo log file systems normally have write latency sensitivity, and can be impacted by an Archive log switch (ARCHIVELOG is usually enabled for production databases).

Note

During a log switch, the previously closed log is copied to the archive destination, and is usually not throttled. This can impact transaction commit response times. One of the simplest ways to mitigate this effect is to place the Archive Log destination on DIO-enabled NFS mount, and the network connection be forced to 100TX. This is the easiest way to throttle archive log copies. Customers often use NFS as an archive log destination, so this can be as simple as a NIC re-configuration request.
A LUN (and subsequent file system) should be allocated for ORACLE_HOME. This file system should not contain any database files. This LUN must only hold the product home, and spare capacity for trace files. It could be as small as 8GB.

Note

For RAC/GFS, Oracle Clusterware Home (ORA_CRS_HOME) cannot be located on a clustered GFS mount point.

Note

Like virtualized anything else, Oracle and virtualization tend to make very strange bedfellows. Oracle database applications are voracious consumers of hardware resources and rarely share well with other applications, and often not well even with the host OS. Oracle is a fully portable OS that is completed implemented in user space. It is best to dedicate the hardware to Oracle, and this goes for the storage array too. EMC invented “virtualized” storage years ago with the concept of busting up a single, big disk into four pieces, or Hypers. These Hypers combine in a way that create a Meta LUN. This looks like a highly efficient utilization of storage, but misses the point -- A 15K drive busted up into four pieces, does not serve four times the IOPS. If you run several instances of Oracle on a virtualized server and several copies of Oracle databases on a virtualized storage array, your life will be much harder (and very likely shorter).