Benefits of multiple SCSI controllers for databases [RHEL 7/8]

Latest response

We are going through some design discussions for implementing various databases (Mongo, Cassandra, etc) on RHEL 7/8 on ESX 7. One of the design decisions is how many SCSI controllers to use. There is perceived benefit for multiple SCSI controllers but no one is able to articulate how or why this might be true. I have done some research on this found hints that in the case of Windows each virtual SCSI controller would have its own I/O queue.

(1) What, if any, are the benefits of multiple SCSI controllers for databases and why (beyond a simple it's faster)?

(2) How much of this is dependent on the operating system or operating system version?

(3) How much of this dependent on the choice of SCSI controller + driver? If the hints of multiple I/O queues are true then my guess is that this is a function of the driver.

Thanks in advance for any insight or guidance.

Responses

Good question, this piqued my interest too.

Most sources I could find talked about the difference between the emulated SCSI controllers and the paravirt controller, with the paravirt being 10% to 30% more efficient depending on workload. That's the expected result.

This blog talks about the multiple queues being beneficial, but like you said, doesn't quantify further:

A reasonably good answer can be found in this VMware blog and the whitepaper it links to:

We can infer that having multiple IO queues in the VM is probably only beneficial if the hypervisor can also spread those queues across physical HBAs. If you're shoving 4 virtual HBAs of IO into one physical HBA, you're probably not going to see a benefit as the physical machine becomes a bottleneck.

I imagine you'd probably need sufficient VCPUs and physical CPUs to be able to run each queue concurrently too. To give a silly example, there's probably not much advantage to stuffing 4 SCSI controllers into a 2-core VM.

The white paper provides the testing method of interest in the "Performance of a Single VM" section. They run up 10x 6.4 GiB disks per controller and use Iometer to run up a database-like workload (2 workers per controller, 8KiB IO size, 100% random read and write, 16 outstanding IOs) and find that IOPS scale fairly linearly per controller added, with not much increase in latency.

You can use that whitepaper to inform your testing method too.

Identify your workload pattern. Either run a repeatable production workload, a prod-like test program, or configure an IO test program like Iometer or fio to mimic your workload.

Personally I like a mixture of these test methods. A repeatable prod workload is great because you can pick a very simple metric like "total runtime", but also the prod program might not provide useful stats like IOPS and latency. Artificial test programs can provide more detailed reporting but it can also be challenging to make the program behave accurately like the actual workload.

Consider your amount of load now, plus how the workload is expected to grow over the lifetime of the system.

From there, try the same test with 1/2/3/4 SCSI controllers in the VMware VM and see what sort of results you get.

Maybe you'll also see a linear scaling of performance with controllers, maybe your workload is not sufficiently heavy to benefit from multiple controllers, maybe you'll only benefit from 2 controllers but not 4.