Kdump recommendations for large servers

Latest response

Especially database servers nowadays reach memory sizes where full kernel dumps are not feasible.

Are there any recommendations, how to configure kernel dumps on such systems?

Even with dump level set to 31, kernel dumps could require a lot of space...

What is the maximum memory size, where kernel dumps should be setup?
Do you only set it up, when requested by Red Hat or do you activate it for all servers?

Responses

The usual recommendation from us here in support is "RAM +1%".

We have had some instances where -d31 stripped out pages we needed to diagnose, rendering the vmcore useless. After some internal discussion we decided that -d1 was the most reliable way to get a useful vmcore, which is why it's suggested on the kdump knowledgebase solution.

Note that you don't necessarily have to upload that whole vmcore to us. You could try reducing the size locally with makedumpfile, then upload the smaller core. If that works, goodo. If we need more pages, we can try to remove less (or no more) pages and get a bigger file.

Back to your original question. One option could be to size a large (say one or two terabyte) dump target, then use that dump target for a large number of systems. You're probably unlikely to have all your boxes panic at the same time, so while there's collectively many terabytes of RAM out there, any one system can use the dump target with room to spare.

I am very interested to see how others are dealing with this as well.