9.5. Viewing Storage Node Alerts
- The JVM heap hits its maximum threshold and causes performance degradation.
- The storage node begins using too much disk space on its system.
- High heap usage, which can lead ot out of memory errors and performance degradation. A dampening rule is in place to prevent alerts for momentary memory spikes.
- High disk usage, which can lead to problems with compaction and other routine operations.Compaction operations are particularly important because compaction merges datafiles on the disk into a single disk file. This frees disk space and improves read performance. If this operations fails, then performance can degrade.An alert is fired for high disk usage if any one of several conditions is met:
A dampening rule is in place to prevent alerts for momentary usage spikes.
- The size of the storage node data exceeds 50% of total disk space.
- The overall amount of disk space used exceeds 75% of the total disk space (regardless of how much disk space the storage node is using).
- The ratio of free disk space to storage node data is less than 1.5. This is calculated by taking the amount of free disk space divided by the disk space used by the storage node. If there is 50MB of free space, and the storage node is using 35MB of disk, then the ratio is 50/35 or 1.42. That is too low and would trigger an alert.
- Snapshot failure, meaning a local routine backup operation has failed.
- Maintenance operation failure, meaning either a deploy or undeploy operation for a node failed. Any underlying causes, like an unavailable resource, can be addressed and then the operation can be re-run.
Table 9.1. Storage Resources for Alerts
|Alert||Parent Resource||Resource Type||Area to Address|
|High Heap Usage||Cassandra Server JVM||Memory Subsystem||Edit the heap sizes in the storge node JVM configuration|
|High Disk Usage||Database Management Services||Storage Service||Increase the disk space for the system hosting the node|
|Snapshot Failure||Database Management Services||Storage Service|
|Maintenance Operation Failure||Storage Node||Unavailable storage nodes in the cloud (which prevent updates)|
- Click the Administration tab in the top navigation bar.
- In the Topology area on the left, select the Storage Nodes item.
- The Nodes tab shows the number of unacknowledged alerts for each node.
- To view the list of alerts, open the Cluster Alerts tab.Every alert is listed with a description of the condition which triggered it, the affected resource, and the time of the alert.