Number of entries in Distributed Cache
Hi,
We are trying to find total number of unique entries in distributed cache level.
We are unable to find any API support for this. All the size() methods returns, total number of entries in local cache.
This statistics is important in application level requirement.
Will there be any way to identify total number of unique entries in distributed cache?
Regards,
Vadivel G
Responses
Unfortunately there is no easy way to get certain number of uniq entries in the cluster and we have to calculate it manually.
If you use a shared cache store, you can probably get number of entries from it easily.
To calculate it, we need to get keyset from all nodes and merge them to exclude duplicate entries, by using MapReduce Framework (library mode) or over HotRod (server mode). If we have large keysets, it might cause OutOfMemoryError in the middle and unable to calculate it in a straightforward way - this is one of the reason we can't provide such API. In that case we might have to store keyset to disk (probably slow) or use a technique like Bloom filter (http://en.wikipedia.org/wiki/Bloom_filter) for detecting duplicates with limited memory.
The size() API or the JMX attribute numberOfEntries on infinispan cache "component=Statistics" MBean, return total num of entries on that node. It includes backup entires and L1 cache entries so no uniqueness, but it helps to know approx. num of entires.
If you disable L1 cache, you can calculate certain number of entries by getting size/numberOfEntries from all nodes, calculate total and divide by numOwners. For example, 4 nodes cluster, size/numberOfEntries=20 on each node, numOwners=2, the number of entries in the cluster is 20 * 4 / 2 = 40.
Hope that helps,
Takayoshi
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
