16.5. Grouping and Aggregation Operations

The Infinispan Query DSL has the ability to group query results according to a set of grouping fields and construct aggregations of the results from each group by applying an aggregation function to the set of values. Grouping and aggregation can only be used with projection queries.
The set of grouping fields is specified by calling the method groupBy(field) multiple times. The order of grouping fields is not relevant.
All non-grouping fields selected in the projection must be aggregated using one of the grouping functions described below.

Example 16.5. Grouping Books by author and counting them

Query query = queryFactory.from(Book.class)
    .select(Expression.property("author"), Expression.count("title"))
    .having("title").like("%engine%")
    .groupBy("author")
    .toBuilder()
    .build();

// results.get(0)[0] will contain the first matching entry's author
// results.get(0)[1] will contain the first matching entry's title
List<Object[]> results = query.list();
Aggregation Operations

The following aggregation operations may be performed on a given field:

  • avg() - Computes the average of a set of Numbers, represented as a Double. If there are no non-null values the result is null instead.
  • count() - Returns the number of non-null rows as a Long. If there are no non-null values the result is 0 instead.
  • max() - Returns the greatest value found in the specified field, with a return type equal to the field in which it was applied. If there are no non-null values the result is null instead.

    Note

    Values in the given field must be of type Comparable, otherwise an IllegalStateException will be thrown.
  • min() - Returns the smallest value found in the specified field, with a return type equal to the field in which it was applied. If there are no non-null values the result is null instead.

    Note

    Values in the given field must be of type Comparable, otherwise an IllegalStateException will be thrown.
  • sum() - Computes and returns the sum of a set of Numbers, with a return type dependent on the indicated field's type. If there are no non-null values the result is null instead.
    The following table indicates the return type based on the specified field.

    Table 16.1. Sum Return Type

    Field Type Return Type
    Integral (other than BigInteger) Long
    Floating Point Double
    BigInteger BigInteger
    BigDecimal BigDecimal
Projection Query Special Cases

The following cases items describe special use cases with projection queries:

  • A projection query in which all selected fields are aggregated and none is used for grouping is legal. In this case the aggregations will be computed globally instead of being computed per each group.
  • A grouping field can be used in an aggregation. This is a degenerated case in which the aggregation will be computed over a single data point, the value belonging to current group.
  • A query that selects only grouping fields but no aggregation fields is legal.
Evaluation of grouping and aggregation queries

Aggregation queries can include filtering conditions, like usual queries, which may be optionally performed before and after the grouping operation.

All filter conditions specified before invoking the groupBy method will be applied directly to the cache entries before the grouping operation is performed. These filter conditions may refer to any properties of the queried entity type, and are meant to restrict the data set that is going to be later used for grouping.
All filter conditions specified after invoking the groupBy method will be applied to the projection that results from the grouping operation. These filter conditions can either reference any of the fields specified by groupBy or aggregated fields. Referencing aggregated fields that are not specified in the select clause is allowed; however, referencing non-aggregated and non-grouping fields is forbidden. Filtering in this phase will reduce the amount of groups based on their properties.
Ordering may also be specified similar to usual queries. The ordering operation is performed after the grouping operation and can reference any fields that are allowed for post-grouping filtering as described earlier.