Chapter 5. Score Calculation

5.1. Score Terminology

5.1.1. What is a Score?

Every initialized Solution has a score. The score is an objective way to compare two solutions. The solution with the higher score is better. The Solver aims to find the Solution with the highest Score of all possible solutions. The best solution is the Solution with the highest Score that Solver has encountered during solving, which might be the optimal solution.

Planner cannot automatically know which Solution is best for your business, so you need to tell it how to calculate the score of a given Solution according to your business needs. If you forget or are unable to implement an important business constraint, the solution is probably useless:

optimalWithIncompleteConstraints

Defining constraints in Planner is very flexible through the following score techniques:

  • Score signum (positive or negative): maximize or minimize a constraint type
  • Score weight: put a cost/profit on a constraint type
  • Score level (hard, soft, …​): prioritize a group of constraint types
  • Pareto scoring

5.1.2. Score Constraint Signum (Positive or Negative)

All score techniques are based on constraints. A constraint can be a simple pattern (such as Maximize the apple harvest in the solution) or a more complex pattern. A positive constraint is a constraint you want to maximize. A negative constraint is a constraint you want to minimize.

positiveAndNegativeConstraints

The image above illustrates that the optimal solution always has the highest score, regardless if the constraints are positive or negative.

Most planning problems have only negative constraints and therefore have a negative score. In that case, the score is the sum of the weight of the negative constraints being broken, with a perfect score of 0. This explains why the score of a solution of four queens is the negative of the number of queen pairs which can attack each other.

Negative and positive constraints can be combined, even in the same score level.

Note

Do not presume that your business knows all its score constraints in advance. Expect score constraints to be added or changed after the first releases.

When a constraint activates (because the negative constraint is broken or the positive constraint is fulfilled) on a certain planning entity set, it is called a constraint match.

5.1.3. Score Constraint Weight

Not all score constraints are equally important. If breaking one constraint is equally bad as breaking another constraint x times, then those two constraints have a different weight (but they are in the same score level). For example in vehicle routing, you can make one “unhappy driver” constraint match count as much as two “fuel tank usage” constraint matches:

scoreWeighting

Score weighting is often used in use cases where you can put a price tag on everything. In that case, the positive constraints maximize revenue and the negative constraints minimize expenses, so together they maximize profit. Alternatively, score weighting is also often used to create social fairness. For example, a nurse, who requests a free day, pays a higher weight on New Years eve than on a normal day.

Putting a good weight on a constraint can be a difficult analytical decision, because it is about making choices and tradeoffs with other constraints. However, a non-accurate weight is less damaging than mediocre algorithms:

scoreTradeoffInPerspective

The weight of a constraint match can be dynamically based on the planning entities involved. For example in cloud balance, the weight of the soft constraint match for an active Computer is the cost of that Computer (which differs per computer).

Note

Furthermore, it is often useful to allow the end-user to recalibrate constraint weights in the user interface, as demonstrated in the exam timetabling example with the InstitutionParametrization class.

5.1.4. Score Constraint Level (hard, soft, …​)

Sometimes a score constraint outranks another score constraint, no matter how many times the other is broken. In that case, those score constraints are in different levels. For example, a nurse cannot do 2 shifts at the same time (due to the constraints of physical reality), this outranks all nurse happiness constraints.

Most use cases have only two score levels, hard and soft. Two scores are compared lexicographically. The first score level gets compared first. If those differ, the others score levels are ignored. For example, a score that breaks 0 hard constraints and 1 000 000 soft constraints is better than a score that breaks 1 hard constraint and 0 soft constraints.

scoreLevels

If there are two (or more) score levels, for example a hard and soft level, then a score is feasible if no hard constraints are broken.

Note

By default, Planner will always assign all planning variables a planning value. If there is no feasible solution, this means the best solution will be unfeasible. To instead leave some of the planning entities unassigned, apply overconstrained planning.

For each constraint, you need to pick a score level, a score weight and a score signum. For example: -1soft which has score level of soft, a weight of 1 and a negative signum. Do not use a big constraint weight when your business actually wants different score levels. That hack, known as score folding, is broken:

scoreFoldingIsBroken
Note

Your business might tell you that your hard constraints all have the same weight, because they cannot be broken (so the weight does not matter). This is not true because if no feasible solution exists for a specific dataset, the least infeasible solution allows the business to estimate how many business resources they are lacking. For example in cloud balancing, how many new computers to buy.

Furthermore, it will likely create a score trap. For example in cloud balance if a Computer has seven CPU too little for its Processes, then it must be weighted seven times as much as if it had only one CPU too little.

Three or more score levels are supported. For example: a company might decide that profit outranks employee satisfaction (or vice versa), while both are outranked by the constraints of physical reality.

Note

To model fairness or load balancing, there is no need to use lots of score levels (even though Planner can handle many score levels).

5.1.5. Pareto Scoring (AKA Multi-objective Optimization Scoring)

Far less common is the use case of pareto optimization, which is also known under the more confusing term multi-objective optimization. In pareto scoring, score constraints are in the same score level, yet they are not weighted against each other. When two scores are compared, each of the score constraints are compared individually and the score with the most dominating score constraints wins. Pareto scoring can even be combined with score levels and score constraint weighting.

Consider this example with positive constraints, where we want to get the most apples and oranges. Since it is impossible to compare apples and oranges, we can not weight them against each other. Yet, despite that we can not compare them, we can state that two apples are better then one apple. Similarly, we can state that two apples and one orange are better than just one orange. So despite our inability to compare some Scores conclusively (at which point we declare them equal), we can find a set of optimal scores. Those are called pareto optimal.

paretoOptimizationScoring

Scores are considered equal far more often. It is left up to a human to choose the better out of a set of best solutions (with equal scores) found by Planner. In the example above, the user must choose between solution A (three apples and one orange) and solution B (one apple and six oranges). It is guaranteed that Planner has not found another solution which has more apples or more oranges or even a better combination of both (such as two apples and three oranges).

To implement pareto scoring in Planner, implement a custom ScoreDefinition and Score (and replace the BestSolutionRecaller). Future versions will provide out-of-the-box support.

Note

A pareto Score's compareTo method is not transitive because it does a pareto comparison. For example: having two apples is greater than one apple. One apple is equal to one orange. Yet, two apples are not greater than one orange (but actually equal). Pareto comparison violates the contract of the interface java.lang.Comparable's compareTo method, but Planner’s systems are pareto comparison safe, unless explicitly stated otherwise in this documentation.

5.1.6. Combining Score Techniques

All the score techniques mentioned above, can be combined seamlessly:

scoreComposition

5.1.7. Score interface

A score is represented by the Score interface, which naturally extends Comparable:

public interface Score<...> extends Comparable<...> {
    ...
}

The Score implementation to use depends on your use case. Your score might not efficiently fit in a single long value. Planner has several built-in Score implementations, but you can implement a custom Score too. Most use cases tend to use the built-in HardSoftScore.

scoreClassDiagram

The Score implementation (for example HardSoftScore) must be the same throughout a Solver runtime. The Score implementation is configured in the solver configuration as a ScoreDefinition:

  <scoreDirectorFactory>
    <scoreDefinitionType>HARD_SOFT</scoreDefinitionType>
    ...
  </scoreDirectorFactory>

5.1.8. Avoid Floating Point Numbers in Score Calculation

Avoid the use of float and double for score calculation. Use BigDecimal instead.

Floating point numbers (float and double) cannot represent a decimal number correctly. For example: a double cannot hold the value 0.05 correctly. Instead, it holds the nearest representable value. Arithmetic (including addition and subtraction) with floating point numbers, especially for planning problems, leads to incorrect decisions:

scoreWeightType

Additionally, floating point number addition is not associative:

System.out.println( ((0.01 + 0.02) + 0.03) == (0.01 + (0.02 + 0.03)) ); // returns false

This leads to score corruption.

Decimal numbers (BigDecimal) have none of these problems.

Note

BigDecimal arithmetic is considerably slower than int, long or double arithmetic. In experiments we have seen the average calculation count get divided by 5.

Therefore, in some cases, it can be worthwhile to multiply all numbers for a single score weight by a plural of ten (for example 1000), so the score weight fits in an int or long.