Chapter 12. PMML support in Red Hat Process Automation Manager

Red Hat Process Automation Manager includes consumer conformance support for the following PMML model types:

For a list of all PMML model types, including those not supported in Red Hat Process Automation Manager, see the DMG PMML specification.

Red Hat Process Automation Manager offers two implementations including PMML legacy and PMML trusty.

Important

The PMML legacy implementation is deprecated with Red Hat Process Automation Manager 7.10.0 and will be replaced by PMML trusty implementation in a future Red Hat Process Automation Manager release.

Red Hat Process Automation Manager does not include a built-in PMML model editor, but you can use an XML or PMML-specific authoring tool to create PMML models and then integrate the PMML models in your decision services in Red Hat Process Automation Manager. You can import PMML files into your project in Business Central (Menu → Design → Projects → Import Asset) or package the PMML files as part of your project knowledge JAR (KJAR) file without Business Central.

For more information about including assets such as PMML files with your project packaging and deployment method, see Packaging and deploying an Red Hat Process Automation Manager project.

You can migrate a PMML service to a Red Hat build of Kogito microservice. For more information about migrating to Red Hat build of Kogito microservices, see Migrating to Red Hat build of Kogito microservices.

12.1. PMML trusty support and naming conventions in Red Hat Process Automation Manager

When you add a PMML file to a project in Red Hat Process Automation Manager, multiple assets are generated. The tree and scorecard models are translated to rules, and regression and mining models are translated to Java classes. Each type of PMML model generates a different set of assets, but all PMML model types generate at least the following set of assets:

  • A root package whose name is derived from the PMML file name
  • In the root package, a Java factory class that is used to instantiate the model
  • A subpackage specific to the model whose name is derived from the model name
  • For rule models, two rule-mapper classes that are used to instantiate the rule network
  • For mining models, children model packages and classes are nested in the parent model
Note

Currently, only one model for each PMML file is allowed. Also, extensions are temporarily not supported.

The following are naming conventions for generated PMML packages and classes:

  • The root package name is the name of the original PMML file in lowercase and without space, for example, sampleregression.
  • The name of the generated factory Java class is the PMML file name with Factory added to it in the format fileName+"Factory" and first uppercase letter, for example, SampleRegressionFactory.
  • The subpackage name of a model is the name of the original model in lowercase and without space, for example, compoundnestedpredicatescorecard.
  • The names of the generated data classes are determined by the model type:

    • Rules models: A top-level PMMLRuleMappersImpl is generated including references to PMMLRuleMapperImpl classes that are nested in the subpackages.
    • Mining models:

      • The name of the created segmentation subpackage is the name of the original model in lowercase, without space, and segmentation added to it in the format modelName+”segmentation”, for example, mixedminingsegmentation.
      • In the segmentation subpackage, a segmentation Java class is created that contains the references to the nested models. The name of the created segmentation Java class is the model name with Segmentation added to it in the format modelName+Segmentation, for example, MixedMiningSegmentation.
      • For each segment, a specific subpackage is created. The name of the segment specific subpackage is the original model name in lowercase with segment and a progressive integer starting from 0 added to it in the format modelName+segment+integer. For example, mixedminingsegment0, mixedminingsegment1.

Known limitations of PMML trusty implementation

The following list shows elements that are not implemented for PMML trusty:

  • Target element is not implemented
  • Extension element is not implemented
  • MiningSchema or MiningField elements that are not implemented, include:

    • importance
    • outliers
    • lowValue
    • highValue
    • invalidValueTreatment
    • invalidValueReplacement
  • OutputField elements that are not implemented, include:

    • Decisions
    • Value
    • Rule feature
    • Algorithm
    • isMultiValued
    • segmentId
    • isFinalResult
  • TransformationDictionary or LocalTransformation expressions that are not supported, include:

    • NormContinuous
    • NormDiscrete
    • MapValues
    • TextIndex
    • Aggregate
    • Lag
  • ModelStats, ModelExplanation, and ModelExplanation element is not implemented in all models including regression, tree, scorecard, and mining
  • verification element is not implemented in tree, scorecard, and mining model
  • VariableWeight element is not implemented in mining model
  • Tree model elements that are not implemented, include:

    • IsMissing or IsNotMissing
    • Surrogate in CompoundPredicate
    • missingValuePenalty
    • splitCharacteristic
    • isScorable

12.2. PMML legacy support and naming conventions in Red Hat Process Automation Manager

When you add a PMML file to a project in Red Hat Process Automation Manager, multiple assets are generated. Each type of PMML model generates a different set of assets, but all PMML model types generate at least the following set of assets:

  • A DRL file that contains all of the rules associated with your PMML model
  • At least two Java classes:

    • A data class that is used as the default object type for the model type
    • A RuleUnit class that is used to manage data sources and rule execution

If a PMML file has MiningModel as the root model, multiple instances of each of these files are generated.

The following are naming conventions for generated PMML legacy packages, classes, and rules:

  • If no package name is given in a PMML model file, then the default package name org.kie.pmml.pmml_4_2 is prefixed to the model name for the generated rules in the format "org.kie.pmml.pmml_4_2"+modelName.
  • The package name for the generated RuleUnit Java class is the same as the package name for the generated rules.
  • The name of the generated RuleUnit Java class is the model name with RuleUnit added to it in the format modelName+"RuleUnit".
  • Each PMML model has at least one data class that is generated. The package name for these classes is org.kie.pmml.pmml_4_2.model.
  • The names of generated data classes are determined by the model type, prefixed with the model name:

    • Regression models: One data class named modelName+"RegressionData"
    • Scorecard models: One data class named modelName+"ScoreCardData"
    • Tree models: Two data classes, the first named modelName+"TreeNode" and the second named modelName+"TreeToken"
    • Mining models: One data class named modelName+"MiningModelData"
Note

The mining model also generates all of the rules and classes that are within each of its segments.

12.2.1. PMML extensions in Red Hat Process Automation Manager

The PMML legacy specification supports Extension elements that extend the content of a PMML model. You can use extensions at almost every level of a PMML model definition, and as the first and last child in the main element of a model for maximum flexibility. For more information about PMML extensions, see the DMG PMML Extension Mechanism.

To optimize PMML integration, Red Hat Process Automation Manager supports the following additional PMML extensions:

  • modelPackage: Designates a package name for the generated rules and Java classes. Include this extension in the Header section of the PMML model file.
  • adapter: Designates the type of construct (bean or trait) that is used to contain input and output data for rules. Insert this extension in the MiningSchema or Output section (or both) of the PMML model file.
  • externalClass: Used in conjunction with the adapter extension in defining a MiningField or OutputField. This extension contains a class with an attribute name that matches the name of the MiningField or OutputField element.