Red Hat Training

A Red Hat training course is available for JBoss Enterprise SOA Platform

Chapter 13. Common Use Cases

13.1. Support for Processing Huge Messages

Smooks supports the following types of processing for huge messages:
  • One-to-one transformation: This is the process of transforming a huge message from its source format (for example, XML) to a huge message in a target format (EDI, CSV, XML, and so on).
  • Splitting and routing: Splitting of a huge message into smaller (more consumable) messages in any format (EDI, XML, Java etc.) and routing of those smaller messages to a number of different destination types (File, JMS, Database).
  • Persistence: Persisting the components of the huge message to a Database, from where they can be more easily queried and processed. Within Smooks, we consider this to be a form of splitting and routing (routing to a Database).
All of the above is possible without writing any code (that is, in a declarative manner). They can also be handled in a single pass over the source message, splitting and routing in parallel (plus routing to multiple destinations of different types and in different formats).

Note

When processing huge messages with Smooks, make sure you are using the SAX filter for better performance.

13.2. Transforming Huge Messages with FreeMarker

To process a huge message by transforming it into a single message of another format, you can apply multiple FreeMarker templates to the Source message Event Stream and output it to a Smooks.filterSource Result stream. You can do this in one of two ways:
  • Using FreeMarker and NodeModels for the model.
  • Using FreeMarker and a Java Object model for the model. The model can be constructed from data in the message, using the Javabean Cartridge.

13.3. Huge Messages and NodeModels

When a message is huge, you must identify its multiple NodeModels so that the runtime memory footprint is as low as possible. You cannot process the message using a single model because the full message is too big to hold in memory. In the case of the order message, there are two models: one for the main order data and one for the order-item data.
Most data that will be in memory at any one time is the main order data, plus one of the order-items. Because the NodeModels are nested, Smooks makes sure that the order data NodeModel never contains any of the data from the order-item NodeModels. Also, as Smooks filters the message, the order-item NodeModel will be overwritten for every order-item (that is, they are not collected).

13.4. Configuring Smooks to Capture Multiple NodeModels

  1. To configure Smooks to capture multiple NodeModels for use by the FreeMarker templates, you should configure the DomModelCreator visitor. It should be targeted at the root node of each model. Note again that Smooks also makes this available to SAX filtering (the key to processing huge messages).
    This is The Smooks configuration for creating the NodeModels for the message:
    <?xml version="1.0"?>
    <smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.1.xsd" 
                          xmlns:core="http://www.milyn.org/xsd/smooks/smooks-core-1.3.xsd"
                          xmlns:ftl="http://www.milyn.org/xsd/smooks/freemarker-1.1.xsd">
     
        <!--
        Filter the message using the SAX Filter (i.e. not DOM, so no
        intermediate DOM for the "complete" message - there are "mini" DOMs
        for the NodeModels below)....
        -->
        <core:filterSettings type="SAX" defaultSerialization="false" />
     
        <!--
        Create 2 NodeModels. One high level model for the "order"
        (header etc) and then one for the "order-item" elements...
        -->
        <resource-config selector="order,order-item">
            <resource>org.milyn.delivery.DomModelCreator</resource>
        </resource-config>
     
        <!-- FreeMarker templating configs to be added below... -->
    
  2. Next, apply the following FreeMarker templates:
    • A template to output the order header details, up to but not including the order items.
    • A template for each of the order items, to generate the item elements in the salesorder.
    • A template to close out the message.
    With Smooks, you can implement this by defining two FreeMarker templates. One to cover points one and three (combined) above, and a second to cover the item elements.
  3. Applt the first FreeMarker template. It is targeted at the order-items element and looks like this:
    <ftl:freemarker applyOnElement="order-items">
            <ftl:template><!--<salesorder>
        <details>
            <orderid>${order.@id}</orderid>
            <customer>
                <id>${order.header.customer.@number}</id>
                <name>${order.header.customer}</name>
            </customer>
        </details>
        <itemList>
        <?TEMPLATE-SPLIT-PI?> 
        </itemList>
    </salesorder>-->
            </ftl:template>
        </ftl:freemarker>
    
    The ?TEMPLATE-SPLIT-PI? processing instruction tells Smooks where to split the template, outputting the first part of the template at the start of the order-items element, and the other part at the end of the order-items element. The item element template (the second template) will be output in between.
  4. Apply the second FreeMarker template. This outputs the item elements at the end of every order-item element in the source message:
    <ftl:freemarker applyOnElement="order-item">
            <ftl:template><!-- <item>
        <id>${.vars["order-item"].@id}</id>
        <productId>${.vars["order-item"].product}</productId>
        <quantity>${.vars["order-item"].quantity}</quantity>
        <price>${.vars["order-item"].price}</price>
    </item>-->
            </ftl:template>
        </ftl:freemarker>
    </smooks-resource-list>
    
    Because the second template fires on the end of the order-item elements, it effectively generates output into the location of the ?TEMPLATE-SPLIT-PI? processing instruction in the first template. Note that the second template could have also referenced data in the order NodeModel.
  5. Apply a closing template of your choice.

    Note

    This approach to performing a one-to-one transformation of a huge message works because the only objects in memory at any one time are the order header details and the current order-item details (in the Virtual Object Model). Obviously it can' work if the transformation is so obscure as to always require full access to all the data in the source message, for example if the messages needs to have all the order items reversed in order (or sorted). In such a case however, you do have the option of routing the order details and items to a database and then using the database's storage, query and paging features to perform the transformation.

13.5. Message Splitting Requirements

You can process huge messages by splitting them into smaller messages that can be processed independently. Splitting and routing is sometimes also needed with smaller messages (message size may be irrelevant) where, for example, order items in an an order message need to be split out and routed (based on content) to different departments or partners for processing. Under these conditions, the message formats required at the different destinations may also vary as shown in the examples below:
  • destination1: required XML via the file system,
  • destination2: requires Java objects via a JMS Queue,
  • destination3: picks the messages up from a table in a Database etc.
  • destination4: requires EDI messages via a JMS Queue,
You can perform multiple splitting and routing operations to multiple destinations (of different types) in a single pass over a message.

13.6. Streaming Split Messages Through Smooks

As you stream the message through Smooks:
  • Repeatedly create a standalone message (split) for the fragment to be routed.
  • Repeatedly bind the split message into the bean context under a unique beanId.
  • Repeatedly route the split message to the required endpoint (whether it be a file, DB, JMS or ESB).
These operations happen for each instance of the split message found in the source message, for example, for each orderItem in an order message.

13.7. Methods for Creating Split Messages

  • A basic (untransformed/unenriched) fragment split and bind. This serializes a message fragment (repeatedly) to its XML form and stores it in the bean context as a String.
  • A more complex approach using the Java Binding and Templating Cartridges, where you configure Smooks to extract data from the source message and and into the bean context (using jb:bean configs) and then (optionally) apply templates to create the split messages. This has the following advantages:
    • Allows for transformation of the split fragments, that is, not just XML as with the basic option.
    • Allows for enrichment of the message.
    • Allows for more complex splits, with the ability to merge data from multiple source fragments into each split message, for example not just the orderItem fragments but the order header info too.
    • Allows for splitting and routing of Java Objects as the Split messages (for example, over JMS).

13.8. Serializing Messages

  1. To split and route fragments of a message, use the basic frag:serialize and *:router components (jms:router, file:router and so on) from the Routing Cartridge. The frag:serialize component has its own configuration in the http://www.milyn.org/xsd/smooks/fragment-routing-1.2.xsd namespace.
  2. Use the example below for serializing the contents of a SOAP message body and storing it in the bean context under the beanId of soapBody:
    <?xml version="1.0"?>
    <smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.1.xsd" xmlns:frag="http://www.milyn.org/xsd/smooks/fragment-routing-1.2.xsd">
     
        <frag:serialize fragment="Envelope/Body" bindTo="soapBody" childContentOnly="true"/>
     
    </smooks-resource-list>
    
  3. Use this code to execute it:
    Smooks smooks = new Smooks(configStream);
    JavaResult javaResult = new JavaResult();
     
    smooks.filterSource(new StreamSource(soapMessageStream), javaResult);
     
    String bodyContent = javaResult.getBean("soapBody").toString().trim();
    
  4. To do this programatically, use this code:
    Smooks smooks = new Smooks();
     
    smooks.addVisitor(new FragmentSerializer().setBindTo("soapBody"), "Envelope/Body");
     
    JavaResult javaResult = new JavaResult();
    smooks.filterSource(new StreamSource(soapMessageStream), javaResult);
     
    String bodyContent = javaResult.getBean("soapBody").toString().trim();
    

13.9. Routing Split Messages Example

The following is a quick example, showing the configuration for routing split messages (this time order-item fragments) to a JMS destination for processing:
<?xml version="1.0"?>
<smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.1.xsd" xmlns:frag="http://www.milyn.org/xsd/smooks/fragment-routing-1.2.xsd" xmlns:jms="http://www.milyn.org/xsd/smooks/jms-routing-1.2.xsd">
 
    <!-- Create the split messages for the order items... -->
    <frag:serialize fragment="order-items/order-item" bindTo="orderItem" />
 
    <!-- Route each order items split mesage to the orderItem JMS processing queue... -->
    <jms:router routeOnElement="order-items/order-item" beanId="orderItem" destination="orderItemProcessingQueue" />
 
</smooks-resource-list>

Note

The jms:router could be substituted for any of the other routers. For example, if using with JBoss ESB, you could use the esbr:routeBean configuration to route the split message to any ESB endpoint.

13.10. File-based Routing

File-based routing is performed via the file:outputStream configuration from the http://www.milyn.org/xsd/smooks/file-routing-1.1.xsd configuration namespace. You can combine the following Smooks functionality to split a message out into smaller messages on the file system.

13.11. File-based Routing Components

Table 13.1. File-based Routing Components

Component Description
The Javabean Cartridge Extracts data from the message and holds it in variables in the bean context. You could also use DOM NodeModels for capturing the order and order-item data to be used as the templating data models.
file:outputStream This configuration from the Routing Cartridge is used for managing file system streams (naming, opening, closing, throttling creation etc).
Templating Cartridge (FreeMarker Templates) Used for generating the individual split messages from data bound in the bean context by the Javabean Cartridge (see first point above). The templating result is written to the file output stream (see second point above).

13.12. Huge Message Processing

Huge Message Processing

In the example, a huge order message needs to be sent while routing the individual order item details to file. The split messages contain data from the order header and root elements:

<?xml version="1.0"?>
<smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.1.xsd"
                      xmlns:core="http://www.milyn.org/xsd/smooks/smooks-core-1.3.xsd"
                      xmlns:jb="http://www.milyn.org/xsd/smooks/javabean-1.4.xsd"
                      xmlns:file="http://www.milyn.org/xsd/smooks/file-routing-1.1.xsd"
                      xmlns:ftl="http://www.milyn.org/xsd/smooks/freemarker-1.1.xsd">
 
        <!--
        Filter the message using the SAX Filter (i.e. not DOM, so no
        intermediate DOM, so we can process huge messages...
        -->
        <core:filterSettings type="SAX" />
 
        <!-- Extract and decode data from the message. Used in the freemarker template (below).
               Note that we could also use a NodeModel here... -->
(1)     <jb:bean beanId="order" class="java.util.Hashtable" createOnElement="order">
            <jb:value property="orderId" decoder="Integer" data="order/@id"/>
            <jb:value property="customerNumber" decoder="Long" data="header/customer/@number"/>
            <jb:value property="customerName" data="header/customer"/>
            <jb:wiring property="orderItem" beanIdRef="orderItem"/>
        </jb:bean>
(2)     <jb:bean beanId="orderItem" class="java.util.Hashtable" createOnElement="order-item">
            <jb:value property="itemId" decoder="Integer" data="order-item/@id"/>
            <jb:value property="productId" decoder="Long" data="order-item/product"/>
            <jb:value property="quantity" decoder="Integer" data="order-item/quantity"/>
            <jb:value property="price" decoder="Double" data="order-item/price"/>
        </jb:bean>
 
        <!-- Create/open a file output stream. This is writen to by the freemarker template (below).. -->
(3)     <file:outputStream openOnElement="order-item" resourceName="orderItemSplitStream">
            <file:fileNamePattern>order-${order.orderId}-${order.orderItem.itemId}.xml</file:fileNamePattern>
            <file:destinationDirectoryPattern>target/orders</file:destinationDirectoryPattern>
            <file:listFileNamePattern>order-${order.orderId}.lst</file:listFileNamePattern>
 
            <file:highWaterMark mark="10"/>
        </file:outputStream>
 
        <!--
        Every time we hit the end of an <order-item> element, apply this freemarker template,
        outputting the result to the "orderItemSplitStream" OutputStream, which is the file
        output stream configured above.
        -->
(4)     <ftl:freemarker applyOnElement="order-item">
            <ftl:template>target/classes/orderitem-split.ftl</ftl:template>
            <ftl:use>
                <!-- Output the templating result to the "orderItemSplitStream" file output stream... -->
                <ftl:outputTo outputStreamResource="orderItemSplitStream"/>
            </ftl:use>
        </ftl:freemarker>
 
</smooks-resource-list>
Smooks Resource configurations shown in number one and two above define the Java Bindings for extracting the order header information (config #1) and the order-item information (config #2). When processing a huge message, make sure you only have the current order item in memory at any one time. The Smooks Javabean Cartridge manages all this for you, creating and recreating the orderItem beans as the order-item fragments are being processed.
The file:outputStream configuration in configuration number three manages the generation of the files on the file system. As you can see from the configuration, the file names can be dynamically constructed from data in the bean context. You can also see that it can throttle the creation of the files via the highWaterMark configuration parameter. This helps you manage file creation so as not to overwhelm the target file system.
Smooks Resource configuration number four defines the FreeMarker templating resource used to write the split messages to the OutputStream created by the file:outputStream (config #3). See how configuration 4 references the file:outputStream resource. The Freemarker template is as follows:
<orderitem id="${.vars["order-item"].@id}" order="${order.@id}">
    <customer>
        <name>${order.header.customer}</name>
        <number>${order.header.customer.@number}</number>
    </customer>
    <details>
        <productId>${.vars["order-item"].product}</productId>
        <quantity>${.vars["order-item"].quantity}</quantity>
        <price>${.vars["order-item"].price}</price>
    </details>
</orderitem>

13.13. JMS Routing

JMS Routing

JMS routing is performed via the jms:router configuration from the http://www.milyn.org/xsd/smooks/jms-routing-1.2.xsd configuration namespace. The following is an example jms:router configuration that routes an orderItem_xml bean to a JMS Queue named smooks.exampleQueue:

<?xml version="1.0"?>
<smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.1.xsd"
                      xmlns:core="http://www.milyn.org/xsd/smooks/smooks-core-1.3.xsd"
                      xmlns:jms="http://www.milyn.org/xsd/smooks/jms-routing-1.2.xsd"
                      xmlns:ftl="http://www.milyn.org/xsd/smooks/freemarker-1.1.xsd">
 
        <!--
        Filter the message using the SAX Filter (i.e. not DOM, so no
        intermediate DOM, so we can process huge messages...
        -->
        <core:filterSettings type="SAX" />
 
(1)     <resource-config selector="order,order-item">
            <resource>org.milyn.delivery.DomModelCreator</resource>
        </resource-config>
 
(2)     <jms:router routeOnElement="order-item" beanId="orderItem_xml" destination="smooks.exampleQueue">
            <jms:message>
                <!-- Need to use special FreeMarker variable ".vars" -->
                <jms:correlationIdPattern>${order.@id}-${.vars["order-item"].@id}</jms:correlationIdPattern>
            </jms:message>
            <jms:highWaterMark mark="3"/>
        </jms:router>
 
(3)     <ftl:freemarker applyOnElement="order-item">
            <!--
            Note in the template that we need to use the special FreeMarker variable ".vars"
            because of the hyphenated variable names ("order-item"). See http://freemarker.org/docs/ref_specvar.html.
            -->
            <ftl:template>/orderitem-split.ftl</ftl:template>
            <ftl:use>
                <!-- Bind the templating result into the bean context, from where
                it can be accessed by the JMSRouter (configured above). -->
                <ftl:bindTo id="orderItem_xml"/>
            </ftl:use>
        </ftl:freemarker>
 
</smooks-resource-list>
In this case, we route the result of a FreeMarker templating operation to the JMS Queue (that is, as a String). We could also have routed a full Object Model, in which case it would be routed as a Serialized ObjectMessage.

13.14. Routing to a Database

  1. To route an order and order item data to a database, you should define a set of Java bindings that extract the order and order-item data from the data stream:
    <?xml version="1.0"?>
    <smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.1.xsd" 
                          xmlns:jb="http://www.milyn.org/xsd/smooks/javabean-1.4.xsd">
     
        <!-- Extract the order data... -->
        <jb:bean beanId="order" class="java.util.Hashtable" createOnElement="order">
            <jb:value property="orderId" decoder="Integer" data="order/@id"/>
            <jb:value property="customerNumber" decoder="Long" data="header/customer/@number"/>
            <jb:value property="customerName" data="header/customer"/>
        </jb:bean>
     
        <!-- Extract the order-item data... -->
        <jb:bean beanId="orderItem" class="java.util.Hashtable" createOnElement="order-item">
            <jb:value property="itemId" decoder="Integer" data="order-item/@id"/>
            <jb:value property="productId" decoder="Long" data="order-item/product"/>
            <jb:value property="quantity" decoder="Integer" data="order-item/quantity"/>
            <jb:value property="price" decoder="Double" data="order-item/price"/>
        </jb:bean>
    
  2. Next you need to define datasource configuration and a number of db:executor configurations that will use that datasource to insert the data that was bound into the Java Object model into the database. This is the datasource configuration (namespace http://www.milyn.org/xsd/smooks/datasource-1.3.xsd) for retrieving a direct database connection:
    <?xml version="1.0"?>
    <smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.1.xsd" xmlns:ds="http://www.milyn.org/xsd/smooks/datasource-1.3.xsd">
     
        <ds:direct bindOnElement="#document"
                datasource="DBExtractTransformLoadDS"
                driver="org.hsqldb.jdbcDriver"
                url="jdbc:hsqldb:hsql://localhost:9201/milyn-hsql-9201"
                username="sa"
                password=""
                autoCommit="false" />
     
    </smooks-resource-list>
    
  3. It is possible to use a JNDI datasource for retrieving a database connection:
    <?xml version="1.0"?>
    <smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.1.xsd" xmlns:ds="http://www.milyn.org/xsd/smooks/datasource-1.3.xsd">
     
        <!-- This JNDI datasource can handle JDBC and JTA transactions or 
               it can leave the transaction managment to an other external component.
               An external component could be an other Smooks visitor, the EJB transaction manager
               or you can do it your self. -->
        <ds:JNDI
            bindOnElement="#document"
            datasource="DBExtractTransformLoadDS"
            datasourceJndi="java:/someDS"
            transactionManager="JTA"
            transactionJndi="java:/mockTransaction"
            targetProfile="jta"/>
     
    </smooks-resource-list>
    
  4. The datasource schema describes and documents how you can configure the datasource. This is the db:executor configuration (namespace http://www.milyn.org/xsd/smooks/db-routing-1.1.xsd):
    <?xml version="1.0"?>
    <smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.1.xsd"
                          xmlns:db="http://www.milyn.org/xsd/smooks/db-routing-1.1.xsd">
     
        <!-- Assert whether it's an insert or update. Need to do this just before we do the insert/update... -->
        <db:executor executeOnElement="order-items" datasource="DBExtractTransformLoadDS" executeBefore="true">
            <db:statement>select OrderId from ORDERS where OrderId = ${order.orderId}</db:statement>
            <db:resultSet name="orderExistsRS"/>
        </db:executor>
     
        <!-- If it's an insert (orderExistsRS.isEmpty()), insert the order before we process the order items... -->
        <db:executor executeOnElement="order-items" datasource="DBExtractTransformLoadDS" executeBefore="true">
            <condition>orderExistsRS.isEmpty()</condition>
            <db:statement>INSERT INTO ORDERS VALUES(${order.orderId}, ${order.customerNumber}, ${order.customerName})</db:statement>
        </db:executor>
     
        <!-- And insert each orderItem... -->
        <db:executor executeOnElement="order-item" datasource="DBExtractTransformLoadDS" executeBefore="false">
            <condition>orderExistsRS.isEmpty()</condition>
            <db:statement>INSERT INTO ORDERITEMS VALUES (${orderItem.itemId}, ${order.orderId}, ${orderItem.productId}, ${orderItem.quantity}, ${orderItem.price})</db:statement>
        </db:executor>
     
        <!-- Ignoring updates for now!! -->
     
    </smooks-resource-list>