Red Hat Training

A Red Hat training course is available for JBoss Enterprise SOA Platform

Chapter 4. Consuming Input Data

4.1. Stream Readers

A stream reader is a class that implements the XMLReader interface (or the SmooksXMLReader interface). Smooks uses a stream reader to generate a stream of SAX events from the source message data stream. XMLReaderFactory.createXMLReader() is the default XMLReader. It can be configured to read non-XML data sources by configuring a specialist XML reader.

4.2. XMLReader Configurations

This is an example of how to configure the XML to use handlers, features and parameters:
<reader class="com.acme.ZZZZReader">
    <handlers>
        <handler class="com.X" />
        <handler class="com.Y" />
    </handlers>
    <features>
        <setOn feature="http://a" />
        <setOn feature="http://b" />
        <setOff feature="http://c" />
        <setOff feature="http://d" />
    </features>
    <params>
        <param name="param1">val1</param>
        <param name="param2">val2</param>
    </params>
</reader>

4.3. Setting Features on the XML Reader

  • By default, Smooks reads XML data. To set features on the default XML reader, omit the class name from the configuration:
    <reader>
        <features>
            <setOn feature="http://a" />
            <setOn feature="http://b" />
            <setOff feature="http://c" />
            <setOff feature="http://d" />
        </features>
    </reader>
    

4.4. Configuring the CSV Reader

  1. Use the http://www.milyn.org/xsd/smooks/csv-1.2.xsd configuration namespace to configure the reader.
    Here is a basic configuration:
    <?xml version="1.0"?>
    <smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.1.xsd" xmlns:csv="http://www.milyn.org/xsd/smooks/csv-1.2.xsd">
     
        <!--
        Configure the CSV to parse the message into a stream of SAX events.
        -->
        <csv:reader fields="firstname,lastname,gender,age,country" separator="|" quote="'" skipLines="1" />
     
    </smooks-resource-list>
    
  2. You will see the following event stream:
    <csv-set>
        <csv-record>
            <firstname>Tom</firstname>
            <lastname>Fennelly</lastname>
            <gender>Male</gender>
            <age>21</age>
            <country>Ireland</country>
        </csv-record>
        <csv-record>
            <firstname>Tom</firstname>
            <lastname>Fennelly</lastname>
            <gender>Male</gender>
            <age>21</age>
            <country>Ireland</country>
        </csv-record>
    </csv-set>
    

4.5. Defining Configurations

  1. To define fields in XML configurations you must use a comma-separated list of names in the fields attribute.
  2. Make sure the field names follow the same naming rules as XML element names:
    • they can contain letters, numbers, and other characters
    • they cannot start with a number or punctuation character
    • they cannot start with the letters xml (or XML or Xml, etc)
    • they cannot contain spaces
  3. Set the rootElementName and recordElementName attributes so you can modify the csv-set and csv-record element names. The same rules apply for these names.
  4. You can define string manipulation functions on a per-field basis. These functions are executed before the data is converted into SAX events. Define them after the field name, separating the two with a question mark:
    
    <?xml version="1.0"?>
    <smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.1.xsd" xmlns:csv="http://www.milyn.org/xsd/smooks/csv-1.2.xsd">
     
        <csv:reader fields="lastname?trim.capitalize,country?upper_case" />
     
    </smooks-resource-list>
    
    
  5. To get Smooks to ignore fields in a CSV record, you must specify the $ignore$ token as the field's configuration value. Specify the number of fields to be ignored simply by following the $ignore$ token with a number (so use $ignore$3 to ignore the next three fields.) Use $ignore$+ to ignore all of the fields to the end of the CSV record.
    
    <?xml version="1.0"?>
    <smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.1.xsd" xmlns:csv="http://www.milyn.org/xsd/smooks/csv-1.2.xsd">
     
        <csv:reader fields="firstname,$ignore$2,age,$ignore$+" />
     
    </smooks-resource-list>
    
    

4.6. Binding CSV Records to Java Objects

  1. Read the following to learn how to CSV records to Java objects. In this example, we will use CSV records for people:
    Tom,Fennelly,Male,4,Ireland
    Mike,Fennelly,Male,2,Ireland
    
  2. Input this code to bind the record to a person:
    
    public class Person {
        private String firstname;
        private String lastname;
        private String country;
        private Gender gender;
        private int age;
    }
     
    public enum Gender {
        Male, 
        Female;
    }
    
    
  3. Input the following code and modify it to suit your task:
    
    <?xml version="1.0"?>
    <smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.1.xsd" xmlns:csv="http://www.milyn.org/xsd/smooks/csv-1.2.xsd">
     
        <csv:reader fields="firstname,lastname,gender,age,country">
            <!-- Note how the field names match the property names on the Person class. -->
            <csv:listBinding beanId="people" class="org.milyn.csv.Person" />
        </csv:reader>
     
    </smooks-resource-list>
    
  4. To execute the configuration, use this code:
    
    Smooks smooks = new Smooks(configStream);
    JavaResult result = new JavaResult();
     
    smooks.filterSource(new StreamSource(csvStream), result);
     
    List<Person> people = (List<Person>) result.getBean("people");
    
  5. You can create Maps from the CSV record set:
    
    <?xml version="1.0"?>
    <smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.1.xsd" xmlns:csv="http://www.milyn.org/xsd/smooks/csv-1.2.xsd">
     
        <csv:reader fields="firstname,lastname,gender,age,country">
            <csv:mapBinding beanId="people" class="org.milyn.csv.Person" keyField="firstname" />
        </csv:reader>
     
    </smooks-resource-list>
    
    
  6. The configuration above produces a map of person instances, keyed to the firstname value of each person. This is how it is executed:
    
    Smooks smooks = new Smooks(configStream);
    JavaResult result = new JavaResult();
     
    smooks.filterSource(new StreamSource(csvStream), result);
     
    Map<String, Person> people = (Map<String, Person>) result.getBean("people");
     
    Person tom = people.get("Tom");
    Person mike = people.get("Mike");
    
    
    Virtual models are also supported, so you can define the class attribute as a java.util.Map and bind the CSV field values to map instances which are, in turn, added to a list or map.

4.7. Configuring the CSV Reader for Record Sets

  1. To configure a Smooks instance with a CSV reader to read a person record set, use the code below. It will bind the records to a list of person instances.
    
    Smooks smooks = new Smooks();
     
    smooks.setReaderConfig(new CSVReaderConfigurator("firstname,lastname,gender,age,country")
                      .setBinding(new CSVBinding("people", Person.class, CSVBindingType.LIST)));
     
    JavaResult result = new JavaResult();
    smooks.filterSource(new StreamSource(csvReader), result);
     
    List<Person> people = (List<Person>) result.getBean("people");
    
    

    Note

    You can also optionally configure the Java Bean. The Smooks instance could instead (or additionally) be configured programmatically to use other visitor implementations to process the CSV record set.
  2. To bind the CSV's records to a list or map of a Java type that reflects the data in your CSV records, use the CSVListBinder or CSVMapBinder classes:
    
    // Note: The binder instance should be cached and reused...
    CSVListBinder binder = new CSVListBinder("firstname,lastname,gender,age,country", Person.class);
     
    List<Person> people = binder.bind(csvStream);
    
    CSVMapBinder:
    
    // Note: The binder instance should be cached and reused...
    CSVMapBinder binder = new CSVMapBinder("firstname,lastname,gender,age,country", Person.class, "firstname");
     
    Map<String, Person> people = binder.bind(csvStream);
    
    
    If you need more control over the binding process, revert back to using the lower-level APIs.

4.8. Configuring the Fixed-Length Reader

  1. To configure the fixed-length reader, modify the http://www.milyn.org/xsd/smooks/fixed-length-1.3.xsd configuration namespace as shown below:
    <?xml version="1.0"?>
    <smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.1.xsd" xmlns:fl="http://www.milyn.org/xsd/smooks/fixed-length-1.3.xsd">
     
        <!--
        Configure the Fixed length to parse the message into a stream of SAX events.
        -->
        <fl:reader fields="firstname[10],lastname[10],gender[1],age[2],country[2]" skipLines="1" />
     
    </smooks-resource-list>
    
    
    Here is an example input file:
    #HEADER
    Tom       Fennelly  M 21 IE
    Maurice  Zeijen     M 27 NL
    
    Here is the event stream that will be generated:
    
    <set>
        <record>
            <firstname>Tom       </firstname>
            <lastname>Fennelly  </lastname>
            <gender>M</gender>
            <age> 21</age>
            <country>IE</country>
        </record>
        <record>
            <firstname>Maurice  </firstname>
            <lastname>Zeijen     </lastname>
            <gender>M</gender>
            <age>27</age>
            <country>NL</country>
        </record>
    </set>
    ]]>
    
  2. Define the string manipulation functions as shown below:
    
    <?xml version="1.0"?>
    <smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.1.xsd" xmlns:fl="http://www.milyn.org/xsd/smooks/fixed-length-1.3.xsd">
     
        <!--
        Configure the fixed length reader to parse the message into a stream of SAX events.
        -->
        <fl:reader fields="firstname[10]?trim,lastname[10]trim.capitalize,gender[1],age[2],country[2]" skipLines="1" />
     
    </smooks-resource-list>
    
    
  3. You can also ignore these fields if you choose:
    
    <?xml version="1.0"?>
    <smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.1.xsd" xmlns:fl="http://www.milyn.org/xsd/smooks/fixed-length-1.3.xsd">
     
        <fl:reader fields="firstname,$ignore$[2],age,$ignore$[10]" />
     
    </smooks-resource-list>
    
    

4.9. Configuring Fixed-Length Records

  1. To bind fixed-length records to a person, see the configuration below. In this example we will use these sample records:
    Tom       Fennelly  M 21 IE
    Maurice  Zeijen     M 27 NL
    
    This is how you bind them to a person:
    [
    public class Person {
        private String firstname;
        private String lastname;
        private String country;
        private String gender;
        private int age;
    }
    
    
  2. Configure the records so they look like this:
    <?xml version="1.0"?>
    <smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.1.xsd"  xmlns:fl="http://www.milyn.org/xsd/smooks/fixed-length-1.3.xsd">
     
        <fl:reader fields="firstname[10]?trim,lastname[10]?trim,gender[1],age[3]?trim,country[2]">
            <!-- Note how the field names match the property names on the Person class. -->
            <fl:listBinding BeanId="people" class="org.milyn.fixedlength.Person" />
        </fl:reader>
     
    </smooks-resource-list>
    
    
  3. Execute it as shown:
    Smooks smooks = new Smooks(configStream);
    JavaResult result = new JavaResult();
     
    smooks.filterSource(new StreamSource(fixedLengthStream), result);
     
    List<Person> people = (List<Person>) result.getBean("people");
    
    
  4. Optionally, use this configuration to create maps from the fixed-length record set:
    
    <?xml version="1.0"?>
    <smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.1.xsd"  xmlns:fl="http://www.milyn.org/xsd/smooks/fixed-length-1.3.xsd">
     
        <fl:reader fields="firstname[10]?trim,lastname[10]?trim,gender[1],age[3]?trim,country[2]">
            <fl:mapBinding BeanId="people" class="org.milyn.fixedlength.Person" keyField="firstname" />
        </fl:reader>
    
  5. This is how you execute the map of person instances that is produced:
    Smooks smooks = new Smooks(configStream);
    JavaResult result = new JavaResult();
     
    smooks.filterSource(new StreamSource(fixedLengthStream), result);
     
    Map<String, Person> people = (Map<String, Person>) result.getBean("people");
     
    Person tom = people.get("Tom");
    Person mike = people.get("Maurice");
    
    Virtual Models are also supported, so you can define the class attribute as a java.util.Map and bind the fixed-length field values to map instances, which are in turn added to a list or a map.

4.10. Configuring the Fixed-Length Reader Programmatically

  1. Use this code to configure the fixed-length reader to read a person record set, binding the record set into a list of person instances:
    Smooks smooks = new Smooks();
     
    smooks.setReaderConfig(new FixedLengthReaderConfigurator("firstname[10]?trim,lastname[10]?trim,gender[1],age[3]?trim,country[2]")
                      .setBinding(new FixedLengthBinding("people", Person.class, FixedLengthBindingType.LIST)));
     
    JavaResult result = new JavaResult();
    smooks.filterSource(new StreamSource(fixedLengthStream), result);
     
    List<Person> people = (List<Person>) result.getBean("people");
    
    Configuring the Java binding is not mandatory. You can instead programmatically configure the Smooks instance to use other visitor implementations to carry out various forms of processing on the fixed-length record set.
  2. To bind fixed-length records directly to a list or map of a Java type that reflects the data in your fixed-length records, use either the FixedLengthListBinder or the FixedLengthMapBinder classes:
    // Note: The binder instance should be cached and reused...
    FixedLengthListBinder binder = new FixedLengthListBinder("firstname[10]?trim,lastname[10]?trim,gender[1],age[3]?trim,country[2]", Person.class);
     
    List<Person> people = binder.bind(fixedLengthStream);
    
    FixedLengthMapBinder:
    
    // Note: The binder instance should be cached and reused...
    FixedLengthMapBinder binder = new FixedLengthMapBinder("firstname[10]?trim,lastname[10]?trim,gender[1],age[3]?trim,country[2]", Person.class, "firstname");
     
    Map<String, Person> people = binder.bind(fixedLengthStream);
    
    If you need more control over the binding process, revert back to the lower level APIs.

4.11. EDI Processing

  1. To utilize EDI processing in Smooks, access the http://www.milyn.org/xsd/smooks/edi-1.2.xsd configuration namespace.
  2. Modify this configuration to suit your needs:
    <?xml version="1.0"?>
    <smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.1.xsd" xmlns:edi="http://www.milyn.org/xsd/smooks/edi-1.2.xsd">    
     <!--
        Configure the EDI Reader to parse the message stream into a stream of SAX events.
        -->
        <edi:reader mappingModel="edi-to-xml-order-mapping.xml" validate="false"/>
    </smooks-resource-list>
    

4.12. EDI Processing Terms

  • mappingModel: This defines the EDI mapping model configuration for converting the EDI message stream to a stream of SAX events that can be processed by Smooks.
  • validate: This attribute turns the data-type validation in the EDI Parser on and off. (Validation is on by default.) To avoid redundancy, turn data-type validation off on the EDI reader if the EDI data is being bound to a Java object model (using Java bindings a la jb:bean).

4.13. EDI to SAX

The EDI to SAX event mapping process is based on a mapping model supplied to the EDI reader. (This model must always use the http://www.milyn.org/xsd/smooks/edi-1.2.xsd schema. From this schema, you can see that segment groups are supported, including groups within groups, repeating segments and repeating segment groups.)
The medi:segment element supports two optional attributes, minOccurs and maxOccurs. (There is a default value of one in each case.) Use these attributes to control the characteristics of a segment. A maxOccurs value of -1 indicates that the segment can repeat any number of times in that location of the (unbound) EDI message.
You can add segment groups by using the segmentGroup element. A segment group is matched to the first segment in the group. They can contain nested segmentGroup elements, but the first element in a segmentGroup must be a segment. segmentGroup elements support minOccurs and maxOccurs cardinality. They also support an optional xmlTag attribute which, if present, will result in the XML generated by a matched segment group to be inserted into an element that has the name of the xmlTag attribute value.

4.14. EDI to SAX Event Mapping

When mapping EDI to SAX events, segments are matched in either of these ways:
  • by an exact match on the segment code (segcode).
  • by a regex pattern match on the full segment, where the segcode attribute defines the regex pattern (for instance, segcode="1A\*a.*").
  • required: field, component and sub-component configurations support a "required" attribute, which flags that field, component or sub-component as requiring a value.
  • by default, values are not required (fields, components and sub-components).
  • truncatable: segment, field and component configurations support a "truncatable" attribute. For a segment, this means that parser errors will not be generated when that segment does not specify trailing fields that are not "required" (see "required" attribute above). Likewise for fields/components and components/sub-components.
  • By default, segments, fields, and components are not truncatable.
So, a field, component or a sub-component can be present in a message in one of the following states:
  • present with a value (required="true")
  • present without a value (required="false")
  • absent (required="false" and truncatable="true")

4.15. Segment Definitions

It is possible to reuse segment definitions. Below is a configuration that demonstrates the importation feature:
    
<?xml version="1.0" encoding="UTF-8"?>
<medi:edimap xmlns:medi="http://www.milyn.org/schema/edi-message-mapping-1.2.xsd">
 
    <medi:import truncatableSegments="true" truncatableFields="true" truncatableComponents="true" resource="example/edi-segment-definition.xml" namespace="def"/>
 
    <medi:description name="DVD Order" version="1.0"/>
 
    <medi:delimiters segment="
" field="*" component="^" sub-component="~" escape="?"/>
 
    <medi:segments xmltag="Order">
        <medi:segment minOccurs="0" maxOccurs="1" segref="def:HDR" segcode="HDR" xmltag="header"/>
        <medi:segment minOccurs="0" maxOccurs="1" segref="def:CUS" segcode="CUS" xmltag="customer-details"/>
        <medi:segment minOccurs="0" maxOccurs="-1" segref="def:ORD" segcode="ORD" xmltag="order-item"/>
    </medi:segments>
 
</medi:edimap> 

4.16. Segment Terms

Segments and segments containing child segments can be separated into another file for easier future reuse.
  • segref: This contains a namespace:name referencing the segment to import.
  • truncatableSegments: This overrides the truncatableSegments specified in the imported resource mapping file.
  • truncatableFields: This overrides the truncatableFields specified in the imported resource mapping file.
  • truncatableComponents: This overrides the truncatableComponents specified in the imported resource mapping file.

4.17. The Type Attribute

The example below demonstrates support for the type attribute.
 
<medi:edimap xmlns:medi="http://www.milyn.org/schema/edi-message-mapping-1.2.xsd">
 
    <medi:description name="Segment Definition DVD Order" version="1.0"/>
 
    <medi:delimiters segment="
" field="*" component="^" sub-component="~" escape="?"/>
 
    <medi:segments xmltag="Order">
 
        <medi:segment segcode="HDR" xmltag="header">
            <medi:field xmltag="order-id"/>
            <medi:field xmltag="status-code" type="Integer"/>
            <medi:field xmltag="net-amount" type="BigDecimal"/>
            <medi:field xmltag="total-amount" type="BigDecimal"/>
            <medi:field xmltag="tax" type="BigDecimal"/>
            <medi:field xmltag="date" type="Date" typeParameters="format=yyyyHHmm"/>
        </medi:segment>
 
    </medi:segments>
 
</medi:edimap>
You can use type system for different things, including:
  • field validation
  • Edifact Java Compilation

4.18. The EDIReaderConfigurator

  • Use the EDIReaderConfigurator to programmatically configure the Smooks instance to use the EDIReader as shown in the code below:
    Smooks smooks = new Smooks();
     
    // Create and initialise the Smooks config for the parser...
    smooks.setReaderConfig(new EDIReaderConfigurator("/edi/models/invoice.xml"));
     
    // Use the smooks as normal
    smooks.filterSource(....);
    

4.19. The Edifact Java Compiler

The Edifact Java Compiler simplifies the process of going from EDI to Java. It generates the following:
  • a Java object model for a given EDI mapping model.
  • a Smooks Java binding configuration to populate the Java Object model from an instance of the EDI message described by the EDI mapping model.
  • a factory class to use the Edifact Java Compiler to bind EDI data to the Java object model.

4.20. Edifact Java Compiler Example

The Edifact Java Compiler allows you to write simple Java code such as the following:
// Create an instance of the EJC generated Factory class.  This should normally be cached and reused...
OrderFactory orderFactory = OrderFactory.getInstance();
 
// Bind the EDI message stream data into the EJC generated Order model...
Order order = orderFactory.fromEDI(ediStream);
 
// Process the order data...
Header header = order.getHeader();
Name name = header.getCustomerDetails().getName();
List<OrderItem> orderItems = order.getOrderItems();

4.21. Executing the Edifact Java Compiler

  • To execute the Edifact Java Compiler through Maven, add the plug-in in your POM file:
    <build>
        <plugins>
            <plugin>
                <groupId>org.milyn</groupId>
                <artifactId>maven-ejc-plugin</artifactId>
                <version>1.2</version>
                <configuration>
                    <ediMappingFile>edi-model.xml</ediMappingFile>
                    <packageName>com.acme.order.model</packageName>
                </configuration>
                <executions>
                    <execution><goals><goal>generate</goal></goals></execution>
                </executions>
            </plugin>
        </plugins>
    </build>
    

4.22. Maven Plug-in Parameters for the Edifact Java Compiler

The plug-in has three configuration parameters:
  • ediMappingFile: the path to the EDI mapping model file within the Maven project. (It is optional. The default is src/main/resources/edi-model.xml).
  • packageName:the Java package the generated Java artifacts are placed into (the Java object model and the factory class).
  • destDir: the directory in which the generated artifacts are created and compiled. (This is optional. The default is target/ejc).

4.23. Executing the Edifact Java Compiler with Ant

  • Create and execute the EJC task as shown below:
    <target name="ejc">
     
        <taskdef resource="org/milyn/ejc/ant/anttasks.properties">
            <classpath><fileset dir="/smooks-1.2/lib" includes="*.jar"/></classpath>
        </taskdef>
     
        <ejc edimappingmodel="src/main/resources/edi-model.xml"
             destdir="src/main/java"
             packagename="com.acme.order.model"/>
     
        <!-- Ant as usual from here on... compile and jar the source... -->
     
    </target>
    

4.24. UN/EDIFACT Message Interchanges

The easiest way to learn more about the Edifact Java Compiler is to check out the EJC example, UN/EDIFACT.
Smooks provides out-of-the-box support for UN/EDIFACT message interchanges by way of these means:
  • pre-generated EDI mapping models generated from the official UN/EDIFACT message definition ZIP directories. These allow you to convert a UN/EDIFACT message interchange into a more readily consumable XML format.
  • pre-generated Java bindings for easy reading and writing of UN/EDIFACT interchanges using pure Java

4.25. Using UN/EDIFACT Interchanges with the edi:reader

  • <?xml version="1.0"?>
    <smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.1.xsd" xmlns:unedifact="http://www.milyn.org/xsd/smooks/unedifact-1.4.xsd">
     
        <unedifact:reader mappingModel="urn:org.milyn.edi.unedifact:d03b-mapping:v1.4" ignoreNewLines="true" />
     
    </smooks-resource-list>
    
    The mappingModel attribute defines an URN that refers to the mapping model ZIP set's Maven artifact, which is used by the reader.

4.26. Configuring Smooks to Consume a UN/EDIFACT Interchange

  1. To programmatically configure Smooks to consume a UN/EDIFACT interchange (via, for instance, an UNEdifactReaderConfigurator), use the code below:
    Smooks smooks = new Smooks();
     
    smooks.setReaderConfig(new UNEdifactReaderConfigurator("urn:org.milyn.edi.unedifact:d03b-mapping:v1.4"));
    
  2. Insert the following on the containing application's classpath:
    • the requisite EDI mapping models
    • the Smooks EDI cartridge
  3. There may be some Maven dependancies your configuration will require. See the example below:
    <dependency>
        <groupId>org.milyn</groupId>
        <artifactId>milyn-smooks-edi</artifactId>
        <version>1.4</version>
    </dependency>
     
    <!-- Required Mapping Models -->
    <dependency>
        <groupId>org.milyn.edi.unedifact</groupId>
        <artifactId>d93a-mapping</artifactId>
        <version>v1.4</version>
    </dependency>
    <dependency>
        <groupId>org.milyn.edi.unedifact</groupId>
        <artifactId>d03b-mapping</artifactId>
        <version>v1.4</version>
    </dependency>
    
  4. Once an application has added an EDI mapping model ZIP set to its classpath, you can configure Smooks to use this model by simply referencing the Maven artifact using a URN as the unedifact:reader configuration's mappingModel attribute value:
    <?xml version="1.0"?>
    <smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.1.xsd" xmlns:unedifact="http://www.milyn.org/xsd/smooks/unedifact-1.4.xsd"
     
        <unedifact:reader mappingModel="urn:org.milyn.edi.unedifact:d03b-mapping:v1.4" ignoreNewLines="true" />
     
    </smooks-resource-list>
    

4.27. The mappingModel

The mappingModel attribute can define multiple, comma-separated EDI Mapping Models URNs . By doing so, it facilitates the UN/EDIFACT reader process interchanges which deal with multiple UN/EDIFACT messages defined in different directories.
Mapping model ZIP sets are available for all of the UN/EDIFACT directories. Obtain them from the MavenSNAPSHOT and Central repositories and add them to your application by using standard Maven dependency management.

4.28. Configuring the mappingModel

  1. To add the D93A mapping model ZIP set to your application classpath, set the following dependency to your application's POM file:
    <!-- The mapping model sip set for the D93A directory... -->
    <dependency>
        <groupId>org.milyn.edi.unedifact</groupId>
        <artifactId>d93a-mapping</artifactId>
        <version>v1.4</version>
    </dependency>
    
  2. Configure Smooks to use this ZIP set by adding the unedifact:reader configuration to your Smooks configuration as shown below:
    <unedifact:reader mappingModel="urn:org.milyn.edi.unedifact:d93a-mapping:v1.4" />
    
    Note how you configure the reader using a URN derived from the Maven artifac's dependency information.
  3. You can also add multiple mapping model ZIP sets to your application's classpath. To do so, add all of them to your unedifact:reader configuration by comma-separating the URNs.
  4. Pre-generated Java binding model sets are provided with the tool (there is one per mapping model ZIP set). Use these to process UN/EDIFACT interchanges using a very simple, generated factory class.

4.29. Processing a D03B UN/EDIFACT Message Interchange

  1. To process a D03B UN/EDIFACT message interchange, follow the example below:
    
    Reading:
    
    // Create an instance of the EJC generated factory class... cache this and reuse !!!
    D03BInterchangeFactory factory = D03BInterchangeFactory.getInstance();
     
    // Deserialize the UN/EDIFACT interchange stream to Java...
    UNEdifactInterchange interchange = factory.fromUNEdifact(ediInStream);
     
    // Need to test which interchange syntax version.  Supports v4.1 at the moment...
    if(interchange instanceof UNEdifactInterchange41) {
        UNEdifactInterchange41 interchange41 = (UNEdifactInterchange41) interchange;
     
        for(UNEdifactMessage41 message : interchange41.getMessages()) {
            // Process the messages...
     
            Object messageObj = message.getMessage();
     
            if(messageObj instanceof Invoic) {
                // It's an INVOIC message....
                Invoic invoic = (Invoic) messageObj;
                ItemDescription itemDescription = invoic.getItemDescription();
                // etc etc....
            } else if(messageObj instanceof Cuscar) {
                // It's a CUSCAR message...
            } else if(etc etc etc...) {
                // etc etc etc...
            }
        }
    }
    
    
    Writing:
    
    factory.toUNEdifact(interchange, ediOutStream);
    
    
  2. Use Maven to add the ability to process a D03B message interchange by adding the binding dependency for that directory (you can also use pre-generated UN/EDIFACT Java object models distributed via the MavenSNAPSHOT and Central repositories):
    <dependency>
        <groupId>org.milyn.edi.unedifact</groupId>
        <artifactId>d03b-binding</artifactId>
        <version>v1.4</version>
    </dependency>
    

4.30. Processing JSON Data

  1. To process JSON data, you must configure a JSON reader:
    <?xml version="1.0"?>
    <smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.1.xsd" xmlns:json="http://www.milyn.org/xsd/smooks/json-1.1.xsd">
     
        <json:reader/>
     
    </smooks-resource-list>
    
  2. Set the XML names of the root, document and array elements by using the following configuration options:
    • rootName: this is the name of the root element. The default is yaml.
    • elementName: this is the name of a sequence element. The default is element.
  3. You may wish to use characters in the key name that are not allowed in the XML element name. The reader offers multiple solutions to this problem. It can search and replace white spaces, illegal characters and the number in key names that start with a number. You can also use it to replace one key name with a completely different one. The following sample code shows you how to do this:
    <?xml version="1.0"?>
    <smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.1.xsd" xmlns:json="http://www.milyn.org/xsd/smooks/json-1.1.xsd">
     
        <json:reader keyWhitspaceReplacement="_" keyPrefixOnNumeric="n" illegalElementNameCharReplacement=".">
            <json:keyMap>
                <json:key from="some key">someKey</json:key>
                <json:key from="some&key" to="someAndKey" />
            </json:keyMap>
        </json:reader>
     
    </smooks-resource-list>
    
    • keyWhitspaceReplacement: this is the replacement character for white spaces in a JSON map key. By default this is not defined, so the reader does not automatically search for white spaces.
    • keyPrefixOnNumeric: this is the prefix character to add if the JSON node name starts with a number. By default, this is not defined, so the reader does not search for element names that start with a number.
    • illegalElementNameCharReplacement: if illegal characters are encountered in a JSON element name then they are replaced with this value.
  4. You can also configure these optional settings:
    • nullValueReplacement: this is the replacement string for JSON null values. The default is an empty string.
    • encoding: this is the default encoding of any JSON message InputStream processed by the reader. The default encoding is UTF-8.

      Note

      This feature is deprecated. Instead, you should now manage the JSON streamsource character encoding by supplying a java.io.Reader to the Smooks.filterSource() method.
  5. To configure Smooks programmatically to read a JSON configuration, use the JSONReaderConfiguratorclass:
    Smooks smooks = new Smooks();
     
    smooks.setReaderConfig(new JSONReaderConfigurator()
            .setRootName("root")
            .setArrayElementName("e"));
     
    // Use Smooks as normal...
    

4.31. Using Characters Not Allowed in XML when Processing JSON Data

To use characters in the key name that are not allowed in the XML element name, use the reader to search and replace white spaces, illegal characters and the number in key names that start with a number. You can also use it to replace one key name with a completely different one. The following sample code shows you how to do this:
<?xml version="1.0"?>
<smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.1.xsd" xmlns:json="http://www.milyn.org/xsd/smooks/json-1.1.xsd">
 
    <json:reader keyWhitspaceReplacement="_" keyPrefixOnNumeric="n" illegalElementNameCharReplacement=".">
        <json:keyMap>
            <json:key from="some key">someKey</json:key>
            <json:key from="some&key" to="someAndKey" />
        </json:keyMap>
    </json:reader>
 
</smooks-resource-list>
  • keyWhitspaceReplacement: this is the replacement character for white spaces in a JSON map key. By default this is not defined, so the reader does not automatically search for white spaces.
  • keyPrefixOnNumeric: this is the prefix character to add if the JSON node name starts with a number. By default, this is not defined, so the reader does not search for element names that start with a number.
  • illegalElementNameCharReplacement: if illegal characters are encountered in a JSON element name then they are replaced with this value.
These settings are optional:
  • nullValueReplacement: this is the replacement string for JSON null values. The default is an empty string.
  • encoding: this is the default encoding of any JSON message InputStream processed by the reader. The default encoding is UTF-8.

    Note

    This feature is deprecated. Instead, you should now manage the JSON streamsource character encoding by supplying a java.io.Reader to the Smooks.filterSource() method.

4.32. Configuring YAML Streams

Procedure 4.1. Task

  1. Configure your reader to process YAML files as shown:
    <?xml version="1.0"?>
    <smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.1.xsd" xmlns:yaml="http://www.milyn.org/xsd/smooks/yaml-1.4.xsd">
     
        <yaml:reader/>
     
    </smooks-resource-list>
    
  2. Configure the YAML stream to contain multiple documents. The reader handles this by adding a document element as a child of the root element. An XML-serialized YAML stream with one empty YAML document looks like this:
    <yaml>
        <document>
        </document>
    </yaml>
    
  3. Configure Smooks programmatically to read a YAML configuration by exploiting the YamlReaderConfigurator class:
    Smooks smooks = new Smooks();
     
    smooks.setReaderConfig(new YamlReaderConfigurator()
            .setRootName("root")
            .setDocumentName("doc")
            .setArrayElementName("e"))
            .setAliasStrategy(AliasStrategy.REFER_RESOLVE)
            .setAnchorAttributeName("anchor")
            .setAliasAttributeName("alias");
     
    // Use Smooks as normal...
    

4.33. Supported Result Types

Smooks can work with standard JDK StreamResult and DOMResult result types, as well as these specialist ones:
  • JavaResult: use this result type to capture the contents of the Smooks Java Bean context.
  • ValidationResult: use this result type to capture outputs.
  • Simple Result type: use this when writing tests. This is a StreamResult extension wrapping a StringWriter.

4.34. Using Characters Not Allowed in XML when Processing YAML Data

  • You can use characters in the key name that are not allowed in the XML element name. The reader offers multiple solutions to this problem. It can search and replace white spaces, illegal characters and the number in key names that start with a number. You can configure it to replace one key name with a completely different one, as shown below:
    <?xml version="1.0"?>
    <smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.1.xsd" xmlns:yaml="http://www.milyn.org/xsd/smooks/yaml-1.4.xsd">
     
        <yaml:reader keyWhitspaceReplacement="_" keyPrefixOnNumeric="n" illegalElementNameCharReplacement=".">
            <yaml:keyMap>
                <yaml:key from="some key">someKey</yaml:key>
                <yaml:key from="some&key" to="someAndKey" />
            </yaml:keyMap>
        </yaml:reader>
     
    </smooks-resource-list>
    

4.35. Options for Replacing XML in YAML

  • keyWhitspaceReplacement: This is the replacement character for white spaces in a YAML map key. By default this not defined.
  • keyPrefixOnNumeric: Add this prefix if the YAML node name starts with a number. By default this is not defined.
  • illegalElementNameCharReplacement: If illegal characters are encountered in a YAML element name, they are replaced with this value. By default this is not defined.

4.36. Anchors and Aliases in YAML

The YAML reader can handle anchors and aliases via three different strategies. Define your strategy of choice via the aliasStrategy configuration option. This option can have one of the following values:
  • REFER: The reader creates reference attributes on the element that has an anchor or an alias. The element with the anchor obtains the id attribute containing the name from the anchor as the attribute value. The element with the alias gets the ref attribute also containing the name of the anchor as the attribute value. You can define the anchor and alias attribute names by setting the anchorAttributeName and aliasAttributeName properties.
  • RESOLVE: The reader resolves the value or the data structure of an anchor when its alias is encountered. This means that the SAX events of the anchor are repeated as child events of the alias element. When a YAML document contains a lot of anchors or anchors and a substantial data structure this can lead to memory problems.
  • REFER_RESOLVE: This is a combination of REFER and RESOLVE. The anchor and alias attributes are set but the anchor value or data structure is also resolved. This option is useful when the name of the anchor has a business meaning.
The YAML reader uses the REFER strategy by default.

4.37. Java Object Graph Transformation

  1. Smooks can transform one Java object graph into another. To do this, it uses the SAX processing model, which means no intermediate object model is constructed. Instead, the source Java object graph is turned directly into a stream of SAX events, which are used to populate the target Java object graph.
    If you use the HTML Smooks Report Generator tool, you will see that the event stream produced by the source object model is as follows:
    <example.srcmodel.Order>
        <header>
            <customerNumber>
                </customerNumber>
               <customerName>
           </customerName>;
        </header>
        <orderItems>
            <example.srcmodel.OrderItem>
                <productId>
               </productId>
                <quantity>
                >/quantity>
                <price>
                </price>
            </example.srcmodel.OrderItem>
        </orderItems>
    </example.srcmodel.Order>
    
  2. Aim the Smooks Java bean resources at this event stream. The Smooks configuration for performing this transformation (smooks-config.xml) is as follows:
    <?xml version="1.0"?>
    <smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.1.xsd" xmlns:jb="http://www.milyn.org/xsd/smooks/javabean-1.4.xsd">
     
        <jb:bean BeanId="lineOrder" class="example.trgmodel.LineOrder" createOnElement="example.srcmodel.Order">
            <jb:wiring property="lineItems" BeanIdRef="lineItems" />
            <jb:value property="customerId" data="header/customerNumber" />
            <jb:value property="customerName" data="header/customerName" />
        </jb:bean<
     
        <jb:bean BeanId="lineItems" class="example.trgmodel.LineItem[]" createOnElement="orderItems">
            <jb:wiring BeanIdRef="lineItem" />
        </jb:bean>
     
     
        <jb:bean BeanId="lineItem" class="example.trgmodel.LineItem" createOnElement="example.srcmodel.OrderItem">
            <jb:value property="productCode" data="example.srcmodel.OrderItem/productId" />
            <jb:value property="unitQuantity" data="example.srcmodel.OrderItem/quantity" />
            <jb:value property="unitPrice" data="example.srcmodel.OrderItem/price" />
        </jb:bean>
     
    </smooks-resource-list>
    
  3. The source object model is provided to Smooks via a org.milyn.delivery.JavaSource object. Create this object by passing the constructor the source model's root object. The resulting Java Source object is used in the Smooks#filter method. Here is the resulting code:
    protected LineOrder runSmooksTransform(Order srcOrder) throws IOException, SAXException {
        Smooks smooks = new Smooks("smooks-config.xml");
        ExecutionContext executionContext = smooks.createExecutionContext();
     
        // Transform the source Order to the target LineOrder via a
        // JavaSource and JavaResult instance...
        JavaSource source = new JavaSource(srcOrder);
        JavaResult result = new JavaResult();
     
        // Configure the execution context to generate a report...
        executionContext.setEventListener(new HtmlReportGenerator("target/report/report.html"));
     
        smooks.filterSource(executionContext, source, result);
     
        return (LineOrder) result.getBean("lineOrder");
    }
    

4.38. String Manipulation on Input Data

The CSV and fixed-length readers allow you to execute string manipulation functions on the input data before the data is converted into SAX events. The following functions are available:
  • upper_case: this returns the upper case version of the string.
  • lower_case: this returns the lower case version of the string.
  • cap_first: this returns the string with the very first word capitalized.
  • uncap_first: this returns the string with the very first word un-capitalized. It is the opposite of cap_first.
  • capitalize: this returns the string with all words capitalized.
  • trim: this returns the string without leading and trailing white-spaces.
  • left_trim: this returns the string without leading white-spaces.
  • right_trim: this returns the string without trailing white-spaces.
You can chain functions via the point separator. Here is an example: trim.upper_case
How you define the functions per field depends on the reader you are using.