Chapter 2. Defining Logical Data Units

Abstract

When describing a service in a WSDL contract complex data types are defined as logical units using XML Schema.

2.1. Introduction to Logical Data Units

When defining a service, the first thing you must consider is how the data used as parameters for the exposed operations is going to be represented. Unlike applications that are written in a programming language that uses fixed data structures, services must define their data in logical units that can be consumed by any number of applications. This involves two steps:

  1. Breaking the data into logical units that can be mapped into the data types used by the physical implementations of the service
  2. Combining the logical units into messages that are passed between endpoints to carry out the operations

This chapter discusses the first step. Chapter 3, Defining Logical Messages Used by a Service discusses the second step.

2.2. Mapping data into logical data units

Overview

The interfaces used to implement a service define the data representing operation parameters as XML documents. If you are defining an interface for a service that is already implemented, you must translate the data types of the implemented operations into discreet XML elements that can be assembled into messages. If you are starting from scratch, you must determine the building blocks from which your messages are built, so that they make sense from an implementation standpoint.

Available type systems for defining service data units

According to the WSDL specification, you can use any type system you choose to define data types in a WSDL contract. However, the W3C specification states that XML Schema is the preferred canonical type system for a WSDL document. Therefore, XML Schema is the intrinsic type system in Apache CXF.

XML Schema as a type system

XML Schema is used to define how an XML document is structured. This is done by defining the elements that make up the document. These elements can use native XML Schema types, like xsd:int, or they can use types that are defined by the user. User defined types are either built up using combinations of XML elements or they are defined by restricting existing types. By combining type definitions and element definitions you can create intricate XML documents that can contain complex data.

When used in WSDL XML Schema defines the structure of the XML document that holds the data used to interact with a service. When defining the data units used by your service, you can define them as types that specify the structure of the message parts. You can also define your data units as elements that make up the message parts.

Considerations for creating your data units

You might consider simply creating logical data units that map directly to the types you envision using when implementing the service. While this approach works, and closely follows the model of building RPC-style applications, it is not necessarily ideal for building a piece of a service-oriented architecture.

The Web Services Interoperability Organization’s WS-I basic profile provides a number of guidelines for defining data units and can be accessed at http://www.ws-i.org/Profiles/BasicProfile-1.1-2004-08-24.html#WSDLTYPES. In addition, the W3C also provides the following guidelines for using XML Schema to represent data types in WSDL documents:

  • Use elements, not attributes.
  • Do not use protocol-specific types as base types.

2.3. Adding data units to a contract

Overview

Depending on how you choose to create your WSDL contract, creating new data definitions requires varying amounts of knowledge. The Apache CXF GUI tools provide a number of aids for describing data types using XML Schema. Other XML editors offer different levels of assistance. Regardless of the editor you choose, it is a good idea to have some knowledge about what the resulting contract should look like.

Procedure

Defining the data used in a WSDL contract involves the following steps:

  1. Determine all the data units used in the interface described by the contract.
  2. Create a types element in your contract.
  3. Create a schema element, shown in Example 2.1, “Schema entry for a WSDL contract”, as a child of the type element.

    The targetNamespace attribute specifies the namespace under which new data types are defined. Best practice is to also define the namespace that provides access to the target namespace. The remaining entries should not be changed.

    Example 2.1. Schema entry for a WSDL contract

    <schema targetNamespace="http://schemas.iona.com/bank.idl"
            xmlns="http://www.w3.org/2001/XMLSchema"
            xmlns:xsd1="http://schemas.iona.com/bank.idl"
            xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/">
  4. For each complex type that is a collection of elements, define the data type using a complexType element. See Section 2.5.1, “Defining data structures”.
  5. For each array, define the data type using a complexType element. See Section 2.5.2, “Defining arrays”.
  6. For each complex type that is derived from a simple type, define the data type using a simpleType element. See Section 2.5.4, “Defining types by restriction”.
  7. For each enumerated type, define the data type using a simpleType element. See Section 2.5.5, “Defining enumerated types”.
  8. For each element, define it using an element element. See Section 2.6, “Defining elements”.

2.4. XML Schema simple types

Overview

If a message part is going to be of a simple type it is not necessary to create a type definition for it. However, the complex types used by the interfaces defined in the contract are defined using simple types.

Entering simple types

XML Schema simple types are mainly placed in the element elements used in the types section of your contract. They are also used in the base attribute of restriction elements and extension elements.

Simple types are always entered using the xsd prefix. For example, to specify that an element is of type int, you would enter xsd:int in its type attribute as shown in Example 2.2, “Defining an element with a simple type”.

Example 2.2. Defining an element with a simple type

<element name="simpleInt" type="xsd:int" />

Supported XSD simple types

Apache CXF supports the following XML Schema simple types:

  • xsd:string
  • xsd:normalizedString
  • xsd:int
  • xsd:unsignedInt
  • xsd:long
  • xsd:unsignedLong
  • xsd:short
  • xsd:unsignedShort
  • xsd:float
  • xsd:double
  • xsd:boolean
  • xsd:byte
  • xsd:unsignedByte
  • xsd:integer
  • xsd:positiveInteger
  • xsd:negativeInteger
  • xsd:nonPositiveInteger
  • xsd:nonNegativeInteger
  • xsd:decimal
  • xsd:dateTime
  • xsd:time
  • xsd:date
  • xsd:QName
  • xsd:base64Binary
  • xsd:hexBinary
  • xsd:ID
  • xsd:token
  • xsd:language
  • xsd:Name
  • xsd:NCName
  • xsd:NMTOKEN
  • xsd:anySimpleType
  • xsd:anyURI
  • xsd:gYear
  • xsd:gMonth
  • xsd:gDay
  • xsd:gYearMonth
  • xsd:gMonthDay

2.5. Defining complex data types

Abstract

XML Schema provides a flexible and powerful mechanism for building complex data structures from its simple data types. You can create data structures by creating a sequence of elements and attributes. You can also extend your defined types to create even more complex types.

In addition to building complex data structures, you can also describe specialized types such as enumerated types, data types that have a specific range of values, or data types that need to follow certain patterns by either extending or restricting the primitive types.

2.5.1. Defining data structures

Overview

In XML Schema, data units that are a collection of data fields are defined using complexType elements. Specifying a complex type requires three pieces of information:

  1. The name of the defined type is specified in the name attribute of the complexType element.
  2. The first child element of the complexType describes the behavior of the structure’s fields when it is put on the wire. See the section called “Complex type varieties”.
  3. Each of the fields of the defined structure are defined in element elements that are grandchildren of the complexType element. See the section called “Defining the parts of a structure”.

For example, the structure shown in Example 2.3, “Simple Structure” is defined in XML Schema as a complex type with two elements.

Example 2.3. Simple Structure

struct personalInfo
{
  string name;
  int age;
};

Example 2.4, “A complex type” shows one possible XML Schema mapping for the structure shown in Example 2.3, “Simple Structure” The structure defined in Example 2.4, “A complex type” generates a message containing two elements: name and age.

.

Example 2.4. A complex type

<complexType name="personalInfo">
  <sequence>
    <element name="name" type="xsd:string" />
    <element name="age" type="xsd:int" />
  </sequence>
</complexType>

Complex type varieties

XML Schema has three ways of describing how the fields of a complex type are organized when represented as an XML document and passed on the wire. The first child element of the complexType element determines which variety of complex type is being used. Table 2.1, “Complex type descriptor elements” shows the elements used to define complex type behavior.

Table 2.1. Complex type descriptor elements

ElementComplex Type Behavior

sequence

All of a complex type’s fields can be present and they must be in the order in which they are specified in the type definition.

all

All of the complex type’s fields can be present but they can be in any order.

choice

Only one of the elements in the structure can be placed in the message.

If the structure is defined using a choice element, as shown in Example 2.5, “Simple complex choice type”, it generates a message with either a name element or an age element.

Example 2.5. Simple complex choice type

<complexType name="personalInfo">
  <choice>
    <element name="name" type="xsd:string"/>
    <element name="age" type="xsd:int"/>
  </choice>
</complexType>

Defining the parts of a structure

You define the data fields that make up a structure using element elements. Every complexType element should contain at least one element element. Each element element in the complexType element represents a field in the defined data structure.

To fully describe a field in a data structure, element elements have two required attributes:

  • The name attribute specifies the name of the data field and it must be unique within the defined complex type.
  • The type attribute specifies the type of the data stored in the field. The type can be either one of the XML Schema simple types, or any named complex type that is defined in the contract.

In addition to name and type, element elements have two other commonly used optional attributes: minOcurrs and maxOccurs. These attributes place bounds on the number of times the field occurs in the structure. By default, each field occurs only once in a complex type. Using these attributes, you can change how many times a field must or can appear in a structure. For example, you can define a field, previousJobs, that must occur at least three times, and no more than seven times, as shown in Example 2.6, “Simple complex type with occurrence constraints”.

Example 2.6. Simple complex type with occurrence constraints

<complexType name="personalInfo">
  <all>
    <element name="name" type="xsd:string"/>
    <element name="age" type="xsd:int"/>
    <element name="previousJobs" type="xsd:string:
             minOccurs="3" maxOccurs="7"/>
  </all>
</complexType>

You can also use the minOccurs to make the age field optional by setting the minOccurs to zero as shown in Example 2.7, “Simple complex type with minOccurs set to zero”. In this case age can be omitted and the data will still be valid.

Example 2.7. Simple complex type with minOccurs set to zero

<complexType name="personalInfo">
  <choice>
    <element name="name" type="xsd:string"/>
    <element name="age" type="xsd:int" minOccurs="0"/>
  </choice>
</complexType>

Defining attributes

In XML documents, attributes are contained in the element’s tag. For example, in the complexType element in the code below, name is an attribute. To specify an attribute for a complex type, you define an attribute element in the complexType element definition. An attribute element can appear only after the all, sequence, or choice element. Specify one attribute element for each of the complex type’s attributes. Any attribute elements must be direct children of the complexType element.

Example 2.8. Complex type with an attribute

<complexType name="personalInfo">
  <all>
    <element name="name" type="xsd:string"/>
    <element name="previousJobs" type="xsd:string"
             minOccurs="3" maxOccurs="7"/>
  </all>
  <attribute name="age" type="xsd:int" use="required" />
</complexType>

In the previous code, the attribute element specifies that the personalInfo complex type has an age attribute. The attribute element has these attributes:

  • name — A required attribute that specifies the string that identifies the attribute.
  • type — Specifies the type of the data stored in the field. The type can be one of the XML Schema simple types.
  • use — An optional attribute that specifies whether the complex type is required to have this attribute. Valid values are required or optional. The default is that the attribute is optional.

In an attribute element, you can specify the optional default attribute, which lets you specify a default value for the attribute.

2.5.2. Defining arrays

Overview

Apache CXF supports two methods for defining arrays in a contract. The first is define a complex type with a single element whose maxOccurs attribute has a value greater than one. The second is to use SOAP arrays. SOAP arrays provide added functionality such as the ability to easily define multi-dimensional arrays and to transmit sparsely populated arrays.

Complex type arrays

Complex type arrays are a special case of a sequence complex type. You simply define a complex type with a single element and specify a value for the maxOccurs attribute. For example, to define an array of twenty floating point numbers you use a complex type similar to the one shown in Example 2.9, “Complex type array”.

Example 2.9. Complex type array

<complexType name="personalInfo">
  <element name="averages" type="xsd:float" maxOccurs="20"/>
</complexType>

You can also specify a value for the minOccurs attribute.

SOAP arrays

SOAP arrays are defined by deriving from the SOAP-ENC:Array base type using the wsdl:arrayType element. The syntax for this is shown in Example 2.10, “Syntax for a SOAP array derived using wsdl:arrayType”. Ensure that the definitions element declares xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/".

Example 2.10. Syntax for a SOAP array derived using wsdl:arrayType

<complexType name="TypeName">
  <complexContent>
    <restriction base="SOAP-ENC:Array">
      <attribute ref="SOAP-ENC:arrayType"
                 wsdl:arrayType="ElementType<ArrayBounds>"/>
    </restriction>
  </complexContent>
</complexType>

Using this syntax, TypeName specifies the name of the newly-defined array type. ElementType specifies the type of the elements in the array. ArrayBounds specifies the number of dimensions in the array. To specify a single dimension array use []; to specify a two-dimensional array use either [][] or [,].

For example, the SOAP Array, SOAPStrings, shown in Example 2.11, “Definition of a SOAP array”, defines a one-dimensional array of strings. The wsdl:arrayType attribute specifies the type of the array elements, xsd:string, and the number of dimensions, with [] implying one dimension.

Example 2.11. Definition of a SOAP array

<complexType name="SOAPStrings">
  <complexContent>
    <restriction base="SOAP-ENC:Array">
      <attribute ref="SOAP-ENC:arrayType"
                 wsdl:arrayType="xsd:string[]"/>
    </restriction>
  </complexContent>
</complexType>

You can also describe a SOAP Array using a simple element as described in the SOAP 1.1 specification. The syntax for this is shown in Example 2.12, “Syntax for a SOAP array derived using an element”.

Example 2.12. Syntax for a SOAP array derived using an element

<complexType name="TypeName">
  <complexContent>
    <restriction base="SOAP-ENC:Array">
      <sequence>
        <element name="ElementName" type="ElementType"
                 maxOccurs="unbounded"/>
      </sequence>
    </restriction>
  </complexContent>
</complexType>

When using this syntax, the element’s maxOccurs attribute must always be set to unbounded.

2.5.3. Defining types by extension

Like most major coding languages, XML Schema allows you to create data types that inherit some of their elements from other data types. This is called defining a type by extension. For example, you could create a new type called alienInfo, that extends the personalInfo structure defined in Example 2.4, “A complex type” by adding a new element called planet.

Types defined by extension have four parts:

  1. The name of the type is defined by the name attribute of the complexType element.
  2. The complexContent element specifies that the new type will have more than one element.

    Note

    If you are only adding new attributes to the complex type, you can use a simpleContent element.

  3. The type from which the new type is derived, called the base type, is specified in the base attribute of the extension element.
  4. The new type’s elements and attributes are defined in the extension element, the same as they are for a regular complex type.

For example, alienInfo is defined as shown in Example 2.13, “Type defined by extension”.

Example 2.13. Type defined by extension

<complexType name="alienInfo">
  <complexContent>
    <extension base="xsd1:personalInfo">
      <sequence>
        <element name="planet" type="xsd:string"/>
      </sequence>
    </extension>
  </complexContent>
</complexType>

2.5.4. Defining types by restriction

Overview

XML Schema allows you to create new types by restricting the possible values of an XML Schema simple type. For example, you can define a simple type, SSN, which is a string of exactly nine characters. New types defined by restricting simple types are defined using a simpleType element.

The definition of a type by restriction requires three things:

  1. The name of the new type is specified by the name attribute of the simpleType element.
  2. The simple type from which the new type is derived, called the base type, is specified in the restriction element. See the section called “Specifying the base type”.
  3. The rules, called facets, defining the restrictions placed on the base type are defined as children of the restriction element. See the section called “Defining the restrictions”.

Specifying the base type

The base type is the type that is being restricted to define the new type. It is specified using a restriction element. The restriction element is the only child of a simpleType element and has one attribute, base, that specifies the base type. The base type can be any of the XML Schema simple types.

For example, to define a new type by restricting the values of an xsd:int you use a definition like the one shown in Example 2.14, “Using int as the base type”.

Example 2.14. Using int as the base type

<simpleType name="restrictedInt">
  <restriction base="xsd:int">
    ...
  </restriction>
</simpleType>

Defining the restrictions

The rules defining the restrictions placed on the base type are called facets. Facets are elements with one attribute, value, that defines how the facet is enforced. The available facets and their valid value settings depend on the base type. For example, xsd:string supports six facets, including:

  • length
  • minLength
  • maxLength
  • pattern
  • whitespace
  • enumeration

Each facet element is a child of the restriction element.

Example

Example 2.15, “SSN simple type description” shows an example of a simple type, SSN, which represents a social security number. The resulting type is a string of the form xxx-xx-xxxx. <SSN>032-43-9876<SSN> is a valid value for an element of this type, but <SSN>032439876</SSN> is not.

Example 2.15. SSN simple type description

<simpleType name="SSN">
  <restriction base="xsd:string">
    <pattern value="\d{3}-\d{2}-\d{4}"/>
  </restriction>
</simpleType>

2.5.5. Defining enumerated types

Overview

Enumerated types in XML Schema are a special case of definition by restriction. They are described by using the enumeration facet which is supported by all XML Schema primitive types. As with enumerated types in most modern programming languages, a variable of this type can only have one of the specified values.

Defining an enumeration in XML Schema

The syntax for defining an enumeration is shown in Example 2.16, “Syntax for an enumeration”.

Example 2.16. Syntax for an enumeration

<simpleType name="EnumName">
  <restriction base="EnumType">
    <enumeration value="Case1Value"/>
    <enumeration value="Case2Value"/>
    ...
    <enumeration value="CaseNValue"/>
  </restriction>
</simpleType>

EnumName specifies the name of the enumeration type. EnumType specifies the type of the case values. CaseNValue, where N is any number one or greater, specifies the value for each specific case of the enumeration. An enumerated type can have any number of case values, but because it is derived from a simple type, only one of the case values is valid at a time.

Example

For example, an XML document with an element defined by the enumeration widgetSize, shown in Example 2.17, “widgetSize enumeration”, would be valid if it contained <widgetSize>big</widgetSize>, but it would not be valid if it contained <widgetSize>big,mungo</widgetSize>.

Example 2.17. widgetSize enumeration

<simpleType name="widgetSize">
  <restriction base="xsd:string">
    <enumeration value="big"/>
    <enumeration value="large"/>
    <enumeration value="mungo"/>
  </restriction>
</simpleType>

2.6. Defining elements

Elements in XML Schema represent an instance of an element in an XML document generated from the schema. The most basic element consists of a single element element. Like the element element used to define the members of a complex type, they have three attributes:

  • name — A required attribute that specifies the name of the element as it appears in an XML document.
  • type — Specifies the type of the element. The type can be any XML Schema primitive type or any named complex type defined in the contract. This attribute can be omitted if the type has an in-line definition.
  • nillable — Specifies whether an element can be omitted from a document entirely. If nillable is set to true, the element can be omitted from any document generated using the schema.

An element can also have an in-line type definition. In-line types are specified using either a complexType element or a simpleType element. Once you specify if the type of data is complex or simple, you can define any type of data needed using the tools available for each type of data. In-line type definitions are discouraged because they are not reusable.