Chapter 383. Zip File DataFormat

Available as of Camel version 2.11

The Zip File Data Format is a message compression and de-compression format. Messages can be marshalled (compressed) to Zip files containing a single entry, and Zip files containing a single entry can be unmarshalled (decompressed) to the original file contents. This data format supports ZIP64, as long as Java 7 or later is being used].

383.1. ZipFile Options

The Zip File dataformat supports 4 options, which are listed below.

NameDefaultJava TypeDescription

usingIterator

false

Boolean

If the zip file has more then one entry, the setting this option to true, allows to work with the splitter EIP, to split the data using an iterator in a streaming mode.

allowEmptyDirectory

false

Boolean

If the zip file has more then one entry, setting this option to true, allows to get the iterator even if the directory is empty

preservePathElements

false

Boolean

If the file name contains path elements, setting this option to true, allows the path to be maintained in the zip file.

contentTypeHeader

false

Boolean

Whether the data format should set the Content-Type header with the type from the data format if the data format is capable of doing so. For example application/xml for data formats marshalling to XML, or application/json for data formats marshalling to JSon etc.

383.2. Spring Boot Auto-Configuration

The component supports 5 options, which are listed below.

NameDescriptionDefaultType

camel.dataformat.zipfile.allow-empty-directory

If the zip file has more then one entry, setting this option to true, allows to get the iterator even if the directory is empty

false

Boolean

camel.dataformat.zipfile.content-type-header

Whether the data format should set the Content-Type header with the type from the data format if the data format is capable of doing so. For example application/xml for data formats marshalling to XML, or application/json for data formats marshalling to JSon etc.

false

Boolean

camel.dataformat.zipfile.enabled

Enable zipfile dataformat

true

Boolean

camel.dataformat.zipfile.preserve-path-elements

If the file name contains path elements, setting this option to true, allows the path to be maintained in the zip file.

false

Boolean

camel.dataformat.zipfile.using-iterator

If the zip file has more then one entry, the setting this option to true, allows to work with the splitter EIP, to split the data using an iterator in a streaming mode.

false

Boolean

ND

383.3. Marshal

In this example we marshal a regular text/XML payload to a compressed payload using Zip file compression, and send it to an ActiveMQ queue called MY_QUEUE.

from("direct:start")
    .marshal().zipFile()
    .to("activemq:queue:MY_QUEUE");

The name of the Zip entry inside the created Zip file is based on the incoming CamelFileName message header, which is the standard message header used by the file component. Additionally, the outgoing CamelFileName message header is automatically set to the value of the incoming CamelFileName message header, with the ".zip" suffix. So for example, if the following route finds a file named "test.txt" in the input directory, the output will be a Zip file named "test.txt.zip" containing a single Zip entry named "test.txt":

from("file:input/directory?antInclude=*/.txt")
    .marshal().zipFile()
    .to("file:output/directory");

If there is no incoming CamelFileName message header (for example, if the file component is not the consumer), then the message ID is used by default, and since the message ID is normally a unique generated ID, you will end up with filenames like ID-MACHINENAME-2443-1211718892437-1-0.zip. If you want to override this behavior, then you can set the value of the CamelFileName header explicitly in your route:

from("direct:start")
    .setHeader(Exchange.FILE_NAME, constant("report.txt"))
    .marshal().zipFile()
    .to("file:output/directory");

This route would result in a Zip file named "report.txt.zip" in the output directory, containing a single Zip entry named "report.txt".

383.4. Unmarshal

In this example we unmarshal a Zip file payload from an ActiveMQ queue called MY_QUEUE to its original format, and forward it for processing to the UnZippedMessageProcessor.

from("activemq:queue:MY_QUEUE")
    .unmarshal().zipFile()
    .process(new UnZippedMessageProcessor());

If the zip file has more then one entry, the usingIterator option of ZipFileDataFormat to be true, and you can use splitter to do the further work.

ZipFileDataFormat zipFile = new ZipFileDataFormat();
zipFile.setUsingIterator(true);

from("file:src/test/resources/org/apache/camel/dataformat/zipfile/?consumer.delay=1000&noop=true")
    .unmarshal(zipFile)
    .split(body(Iterator.class)).streaming()
        .process(new UnZippedMessageProcessor())
    .end();

Or you can use the ZipSplitter as an expression for splitter directly like this

from("file:src/test/resources/org/apache/camel/dataformat/zipfile?consumer.delay=1000&noop=true")
    .split(new ZipSplitter()).streaming()
        .process(new UnZippedMessageProcessor())
    .end();

383.5. Aggregate

Note

Please note that this aggregation strategy requires eager completion check to work properly.

In this example we aggregate all text files found in the input directory into a single Zip file that is stored in the output directory. 

from("file:input/directory?antInclude=*/.txt")
    .aggregate(constant(true), new ZipAggregationStrategy())
        .completionFromBatchConsumer().eagerCheckCompletion()
        .to("file:output/directory");

The outgoing CamelFileName message header is created using java.io.File.createTempFile, with the ".zip" suffix. If you want to override this behavior, then you can set the value of the CamelFileName header explicitly in your route:

from("file:input/directory?antInclude=*/.txt")
    .aggregate(constant(true), new ZipAggregationStrategy())
        .completionFromBatchConsumer().eagerCheckCompletion()
        .setHeader(Exchange.FILE_NAME, constant("reports.zip"))
        .to("file:output/directory");

383.6. Dependencies

To use Zip files in your camel routes you need to add a dependency on camel-zipfile which implements this data format.

If you use Maven you can just add the following to your pom.xml, substituting the version number for the latest & greatest release (see the download page for the latest versions).

<dependency>
  <groupId>org.apache.camel</groupId>
  <artifactId>camel-zipfile</artifactId>
  <version>x.x.x</version>
  <!-- use the same version as your Camel core version -->
</dependency>

383.7. Zipkin Component

Available as of Camel 2.18

The camel-zipkin component is used for tracing and timing incoming and outgoing Camel messages using zipkin.

Events (span) are captured for incoming and outgoing messages being sent to/from Camel.

Note

camel-zipkin is planned to be refactored in Camel 2.22.0 to not use zipkin-scribe but use the default http transport. This work may cause backwards incompatibility.

This means you need to configure which Camel endpoints map to zipkin service names.

The mapping can be configured using:

  • route id - A Camel route id
  • endpoint url - A Camel endpoint url

For both kinds you can match using wildcards and regular expressions, using the rules from Intercept.

To match all Camel messages you can use * in the pattern and configure that to the same service name.

If no mapping has been configured, Camel will fallback and use endpoint uri’s as service names. 
However, it’s recommended to configure service mappings so you can use human-readable names instead of Camel endpoint uris in the names.

Camel will auto-configure a span reporter one hasn’t been explicitly configured, and if the hostname and port to a zipkin collector has been configured as environment variables

  • ZIPKIN_COLLECTOR_HTTP_SERVICE_HOST - The http hostname
  • ZIPKIN_COLLECTOR_HTTP_SERVICE_PORT - The port number

or

  • ZIPKIN_COLLECTOR_THRIFT_SERVICE_HOST - The Scribe (Thrift RPC) hostname
  • ZIPKIN_COLLECTOR_THRIFT_SERVICE_PORT - The port number

This makes it easy to use camel-zipkin in container platforms where the platform can run your application in a linux container where service configurations are provided as environment variables.

383.8. Options

OptionDefaultDescription

rate

1.0f

Configures a rate that decides how many events should be traced by zipkin. The rate is expressed as a percentage (1.0f = 100%, 0.5f is 50%, 0.1f is 10%).

spanReporter

 

Mandatory: The reporter to use for sending zipkin span events to the zipkin server.

serviceName

 

To use a global service name that matches all Camel events

clientServiceMappings

 

Sets the client service mappings that matches Camel events to the given zipkin service name. The content is a Map<String, String> where the key is a pattern and the value is the service name. The pattern uses the rules from Intercept.

serverServiceMappings

 

Sets the server service mappings that matches Camel events to the given zipkin service name. The content is a Map<String, String> where the key is a pattern and the value is the service name. The pattern uses the rules from Intercept.

excludePatterns

 

Sets exclude pattern(s) that will disable tracing with zipkin for Camel messages that matches the pattern. The content is a Set<String> where the key is a pattern. The pattern uses the rules from Intercept.

includeMessageBody

false

Whether to include the Camel message body in the zipkin traces. This is not recommended for production usage, or when having big payloads. You can limit the size by configuring the max debug log size

includeMessageBodyStreams

false

Whether to include message bodies that are stream based in the zipkin traces. This requires enabling streamcaching on the routes or globally on the CamelContext. This is not recommended for production usage, or when having big payloads. You can limit the size by configuring the max debug log size.  

383.9. Spring Boot Auto-Configuration

The component supports 10 options, which are listed below.

NameDescriptionDefaultType

camel.zipkin.client-service-mappings

Sets client service mapping(s) that matches Camel events to the given zipkin service name. The key is the pattern, the value is the service name.

 

Map

camel.zipkin.endpoint

Sets the POST URL for zipkin’s <a href="http://zipkin.io/zipkin-api/#/">v2 api</a>, usually "http://zipkinhost:9411/api/v2/spans"

 

String

camel.zipkin.exclude-patterns

Sets exclude pattern(s) that will disable tracing with zipkin for Camel messages that matches the pattern.

 

Set

camel.zipkin.host-name

Sets the hostname if sending spans to a remote zipkin scribe (thrift RPC) collector.

 

String

camel.zipkin.include-message-body

Whether to include the Camel message body in the zipkin traces. This is not recommended for production usage, or when having big payloads. You can limit the size by configuring camel.springboot.log-debug-max-chars option.

false

Boolean

camel.zipkin.include-message-body-streams

Whether to include message bodies that are stream based in the zipkin traces. This is not recommended for production usage, or when having big payloads. You can limit the size by configuring camel.springboot.log-debug-max-chars option.

false

Boolean

camel.zipkin.port

Sets the port if sending spans to a remote zipkin scribe (thrift RPC) collector.

0

Integer

camel.zipkin.rate

Configures a rate that decides how many events should be traced by zipkin. The rate is expressed as a percentage (1.0f = 100%, 0.5f is 50%, 0.1f is 10%).

1

Float

camel.zipkin.server-service-mappings

Sets server service mapping(s) that matches Camel events to the given zipkin service name. The key is the pattern, the value is the service name.

 

Map

camel.zipkin.service-name

To use a global service name that matches all Camel events

 

String

383.10. Example

To enable camel-zipkin you need to configure first

ZipkinTracer zipkin = new ZipkinTracer();
// Configure a reporter, which controls how often spans are sent
//   (the dependency is io.zipkin.reporter2:zipkin-sender-okhttp3)
sender = OkHttpSender.create("http://127.0.0.1:9411/api/v2/spans");
zipkin.setSpanReporter(AsyncReporter.create(sender));
// and then add zipkin to the CamelContext
zipkin.init(camelContext);

The configuration above will trace all incoming and outgoing messages in Camel routes. 

To use ZipkinTracer in XML, all you need to do is to define scribe and zipkin tracer beans. Camel will automatically discover and use them.

  <!-- configure how to reporter spans to a Zipkin collector
          (the dependency is io.zipkin.reporter2:zipkin-reporter-spring-beans) -->
  <bean id="http" class="zipkin2.reporter.beans.AsyncReporterFactoryBean">
    <property name="sender">
      <bean id="sender" class="zipkin2.reporter.beans.OkHttpSenderFactoryBean">
        <property name="endpoint" value="http://localhost:9411/api/v2/spans"/>
      </bean>
    </property>
    <!-- wait up to half a second for any in-flight spans on close -->
    <property name="closeTimeout" value="500"/>
  </bean>

  <!-- setup zipkin tracer -->
  <bean id="zipkinTracer" class="org.apache.camel.zipkin.ZipkinTracer">
    <property name="serviceName" value="dude"/>
    <property name="spanReporter" ref="http"/>
  </bean>

383.10.1. ServiceName

However, if you want to map Camel endpoints to human friendly logical names, you can add mappings

  • ServiceName *

You can configure a global service name that all events will fallback and use, such as:

zipkin.setServiceName("invoices");

This will use the same service name for all incoming and outgoing zipkin traces. If your application uses different services, you should map them to more finely grained client / server service mappings

383.10.2. Client and Server Service Mappings

  • ClientServiceMappings
  • ServerServiceMappings

If your application hosts a service that others can call, you can map the Camel route endpoint to a server service mapping. For example, suppose your Camel application has the following route:

from("activemq:queue:inbox")
  .to("http:someserver/somepath");

And you want to make that as a server service, you can add the following mapping:

zipkin.addServerServiceMapping("activemq:queue:inbox", "orders");

Then when a message is consumed from that inbox queue, it becomes a zipkin server event with the service name 'orders'.

Now suppose that the call to http:someserver/somepath is also a service, which you want to map to a client service name, which can be done as:

zipkin.addClientServiceMapping("http:someserver/somepath", "audit");

Then in the same Camel application you have mapped incoming and outgoing endpoints to different zipkin service names.

You can use wildcards in the service mapping. To match all outgoing calls to the same HTTP server you can do:

zipkin.addClientServiceMapping("http:someserver*", "audit");

383.11. Mapping rules

The service name mapping for server occurs using the following rules

  1. Is there an exclude pattern that matches the endpoint uri of the from endpoint? If yes then skip.
  2. Is there a match in the serviceServiceMapping that matches the endpoint uri of the from endpoint? If yes, then use the found service name
  3. Is there a match in the serviceServiceMapping that matches the route id of the current route? If yes, then use the found service name
  4. Is there a match in the serviceServiceMapping that matches the original route id where the exchange started? If yes, then use the found service name
  5. No service name was found, the exchange is not traced by zipkin

The service name mapping for client occurs using the following rules

  1. Is there an exclude pattern that matches the endpoint uri of the from endpoint? If yes then skip.
  2. Is there a match in the clientServiceMapping that matches the endpoint uri of endpoint where the message is being sent to? If yes, then use the found service name
  3. Is there a match in the clientServiceMapping that matches the route id of the current route? If yes, then use the found service name
  4. Is there a match in the clientServiceMapping that matches the original route id where the exchange started? If yes, then use the found service name
  5. No service name was found, the exchange is not traced by zipkin

383.11.1. No client or server mappings

If there has been no configuration of client or server service mappings, CamelZipkin runs in a fallback mode, and uses endpoint uris as the service name.

In the example above, this would mean the service names would be defined as if you add the following code yourself:

zipkin.addServerServiceMapping("activemq:queue:inbox", "activemq:queue:inbox");
zipkin.addClientServiceMapping("http:someserver/somepath", "http:someserver/somepath");

This is not a recommended approach, but gets you up and running quickly without doing any service name mappings. However, when you have multiple systems across your infrastructure, then you should consider using human-readable service names, that you map to instead of using the camel endpoint uris.

383.12. camel-zipin-starter

If you are using Spring Boot then you can add the camel-zipkin-starter dependency, and turn on zipkin by annotating the main class with @CamelZipkin. You can then configure camel-zipkin in the application.properties file where you can configure the hostname and port number for the Zipkin Server, and all the other options as listed in the options table above.

You can find an example of this in the camel-example-zipkin