Chapter 109. MongoDB GridFS

Camel MongoDB GridFS component

Available as of Camel 2.17
Maven users will need to add the following dependency to their pom.xml for this component:
<dependency>
    <groupId>org.apache.camel</groupId>
    <artifactId>camel-mongodb-gridfs</artifactId>
    <version>2.17.0.redhat-630187</version>
    <!-- use the same version as your Camel core version -->
</dependency>

URI format

Camel versions 2.19 and higher:
mongodb-gridfs:connectionBean?database=databaseName&bucket=bucketName[&moreOptions...]
Camel versions 2.17 and 2.18:
gridfs:connectionBean?database=databaseName&bucket=bucketName[&moreOptions...]

Endpoint options

GridFS endpoints support the following options, depending on whether they are acting as a Producer or as a Consumer. (For consumers, options also vary based on consumer type.)
Name Default Value Description Producer Consumer
database none Required. Specifies the name of the database to which to bind this endpoint. All operations will be executed against this database. Y Y
bucket fs Specifies the name of the GridFS bucket within the specified database. Defaults to the GridFS.DEFAULT_BUCKET value. Y Y
operation create
Specifies the ID of the operation this endpoint will execute. Valid values are:
  • query operations: findOne, listAll, count
  • write operation: create
  • delete operation: remove
Y N
query none Used in conjunction with queryStrategy options to create the query used to search for new files. N Y
queryStrategy TimeStamp
Specifies the strategy used to find new files. Valid values are:
  • TimeStamp
    Processes files that are uploaded after the consumer starts.
  • PersistentTimestamp
    Like TimeStamp, but persists the last timestamp used to a collection, so on restart, the consumer can resume where it left off.
  • FileAttribute
    Finds files missing the attribute specified by fileAtttributeName (see fileAttributeName). After processing, the attribute specified by fileAttributeName is added to the file.
  • TimestampAndFileAttribute
    Finds files that are newer than TimeStamp and missing the attribute specified by fileAtttributeName.
  • PersistentTimestampAndFileAttribute
N Y
persistentTSCollection camel-timestamps Used in conjunction with PersistentTimestamp. Specifies the collection in which the timestamp is stored. N Y
persistentTSObject camel-timestamp
Used in conjunction with PersistentTimestamp. Specifies the ID of the timestamp object.
This allows each consumer to have its own timestamp ID stored in a common collection.
N Y
fileAttributeName camel-processed
Used in conjunction with FileAttribute. Specifies the name of the attribute to use.
When a file is about to be processed, the specified attribute is set to processing. When file processing has finished, the specified attribute is set to done.
N Y
delay 500 (ms) Specifies the interval, in milliseconds, between subsequent polls of GridFS for new files. N Y
initialDelay
1000 (ms)
Specifies the delay, in milliseconds, before polling GridFS the first time for new files.
N Y

Configuration of database in Spring XML

The following Spring XML creates a bean defining the connection to a MongoDB instance.
<beans xmlns="http://www.springframework.org/schema/beans"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd">
    <bean id="mongoBean" class="com.mongodb.Mongo">
        <constructor-arg name="host" value="${mongodb.host}" />
        <constructor-arg name="port" value="${mongodb.port}" />
    </bean>
</beans>

Sample route

The following route defined in Spring XML executes the operation findOne on a collection.
<route>
  <from uri="direct:start" />
  <!-- using bean 'mongoBean' defined above -->
  <to uri="mongodb-gridfs:mongoBean?database=${mongodb.database}&amp;operation=findOne" />
  <to uri="direct:result" />
</route>

MongoDB operations - producer endpoints

  • count
    Returns the total number of files in the collection, and returns an integer as the OUT message body:
    // from("direct:count").to("mongodb-gridfs?database=tickets&operation=count");
    Integer result = template.requestBodyAndHeader("direct:count", "irrelevantBody");
    assertTrue("Result is not of type Long", result instanceof Integer);
    You can use a filename header to provide a count of files matching the specified file name:
    Map<String, Object> headers = new HashMap<String, Object>();
    headers.put(Exchange.FILE_NAME, "filename.txt");
    Integer count = template.requestBodyAndHeaders("direct:count", query, headers);
  • listAll
    Returns a Reader that lists all of the file names and their IDs in a tab-separated stream:
    // from("direct:listAll").to("mongodb-gridfs?database=tickets&operation=listAll");
    Reader result = template.requestBodyAndHeader("direct:listAll", irrelevantBody");
                    
    filename1.txt     1252314321
    filename2.txt     2897651254
  • findOne
    Using Exchange.FILE_NAME from incoming headers, finds a matching file in the GridFS system, sets the body to an InputStream of the content, and provides metadata as headers:
    // from("direct:findOne").to("mongodb-gridfs?database=tickets&operation=findOne");
    Map<String, Object> headers = new HashMap<String, Object>();
    headers.put(Exchange.FILE_NAME, "filename.txt");
    InputStream result = template.requestBodyAndHeaders("direct:findOne", "irrelevantBody", headers);
  • create
    Creates a new file in the GridFS database, using Exchange.FILE_NAME from the incoming headers for the file name and the body content as an InputStream for the file contents:
    // from("direct:create").to("mongodb-gridfs?database=tickets&operation=create");
    Map<String, Object> headers = new HashMap<String, Object>();
    headers.put(Exchange.FILE_NAME, "filename.txt");
    InputStream result = ...the data for the file...
    template.requestBodyAndHeaders("direct:create", stream, headers);
    
  • remove
    Removes a file from the GridFS database:
    // from("direct:remove").to("mongodb-gridfs?database=tickets&operation=remove");
    Map<String, Object> headers = new HashMap<String, Object>();
    headers.put(Exchange.FILE_NAME, "filename.txt");           
    template.requestBodyAndHeaders("direct:remove", "", headers);
    

GridFS Consumer

The MongoDB GridFS component polls GridFS periodically for new files to process. Two parameters, delay and initialDelay, control this behavior. delay specifies how long the background thread sleeps between polling attempts (default is 500ms). initialDelay specifies how long after starting the consumer waits before polling GridFS the first time, which is useful when the backend service needs a little more time to become available.
Several strategies are available to the consumer for determining which files within the grid have not yet been processed:
  • TimeStamp—[default] On start up, the consumer uses the current time as the starting point. Only files added after the consumer started are processed. All files in the grid that pre-date consumer startup are ignored. After polling, the consumer updates its timestamp with the timestamp of the most recently processed file.
  • PersistentTimestamp—On start up, the consumer queries the collection specified by persistentTSCollection for the object provided by persistentTSObject and uses it as the starting timestamp. If that object does not exist, the consumer uses the current time and creates the object. Whenever a file has been processed, the timestamp in the collection is updated.
  • FileAttribute—Instead of using timestamps, the consumer queries GridFS for files that lack the attribute specified by fileAttributeName. When the consumer starts to process the file, this attribute is added to the file in GridFS.
    Usage example:
    from("mongodb-gridfs?database=tickets&queryStrategy=FileAttribute").process(...);
  • TimestampAndFileAttribute—Combing the two strategies, the consumer finds files newer than TimeStamp that lack the attribute provided by fileAttributeName. During file processing the missing attribute is added to the file in GridFS.
  • PersistentTimestampAndFileAttribute—Combing the two strategies, the consumer finds files newer than TimeStamp that lack the attribute provided by fileAttributeName. During file processing the missing attribute is added to the file in GridFS, and the timestamp in the collection is updated.
    Usage example:
    from("mongodb-gridfs?database=myData&queryStrategy=PersistentTimestamp&
          persistentTSCollection=CamelTimestamps&persistentTSObject=myDataTS).process(...);

See also