Chapter 20. Fabric Maven Proxies

Abstract

Container hosts often have limited or no access to the Internet, which can make it difficult for Fabric containers to download and install Maven artifacts. This problem can be mitigated using a Maven proxy, which serves as a central cache of Maven artifacts for the Fabric containers. Managed containers try to download from the Maven proxy, before trying to download from the Internet. This chapter explains how the Maven proxy works and how to customize the configuration of the Maven proxy to suit your network environment.

20.1. Introduction to Fabric Maven Proxies

Overview

A container can be a Maven proxy if it is running the fabric profile. The fabric profile, as well as any profile that inherits from the fabric profile, contains the fabric-maven-bundle, which enables a container to be a Maven proxy. Containers that can be Maven proxies include:
  • Fabric servers
  • Containers that are joined to a cluster with fabric:join and that keep the assigned fabric profile
  • Containers that are provisioned with a profile that inherits from the fabric profile
Each container that has the fabric profile registers itself in Zookeeper as a Maven proxy. When a container is being provisioned, its fabric-agent bundle obtains the list of Maven proxies from Zookeeper and prepends their URIs to the container's io.fabric8.agent/org.ops4j.pax.url.mvn.repositories list. When the container needs a Maven artifact, it uses the entire list of Maven proxies. Consequently, you should not use too many containers that run the fabric profile and are therefore Maven proxies. For example, if there are 100 containers and one of them cannot resolve a particular artifact, it is probable that none of the other containers can resolve it either but they could all be consulted.
If an environment includes a host that cannot access a Maven proxy, downloading Maven artifacts to that host and using Maven artifacts on that host might not work correctly.

Maven proxy

A Maven proxy is a HTTP Web server that behaves very like a standard Maven repository, such as Maven Central.
The purpose of the Maven proxy is to serve Maven artifacts on the local network. It has its own local cache of Maven artifacts, which it can serve up quickly. But if necessary, the Maven proxy can also download artifacts from remote repositories (in a proxy role). This architecture offers a number of advantages:
  • The Maven proxy builds up a large cache over time, which can be served up quickly to other containers in the Fabric.
  • It is not necessary for every container to download Maven artifacts from remote repositories—the Maven proxy performs this service for the other containers.
  • In a network with limited Internet access, you can arrange to deploy the Maven proxy on a host with Internet access, while the other containers in the fabric are deployed on hosts without Internet access.

Managed container

A managed container is a regular Fabric container (not part of the Fabric ensemble), whose contents are managed by a Fabric8 agent.

Fabric8 agent

The Fabric8 agent is responsible for ensuring that the bundles deployed in the container are consistent with what is specified in this container's Fabric profiles. Whenever necessary, the Fabric8 agent will contact the Maven proxy to download new Maven artifacts for deploying inside the container.

Resolving a Maven artifact

The Fabric8 agent attempts to locate a Maven artifact by searching the following locations, in order:
  1. Default repositories, which are local repositories for the Fabric agent.
  2. Maven proxy.
  3. Remote Maven repositories.
For a more detailed outline of this process, see Section 20.2, “How a Managed Container Resolves Artifacts”.

Maven proxy endpoint discovery

Before the Fabric8 agent can connect to the Maven proxy, it needs to discover the list of available Maven proxy URLs. The discovery mechanism is based on the Apache Zookeeper registry: by querying Zookeeper, the Fabric8 agent can discover the list of available Maven proxy URLs (that is, the Fabric8 agent fetches all the children of the /fabric/registry/maven/proxy/download path in Zookeeper).

No replication

Within the Maven proxy cluster, there is no automatic replication of artifacts between different Maven proxies in the cluster. You will notice the effects of this, when one of the Maven proxies becomes unavailable, so that clients are obliged to contact the next Maven proxy in the list:
  • Any client that does not have the complete list of Maven proxy URLs would need to be reconfigured manually to use one of the remaining available Maven proxies.
  • If you have been automatically uploading artifacts to the Maven proxy as part of your build process (see Section 20.7, “Automated Deployment”), you will need to reconfigure the upload URL.
  • It is likely that the next Maven proxy has a much smaller cache of Maven artifacts than the original one. This could result in noticeable delays, because many previously cached artifacts have to be downloaded again.

Managing the Maven artifact data

Although Fabric does not support replication of the local Maven caches, there are some strategies you can adopt to compensate for this. The Maven proxy caches its artifacts in its local Maven repository (normally in ${karaf.home}/data/repository). You could simply do a manual copy of the contents of the local Maven repository from one Maven proxy host to another. Or for a more sophisticated approach, you could try storing the local Maven repository on a networked file system.