OpenShift v2 Proxies and Troubleshooting Common Errors

Updated -

OpenShift's routing layer is a set of complex proxy servers (services) that developers and administrators have to understand in order to properly build and deploy applications. To fully understand this it's best to start off with the simple components (http and https) and look at how the routing from a client gets to your application. The graphic below show the key feature that OpenShift uses for routing this type of traffic.

OpenShift Ports and Proxies

  • Note: You will notice that DNS is used heavily in OpenShift so its important that application and host name resolution is working as expected.

In the example given above you can see that client requests use DNS to resolve the Node IP address, and client requests use this DNS host name of the application to pass through the HTTP[s] proxy that the node uses to route (most) incoming requests.

The HTTP[s] port proxy is simply an httpd server running on the node that uses mod-rewrite or httpd's built in vhost and mod_proxy components to make an internal connection to the applications running on the node's internal networking interfaces.

In short, this is how your application can run on an internal IP (127.5.170.129) and port (8080), yet still access it on the node's public IP and default http[s] ports.

  • Note: OpenShift at the HTTP[s] Proxy layer terminates SSL, so any connection to port 443 is terminated and forwarded over http to the application.
  • Note: Due to the way the HTTP[s] proxy works (making a new connection to the application with each request), you are limited to (roughly, depending on configuration) 20530 connections from the HTTP[s] proxy to your application.

With an understanding of the basic OpenShift routing for applications you can start to examine how the http[s] routing works with scaled applications.

The same routing path is taken for scaled applications, however with a scaled application (at this point in the routing path), is HAProxy. What this application is configured to do is route to pre-determined IP and PORT combination. In order for this to work you have to route through one of the other OpenShift proxies (OpenShift Port-Proxy).

What happens with scaled applications is the Node IP and a Public Proxy Port is published using the V2 cartridge events, and the HAProxy application creates a route to this endpoint. In the depiction above this is the port proxy, where Gear 1 would be HAProxy and Gear 3 is your Web Framework cartridge.

  • Note: The connection from HAProxy to the Web Framework is direct! This eliminates the need to pass through the HTTP[s] proxy that has already been traversed.
    • This also applies to the health checks that HAProxy performs on Gears for scaled applications. Understanding that HAProxy preforms health checks on Gear (application) contexts is important because if these checks fail you can see 502 or 503 errors when trying to access your application, because the Proxy disables the route to the application (IE: puts the gear in maintenance mode).

With an understanding of how the HTTP[s] proxy and the OpenShift Port-Proxy work-together to service http and https requests for scaled applications, you can begin to look at how OpenShift removes the single point of failure with a single HAProxy. To understand this fully you need to look at how OpenShift uses External Load Balancers and Routing Listeners to route requests to multiple HAproxy servers (and allow the normal routing for scaled application to be followed).

With the External Load Balancer in play a service (or multiple services or hardware appliances) acts as the entry point for your HA DNS address.

  • Note: The HA DNS address is an extra DNS entry on top of the one that is provided by your application that routes to the External Load Balncer.
  • Note: idling with External Load Balncer is often disabled, or accounted for, by not directly access or routing to the gear as seen in Knowledge Base Article 1310683, or most routing implementations.

Once a request hits this endpoint, the service (depending on how it was configured by the Routing Listener) will route the request directly to one of many HAProxy gears.

If you have or run into issues with your request at any point in this process it is important to test and review the logs to see where / why this request may not have made it to the correct end point or, if it did make it to the right endpoint, why it failed to respond correctly.

In most cases you start by getting your Internal IP and PORT of your gear, and testing if the Web Framework is working as you would expect:

    # echo $(rhc ssh test "env | grep -e [O]PENSHIFT_.*_IP -e [O]PENSHIFT_.*_PORT" | sort | awk -F= '{print $2}') | sed 's/ /:/'
  • Use these values to test that HTTP access can be made from the gear to itself (xxx.xxx.xxx.xxx=IP_ADDRESS, yyyy=PORT)
    # rhc ssh app "curl -ILs $(echo $(rhc ssh test "env | grep -e [O]PENSHIFT_.*_IP -e [O]PENSHIFT_.*_PORT" | sort | awk -F= '{print $2}') | sed 's/ /:/')"
  • Note: The examples above are provided to show you what types of IP/PORT combinations you need to curl. If you run the commands above directly you may get failures with the commands.

Its also important to try testing if you can get to the gear externaly:

    # rhc ssh app "curl -ILs \$OPENSHIFT_APP_DNS"
  • Note: If you have a scaled application you will also need to make sure that your gear can communicate to the proxy then back to itself.

         # rhc ssh app "curl -ILs \$OPENSHIFT_HAPROXY_IP:\$OPENSHIFT_PHP_PROXY_PORT"
    

Insuring that DNS points to the correct location is also of high importance:

    # dig APP_DNS

When reviewing logs to see how the different endpoints are responding please consider the following:

  • $OPENSHIFT_LOG_DIR - Web Framework Logs / HAProxy Log
  • /var/log/httpd/openshift_log - HTTP[s] Routing Log

These logs (haproxy logs for the scaled application) are important to review because if you are seeing 502 or 503 errors when trying to access your application, it could be because the proxy is disabling the routes to the application (IE: puts the gear in maintenance mode), because it is failing health checks.

In some situations you need to restart services to get components of the application up and working. To do this its you can use RHC or from the gear (by sshing into it) you can restart it with commads like:

# gear restart --cart nodejs

However there are times when this does not work:

# gear restart --cart nodejs

Stopping NodeJS cartridge
Mon Sep 14 2015 08:39:13 GMT-0400 (EDT): Stopping application 'quantified' ...
Warning: Application 'quantified' unable to stop. Use force-stop to kill.
Starting NodeJS cartridge
Application is already running
~~~~

The warning there indicates that the nodejs gear is not responding to the commands issues in the "stop" script, so I forcibly killed the nodejs supervisor process. Its important that you identify the appropriate process that need to be restarted so you first need to run: 

ps -f -u GEAR_UUID


- Identify the correct PID, and forcefully stop the process.

kill -9

~~~

Then restarted the cartridge, using the command above.

Close

Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.