When OpenShift starts a container, it uses an arbitrarily assigned user ID. This feature helps to ensure that if an application from within a container manages to break out to the host, it won’t be able to interact with other processes and containers owned by other users, in other projects.
If the process has requirements to alter file permissions or retrieve user information, then this security feature will cause problems for the container. For example, the container process needs to perform a whoami or look up it’s
The solution is group permissions. Even though OpenShift runs the container processes as a random UID, the user belongs to the root group.
[lab@master-0 ~]$ ps -o pid,group,gid,cmd -C "sleep 10m" PID UID USER GID CMD 3840163 1001 quicklab 0 sleep 10m
Hence, typically the advice has been to allow the root group to read/write files and directories, by changing the group ownership of those files to root.
However, it has been found that this advice has also been applied to the
/etc/passwd file, allowing the random user information to be added to it.
Product Security were recently notified of a vulnerability in the form of a container privilege escalation. The vulnerability details that when a Dockerfile alters the permissions of it’s own
/etc/passwd file, it allows for any user (within that container) to escalate to root.
For example, take the following Dockerfile from the Jenkins image:
RUN /usr/local/bin/install-jenkins-core-plugins.sh /opt/openshift/base-plugins.txt && \ rmdir /var/log/jenkins && \ chmod 664 /etc/passwd && \ chmod -R 775 /etc/alternatives && \ chmod -R 775 /var/lib/alternatives && \ chmod -R 775 /usr/lib/jvm && \ chmod 775 /usr/bin && \ chmod 775 /usr/lib/jvm-exports && \ chmod 775 /usr/share/man/man1 && \ mkdir -p /var/lib/origin && \ chmod 775 /var/lib/origin && \ chown -R 1001:0 /opt/openshift && \
As discussed previously,
chmod 664 /etc/passwd not only allows the random user’s data to be added to the
/etc/passwd file, but also allows any process from within that container to modify
For example, if using a container in which the Dockerfile includes the above snippet, altering
/etc/passwd's file permissions, it’s possible to conduct the following:
Login to the container:
$ oc rsh test_container
Add a new root user (UID 0) to
$ echo test::0:0:/root:/bin/bash >> /etc/passwd
Switch to the new root user:
$ su test
Confirm UID is 0 and we are now root:
sh-4.2# id uid=0(root) gid=0(root) groups=0(root)
If user namespaces were used within OpenShift the impact of this would be reduced as the user would only be root in a namespace separate from the host. However, in current OpenShift versions, we can confirm that the container process is in fact running as root (UID 0) both within the container and on the host.
sleep command is run within a container, on the node (or host) the process will look something similar to:
$ ps -o uid,uname,gid,cmd -C "sleep 10m" UID USER GID CMD 1000590000 1000590+ 0 sleep 1d
However, if the above technique is used and the same
sleep is called, it can now be seen that the process is in fact running as root:
$ ps -o uid,uname,gid,cmd -C "sleep 10m" UID USER GID CMD 0 root 0 sleep 10m
When working with containers an important question to always ask is; Would you do this if you were installing the application on the host? If the answer is no, then it shouldn’t be done in a container either.
Additionally, OpenShift (and likewise Kubernetes) does not currently support user namespaces. What this means is that if a process is run as root from within a container, they have the equivalent permissions of root on the host.
It’s not as bad as its sounds.
By default OpenShift runs containers in a restricted SCC profile. Specifically this means that
SETGID are two systemcalls which are blocked in the container. This means that containers running with the restricted SCC cannot use these systemcalls to change their UID or GID and escalate their privileges. Furthermore, due to OpenShift employing a multi-layered approach to security (cgroups, SELinux, seccomp etc) it also restricts any attempts to access other containers or host resources (aka a full container escape).
However, it’s still not desirable as it unnecessarily increases the attack surface.
Moving forward: How we are fixing it this:
The two main solutions for containers which require to get user information include:
- Rely on CRI-O or
- Use nss_wrapper
The OpenShift run-time CRI-O (starting from OpenShift 4.2 onward) now inserts the random user for the container into
/etc/passwd. Removing the requirement to insert the random user manually into
/etc/passwd completely. Additionally in future versions of CRI-O, the
$WORKDIR of the container user will also be assigned, helping Java based images (thanks Akram - see below).
Given that support for versions before OpenShift 4.2 may also be required, nss_wrapper can be utilized. This provides for a local, unprivileged passwd file to be specified, allowing the container to map the required user information to an random UID without having to modify the containers
/etc/passwd file directly. A great example of this fix can be found in this pull-request.
Further container guidelines can also be found here
Product Security are working to fix the problem with our own containers and continuing to work with our engineers so that simple bugs like this don’t creep up in future. There are still a lot of containers to fix, but don’t worry we’re almost there:
To the original reporter: Joseph LaMagna-Reiter (SPR Inc.) and to Akram Ben Aissi from Red Hat for researching alternative methods.
Akram has also pointed out that for Java based processes, (until CRI-O is updated), -Duser.home is still required to be specified for compatibility.