Kubernetes privilege escalation and access to sensitive information in OpenShift products and services - CVE-2018-1002105

Public Date: December 3, 2018, 5:00 pm
Updated -
Resolved Status
Critical Impact

Insights vulnerability analysis

View exposed systems

A flaw has been detected in kubernetes which allows privilege escalation and access to sensitive information in OpenShift products and services.  This issue has been assigned CVE-2018-1002105 and has a security impact of Critical.

All 3.x versions of OpenShift Container Platform allow for compromise of pods (multiple running container instances) running on a compute node to which a pod is scheduled with normal user privilege.  This access could include access to all secrets, pods, environment  variables, running pod/container processes, and persistent volumes.

Additionally, on  OpenShift Container Platform versions 3.6 and higher, this vulnerability allows cluster-admin level access to any API hosted by an aggregated API server.  This includes the ‘metrics-service’ and ‘servicecatalog’ as possible targets. Cluster-admin level access to the service catalog allows creation of brokered services by an unauthenticated user with escalated privileges in any namespace and on any node. This could allow an attacker to deploy malicious code, or alter existing brokered services.

For OpenShift Dedicated environments, a regular user, with pod exec/attach/portforward permissions, can gain cluster-level administrative privileges on any compute node that can run that pod.  This includes exec access to all running workloads, all current secrets, logs, etc..  

Background Information

The OpenShift API Server is included in all OpenShift Container Platform installations and handles all the administration tasks for the cluster. Administrators and developers don’t usually call the API directly, but use the ‘oc’ binary, which calls the API server.

The API server provides various functionality, including the following oc commands:

  • oc exec
  • oc port-forward
  • oc rsh

It does this by acting as a reverse proxy to the kubelet running on the compute nodes. When connecting to the kubelet in order to fulfill any of the above commands, it opens a websocket connection which connects stdin, stdout, or stderr to the administrator or developer’s original call. For more information, read the Remote Commands section of the OpenShift Container Platform 3.11 architecture guide.

The API Server also acts as a reverse proxy when implementing the API Aggregation feature of Kubernetes. API aggregation enables the installation of additional Application Programming Interfaces (APIs) into the core API Server. Those additional APIs are referred to as API extensions by upstream Kubernetes. API Extensions allow an architect to extend the features of OpenShift Container Platform 3.

Acknowledgements

Red Hat would like to thank the Kubernetes Product Security Team for reporting this issue. Upstream acknowledges Darren Shepherd as the original reporter.

Additional References

Kubernetes Announce List

Video: Kubernetes Privilege Flaw Escallation Explained by Red Hat

Blog: The Kubernetes privilege escalation flaw: Innovation still needs IT security expertise

Blog: Understanding the critical Kubernetes privilege escalation flaw in OpenShift 3


Impacted Products

Red Hat Product Security has rated CVE-2018-1002105 as having a security impact of Critical.

The following Red Hat Product versions are impacted:

  • Red Hat OpenShift Container Platform 3.x
  • Red Hat OpenShift Online
  • Red Hat OpenShift Dedicated

Attack Description and Impact

The OpenShift API server is a component which handles the API requests for OpenShift Container Platform 3. The OpenShift API server had a vulnerability in the ‘upgradeawarehandler’ type which acts as a reverse proxy to the compute nodes and other services added as API Extensions. A flaw in the reverse proxy prevented it from closing connections from the downstream service when there was an error. This allowed the user making the API call to escalate their privileges.

There are 2 ways to use this vulnerability to attack OpenShift Container Platform, OpenShift Online, or OpenShift Dedicated. The first involves abusing pod exec, attach, or portforward privileges granted to a normal user, and the second involves attacking the API extensions feature which provides the service catalog and access to additional features in OpenShift Container Platform 3.6 and later.

If a user has pod exec/attach/portforward privileges for any pod, their privileges can be escalated to cluster-admin, and any API call to a compute node Kubelet api can be achieved. An API call to the Kubelet api allows the user to exec into any container running on the same node as the pod which they have privileges for, including privileged containers which have read/write access to the host filesystem.

The second attack doesn’t require any privileges and exploits the API extension feature used by ‘metrics-server’ and ‘servicecatalog’ in OpenShift Container Platform, OpenShift Online, and Dedicated. It allows an unauthenticated user to gain cluster-admin privileges to any API extension deployed to the cluster including ‘servicecatalog’. Cluster-admin access to ‘servicecatalog’ allows creation of service brokers in any namespace and on any node. 

Diagnose your vulnerability

To check the version of OpenShift container platform installed send an http request to the API server, such as (Note: The URL should be the same as the webconsole URL):

   curl https://openshift.example.com:8443/version/openshift | grep gitVersion

*port is configuration specific

Alternatively if you are logged in with the ‘oc’ command you can check the version with this command:

   oc version

Any versions of OpenShift Container Platform older than those listed below are vulnerable:

  • v3.11.43-1
  • v3.10.72-1
  • v3.9.51-1
  • v3.8.44-1
  • v3.7.72-1
  • v3.6.173.0.140-1
  • v3.5.5.31.80-1
  • v3.4.1.44.57-1
  • v3.3.1.46.45-1
  • v3.2.1.34-2

Take Action

Customers running affected versions of Red Hat products are strongly recommended to update them as soon as errata are available.  

OpenShift Online (Starter, Pro) have been remediated.  OpenShift Dedicated customers should speak with their support contact to confirm the status/schedule for these fixes. 

Updates for Affected Products

ProductPackageAdvisory/Update
OpenShift Container Platform v3.11kubernetesRHSA-2018:3537
OpenShift Container Platform v3.10kubernetesRHSA-2018:3549
OpenShift Container Platform v3.9kubernetesRHSA-2018:2908
OpenShift Container Platform v3.8kubernetesRHSA-2018:3551
OpenShift Container Platform v3.7kubernetesRHSA-2018:2906
OpenShift Container Platform v3.6kubernetesRHSA-2018:3598
OpenShift Container Platform v3.5kubernetesRHSA-2018:3624
OpenShift Container Platform v3.4kubernetesRHSA-2018:3752
OpenShift Container Platform v3.3kubernetesRHSA-2018:3754
OpenShift Container Platform v3.2kubernetesRHSA-2018:3742


Mitigation

Fixes for OpenShift Container Platform versions 3.2 and higher have been shipped to mitigate the flaw.   

Users who update to the latest versions do not need to apply further mitigations. However, if the updates cannot be applied, mitigating the issue is recommended. 

Mitigate the pod exec/attach/port-forward attack (OCP 3.2->3.11)

Change the default admin, and edit roles which have pod attach/exec/port-forward:

$oc edit clusterrole admin

Add the ‘rbac.authorization.kubernetes.io/autoupdate’ annotation with a value of false, and remove the pod attach/exec/portforward permissions. Example:

apiVersion: v1
kind: ClusterRole
metadata:
   annotations:
      openshift.io/description: A user that has edit rights within the project and can
        change the project's membership.
   creationTimestamp: 2018-11-21T01:34:04Z
   name: admin
   resourceVersion: "148192"
   selfLink: /oapi/v1/clusterroles/admin
   uid: 87dd0306-ed2d-11e8-9456-fa163e65ec84
rules:
- apiGroups:
  - ""
   attributeRestrictions: null
   resources:
  - pods
  - pods/proxy
   verbs:

Do the same for the ‘edit’ role which also has these permissions.

For a single user in a project:

Check the roles assigned to the user. In this case, the admin role is assigned to the quicklab user:
 
$ oc project myproj
$ oc adm policy who-can get pods/exec
Namespace: test
Verb:      get
Resource:  pods/exec
 
Users:  quicklab
        system:admin
        system:serviceaccount:kube-system:generic-garbage-collector
        system:serviceaccount:kube-system:namespace-controller
 
Groups: system:cluster-admins
        system:masters
 
 
$ oc get rolebindings
NAME                    ROLE                    USERS      GROUPS                          SERVICE ACCOUNTS   SUBJECTS
admin           /admin          quicklab                                                      
system:deployers        /system:deployer                                                   deployer           
system:image-builders   /system:image-builder                                              builder            
system:image-pullers    /system:image-puller               system:serviceaccounts:myproj
$ oc export clusterrole admin > admin-mitigate.yml

Edit admin-mitigate.yml by adding a ‘rbac.authorization.kubernetes.io/autoupdate’ annotation with a value of false, changing the name and removing the pod attach/exec/portforward permissions. Example:

apiVersion: v1
kind: ClusterRole
metadata:
  annotations:
    openshift.io/description: A user that has edit rights within the project and can
      change the project's membership.
    rbac.authorization.kubernetes.io/autoupdate: "false"
  creationTimestamp: null
  name: admin-mitigate
rules:
- apiGroups:
  - ""
  attributeRestrictions: null
  resources:
  - pods
  - pods/proxy
  verbs:
   ...

Create a new role with the new configuration:

$ cat admin-mitigate.yml | oc create -f -

Assign this role to any users who you don’t want to have those permissions:

$ oc adm policy remove-role-from-user admin quicklab
role "admin" removed: "quicklab"
$oc adm policy add-role-to-user admin-mitigate quicklab
role “admin-mitigate’”added: “quicklab”

Mitigating the API Extension attack (OCP 3.6 -> 3.11)

To mitigate the API extension attack until such time that a cluster upgrade to a patched version can be completed, it is possible to delete the affected API Extensions. Please review each services carefully to ensure this will not cause a loss of critical functionality, in which case alternate protections such as network level restrictions to known sources may be more appropriate.

Get a list of services which need to be disabled:

$ oc get apiservices -o=custom-columns=NAME:.metadata.name,SERVICE:.spec.service,STATUS:.status.conditions[0].type,VALUE:.status.conditions[0].status | grep -v '<nil>'

For example, on OCP 3.11 the services might include:

NAMESERVICESTATUSVALUE
v1beta1.servicecatalog.k8s.io map[name:apiserver namespace:kube-service-catalog]AvailableTrue

Disable the services listed in the output, for example:

$ oc delete svc apiserver -n kube-service-catalog

For this example, to re-enable service-catalog run the openshift-service-catalog playbook. Please refer to the product documentation to re-enable other services.

$ ansible-playbook [-i /path/to/inventory]
/usr/share/ansible/openshift-ansible/playbooks/openshift-service-catalog/config.yml

14 Comments

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

The mitigations apply to unpatched versions of Kubernetes and OpenShift. Users who have already updated to the latest versions do not need to apply further mitigations. However, if the updates cannot be applied, mitigating the issue is recommended.

While editing the admin and edit ClusterRole instructions are correct for upstream Kubernetes, OpenShift versions 3.9, 3.10 and 3.11 use aggregate roles rather than static definitions. Unless the roles are replaced, changes are not persisted as the aggregationRule replaces the clusterRole with the roles selected by matchLabels.

[root@master1 ~]# oc get clusterroles -l rbac.authorization.k8s.io/aggregate-to-admin
NAME
system:aggregate-to-admin
system:openshift:aggregate-to-admin

By default, changes to these clusterRoles will be reconciled and reverted. Rather than edit system:aggregate-to-admin, it may be preferable to replace the clusterRole aggregate with a fixed custom clusterRole by removing the aggregationRule: block:

  1. Use oc edit clusterrole admin to edit the aggregate.
  2. Remove the aggregationRule: block
  3. Remove pod/attach, pod/exec, and pod/portforward
  4. Disable role reconciliation by setting openshift.io/reconcile-protect: "true" or rbac.authorization.kubernetes.io/autoupdate=false
  5. Remove the label kubernetes.io/bootstrapping: rbac-defaults
  6. Write and quit to save the changes

These steps may be repeated as necessary for the edit clusterRole.

Once the fixes have been applied, the admin clusterRole may be restored to its previous definition by restoring the aggregationRule block (overriding the local definition on save and replacing the custom definition with the aggregate of the system clusterRoles). For reference, the original block is below. The difference between the admin and edit clusterRoles is the name of the role reference.

aggregationRule:
  clusterRoleSelectors:
  - matchLabels:
      rbac.authorization.k8s.io/aggregate-to-admin: "true"

Hopefully this helps for those who have not been able to apply the fixes and need to mitigate the vulnerability temporarily but have not been able to effectively because the edits do not appear to persist when using aggregate clusterRoles.

Thank you for the clarification about the mitigation only applying to unpatched installations. A note to this effect has been added to the Resolve tab.

Comment updated to include OpenShift 3.9 as one of the versions using an aggregationRule to combine policies.

OpenShift 3.7 and earlier do not use aggregate cluster roles and these may be edited; however, openshift.io/reconcile-protect: "false" may need to be changed to true to prevent the system from replacing the cluster role when a master is restarted until the patches have been applied to fix the issue.

If, for whatever reason, the choice has been made to mitigate the attack rather than apply updates to fix the issue, please make sure you check the "Resolve" tab of this article and follow the steps to mitigate the API Extension attack as well. While these components are not installed on all clusters, the mitigation includes information on how to determine if an extension is installed and how to disable it until such time as the package updates may be applied that fix both issues.

When upgrading to 3.10 : revert this change even if, some rolebindings cannot be fullfilled.

The diagnose tab should be

   curl https://openshift.example.com:8443/version/openshift | grep gitVersion

instead of

   curl https://openshift.example.com/version/openshift | grep gitVersion

Thanks for the comment. The port is configuration dependent.

well 8443 is the default ... :D

In the Mitigating the API Extension attack (OCP 3.6 -> 3.11) steps, if nothing is returned from: oc get apiservices -o=custom-columns=NAME:.metadata.name,SERVICE:.spec.service,STATUS:.status.conditions[0].type,VALUE:.status.conditions[0].status | grep -v ''

(everything returned is nil in the SERVICE column)

Does that mean we are not affected?

Thanks for the question. It's not clear if this page is removing the word nil (in angled brackets) from between the apostrophes following | grep -v

To your question, the "oc get apiservices ..." command lists all apiservices present. The grep removes any that are not proxied, leaving behind a list of vulnerable services which need to be disabled. If none are returned by the command then your cluster is not vulnerable to this part of the exploit.

The pages use markdown, including backquotes should prevent removal of tags that appear to be HTML (<nil>). I believe what was meant was follows:

oc get apiservices -o=custom-columns=NAME:.metadata.name,SERVICE:.spec.service,STATUS:.status.conditions[0].type,VALUE:.status.conditions[0].status | grep -v '<nil>'

If the command only returns the following there is no need for additional action:

NAME                                   SERVICE   STATUS      VALUE

However, if you see any items listed under the column headers with a SERVICE that is not <nil>, that indicates an API extension to that service is installed.

NAME                           SERVICE                                             STATUS       VALUE
v1beta1.servicecatalog.k8s.io  map[name:apiserver namespace:kube-service-catalog]  Available    True

The Service Catalog is an API extension used by the Template Service Broker and the Ansible Template Broker. This feature was in technical preview in OpenShift 3.6 and generally available in OpenShift 3.7 and later. Other software may be delivered as API extensions as well. If disabling API extensions would disable functionality that may be required, please consider updating to the latest z-stream versions within your current minor release that fix these vulnerabilities instead.

the diagnose tab states "Any versions of OpenShift Container Platform older than those listed below are vulnerable: v3.11.43-1" if I run the curl or the oc version command I get "gitVersion": "v3.11.43" (missing the -1) But the packages I have installed (well or those that are packaged within the docker images in my containerized environment) are the ones that are listed in the resolve tab

root # oc rsh master-api-hostname
sh-4.2# rpm -qa|grep atomic-openshift
atomic-openshift-clients-3.11.43-1.git.0.647ac05.el7.x86_64
atomic-openshift-3.11.43-1.git.0.647ac05.el7.x86_64

The -1 (and git.0.647ac05) reference package names and build artifacts. While we identify specific package version and release numbers, it is sufficient to ensure the clusters are running packages and images version 3.11.43 or higher.

OpenShift 3.11 only had two public releases. The initial release was version 3.11.16-1 on October 11, 2018. Version 3.11.43-1 was the first errata released on November 19, 2018. There were no official releases in-between.

so then diagnose tab should read 3.11.43?

The Diagnose tab should read 3.11.43, correct. However the actual RPM name had -1, which lead to the confusion, see the atomic-openshift-master RPM linked to the 3.11 Errata on the Resolve tab.