Navigating Kubernetes API deprecations and removals

Updated -

Disclaimer: Links contained herein to external website(s) are provided for convenience only. Red Hat has not reviewed the links and is not responsible for the content or its availability. The inclusion of any link to an external website does not imply endorsement by Red Hat of the website or their entities, products or services. You agree that Red Hat is not responsible or liable for any loss or expenses that may result due to your use of (or reliance on) the external site or content.

Kubernetes follows a fairly strict API versioning policy; resulting in a number of API deprecation for v1beta1 and v2beta1 APIs happening over several releases. Beta API versions must be supported for 9 months or 3 releases (whichever is longer) after deprecation, according to the policy, at which point they can be removed.

Places where APIs that have been removed are still in use by workloads, tools, or other components running on or interacting with the cluster will begin to fail. Therefore, it is important for users/administrators must evaluate their cluster for any APIs in use that will be removed and migrate the affected components to use the appropriate new API version.

After this is done, the administrator can provide the administrator acknowledgment and proceed with updates where API removals are happening.

Migrating applications using removed APIs

For more information on migrating from removed Kubernetes APIs, see the following:

Evaluating your cluster for removed APIs

There are several methods to help administrators identify where APIs that will be removed are in use. However, OpenShift Container Platform cannot identify all instances, especially workloads that are idle or external tools that are used. It is the responsibility of the administrator to properly evaluate all workloads and other integrations for instances of removed APIs.

Reviewing alerts to identify uses of APIs that will be removed

OpenShift Container Platform provides two alerts that fire when an API is in use that will be removed in the next release:

  • APIRemovedInNextReleaseInUse - for APIs that will be removed in the next OpenShift Container Platform release.
  • APIRemovedInNextEUSReleaseInUse - for APIs that will be removed in the next OpenShift Container Platform Extended Update Support (EUS) release.

If either of these alerts are firing in your cluster, review the alerts and take action to clear the alerts by migrating manifests and API clients to use the new API version.

These alerts, meant for the overall monitoring of the cluster, do not provide information that helps determine which workloads are using the APIs that will be removed. Additionally, these alerts are tuned to not be overly sensitive so as to avoid causing alerting fatigue on a production system. You can use the APIRequestCount API to get more information about which APIs are in use and which workloads are using APIs that will be removed. Additionally, some APIs might not trigger the above alerts, yet may still be captured by APIRequestCount.

Using APIRequestCount to identify uses of APIs that will be removed

You can use the APIRequestCount API to track API requests and review whether any of them are using one of the removed APIs.

IMPORTANT: There was a bug in previous OCP 4.8 releases affecting the APIRequestCount (BZ 2021629), already fixed in OCP 4.8.25 by errata RHBA-2021:5209. The recommendation is to upgrade to a newer OCP 4.8.z release before checking the APIRequestCount.

Note: You must have access to the cluster as a user with the cluster-admin role in order to use use the APIRequestCount API.

Run the following command and examine the REMOVEDINRELEASE column of the output to identify APIs that will be removed in a future release but are currently in use:

$ oc get apirequestcounts

Example output:

NAME                                        REMOVEDINRELEASE   REQUESTSINCURRENTHOUR   REQUESTSINLAST24H
cloudcredentials.v1.operator.openshift.io                      32                      111
ingresses.v1.networking.k8s.io                                 28                      110
ingresses.v1beta1.extensions                1.22               16                      66
ingresses.v1beta1.networking.k8s.io         1.22               0                       1
installplans.v1alpha1.operators.coreos.com                     93                      167
...

You can also use -o jsonpath to filter the results:

$ oc get apirequestcounts -o jsonpath='{range .items[?(@.status.removedInRelease!="")]}{.status.removedInRelease}{"\t"}{.status.requestCount}{"\t"}{.metadata.name}{"\n"}{end}'

Example output:

1.22    11      certificatesigningrequests.v1beta1.certificates.k8s.io
1.22    356     ingresses.v1beta1.extensions
1.22    0       ingresses.v1beta1.networking.k8s.io

Using APIRequestCount to identify which workloads are using the APIs that will be removed

You can examine the APIRequestCount resource for a given API version to help identify which workloads are using the API.

IMPORTANT: There was a bug in previous OCP 4.8 releases affecting the APIRequestCount (BZ 2021629), already fixed in OCP 4.8.25 by errata RHBA-2021:5209. The recommendation is to upgrade to a newer OCP 4.8.z release before checking the APIRequestCount.

Note: You must have access to the cluster as a user with the cluster-admin role in order to use use the APIRequestCount API.

Run the following command and examine the username and userAgent fields to help identify the workloads that are using the API:

$ oc get apirequestcounts <resource>.<version>.<group> -o yaml

For example:

$ oc get apirequestcounts ingresses.v1beta1.networking.k8s.io -o yaml

You can also use -o jsonpath to extract the username and userAgent values from an APIRequestCount resource:

$ oc get apirequestcounts ingresses.v1beta1.networking.k8s.io \
  -o jsonpath='{range .status.last24h..byUser[*]}{..byVerb[*].verb}{","}{.username}{","}{.userAgent}{"\n"}{end}' \
  | sort -k 2 -t, -u | column -t -s ","

Note: change the ingresses.v1beta1.networking.k8s.io in the command with the name of each deprecated APIs still used.
Note: above command shows if the selected API was used in the last 24 hours. If changes/upgrades were done in some of the applications to use the newer APIs, it's possible to change last24h to currentHour to see if the API is still used in the last hour.

Example output:

VERBS  USERNAME                        USERAGENT
watch  bob                             oc/v4.8.11
watch  system:kube-controller-manager  cluster-policy-controller/v0.0.0

IMPORTANT NOTE: You can safely ignore the following entries that appear in the results:

  • The system:serviceaccount:kube-system:generic-garbage-collector and system:serviceaccount:kube-system:namespace-controller users might appear in the results because they walk through all registered APIs searching for resources to remove.
  • The system:kube-controller-manager and system:cluster-policy-controller users might appear in the results because they walk through all resources while enforcing various policies.
  • If OpenShift GitOps is installed in the cluster, the system:serviceaccount:openshift-gitops:openshift-gitops-argocd-application-controller user (refer to KCS 6635361 for additional information).
  • If OpenShift Pipelines is installed in the cluster, the openshift-pipelines-operator userAgent (refer to KCS 6821411 for additional information).
  • In OSD and ROSA clusters, or if Velero is installed in the cluster, the system:serviceaccount:openshift-velero:velero and system:serviceaccount:openshift-oadp:velero users (refer to KCS 6351332 for additional information).
  • The system:serviceaccount:openshift-ovn-kubernetes:ovn-kubernetes-controller may appear as using endpointslices.v1beta1.discovery.k8s.io in 4.10, while this API is removed in 4.12. This is not a problem because OVN-Kubernetes switches to v1 endpointslices API in 4.11 and an EUS upgrade from 4.10 to 4.12 must have its control plane traverse 4.11.
  • The system:serviceaccount:openshift-network-operator may appear as using poddisruptionbudgets.v1beta1.policy in 4.10, while this API is removed in 4.12. This is not a problem because cluster-network-operator switches to v1 poddisruptionbudgets API in 4.11 and an EUS upgrade from 4.10 to 4.12 must have its control plane traverse 4.11.
  • The system:serviceaccount:openshift-monitoring:kube-state-metrics may appear as using horizontalpodautoscalers.v2beta2.autoscaling. This can safely ignored because all 4.12 versions switch to v2 API as per resolution of OCPBUGS-4112.
  • The system:serviceaccount:prod-rhsso:rhsso-operator may appear using poddisruptionbudgets.v1beta1.policy when trying to upgrade to 4.12. If the version of RHSSO is up-to-date, this can be safely ignored as explained in Red Hat Single Sign-On (RH SSO) Operator is using deprecated 'poddisruptionbudgets.v1beta1.policy' API.

13 Comments

Can users specifically use

kubectl-convert -f --output-version /

that is documented below for this? And if yes we should specifically document that in this KB.

https://kubernetes.io/docs/reference/using-api/deprecation-guide/#migrate-to-non-deprecated-apis

Thank you! I found it helpful to output the API name in addition to the VERB, USERNAME, and USERAGENT I'm sure this could be improved, but this is what I came up with in a hurry.

$ for i in $(oc get apirequestcounts -o jsonpath='{range .items[?(@.status.removedInRelease!="")]}{.metadata.name}{"\n"}{end}'); do
    oc get apirequestcounts $i -o jsonpath='{range .status.last24h..byUser[*]}{..byVerb[*].verb}{","}{"'$i',"}{.username}{","}{.userAgent}{"\n"}{end}' \
       | sort -k 2 -t, -u
  done | column -t -s, -NVERBS,API,USERNAME,USERAGENT

VERBS        API                                                               USERNAME                                                        USERAGENT
delete       cronjobs.v1beta1.batch                                            system:serviceaccount:openshift-storage:rook-ceph-system        rook/v0.0.0
get          customresourcedefinitions.v1beta1.apiextensions.k8s.io            system:serviceaccount:multicluster-engine:hive-operator         hive-operator/v0.0.0
get          flowschemas.v1beta1.flowcontrol.apiserver.k8s.io                  system:serviceaccount:openshift-cluster-version:default         cluster-version-operator/v0.0.0
list watch   horizontalpodautoscalers.v2beta2.autoscaling                      user@example.com                                                Mozilla/5.0
watch        horizontalpodautoscalers.v2beta2.autoscaling                      system:serviceaccount:openshift-monitoring:kube-state-metrics   v2.5.0
patch watch  horizontalpodautoscalers.v2beta2.autoscaling                      system:serviceaccount:openshift-operators:quay-operator         manager/v0.0.0
watch        podsecuritypolicies.v1beta1.policy                                system:kube-controller-manager                                  kube-controller-manager/v1.24.6+5658434
get update   podsecuritypolicies.v1beta1.policy                                system:serviceaccount:metallb-system:default                    manager/v0.0.0
list watch   podsecuritypolicies.v1beta1.policy                                system:serviceaccount:open-cluster-management:search-collector  main/v0.0.0
get          prioritylevelconfigurations.v1beta1.flowcontrol.apiserver.k8s.io  system:serviceaccount:openshift-cluster-version:default         cluster-version-operator/v0.0.0

if the above command doesnt work on the linux machine try this

( echo 'VERBS,API,USERNAME,USERAGENT' ; for i in $(oc get apirequestcounts -o jsonpath='{range .items[?(@.status.removedInRelease!="")]}{.metadata.name}{"\n"}{end}'); do     oc get apirequestcounts $i -o jsonpath='{range .status.last24h..byUser[*]}{..byVerb[*].verb}{","}{"'$i',"}{.username}{","}{.userAgent}{"\n"}{end}'        | sort -k 2 -t, -u;   done )  | column -s, -t

The compliance operator v0.1.34 does API calls to flowschemas.v1alpha1.flowcontrol.apiserver.k8s.io with this user info:

VERBS  USERNAME                                                           USERAGENT
get    system:serviceaccount:openshift-compliance:api-resource-collector  compliance-operator/v0.0.0

Engineering confirmed to me that this was fixed in https://github.com/ComplianceAsCode/content/commit/51d206cf3847b5a3b240842a51ac6ea046705786, although I don't have the exact version of the operator which contains this fix. Upgrading to the latest compliance operator version should resolve this.

You can safely ignore the following entries that appear in the results: [ snip long list of OpenShift components that use deprecated APIs ]

This doesn't address an important point: OpenShift generates an alert when deprecated APIs are used. And there's no way to "ignore" a single component's use of a deprecated API. The entire alert has to be silenced. So you'll either learn to ignore the warnings or you'll no longer get alerts related to any use of the specifed API.

Either option removes the value of having usage alerts in the first place.

It's possible that the engineers who created the alerts should have heeded this advice. Fortunately it's only the openshift-gitops-argocd-application-controller user that calls the API, for example, and not the service account used by the GitOps server itself. This is probably true for Pipelines as well, as it normally creates a service account in each namespace to run the Tasks.

But however its implemented, a policy of not shipping software components that trigger alerts built into the same product would be strongly preferred.

The current situation is like a car with several components that cause the check-engine light to stay on all of the time. It's beside the point that the check-engine light can safely be ignored if it's on due to specific expected conditions (engine running; use of left turn signal, etc.). The burden of constantly running diagnostics to learn why the check-engine light is on and comparing it against a list of acceptable triggers has been left to the driver. Either the subsystems should stop triggering the check-engine light or the light's activation should be modified to ignore such causes.

Hello,

I guess the following is also missing in the "safely ignore" list:

$ oc get apirequestcounts horizontalpodautoscalers.v2beta2.autoscaling -o jsonpath='{range .status.last24h..byUser[*]}{..byVerb[*].verb}{","}{.username}{","}{.userAgent}{"\n"}{end}'   | sort -k 2 -t, -u | column -t -s, -NVERBS,USERNAME,USERAGENT
VERBS       USERNAME                                                       USERAGENT
watch       system:serviceaccount:openshift-monitoring:kube-state-metrics  v2.3.0
list watch  system:serviceaccount:openshift-monitoring:kube-state-metrics  v2.5.0

That's correct. The v2beta2 autoscaling API was replaced by v2 one in 4.12.0 as per resolution of OCPBUGS-4112. Now updating the solution.

Thank you very much!

Yet another one (seen on a 4.11 cluster with MetalLB installed)

$ oc get apirequestcounts podsecuritypolicies.v1beta1.policy -o jsonpath='{range .status.last24h..byUser[]}{..byVerb[].verb}{","}{.username}{","}{.userAgent}{"\n"}{end}' | sort -k 2 -t, -u | column -t -s -NVERBS,USERNAME,USERAGENT watch system:kube controller manager kube controller manager/v1.24.6+deccab3

get update system:serviceaccount:metallb system:default manager/v0.0.0

I have been checking and it seems that metallb has indeed still references to podsecuritypolicies in the CSV and the ones created by metallb-operator are not cleaned up.

I am going to try to spawn a new 4.13 cluster and see if installing the metallb-operator does imply any attempt of mistakenly accessing this API. I might need to open a bug with the results.

Ok. I checked a bit better and the only references are in RBAC items authorizing this access, but I don't see a real attempt to access the deprecated APIs in the recent versions.

So the metallb warnings seem safe to ignore (at least, I couldn't find a case where they aren't). If they indeed caused any trouble, that would have to be troubleshooted at a Support Case.

Below is a jq command that should exclude all of the entries documented as "ignorable", as well as calls from "system:serviceaccount:metallb-system:default" (which was recently reported by another commenter) and "system:serviceaccount:openshift-storage:rook-ceph-system" (which I've observed myself).

Note that it only includes entries for APIs which the cluster reports as being scheduled for removal (.status.removedInRelease exists). You can be more specific by changing select(.status.removedInRelease) to select(.status.removedInRelease == "1.25") (for example).

oc get apirequestcounts -o json | jq -r '[
  .items[]
  | select(.status.removedInRelease)
  | .metadata.name as $api
  | {name: .metadata.name, removedInRelease: .status.removedInRelease}
    + (.status.last24h[] | select(has("byNode")) | .byNode[] | select(has("byUser")) | .byUser[] | {username,userAgent,"verb": .byVerb[].verb})
  | select(
      (
        (
          {username} as $o
          | [{username: "system:serviceaccount:openshift-gitops:openshift-gitops-argocd-application-controller"}]
          | contains([$o])
        )
        or .username == "system:serviceaccount:openshift-gitops:openshift-gitops-argocd-application-controller"
        or .username == "system:serviceaccount:kube-system:generic-garbage-collector"
        or .username == "system:serviceaccount:kube-system:namespace-controller"
        or .username == "system:kube-controller-manager"
        or .username == "system:cluster-policy-controller"
        or .username == "system:serviceaccount:openshift-velero:velero"
        or .username == "system:serviceaccount:openshift-oadp:velero"
        or .username == "system:serviceaccount:metallb-system:default"
        or .username == "system:serviceaccount:openshift-storage:rook-ceph-system"
        or .userAgent == "openshift-pipelines-operator/v0.0.0"
        or (
          .username == "system:serviceaccount:openshift-ovn-kubernetes:ovn-kubernetes-controller"
          and $api == "endpointslices.v1beta1.discovery.k8s.io"
        )
        or (
          .username == "system:serviceaccount:openshift-network-operator"
          and $api == "poddisruptionbudgets.v1beta1.policy"
        )
        or (
          .username == "system:serviceaccount:openshift-monitoring:kube-state-metrics"
          and $api == "horizontalpodautoscalers.v2beta2.autoscaling"
        )
      )
      | not
    )
]
| group_by( {name, removedInRelease, username, userAgent} )
| map(first + {verb: map(.verb) | unique})
| .[] | [.name, .username, .userAgent, (.verb | join(",")), .removedInRelease]
| join("\t")' | column --table-columns NAME,USERNAME,USERAGENT,VERB,REMOVEDINRELEASE --table

I learned a bit of jq as I put this together, so there's almost certainly a more concise way to get it done.

Sample output:

NAME                                          USERNAME  USERAGENT    VERB        REMOVEDINRELEASE
endpointslices.v1beta1.discovery.k8s.io       me        oc/4.12.0    list        1.25
horizontalpodautoscalers.v2beta2.autoscaling  you       Mozilla/5.0  list,watch  1.26
poddisruptionbudgets.v1beta1.policy           me        oc/4.12.0    list        1.25

Even if you see that deprecated APIs are still requested this does not mean that is an issue, i.e. some operators systematically discover all available APIs.

Taking as an example the API ingresses.v1beta1.extensions, which is deprecated in favor of networking.k8s.io/v1 (from here) we can prove that the newly API is already in use:

  • Let's see which components are requesting the deprecated ingresses.v1beta1.extensions API:
$ oc get apirequestcounts ingresses.v1beta1.extensions  -o yaml | grep "userAgent:" | sort | uniq
          userAgent: cluster-policy-controller/v0.0.0
          userAgent: kube-controller-manager/v1.21.1+a620f50
  • Check if the same components are also requesting the new ingresses.v1.networking.k8s.io API:
$ oc get apirequestcounts ingresses.v1.networking.k8s.io   -o yaml | grep "userAgent:" | sort | uniq
          userAgent: cluster-policy-controller/v0.0.0
          userAgent: kube-controller-manager/v1.21.1+a620f50
          userAgent: kube-state-metrics/v2.0.0
          userAgent: openshift-controller-manager/v0.0.0
  • Also, as we can see the newly API is requested many more times than the deprecated one:
$ oc get apirequestcount ingresses.v1beta1.networking.k8s.io ingresses.v1.networking.k8s.io      
NAME                                  REMOVEDINRELEASE   REQUESTSINCURRENTHOUR   REQUESTSINLAST24H
ingresses.v1.networking.k8s.io                           2                       699
ingresses.v1beta1.networking.k8s.io   1.22               1                       4

P.S.: omc tool supports apirequestcount resources, so you can also run the above commands using the must-gather from the customer.

Hello! I don´t understand this part:

$ oc get apirequestcounts cronjobs.v1beta1.batch -o jsonpath='{range .status.last24h..byUser[]}{..byVerb[].verb}{","}{.username}{","}{.userAgent}{"\n"}{end}' | sort -k 2 -t, -u | column -t -s ","

watch system:serviceaccount:openshift-vertical-pod-autoscaler:vpa-admission-controller admission-controller/v0.0.0 watch system:serviceaccount:openshift-vertical-pod-autoscaler:vpa-recommender recommender/v0.0.0 watch system:serviceaccount:openshift-vertical-pod-autoscaler:vpa-updater updater/v0.0.0

So, when it shows serviceaccount, there is no need to change anything because it is only making RBAC calls?

$ oc get apirequestcount cronjobs.v1beta1.batch cronjobs.v1.batch NAME REMOVEDINRELEASE REQUESTSINCURRENTHOUR REQUESTSINLAST24H cronjobs.v1beta1.batch 1.25 34 1104 cronjobs.v1.batch 581 18583

The process is very confusing.

How can it not be easier to officially identify which objects are using the apis to be deprecated?

Thanks!