Model registry fails to connect to MySQL database due to an authentication plugin mismatch

Solution Verified - Updated -

Environment

  • Red Hat OpenShift AI version 2.22 or earlier.
  • Model Registry component enabled.
  • MySQL database using MySQL v8.0 - v8.4, particularly instances deployed using the OpenShift Container Platform (OCP) MySQL template or other standard online deployment examples.

Issue

When you set up a model registry in Red Hat OpenShift AI (RHOAI), the model registry pods fail to start, and the grpc-container and rest-container show authentication errors in the logs.

This issue often occurs when deploying MySQL using common online templates or examples, which may not use the authentication plugin required by OpenShift AI.

See the Diagnostic Steps section for more information about confirming the cause of your issue.

Resolution

To resolve this issue, you must manually change the authentication plugin for the model registry database user to mysql_native_password.

  1. Log in to the MySQL instance as a user with privileges to alter other users, for example, the root user.

  2. Execute the following ALTER USER command. This command changes the specified user's authentication method. Replace and
    with the actual username and password for your model registry database.

ALTER USER '<username>'@'%' IDENTIFIED WITH mysql_native_password BY '<password>';
  1. Verify the change. You can run the following command to confirm that the plugin was updated:
SELECT user, host, plugin FROM mysql.user;

The output should now show the mysql_native_password plugin for your user:

+------------------+-----------+-----------------------+
| user             | host      | plugin                |
+------------------+-----------+-----------------------+
| <username>       | %         | mysql_native_password |
...
+------------------+-----------+-----------------------+
  1. After altering the user, restart the failing model registry pod. For example, for a model registry instance called model_registry_example running in the rhoai-model-registries namespace, you can run the following command to delete and recreate the pod:
oc delete deployment model_registry_example -n rhoai-model-registries

The recreated pod should connect to the database successfully.

Note:
No further privilege changes, such as GRANT ALL PRIVILEGES or FLUSH PRIVILEGES, should be necessary to connect successfully.

Root Cause

This issue often occurs when deploying MySQL using common online templates or examples, which may not use the authentication plugin required by OpenShift AI.

The grpc-container log shows a fatal error indicating that the MySQL database was not initialized, and that the connection failed:

E0624 17:37:29.099947 1 mysql_metadata_source.cc:174] MySQL database was not initialized. Please ensure your MySQL server is running. Also, this error might be caused by starting from MySQL 8.0, mysql_native_password used by MLMD is not supported as a default for authentication plugin.
...
F0624 17:37:29.100133 1 metadata_store_server_main.cc:617] Check failed: absl::OkStatus() == status (OK vs. INTERNAL: mysql_real_connect failed: errno: , error: [mysql-error-info='']) MetadataStore cannot be created with the given connection config.

The rest-container log shows repeated errors trying to connect to the MLMD service:

W0624 17:39:51.036969 1 mlmd.go:85] Retrying connection to MLMD service (attempt 25/25): error setting up context type kf.RegisteredModel: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:9090: connect: connection refused"

These errors might occur if you are using a MySQL version between 8.0 and 8.4, where mysql_native_password is expected to be the default authentication plugin.

Diagnostic Steps

Check the pod logs
In the logs of the failing model registry pod, look for the grpc-container. The presence of the mysql_real_connect failed error indicates a database connection issue.

Verify the MySQL authentication plugin
Many deployment examples and templates for MySQL version 8.0 and newer, including the default OpenShift Container Platform template, set caching_sha2_password as the default authentication plugin. This conflicts with the requirement that mysql_native_password is the authentication plugin for MySQL.

  1. Log in to your MySQL database pod.
  2. Run the following command to check the authentication plugin for your database user:
SELECT user, host, plugin FROM mysql.user;
  1. Examine the command output. If the plugin for your user is set to caching_sha2_password (or any plugin other than mysql_native_password), then the root cause of the pod failure is a mismatch between the authentication plugin required by the model registry's MLMD component (mysql_native_password) and the one configured in the MySQL database.
+------------------+-----------+-----------------------+
| user             | host      | plugin                
+------------------+-----------+-----------------------+ 
| <username>       | %         | caching_sha2_password |
| root             | %         | caching_sha2_password |
...
+------------------+-----------+-----------------------+

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments