Mitigation of Web Cache Poisoning in the Python urllib library (CVE-2021-23336)

Updated -

1. Background information

Certain Python versions in Red Hat Enterprise Linux (RHEL) and Red Hat Software Collections (RHSCL) are vulnerable to Web Cache Poisoning CVE-2021-23336 through the urllib query string parsing functions by using a vector called parameter cloaking.

When parsing query strings using the urllib.parse.parse_qsl and urllib.parse.parse_qs functions, Python urllib previously always separated parameters on both ampersands (&) and semicolons (;). A reverse proxy, if used in front of a web application that uses urllib, can also be configured to parse query strings for the purpose of excluding certain parameters when computing cache keys (to determine if requests are identical and can get the same response from the cache). If such a reverse proxy was configured to separate parameters only on ampersand (&), this could lead to different interpretation of the request between the proxy and the Python web application. A malicious request could cause the proxy to ignore parameters processed by the web application and cause a response to be cached and returned to other clients accessing the application.

The following supported versions of Python are affected in RHEL 8:

  • Python 3.6 provided by the python36 package from the python36:3.6 module stream and by the platform-python package
  • Python 3.8 provided by the python38 package in the python38:3.8 module stream
  • Python 2.7 provided by the python2 package in the python27:2.7 module stream

The following supported versions of Python are affected in RHSCL:

  • Python 3.8 provided by the rh-python38-python package
  • Python 2.7 provided by the python27-python package

2. Upstream resolution

Python upstream developers decided to address the issue by changing the default separator for the urllib.parse.parse_qsl and urllib.parse.parse_qs functions to an ampersand (&), whereas previously both ampersand (&) and semicolon (;) acted as separators. The default separator can be overridden in Python code by passing the new separator parameter to the urllib.parse.parse_qsl and urllib.parse.parse_qs functions but it is impossible to revert to the previous default of accepting both & and ; as separators.

This change was implemented in upstream Python versions 3.9.2, 3.8.8, 3.7.10, 3.6.13, and in the pre-release version 3.10.0a6.

3. Red Hat Enterprise Linux and Red Hat Software Collections resolution

The urllib.parse.parse_qsl and urllib.parse.parse_qs functions in Python versions released previously in RHEL and RHSCL used both ampersand (&) and semicolon (;) as query parameter separators, and thus are vulnerable to CVE-2021-23336.

To address this security vulnerability, the upstream change of the default separator to an ampersand (&) is being backported to Python:

Product Python version Changed in
RHEL 8 Python 3.6 RHSA-2021:1633
RHEL 8 Python 3.8 RHSA-2021:4162
RHEL 8 Python 2.7 RHSA-2021:4151
RHSCL Python 3.8 RHSA-2021:3254
RHSCL Python 2.7 RHSA-2021:3252



Note that Python 3.9.2, available with RHEL 8.4 in the RHEA-2021:1919 advisory, already includes the new default separator (&) and follows the upstream resolution.

The change of the default separator is potentially backwards incompatible, therefore Red Hat provides a way to configure the behavior in Python packages where the default separator has been changed. In addition, the affected urllib parsing functions issue a warning if they detect that a customer’s application has been affected by the change.

3.1 Warning when your application has been affected

In the updated Python packages, the urllib.parse.parse_qsl and urllib.parse.parse_qs functions issue the following warning if they encounter a semicolon (;) in the input query string:

The default separator of urllib.parse.parse_qsl and parse_qs was changed to '&' to avoid a web cache poisoning issue (CVE-2021-23336).  By default, semicolons no longer act as query field separators.  See https://access.redhat.com/articles/5860431 for more details.

To suppress the warning, use one of the following approaches:

3.2 Specifying the default separator when calling the affected urllib parsing functions

With the updated Python packages, when calling the urllib.parse.parse_qsl and urllib.parse.parse_qs functions, you can explicitly set the separator parameter to:

  • An ampersand (&)
  • A semicolon (;)
  • Any other single character

For example:

import urllib.parse
urllib.parse.parse_qs("a=1&b=2;c=3", separator="&"))

Specifying the separator parameter will suppress the warning.

Note that when using the separator parameter with the urllib parsing functions, it is impossible to select the legacy behavior of using both & and ; as separators.

3.3 Configuring the default query separator for existing applications

For cases where it is not feasible to modify application code, Python packages where the default separator has been changed provide several ways to configure the default query separator and suppress the warning:

All of these approaches enable you to choose from the following three options:

  • & - will explicitly set the new default separator to ampersand
  • ; - will set the default separator to a semicolon
  • legacy - this string will revert the new default to the previous behaviour of splitting on both ampersand and semicolon

Note that these configuration options are available only for the updated Python packages where the default separator has been changed, not for Python 3.9, which was released already with the new default and follows the upstream resolution.

3.3.1. Configuration file

Create the /etc/python/urllib.cfg file with the following content. Note that you might have to create the /etc/python/ directory first.

[parse_qs]
PYTHON_URLLIB_QS_SEPARATOR=legacy

This will globally configure all the affected Python versions (3.6, 3.8, 2.7) on your system to the selected choice: legacy (shown in the example), &, or ;.

3.3.2. Environment variable

You can configure the separator using an environment variable, for example:

PYTHON_URLLIB_QS_SEPARATOR='legacy' ./your_application.py

Alternatively, you can export the variable so that it is visible to any new process in your environment:

export PYTHON_URLLIB_QS_SEPARATOR='legacy'
...
./your_application.py

You can use any of the three options: legacy (shown in the example), &, or ;. The environment variable takes priority over the configuration file.

3.3.3. Configuration in Python

You can configure the default separator in the code of your Python application, or globally in the sitecustomize.py file in your site-packages directory. Include this code:

import urllib.parse
urllib.parse._default_qs_separator = "legacy"

You can use any of the three options: legacy (shown in the example), &, or ;. Configuration in Python takes priority over both the environment variable and the configuration file.

4. Fedora resolution

The main Python 3 interpreter and all alternate Python versions 3.6 and higher have been updated in all supported Fedora releases, starting with Fedora 32. In all the updated Python packages, the default separator has been changed to an ampersand (&). In Python 3, the default separator can be changed only by passing the new separator parameter when calling the urllib.parse.parse_qsl and urllib.parse.parse_qs functions in Python code.

Python 3.5 in Fedora remains affected by CVE-2021-23336 as it is scheduled to be removed from Fedora 35 onwards.

Because of the discontinued upstream development of Python 2, the same conservative approach as in RHEL 8 has been used for Python 2.7 in all supported Fedora releases. The default separator has been changed to an ampersand (&) and can be changed by passing the new separator parameter when calling the affected urllib parsing functions, the RHEL configuration mechanism can be used, and a warning is issued when an application has been affected.

Comments