Certificate verification in Python standard library HTTP clients

Updated -

Background Information

The Python standard library includes multiple modules that provide HTTP client functionality, including httplib, urllib, urllib2, and xmlrpclib. While these modules support HTTPS connections, they traditionally performed no verification of certificates presented by HTTPS servers, and offered no way to easily enable such verification. This could allow Man-In-The-Middle (MITM) attackers to easily hijack HTTPS connections from Python clients to eavesdrop or modify transferred data.

This lack of certificate verification was well known and usually worked around in relevant use cases by having verification implemented in applications or by using different HTTP client libraries that performed certificate verification. The package management tools in Red Hat Enterprise Linux can be used as an example: the Yum package manager used in Red Hat Enterprise Linux 5, 6, and 7 uses the python-pycurl module, a wrapper around the curl/libcurl library, which performs certificate verification; the up2date package manager as used in Red Hat Enterprise Linux 4 and earlier implemented certificate verification using the m2crypto module.

Even though this limitation was well known, many application authors were not aware of it or assumed all expected checks were performed. That led to reports of several security flaws over time, and the assignment of CVE-2014-9365 for the lack of certificate verification in the Python standard library HTTP clients.

Resolution

Upstream Resolution

Python upstream developers decided to address the problem by enabling certificate verification by default. The change was implemented via Python Enhancement Proposal PEP 476 (Enabling certificate verification by default for stdlib http clients), and applied to both current development branch in version 3.4.3 and the legacy maintenance branch in version 2.7.9. This was a controversial change for the legacy 2.7 branch as it is know to have backwards compatibility issues associated with it. Deployments that intentionally or unintentionally rely on the old behaviour that lacks any verification are expected to break when the updated Python version is used. As a consequence, this may prevent users from adopting patched Python versions.

In an attempt to address these compatibility issues and to provide a smoother transition to the newer safer defaults, Red Hat worked with Python community members to define mechanisms to allow users and administrators to control whether certificate verification should be performed without requiring modification of individual applications. Those mechanisms are described in PEP 493 (HTTPS verification migration tools for Python 2.7).

Red Hat Enterprise Linux 7 Resolution

The Python version included in Red Hat Enterprise Linux 7 was based on upstream version 2.7.5 and hence did not perform certificate verification. The support for PEP 476 (along with the required PEP 466 (Network Security Enhancements for Python 2.7.x)) was first added via RHSA-2015:2101 released as part of Red Hat Enterprise Linux 7.2.

The RHSA-2015:2101 update adds support for PEP 476, however due to backwards compatibility reasons, it disables certificate verification by default. It also implements support for the cert-verification.cfg configuration file described in the "Backporting PEP 476 to earlier Python versions" section of PEP 493. With this support, certificate verification can be enabled by default.

The support for PEP 493 in Red Hat Enterprise Linux 7 was further extended via RHSA-2016:2586 released as part of Red Hat Enterprise Linux 7.3. The update adds the following features:

  • Environment based configuration, as described in the "Feature: environment based configuration" section of PEP 493. The ssl module now checks the PYTHONHTTPSVERIFY environment variable - if set, its value overrides the settings from cert-verification.cfg. The value of 0 disables certificate verification and any other value enables it. This feature can be used by end users to enable or disable verification for a specific Python program, or a specific invocation of a Python program, without needing to modify the program's source code.

  • Configuration API, as described in the "Feature: Configuration API" section of PEP 493. The ssl._https_verify_certificates() function can be used to enable or disable certificate verification at runtime. This API can be used by program authors to ensure their programs run with verification enabled or disabled regardless of the default system setting.

Certificate verification was enabled by default via RHSA-2017:1868 released as part of Red Hat Enterprise Linux 7.4. Deployments that require certificate verification to remain disabled can change the default in via the cert-verification.cfg configuration file. Refer to the "Controlling certificate verification" section below for further details.

Red Hat Software Collections Resolution

The Python version used in the rh-python34 collection is based on upstream version 3.4.2. However, PEP 476 support was backported to this version and included since its first release via RHEA-2015:1058, released as part of the Red Hat Software Collections 2.0, with certificate verification enabled by default. It does not include any support for PEP 493 and therefore it is not possible to disable verification by default via the cert-verification.cfg configuration file. The troubleshooting tips noted below are applicable to this version too.

The Python version included in the python27 collection was originally based on upstream version 2.7.5 and later updated to upstream version 2.7.8, and hence did not perform certificate verification. The support for PEP 476 was first added via RHSA-2016:1166 released as part of Red Hat Software Collections 2.2.

The RHSA-2016:1166 update adds support for PEP 476, however due to backwards compatibility reasons, it disables certificate verification by default. It also implements support for the cert-verification.cfg configuration file described in the "Backporting PEP 476 to earlier Python versions" section of PEP 493. The python27 collection stores the cert-verification.cfg configuration file in the /opt/rh/python27/root/etc/python/ directory rather than the /etc/python/ directory. With this support, certificate verification can be enabled by default.

The support for PEP 493 in the python27 collection was further extended via RHSA-2017:1162 released as part of Red Hat Software Collections 2.4. That erratum updates Python to version 2.7.13, and therefore enables certificate verification by default. It also adds support for additional features defined in PEP 493: "Feature: environment based configuration" and "Feature: Configuration API". Refer to the "Red Hat Enterprise Linux 7 Resolution" section above for further details about these features.

The Python version used in the python33 collection does not implement PEP 476 or PEP 493 and their support is not expected to be added in future updates.

Controlling and troubleshooting certificate verification

Controlling certificate verification

The Python packages with PEP 476 and PEP 493 support as shipped with Red Hat products allow system administrators to set whether certification verification should be enabled or disabled by default via an INI-style configuration file: /etc/python/cert-verification.cfg. In this configuration file, the default for HTTP clients in the Python standard library is set using the verify option in the [https] section. The section may look like this:

[https]
verify=enable

Valid values are enable (verification is enabled by default), disable (verification is disabled by default), and platform_default (use the platform specific default hard-coded in the ssl module). Users are encouraged to test their applications with enable and only use disable if verification causes problems in their environments, and only until those problem can be resolved (e.g. by ensuring that the certificate authority (CA) used by their systems is configured as trusted, or by modifying applications that should continue running with verification disabled). When the platform_default value is used, the actual default may change as additional Python packages updates with different hard-coded default are released in the future.

When using Python versions that support the PYTHONHTTPSVERIFY environment variable, that variable can be used to set the verification default for specific program invocations. Typical use cases include:

  • If the global default configured on the system via cert-verification.cfg is to not perform certification verification, and some program needs to run with verification enabled (e.g. to test if the program functions correctly before the global default is changed to enable), Python interpreter can be invoked in the following way to enable verification:

    $ PYTHONHTTPSVERIFY=1 python /path/to/python-program.py
    
  • If the global default configured on the system via cert-verification.cfg is to perform certification verification, and some program needs to run with verification disabled, Python interpreter can be invoked in the following way to disable verification:

    $ PYTHONHTTPSVERIFY=0 python /path/to/python-program.py
    

Troubleshooting certificate verification

Once Python is configured to perform certificate verification for HTTPS client connections, some connections may fail because of failed verification. The following short program can be used to demonstrate the most common errors that can be encountered.

#!/usr/bin/env python

try:   
    import urllib2 #python2
except:
    import urllib.request as urllib2 #python3
import sys

req = urllib2.Request(sys.argv[1], headers={'User-Agent':'Mozilla/5.0'})
urllib2.urlopen(req)

The most common error occurs when connecting to a HTTPS server which presents a certificate issued by an unknown CA:

$ python urllib2-test.py https://cdn.redhat.com
...
urllib2.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:590)>

By default, the Python ssl module uses the system CA certificate bundle - /etc/pki/tls/certs/ca-bundle.crt - shipped as part of the ca-certificates package. Inside corporate intranets, servers commonly use certificates issued by an internal corporate CA rather than by a public Internet CA. Any affected programs should be configured to use the internal CA certificate to be able to successfully verify certificates of such servers. The following methods that do not require any program modifications can be used to make them trust certificates from the corporate CA:

  • Add the CA's certificate to the system certificate bundle. Consult the update-ca-trust(8) manual page for further information on how to add new certificates to the bundle. Note that once it is added, other programs using the system certificate bundle will also trust any certificate issued by that CA. Therefore, this should only be done for trusted CA.

  • Use the program's environment to specify additional trusted CA certificates for a specific program invocation. Environment variables SSL_CERT_FILE and SSL_CERT_DIR can be used to specify additional trusted CA certificate or certificates. For example, save the CA certificate (in PEM or DER format) to a file and set its path as a value for SSL_CERT_FILE:

    $ SSL_CERT_FILE=/etc/rhsm/ca/redhat-uep.pem python urllib2-test.py https://cdn.redhat.com/
    

Another problem that can be encountered is when a target HTTPS server has a certificate issued by one of the trusted CAs, but it is issued for a different host name. An error such as this one is reported in such a case:

$ python urllib2-test.py https://ev-www.redhat.com.edgekey.net
...
ssl.CertificateError: hostname 'ev-www.redhat.com.edgekey.net' doesn't match either of 'www.redhat.com', 'redhat.com'

One of the host names recorded in the server certificate should be used when connecting to such a server. If neither of the listed names can be used, the server should have a proper certificate generated, or certificate verification needs to be disabled in clients connecting to it.

Modifying Python programs to control certificate verification

The text above describes methods for controlling certificate verification without modifying Python programs - using the cert-verification.cfg configuration file and environment variables. Python programs can also be modified to apply their own settings for certificate verification regardless of system defaults.

  • Programs directly using ssl.wrap_socket() can specify a file with trusted CA certificates using the ca_certs parameter. This is an alternative to using SSL_CERT_FILE and SSL_CERT_DIR environment variables. See the ssl module documentation for details.

  • Alternative to directly using ssl.wrap_socket(), programs can create SSLContext to store configuration and data needed by TLS/SSL connections, and use its wrap_socket() method. SSLContext can be configured with location of trusted CA certificates (similar to ssl.wrap_socket() described above), but can also be configured to disable server host name checks. See the ssl module documentation for details.

  • HTTP/HTTPS client modules inside the Python standard library now accept SSLContext to allow customization of their default settings for TLS/SSL connections, including certificate verification. See the httplib and urllib2 module documentation for details.

  • The ssl._create_unverified_context() function can be used to create an unverified SSLContext - a context with disables all certificate verification. Such context can be passed to httplib or urllib2 modules to disable verification for individual connections, or set as the default context for all subsequent HTTPS connections. See the "Opting Out" section of PEP 476 for details and code examples.

  • When using Python version that implements "Configuration API" defined in PEP 493, the ssl._https_verify_certificates() function can be used to control whether verified or unverified context is used by default for all subsequent HTTPS connections.

Comments