Serialization is a feature of programming languages that allows the state of in-memory objects to be represented in a standard format, which can be written to disk or transmitted across a network. Java includes powerful serialization capabilities as a core feature of the language. All classes which implement the java.io.Serializable interface can be serialized and deserialized, with Java handling the plumbing automatically. Serialization is now widely used in Java applications as a mechanism for transferring objects over the network, for example via RESTful interfaces. When applications deserialize arbitrary content, which is provided by an attacker, this can lead to a variety of unexpected security impacts. This article is the first part of a two-part series, focusing on security issues related to binary deserialization (that is, Java's native serialization format).
Any classes that implement the java.io.Serializable interface can be serialized and deserialized automatically. Classes can override the default serialization and deserialization process by implementing special methods. For deserialization, there are two methods:
void readObject(java.io.ObjectInputStream in)
which is called when the class is deserialized. The serialized representation of the class should be read from the in stream.
which is called after the class is deserialized, and can be used to replace the object read from the stream. An example use of this function is to enforce a singleton pattern by replacing any deserialized object with the singleton instance.
Implementing these special deserialization methods can lead to security issues. If an application deserializes arbitrary user-supplied content, it means that an attacker could provide serialized instances of any class on the application's classpath, with any chosen values being assigned to the member variables. Access modifiers such as private do not inhibit an attacker who can set arbitrary values for private member variables in a serialized class instance. Security issues may arise when the logic in the readObject and readResolve methods does not take this into account.
For this reason, it is generally considered insecure for an application to deserialize arbitrary user-supplied content. There is even a Common Weakness Enumeration ID assigned to this category of flaw: CWE-502: Deserialization of Untrusted Data.
Example: Apache Commons FileUpload (CVE-2013-2186)
Let's examine a recent example of this type of issue. Apache Commons FileUpload is a component that simplifies handling of file uploads in Java-based web applications. It contains a DiskFileItem class, which is serializable and implements a custom readObject method. The readObject method creates a temporary file on the disk, and writes the content represented by the DiskFileItem object to that temporary file. The file is created with the following call:
tempFile = new File(tempDir, tempFileName);
The value of tempDir is read from the repository private member variable of the class, and prior to patching this flaw, no security checks were applied to this value. This exposed a poison null byte vulnerability (CWE-626: Null Byte Interaction Error). An attacker could provide a serialized instance of DiskFileItem with a null-terminated full path value for the repository variable:
The values of tempDir and tempFileName would be concatenated inside the File class. When creating the file on the disk, the full file path would be passed down to native libraries, which interpret a null character as a string terminator. Therefore, a file would be created using the full path specified in the repository variable. As a result, an attacker could write arbitrary files to any location allowed by the user running the target application. Newer versions of the JDK mitigate poison null byte attacks automatically, but at the time of reporting this flaw, patched JDKs were not widely deployed.
Where Lies the Security Flaw?
Note that exploitation of the DiskFileItem flaw relies on an application performing deserialization of untrusted data, with DiskFileItem on the classpath. Does the flaw lie in the application performing deserialization of untrusted data, which in isolation is not a security concern? Or does it lie in DiskFileItem, which is not vulnerable unless an application is performing deserialization of untrusted data? This is a question which does not yet have a consensus answer in the security community. The Red Hat Security Response Team's view is that both a vulnerable serializable class, and an application performing deserialization of untrusted data expose security flaws. Therefore we assigned CVE-2013-2186 to the DiskFileItem flaw. This view is not shared by the Apache Commons security team, who viewed the fix as a hardening measure, and that only an application performing deserialization of untrusted data would expose an actual security flaw.
To mitigate deserialization flaws in serializable classes, any custom deserialization code must take into account the fact that attacker-controlled serialized instances may contain arbitrary values for all member variables, regardless of their access modifier.
Mitigating security impacts in applications that intentionally deserialize untrusted data is more complex. Typically, an application will only intend to deserialize instances of a small number of defined classes. However, if the application deserializes untrusted data, then performs type-checking on the deserialized instances, it is too late — the custom deserialization methods have already been executed. Look-ahead deserialization is a deserialization validation technique which allows the content of a serialized stream to be type-checked prior to actual deserialization. This technique is an effective mitigation and is well documented by Pierre Ernst of IBM.
The ultimate mitigation is to not deserialize untrusted data in the first place, and, where possible, this mitigation should be applied. For example, restlet is a library for building RESTful APIs, which provides a variety of encoding formats, including XML, JSON, and binary serialization. REST APIs built using restlet that accept binary serialization as an encoding format will deserialize arbitrary user-supplied data — this flaw was assigned CVE-2013-4271. To address this flaw, restlet removed support for binary serialization as an encoding format.
Java serialization is a powerful feature, but when used as a format for transporting untrusted, user-supplied data, it can lead to potentially critical security issues. Whether responsibility for these issues lies with vulnerable serializable classes, or with applications deserializing untrusted data, remains an open question in the security community. In the interim, considering these issues to be security flaws in both components is a safe approach.
In the second part of this series, we will look at using XML as a serialization format, and the security issues which can arise from that approach.