Position Independent Executable (PIE) Performance

December 12, 2012Red Hat

This article was originally published on the Red Hat Customer Portal. The information may no longer be current.

Position Independent Executables (PIE) use randomization as an exploit mitigation technique against attacks on return oriented programming. In my previous post I discussed the effects that PIE has on ELF binaries and how they are executed. In this entry I will discuss how I gathered information about program startup times and share some of my findings. The Linux loader has a great feature that allows you to gain some insight into what actions are taken during a program's execution. I used this feature when attempting to measure the impact PIE had on application startup times. This was chosen as the time that is spent in the linker resolving symbols is largely out of the programmer's control.

To collect statistics about the loaders performance you can prefix program execution with LD_DEBUG=statistics. This provides detailed information about the runtime statistics that pertain to the loader. Consider the following example:

$ LD_DEBUG=statistics ./pie-example 
     21180: 
     21180: runtime linker statistics:
     21180:  total startup time in dynamic loader: 714700 clock cycles
     21180:     time needed for relocation: 6958 clock cycles (.9%)
     21180:                number of relocations: 0
     21180:     number of relocations from cache: 0
     21180:       number of relative relocations: 1
     21180:    time needed to load objects: 274946 clock cycles (38.4%)

From this output you can gain an interesting insight into the differences between PIE and standard executables. In my original paper I examined the impact that PIE had on different types of applications, specifically focusing on the time spent in the loader during program startup. This was done by collecting thousands of samples of the statistical information output by LD_DEBUG in single user mode. One of the commands I looked at was 'sudo'. This is a command that is regularly executed and has the setuid bit set, so it serves as a good example. The difference I found in the time spent in the loader in clock cycles is shown in the figures below: Figure 1. The results indicate there is a clear time shift of approximately 0.5µ second between the standard and PIE versions of sudo. Another key difference is the number of relative relocations that occur in each application as shown below: Figure 2. In the testing I did this resulted in an average overhead of 16% during the programs startup phase. Given the test system runs at around 0.357 nanosecond per clock cycle, the overhead converts to roughly 0.1985 milliseconds. A figure that I would not lose any sleep over; however, some of you might. The key difference here is the relative relocations by each version of the program. To reduce the overhead we need to try and reduce this figure.

Revisiting the original sample application, if we want to reduce the number of relative relocations we need to figure out where they are occuring. Examining the ELF binary will give some indication as to the cause of this problem:

$ readelf -r pie-example 
Relocation section '.rela.dyn' at offset 0x370 contains 1 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
0000002005f0  000000000008 R_X86_64_RELATIVE                    0000000000200620
Relocation section '.rela.plt' at offset 0x388 contains 2 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000200610  000200000007 R_X86_64_JUMP_SLO 0000000000000000 quit + 0
000000200618  000300000007 R_X86_64_JUMP_SLO 0000000000000000 puts + 0

The R_X86_64_RELATIVE type indicates a relative relocation offset is 200620. The symbol this relates to can be found by searching the disassembly of a binary:

$ objdump -D pie-example | grep 200620 -m 1
0000000000200620 : <message>

This relocation is for the message string that we are printing to the screen. The way that the message variable has been declared is actually important when it comes to relative relocations. So what happens if we declare the string differently?

[sourcecode language="cpp"]
...
const char *message = "Hello World";
int main(int argc, const char *argv[], const char *envp[])
{
puts(message);
exits(0);
}
[/sourcecode]

The above would result in an additional relocation as the declaration is a pointer to a global string that is read only. In addition to this, a relocation is needed to locate the content in the .data segment.

$ readelf -r pie-example 
Relocation section '.rela.dyn' at offset 0x370 contains 2 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000200608  000000000008 R_X86_64_RELATIVE                    0000000000200638
000000200638  000000000008 R_X86_64_RELATIVE                    0000000000000440

Changing the string declaration to static const char message[], a local variable, or using a preprocessor macro will result in no relative relocations for the PIE implementation. This is because the string "Hello World" will be placed in the read-only data section (.rodata) of the binary. As the name would suggest this means that the content cannot change or be written to. As a result, a decision can be made about the location of the message string at compile/link time as opposed to runtime. There is likely to be room to move when it comes to optimization in most programs. If you are interested in learning about other optimizations in this area, I strongly recommend Uli Drepper's paper How to write shared libraries.

I would argue, however, that in most cases such tedious levels of optimization would not be necessary. In the testing that I did the performance overhead in program startup ranged from 0.1985 milliseconds to 11 milliseconds. Which is minimal when compared with the benefits that PIE gives you against return oriented programming based attacks. I hope you enjoy my paper and consider using PIE in your project. In future posts I will investigate some of the other security features that GCC provides in the area of hardening executables.

About the author

Red Hat

The world’s leading provider of enterprise open source software solutions, Acme Products

Red Hat is the world’s leading provider of enterprise open source software solutions, using a community-powered approach to deliver reliable and high-performing Linux, hybrid cloud, container, and Kubernetes technologies.

Red Hat helps customers integrate new and existing IT applications, develop cloud-native applications, standardize on our industry-leading operating system, and automate, secure, and manage complex environments. Award-winning support, training, and consulting services make Red Hat a trusted adviser to the Fortune 500. As a strategic partner to cloud providers, system integrators, application vendors, customers, and open source communities, Red Hat can help organizations prepare for the digital future.

Read full bio

Browse by channel

Explore all channels

Platform products

Try & buy

Featured cloud services

By category

By organization type

By customer

Featured

Topics

Articles

More to explore

For customers

For partners

About us

Open source

Company details

Communities

Recommendations

Select a language

Select a language

Position Independent Executable (PIE) Performance

This article was originally published on the Red Hat Customer Portal. The information may no longer be current.

About the author

Red Hat

More like this

Browse by channel

Products

Tools

Try, buy, & sell

Communicate

About Red Hat

Select a language

Red Hat legal and privacy links

Red Hat legal and privacy links