Shared libraries and execute permissions

In the few discussions you can find on the web about shared-libraries and execute-permissions, you can find a range of various opinions but not a lot about what goes when you execute a library.

The first thing to consider is how the execute-permission interacts with the dynamic loader. When mapping a library, the dynamic-loader doesn't care about file-permissions; it cares about mapping specific internal parts of the .so. Specifically, it wants to map the PT_LOAD segments as-per the permissions specified by the program-header. A random example:

 $ readelf --segments /usr/lib/x86_64-linux-gnu/libcrypto.so.1.0.0

Elf file type is DYN (Shared object file)
Entry point 0x77c00
There are 7 program headers, starting at offset 64

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x00000000001b5ea4 0x00000000001b5ea4  R E    200000
  LOAD           0x00000000001b6b60 0x00000000003b6b60 0x00000000003b6b60
                 0x0000000000028dc4 0x000000000002c998  RW     200000

The permissions to load the code and data segments are given the Flags output. Code has execute-permissions, and data has write-permissions. These flags are mapped into flags for the mmap call, which the loader then uses to map the various segments of the file into memory.

So, do you actually need execute-permissions on the underlying file to mmap that segment as executable? No, because you can read it. If I can read it, then I can copy the segment to another area of memory I have already mapped with PROT_EXEC and execute it there anyway.

Googling suggests that some systems do require execute-permissions on a file if you want to directly mmap pages from it with PROT_EXEC (and if you dig into the kernel source, there's an ancient system call uselib that looks like it comes from a.out days, given it talks about loading libraries at fixed addresses, that also wants this). This doesn't sound like a terrible hardening step; I wouldn't be surprised if some hardening patches require it. Maximum compatability and historical-features such as a.out also probably explains why gcc creates shared libraries with execute permissions by default.

Thus, should you feel like it, you can run a shared-library. Something trivial will suffice:

int function(void) {
      return 100;
}
$ gcc -fPIC -shared -o libfoo.so foo.c

$ ./libfoo.so
Segmentation fault

This is a little more interesting (to me anyway) to dig into. At a first pass, why does this even vaguely work? That's easy -- an ELF file is an ELF file, and the kernel is happy to map those PT_LOAD segments in and jump to the entry point for you:

$ readelf --segments ./libfoo.so

Elf file type is DYN (Shared object file)
Entry point 0x570
There are 6 program headers, starting at offset 64

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x000000000000070c 0x000000000000070c  R E    200000
  LOAD           0x0000000000000710 0x0000000000200710 0x0000000000200710
                 0x0000000000000238 0x0000000000000240  RW     200000

What's interesting here is that the shared-library has an entry point (e_entry) at all. ELF defines it as:

This member gives the virtual address to which the system first transfers control, thus starting the process. If the file has no associated entry point, this member holds zero.

First things first, where did that entry point come from? The ld manual tells us that the linker will set the entry point based upon the following hierarchy:

  • the -e entry command-line option;
  • the ENTRY(symbol) command in a linker script;
  • the value of a target specific symbol
  • the address of the first byte of the .text section, if present;
  • The address 0.

We know we're not specifying an entry point. The ENTRY command is interesting; we can check our default-link script to see if that is specified:

$ ld --verbose | grep ENTRY
ENTRY(_start)

Interesting. Obviously we didn't specify a _start; so do we have one? A bit of digging leads to the crt files, for C Run-Time. These are little object files automatically linked in by gcc that actually do the support-work to get a program to the point that main is ready to run.

So, if we go and check out the crt files, one can find a definition of _start in crt1.o

$ nm /usr/lib/x86_64-linux-gnu/crt1.o
crt1.o:
0000000000000000 R _IO_stdin_used
0000000000000000 D __data_start
                 U __libc_csu_fini
                 U __libc_csu_init
                 U __libc_start_main
0000000000000000 T _start
0000000000000000 W data_start
                 U main

But do we have that for our little shared-library? We can get a feel for what gcc is linking in by examining the output of -dumpspecs. Remembering gcc is mostly just a driver that calls out to other things, a``specs`` file is what gcc uses to determine which arguments pass around to various stages of a compile:

$ gcc -dumpspecs
...
*startfile:
%{!shared: %{pg|p|profile:gcrt1.o%s;pie:Scrt1.o%s;:crt1.o%s}}
 crti.o%s %{static:crtbeginT.o%s;shared|pie:crtbeginS.o%s;:crtbegin.o%s}

The format isn't really important here (of course you can read about it); but the gist is that various flags, such as -static or -pie get passed different run-time initailisation helpers to link-in. But we can see that if we're creating a shared library we won't be getting crt1.o. We can double-confirm this by checking the output of gcc -v (cut down for clarity).

$ gcc -v -fPIC -shared -o libfoo.so foo.c
Using built-in specs.
 ...
/usr/lib/gcc/x86_64-linux-gnu/4.4.5/collect2 -shared -o libfoo.so
/usr/lib/gcc/x86_64-linux-gnu/4.4.5/../../../../lib/crti.o
/usr/lib/gcc/x86_64-linux-gnu/4.4.5/crtbeginS.o
/tmp/ccRpsQU3.o
/usr/lib/gcc/x86_64-linux-gnu/4.4.5/crtendS.o
/usr/lib/gcc/x86_64-linux-gnu/4.4.5/../../../../lib/crtn.o

So this takes us further down ld's entry-point logic to pointing to the first bytes of .text, which is where the entry-point comes from. So that solves the riddle of the entry point.

There's one more weird thing you notice when you run the library, which is the faulting address in kern.log:

libfoo.so[8682]: segfault at 1 ip 0000000000000001 sp 00007fffcd63ec48 error 14 in libfoo.so[7f54c51fa000+1000]

The first thing is decoding error; 14 doesn't seem to have any relation to anything. Of course everyone has the Intel 64 Architecture Manual (or mm/fault.c that also mentions the flags) to decode this into 1110 which means "no page found for a user-mode write access with reserved-bits found to be set" (there's another post in all that someday!).

So why did we segfault at 0x1, which is an odd address to turn up? Let's disassemble what actually happens when this starts.

00000000000004a0 <call_gmon_start>:
 4a0:   48 83 ec 08             sub    $0x8,%rsp
 4a4:   48 8b 05 2d 03 20 00    mov    0x20032d(%rip),%rax        # 2007d8 <_DYNAMIC+0x190>
 4ab:   48 85 c0                test   %rax,%rax
 4ae:   74 02                   je     4b2 <call_gmon_start+0x12>
 4b0:   ff d0                   callq  *%rax
 4b2:   48 83 c4 08             add    $0x8,%rsp
 4b6:   c3                      retq

We're moving something in rax and testing it; if true we call that value, otherwise skip and retq. In this case, objdump is getting a bit confused telling us that 2007d8 is related to _DYNAMIC; in fact we can check the relocations to see it's really the value of __gmon_start__:

$ readelf --relocs ./libfoo.so

Relocation section '.rela.dyn' at offset 0x3f0 contains 4 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000200810  000000000008 R_X86_64_RELATIVE                    0000000000200810
0000002007d8  000200000006 R_X86_64_GLOB_DAT 0000000000000000 __gmon_start__ + 0
0000002007e0  000300000006 R_X86_64_GLOB_DAT 0000000000000000 _Jv_RegisterClasses + 0
0000002007e8  000400000006 R_X86_64_GLOB_DAT 0000000000000000 __cxa_finalize + 0

Relocation section '.rela.plt' at offset 0x450 contains 1 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000200808  000400000007 R_X86_64_JUMP_SLO 0000000000000000 __cxa_finalize + 0

Thus call_gmon_start, rather unsurprisingly, checks the value of __gmon_start__ and calls it if it is set. Presumably this is set as part of profiling and called during library initialisation -- but it is clearly not an initialiser by itself. The retq ends up popping a value off the stack and jumping to it, which in this case just happens to be 0x1 -- which we can confirm with gdb by putting a breakpoint on the first text address and examining the stack-pointer:

(gdb) x/2g $rsp
0x7fffffffe7d8:        0x0000000000000000      0x0000000000000001

So that gives us our ultimate failure.

Of course, if you're clever, you can get around this and initalise yourself correctly and actually make your shared-library do something when executed. The canonical example of this is libc.so itself:

$ /lib/x86_64-linux-gnu/libc-2.13.so
GNU C Library (Debian EGLIBC 2.13-37) stable release version 2.13, by Roland McGrath et al.
Copyright (C) 2011 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
...

You can trace through how this actually does work in the same way as we traced through why the trivial example doesn't work.

If you wondering my opinion on executable-bits for shared-libraries; I would not give them execute permissions. I can't see it does anything but open the door to confusion. However, understanding exactly why the library segfaults the way it does actually ends up being a fun little tour around various parts of the toolchain!