Load-time relocation of shared libraries
August 25th, 2011 at 2:47 pmThis article’s aim is to explain how a modern operating system makes it possible to use shared libraries with load-time relocation. It focuses on the Linux OS running on 32-bit x86, but the general principles apply to other OSes and CPUs as well.
Note that shared libraries have many names – shared libraries, shared objects, dynamic shared objects (DSOs), dynamically linked libraries (DLLs – if you’re coming from a Windows background). For the sake of consistency, I will try to just use the name "shared library" throughout this article.
Loading executables
Linux, similarly to other OSes with virtual memory support, loads executables to a fixed memory address. If we examine the ELF header of some random executable, we’ll see an Entry point address:
$ readelf -h /usr/bin/uptime
ELF Header:
Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
Class: ELF32
[...] some header fields
Entry point address: 0x8048470
[...] some header fields
This is placed by the linker to tell the OS where to start executing the executable’s code [1]. And indeed if we then load the executable with GDB and examine the address 0x8048470, we’ll see the first instructions of the executable’s .text segment there.
What this means is that the linker, when linking the executable, can fully resolve all internal symbol references (to functions and data) to fixed and final locations. The linker does some relocations of its own [2], but eventually the output it produces contains no additional relocations.
Or does it? Note that I emphasized the word internal in the previous paragraph. As long as the executable needs no shared libraries [3], it needs no relocations. But if it does use shared libraries (as do the vast majority of Linux applications), symbols taken from these shared libraries need to be relocated, because of how shared libraries are loaded.
Load-time relocation in action
To see the load-time relocation in action, I will use our shared library from a simple driver executable. When running this executable, the OS will load the shared library and relocate it appropriately.
Curiously, due to the address space layout randomization feature which is enabled in Linux, relocation is relatively difficult to follow, because every time I run the executable, the libmlreloc.so shared library gets placed in a different virtual memory address [9].
This is a rather weak deterrent, however. There is a way to make sense in it all. But first, let’s talk about the segments our shared library consists of:
$ readelf --segments libmlreloc.so
Elf file type is DYN (Shared object file)
Entry point 0x3b0
There are 6 program headers, starting at offset 52
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
LOAD 0x000000 0x00000000 0x00000000 0x004e8 0x004e8 R E 0x1000
LOAD 0x000f04 0x00001f04 0x00001f04 0x0010c 0x00114 RW 0x1000
DYNAMIC 0x000f18 0x00001f18 0x00001f18 0x000d0 0x000d0 RW 0x4
NOTE 0x0000f4 0x000000f4 0x000000f4 0x00024 0x00024 R 0x4
GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x4
GNU_RELRO 0x000f04 0x00001f04 0x00001f04 0x000fc 0x000fc R 0x1
Section to Segment mapping:
Segment Sections...
00 .note.gnu.build-id .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .text .fini .eh_frame
01 .ctors .dtors .jcr .dynamic .got .got.plt .data .bss
02 .dynamic
03 .note.gnu.build-id
04
05 .ctors .dtors .jcr .dynamic .got
To follow the myglob symbol, we’re interested in the second segment listed here. Note a couple of things:
- In the section to segment mapping in the bottom, segment 01 is said to contain the .data section, which is the home of myglob
- The VirtAddr column specifies that the second segment starts at 0x1f04 and has size 0x10c, meaning that it extends until 0x2010 and thus contains myglob which is at 0x200C.
Now let’s use a nice tool Linux gives us to examine the load-time linking process – the dl_iterate_phdr function, which allows an application to inquire at runtime which shared libraries it has loaded, and more importantly – take a peek at their program headers.
So I’m going to write the following code into driver.c:
#define _GNU_SOURCE
#include <link.h>
#include <stdlib.h>
#include <stdio.h>
static int header_handler(struct dl_phdr_info* info, size_t size, void* data)
{
printf("name=%s (%d segments) address=%p\n",
info->dlpi_name, info->dlpi_phnum, (void*)info->dlpi_addr);
for (int j = 0; j < info->dlpi_phnum; j++) {
printf("\t\t header %2d: address=%10p\n", j,
(void*) (info->dlpi_addr + info->dlpi_phdr[j].p_vaddr));
printf("\t\t\t type=%u, flags=0x%X\n",
info->dlpi_phdr[j].p_type, info->dlpi_phdr[j].p_flags);
}
printf("\n");
return 0;
}
extern int ml_func(int, int);
int main(int argc, const char* argv[])
{
dl_iterate_phdr(header_handler, NULL);
int t = ml_func(argc, argc);
return t;
}
header_handler implements the callback for dl_iterate_phdr. It will get called for all libraries and report their names and load addresses, along with all their segments. It also invokes ml_func, which is taken from the libmlreloc.so shared library.
To compile and link this driver with our shared library, run:
gcc -g -c driver.c -o driver.o
gcc -o driver driver.o -L. -lmreloc
Running the driver stand-alone we get the information, but for each run the addresses are different. So what I’m going to do is run it under gdb [10], see what it says, and then use gdb to further query the process’s memory space:
$ gdb -q driver
Reading symbols from driver...done.
(gdb) b driver.c:31
Breakpoint 1 at 0x804869e: file driver.c, line 31.
(gdb) r
Starting program: driver
[...] skipping output
name=./libmlreloc.so (6 segments) address=0x12e000
header 0: address= 0x12e000
type=1, flags=0x5
header 1: address= 0x12ff04
type=1, flags=0x6
header 2: address= 0x12ff18
type=2, flags=0x6
header 3: address= 0x12e0f4
type=4, flags=0x4
header 4: address= 0x12e000
type=1685382481, flags=0x6
header 5: address= 0x12ff04
type=1685382482, flags=0x4
[...] skipping output
Breakpoint 1, main (argc=1, argv=0xbffff3d4) at driver.c:31
31 }
(gdb)
Since driver reports all the libraries it loads (even implicitly, like libc or the dynamic loader itself), the output is lengthy and I will just focus on the report about libmlreloc.so. Note that the 6 segments are the same segments reported by readelf, but this time relocated into their final memory locations.
Let’s do some math. The output says libmlreloc.so was placed in virtual address 0x12e000. We’re interested in the second segment, which as we’ve seen in readelf is at ofset 0x1f04. Indeed, we see in the output it was loaded to address 0x12ff04. And since myglob is at offset 0x200c in the file, we’d expect it to now be at address 0x13000c.
So, let’s ask GDB:
(gdb) p &myglob
$1 = (int *) 0x13000c
Excellent! But what about the code of ml_func which refers to myglob? Let’s ask GDB again:
(gdb) set disassembly-flavor intel
(gdb) disas ml_func
Dump of assembler code for function ml_func:
0x0012e46c <+0>: push ebp
0x0012e46d <+1>: mov ebp,esp
0x0012e46f <+3>: mov eax,ds:0x13000c
0x0012e474 <+8>: add eax,DWORD PTR [ebp+0x8]
0x0012e477 <+11>: mov ds:0x13000c,eax
0x0012e47c <+16>: mov eax,ds:0x13000c
0x0012e481 <+21>: add eax,DWORD PTR [ebp+0xc]
0x0012e484 <+24>: pop ebp
0x0012e485 <+25>: ret
End of assembler dump.
As expected, the real address of myglob was placed in all the mov instructions referring to it, just as the relocation entries specified.
Relocating function calls
So far this article demonstrated relocation of data references – using the global variable myglob as an example. Another thing that needs to be relocated is code references – in other words, function calls. This section is a brief guide on how this gets done. The pace is much faster than in the rest of this article, since I can now assume the reader understands what relocation is all about.
Without further ado, let’s get to it. I’ve modified the code of the shared library to be the following:
int myglob = 42;
int ml_util_func(int a)
{
return a + 1;
}
int ml_func(int a, int b)
{
int c = b + ml_util_func(a);
myglob += c;
return b + myglob;
}
ml_util_func was added and it’s being used by ml_func. Here’s the disassembly of ml_func in the linked shared library:
000004a7 <ml_func>:
4a7: 55 push ebp
4a8: 89 e5 mov ebp,esp
4aa: 83 ec 14 sub esp,0x14
4ad: 8b 45 08 mov eax,DWORD PTR [ebp+0x8]
4b0: 89 04 24 mov DWORD PTR [esp],eax
4b3: e8 fc ff ff ff call 4b4 <ml_func+0xd>
4b8: 03 45 0c add eax,DWORD PTR [ebp+0xc]
4bb: 89 45 fc mov DWORD PTR [ebp-0x4],eax
4be: a1 00 00 00 00 mov eax,ds:0x0
4c3: 03 45 fc add eax,DWORD PTR [ebp-0x4]
4c6: a3 00 00 00 00 mov ds:0x0,eax
4cb: a1 00 00 00 00 mov eax,ds:0x0
4d0: 03 45 0c add eax,DWORD PTR [ebp+0xc]
4d3: c9 leave
4d4: c3 ret
What’s interesting here is the instruction at address 0x4b3 – it’s the call to ml_util_func. Let’s dissect it:
e8 is the opcode for call. The argument of this call is the offset relative to the next instruction. In the disassembly above, this argument is 0xfffffffc, or simply -4. So the call currently points to itself. This clearly isn’t right – but let’s not forget about relocation. Here’s what the relocation section of the shared library looks like now:
$ readelf -r libmlreloc.so
Relocation section '.rel.dyn' at offset 0x324 contains 8 entries:
Offset Info Type Sym.Value Sym. Name
00002008 00000008 R_386_RELATIVE
000004b4 00000502 R_386_PC32 0000049c ml_util_func
000004bf 00000401 R_386_32 0000200c myglob
000004c7 00000401 R_386_32 0000200c myglob
000004cc 00000401 R_386_32 0000200c myglob
[...] skipping stuff
If we compare it to the previous invocation of readelf -r, we’ll notice a new entry added for ml_util_func. This entry points at address 0x4b4 which is the argument of the call instruction, and its type is R_386_PC32. This relocation type is more complicated than R_386_32, but not by much.
It means the following: take the value at the offset specified in the entry, add the address of the symbol to it, subtract the address of the offset itself, and place it back into the word at the offset. Recall that this relocation is done at load-time, when the final load addresses of the symbol and the relocated offset itself are already known. These final addresses participate in the computation.
What does this do? Basically, it’s a relative relocation, taking its location into account and thus suitable for arguments of instructions with relative addressing (which the e8 call is). I promise it will become clearer once we get to the real numbers.
I’m now going to build the driver code and run it under GDB again, to see this relocation in action. Here’s the GDB session, followed by explanations:
$ gdb -q driver
Reading symbols from driver...done.
(gdb) b driver.c:31
Breakpoint 1 at 0x804869e: file driver.c, line 31.
(gdb) r
Starting program: driver
[...] skipping output
name=./libmlreloc.so (6 segments) address=0x12e000
header 0: address= 0x12e000
type=1, flags=0x5
header 1: address= 0x12ff04
type=1, flags=0x6
header 2: address= 0x12ff18
type=2, flags=0x6
header 3: address= 0x12e0f4
type=4, flags=0x4
header 4: address= 0x12e000
type=1685382481, flags=0x6
header 5: address= 0x12ff04
type=1685382482, flags=0x4
[...] skipping output
Breakpoint 1, main (argc=1, argv=0xbffff3d4) at driver.c:31
31 }
(gdb) set disassembly-flavor intel
(gdb) disas ml_util_func
Dump of assembler code for function ml_util_func:
0x0012e49c <+0>: push ebp
0x0012e49d <+1>: mov ebp,esp
0x0012e49f <+3>: mov eax,DWORD PTR [ebp+0x8]
0x0012e4a2 <+6>: add eax,0x1
0x0012e4a5 <+9>: pop ebp
0x0012e4a6 <+10>: ret
End of assembler dump.
(gdb) disas /r ml_func
Dump of assembler code for function ml_func:
0x0012e4a7 <+0>: 55 push ebp
0x0012e4a8 <+1>: 89 e5 mov ebp,esp
0x0012e4aa <+3>: 83 ec 14 sub esp,0x14
0x0012e4ad <+6>: 8b 45 08 mov eax,DWORD PTR [ebp+0x8]
0x0012e4b0 <+9>: 89 04 24 mov DWORD PTR [esp],eax
0x0012e4b3 <+12>: e8 e4 ff ff ff call 0x12e49c <ml_util_func>
0x0012e4b8 <+17>: 03 45 0c add eax,DWORD PTR [ebp+0xc]
0x0012e4bb <+20>: 89 45 fc mov DWORD PTR [ebp-0x4],eax
0x0012e4be <+23>: a1 0c 00 13 00 mov eax,ds:0x13000c
0x0012e4c3 <+28>: 03 45 fc add eax,DWORD PTR [ebp-0x4]
0x0012e4c6 <+31>: a3 0c 00 13 00 mov ds:0x13000c,eax
0x0012e4cb <+36>: a1 0c 00 13 00 mov eax,ds:0x13000c
0x0012e4d0 <+41>: 03 45 0c add eax,DWORD PTR [ebp+0xc]
0x0012e4d3 <+44>: c9 leave
0x0012e4d4 <+45>: c3 ret
End of assembler dump.
(gdb)
The important parts here are:
- In the printout from driver we see that the first segment (the code segment) of libmlreloc.so has been mapped to 0x12e000 [11]
- ml_util_func was loaded to address 0x0012e49c
- The address of the relocated offset is 0x0012e4b4
- The call in ml_func to ml_util_func was patched to place 0xffffffe4 in the argument (I disassembled ml_func with the /r flag to show raw hex in addition to disassembly), which is interpreted as the correct offset to ml_util_func.
Obviously we’re most interested in how (4) was done. Again, it’s time for some math. Interpreting the R_386_PC32 relocation entry mentioned above, we have:
Take the value at the offset specified in the entry (0xfffffffc), add the address of the symbol to it (0x0012e49c), subtract the address of the offset itself (0x0012e4b4), and place it back into the word at the offset. Everything is done assuming 32-bit 2-s complement, of course. The result is 0xffffffe4, as expected.
Extra credit: Why was the call relocation needed?
This is a "bonus" section that discusses some peculiarities of the implementation of shared library loading in Linux. If all you wanted was to understand how relocations are done, you can safely skip it.
When trying to understand the call relocation of ml_util_func, I must admit I scratched my head for some time. Recall that the argument of call is a relative offset. Surely the offset between the call and ml_util_func itself doesn’t change when the library is loaded – they both are in the code segment which gets moved as one whole chunk. So why is the relocation needed at all?
Here’s a small experiment to try: go back to the code of the shared library, add static to the declaration of ml_util_func. Re-compile and look at the output of readelf -r again.
Done? Anyway, I will reveal the outcome – the relocation is gone! Examine the disassembly of ml_func – there’s now a correct offset placed as the argument of call – no relocation required. What’s going on?
When tying global symbol references to their actual definitions, the dynamic loader has some rules about the order in which shared libraries are searched. The user can also influence this order by setting the LD_PRELOAD environment variable.
There are too many details to cover here, so if you’re really interested you’ll have to take a look at the ELF standard, the dynamic loader man page and do some Googling. In short, however, when ml_util_func is global, it may be overridden in the executable or another shared library, so when linking our shared library, the linker can’t just assume the offset is known and hard-code it [12]. It makes all references to global symbols relocatable in order to allow the dynamic loader to decide how to resolve them. This is why declaring the function static makes a difference – since it’s no longer global or exported, the linker can hard-code its offset in the code.
Conclusion
Load-time relocation is one of the methods used in Linux (and other OSes) to resolve internal data and code references in shared libraries when loading them into memory. These days, position independent code (PIC) is a more popular approach, and some modern systems (such as x86-64) no longer support load-time relocation.
Still, I decided to write an article on load-time relocation for two reasons. First, load-time relocation has a couple of advantages over PIC on some systems, especially in terms of performance. Second, load-time relocation is IMHO simpler to understand without prior knowledge, which will make PIC easier to explain in the future. (Update 03.11.2011: the article about PIC was published)
Regardless of the motivation, I hope this article has helped to shed some light on the magic going behind the scenes of linking and loading shared libraries in a modern OS.

| [1] | For some more information about this entry point, see the section "Digression – process addresses and entry point" of this article. |
| [2] | Link-time relocation happens in the process of combining multiple object files into an executable (or shared library). It involves quite a lot of relocations to resolve symbol references between the object files. Link-time relocation is a more complex topic than load-time relocation, and I won’t cover it in this article. |
| [3] | This can be made possible by compiling all your libraries into static libraries (with ar combining object files instead gcc -shared), and providing the -static flag to gcc when linking the executable – to avoid linkage with the shared version of libc. |
| [4] | ml simply stands for "my library". Also, the code itself is absolutely non-sensical and only used for purposes of demonstration. |
| [5] | Also called "dynamic linker". It’s a shared object itself (though it can also run as an executable), residing at /lib/ld-linux.so.2 (the last number is the SO version and may be different). |
| [6] | If you’re not familiar with how x86 structures its stack frames, this would be a good time to read this article. |
| [7] | You can provide the -l flag to objdump to add C source lines into the disassembly, making it clearer what gets compiled to what. I’ve omitted it here to make the output shorter. |
| [8] | I’m looking at the left-hand side of the output of objdump, where the raw memory bytes are. a1 00 00 00 00 means mov to eax with operand 0x0, which is interpreted by the disassembler as ds:0x0. |
| [9] | So ldd invoked on the executable will report a different load address for the shared library each time it’s run. |
| [10] | Experienced readers will probably note that I could ask GDB about i shared to get the load-address of the shared library. However, i shared only mentions the load location of the whole library (or, even more accurately, its entry point), and I was interested in the segments. |
| [11] | What, 0x12e000 again? Didn’t I just talk about load-address randomization? It turns out the dynamic loader can be manipulated to turn this off, for purposes of debugging. This is exactly what GDB is doing. |
| [12] | Unless it’s passed the -Bsymbolic flag. Read all about it in the man page of ld. |
Related posts:

August 25th, 2011 at 23:29
Thanks for this excelent article, even though I’m not very much into such low level, I always enjoy understanding how “the world” (of a computer) works.
I read your articles with a lot of interest, thanks !
Elias
August 26th, 2011 at 05:44
Elias, thanks for the kind feedback.
August 26th, 2011 at 06:14
Second the previous comment. Great set of articles time and time again.
Thanks
Narayan
August 26th, 2011 at 11:17
I really enjoyed reading the article, very well explained.
August 26th, 2011 at 13:57
Modern, lol. Shared libraries were a shitty idea from Sun to save disk space. It is time for them to just die.
August 26th, 2011 at 15:49
maht,
I did not use the word modern to refer to shared libraries, but rather to the OSes discussed here. Also, I disagree that shared libraries should “just die”.
libcon my Ubuntu weighs over 1.5MB. Sharing this in memory among processes (and on disk among all installed programs – of which there are thousands, so do the math…) is a nice saving. Shared libraries also enable modularization of programs, dynamic plugins and incremental updates of a large application (see the recent buzz on how Google Chrome updates itself, for example).August 27th, 2011 at 00:15
When relocations take place, is the .text portion of the shared library’s in-memory image modified? E.g. All references in m1_func for myglob are changed from 0×0 to 0x200C. If so, this means the same shared lib has to be loaded into memory multiple times (once for each unrelated process) because each one will want a different base address to be relocated to. It would be nice if everytime a newly running process needs the shared lib it could just be have the already in-memory copy be mmapped into the new processes virtual address space.
Obviously child processes and threads would inherit the same shared library page since they are created after final load-time linking could occur (and in general get a copy of everything until they do an exec()).
Perhaps PIC lets you share libs across unrelated processes, and I’m just unintentionally reading ahead in your course material?
Thanks for the article.
Jeremy
August 27th, 2011 at 06:32
Jeremy Impson,
Indeed the text segment has to be changed and this is one of the big disadvantages of load-time relocation, for the exact reason you noted.
And yes, I plan to cover this in the PIC article (because PIC doesn’t have this problem).
August 30th, 2011 at 23:16
Firstly, thanks for the article. You write very concise which is usually not the case with tech people in my opinion. This is refreshing. One question though, and perhaps this is just my own disconnect but how does the dynamic linker know when it sees an instruction such as:
a1 00 00 00 00 mov eax,ds:0×0
that a relocation is needed and to check the relocation table for further instructions? Does an opcode such a mov followed by 0′s tell the dynamic linker to perform a reloc table lookup here and patch in the real value(absolute address)
Thanks.
August 31st, 2011 at 05:52
Freddie,
Thanks for your feedback. Please take a look at the text starting with:
I think it explains how things work. If not, let me know.
August 31st, 2011 at 19:25
Eliben, thanks for your response. I re-read the section you mentioned. I am understanding the 0×0 to be a place holder until the linker can update those 0′s with an absolue address. Is that correct? My question is really when is are those 0′s updated, as soon as the linker loads the shared library? Is there machine code(in the loader perhaps?) that iterates through entry in the rel.dyn table and resolves and patches in the absolute adress at load time? Does this happen before program execution starts?
August 31st, 2011 at 22:33
Freddie,
The dynamic loader handles these relocation entries when the shared library is loaded.
September 1st, 2011 at 06:57
Thank you again. I magine that ld-.so through .dyn.rel table and fills these in as soon as the needed library is loaded is loaded into memory. For instance if the relocation is a slot in the GOT say of type GLOB_DAT it will fill the offset specified with the correct address of that variable or constant is that correct? I have seen plenty of examples of this with the PLT and GOT for procedure calls but am a little unclear on how it does that for data elements. Is there a way to step through the linker code doing that at load time? Please feel free to email me offline if this is no longer appropriate for your post. Again I really appreciate your articulate writing on this subject.
September 1st, 2011 at 08:00
Freddie,
I plan to write more articles on this issue. Specifically, loading of PIC code which is relevant for PLT and GOT. So stay tuned
September 5th, 2011 at 22:34
Why does gcc compile to:
(a3 ...) mov ds:0x0,eax(a1 ...) mov eax,ds:0x0
Surely the latter is unneeded?
September 6th, 2011 at 07:54
Damion,
Sure, but this is an un-optimized build we’re seeing here (
-O0). With all optimizations off, it’s easy to notice very inefficient code generated by the compiler.September 7th, 2011 at 21:05
This may be nitpicking, but this actually tells the OS where to jump to once the executable is fully loaded, not where to load its code to. The virtual memory address that the OS should load the .text section to is in its section header (here a 64bit executable):
It seems that, on modern Linuxes at least, the entry point indeed often coincides with the start of the .text section, but it doesn’t need to.
And it’s pretty easy to construct a working binary where this isn’t the case, without any dirty tricks:
When executing
./test, it correctly jumps over the int3 instruction and executes the endless loop instead of throwing a trace trap.Doesn’t distract from the great explanation on load time relocation, though.
September 8th, 2011 at 05:41
Julien,
Thanks – you’re right, of course. I actually covered the entry point issue in some detail here: http://eli.thegreenplace.net/2011/01/27/how-debuggers-work-part-2-breakpoints/ – so it’s just a typo which I’ll fix.
September 19th, 2011 at 20:12
Excellent.Today is the first time I have hit your blog and am already a big fan of yours
October 5th, 2011 at 17:13
Hi
May I translate this article into traditional Chinese? I am studying the operations of linkers and loaders. This article helps me a lot and I think it will also help other enthusiasts in Taiwan understand the magic.
October 5th, 2011 at 22:19
Mars,
Absolutely! Thanks for letting me know – and share a link once you have the translation.
October 27th, 2011 at 13:57
Hi Eliben
I got this translation done…finally.
Please check
http://reborn2266.blogspot.com/2011/10/load-time-relocation-of-shared.html
Again, great article!!
October 27th, 2011 at 15:02
Mars,
I only understand the code snippets, but it looks great
November 5th, 2011 at 18:16
I get the following error when trying to follow the article:
November 5th, 2011 at 19:14
sol,
Good question
You must be using a x64 machine, where
gccforces PIC for shared libraries by default. I’ve written an article on PIC (also focusing on x86) – look for the link in the Conclusion section of this one.Stay tuned for my future article on PIC for x64. In the meantime, you can compile with
fno-PIC -mcmodel=large -sharedthough the assembly you will see is a bit different from what I present here.November 8th, 2011 at 07:19
Recall that the program/executable is not relocatable (except with the -pie option), and thus its data addresses have to bound at link time.
December 3rd, 2011 at 03:25
Great article! Thanks for sharing such informations.
Keep it up!
January 19th, 2012 at 21:58
Thanks for taking the time to write this article.
I’m a little confused by the second extra credit section. The shared library is loaded to different places in the program’s address space on different program invocations. If the program references data in the shared library, that data is moved into the program’s data section, enabling the linker to know its address. When the program makes calls to functions in the shared library, though, the addresses of those function can’t be known at link time. If the loader has to patch addresses for function calls into the shared library, why not have it patch addresses for data references as well?
Am I misunderstanding something, or is this just a case of things being the way they are?
January 19th, 2012 at 22:16
It’s funny how often it happens that, no matter how much you think about something before you send it, after you send it you realize you should have though about it more.
If the operating system wanted to share a copy of a shared library in system memory between multiple applications, each application would need to have its own copy of global variables from the library. Thus the requirement to move data, but not functions, into the program’s data section.
Of course, if the loader is not putting the library at the same location in different programs’ address spaces, each program will need its own copy of the library in system memory since the library will have to be relocated differently for each program.
Right?
January 20th, 2012 at 06:25
Paul,
Addresses of functions in shared libraries are already bound at link-time in the executable (unless these libraries are loaded with
dlopen), since the linker knows where in the process’s address space each library will be loaded. Code in the shared library is relocated (or PIC), so there’s no problem with internal code references there. Accessing the same functions from the executable can just use relative calls, no relocation required. Data can’t be accessed in this way (on x86), hence this trick.For a clearer view, examine a dissassembly and the relocation table of an executable referencing a shared lib’s function and global var.
June 20th, 2012 at 17:53
Really Great Article on this topic! Well explained. Thanks!
June 23rd, 2012 at 06:42
Awesome!!! The shared library problem has bothered me so long time, and now, I eventually find way out here via your excellent explanation. I have to say, it is excited when knowing how it works!
Make sure I fully understand, I put two points, if incorrect, please point out.
1. Every library, both static and dynamic, is already assigned address range in the program’s address space when the program/executable is linked. The program knows address of all global variable and functions even though some of them are not presented in the program binary.
2. When the library is loaded, they will be loaded to their specific address ranged and relocation happens inside of the library, particularly the relocation for its global variables in the library.
The question is, when will the shared library be loaded? Should all needed shared libraries be loaded when the program starts?
June 23rd, 2012 at 12:37
Shoufu,
I suggest you re-read the article carefully – I think you’ll find the answers to your questions there.
November 15th, 2012 at 18:53
How are the shared libraries shared between different executables and their invocations if a global variable in a shared library is used in these executables and hence has to be allocated in the program’s address space? Does this mean that the different instances of the global variable have to be in the same address in different executables? Otherwise, how could the shared library’s reference to the variable be the same for every executable and thus shared between the executables?
November 15th, 2012 at 18:57
It seems that it is unrealistic to require the global variable in the shared library to be in the same address in different executables’ address space if the global variable is referenced in them.
November 16th, 2012 at 15:31
Y,
You can share the read-only sections of the SO between executables. For example the code section(s).
November 16th, 2012 at 18:53
Thanks for the reply. I have a lot of fun reading this article. I also glanced a bit into the beginning of the PIC article. It looks like if the code section contains relocated global variables that are also referenced by the executable (outside of the SO), then the code section is not shareable either. Is that right?
December 13th, 2012 at 05:35
Excellent Article! I must say it is very well written and really helpful.
January 4th, 2013 at 15:20
What I didn’t understand is how the ml_func is relocated. I mean the call to ml_func in driver.c.
February 18th, 2013 at 21:58
Thanks a lot for the article!
And I have a question, in the “Extra credit #2″, there is “If we examine ml_func in GDB, we’ll see the correct reference made to myglob:
0x0012e48e : a1 18 a0 04 08 mov eax,ds:0x804a018″
However, ml_func is in the shared library. There should be no absolute addressing in its text.
The access to myglob within ml_func still goes through GOT of the shared library, but this time, the GOT entry points to a location in the program’s address space (rather than the shared library’s), right?
February 19th, 2013 at 05:15
@yujun,
This article is about load-time relocation. For PIC, read http://eli.thegreenplace.net/2011/11/03/position-independent-code-pic-in-shared-libraries/
March 6th, 2013 at 05:26
Very nice article. TFS. A minor typo here, “gcc -o driver driver.o -L. -lmreloc” should be “-lmlreloc” . Also , when relocating functions in this case ml_util_func() , the offset where the relocation has to be applied contains 0xfffffffc . This is because , the offset is calculated relative to EIP and since EIP points to the address of the next instruction to be executed, we have to subtract -4 from the difference we calculate. Just in case someone was wondering why we have -4 at the offset where relocation has to be applied!
March 19th, 2013 at 11:50
YES. Thank you. This is exactly what I needed to know, explained perfectly.
April 26th, 2013 at 05:57
Thank you very much!very nice
in last section (Referencing shared library data from the executable) you say that shared library refrence to myglob will modify to myglob address at executable address space?