The last few weeks have been spent researching initialisation functions, constructors, and how to write and call them.
Now that libkernel and libc are separate entities, they both have some initialisation code that needs to be run. At the very least, these two are going to have to resolve the kernel system call functions so that they can make syscalls. Libc can cheat at this because it provides the start() function that is called first when the program is loaded, so it can take this opportunity to set itself up (this is fairly common amongst C libraries). Libkernel has a slightly tougher job. I'd rather not have the program need to explicitly initialise it, and libc shouldn't assume that libkernel should be there, so how do we do it?
Well, GCC already has a way of doing this which it uses for C++ static constructors and for a few rare cases in C, but the details of it are spread around so let's try to make sense of it for our specific situation.
The first port of call is to RTFM, which in this case is
http://gcc.gnu.org/onlinedocs/gccint/Initialization.html . Unfortunately, this article is very heavy going, talks in very abstract terms and isn't very easy to follow. Also, searching around for some of the terms used, especially the #defines in this article doesn't help either.
The few sources I was able to find with information on how this works generally suggest that the ELF file contains a section called .ctors which contains a list of addresses to constructor functions which are to be called at the beginning of the program (and a section called .dtors for deconstructors for the end of the program). It makes sense that this section should be created by the compiler itself (as the linker won't know which functions need to be in this) so the linker must just combine together the pointers from all the different object files.
Step 1 : Compiling
Let's have a look at this first hand. We know that C++ does for some cases, so let's try this:
main()
{
asm("int $1");
}
volatile int RandomRoll()
{
asm("int $2");
return 4;
}
int mTest = RandomRoll();
Which can be compiled into an object file with:
g++ hello.cpp -nostdlib
(ignoring the warning about the undefined entry symbol.)
This file needs some initialisation in order to work properly, specifically that the program must call RandomRoll() to set the value of mTest before main() is called, so we expect to find some clues in here. The assembly interrupts aren't expected to be run, they are there to help us identify the code when we start disassembling it.
Having a look at the object file with objdump allows us to see what's going on.
> objdump a.out -h
a.out: file format elf32-i386
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 00000052 08048074 08048074 00000074 2**2
CONTENTS, ALLOC, LOAD, READONLY, CODE
1 .eh_frame 00000098 080480c8 080480c8 000000c8 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
2 .ctors 00000004 08049160 08049160 00000160 2**2
CONTENTS, ALLOC, LOAD, DATA
3 .bss 00000004 08049164 08049164 00000164 2**2
ALLOC
4 .comment 00000011 00000000 00000000 00000164 2**0
CONTENTS, READONLY
Ok, that's good, we have a section called .ctors with a size of four bytes which contains data. This is exactly the right amount to hold an address on a 32-bit platform.
Back to objdump to make certain we understand this (I've skipped the data from some of the sections as they weren't relevant):
> objdump a.out -sS
a.out: file format elf32-i386
Contents of section .ctors:
8049160 aa800408 ....
Disassembly of section .text:
08048074 <main>:
8048074: 55 push %ebp
8048075: 89 e5 mov %esp,%ebp
8048077: cd 01 int $0x1
8048079: b8 00 00 00 00 mov $0x0,%eax
804807e: 5d pop %ebp
804807f: c3 ret
08048080 <_z10randomrollv>:
8048080: 55 push %ebp
8048081: 89 e5 mov %esp,%ebp
8048083: cd 02 int $0x2
8048085: b8 04 00 00 00 mov $0x4,%eax
804808a: 5d pop %ebp
804808b: c3 ret
0804808c <_z41__static_initialization_and_destruction_0ii>:
804808c: 55 push %ebp
804808d: 89 e5 mov %esp,%ebp
804808f: 83 7d 08 01 cmpl $0x1,0x8(%ebp)
8048093: 75 13 jne 80480a8 <_z41__static_initialization_and_destruction_0ii data-blogger-escaped-x1c="">
8048095: 81 7d 0c ff ff 00 00 cmpl $0xffff,0xc(%ebp)
804809c: 75 0a jne 80480a8 <_z41__static_initialization_and_destruction_0ii data-blogger-escaped-x1c="">
804809e: e8 dd ff ff ff call 8048080 <_z10randomrollv>
80480a3: a3 64 91 04 08 mov %eax,0x8049164
80480a8: 5d pop %ebp
80480a9: c3 ret
080480aa <_global__sub_i_main>:
80480aa: 55 push %ebp
80480ab: 89 e5 mov %esp,%ebp
80480ad: 83 ec 08 sub $0x8,%esp
80480b0: c7 44 24 04 ff ff 00 movl $0xffff,0x4(%esp)
80480b7: 00
80480b8: c7 04 24 01 00 00 00 movl $0x1,(%esp)
80480bf: e8 c8 ff ff ff call 804808c <_z41__static_initialization_and_destruction_0ii>
80480c4: c9 leave
80480c5: c3 ret
There's a lot going on here and I'm not sure I wrote all of it. Let's have a look.
We can see our main() and RandomRoll() functions (not sure why the name is munged, not important right now) but we also have two new functions, _global__sub_i_main() and _z41__static_initialization_and_destruction_0ii(). _global__sub_i_main() seems to call _z41__static_init_blah_blah(), which in turn seems to call our RandomRoll() function so without getting into it too deeply, I'd be happy to assume that _global__sub_i_main() is our initialisation function.
At the top of the disassembly is the contents of the .ctors section which contains that one 32-bit value, which is (interpreting the little-endian encoding) 0x080480aa, which is exactly the memory address of the _global__sub_i_main().
\ o /
At this point, my Google searching failed me. For the life of me I can't figure out how to coerce the C compiler to create a .ctors section and populate it in the way that C++ does (answers on a postcard), however, I have a plan.
At the end of the day, the value that C++ put into the .ctors section is just your average common-or-garden initialised pointer variable, but in a special place, so what about this:
void test_init()
{
asm("int $3");
}
void* __attribute__ ((section (".ctors"))) test_init_ref = test_init;
I have my function, and I have a global variable which points to my function. I've used some GCC specific runes to say that I want my variable to live in a section called .ctors. Let's build this and see if it gives us the desired result (some sections removed for brevity):
>gcc hello.c -nostdlib
>objdump a.out -sS
a.out: file format elf32-i386
Contents of section .ctors:
80490b4 74800408 t...
Disassembly of section .text:
08048074 :
8048074: 55 push %ebp
8048075: 89 e5 mov %esp,%ebp
8048077: cc int3
8048078: 5d pop %ebp
8048079: c3 ret
Success! We can see our function, and we can see the entry in .ctors which points to it, just as we saw in C++. It's not the prettiest of solutions, but it will work for now.
Step 2 : Running
While we've gotten that all working, that's only half of the battle, we still need to actually call all these from our program.
For the most part, all our program needs to know is where the .ctors array is and how long it is. It can then iterate all the constructors and run them. (These are function pointers after all, so these will always be relocated as required by the loader. If the relocation didn't update function pointers, the program has more to worry about than constructors.)
One of the other searches I was running while doing this was to investigate how other systems dealt with this problem. While that search yielded nothing of note, it did give me the idea to check out the default GCC link script. Currently my kernel uses a custom link script, but the applications still use the default one, where I found something interesting:
.ctors :
{
/* gcc uses crtbegin.o to find the start of
the constructors, so we make sure it is
first. Because this is a wildcard, it
doesn't matter if the user does not
actually link against crtbegin.o; the
linker won't look for a file to match a
wildcard. The wildcard also means that it
doesn't matter which directory crtbegin.o
is in. */
KEEP (*crtbegin.o(.ctors))
KEEP (*crtbegin?.o(.ctors))
/* We don't want to include the .ctor section from
the crtend.o file until after the sorted ctors.
The .ctor section from the crtend file contains the
end of ctors marker and it must be last */
KEEP (*(EXCLUDE_FILE (*crtend.o *crtend?.o ) .ctors))
KEEP (*(SORT(.ctors.*)))
KEEP (*(.ctors))
}
So, my link-script-fu is not great, but I can understand enough of this to see that crtbegin.o and crtend.o have special handling when it comes to the .ctors section. Anything for the .ctors section from crtbegin.o goes at the beginning of the output, and anything from crtend.o goes at the end, effectively book-ending the array. (I'll admit I didn't know what crtbegin and crtend were for, but GCC complained that I didn't have them when I created libc, so I put in empty ones.)
This information, coupled with what we just did for shuffling variables into the .ctors section gives me an idea.
I can put a variable in crtend.c like this:
int __attribute__((section(".ctors"))) __libc_internal_ctor_end = 0;
... and then put this in crtbegin.c:
extern int __libc_internal_ctor_end; // <<-- in crtend.o
int __attribute__((section(".ctors"))) __libc_internal_ctor_start = 0;
void call_ctors()
{
// Loop between __libc_internal_ctor_start and __libc_internal_ctor_end
}
The linker will put these two variables at the beginning and end of the .ctors list, and then the call_ctors() function can use them to work out where all the other constructors are. Note that the content of those two variables is completely irrelevant, only their locations in memory is useful.
Step 3 : Testing
This blog post is way longer than it was going to be, so I'll be brief and say what I usually say. It failed a bunch of times because I made some stupid mistakes, but after much hammering, it all works.
crt0.s now performs calls:
- the C Library initialisation function,
- the constructors function in crtbegin.c, and
- the program's main() function.
I also found out along the way that when a library (such as libkernel.a) is linked in, the linker only includes those object files which are actually needed. This gave a minor testing headache because I expected my init() function and its .ctor reference to appear just because the library was linked, but it didn't. That was easily fixed though.