2013-11-23

2013-11-23 Constructors and Library Initialisation

The last few weeks have been spent researching initialisation functions, constructors, and how to write and call them.

Now that libkernel and libc are separate entities, they both have some initialisation code that needs to be run.  At the very least, these two are going to have to resolve the kernel system call functions so that they can make syscalls.  Libc can cheat at this because it provides the start() function that is called first when the program is loaded, so it can take this opportunity to set itself up (this is fairly common amongst C libraries).  Libkernel has a slightly tougher job.  I'd rather not have the program need to explicitly initialise it, and libc shouldn't assume that libkernel should be there, so how do we do it?

Well, GCC already has a way of doing this which it uses for C++ static constructors and for a few rare cases in C, but the details of it are spread around so let's try to make sense of it for our specific situation.

The first port of call is to RTFM, which in this case is http://gcc.gnu.org/onlinedocs/gccint/Initialization.html .  Unfortunately, this article is very heavy going, talks in very abstract terms and isn't very easy to follow.  Also, searching around for some of the terms used, especially the #defines in this article doesn't help either.

The few sources I was able to find with information on how this works generally suggest that the ELF file contains a section called .ctors which contains a list of addresses to constructor functions which are to be called at the beginning of the program (and a section called .dtors for deconstructors for the end of the program).  It makes sense that this section should be created by the compiler itself (as the linker won't know which functions need to be in this) so the linker must just combine together the pointers from all the different object files.

Step 1 : Compiling


Let's have a look at this first hand.  We know that C++ does for some cases, so let's try this:

main()
{
 asm("int $1");
}

volatile int RandomRoll()
{
 asm("int $2");
 return 4;
}

int mTest = RandomRoll();

Which can be compiled into an object file with:
g++ hello.cpp -nostdlib
(ignoring the warning about the undefined entry symbol.)

This file needs some initialisation in order to work properly, specifically that the program must call RandomRoll() to set the value of mTest before main() is called, so we expect to find some clues in here.  The assembly interrupts aren't expected to be run, they are there to help us identify the code when we start disassembling it.

Having a look at the object file with objdump allows us to see what's going on.

> objdump a.out -h

a.out:     file format elf32-i386

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .text         00000052  08048074  08048074  00000074  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  1 .eh_frame     00000098  080480c8  080480c8  000000c8  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  2 .ctors        00000004  08049160  08049160  00000160  2**2
                  CONTENTS, ALLOC, LOAD, DATA
  3 .bss          00000004  08049164  08049164  00000164  2**2
                  ALLOC
  4 .comment      00000011  00000000  00000000  00000164  2**0
                  CONTENTS, READONLY



Ok, that's good, we have a section called .ctors with a size of four bytes which contains data.  This is exactly the right amount to hold an address on a 32-bit platform.

Back to objdump to make certain we understand this (I've skipped the data from some of the sections as they weren't relevant):

> objdump a.out -sS

a.out:     file format elf32-i386

Contents of section .ctors:
 8049160 aa800408                             ....

Disassembly of section .text:

08048074 <main>:
 8048074:       55                      push   %ebp
 8048075:       89 e5                   mov    %esp,%ebp
 8048077:       cd 01                   int    $0x1
 8048079:       b8 00 00 00 00          mov    $0x0,%eax
 804807e:       5d                      pop    %ebp
 804807f:       c3                      ret

08048080 <_z10randomrollv>:
 8048080:       55                      push   %ebp
 8048081:       89 e5                   mov    %esp,%ebp
 8048083:       cd 02                   int    $0x2
 8048085:       b8 04 00 00 00          mov    $0x4,%eax
 804808a:       5d                      pop    %ebp
 804808b:       c3                      ret

0804808c <_z41__static_initialization_and_destruction_0ii>:
 804808c:       55                      push   %ebp
 804808d:       89 e5                   mov    %esp,%ebp
 804808f:       83 7d 08 01             cmpl   $0x1,0x8(%ebp)
 8048093:       75 13                   jne    80480a8 <_z41__static_initialization_and_destruction_0ii data-blogger-escaped-x1c="">
 8048095:       81 7d 0c ff ff 00 00    cmpl   $0xffff,0xc(%ebp)
 804809c:       75 0a                   jne    80480a8 <_z41__static_initialization_and_destruction_0ii data-blogger-escaped-x1c="">
 804809e:       e8 dd ff ff ff          call   8048080 <_z10randomrollv>
 80480a3:       a3 64 91 04 08          mov    %eax,0x8049164
 80480a8:       5d                      pop    %ebp
 80480a9:       c3                      ret

080480aa <_global__sub_i_main>:
 80480aa:       55                      push   %ebp
 80480ab:       89 e5                   mov    %esp,%ebp
 80480ad:       83 ec 08                sub    $0x8,%esp
 80480b0:       c7 44 24 04 ff ff 00    movl   $0xffff,0x4(%esp)
 80480b7:       00
 80480b8:       c7 04 24 01 00 00 00    movl   $0x1,(%esp)
 80480bf:       e8 c8 ff ff ff          call   804808c <_z41__static_initialization_and_destruction_0ii>
 80480c4:       c9                      leave
 80480c5:       c3                      ret

There's a lot going on here and I'm not sure I wrote all of it.  Let's have a look.

We can see our main() and RandomRoll() functions (not sure why the name is munged, not important right now) but we also have two new functions, _global__sub_i_main() and _z41__static_initialization_and_destruction_0ii().  _global__sub_i_main() seems to call _z41__static_init_blah_blah(), which in turn seems to call our RandomRoll() function so without getting into it too deeply, I'd be happy to assume that _global__sub_i_main() is our initialisation function.

At the top of the disassembly is the contents of the .ctors section which contains that one 32-bit value, which is (interpreting the little-endian encoding) 0x080480aa, which is exactly the memory address of the _global__sub_i_main().
\ o /

At this point, my Google searching failed me.  For the life of me I can't figure out how to coerce the C compiler to create a .ctors section and populate it in the way that C++ does (answers on a postcard), however, I have a plan.

At the end of the day, the value that C++ put into the .ctors section is just your average common-or-garden initialised pointer variable, but in a special place, so what about this:

void test_init()
{
asm("int $3");
}

void* __attribute__ ((section (".ctors"))) test_init_ref = test_init;

I have my function, and I have a global variable which points to my function.  I've used some GCC specific runes to say that I want my variable to live in a section called .ctors.  Let's build this and see if it gives us the desired result (some sections removed for brevity):

>gcc hello.c -nostdlib
>objdump a.out -sS

a.out:     file format elf32-i386

Contents of section .ctors:
 80490b4 74800408                             t...

Disassembly of section .text:

08048074 :
 8048074:       55                      push   %ebp
 8048075:       89 e5                   mov    %esp,%ebp
 8048077:       cc                      int3
 8048078:       5d                      pop    %ebp
 8048079:       c3                      ret

Success!  We can see our function, and we can see the entry in .ctors which points to it, just as we saw in C++.  It's not the prettiest of solutions, but it will work for now.

Step 2 : Running


While we've gotten that all working, that's only half of the battle,  we still need to actually call all these from our program.

For the most part, all our program needs to know is where the .ctors array is and how long it is.  It can then iterate all the constructors and run them.  (These are function pointers after all, so these will always be relocated as required by the loader.  If the relocation didn't update function pointers, the program has more to worry about than constructors.)

One of the other searches I was running while doing this was to investigate how other systems dealt with this problem.  While that search yielded nothing of note, it did give me the idea to check out the default GCC  link script.  Currently my kernel uses a custom link script, but the applications still use the default one, where I found something interesting:

  .ctors          :
  {
    /* gcc uses crtbegin.o to find the start of
       the constructors, so we make sure it is
       first.  Because this is a wildcard, it
       doesn't matter if the user does not
       actually link against crtbegin.o; the
       linker won't look for a file to match a
       wildcard.  The wildcard also means that it
       doesn't matter which directory crtbegin.o
       is in.  */
    KEEP (*crtbegin.o(.ctors))
    KEEP (*crtbegin?.o(.ctors))
    /* We don't want to include the .ctor section from
       the crtend.o file until after the sorted ctors.
       The .ctor section from the crtend file contains the
       end of ctors marker and it must be last */
    KEEP (*(EXCLUDE_FILE (*crtend.o *crtend?.o ) .ctors))
    KEEP (*(SORT(.ctors.*)))
    KEEP (*(.ctors))
  }

So, my link-script-fu is not great, but I can understand enough of this to see that crtbegin.o and crtend.o have special handling when it comes to the .ctors section.  Anything for the .ctors section from crtbegin.o goes at the beginning of the output, and anything from crtend.o goes at the end, effectively book-ending the array.  (I'll admit I didn't know what crtbegin and crtend were for, but GCC complained that I didn't have them when I created libc, so I put in empty ones.)

This information, coupled with what we just did for shuffling variables into the .ctors section gives me an idea.

I can put a variable in crtend.c like this:
int __attribute__((section(".ctors"))) __libc_internal_ctor_end = 0;

... and then put this in crtbegin.c:
extern int __libc_internal_ctor_end;  // <<-- in crtend.o
int __attribute__((section(".ctors"))) __libc_internal_ctor_start = 0;

void call_ctors()
{
 // Loop between __libc_internal_ctor_start and __libc_internal_ctor_end
}

The linker will put these two variables at the beginning and end of the .ctors list, and then the call_ctors() function can use them to work out where all the other constructors are.  Note that the content of those two variables is completely irrelevant, only their locations in memory is useful.

Step 3 : Testing


This blog post is way longer than it was going to be, so I'll be brief and say what I usually say.  It failed a bunch of times because I made some stupid mistakes, but after much hammering, it all works.

crt0.s now performs calls:

  • the C Library initialisation function,
  • the constructors function in crtbegin.c, and
  • the program's main() function.
The really good article I found the other day ( https://blogs.oracle.com/ksplice/entry/hello_from_a_libc_free ) also helped me to confirm at least some of what this was supposed to do, but only after I know what I was looking for.

I also found out along the way that when a library (such as libkernel.a) is linked in, the linker only includes those object files which are actually needed.  This gave a minor testing headache because I expected my init() function and its .ctor reference to appear just because the library was linked, but it didn't.  That was easily fixed though.

No comments:

Post a Comment