How to Manually Link a C Program                                   February 2023

    These are instruction to manually link object files to produce an
    executable. Usually, GCC links C programs by automatically invoking ld in
    response to a command such as gcc a.o b.o -o output. But this hides some
    interesting stuff about how programs are linked, so for some fun, lets link
    the object files ourselves.

    First, in an empty directory, create main.c:

        #include <stdio.h>
        
        int
        main(int argc, char **argv)
        {
          printf("hello world\n");
          return 0;
        }

    Now compile the program to main.o. -c means not to run the linker:

        gcc -c main.c

    Lets try to (naively) link the program. ld is the name of the GNU linker and
    its what GCC uses to link:

        ld main.o -o output

    We get the error messages:

        ld: warning: cannot find entry symbol _start; defaulting to 000000000...
        ld: main.o: in function `main':
        main.c:(.text+0x1a): undefined reference to `puts'

    The entry point to an elf file is the _start symbol but our C program only
    has a main function. The solution is to link with the C runtime startup
    code object file. This provides a _start symbol which will, among other
    things, execute the main function. It is possible to avoid linking with the
    C runtime startup code by specifying a _start symbol in your program source,
    but that's beyond the scope of this guide.

    We are also informed of an undefined reference to `puts'. This is because we
    haven't linked with the standard C library which is used by the printf call.
    I don't know why we are told undefined reference to `puts' and not undefined
    reference to `printf'... Interesting, lets call it homework.

    Lets run the linker again to reflect what we have learnt. link main.o, the C
    runtime startup code and the C standard library.

        ld main.o /lib/Scrt1.o -lc -o output

    On your system, the path to the static C runtime startup code (Scrt) may be
    different. To find out, run gcc main.o -v and grep the standard error output
    for crt. This asks GCC to link for you in verbose mode so that you can see
    what C runtime startup it uses.

    The command should run without error. But when we try to execute ./output:

        bash: ./output: No such file or directory

    Weird, ./output definitely exists. Lets have a look at the headers of the elf:

        $ readelf ./output -e
        ELF Header:
          Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
          Class:                             ELF64
        ...
        INTERP         0x00000000000002e0 0x00000000004002e0 0x00000000004002e0
                         0x000000000000000f 0x000000000000000f  R      0x1
              [Requesting program interpreter: /lib/ld64.so.1]
        ...

    Ah Ha! That program interpreter doesn't look right! Look at the headers of a
    working binary on your system: readelf $(which bash) -e. You should see that it
    uses an interpreter like /lib64/ld-linux-x86-64.so.2. As far as I understand,
    ld-linux is like a newer version of ld64 and ld64 isn't installed on most
    systems. Now lets construct our final link command:

        $ ld main.o /lib/Scrt1.o -lc --dynamic-linker=/lib/ld-linux-x86-64.so.2 -o output
        $ ./output
        hello world

    Hooray it works!

    Further reading: How ELF binaries get run.