Donnerstag, 28. November 2013

I've learned the hard way - TDD in C, 2nd round

A while ago I wrote about how to insert "spy" code in C. That is the process of overwriting an existing function definition at link time with an own implementation to basically write unit tests.

After practicing for a while I have to admit that this approach works but is unpredictable and tricky to set it up. Unpredictable since I found it hard to guess in which exact order the linker is resolving the needed symbols (in this case function symbols), tricky when you're trying to get legacy code with many, sometimes depending libraries under test (which I currently do).

After some recent frustration I gave the search engine an other go and ended up with my new favorite toy: the --wrap option of ld, the GNU linker.

Here is what the man page of ld tells about it:
--wrap=symbol
    Use a wrapper function for symbol.  Any undefined reference 
    to symbol will be resolved to "__wrap_symbol".  Any undefined
    reference to "__real_symbol" will be resolved to symbol.

    This can be used to provide a wrapper for a system function.  
    The wrapper function should be called "__wrap_symbol".  
    If it wishes to call the system function, it should 
    call "__real_symbol".

    Here is a trivial example:

            void *
            __wrap_malloc (size_t c)
            {
              printf ("malloc called with %zu\n", c);
              return __real_malloc (c);
            }

    If you link other code with this file using --wrap malloc, 
    then all calls to "malloc" will call the function "__wrap_malloc" 
    instead.  The call to "__real_malloc" in "__wrap_malloc" will call 
    the real "malloc" function.

    You may wish to provide a "__real_malloc" function as well, so that 
    links without the --wrap option will succeed.  If you do this, you 
    should not put the definition of "__real_malloc" in the same file 
    as "__wrap_malloc"; if you do, the assembler may resolve the call 
    before the linker has a chance to wrap it to "malloc".

This functionality can of course also be used to overwrite some function which stands in our way for unit testing which I'm going to show you now.

I'm using the same test source as last time. Here are our example files:
// file fav_music.h
void tellFavoriteMusic( void );


// file program.c
#include "fav_music.h"

int main() {
   tellFavoriteMusic(); 
}


// file fav_music_indie.c
#include <stdio.h>

void tellFavoriteMusic() {
    printf("I like Indie!!!\n");
}


// file fav_music_soul.c
#include <stdio.h>

void __wrap_tellFavoriteMusic() {
    printf("I like Soul!!!\n");
}

Notice that the name of function tellFavoriteMusic in fav_music_soul.c has been changed to __wrap_tellFavoriteMusic.

When the three files are compiled and linked together (for clarity I keep this as two separate commands although gcc could do it in one go) as expected tellFavoriteMusic in fav_music_indie.c is executed by main():
$ gcc -c program.c fav_music_indie.c fav_music_soul.c
$ ls *.o
fav_music_indie.o  fav_music_soul.o  program.o
$ gcc -o program.out fav_music_indie.o fav_music_soul.o program.o
$./program.out
I like Indie!!!
This is something we actually expected: all function names are unique and the one specified is executed.

Now let the magic happen and tell the linker that it should replace a function call to tellFavoriteMusic with a call to __wrap_tellFavoriteMusic:
$ gcc -Wl,--wrap=tellFavoriteMusic -o program.out fav_music_indie.o fav_music_soul.o program.o
$ ./program.out
I like Soul!!!
program.c still calls tellFavoriteMusic, but this time its replacement __wrap_tellFavoriteMusic is called.

A note to the parameters syntax -Wl,--wrap=tellFavoriteMusic. Since we're calling the linker ld via the GNU compiler we need to tell gcc to pass the --wrap parameter through to the linker. Here is what the gcc manual says:
-Wl,option
    Pass option as an option to the linker.  If option contains 
    commas, it is split into multiple options at the commas.  
    You can use this syntax to pass an argument to the option.  
    For example, -Wl,-Map,output.map passes -Map output.map 
    to the linker.  When using the GNU linker, you can also 
    get the same effect with -Wl,-Map=output.map.
Although it looks similar -Wl is different to the common warning switch -W...

Other approaches
While browsing the Internet I stumbled  across some other approaches for  replacing a function with an other one at link time. My favorite ones use the tool objcopy which is part of the binutils and is used to change object files (the result of a compiler run).

Mark function to overwrite as "weak"
Here the idea is to use the ----weaken-symbol parameter of objcopy on the object file of the original function. By marking the function to replace as weak the linker should prefer a second function implementation with the same name which hasn't got that flag. The multiple definition error should not appear.

Temporary remove the original function definition from the object file
With this approach you generate a temporary object file which does not contain the conflicting original implementation of the function. I assume objcopy and/ or objdump could do that.

I haven't tried the last two ideas myself since I'm happy with the --wrap parameter solution. However, some of you might find those thoughts helpful as a starting point of own tests.

Montag, 18. November 2013

Multi line strings in C

I'm still trying to find some time to finish my second article about the Putty sources (first one is here).

In the meantime I want to share a helpful little trick on how to handle multi line strings in C with you. See this example:
//multi line strings and multi line printf examples
#include <stdio.h>

int main( void ) {

    // multi line string definition
    char* string1 = "i am "
                    "a multiline string "
                    "which the compiler assembles "
                    "to one sting. No need for "
                    "concatination.";

    printf( "An ordinary string on one line: %s\n", string1 );


    // multi line string inside printf
    char* string2 = "a string";

    printf( "Multiline string works also "
            "inside a printf command. "
            "This is helpful if the string part of "
            "the command is too long to fit your maximum "
            "collumn size. Of course you can use format "
            "specifier like %s.\n", string2 );
}
The output is:
An ordinary string on one line: i am a multiline string which the compiler assembles to one sting. No need for concatination.
Multiline string works also inside a printf command. This is helpful if the string part of the command is too long to fit your maximum collumn size. Of course you can use format specifier like a string.
The examples are (hopefully) self-explainatory. The first one simply defines a multi line string which will be printed on one line. The second example demonstrates that the printf family of commands also understands the multi line syntax. Neat, innit?