Donnerstag, 19. September 2013

The C Module Pattern

Update Feb. 2015: As I've learned meanwhile this "thing" described below has an offical name. It is called "Abstract Data Type" and is nowedays the way one should design its C code. See "C Interfaces and Implementations: Techniques for Creating Reusable Software" for more on that.

I mentioned before that I am currently reading the book Test Driven Development for Embedded C. An unexpected but very welcome outcome of this study is a much clearer idea of how to structure C modules. Considering the amount of C code which is around in this world, it is astonishing that there apparently is no common pattern for this task.

The proposed approach by James Grenning, author of the book is close to what we consider as object oriented. However, it is far simpler than the GNOMEs gobject interpretation of OO.

The module pattern is all about clear separation of the modules to support loose coupling, predictable module and function names and standardized module constructors and destructores.

Their are two variations of the pattern, which build up on top of each other. The first one is the single instance module, which is presumably the most common one. If more than one instance of a module is required at the same time the multiple instance module variant is the one to choose. However, they don't differ that much.

Common Rules

Both variants have a couple of things in common.

Dependency Inversion Via Interfaces


This might sound odd in the world of C language but it has got all we need to follow the dependency inversion principle. There, a module is not directly depending on an other module but on its interface.

In C terms that is a client module which is only employing the functions and constants provided by the header file of the module used. The client doesn't care about the implementation of the function. It only relies on the header definition.

This is already common practice when we use libraries like stdlib - we don't care about the implementation of the functions offered in stdlib.h but we simply use them.

 

Information Hiding Inside The Module


To hide modul internal variables and functions inside the module they are marked with the C keyword static in front of them. Also, the declaration of private functions (forward declaration) takes place at the top of the module file, not inside the header. This thinking was new to me but makes much more sense: The interface/ header only contains the outside world communication of the module, nothing else.

Always Constructors And Destructors


This was also a new concept for me which I loved from the first minute. Users of the module always initialize the module with a Create function and cleanup module data later with a Destroy function.  Again this rule applies always, even if for the moment one of the functions has only a stub implementation.

What we gain here is a clear, predictable way of opening and closing the communication with the module.  Now it is much harder to forget to free data, since this is what usually the modules destructor will do for use.

Module And Function Naming


The following rules are simple and effective. Modules have a meaningful name, like Database.c which lets us assume that this module deals with the database. So far so good, as the next step the modules public functions use the name of the module as their prefix. Function GetOrderData() is so becoming Database_GetOrderData(). Using this notation it is easy to see where GetOrderData was implemented.

Remark One


This rule has a downside: If you try to give your functions meaningful names to prevent additional comments ( as you should do as a clean coder ) and you've got a lot of parameters (which is not good style anyway but unfortunately harder to get around in C as e.g. in Javascript with its instant JSON objects), then your functions signature might get quite long and is likely to break the 80 characters line width rule.

In that case I code the function onto multiple lines (see examples below). As I tend to be obsessed with clean, verbose code, I don't like that but I keep on following this rule anyway since in my opinion the clear structure I gain outweighs this downside.

Remark Two


Now it is getting slightly esoteric ;-) but since I strongly belief that code style matters lets take a closer look at the function names:

I usually prefer the Java style camel case notation where methods and functions start lower case whereas classes and interfaces begin with a capital letter. For our C function naming rule here there are two possible ways to go:
  • Database_getOrderData()
  • Database_GetOrderData()
The first one transfers the Java notation into our naming rule. However I decided to go for the latter one which I believe is a bit quicker to grasp when you read the code. This is also the proposed notation of James Grenning, author of Test Driven Development in Embedded C.

Single Instance Module

For this example there won't be an implementation, only the public interface (the header file) is presented.

// RecordCollection.h - Single Instance

#ifndef D_RecordCollection_H
#define D_RecordCollection_H

void RecordCollection_Create(); 
void RecordCollection_Add( const char* artist, const char* title ); 
void RecordCollection_PrintContainsArtist( const char* artist ); 
void RecordCollection_Destroy(); 

#endif  /* D_RecordCollection_H */

The module RecordCollection contains a constructor and a destructor function (RecordCollection_Create() and RecordCollectionDestroy()). Beside that there is a function to add a new record to the collection and one function to display whether or not an artist is present in the collection.

The interface does not reveal how RecordCollection is organized internally. We don't know (and we don't want to know) if the module is using a struct to store it's internal data or maybe something else. As its users the only thing we get is a simple instruction on how to work with that module.

Multiple Instance Module

The previously introduced single instance module has one drawback - at one time we can't use more than one. Sticking the analogy of our example I can't have an Long Player (LP, 12inch vinyl)  and a Singles (7inch) object at the same time. With the single instance module it is all one.

To keep separate lists of our vinyl we have to convert our RecordCollection module to an multiple instance module. For that purpose our interface looks like this:
// RecordCollection.h - Multiple Instance

#ifndef D_RecordCollection_H
#define D_RecordCollection_H

GHashTable* RecordCollection_Create(); 
void RecordCollection_Add( GHashTable* collection, 
                           const char* artist, 
                           const char* title ); 
void RecordCollection_PrintContainsArtist( GHashTable* collection, 
                                           const char* artist ); 
void RecordCollection_Destroy( GHashTable* collection ); 

#endif  /* D_RecordCollection_H */
On creation the constructor RecordCollection_Create() returns  a pointer to a GHashTable object somewhere in memory. The destructor RecordCollection_Destroy() in turn accepts a pointer to this object to free the memory occupied.

The remaining  two user functions have almost the same signature as their counterparts in the single instance example - except for the newly appended first argument which passes the current instance of our RecordCollection to the function.

To finish this section I'll give you this time a simple implementation of RecordCollection plus a client program using it. After the code example I finish the post with a discussion of the methodology presented, so bear with me.
// RecordCollection.c

#include <glib.h>
#include <stdio.h>
#include <stdlib.h>
#include "RecordCollection.h"

// declaration of private function inside module, 
// not visible in the interface (header)
static gboolean containsArtist( GHashTable* collection, 
                                const char* artist );

GHashTable* RecordCollection_Create() {
    GHashTable* collection = g_hash_table_new_full( g_str_hash,  
                                                    g_str_equal,
                                                    free,
                                                    free );
    return collection;
}

// Public Functions
void RecordCollection_Add( GHashTable* collection, 
                           const char* artist, 
                           const char* title ) {

    g_hash_table_insert( collection, 
                         g_strdup( artist ), 
                         g_strdup( title ) );
}

void RecordCollection_PrintContainsArtist( GHashTable* collection, 
                                           const char* artist ) {

    if ( containsArtist( collection, artist ) ) {
        printf( "Yepp, got it.\n" );
    }
    else {
        printf( "No, not found.\n" );
    }   
}

void RecordCollection_Destroy( GHashTable* collection ) {
    g_hash_table_destroy ( collection );
}

// Private Functions
static gboolean containsArtist( GHashTable* collection, 
                                const char* artist ) {

    return g_hash_table_contains( collection, artist );
}
// simple application of RecordCollection
// with multiple instances

#include <glib.h>
#include "RecordCollection.h"

int main() {
   GHashTable* myLPs = RecordCollection_Create();
   GHashTable* mySingles = RecordCollection_Create();

   RecordCollection_Add( myLPs, "Marvin Gaye", 
                                "What's Going On" );

   RecordCollection_Add( myLPs, "Baltic Fleet", 
                                "Towers" );

   RecordCollection_Add( mySingles, "Josh Rouse", 
                                    "Winter in the Hamptons" );

   RecordCollection_Add( mySingles, "Team 4", 
                                    "Ich zeig den Weg" );

   // Yepp 
   RecordCollection_PrintContainsArtist( myLPs, "Marvin Gaye");

   // No
   RecordCollection_PrintContainsArtist( myLPs, "Team 4");

   // Yepp
   RecordCollection_PrintContainsArtist( mySingles, "Team 4");

   RecordCollection_Destroy( myLPs );
   RecordCollection_Destroy( mySingles );
}

Discussion

The multiple instance module is as close as we got in terms of objects and object methods with plain C. To get a feeling how my RecordCollection might have looked like in Java, I've sketched out an interface and an example usage of the RecordCollection object, leaving out its implementation:
// possible interface defintion in Java

public interface RecordCollectionInterface
{
    void add( String artist, String name);
    void printContainsArist( String artist );
}
// creating and using the RecordCollection object in Java
// fortunately the garbage collector takes care of the destruction

RecordCollection myLPs = new RecordCollection();
myLPs.add("Marvin Gaye", "What's Going On");
myLPs.printContainsArtist("Marvin Gaye");
You can compare your self but in Java (or any other OO language) you can basically achieve the same functionality with less code - however, that's no news. I won't enter the performance discussion, though.

But still, for me the good news is, that there are ways to write C modules which come quite close to the behavior of objects and object methods in OO languages - of course leaving inheritance completely out.  I can accept the additional syntactical effort required. I mean, did you ever try to run Java on your Arduino?

Keine Kommentare:

Kommentar veröffentlichen