Sonntag, 26. Februar 2017

Poor Man's Microservice Configuration Using Environment Variables

tl;dr: This post shows a simple and tech stack neutral way to provide a configuration file for a microservice.

These days a common approach to microservice deployment is to ship them as a standalone binary package. The Uber Jars in the Java world are a prominent example. This simplifies operations - particularly when you are in a pre-Docker environment.
All it needs to run a microservice is a single Linux command line. As an old Linux guy I am delighted with this "back to the basics" approach.

A must-have feature for those kind of processes is the ability to configure the service via an external configuration file. Here, you often find the usual suspects like YAML, JSON or even INI.

This advice of the 12 factor manifesto made along existing config option more prominent again: good old environment variables.

This post shows a demo microservice which consists of three files:

  • demoservice-starter.sh - starts demoservice.py and provides it with its configuration
  • demoservice.cfg - the configuration file for this service consisting of shell variable definitions
  • demoservice.py - the actual microservice (just a modified Flask "Hello World")

The service is started like this:
$ ./demoservice-starter.sh demoservice.cfg &

Outside to inside explaination:


demoservice-starter.sh 


Line 6 reads in the config file provided as command line argument. Technically the content of demoservice.cfg is parsed and executed.

Remarkable here is that the environment variables created in line 6 are just visible for demoservice-starter.sh and its children but not for the rest of the Linux system. In contrast to a user or system "profile" file containing global environment variables this is a decentral, scoped approach to provide environment variables.

Line 7 then starts the microservice. In this case I use Gunicorn as server for my little Flask application. I use "&" to send the gunicorn process to the background and continue the script execution.

Line 8 stores the process id of my just started process. This PID variable is needed two times. On line 9 we enter a "wait" state until the gunicorn process exits. This is basically a more sophisticated version of an endless sleep loop. The latter one works as well but it requires more code ;)

To stop the microservice we just kill demoservice-starter.sh. However, the shell does not kill our gunicorn child process automatically.

To retrofit this behaviour we have to quickly discuss what kill actually does. This is what the man page says:

kill - send a signal to a process. The default signal for kill is TERM.

So when we kill demoservice-starter.sh we actually just send the TERM signal to the script. What we need to do now is to forward this signal to our child gunicorn process.

This is what line 4 does: When the script receives a TERM signal it kills the gunicorn process which then lets our "wait" command continue to the end of the script.

A quick note on line 2: Here we enable two features of the bash shell. "-a" automatically makes variables defined in the script available for child processes. Without "-a" we would need to prepend each variable in demoservice.cfg with a "export" statement.

The other feature is "stop script on error" by using "-e". This is very useful also for build scripts to safe yourself from each time manually checking exit codes.


demoservice.py

There is not much to say here. When the route path "/" of our demo service is accessed via HTTP GET we use Pythons "os.getenv" function to read the content of the environment variables and echo their content. Remarkable here is the usage of Pythons Literal String Interpolation on line 10 which was introduced in version 3.6.

Mittwoch, 9. November 2016

More active on twitter recently

Just a quick one: I started using Twitter more often. You can find me here: https://twitter.com/@maik_toepfer

Finished Haskell Course

I've finished the Haskell into course and stored my notes on github. I wouldn't say that I am a professional Haskell programmer but I gained the insides I was looking for. Also I used the course to read more about monads and in which context they are applied. In my notes I added the links to two sites I found especially useful.

The future learn platform works similar to openSAP. Also here content is released weekly, I had the feeling that the discussion forums are used even more intensively which was a good thing.

Here is the link to the certificate earned: Link

And again, while the course was running I found out about the next course: The Internet Of Things provided by the King's college of London. MOOC fever apparently never stops.

Montag, 19. September 2016

Haskell Course Has Started

Just received the mail that the Haskell course is taking off:

I am really looking forward to de-mystify the mondads - now and forever.

Dienstag, 30. August 2016

Montag, 29. August 2016

In MOOC Fever - Haskell @ Future Learn

I just enrolled on the online Haskell class which starts at September, the 21th.

Here is the link: https://www.futurelearn.com/courses/functional-programming-haskell

I eventually want to understand what does "monads" are all about.

Samstag, 23. Juli 2016

Finished "Design For Non Designers" MOOC - My Second openSAP Course

While I was in the middle of the OpenUI5 course SAPs learning platform openSAP announced a new course: "Design for Non-Designers".  That sounded interesting then lets face it - the majority of the posts here are about really techie stuff - and not about typography and the likes.

The course was a mix of an introduction into "Design Thinking" and basic design concepts. I learned about personas, storyboards and that prototyping on different level is both, important and fun at the same time. I found that the time was a good investment.

I was glad that I could eventually get some grips on "Design Thinking". I already did a one day workshop at a conference last year but this was a bit pointless.

My course notes are on github so if you are interest feel free to have a look. And yes of course, there is also a record of achievement.


Samstag, 2. Juli 2016

Finished OpenUI5 course

In the last month OpenUI5, the open source version of SAPs own HTML5 framework got more and more popular in the enterprise world. Although compared to AngularJS it is still a niche product.

However, if your common task is to build "boring business web apps" than OpenUI5 might be something for you. It's a one stop shop. It supports coding following the MVC pattern, i18n internationalization, hundreds of controls (widgets), lots of high quality icons and so on.

Their idea is that the main dependency of your business app is OpenUI5 and nothing else.

I spend the last weeks with a massive open online course (MOOC) named "Developing Web Apps with SAPUI5". Here 30k people from around the world learned how to program OpenUI5.

I am glad to announce that I have finished the course and due to the extra task I did I belong to the best 5% - yes.


Dienstag, 12. April 2016

Avoiding Temporal Coupling - Part 2/2

In part one of this post I briefly introduced the problems which come with temporal coupling and "passing a block" as a technique to overcome those problems. In this last part of the post I want to demonstrate "passing a block" in pure C.

Lets start with a basic implementation:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>

typedef void (*FileCommandFunc)(FILE*);

int withFile(const char* fileName, const char* fileAccessMode,
             FileCommandFunc func) {
    FILE *myFile = fopen(fileName, fileAccessMode);
    if (NULL == myFile) {
        fprintf(stderr, "Error opening file '%s': %s\n",
                fileName, strerror(errno));
        return EXIT_FAILURE;
    }
 
    func(myFile);

    fclose(myFile);
    return EXIT_SUCCESS;
}

void printFirstLine(FILE* file) {
    char buffer[1000] = {0};
    if (fgets(buffer, sizeof(buffer), file)!=NULL) {
        printf("%s", buffer); 
    }
}

void printWholeFile(FILE* file) {
    char buffer[1000] = {0};
    while (fgets(buffer, sizeof(buffer), file)!=NULL) {
        printf("%s", buffer); 
    }
}

int main() {
    withFile("demo.txt", "r", printFirstLine);
    withFile("demo.txt", "r", printWholeFile);
}
Function withFile takes three arguments: the name of the file to be opened, the access mode (read, write, append...) and a file command. The latter is a pointer to a function which accepts a file object as only argument.

printFirstLine and printWholeFile are two example commands. Both employ fgets to traverse the file line by line.

In main we see "passing a block" in action. File "demo.txt" is opened in read mode and the actions implemented in the command functions are executed against the file. Error handling and closing of the file handle takes place inside withFile - the user of the function doesn't have to bother with that.

When coding this first draft I realized that my solution is quite limited. Inside the command you can only work with the file passed in and global variables. Other variables aren't simply accessible.

With the GNU Compiler Collection gcc there is the nested functions  feature which nicely works around this limitation:
int main() {
    const char* header = "Content of demo.txt";

    void printWholeFile(FILE* file) {
        puts(header);
        char buffer[1000] = {0};
        while (fgets(buffer, sizeof(buffer), file)!=NULL) {
            printf("%s", buffer); 
        }
    }

    withFile("demo.txt", "r", printWholeFile);
}
I admit the example is a little bit constructed since you could easily print the header before the call of withFile but there you go. The idea is simple: Since printWholeFile is now defined inside main, printWholeFile has access to all global variables and the local variables of main as well.

The downside of this is that it only works in gcc. Here is a more general implementation which uses a technique you find for example in the glibc comparision functions.
The command function accepts a second argument which is a void pointer. This allows us to pass anything into the command. We only have to cast it to the right type before using it. This is basically an accepted disabling of C's type checking.

Here is the code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>

// withFile.h
typedef void (*FileCommandFunc)(FILE*, void*);

typedef struct {
    const char* name;
    const char* accessMode;
    FileCommandFunc func;
    void* commandArgs;
} withFileArgs;

// withFile.c
int withFile(withFileArgs args) {
    FILE *myFile = fopen(args.name, args.accessMode);
    if (NULL == myFile) {
        fprintf(stderr, "Error opening file '%s': %s\n",
                args.name, strerror(errno));
        return EXIT_FAILURE;
    }
 
    args.func(myFile, args.commandArgs);

    fclose(myFile);
    return EXIT_SUCCESS;
}

// production code

// the commands
void printFirstLine(FILE* file, void* commandArgs) {
    const char* header = commandArgs;

    puts(header);
    char buffer[1000] = {0};
    if (fgets(buffer, sizeof(buffer), file)!=NULL) {
        printf("%s", buffer); 
    }
}

void printWholeFile(FILE* file, void* notUsed) {
    char buffer[1000] = {0};
    while (fgets(buffer, sizeof(buffer), file)!=NULL) {
        printf("%s", buffer); 
    }
}

// using the commands
int main() {
    withFile((withFileArgs){.name="demo.txt", 
                            .accessMode="r", 
                            .func=printFirstLine, 
                            .commandArgs="Content of demo.txt"});

    withFile((withFileArgs){.name="demo.txt", 
                            .accessMode="r", 
                            .func=printWholeFile});
}
For convinces I've also replaced the growing number of arguments of withFile with something you call configuration object in Javascript, in C it is the beautiful marriage of designated initializers and compound literals, both introduced in C99.

Also note that in function printFirstLine the cast from the void pointer commandArgs to the const char pointer header is implicit - no extra cast operation needed on the right hand side of the equals operator.

Montag, 11. April 2016

We Love Code! - Eine popkulturelle Ode an das Programmieren

Ich wohne schon wirklich gern in dieser Stadt Leipzig. Alles da was man so von einer Großstadt erwartet und trotzdem übersichtlich wie früher auf dem Dorf. Und so kam es auch, dass mir "We Love Code! - Das kleine 101 des Programmierens" in die Hände fiel. "Da hat Natalie mit geschrieben - du weißt schon, meine ehemalige Mitbewohnerin" sagte unser Schlafgast.

Macht Spaß zu Lesen (Bildquelle)

Ja, ich konnte mich erinnern. Social Media, HTML, CSS, später Ruby und viel Neugier auf den Rest. Drei Jahre später legen die Code Girls Natalie Sontopski und Julia Hoffmann einen angenehm leichtgewichtigen Einstieg in die Welt der Programmierung vor.

Die beiden Autorinnen nehmen den Leser bei der Hand und erklären mit Verve dass man (wie auch ich) eine Mathe-Niete sein kann und trotzdem ein passabler Programmierer, dass Programmiersprachen ähnlich viele Geschmacksfacetten haben wie die irdischen Variationen von Spagetti mit Tomatensoße und dass C Programmierer harte Kerle sind. Da sich mindestens 50% dieses Blogs mit mehr oder weniger obskuren C Programmiertechniken beschäftigt, fühlte ich mich zugegebener  Maßen bei letztem Punkt gebauchpinselt.

Außerdem leistet das Buch einen wichtigen Beitrag zur Idolbildung in der Geschichte bedeutender Programmier-Veteranen. Ada Lovelace, Grace Hopper (It's easier to ask forgiveness than it is to get permission) und andere bekommen extra Seiten gewidmet. Bravo. Im Pop wird fleißig zitiert und referenziert. Die dritte Generation von Joy Division T-Shirt Trägern bevölkert nun die Indie-Clubs. Höchste Zeit dass auch wir Nerds unsere Helden auf der Brust tragen, Beispiel gefällig?

Bildquelle

Und zu guter Letzt: Es handelt sich um ein schönes, aufwendig gedrucktes Buch. Wer will seinen Liebsten auch schon einen USB-Stick mit einer epub-Datei schenken? Vielleicht noch mit einem animierten GIF als Glückwunschkarte.

Es gibt Dinge die kann man nicht digitalisieren. Ein schönes Buch gehört dazu. (Bildquelle)

Fazit: Falls ihr Partner, Freunde oder Verwandte habt die die Welt in der ihr lebt faszinierend finden aber eigentlich nix davon verstehen: We Love Code! ist ein empfehlenswerter Einstieg, am Besten direkt vom Verlag gekauft.

Montag, 4. April 2016

Avoiding Temporal Coupling - Part 1/2

Every now and then we try to watch Uncle Bobs cleancoders videos. In episode 4, "Function Structure" the issue of temporal coupling is discussed. Temporal coupling is something we encounter quite often: to work with a database you first need to connect to the database. Next you do your work with the database and as the last step you disconnect. The same applies to working with a file: first you open it, then you perform your actions on the file handle/ object and when your done you close it.

The order of this actions is important - you can't call the database disconnect method before you call the connect method. Also, those temporal couplings are often hidden in the background. Quite often some global init method created this database connection your are using for you, at least this is what you believe. When shutting down the application you are hoping that there is another magic method which closes the connection.

In order to deal better with temporal coupling Uncle Bob suggests a technique called "passing a block". The idea is to allocate and to release resources at the point where you need it. The following pseudo code illustrates this approach:
withRessource(ressource, command):
    allocate(ressource)
    command(ressource)
    release(ressource)
Python supports this pattern out of the box with its context manager . Also Java 7 and following has a similar construct.

When looking at the pattern I immediately thought that implementing "passing a block" in plain C isn't too hard.  Function pointers are your friend here. Wait for the second part of this post to see the implementation of withFile, a function which first opens a file, executes your desired actions and then closes this file.

Dienstag, 8. März 2016

Professional Online Whiteboarding

My job involves explaining and discussing all sorts of things to people who are almost always situated remotely. Particularly, I like sketching and drawing - that's what we software people do at the end of the day: planing, discussing and implementing boxes and arrows between them.

Shortcomings Of Builtin Tools
For a while I tried to use the builtin whiteboard tools together with the mouse as drawing device. This was inconvenient for a couple of reasons:
  • drawn lines where usually quite edgy and appeared not natural
  • Microsoft Lync (nowedays Skype for Business) had some lag when I tried the whiteboard there
  • I never managed to freehand draw nicely with the mouse

Inspired By A Worldwide Math Trainer
At the end, I never used the builtin drawing facilities because I felt so limited. To overcome this I had a look what other people where doing. I ended up with a similar setup like the Khan Academy:
Note: The Wacom Bamboo series of tablet is discontinued and is succeeded by the Intuos series.

Since I wasn't sure if this whole tablet drawing thing works out for me I went for a cheap used Wacom Bamboo One tablet which I bought for 15 € off ebay.

A Cheap Wacom Bamboo Tablet Is All To Get You Started
Using The Drawing Tools
Software wise I kicked off with SmoothDraw which is not a bad choice since it is free and it doesn't require admin rights for its installation.

These days I use the free edition of Autodesk Sketchbook more often. Compared to SmoothDraw the same (limited) drawings of mine look much nicer here. Also the editing tools are way better with Sketchbook. For example, they have a Lasso selection which lets you draw the selected area and than easily move, turn or resize it.

Pressing TAB toogles between the blank canvas and the tool bars. This is nice when presenting: To get the audiences full attention I usually present inside the blank canvas. Only when I need to change a tool or setting plus I can't remember the hotkey I quickly press TAB to get all the menus and then TAB again to return to the blank canvas.

Talking about hotkeys or keyboard shortcuts: To become a fluent presenter it is advisable to know how to change between tools and colors by only using the keyboard. When I present, I draw with my right hand and use the left hand for pressing the hotkeys.

Airbrush For Visual Quick Wins
Beside the pressure sensitive pen tip I really do love the airbrush tool. Once I am done with my boxes and arrows I highlight the important parts with a subtle colored airbrush. These makes to end result much more appealing and it is so simple to do.


Post Production With Gimp
No matter what drawing tool I am using, I never use the full size of the canvas. Actually, I like to get started with a large canvas just to be sure I don't need to limit myself later on.

When I want to share the results of a drawing session I usually export the images to png and then import it into Gimp. There I cut the canvas to the actual size of the image. Also I downscale the image to a size usable for online purposes.

For this I zoom the image to a satisfying size (in percent). I then go to "Image > Scale Image". There I change the size unit from px to percent. I then enter the desired percentage I previously found out by zooming into the image.

When I am done I usually use the "Export to original image" option (can't remember the right name) to overwrite my png with a version of the right size.

Conclusion
I started my drawing experiment over a year ago. When I got the tablet I spend some evenings at home listening to good music and drawing (not only) boxes and arrows. This is also an advice for you: drawing with a tablet needs some time to get used to it. For enough self confidence for your free drawing presentation you better spend some time training which by the way is good fun anyway.

For me these 15 € together with the evenings of good music and drawing training have paid off more than once. In a world of over designed Powerpoint presentations it makes a difference when you build up a topic just with the power of a pen.

Samstag, 20. Februar 2016

Software Craftmenship In Leipzig Is Picking Up Pace

Last September I wrote about my attempt to host a local meetup for the software craftsmen in my home town of Leipzig/ Germany. After a slow start things are recently going well.

This week we had Alex of Grossweber over how gave a very compact but at the same time well structured introduction to git and its advanced topics like interactive rebase. Up to this point my git knowledge was very basic but after this evening my understanding of this topic is much clearer. Good work Alex.

git Session with Alex @ Makerspace Leipzig, 2016/02/17
We started to publish our events on two platform. Beside our native Softwerkskammmer page we started using meetup.com as well.  I suppose being a meetup.com event helped use to constantly increase our attendees. Last week we where almost 30 people and I had to limit the subscription to the event. A couple of month back we where always below 10 people - how quickly things change...


Donnerstag, 18. Februar 2016

"pragma once" as a better alternative for guard clauses

An include guard is a very popular and incredibly useful hack:

... an #include guard, sometimes called a macro guard, is a particular construct used to avoid the problem of double inclusion when dealing with the include directive. (Link)

Example:
// person.h
#ifndef PERSON_H
#define PERSON_H

typedef struct {
    char* first_name;
    char* last_name;
    int age;
} person;
 
#endif /* PERSON_H*/
The idea is to let the C preprocessor only evaluate the guarded code if the global symbol PERSON_H is not defined. Since line 3 defines PERSON_H as the very first step the person struct is guaranteed to be seen only once at compilation time.

It is simple macro programming but so popular that an IDE like Eclipse CDT auto-generates the include guard for you. You can even choose different naming schemes.

But this technique also has some downsides:
  • three lines of extra code
  • potential name clashes if there is another person.h in an included project
The latter one can be worked around with improved naming schemes like adding the path to the symbol name (#define MY_PROJECT_SRC_PERSON_H) or using a simple random number (#define DF454FSKWDLD) but stop - this hack is getting worse and worse.

Luckily there is a solution for quite some time now called #pragma once. The above example looks rewritten like this:
// person.h
#pragma once

typedef struct {
    char* first_name;
    char* last_name;
    int age;
} person;
The ‘#pragma’ directive is the method specified by the C standard for providing additional information to the compiler, beyond what is conveyed in the language itself. [Link].

And this is what #pragma once does:
#pragma once is a non-standard but widely supported preprocessor directive designed to cause the current source file to be included only once in a single compilation. [Link]

The Wikipedia page which provides this quote gives also a list of compilers which provide this feature. If you're not forced to compile under Solaris studio you're fine.

Sonntag, 24. Januar 2016

Printing boolean values in C

Since the bool type in C is nothing else than an integer, a naive printout of the number will also just produce a number. This little trick gives you a text representation of the boolean value:
bool status = true;
printf("The status is %s", status ? "TRUE" : "FALSE");

The ternary operator is evaluated first. Since the value of status is true the string TRUE is the second argument of our printf.

So the output is:
This status is TRUE

Donnerstag, 21. Januar 2016

Infrastructure As Code - Some Lessons Learned

I used the last days of 2015 to automate the installation of our C development environment. Here are some of the ingredients:
Due to this long list of required tools and plugins setting up our development environment is quite complex. After asking for some clarification on the right tool for the job I went for Vagrant.

10 days later the result was as expected. I am now able to say vagrant up and (if running for the first time) a basebox from an internal repository is downloaded and Vagrant is then running all the shell scripts I've written to install the above. This is called to provisioning step and takes place only once.

Coding the infrastructures I found my self dealing with some of the issues I only knew from ordinary coding so far.

External Dependencies


When installing software which was not provided in a nice repository (Oracle client and Eclipse ProC extension for example) I had to decide if the install scripts download some version of the software from the internet or if I should add a specific version to my Vagrant project and keep it.

I went for the latter. To reduce external dependencies (=download links at the Internet) I have the required archives and binaries locally under version control. If I want to update the software, I manually need to download the newer version and let it replace the older version.

Inside my scripts I always tried to use wildcards when it came down to file names. The goal was that a version update does not require an update of the provisioning shell scripts. A simple overwrite of the old version with the new version should do.

ECLIPSE_PURE_SDK="/vagrant/files/eclipse-SDK-*-linux-*_64.tar.gz"
...
tar xvzf ${ECLIPSE_PURE_SDK}


Feedback Loop


The general approach to codify my infrastructure setup was very similar to the way I usually go forward:
  1. write new code or correct existing code
  2. let it compile (optional, only required for compiled languages of course)
  3. execute
  4. find the error
  5. start from beginning
Translated into the world of Vagrant this is:
  1. write a new installation task or correct an erroneous existing one
  2. let Vagrant provision (=execute) the installation tasks
  3.  find the error in the installation tasks
  4. start from beginning
Particularly the second step, provisioning the Vagrant box was painfully slow. It took about 5-7 minutes to finish this step. Going forward in small incremental steps this means a lot of 5-7 minutes forced breaks.

This time I accepted these waiting times. For the next bigger infrastructure coding job I will definitely try out one of the configuration management tools (Ansible, Puppet, Chef...). All of them give me something really helpful I was lacking this time - idempotence:

... operations [...], that can be applied multiple times without changing the result beyond the initial application. (Wikipedia)

My imaginary updated Vagrant cycle would then look like this:
  1. write a new installation task or correct an erroneous existing one
  2. execute all installation tasks, only the new or updated once run
  3. find the error in the installation tasks
  4. start from beginning
This should save me a lot of time since here only the differences of the configuration tasks are executed.

Freitag, 27. November 2015

Thoughts On "SE-Radio Episode 242: Dave Thomas on Innovating Legacy Systems"

In episode 242 Software Engineering Radio interviewed Dave Thomas about how to deal with legacy systems. I liked the show so much that I had to do a sketchnote:

Controversial And Very Inspiring At The Same Time - SE Radio 242 with Dave Thomas

Actually I am a faithful follower of Working Effectively With Legacy Code : isolate the piece of code you want to change (dependency breaking), write tests for it and then modify the code using TDD. Over time I got quite good at it - even in C. However, it's a lot of effort - even when you're trained.

Dave sayed "Unit tests are a waste of time, focus on acceptance test" (end-to-end tests). The problem with end-to-end tests is that they are even harder to setup. Instead of mocking the objects around you, you have to provide all the external dependencies or at least good replacements:  test databases, test middleware, test clients...
Anyway, once you've managed all that and wrote your first end-to-end test, things are getting easier a lot. Covering "unhappy paths" with tests is now actually quite simple - drop a central database table, switch of the middleware, send faulty messages to your application and check what's going on.

With all this virtualization (docker as latest hype) and infrastructure as code (Puppet, Chef, ...) we now have got good tools to write end-to-end tests which are repeatable, automated and maintainable.
Surely this was not as simple in 2004 when "Working Effectively With Legacy Code" came out.

Dave's statements  reminded of the Golden Master approach which is quite similar. However, the initial end-to-end tests there is only meant to  provide the basic safety net towards a unit test coverage. The latter one is the actual goal of "Golden Master" testing.

So yes, maybe going from outside to inside is nowadays a better way of creating a safety net. I am still not convinced to ditch unit testing of old code completely but this is as always something you have to try out.

Mittwoch, 25. November 2015

Something For The Toolshelf - Code Analysis Tools Used For Security Analysis Of Truecrypt

Recently the Bundesamt für Sicherheit in der Informationstechnik (BSI), an authority of the German government released a security analysis of Truecrypt. This analysis was carried out by the Fraunhofer-Institut für Sichere Informationstechnologie (SIT) in Darmstadt /Germany. This institute is part of the Frauenhofer society - a research organization spread across Germany.

From a software engineering perspective I was curious what approach the researches took to evaluate the code code base.

 

GOTO

Apparently also the Truecrypt authors liked their goto. The study on goto (my translation):

To implement exception handling the usage of goto is generally accepted since the language C does not offer an own feature for that. New research concludes that meanwhile programmers are predominantly using goto in a sensible way.

Die Verwendung von goto wird jedoch im Allgemeinen zur Umsetzung einer Ausnahmebehandlung akzeptiert, da die Sprache C kein eigenes Konstrukt hierfür kennt. Neuere Untersuchungen haben ergeben, dass Programmierer mittlerweile die goto-Anweisung überwiegend nur noch in sinnvoller Weise verwenden. (original)

On that topic the study quotes An empirical study of goto in C, a paper which was pre-released in February 2015 and which was subject of my previous post.

 

Complexity Of The Source Code

To measure complexity the authors of the study employed a tool called Lizard which can deal with a bunch of languages including C, C++, Java, Python and Javascript

Here is the feature list taken from the Github page of Lizard:
  • the nloc (lines of code without comments),
  • CCN (cyclomatic complexity number),
  • token count of functions.
  • parameter count of functions.

As their measure of complexity the study uses the cyclomatic complexity:

As a measure for the complexity of the flow of control especially the cyclomatic complexity is being used. Values higher than 15 are an indicator for potential refactoring. Values above 30 are usually accompanied by flawed code. (my translation)

Als Maß für die Kontrollflusskomplexität wird insbesondere die zyklomatische Komplexität verwendet. Werte größer 15 sind ein Indiz dafür, dass Refaktorierung sinnvoll ist. Werte über 30 gehen oft mit fehlerhaftem Code einher. (original)

Code Duplicates

To find identical pieces of source code the autors of the study use Duplo, a duplicate finder for C and C++. With its default settings the tools consideres three and more identical  lines of code as duplicates. 

 

Static Code Analysis

For this kind of analysis three tools where used: Coverity, Cppcheck and the Clang Static Analyzer. The interesting point here is that there where almost no overlaps in the errors found by the three tools. Which brings me to the conclusion that it is a sensible investment to integrate more than one static analyzer in the Continuous Integration chain.

Montag, 23. November 2015

Rehabilitating C's goto

I admit - I regularly write goto's. Actually almost all non-pure functions see at least one goto. For always the same reason: Handling errors and cleaning up resources. I already wrote about the technique 1 1/2 year ago.

Example For Error Handling And Cleanup using goto [1]

In my eyes the usage of goto for cleanup and error handling is a good thing. The flow of application logic is not unnecessarily cluttered with local error handling. Instead the function is divided into two parts: The upper part which contains the application logic and the lower part which contains the error handling and the cleanup of resources.

However, using these goto's always left me feel like doing something in the gray zone: There is an old ban from the 60ies (Letters to the Editor: Go To Statement Considered Harmful, Dijkstra, 1968) but without talking to much about it in public C programmers still carry on writing goto.

The paper An Empirical Study of goto in C Code  releases as a pre-print in February 2015 now takes an interesting second look at this old ban.

The international group of researches who was involved in the paper analyzed 2 million lines of C code collected from 11K Github repositories. I leave the reading of the entire paper up to you and jump directly to the important part of the conclusion:

...far from being maintenance nightmares, most usages of goto follow disciplined, well-designed usage patterns, that are handled by specific constructs in more modern languages. 
The most common pattern are error-handling and cleanup, for which exception handling constructs exist in most modern languages, but not in C. Even properties such as several goto statements jumping to the same label have benefits, such as reducing code duplication, in spite of the coordinate problem described by Dijkstra.

That sounds like good news to me - I eventually can exit the gray zone.

Sonntag, 25. Oktober 2015

Slides For "Not Your Fathers C - C Application Development in 2015"

Before going on a too short vacation I attended the Developer Open Space conference in my home town Leipzig. As the conference name suggests this was an Open Space conference where the participants them self create the agenda for the day.

I held a session on modern C development which was a high level summary of my past posts here. The slides are now online: