Writing libraries for fun and profit

A post by Lennart Poettering on Linux software library design sparked a parallel discussion on Google+ about library design. Despite being intended for Kernel developers venturing into userspace, libabc contains some excellent tips for C library design and is would be a great template for writing a library of your own.

Here's some additional points brought up during the side-discussion:

Autotools
While Autotools is by far the most widely used build configuration system for Unix, it suffers from being a rather archaic system of shell scrips and is extremely unfriendly to non-Unix systems. CMake offers an interesting alternative with a support for build configuration, project generation and cross-compilation. Project generation isn't perfect (it's going to look weird compared to native Visual Studio project), but it's a great step forward in getting your library to run on multiple platforms.

Calling exit(), abort() and using assert() inside a library
There's quite a deal of contention on this topic. Everyone seems to agree that calling exit() inside a libray is a really bad idea. Using abort() and assert() seem to be more up in the air. It's possible to use abort as a rather crude exception system (via an evil combination of a SIGABRT signal handler and longjump), but in general it's a bad idea. On POSIX systems, signals don't play very well with threads and you completely mess up the standard function-stack flow of your program (this is especially bad for C++). Thus a SIGABRT is almost always fatal. An argument for using abort() is flagging unexpected behavior - things that shouldn't ever happen unless there is a serious going-to-crash-your-code-150-lines-later kind of bug. What do you do when malloc fails? Let your code run wild until it crashes or crash it right at the source of the problem. On the other hand, try explaining this to your manager why the whole application stack crashed because of a minor issue. Asserts offer a kind of middle ground that only blows up when debugging.

Avoid callbacks in your API/Use iterators
This confused me a little bit at first until I realised that this applied to using callbacks for accessing the contents of a collection or abstract data type rather than being against the concept of callbacks in general. Here's an example of how both styles of interface can be implemented. Callbacks can still be very handy for handling asynchronous events.

C++ is bad for library interfaces
C++ is a pretty amazing and flexible language. It's also got some really horrible features that can come back and bite you in really horrid ways. There's three main issues with using C++ for writing library interfaces:
  1. Private member variables are part of a classes implementation, completely messing up information hiding. The Pimpl Idiom is one very common solution, but does make your classes more convoluted.
  2. Name mangling in C++ causes all sorts of problems. And it differs between compilers. Even if using the same compiler, it's possible for two compilation units to generate the same mangled name (think two static libraries) and there's no guarantee that the compiler will event detect this horrible case. And if you want to dynamically load a shared library (using dlopen) you have to use `extern C` to expose an unmangled C-style interface to your library anyway (though you can use a factory/interface pattern to still expose C++ classes).
  3. Versioning and ABI stability. Due to both of the points above, it's harder to maintain ABI compatibility of C++ interfaces. C++ is rather notorious for the Fragile binary interface problem. Adding and changing private member variables will break the ABI even if you haven't broken API compatibility. There's also the issue that symbol hiding in C++ is much-much harder than C (you can't just making your functions static).

Good library design is a real art. I think one of my favourite points I've read over the years is that libraries should be hard to misuse. If you can build an API where it's impossible to misuse, then your APIs will be easier to use and much lower risk of developers getting the wrong mental model for your library. 


Other useful links (in no particular order)