Smooth 2D collisions, part 1: simple disk-segment

TL,DR: The full code is at and is 80 line long.
There’s also an interactive standalone SDL demo to play with.

We consider a disk of radius RAY,
moving inside a set of solid segments,
as shown below.

This require very little math knowledge (basic vector algebra required), and is rewarding and fun!

Collision response

We start with collision response:

The function slideMove is the entry point to our “physics” code.
It calls the collision detection code, collideWithSegments,
and provides basic collision response.

It takes as input a starting position pos, and a “wanted” move vector delta.

It also takes the list of segments we want to test collisions against.

When it returns, pos is updated with the new non-colliding position.

void slideMove(Vec2& pos, Vec2 delta, span<Segment> segments)
  // move to new position ...
  pos += delta;

  // ... then fix it, if needed
  for(int i = 0; i < 5; ++i)
    auto const collision = collideWithSegments(pos, segments);

    if(collision.depth == 0)
      break; // all collisions are resolved

    // fixup position: push the disk out of the segment
    pos += collision.N * collision.depth;

The idea is dead simple: we force the new position, and patch it, until it’s not colliding anymore.
Or until we reach the arbitrary limit on iteration count.
We’re using discrete collision detection, so simultaneous collisions are probable.
We might need a couple of iterations to handle them.

Please note that we might miss collisions if objects are moving too fast.
I won’t go into this problem here, which is out of our scope today.

Note that there’s nothing in this function that actually depend on segments.
This code just passes them through, and is actually agnostic to what they’re made of.
Also, nothing depends on the fact that we’re dealing with a disk.

Now, we need a routine to actually detect – and measure – collisions.

Collision detection

Disk/segment collision

The following code computes one collision between a disk and a segment.
The disk has a radius of RAY.

We first compute the segment’s closest point to the disk center.
delta represents the vector going from the closest point to the disk center.
If the length of delta is more than RAY, the disk is too far: we exit with no collision.

Otherwise, we return the following collision:

  • the collision depth is the disk radius, minus the length of delta.
  • the collision normal is the normalization of delta.

Collision collideDiskWithSegment(Vec2 diskCenter, Segment seg)
  auto const delta = diskCenter - closestPointOnSegment(diskCenter, seg);

  if(delta * delta > RAY * RAY)
    return Collision {};

  auto const dist = magnitude(delta);
  auto const N = delta * (1.0 / dist);
  return Collision { RAY - dist, N };

Please note we’re never using the segment’s normal.
This is what allows a smooth collision response (i.e sliding)
when the disk touches the extremity of a segment.

Nearest point on a segment

So we’re left with one last problem to solve:
getting the point from a given segment [a;b] that’s closest to a given point P.

Let L be the infinite axis going through a and b, oriented from a to b.
Let I be the orthogonal projection of P on L.
Let tangent be the vector going from a to b.

  • If the scalar product (P – a) * tangent is negative, this means that I doesn’t lie on [a;b], it’s “before” a.
    In this case, the nearest point to P is a.
  • If the scalar product (P – b) * tangent is positive, this means that I doesn’t lie on [a;b], it’s “after” b.
    In this case, the nearest point to P is b.
  • Otherwise I lies on [a;b]: compute it, and return it.

(Note that we avoid the costly call to normalize in most cases here).

// returns the point from the segment 'seg' which is the closest to 'P'
Vec2 closestPointOnSegment(Vec2 P, Segment seg)
  auto const tangent = seg.b - seg.a;

  if((P - seg.a) * tangent <= 0)
    return seg.a; // 'P' is before 'a' on the line (ab)

  if((P - seg.b) * tangent >= 0)
    return seg.b; // 'P' is after 'b' on the line (ab)

  auto const T = normalize(tangent);
  auto const relativePos = P - seg.a;
  return seg.a + T * (T * relativePos);

Dealing with simultaneous collisions

There’s still on thing we need to deal with.
What if the disk intersects two segments at the same time?
Do we simply return the first one we find?

Actually, doing this doesn’t work here.
If you do this, your disk is going to get stuck on extra vertices.

Let’s consider the following situation:
the floor is perfectly flat and horizontal,
but composed of two segments A and B.

Now suppose the disk goes from the non-colliding previous C position
to the colliding C position.

Now we’re colliding with both A and B.
The problem is, we don’t get the same collision normal, depending on whether we choose to collide with A or B.
Let J and K be the closests points to C, on segments A and B.

If we decide to collide with A, then we compute a collision normal as the normalization of the JC vector.
This vector is perfectly vertical,
which leads to a perfectly vertical upward repulsion.
This is what we want.

If we decide to collide with B, then we compute a collision normal as the normalization of the KC vector.
This vector is mostly vertical, but also has a slight horizontal component.
If the disk is slowly sliding along the ground, from left to right,
this horizontal component can compensate horizontal move of the disk.
In practice, the disk gets stuck when it touches K.

To avoid this, we always start by resolving the collision with the deepest penetration:

Collision collideWithSegments(Vec2 pos, span<Segment> segments)
  Collision deepest;

  for(auto seg : segments)
    auto const collision = collideDiskWithSegment(pos, seg);

    if(collision.depth > deepest.depth)
      deepest = collision;

  return deepest;

If you want to see for yourself why this selection is needed,
simply neutralize the above condition,
in order to always reassign deepest,
and then launch the demo program.
Now try to slowly slide upwards along the rightmost vertical wall.
If you slide slowly enough, you will get stuck halfway.

I must confess, I’m not sure if this is the correct way of doing things.
It might be possible to create a situation
where taking the shallowest penetration would be the right answer.
However, I wasn’t able to find one, and this strategy seems to work in practice.

Putting it all together

The full source code is in the github repository, in the “disk_vs_segment” branch:

This code is simple and robust ; unlike continuous collision detection,
it deals gracefully with invalid initial configurations,
and will resolve smoothly any already-penetrating state.

At this point, you should be able to write the physics
for a 2D polygon-based top-down shooter, like Doom!

In the next post, I’ll explain how to extend this collision code
to handle axis-aligned bounding boxes.

Stay tuned!

Fancy tools can’t tame over-complicated designs

Bugs are only the tip of the iceberg … and over-complicated designs eat fancy tools for breakfast.

The other day, I was playing with reversing challenges (crackmes), and I had a funny though.

The principle of crackmes is simple: you’re given a small executable file, for which you don’t have the source code. When launched, it asks for some secret password. Your goal is to find the password, generally by the analysis of the given executable binary.

Binary reversing requires many tools, amongst them a debugger (generally gdb/Ollydbg/x64dbg), with every conceivable type of conditionnal breakpoint, time travel, highly precise execution traces, etc.

And I realized a funny thing: to understand the workings of a closed source program that what obfuscated on-purpose, I was using nearly the same tools and techniques that people use to understand their own code.

“The amount of energy necessary to refute bullshit is an order of magnitude bigger than to produce it.” (Alberto Brandolini)

Everyone knows that debugging is twice as hard as writing a program in the first place” (Denis Kernighan)

Suppose we play the following game: I must write a program, following some known spec, and you must find all bugs in it. However, I’m an evil coder, and my goal is to sneak in bugs that you won’t catch.

You can’t win at this game. Sneaking in a bug will take me “one order of magnitude” less energy that what you will need to detect it. To be able to beat me at this game, you might need, at least, a team of 10 skilled QA engineers, probably with access to the source code.

I think there’s a deep idea behind this ; probably something related to entropy ; the idea that’s it’s always easier to make a mess than to clean it afterwards. And, probably, easier “by one order of magnitude”.

Tools work exclusively on the “mess cleaning” side. I’m not talking exclusively about debuggers ; I’m also talking about sanitizers, static analyzers, and, to some extent, fuzzers and integrated tests.

These tools are thus fundamentally condemned to require horribly sophisticated techniques/heuristics, in order to “untangle” what the programmer wrote.

Relying on this “mess cleaning” side to understand the program you’re working on is an extremely slow approach. Sometimes, it’s the only possible one (e.g malware analysis).

But more often, it’s simply the default approach most developers think of (on this specific point, one could argue that Visual Studio’s great debugger harmed the software industry).

Finding bugs is well and nice, but it’s “one order of magnitude” less efficient than having a codebase that’s not full of traps, a codebase that minimizes your probability of writing code that would introduce new mistakes.

Let’s suppose that one day, you have a magical compiler who is able to tell you, instantly and with certainty, if you’ve just introduced a bug or not.

Bad software designers would still end up in some dead end: at this point, the design would have become so complicated that the vast majority of attempts at modification would introduce a new mistake. The magical compiler would catch them early, and the bad software designer would have to undo them, and try again.

They would become trapped in their own over-complicated design. Fragile code would have been transformed into rigid code ; and the goal would have been missed.

(You don’t need a magical compiler to experience this situation, a badly written suite of unit tests is enough – and it’s possible that many of our codebases already are in such dead-ends, it’s just that we can’t directly see it).

Thus, even the best imaginable tool doesn’t solve the issue. This is because catching all my mistakes early is only the tip of the iceberg. I also want to be able to add features in a reasonnable time.

In one word, I want my code to be correct, and flexible.

And flexibility isn’t a refinement of correctness. Quite the opposite, in fact.

Non-intrusive test methods are exactly the tools that act on the “mess cleaning” side I was describing earlier.

Let’s suppose you put a 9-year old in front of a text processor, and you ask him to write a 300-page novel.

Now, you read the result. The first thing you notice is that it’s full of spelling mistakes, and ambiguous wordings.

Now, you say to yourself: “ok, so now, I just have to fix all the mistakes”, and you launch your favorite spellchecker.

Interestingly, after the spellchecking + corrections, the text is still very poor, and you don’t understand why. Yet, you’ve fixed all the mistakes!

You read the text again, and this time, you notice that some very subtle spelling mistakes have made their way through the spellchecker. So you think: I need a better spellchecker!

If you can find a spellchecker able to tell you what corrections you need to do to transform your initial text into Shakespeare, good for you ; your tool is very powerfull (and can probably replace you).

But you don’t necessarily want Shakespeare ; Dan Brown would be okay!

So you decide to ignore the remaining spelling mistakes and start digging into semantics. And then, you realize that the text is full of inconsistencies, like characters using information they’re not supposed to know. Or things happening simultaneously in several places. Or causality violations.

And then, you say to yourself: I know what I need: a tool to check the consistency of a story! And by chance, it’s exactly the subject of a research paper published 3 monthes ago, describing a heuristic to detect inconsistencies in written stories. Great!

Except that it doesn’t detect all inconsistencies, and worse: it doesn’t tell you how to fix the ones it detects.

So you start patching one by one all the errors reported by your new tool. Two weeks of tedious rewording later, the tool doesn’t report anything anymore.

You end up with a patched text. Each paragraph is now nearly consistent, but still, the full text isn’t. By focusing on mistakes detected by the tool, you have, without knowing it, applied fixes to very local issues, and ignored global ones.

And this way of fixing has simply moved the errors (inconsistencies) out of the detection zone or your new fancy tool. Even worse: it’s now even more difficult to modify your story without creating new inconsistencies.

In one word, you just made your novel more difficult to validate, and more rigid. And in the process, you have blinded a state of the art error-detection tool. And your novel still doesn’t even come near Dan Brown.

I think we all agree that this clearly isn’t a good way to write a novel. A better way would probably to teach your writer how to write better.

Don’t get me wrong: I’m not saying we should trash our error checking tools, absolutely not! I’m saying that we see them more like  educationnal tools, rather than validation tools.

This means that this is your 9-year old, not you, that must analyze and understand each reported mistake, in order not to fix his text, but in order to fix his writing technique.

It’s intrusive ; it’s a feedback to the “mess making” side from the “mess cleaning” side, that allows the writer to understand how not to make a mess (and by the way, it’s, to my opinion, the biggest benefit of TDD).

With software development, it’s roughly the same story.

There’s some “writer” process, which corresponds to the modification of your source code, and a “reviewer” process, which corresponds to the analysis and understanding of what you have written.

Of course, in practice, both processes are intertwined (except for adepts of “Debug-Later Programming”, as described in ).

There are lots of ways you can “analyze” what you’ve written. Anything that gives you feedback about you code counts. This includes: unit test failures, poor code coverage, static analysis failure, bug reports from customers, build times, high cyclomatic complexity, integration tests failures, high defect rate, etc.

This is how one learns to keep the code simple.

If your testing methodology doesn’t have feedback to modify the way you write code, you’re doing it in a suboptimal way.

I postulate that having this feedback is the only viable way to write software.

Bugs are only the tip of the iceberg. You don’t want to waste your time attacking them one by one. You have to attack instead the underwater factory that’s producing those bugs at a faster-than-you-can-handle rate.

And it has a name, it’s called software engineering.

LLVM, LDC, and Emscripten: an introduction

More details about the toolchain

For those unfamiliar with the tools described in the other blogpost “Running a D game in the browser”, here are some short descriptions of what they are, and how they work.


LLVM is a compiler project, whose aim is to have a modular architecture and hacker-friendly codebase. The goal is to allow people to develop custom tools around the compilation process.

The LLVM project revolves around its intermediate representation format, also known as “LLVM bitcode”. Most compilers use some kind of IR internally, but they don’t generally make it possible for the user to mess with that IR.

Not having this flexibility with the compiler intermediate representation prevents lots of cool things. As a quick example, this IR could be used for caching, instead of using preprocessed files, like ccache does, thus allowing language-agnostic compilation cache.

And it also makes it a lot harder to write custom frontends or custom backends. If the set of backends isn’t available as a library, and the intermediate representation can’t be put into a file, then it means you have to integrate your custom backend in the main compiler executable, which isn’t always pretty if the original compiler isn’t modular enough.

LLVM has made a strategic design choice here. LLVM frontends can be built separately from LLVM, and they allow the user to generate LLVM bitcode from her source file, instead of generating machine code for a specific target.

This bitcode format is mostly target architecture agnostic. This doesn’t mean compiled programs are architecture agnostic too:  for example, C programs can explicitely introduce architecture-dependent things by using directives like #ifdef _X86_ or the sizeof operator. Actually, this bitcode format could be seen as some kind of flattened C, i.e without structures for the control flow. There are structs, there are function pointers, there are integer types, etc.

Once a program has been compiled to LLVM bitcode:

  • it’s freed from the specificities of the language it was written in
  • it has not yet been polluted with the specificities of any target architecture

At this point, you can compile it to executable code for a real architecture (with the LLVM compiler, ‘llc’), or you can interpret it (with the LLVM interpreter, ‘lli’).

Technically, a program in LLVM bitcode is a file whose extension is “.bc” (binarized LLVM bitcode), or “.ll” (human readable version of LLVM bitcode, looks like assembly language). LLVM provide tools allowing you to switch between both representations.


Some LLVM frontends: clang and ldc

The C++ frontend targetting LLVM is called “clang” (pronounced “Klang!”). clang can output LLVM bitcode by adding “-emit-llvm” to the command line.

The LLVM frontend for the D programming language is called “ldc”. It’s a clever assemblage between the official D frontend from dmd (the reference D compiler), and code generation logic to emit LLVM bitcode. ldc can output LLVM bitcode by adding “-output-ll” or “-output-bc” to the command line.

At this point, I should mention that, although lots of good things can be said about LLVM and its frontends, there is also a lot of clumsinesss that can sometimes get painful to work with. The command lines of nearly all LLVM tools, including the frontends, are messy and heterogeneous. Amongst all the LLVM tools I used, it seems they can’t agree on the command line syntax. For example, to specify the output file, sometimes it’s “-o <file>”, sometimes, it’s “-of<file”, sometimes, it’s “-o=<file>”. The way of specifying target architectures on the command line is horribly counterintuitive.

The LLVM frontends are pluggable, which means they can be built out-of-LLVM-tree, LLVM doesn’t need to know about them. For example, if you’re on Debian, you can just “apt-get install llvm-3.8-dev” and start writing your custom frontend for your custom toy language.

The LLVM backends, implementing targets, on the other side, aren’t pluggable. Indeed, the frontends need to know about them, as each frontend executable will somehow embed the list of available backends.

In a perfect world, words like “x86”, “arm”, “linux”, “windows” should never appear in the frontend code. They should be dynamically provided by the backend, so the frontend can, for example, #define the appropriate symbols (_WIN32, __linux__, etc.) so the source code can potentially depend on it.

In a perfect world, backends would be compiled as plugins, and writing your own backend should be as easy as writing your own frontend. In one word, to add a new backend, you shouldn’t have to modify the LLVM code itself, nor should you have to even recompile it, and you shouldn’t also have to recompile any frontend.

This point is a serious design weakness, which is the root of nearly all the difficulties I had to deal with in this project..


asm.js is a highly-optimizable subset of Javascript. It mostly relies on enforcing static typing through standard Javascript constructs. So, asm.js code is technically valid Javascript code, although most modern browsers will recognize asm.js code as such, and apply special optimizations before feeding it to their Javascript VM.

If you haven’t seen the Emscripten-powered demo, “BananaBread”, go see it now:

Then you will understand that this technique is a game changer, because now we can play first-person shooters into our browsers!

asm.js is not exactly the future, though. Technically, it’s a hack, an attempt to change the way browsers work, to make them a decent target platform for non-web static languages, like C++, Rust … or D.

A non-hacky solution, WebAssembly, is currently being developed, and its support is making its way into our browsers.

However, at the moment, I consider WebAssembly not being mature enough: I’m going to stick to asm.js here, and its main compiler: Emscripten.


Emscripten is a hell of a cool idea. In one word, it’s a translator from LLVM bitcode to asm.js (a highly-optimizable subset of Javascript).

In theory, once a program is converted to LLVM IR, it doesn’t matter what language it was written in: it could be C, C++, Rust, or anything that compiles to LLVM bitcode (I’m disregarding the runtime environment here). So Emscripten should work with any language having an LLVM frontend (and whose semantics don’t differ too much from C++) especially … the D programming language.

To add some difficulties to the issue, as LLVM backends aren’t pluggable, Emscripten actually comes with its own forked versions of LLVM and clang (namely “fastcomp” and “fastcomp-clang”), against which ldc and other LLVM tools generally cannot be compiled, because of incompatible API versions.

Emscripten also consists of a great runtime environment, where a lot of I/O APIs (SDL, SDL_mixer, OpenGL…) have been re-implemented.

This is where most of the magic occurs, actually. It doesn’t matter how a function call like “SDL_SetVideoMode(640, 480, 32, 0)” gets translated to it’s Javascript equivalent.

What really matters here is how the callee is implemented (hint: in Javascript), compiling SDL to Javascript using the same method would just push the problem further: the SDL code also does calls to other APIs (winmm, DirectDraw, X11 …) that of course you do not have in a browser.


Running a D game in the browser

TL;DR: After two years of experimentation, I finally managed to run D code inside my browser. The conversion chain is D-to-LLVM-to-C-to-ASMJS, and uses ldc, Emscripten, and an unexpectedly useful tool from the Julia community.

Demo here:

First attempt with LDC

My initial idea was to compile ldc against emscripten-fastcomp (the Emscripten fork of LLVM), so I would get a D compiler that would target Javascript (spoiler: it doesn’t work).

First I had to get it to build. LLVM API are incompatible between major releases, and the ldc build system relies on the version number which is reported by llvm-config. However, as emscripten-fastcomp API is halfway between LLVM major releases, this version number makes almost no sense, and I had to modify ldc to get it to build.

At this point, however, the only thing I could get when trying to compiler D code to Javascript were assertion errors.  It turns out the generated bitcode doesn’t pass the “PNaCl legalization passes”. I have no clue what this means, however I do understand that the bitcode generated by ldc is different in nature to the one generated by clang (most probably, it uses parts of the bitcode that clang doesn’t use).

After monthes of trial and error, I finally gave up, and, with a heavy heart, went back to C++. I made two toy projects using the SDL and vanilla Emscripten.


Writing these two projects was an incredibly instructive experience. It allowed me to fully realize the inherent limitations of running inside a browser. This was also the occasion to learn the OpenGL core profile, which was, at the time, the only way to do 3D graphics from Emscripten.

The source code for these two projects is available at the following Bazaar repositories:

$ bzr checkout

$ bzr checkout

A new hope: the “resurrected” LLVM C backend

So, compiling C++ to Javascript works. If I could translate my D programs to C++, then I could feed them to Emscripten without having to mess too much with the bitcode files, and without having to mess at all with fastcomp internals.

It turns out LLVM currently has a “C++ backend”: however, it produces C++ code making LLVM API calls, whose execution results in the construction of LLVM data structures representing your program. Obviously, this isn’t what we want here.

LLVM used to have a real C backend, translating LLVM bitcode to standalone C code. This is exactly what we need. Alas, this project was unmaintained, and it was discontinued by the LLVM developers mid 2012, with the release of LLVM 3.1.

The LLVM C backend seems to have been resurrected by Julia developers in the of summer 2014. The project is named llvm-cbe and is available on github ( And it’s great, because it means now I can translate a D program back to C code … and then feed it to Emscripten!

Well, in theory. In practice, things didn’t come out very smoothly.

The C backend strikes back

Firstly, the LLVM C backend in its current state requires LLVM 3.7 exactly. And of course, you can’t compile it against fastcomp (the emscripten LLVM fork). So we’re gonna have to deal with two LLVM toolchains here, let’s hope everything will be compatible.

Secondly, the LLVM C backend sometimes generates invalid C code, i.e code which doesn’t compile. Once again, it seems that its authors restricted themselves to the bitcode subset targetted by clang (although, with a lot of effort, it’s possible to craft a C++ file whose compilation with clang will produce evil bitcode, and this evil bitcode will lead llvm-cbe into producing invalid C code: ) .

I had to fork my version of llvm-cbe, and to fix some issues myself, hoping that my pull requests will be accepted.

The return of Emscripten

At this point, I’m able to translate some D programs to standalone-gcc-compilable C code. Which means the only thing left is to feed it to Emscripten.

So, to summarize the whole working compilation pipeline:

  • I generate LLVM bitcode for all my D source files with LDC (compiled against LLVM 3.7)
  • for practical reasons, I use llvm-link to merge all my bitcode files into one single bitcode file.
  • I compile this single bitcode file back to C source code, using llvm-cbe (also compiled against LLVM 3.7)
  • I feed the resulting C file to Emscripten (which internally uses its own fork of LLVM 3.9)
  • (Optionally, I feed the resulting C file to gcc, to generate a native app. It’s fun to be able to do that, and it’s a lot easier to debug)

Now comes the fun part: writing some D code which uses this toolchain. What I ended up with is a minimalistic real-time game using the SDL.


You can play the demo at:

The source code for the demo, the build scripts, and the toolchain deployment scripts are available at:

Limitations, and the future

The D subset

At the moment, I’m only targetting a tiny subset of the D programming language. The language comes with a runtime, featuring a garbage collector. It’s my first journey to the land of D-asm.js, and there’s no way I’m bringing the D runtime with me – especially if I need to port it to Javascript first.

Although most D programs make heavy use of the runtime, with some effort, programs that don’t require it can be written. Just tell your linker to omit any default libraries, and your code is guaranted not to make any calls to the runtime.

This is what I’m aiming for at this point: having the runtime feature-level of C, which means no GC, no dynamic arrays, no runtime type information, no “new” operator, and no exceptions.

Due to an error in llvm-cbe (related to global variable declaration order), I don’t have support for classes yet.

On the other side, I still benefit from some great parts of D happening at compile-time: mixins, templates, traits, and of course CTFE.

I also don’t have Phobos, the D standard library. Most of it relies on the D runtime, or does API calls to the underlying OS. Technically, these parts would have to be rewritten in Javascript (and maybe one day integrated to Emscripten).


As you may have seen if you played with the deployment scripts, the toolchain is not very pretty. It requires 3 directories to be added to your PATH, amongst them, two different versions of an LLVM toolchain.

An obvious enhancement would be to use the WebAssembly backend of the upstream LLVM, and then using Emscripten+Binaryen to convert the generated WebAssembly to asm.js. This would allow me to ditch llvm-cbe, emscripten-fastcomp and emscripten-fastcomp-clang, resulting in a lighter and more consistent toolchain.

At the moment, I only gave a quick (one weekend) try to this idea, but the incompatible version requirements of the different tools are driving me crazy. llvm-cbe requires LLVM version 3.7 and doesn’t provide a CMakeLists.txt. However the autoconf-based build system of LLVM is deprecated in favour of cmake, recent versions of LLVM (3.9 at this time of writing) don’t provide a ./configure script anymore.

However, I kind of like the idea of being able to convert anything back to C code. C compilers are going to be around for a long time, C as a target platform is an incredibly safer bet than any other format, including LLVM bitcode.

Your ideas

My goal is to continue writing my programs in D, in the most portable way, without fear of potential future platform restrictions. However, the technique I just described is too fragile at this point. It works around many incompatibilites and limitations. However, I believe this proof-of-concept paves the way for non-hacky enhancements.

If you have any idea on how to improve this technique, please let me know. Don’t hesitate to fork the project on github, to make suggestions, or to ask me questions, or even to suggest enhancements to the current flappy-bird-frustrating gameplay of the demo!



B.S. is for build system

I have recently stumbled upon this insightfull (TM) article:
The Convergence Of Build Systems And Package Managers

It’s a nice synthesis of the design issues we currently have with software build systems and package managers ; especially their current tendency to go monolithic – and thus, tightening the frontiers between programming languages.

At this in time (2016), each new programming language seems to require its own build system and/or package manager.

Perl has cpan, Python has easy_install, Javascript has npm, D has dub, Rust has cargo …

Clearly something is wrong here. The programming language shouldn’t be the central abstraction around which all my project revolves. I want to be able to mix any two languages in the same project, and have only ONE build system to invoke to build everything. And I mean in parallel, of course.

We can’t seem to get the build system architecture right. There really are lots of them, which means we might not have found the solution yet.

GNU Make, which is language agnostic, is definitely one popular standard, but has lots of limitations. Some of them are superficial (unusual cabalistic syntax, text-based parser), some others are a lot more worrying: automatic dependency generation, ‘remake’ pass, no clear difference between file targets and phony ones, the sole existence of ‘make clean’, the need to manage temporary files yourself, etc.

A lot of build systems try to lure us into thinking all the needed information can be deducted from the source files. It’s a fallacy!
Additional information, supplied at the build system level, will always be needed: think about OS dependent source files, generated source files, additional link libraries, DSL source files, pluggable factories (link-time registration), preprocessor defines, optimization flags, etc…

Some of those issues can be mitigated with compiler hacks like “#pragma comment(“winmm.lib”)” (good luck with portability, or library search directories)  … or text mixins (see ) (good luck with automatic dependency generation).
My take on the issue is that we’re heading the wrong way. We’re trying to fit into source files information which really don’t belong to source files.

Our source files most probably already contain too much information.

A big part of the problem: build systems have to dig into source files to compute a part of the build configuration (especially the dependency graph). This is a violation of a layered architecture. It’s a lot more harder to be programming language agnostic  if you need to dig into source files. You can be mitigate the programming language agnosticity issue with more hacks like “gcc -MM” (which only outputs GNU make syntax) – but what about generated files? Especially if they’re going to be generated … in a separate “output” directory.

My point is the current compiler interface we all use might be wrong:

ObjectFile compile(SourceFile src)

What we might need is, at compiler command-line level, something more like:
ObjectFile compile(SourceFile src, SourceFile[] imports)

The array “import” files would be dynamically translated by the compiler to “imports/#include/use” directives.

In plain old command-line english, it goes like this:

$ gcc main.cpp --import lib.h --import client.h

You already know this technique. It’s called dependency injection!

This is a significant change in our programming habits. It requires additionnal discipline over our usage of ‘import’ directives, which we already respect most of the time, by having our ‘import’ directives grouped at the top of our source files. C programmers might feel a lot more restricted (they will not be able to include several time the same file with different symbol #defines).

It greatly reduces friction when it comes to testing classes ; no need for complicated dependency injection frameworks, just change your compiler command line to ‘import’ the mock version of its dependencies.

This technique also brings you, by construction, the whole DAG of your source file dependencies. No need for a first build to print a dependency graph, or for funny things like “dynamic dependency discovery”. After all, this graph IS static.


Generating mazes, part 2: adding keys and structure

In a previous post, I described how to generate a ‘logic’ maze, i.e a graph representing color-connected rooms. These mazes had no walls, nor corridors. In order to have something displayable, I had to workaround spatial layout issues by using a trick (starting from planar graph whose nodes already had a position – in one word, triangulating a bunch of random dots).

I also left alone the problem of key placement. I turns out having locked doors in your maze takes a whole new dimension, once you have the possibility to unlock them!

At this point, our mazes have no walls, nor keys.  These two issues are completely unrelated ; however, I’m going to describe how I solve them both.

First, the walls. The trick is to start with what I call a “grid-graph”. It’s nothing more than the graph of connections between adjacent cells in a grid ; here’s an example:


Like we did before, we dig our maze by extracting a random spanning tree from this graph. However, there’s a slight twist here. First, by convention, the root node is always the top-left one. Second, In order to have corridors, I decided to favor keeping “horizontal” edges in my spanning tree. This way, the connections between cells are mostly horizontal, causing horizontal corridors to be generated. This kind of maze could be used, for example, in a 2D side-scroller.


OK, but where do the walls come from? I store them in a 2D matrix of cells, which I call a CellMap. Each cell holds two booleans: connectedLeft and connectedUp. These two booleans represent the presence or absence of a wall at the left and above each cell.

Then, I need to be able to get the CellMap corresponding to one grid-graph. The trick here is to have a bidirectionnal mapping between graph nodes and cell positions, implying a bidirectionnal mapping between graph edges and walls.

The position of a node is simply encoded in its node index, like this:

auto nodeX = nodeId % MAP_WIDTH;
auto nodeY = nodeId / MAP_WIDTH;

Once we have a spanning tree, we can apply what we already did in a previous post: from the tree, extract one sub-tree, whose size is roughly one arbitrary fixed fraction of the whole tree. Then, close the door by coloring the edge connecting this sub-tree to its parent node.


Now that we have closed a red door, it’s the best time to drop the red key: we just have to avoid dropping it behind the red door. Remember, the root of the spanning tree is at the top left node. Although all gray cells would be theoretical acceptable locations for our red key, I chose to drop the red key in the deepest accessible leaf of the tree.

You can see below the red cross, representing … the red key.


So far, so good, we have isolated the “red” zone. Now, let’s repeat the process for the next zone. First, delimiting it:


Then, dropping the corresponding key somewhere deep:


Please keep in mind – it should be obvious at this point – that we’re locking zones in the reverse order. The player will first need to get the green key, go through the greed door, get the red key, then go through the red door.

One more thing: the simplistic key placement heuristic I described above might select a leaf-cell already holding a key. In reality, the generator computes the list of leaf-cells, and select the deepest one which doesn’t currently hold a key.

Skipping to the end: here are the 6 locked zones.


Just for fun, here’s the same algorithm, applied to a bigger map:


We now have zones, walls, and keys! What’s still missing? Cycles. I explained in my previous post how to handle them, I just didn’t implemented them in this demo.

The next step will be visual: my goal is to generate mazes suitable for 2D video games, so I’m going to need messing up with tiles and graphics, to achieve something like this:


As always, the whole source code for the generator is available on Launchpad here.

Reversing dependencies with pluggable factories

What if you could extend your application without modifying existing code, but only by adding new code? In software design circles, this is called the open-closed principle:

software entities (functions, classes, modules, applications, etc.) should be open to extension, but closed to modification.

At first, adding new behaviour to an existing code base without modifying it seems impossible. I’m going to describe here a design pattern called “pluggable factories” which allows to do exactly that.

TL;DR : think about plugins.

Let’s suppose we’re dealing here with a SVG rendering program. This program takes a SVG vector input, renders it to a frame buffer, then writes it to the disk in the BMP format. The high-level code may look like this:

// main.cpp
#include "lib_bmp.h"

int main(int argc, char** argv)
  auto input = argv[1];
  auto output = argv[2];

  auto svg = loadSvg(input);
  auto framebuffer = render(svg);
  writeBmp(framebuffer, output);

  return 0;

At first, this seems fine, until you decide that you need other output formats. Then the code starts to look like this:

// main.cpp
#include "lib_png.h"
#include "lib_gif.h"
#include "lib_bmp.h"

int main(int argc, char** argv)
  auto input = argv[1];
  auto output = argv[2];
  auto format = getExtension(output);

  auto svg = loadSvg(input);
  auto framebuffer = render(svg);

  if(format == "png")
    writePng(framebuffer, output);
  else if(format == "gif")
    writeGif(framebuffer, output);
  else if(format == "bmp")
    writeBmp(framebuffer, output);
    fatal("Unknown format");

  return 0;

Clearly, something is wrong here. This code doesn’t conform to the open-closed principle, as it’s not immune to the addition of new formats. Each new format requires to modify the code of the main workflow. Thus, we will never be able to “close” this workflow code. That’s a pity, because the output format is becoming here an annoying detail ; what our application really does is “render SVG files to raster picture files”, regardless of the concrete output format.

One can mitigate the issue by extracting a new method ‘writePicture’:

// main.cpp
#include "lib_png.h"
#include "lib_gif.h"
#include "lib_bmp.h"

int main(int argc, char** argv)
  auto input = argv[1];
  auto output = argv[2];
  auto format = getExtension(output);

  auto svg = loadSvg(input);
  auto framebuffer = render(svg);
  writePicture(framebuffer, format, output);

  return 0;

void writePicture(Framebuffer pic, string format, string output)
  if(format == "png")
    writePng(pic, output);
  else if(format == "gif")
    writeGif(pic, output);
  else if(format == "bmp")
    writeBmp(pic, output);
    fatal("Unknown format");

At least, the ‘main’ function now has a clean workflow: read input, render it, write it. It’s becoming readable again.

However, it still depends on the availability of ‘writePicture’, which in turn depends on the availability of … all the format functions. In short, although our ‘main’ workflow seems isolated from the low-level details, it still can’t be re-used, nor tested, without them.

The “dependency inversion principle” tells us that abstractions should not depend on concretions. In other words: don’t depend on something more volatile than you.

Currently, our workflow depend on concrete picture formats. We need to reverse this dependency: the concrete picture formats should depend on the workflow, and the workflow should depend on … nothing.
It might not sound familiar, but I guarantee you it is ; just have a look at the following code, it’s basic object-oriented design (i.e polymorphism):

// main.cpp
int main(int argc, char** argv)
  auto input = argv[1];
  auto output = argv[2];
  auto writer = createPictureWriterForFormat(getExtension(output));
  workflow(input, output, writer);
  return 0;

// workflow.h

struct IPictureWriter
  virtual ~IPictureWriter() {};
  virtual void writePicture(Framebuffer pic, string outputPath) = 0;

void workflow(string input, string output, IPictureWriter* format);

// workflow.cpp

void workflow(string input, string output, IPictureWriter* format)
  auto svg = loadSvg(input);
  auto framebuffer = render(svg);
  writePicture(framebuffer, format);

// output_formats.cpp
#include "workflow.h"
#include "output_format_png.h"
#include "output_format_gif.h"
#include "output_format_bmp.h"

unique_ptr createPictureWriterForFormat(string format)
  if(format == "png")
    return make_unique();
  else if(format == "gif")
    return make_unique();
  else if(format == "bmp")
    return make_unique();
    fatal("Unknown format");

// output_format_png.h
#include "workflow.h"
#include "lib_png.h"

class PngWriter : IPictureWriter
  void writePicture(Framebuffer pic, string path)
    writePng(pic, path);

// output_format_gif.h

#include "workflow.h"
#include "lib_gif.h"

class GifWriter : IPictureWriter
  void writePicture(Framebuffer pic, string path)
    writeGif(pic, path);

// output_format_bmp.h
#include "workflow.h"
#include "lib_bmp.h"

class BmpWriter : IPictureWriter
  void writePicture(Framebuffer pic, string path)
    writeBmp(pic, path);

It’s a lot better now. Don’t be scared by the number of lines. That’s certainly more text, but that’s exactly the same amount of executable code. The only thing we did was separate this executable code into properly-labelled boxes.

Our SVG-to-raster conversion workflow can now handle new formats without having to be modified.

However, one thing still isn’t quite right: the “createPictureWriterForFormat” function. It’s called a “dependency magnet”, because during the lifecycle of the application, its dependency count is going to only grow, as more formats get added.

Don’t get me wrong: we certainly need to have, somewhere in our application, a list of output formats. However, as soon as such a list is present in one source file, we have created a dependency magnet. What if we could build such a list at link time?

Enter the pluggable factories.

The principle is very simple: each output format will use the global construction mechanism to register itself to the application. The result looks like this:

// output_formats.cpp
#include "workflow.h"

static map<string, WriterCreationFunc> g_formats;

void registerFormat(string format, WriterCreationFunc func)
  g_formats[format] = func;

unique_ptr<IPictureWriter> createPictureWriterForFormat(string format)
  auto i_func = g_formats.find(format);
  if(i_func == g_formats.end())
    fatal("unknown format");

  return i_func->second();
// output_format_bmp.cpp
#include "workflow.h"
#include "output_formats.h"
#include "lib_bmp.h"

class BmpWriter : IPictureWriter
  void writePicture(Framebuffer pic, string path)
    writeBmp(pic, path);

REGISTER(BmpWriter, "bmp")

As you can see, the growing list of #inclusions has now disappeared from “output_formats.cpp”. How does it work?

The macro “REGISTER” is actually defined like this:

#define REGISTER(FormatWriterClass, format) \
  static auto g_registered = registerMe<FormatWriterClass>(format)

So, far so good: the global initialization of “g_registered” will cause “registerMe” to be called ; then, how does the “registerMe” work?

Let’s see the whole file “outut_format.h”:

// output_formats.h
#include <functional>
#include <string>
#include <memory>

using namespace std;

typedef function<unique_ptr<IPictureWriter>()> WriterCreationFunc;

void registerFormat(string format, WriterCreationFunc func)

template<typename FormatWriterClass>
int registerMe(string format)
  auto creationFunc = []()
    return make_unique<FormatWriterClass>();
  registerFormat(format, creationFunc);
  return 0; // this value is actually unused

#define REGISTER(FormatWriterClass, format) \
  static auto g_registered = registerMe<FormatWriterClass>(format)

Don’t be scared by the lambda, the exact same effect can be achieved using a factory class instead (and please don’t tell me that I shouldn’t do “using namespace std” in a header file – that’s not the point here).

The registerMe function simply wraps the proper instanciation of make_unique (i.e the one which uses a concrete format writer class) into a “WriterCreationFunc”, which we put into a map.

And that’s it ; we can now add a new format by simply creating a new source file, and adding it to the linking process:

// output_format_tga.cpp
#include "workflow.h"
#include "output_formats.h"
#include "extra/libs/fast_tga.h"

class MyTgaWriter : IPictureWriter
  void writePicture(Framebuffer pic, string path)
    fast_tga::writeTga(pic, path);

REGISTER(MyTgaWriter, "tga");

Our application can now write .tga files, although no other source file contains nor refers to anything related to TGA ; that’s open-closed as its best!

Some comments: this only is the basic technique, which may need to be refined. Some compilers might issue a warning, as the static variable “g_registered” is unused. Also, using a simple global map isn’t going to work, because of the order of construction: the map might get constructed after some “g_registered” instances. All these problems can be solved by technical details that I’m not going to describe in this post. After all, this post is about abstracting details away! 🙂

Prototyping procedural maze generation algorithms using fibers

I’m fascinated by procedural generation algorithms, especially random maze generation. This morning, I was working on a graph-based dungeon generator.

Most maze generation algorithms I know of mix the “maze logic” (i.e the graph corresponding to the maze, including doors and keys) with the “maze layout” (the spatial layout of the rooms and corridors).
Actually, most of them are “maze layout” based.

Here, the idea is to start with the “maze logic”, without worrying too much about the “maze layout”. Spatial placement of rooms and corridors, according to a given “maze logic”, will be the subject of another blog post. However, to keep things simple, I only work here with planar graphs.

Here’s how it works:

  1. create a planar graph
  2. compute a random spanning tree for this graph
  3. let’s say you want to subdivide your maze into 4 zones (= 3 doors). The number of nodes per zone (NPZ) is: number of total nodes / 4.
  4. Repeat 3 times:
    1. for each tree node N, compute the total count of its direct or indirect reachable children ; in other words, the size, in nodes, of the subtree whose root is node N, excluding nodes behind doors.
    2. find the tree node N whose subtree size is the nearest of NPZ. Put a door here.
  5. Add some random edges (from the original planar graph), to create some cycles, but beware of not adding edges who would connect different zones.

Fine-tuning a maze generator requires the ability to see what happens between the steps. This means I’m going to need to stop the generation in the middle of the process, and display the state of the generation at the end of each step.

Here’s what the result looks like:

First, create a random planar graph. I do this by Delaunay-triangulating a bunch of random positions.


Then, extract a random spanning tree. The red node represents the root of the tree.


Then, compute the number of grandchildren for each node.


Find a node having a number of grandchildren nearly equal to 1/4 of the node count. This is the red-circled one, having exactly 20 children.


Next, repeat the operation.


Repeat a third – and last – time. The maze is now divided into nearly equally-sized areas.


In the beginning, my state-machine implementation looked like this:

class MazeGenerator
    m_state = STATE_INIT;

  void oneStep()
      case STATE_INIT:
        m_state = STATE_PLANARGRAPH;
        m_state = STATE_SPANNINGTREE;
        m_state = STATE_DONE;
  STATE m_state;

  void createPlanarGraph();
  void computeSpanningTree();
  void createDoors();

However, I got the feeling something was wrong. Especially when I started to want to visualize how each individual door was chosen.
Using an enum/switch introduces great friction when trying to add new steps to the generation algorithm. Dividing a step into two substeps is painfull, having variable number of steps is inconvenient.
It seemed like a good opportunity to play with fibers. Here’s how it goes:

class MazeGenerator
    m_fiber = new Fiber(&generate);

  void oneStep()
  void generate()

  void createPlanarGraph();
  void computeSpanningTree();
  void createDoors();

A lot simpler, isn’t it?

The whole maze generator sandbox project is available on Launchpad:

The source file corresponding to the very technique described in this article is here.