Conway’s Game of Life

This week we take a short break from 3D programming topics and go into gaming! Well, sort of…

A few weeks ago I published on my GitHub page a CUDA implementation of Conway’s Game of Life. The code is pretty simple, well in tune with the simplicity of the game.

The implementation can be found here:

If you are not familiar with the game, Conway’s Game of Life is a 0-player game where cells live and die on an infinite 2D grid. The life/death rules are the following, according to Wikipedia:

Every cell interacts with its eight neighbours, which are the cells that are horizontally, vertically, or diagonally adjacent. At each step in time, the following transitions occur:

  1. Any live cell with fewer than two live neighbours dies, as if caused by under-population.
  2. Any live cell with two or three live neighbours lives on to the next generation.
  3. Any live cell with more than three live neighbours dies, as if by overcrowding.
  4. Any dead cell with exactly three live neighbours becomes a live cell, as if by reproduction.

Conway’s game is excellent for implementing on a GPU, as it involves analyzing the cells in the 2D grid and, what’s best, each cell’s next state depends only on the previous state of its neighbors and never on the their current state.

This means that we can spawn a GPU thread for every single cell in the board and calculate the next state in parallel.

In the published implementation, the board size is 64×64 cells, so we are effectively spawning 4,096 GPU threads to solve every iteration. We do this for one million generations.

The project has been released under a GPLv3 license, so feel free to download, build it, run it, modify it and share it with others under its terms.

If you are looking for a fun weekend project, the game could definitely use an UI. I’ll give you extra points if you can draw it using OpenGL without ever having to copy the board back from GPU memory into system memory ;-)


Bump Mapping a Simple Surface

In my last post, I started discussing Bump Mapping and showed a mechanism through which we can generate a normal map from any diffuse texture. At the time, I signed off by mentioning next time I would show you how to apply a bump map on a trivial surface. This is what we are going to do today.

Bump Mapping a Door texture. Diffuse and Normal maps taken from the Duke3D HRP. Rendered using Vortex 3D Engine. (HTML5 video, a compatibility GIF version can be found here.)

In the video above you can see the results of the effect we are trying to achieve. This video was generated from a GIF file created using the Vortex Engine. It clearly shows the dramatic lighting effect bump mapping achieves.

Although it may seem as if this image is composed of a detailed mesh of the boss’ head, it is in fact just two triangles. If you could see the image from the side, you’d see it’s completely flat! The illusion of curvature and depth is generated by applying per-pixel lighting on a bump mapped surface.

How does bump mapping work? Our algorithm will take as input two images: the diffuse map and the bump map. The diffuse map is just the colors of each pixel in the image, whereas the bump map consists in an encoded set of per-pixel normals that we will use to affect our lighting equation.

Here’s the diffuse map of the Boss door:

Diffuse map of the boss door. Image taken from the Duke3D HRP project.
Diffuse map of the boss door. Image taken from the Duke3D HRP project.

And here’s the bump map:

Bump map of the boss door. Image taken from the Duke3D HRP project.
Bump map of the boss door. Image taken from the Duke3D HRP project.

I’m using these images taken from the excellent Duke3D High Resolution Pack (HRP) for educational purposes. Although we could’ve generated the bump map using the technique from my previous post, this especially-tailored bump map will provide better results.

Believe it or not, there are no more input textures used! The final image was produced by applying the technique on these two. This is the reason I think bump mapping is such a game changer. This technique alone can significantly up the quality and realism of the images our renderers produce.

It is especially shocking when we compare the diffuse map with the final bump-mapped image. Even if we applied per-pixel lighting to the diffuse map in our rendering pipeline, the results would be nowhere close to what we can achieve with bump mapping. Bump mapping really makes this door “pop out” of its surface.

Bump Mapping Theory

The theory I develop in these sections is heavily based on the books Mathematics for 3D Game Programming and Computer Graphics from Eric Lengyel and More OpenGL from David Astle. You should check those books for the definitive reference on bump mapping. Here, I try to explain the concepts in simple terms.

So far, you might have noticed I’ve been mentioning that the bump map consists of the “deformed” normals that we should use when applying the lighting equation to the scene. But I haven’t mentioned how these normals are actually introduced into our lighting equations.

Remember from my previous post how we mentioned that normals are stored in the RGB image? Remember that normals close to (0,0,1) looked blueish? Well, that is because normals are stored in a coordinate system that corresponds to the image. This means that, unfortunately, we can’t just take each normal N and plug it into our lighting equation. If we call L the vector that takes each 3D point (corresponding to each fragment) to the light source, the problem here is that L and N are in different coordinate systems.

L is, of course, in camera or world space, depending on where you like doing your lighting math. But where is N? N is defined in terms of the image. That’s neither of those spaces.

Where is it then? Well, N is actually in its own coordinate system that authors refer to as “tangent space”. It’s its own coordinate system.

In order to apply per-pixel lighting using the normals coming from the bump map, we’ll have to bring all vectors to the same coordinate system. For bump mapping, we usually bring the L vector into tangent space instead of bringing all the normals back into camera/world space. It seems more convenient and should produce the same results.

Once L has been transformed, we will retrieve N from the bump map and use the Lambert equation between these two to calculate the light intensity at the fragment.

From World Space to Tangent Space

How can we convert from camera space to tangent space? Tangent space is not by itself defined in terms of anything that we can map to our mesh. So, we will have to use one additional piece of information to determine the relationship between these two spaces.

Given that our meshes are composed of triangles, we will assume the bump map is to be mapped on top of each triangle. The orientation will be given by the direction of the texture coordinates of the vertices that comprise the triangle.

This means that if we have a triangle that has an edge: (-1,1)(1,1) with texture coordinates: (0,1)(1,1), a horizontal vector (1,0) represents a vector tangent to the vertices that is aligned with the horizontal texture coordinates. We will call this the tangent.

Now, we need two more vectors in order to define the coordinate system. Well, the other vector we can use is the normal of the triangle. This vector is, by definition, perpendicular to the surface and will be perpendicular to the tangent.

The final vector we will use to define the coordinate system has to be perpendicular to both, the normal and the tangent, so we can calculate it using a cross product. There is an ongoing debate whether this vector should be called the “bitangent” or the “binormal” vector. According to Eric Lengyel the term “binormal” makes no sense from a mathematical standpoint, so we will refer to it as the “bitangent”.

Now that we have three vectors that define the tangent space, we can create a transformation matrix that takes vector L and puts it in the same coordinate system that the normals for that specific triangle. Doing this for every triangle will allow applying bump mapping on the triangle.

Responsibility – who does what

Although we can compute the bitangent and the transform matrix in our vertex shader, we will have to supply the tangent vectors as input to our shader program. Tangent vectors need to be calculated using the CPU, but (thankfully) only once. Once we have them, we supply them as an additional vertex array.

Calculating the tangent vectors is trivial for a simple surface like our door, but can become very tricky for an arbitrary mesh. The book Mathematics for 3D Game Programming and Computer Graphics provides a very convenient algorithm to do so, and is widely cited in other books and the web.

For our door, composed of the vertices:

(-1.0, -1.0, 0.0)
(1.0, -1.0, 0.0)
(1.0, 1.0, 0.0)
(-1.0, 1.0, 0.0)

Tangents will be:

(1.0, 0.0, 0.0)
(1.0, 0.0, 0.0)
(1.0, 0.0, 0.0)
(1.0, 0.0, 0.0)

Once we have the bitangents and the transformation matrix, we rotate L in the vertex shader, and pass it down to the fragment shader as a varying attribute, interpolating it over the surface of the triangle.

Our fragment shader can just take L, retrieve (and decode) N from the bump map texture and apply the Lambert equation on both of them. The rest of the fragment shading algorithm need not be changed if we are not applying specular highlights.

In Conclusion

Bump mapping is an awesome technique that greatly improves the lighting in our renderers at limited additional costs. Its implementation is not without a challenge, however.

Here are the steps necessary for applying the technique:

  • After loading the geometry, compute the per-vertex tangent vectors in the CPU.
  • Pass down the per-vertex tangents as an additional vertex array to the shader, along with the normal and other attributes.
  • In the vertex shader, compute the bitangent vector.
  • Compute the transformation matrix.
  • Compute vector L and transform it into tangent space using this matrix.
  • Interpolate L over the triangle as part of the rasterization process.
  • In the fragment shader, normalize L.
  • Retrieve the normal from the bump map by sampling the texture. Decode the RGBA values into a vector.
  • Apply the Lambert equation using N and the normalized L.
  • Finish shading the triangle as usual.

On the bright side, since this technique doesn’t require any additional shading stages, it can be implemented in both OpenGL and OpenGL ES 2.0 and run on most of today’s mobile devices.

In my next post I will show bump mapping applied to a 3D model. Stay tuned!

Bump Map Generation

Bump mapping is a texture-based technique that allows improving the lighting model of a 3D renderer. I’m a big fan of bump mapping; I think it’s a great way to really make the graphics of a renderer pop at no additional geometry processing cost.

Bump mapping example.
Bump mapping example. Notice the improved illusion of depth generated by the technique. Image taken from

Much has been written about this technique, as it’s widely used in lots of popular games. The basic idea is to perturb normals used for lighting at the per-pixel level, in order to provide additional shading cues to the eye.

The beauty of this technique is that it doesn’t require any additional geometry for the model, just a new texture map containing the perturbed normals.

This post covers the topic of bump map generation, taking as input nothing but a diffuse texture. It is based on the techniques described in the books “More OpenGL” by Dave Astle and “Mathematics for 3D Games And Computer Graphics” by Eric Lengyel.

Let’s get started! Here’s the Imp texture that I normally use in my examples. You might remember the Imp from my Shadow Mapping on iPad post.

Diffuse texture map of the Imp model.
Diffuse texture map of the Imp model.

The idea is to generate the bump map from this texture. In order to do this, what we are going to do is analyze the diffuse map as if it were a heightmap that describes a surface. Under this assumption, the bump map will be composed of the surface normals at each point (pixel).

So, the question is, how do we obtain a heightmap from the diffuse texture? We will cheat. We will convert the image to grayscale and hope for the best. At least this way we will be taking into account the contribution of each color channel for each pixel we process.

Let’s call H the heightmap and D the diffuse map. Converting an image to grayscale can be easily done programatically using the following equation:

  \forall (i,j) \in [0..width(D), 0..height(D)], H_{i,j} = red(D_{i,j}) * 0.33 + green(D_{i,j})* 0.66 + blue(D_{i,j}) * 0.11

As we apply this formula to every pixel, we obtain a grayscale image (our heightmap), shown in the next figure:

A grayscale conversion of the Imp diffuse texture.
A grayscale conversion of the Imp diffuse texture.

Now that we have our heightmap, we will study how the grayscale colors vary in the horizontal s and in the vertical t directions . This is a very rough approximation of the surface derivative at the point and will allow approximating the normal later.

If H_{i,j} is the grayscale value stored in the heightmap at the point (i,j) , then we approximate the derivatives s and t like so:

  s_{i,j} = (1, 0, H_{i+1,j}-H_{i-1,j})  \\  t_{i,j} = (0, 1, H_{i, j+1}-H_{i,j-1})


s and t are two vectors perpendicular to the heightmap at point (i,j) . What we can now do is take their cross product to find a vector perpendicular to both. This vector will be the normal of the surface at point (i,j) and is, therefore, the vector we were looking for. We will store it in the bump map texture.

  N = \frac{s \times t}{||s \times t||}


After applying this logic to the entire heightmap, we obtain our bump map.

We must be careful when storing a normalized vector in a texture. Because vector components will be in the [-1,1] range, but values we can store in the bitmap need to be in the [0, 255] range, we will have to convert between both value ranges to store our data as color.

A linear conversion produces an image like the following:

Bump map generated from the Imp's diffuse map.
Bump map generated from the Imp’s diffuse map, ready to be fed into the video card.

Notice the prominence of blue, which represents normals close to the (unperturbed) (0,0,1) vector. Vertical normals end up being stored as blueish colors after the linear conversion.

We are a bit more interested in the darker areas, however. This is where the normals are more perturbed and will make the Phong equation subtly affect shading, expressing “discontinuities” in the surface that the eye will interpret as “wrinkles”.

Other colors will end up looking like slopes and/or curves.

In all fairness, the image is a bit more grainy than I would’ve liked. We can apply a bilinear filter on it to make it smoother. We could also apply a scale to the s and t vectors to control how steep calculated normals will be.

However, since we are going to be interpolating rotated vectors during the rasterization process, these images will be good enough for now.

I’ve written a short Python script that implements this logic and applies it on any diffuse map. It is now part of the Vortex Engine toolset.

In my next post I’m going to discuss how to implement the vertex and fragment shaders necessary to apply bump mapping on a trivial surface. Stay tuned!

WebGL for OpenGL ES programmers

I’ve been meaning to look into WebGL for a while now. Coming from an OpenGL (and then an OpenGL ES 2.0) programming background, I figured it should be relatively “easy” to get up to speed with some basic primitive drawing.

Luckily, I was not disappointed: WebGL’s specification was heavily based on OpenGL ES’ and knowledge can be easily transferred between the two. In this post I outline the main differences and similitudes between these two standards.

Screenshot of “WebGL Test” – a test program built to determine if WebGL is supported on a browser – Click on the image for the live version.

I was surprised to learn that WebGL, as an API, is even slimmer than OpenGL ES 2.0. OpenGL ES 2.0 had already done away with many features from ES 1.1, so WebGL being even smaller, really feels minimal. This is not a bad thing at all, but may make the learning curve a little more steep for developers just getting started with the *GL APIs.

In order to try WebGL, I decided to create a simple test application that determines if your browser supports it. A screenshot of the application can be seen above. The live version can be accessed by clicking on it or clicking here.

Some of the main things that struck me from WebGL while building this application were:

  • Javascript is the only binding. This might sound obvious, but it’s worth mentioning. WebGL development is done in Javascript (unless you are Notch).
  • No in-system memory Vertex Arrays: usage of VBOs is mandatory. It is the only way to submit geometry to the GPU. I think this decision makes a lot of sense, considering that if data were kept in system RAM as a Javascript array, copying to the GPU every frame may be prohibitively expensive. One of the best practices in OpenGL is to cache data in the GPU’s RAM and WebGL makes it mandatory.
  • Javascript types: WebGL provides several Javascript objects/wrappers that help use the API. Some function calls have been changed from the ES 2.0 spec to accommodate Javascript conventions. The glTexImage2D function, in particular, has a very different signature and seems unable to accept a raw array of bytes as texture data. Javascript Image objects help here.
  • Data must be loaded into WebGL using helper types like Float32Array, which tightly packs vertex data into consecutive memory. This is mandatory for populating VBOs.
  • You will have to deal with interleaved array data and feel comfortable counting bytes to compute strides and offsets. It’s the only way to keep the number of VBOs reasonable and is also one of the best practices for working with OpenGL and WebGL.

On the other hand, just like in ES 2.0:

  • There is no fixed-function pipeline. The T&L pipeline has to be coded.
  • Shaders are mandatory. The current types are vertex and fragment shaders.
  • Old data upload functions, such as immediate mode and display lists, are not supported.
  • There is no matrix stack, nor matrix helper functions. Be prepared to roll your own and try to leverage shaders as much as possible to avoid expensive computations in Javascript.


All things considered, I had fun programming WebGL. While developing the application, I found that most issues I encountered were not caused by WebGL, but rather by “surprises” in the way the Javascript programming language works.

I find WebGL, with its fast iteration cycles (just change the code, save and refresh the browser window), a reasonable tool for prototyping 3D applications and quickly trying out ideas.

The joy of not requiring the user to install any plugins and being able to present 3D data to them right in the browser is the icing on the cake and makes it a very interesting tool for people working in the 3D field.

Stay tuned for more WebGL goodness coming soon!

MD2 Library 2.0

MD2 Library 2.0 has been out for a while now (download here), but I haven’t had the time to update this blog! It’s a free download for all iPad users, and, at the time of writing, all iOS versions are supported (from 3.2 up to 7).

MD2 Library 2.0.
MD2 Library 2.0, powered by Vortex 3D Engine 2.0.

The App has been revamped to use the latest version of my custom 3D Renderer: Vortex 3D Engine, bringing new features to the table, including:

  • Per-pixel lighting with specular highlights.
  • Realtime Shadows (on iOS ≥4).
  • Antialiasing (on iOS ≥4).
  • User experience enhancements.
  • General bug fixes.

I took advantage of this due update to vastly improve the internal architecture of the App. The latest features in the Vortex Engine enable providing a much better user experience from an easier codebase and leveraging a simplified resource management scheme.

Head to iTunes to install for free or, if you have version 1.1 installed, just open up the App Store to update the App.

Update to MD2 Library coming soon

I’ve been working on and off on MD2 Library during my free time. MD2 Library is a showcase iPad App for my 3D Engine, Vortex. The Vortex 3D Engine is a cross-platform render engine available for iOS, Mac and Linux, with support for Android and Windows coming soon.

A capture of the MD2 Library App running on the iPad Simulator in Landscape mode.
A capture of the MD2 Library App running on the iPad Simulator in Landscape mode.

MD2 Library 2.0 is powered by Vortex 3D Engine 2.0, which brings a number of cool new features to the table, including:

  • Per-pixel lighting model with specular highlights.
  • Realtime shadows (via shadow mapping).
  • Antialiasing.

MD2 Library is and will continue to be a free download from the Apple App Store. If you’ve installed version 1.1, you should be getting the update soon. Stay tuned!

Writing a Mac OS X Screensaver

A screensaver can be seen as a zero-player game used mostly for entertainment or amusement when the computer is idle.

A Mac OS X screensaver is a system plugin. It is loaded dynamically by the Operating System after a given time has elapsed, or embedded into a configuration window within the Settings App.

What is a system plugin? It means we basically write a module that ascribes to a given interface and receives callbacks from the OS to perform an operation. In this case, draw a view.

A custom screensaver that uses OpenGL to render a colorful triangle.
A custom screensaver that uses OpenGL to render a colorful triangle.

Writing a Mac OS X screensaver is surprisingly easy. A special class from the ScreenSaver framework, called ScreenSaverView, provides the callbacks we need to override in order to render our scene. All work related to packing the executable code into a system component is handled by Xcode automatically.

We can render our view using either CoreGraphics or OpenGL. In this sample, I’m going to use OpenGL to draw the scene.

Initialization and Lifecycle Management

We start off by creating a View that extends ScreenSaverView:

#import <ScreenSaver/ScreenSaver.h>

@interface ScreensaverTestView : ScreenSaverView

@property (nonatomic, retain) NSOpenGLView* glView;

- (NSOpenGLView *)createGLView;


Let’s move on to the implementation.

In the init method, we create our OpenGL Context (associated to its own view). We’ll also get the cleanup code out of the way.

- (id)initWithFrame:(NSRect)frame isPreview:(BOOL)isPreview
    self = [super initWithFrame:frame isPreview:isPreview];
    if (self)
        self.glView = [self createGLView];
        [self addSubview:self.glView];
        [self setAnimationTimeInterval:1/30.0];
    return self;

- (NSOpenGLView *)createGLView
	NSOpenGLPixelFormatAttribute attribs[] = {
	NSOpenGLPixelFormat* format = [[NSOpenGLPixelFormat alloc] initWithAttributes:attribs];
	NSOpenGLView* glview = [[NSOpenGLView alloc] initWithFrame:NSZeroRect pixelFormat:format];
	NSAssert(glview, @"Unable to create OpenGL view!");
	[format release];
	return [glview autorelease];

- (void)dealloc
	[self.glView removeFromSuperview];
	self.glView = nil;
	[super dealloc];

The above code is self-explanatory.

Notice how we tell the video driver what kind of OpenGL configuration it should allocate for us; In this case, we only request hardware acceleration. We won’t allocate a depth buffer because there is no need for it (yet).

Rendering Callbacks

Now, let’s move on to implementing the rendering callbacks for our screensaver. Most of the methods here will just forward the events to the super class, but we’ll customize the animateOneFrame method in order to do our rendering.

- (void)startAnimation
    [super startAnimation];

- (void)stopAnimation
    [super stopAnimation];

- (void)drawRect:(NSRect)rect
    [super drawRect:rect];

- (void)animateOneFrame
	[self.glView.openGLContext makeCurrentContext];
	glClearColor(0.5f, 0.5f, 0.5f, 1.0f);
	static float vertices[] = {
		1.0f, -1.0f, 0.0f,
		0.0f, 1.0f, 0.0f,
		-1.0f, -1.0f, 0.0f
	static float colors[] = {
		1.0f, 0.0f, 0.0f,
		1.0f, 0.0f, 1.0f,
		0.0f, 0.0f, 1.0f
	glVertexPointer(3, GL_FLOAT, 0, vertices);
	glColorPointer(3, GL_FLOAT, 0, colors);

	glDrawArrays(GL_TRIANGLES, 0, 3);

	[self setNeedsDisplay:YES];

- (void)setFrameSize:(NSSize)newSize
	[super setFrameSize:newSize];
	[self.glView setFrameSize:newSize];

We place our rendering logic in the animateOneFrame method. Here, we define our geometry in terms of vertices and colors and submit it as vertex arrays to OpenGL.

Implementing the setFrameSize: method is very important. This method is called when our screensaver starts and we must use it to adjust our views’ dimensions so we can render on the whole screen.

Actionsheet Methods

Mac OS X screensavers may have an associated actionsheet. The actionsheet can be used to let the user customize the experience or configure necessary attributes.

- (BOOL)hasConfigureSheet
    return NO;

- (NSWindow*)configureSheet
    return nil;

Testing our Screensaver

Unfortunately, we can’t run our screensaver right off Xcode. Because it’s a system plugin, we need to move its bundle to a specific system folder so Mac OS X can register it. In order to install the screensaver just for ourselves, we place the bundle in the $HOME/Library/Screen\ Savers directory.

Once copied, we need to open the Settings App (if it was open, we need to close it first). Our screensaver will be available in the “Desktop & Screen Saver” group, under the “Other” category.


Screensaver writing for Mac OS X is surprisingly easy! With the full power of desktop OpenGL and C++ at our disposal, we can create compelling experiences that delight users and bystanders.

As usual, there are some caveats when developing OS X screensavers. You can read about them here.

Happy coding!

More on Objective-C Blocks

In 2011 I first blogged about Objective-C blocks, a game changing language construct that allows defining callable functions on-the-fly. In this post, we delve into some advanced properties of blocks in the Objective-C language.

1. Blocks capture their enclosing scope

Consider the following code snippet:

#import <Foundation/Foundation.h>

int main(int argc, char* argv[])

		int capture_me = 10;

		int (^squared)(void) = ^(void){
			return capture_me * capture_me;

		printf("%d\n", squared());

	return 0;

In the above example, we create a block that captures local variable “capture_me” and store it into a variable called “squared”. When we invoke the “squared” block, it will access the captured variable’s value, square it and return it to the caller.

This is a great feature that allows referencing local variables from deep within a complex operation’s stack. As Miguel de Icaza points out, however, we need to be careful with this feature to avoid producing hard to maintain code.

As you may have guessed, the code above correctly prints value “100”.

2. Blocks can modify captured variables

Now, consider this snippet. We will change our block not to return the squared variable, but rather to capture a reference to the local variable and store the squared value, overriding the original.

#import <Foundation/Foundation.h>

int main(int argc, char* argv[])

		__block int modify_me = 10;

		void (^squared)(void) = ^(void){
			modify_me *= modify_me;

		printf("%d\n", modify_me);

	return 0;

The __block keyword signals that variable “modify_me” is captured as a reference by the Block, allowing it to be modified from within its body.

Just like before, this code still prints “100”. If we were to call the “squared” block a second time, we would square the variable again, yielding “10.000”.

3. Blocks are Objective-C Objects allocated on the stack

Unlike any other object instance in Objective-C, blocks are objects that are allocated on the stack. This means blocks need to be treated as a special case when we want to store them for later usage.

As a general rule of thumb: you should never retain a block. If it is to survive the stack frame where it was defined, you must copy it, so the runtime can place it on the heap.

If you forget and accidentally retain a block on the stack it might lead to runtime errors. The Xcode analyzer, thankfully, detects this problem.


If there were a feature I could have added to the Java programming language (when developing Android apps), it would without be, without a doubt, support for blocks or, in general, lambda expressions.

Objective-C blocks are a powerful feature that must be handled with care. When used correctly, they have the power to let us improve our code to make it more streamlined. When used incorrectly, they can lead to unreadable code and/or hard-to-debug memory-management bugs.

If you are interested in learning more about blocks in the Objective-C programming language, this article is a great resource and here’s the official Apple documentation.

Happy coding!

C++11 Enum Classes

With the release of the C++11 standard, C++ finally obtained its own enum type declarations. Dubbed “enum classes”, these new enums type define a namespace for the discrete values they contain. This sets them apart from classic C-style enums, which define their values in the enclosing scope. Enum classes can also be forward declared, helping improve compilation times by reducing transitive header inclusion.

C-style enums

So, what was the problem with C-style enums? -Consider this classic C enum defined at file scope:

enum ProjectionType

Constants PERSPECTIVE and ORTHOGONAL are defined in the global namespace, meaning that all references to these names will be considered a value belonging to this enum. Using general names will surely lead to chaos, as two enums defined in different headers can easily cause type ambiguities when pulling both headers together in a compilation unit.

A solution to this problem in a language that does not have namespaces, like C, is to prefix each constant with something that identifies the type, as to prevent possible name clashes.

This means our constants would become PROJECTION_TYPE_PERSPECTIVE and PROJECTION_TYPE_ORTHOGONAL. Needless to say, all caps might not be ideal from a code readabilty standpoint, as they can easily make a modern C++ codebase look like an old C-style macro-plagued program.

The pre-2011 C++ approach

In C++, we do have namespaces, so we can wrap our enums in namespace declarations to help organize our constants:

namespace ProjectionType
    enum Enum

Now, this is better. With this small change, our constants can be referenced as: ProjectionType::Perspective and ProjectionType::Orthogonal. The problem here is the fact that doing this every time for every enum can get a little tedious. Furthermore, our datatype is now called ProjectionType::Enum, which is not that pretty. Can we do better?

The C++11 solution

The ISO Committee decided to take this problem on by introducing the new concept of “enum classes”. Enum classes are just like C-style enums, with the advantage that they define a containing namespace (of the same name of the enum type) for the constants they declare.

enum class ProjectionType

Notice we declare an enum class by adding the class keyword right after the enum keyword. This statement, which would cause a syntax error in the C++98 standard, is how we declare enum classes in C++11. It must be accepted by all conforming compilers.

Using this declaration, our constants can now be accessed as ProjectionType::Perspective and ProjectionType::Orthogonal, with the added advantage that our type is called ProjectionType.

C-style enums vs enum classes

Because C++ is a superset of C, we still have access to C-style enums in C++11-conforming compilers. You should, however, favor enum classes over C-style enums for all source files that are C++ code.

The Mandelbrot Project

I’ve published the source code of the program I wrote for my tech talk at the 2011 PyDay conference. It’s a Python script and a companion C library that calculates and draws the Mandelbrot set.

The objective of the tech talk was to show how to speed up Python programs using the power of native code.

A render of the Mandelbrot set as performed by the script. Computations were performed in C.
A render of the Mandelbrot set as performed by the script. Computations were performed in C.

What’s interesting about this program is that, although the core was written completely in Python, I wrote two compute backends for it: one in Python and one in C. The C code is interfaced with using the ctypes module.

The results of running the program are shown in the screenshot above. If you are interested in trying it, the full source code is hosted in GitHub, here: I’ve licensed it under the GPLv3, so you can download, run it, test it and modify it.

As one would anticipate, the C implementation runs much faster than the Python one, even when taking into account the marshaling of objects from Python to C and back. Here’s the chart I prepared for the conference showing the specific numbers from my tests.

These tests were performed to compare the run times at different numer of iterations, note this is a logarithmic scale.

Comparison of the Python + C implementation vs a pure Python one. Scale is Logarithmic.
Comparison of the Python + C implementation vs a pure Python one. Scale is Logarithmic.

As you can see, Python programs can be significantly sped up using ctypes, especially when we are dealing with compute-intensive operations.

It might be possible to speed up the Python implementation to improve its performance to some extent, and now that the source code is available under the GPL, you are encouraged to! I would always expect well-written C code to outperform the Python implementation, but I would like to learn about your results if you happen to give it a go.

Happy hacking!