C++11 Iterators and Lambdas (Revisited)

Last week I introduced an article where I tested C++11 support in Apple’s version of the clang compiler (Apple clang 4.0).

As you may remember, after conducting some tests, I was left with the impression that clang’s implementation was far behind of what I was expecting, with some notable features missing like initializer lists and some iterator constructs. This week I decided to revisit what had been done by studying the clang documentation.

After some reading, it turns out that C++11 support is beyond what I originally though. The problem is that (at this point) C++11 support is still an “opt-in” compiler option.

Language support is turned off by default and so is the standard library. That’s right, clang is shipping with not one but two C++ standard libraries: the C++ 2003 legacy library (libstdc++) and the new C++11 library (libc++). You need to tell clang that you want it to link against the new library, something I was not doing before!

With this in mind, let’s develop a new C++11 program:

#include <vector>
using std::vector;

#include <algorithm>
using std::for_each;

#include <iterator>
using std::begin;
using std::end;

#include <iostream>
using std::cout;
using std::endl;

int main(int argc, char* argv[])
{
	// Initializer List goodness : )
	vector<float> v = { 0.0f, 1.0f, 2.0f, 3.0f, 4.0f };

	// for_each loop Goodness using new iterators and a lambda:
	for_each(begin(v), end(v), [](float f) { cout << f << " "; });
	cout << endl;

	// New "auto" keyword semantics:
	auto f = v[v.size()-1];
	cout << "An element: " << f << endl;

	return 0;
}

Now, many things have changed since our last program. We can now access lots of goodies from the new standard, including (as noted in the code): initializer lists, std::for_each, std::begin, std::end, lambda expressions and the new “auto” keyword semantics. They are all supported and work as expected.

To build this program, assuming you’re using Apple’s clang 4.0 or the Open Source clang version >= 3.2, use the following command:

$ clang++ -std=c++11 -stdlib=libc++ main.cpp

Notice how I explicitly invoke clang++ as opposed to clang and how I must specify the standard library version to use. libc++ is the LLVM’s project implementation of the new C++11 standard library.

If you have been following with your text editor and compiler, the output generated by running this program should be something like:

$ ./a.out
0 1 2 3 4 
An element: 4

This now feels much better, much more in tune with what one would expect from a project as important as LLVM.

In my next article I will continue exploring C++11′s implementation in clang as I test more advanced features of the language such as shared pointers and threading. Stay tuned!

Playing with C++11 (using Clang)

I have been following the new C++ standard for quite some time (even before it was named C++11), so I’m very glad to see support for it reaching mainstream compilers such as clang, g++ and Visual C++.

Although this article here provides a good comparison of the state of the C++11 implementation in these three compilers, I wanted to give Clang on OSX a spin to see for myself how much of the new standard is actually supported in “real life” today.

I tried writing a short program that tested different features. I wasn’t expecting everything to work but, I was surprised to learn that some features that I was giving for granted are not widely available yet in clang 3.2 nor 4.0 .

Here’s my test program. It stores the numbers from 0 to 99 in a vector and then prints them to the console.


#include <algorithm>
#include <iostream>
#include <vector>

int main(int argc, char* argv[])
{
	// Create and populate a container:
	
	const int n = 100;
	std::vector<float> v(n);
	for (auto i = 0; i < n; i++)
	{
		v[i] = i;
	}
	
	// Iterate over v, applying a lambda expression on
	// each element.
	
	std::for_each(v.begin(), v.end(), [](int f) {
				std::cout << f << " ";
			});

	std::cout << std::endl;

	return 0;
}

Some new C++11 features used in this program are:

  1. New std::for_each for iterating over a vector.
  2. New "auto" keyword semantics.
  3. Lambda Expressions (used here for printing elements).

Some features I wanted to try but are not yet supported in Clang 3.2, nor in Apple's Clang 4.0 are:

  1. Initializer Lists. Some people claim it is still not to be supported by Apple, but I couldn't get it to work on an open source build of Clang. Granted, I used 3.2. I should try building a newer clang perhaps even from trunk.
  2. std::begin() and std::end(), which should come with the updated Standard Library. It seems the STL implementation is not complete yet.

If I have time to try different compiler configurations, I might post the results here. All in all I'm happy with support reaching the mainstream and hope that I can start writing C++11 code in my next assignment.

Dynamic Method Injection (in Objective-C)

When asked what I like the most from the Objective-C programming language, I often refer to its dynamic underpinnings.

Although Objective-C is a superset of C, the way its designers decided to implement the Object model was dynamic. This provides a strong contrast with C++, which, although a superset of C as well, is a static-typed language.

In this post I will show how you can use the Objective-C runtime to dynamically add a method to an (Objective-C) class at runtime.

Let’s start by creating a Dummy class that has no methods whatsoever. I will put everything in the same file for the sake of simplicity, but in real-life code, you’d want to have separate files for the interface and the implementation.

#import <Foundation/Foundation.h>

@interface Dummy : NSObject
@end

@implementation Dummy
@end

Now, let’s implement a method to be dynamically added to Dummy. Interestingly enough, methods to add need to be implemented as C functions.

Here’s our implementation:

void newMethod(id self, SEL _cmd)
{
  printf("Hello from a dynamically added method! (self=%p)\n", self);
}

Now, let’s go to the main function and see how we can inject “newMethod” into the Dummy class. We will need to import the Objective-C runtime.

#import <objc/runtime.h>

int main(int argc, char* argv[])
{
  NSAutoreleasePool* pool = [[NSAutoreleasePool alloc] init];
  
  // Add method to Dummy class (args explained below)
  class_addMethod([Dummy class], @selector(printHello), (IMP)newMethod, "v@:");
  
  Dummy* instance = [[Dummy alloc] init];
  [instance printHello];
  [instance release];
  
  [pool release];
  return 0;
}

When running this program, an output similar to this will be produced:

Hello from a dynamically added method! (self=0x10ad143a0)

So, let’s see how the method was added.

The heavylifting here is done by the class_addMethod function. This function, coming from the Objective-C runtime, allows you to register a new method in a class.

This feat requires a pointer to the function that implements the method (“newMethod”), but lets you assign any name you want to it (I chose “printHello”). Notice that’s the message I send to the Dummy instance.

The strangest parameter is perhaps the last “v@:”. This is actually a code for the argument types received by the “newMethod” function. All valid type codes can be found in the Apple reference, but to make things easier, “v@:” means that the function returns void (v) and receives an Object and a selector.

Light Scattering

When we introduced Programmable Pipeline support to Vortex Engine, we claimed that better visual effects could now be introduced into the renderer. With the introduction of Render-to-Texture functionality into Vortex 2.0, coupled with the power of Shaders, we can now make good on our promise.

Light Scattering (also known as “God Rays”) is a prime example of what can be achieved with Shaders and Render-to-Texture capabilities.

In the following image, a room consisting of an art gallery with tall pillars is depicted. We want to convey the effect of sun light coming from outside, illuminating the inner nave. Enter Light Scattering:

A Light Scattering algorithm is used for improving the visual experience. The scene is rendered using Vortex 2.0. Room courtesy of RealityFrontier. (Click to Enlarge)

 

It can be seen in the picture above how the the effect, although subtle, brings more life to the rendered scene. There is still much room to improve the visual results, though, particularily with the results of Kenny Mitchel’s article on GPU Gems 3.

There is also room for optimization. Currently, the scene is rendered in realtime at an average of 187 frames per second, producing 1024×1024 images on a NVIDIA GeForce GTX465. Moving the algorithm to mobile devices, although easy from a coding perspective, might require extra work to achieve a high frame rate on the embedded GPU.

Here are three more captures from different angles.

(Click to Enlarge)

 

(Click to Enlarge)

(Click to Enlarge)

 

Implementing sqrt

I was reading some post about interview questions of 2011 and came across one that stated “find the square root of a number”.

Assuming we can’t use the sqrtf function of the standard C math library, let’s see how we can calculate the square root of a number x.

Given n, we know that its square root is a number x that holds:

\sqrt{n} = x

Let’s work on this equation a little. Raise both sides to the second power:

n = x^2

Move to the left of the equality:

0 = x^2 - n

If we found the roots of this last equation somehow, we would have found the square root of n. We can do this by using the Newton-Raphson iteration.

The Newton-Raphson iteration states that we can find the root of an equation using the following formula iteratively:

x_{n+1} = x_n - \frac{f(x)}{f'(x)}

Where f’(x) is the derivative of function f(x). We will approximate the derivative using the definition of derivative at a point (we could also note that the derivative could be trivially calculated; this method is more general).

f'(x) = \frac{f(x+h) - f(x)}{h}

The error of the Newton-Raphson iteration is given by:

|x_{n+1} - x_n|

Starting with a hardcoded seed value, we will perform this iteration in a loop until the error is less than a given value. I have chosen to iterate until the error is less than 1×10^(-10): 0.00000000001.

Let us see what a tentative “pythonesque” pseudocode for this loop could be:

def sqrt(n):
    f = function(x*x - n)
    x = 1 # seed
    xant = 0
    do:
        f1 = (f.eval(x+h) - f.eval(x)) / h
        xant = x
        x = x - f.eval(x)/f1
    while abs(x - xant) > err;
    return x

Assuming we have a symbolic function type, that loop does not seem too difficult. In order to code this in C, since the equation is always the same, I will hardcode it as a plain function.

typedef double real; // change to float for single precision

real f(real x, real n)
{
    return x*x - n;
}

real sqrt(real n)
{
    real err = 0.00000000001f;
    real h = 0.01f;

    real x = 1.0f; // seed
    real xant = 0.0f;

    do
    {
        xant = x;
        real df = (f(x+h, n) - f(x, n))/h;
        x = x - f(x, n)/df;
    }
    while (abs(x - xant) > err);

    return x;
}

Here are the results of running our custom square root function, compared to the standard version provided with the C programming language:

[ale@syaoran sqrt]$ ./sqrt 1.0
Custom sqrt of: 1 = 1
libm sqrt of: 1 = 1

[ale@syaoran sqrt]$ ./sqrt 2.0
Custom sqrt of: 2 = 1.41421
libm sqrt of: 2 = 1.41421

[ale@syaoran sqrt]$ ./sqrt 4.0
Custom sqrt of: 4 = 2
libm sqrt of: 4 = 2

[ale@syaoran sqrt]$ ./sqrt 16.0
Custom sqrt of: 16 = 4
libm sqrt of: 16 = 4

[ale@syaoran sqrt]$ ./sqrt 32.0
Custom sqrt of: 32 = 5.65685
libm sqrt of: 32 = 5.65685

[ale@syaoran sqrt]$ ./sqrt 100.0
Custom sqrt of: 100 = 10
libm sqrt of: 100 = 10

[ale@syaoran sqrt]$ ./sqrt 1000000.0
Custom sqrt of: 1e+06 = 1000
libm sqrt of: 1e+06 = 1000

A short detour along the way…

I wanted to improve the Stencil Shadow Volumes code a little bit and enable it for the Programmable Pipeline in Vortex, however, I had to take a small detour to fix an issue related to mobile device support.

It turns out that OpenGL ES, the 3D library that Vortex uses to render its graphics on mobile devices, does not support rendering indexed geometry for indices larger than 16 bits. Keep this in mind when developing for mobile devices such as the iPhone or iPad.

I can completely understand the reasoning behind this decision. 32-bit indices could be considered too much data to submit to the GPU in a mobile device. Furthermore, they are not strictly necessary, as they could be replaced (if needed) by splitting the geometry into two groups defined by 16-bit indices.

The solution I devised, which is now part of Vortex 2.0, is to allow the user to specify the data size for the indices when defining the geometry. This provides the flexibility to use 32, 16 or 8 bit indices. You can even have several geometric objects with different index sizes in a scene.

The advantage of leveraging this mechanism is that now it is very easy to fine-tune the number of bytes used for index representation for improving performance. For example, using 16-bit indices instead of 32-bit indices would make no difference for representing models composed of less than 65536 vertices, while requiring a copy of just half the number of bytes to the GPU.

In the extreme case of 8-bit indices we would be constrained to only 256 vertices, but we would be sending only one fourth of the data to the GPU.

This was mostly plumbing work, so no new picture this time. Stay tuned for more updates!

Stencil Shadow Volumes

I’ve built Stencil Shadow Volumes into the Vortex Engine.

Shadows are a very interesting feature to implement in an renderer, as they provide important visual cues that help depict the relationship between objects in a scene. Notice in the following image how the shadow tells our brains that the Knight is standing on the floor (as opposed to hovering over it).

A Knight, lit by a green light, casts a shadow on a tiled floor.

The implementation is at this point no more than a prototype and requires significant more testing, however, since the visual results are very appealing, I wanted to share some of the images.

I personally have a bias towards using Shadows Volumes instead of Shadow Maps; I think the former algoritm leaves out much of the guesswork that Shadow Maps require. Furthermore, Shadow Volumes are not subject to some of the limitations of Shadow Maps, such as the map’s resolution.

Another point for Shadow Volumes is the fact that they provide a natural way to implement “soft shadows”: shadows that are not completely black but rather transparent. The following image corresponds to the same scene but with two minor changes: the light has been changed to white and the floor texture is different. Notice how we can see the floor texture englobed in the Knight’s translucent shadow.

The same knight, lit by a white light this time, casts a soft shadow on an industrial floor. Notice the shadow "translucency".

I hope we can have Shadow Volumes available for both rendering pipelines as part of Vortex 2.0.

Edit: Thanks to Gabriel G. for noting the term “soft shadows” was being used incorrectly.

Objective-C Blocks

Lately, I have been playing with Objective-C blocks.

These are easy to use constructs that enable defining functions “on the fly” when programming in Objective-C and are very similar to Python’s lambdas.

The syntax might look a little weird at first, but it is not that different from working with function pointers in C.

Here’s a simple example of a function f that receives a block b as a parameter and calls it. The only restriction imposed on b by f is that it must be a block that takes an int argument and returns an int.

#include <stdio.h>

// f is a function that receives a block:
void f(int (^b)(int))
{
	b(0); //call the block
}

int main(int argc, char* argv[])
{
	// Define and pass a block to f():
	f(^(int p){ printf("%d\n", p); return 0; });

	return 0;
}

The output of this program is just the int supplied by f to the block b.

Let’s try a more interesting example. Here, as inspired by this post, we use blocks to define a general-purpose for_each function that receives a list of objects and a block and then applies the block on each element of the list.

In order to define a general-purpose for_each function, we take advantage of Objective-C’s dynamic typing and name our types id.

void for_each(NSArray* array, void (^b)(id))
{
	for (id obj in array)
	{
		b(obj);
	}
}

Now, let’s develop a short program to test the for_each. This program creates a list of strings and uses the for_each function to output all of its contents.

int main(int argc, char* argv[])
{
	// Default autorelease pool:
	NSAutoreleasePool* pool = [[NSAutoreleasePool alloc] init];
	
	// Create an array. The contents could be anything,
	// as long as the block can handle them:
	NSArray* array = 
	[NSArray arrayWithObjects:@"a", @"b", @"c", @"d", @"e", nil];

	// Print every string instance:
	for_each(array, 
		^(id str){ printf("%s\n", [str UTF8String]); });

	[pool release];
	
	return 0;
}

Suppose now that we wanted to print out strings in uppercase. No problem, we can keep all the same structure, we just need to change the block:

	for_each(array, 
		^(id str){ 
			printf("%s\n", [[str uppercaseString] UTF8String]); 
		});

So, what happens if we place an object of a type different from a NSString instance in the array? Well, the for_each is very general and doesn’t know anything about the array’s elements or the block’s internals. Thus, if there is a problem, it will be triggered inside the block code.

Imagine we attempt to send uppercaseString to an object that does not recognize that message. If this happens, an error will be triggered at runtime and abort() will be called, canceling our program.

As we move into dynamic coding, the code becomes more flexible, but we must be more careful not to trigger runtime errors in our programs. It’s important that we develop our blocks to be consistent with our data structures.

Exploring the Android NDK

I have been testing the Android development tools. From what I have learned, the tools are separated into two main products: the Android Software Development Kit (SDK) and the Android Native Development Kit (NDK).

The SDK was the first development toolkit for Android, and it only allowed applications to be written using the Java programming language. The NDK was released some time later as a toolchain to enable developers to write parts of their applications using native programming languages (C and C++).

One of the first programs I developed to get a feeling of what the Android SDK looked like was an OpenGL ES App that painted the background using a color degrade. I wrote it a couple of months ago, but today was the first time I ran it on a real Android device.

The resulting image can be seen in the following picture:

Android OpenGL ES Test

Other than trying the SDK, what I really wanted to do was to experience how hard it would be to rewrite part of the App in C and then having both integrated. It turns out adding components built using the NDK is not very hard (for pure C code), so I decided to try moving all the rendering code to a plain C function.

I started a new project and coded all the rendering logic in a C function that I called “render“. Then, the NDK was used to compile the C code into a JNI-compliant shared library and, finally, I wrote a simple Java wrapper that calls into the shared library’s render function to do all the drawing.

The wrapper is responsible for creating the Android “Activity”, setting up an OpenGL ES context, and calling the native C function. The native C function clears the background and does all the drawing.

Getting all the JNI requirements in place in order to have the Java runtime recognize the native library and call a native method was not too hard, but it was not trivial either. It is definitely much more complicated that calling native libraries from Objective-C or even Python. After a few tries, the bond was made:

Android OpenGL ES rendering from C code.

Clearly this is a very simple example where the C code could be tailored to fit JNI’s requirements from the start. I expect porting an existing C++ codebase to be much more difficult. However, I am looking forward to continue delving into Android’s development tools.