Designing the Editor Architecture

Last week I used the (very little) free time that I had to work on the internal architecture of the Editor and how it’s going to interact with the Vortex Engine.

In general terms, the plan is to have all UI interactions be well-defined and go to a Front Controller object that’s going to be responsible for driving the engine. This Front Controller, by definition, will be a one-stop shop for the entire implementation behind the UI and it will also, at a later stage, provide higher-granularity control of the engine.

Vortex Framebuffer Object Support: a knight is rendered on a texture that is then mapped on a cube. All rendering is done on the GPU, avoiding expensive copies to RAM.

Vortex Framebuffer Object Support: a knight is rendered on a texture that is then mapped on a cube. All rendering is done on the GPU, avoiding expensive copies to RAM.

Other components I’ve been designing include an undo/redo stack (which is super important for an editor application) and a scripting API. It’s still early for both these components, but I think it’s better if the design supports these from early on as opposed to trying to tack them on to the Editor at a later stage.

Finally, last week I took the time to bootstrap a higher OpenGL version on Windows. The Editor now has access to full OpenGL on this platform. This is a significant milestone that opens the door for bringing in to Windows Vortex’s advanced rendering techniques, such as FBO objects as depicted in the image above.

I’ve only got a short update for this week. Stay tuned for more to come : )

Vortex Editor .plan

Not too long ago, I started working on an Editor for the Vortex Engine. I have been toying with the idea for years and I finally decided to get started. Not only because it is going to be an interesting challenge, but also because I feel it’s a good way to improve the development workflow when using the engine.

A very early screenshot depicting a scene with a Box entity. No lighting, no mipmapping, no AA. “Crate” texture image courtesy of lighthouse3d.com

A very early screenshot depicting a scene with a Box entity. No lighting, no mipmapping, no AA. “Crate” texture image courtesy of lighthouse3d.com

Using the Engine Today

Let’s take a look at the way I can build an App today with the Vortex Engine. First, I would create a new Application (be it a Linux, Mac or iOS App). Then, I would link against the engine, and then finally, I would create a scene through the Vortex API manually.

Now, while this approach certainly works and even plays as one of Vortex’s strengths by allowing you to integrate the engine into any App without taking over the application loop, it does become cumbersome to create the scene programatically.

The reason is that this process usually amounts to repeating a series of steps for every scene in the App:

  1. Start by taking a first stab in the dark.
  2. Build and run the App.
  3. Realize that you want to change the scene layout.
  4. Go back to the code, change it.
  5. Rebuild and re-run the App.
  6. Repeat from step 3 until you’re satisfied with the results.

The idea of the Editor is to tackle this problem head-on. With the Editor, you will be able “see” the scene you are building, tweak it visually and then save it as a package that can be loaded by the engine.

Bringing Vortex to Windows

Starting a new project for the Editor begs the question of which platforms this App shall run on. The Editor will be a desktop App, so ideally, it would work on all three major desktop platforms: Windows, Linux and Mac.

Now, there is no point in making a new renderer for the Editor, as we want the scene we see in it to be as close as possible to what the final user Apps will render. What this means is that the Editor needs to run the engine.

Portability has always been one of the key tenets of the Vortex Engine, so this is the perfect opportunity to bring the engine to Windows, a platform it’s never run on before.

Bringing to Windows a codebase that was born on Linux and then expanded to support Mac and iOS is the ultimate test for source-code level portability. Once finished, the end result will be a flexible codebase that is also more adherent to the standard.

So far, the two main challenges in building the engine on Windows have been: adapting the codebase for building under the MSVC compiler and Windows’ barebones support for OpenGL.

Building on MSVC

Although Vortex is standard C++ and it builds with both GCC and Clang, building it with MSVC required a few changes here and there to conform better to its front end.

This also meant reconsidering some dependencies of the engine to allow for a non-POSIX environment. Thankfully, the move to C++11 has already helped replace some UNIX-specific functions with now-standard equivalents.

OpenGL on Windows

Regarding OpenGL support, the windowing toolkit I’ve chosen to implement the UI in has proven to be more of a problem than a solution. At the time of writing, and mostly because I’m trying to hit a high velocity building the Editor, I haven’t taken the time to bootstrap anything beyond OpenGL 1.1 support.

This would be a problem, however, Vortex’s Dual Pipeline support, as I first described in this post back from 2011, has proven essential by allowing the engine to scale down to OpenGL 1.1.

Dual Pipeline support: a Comparison of the Rendering Pipelines available in Vortex Engine. The image on the left represent the Fixed Pipeline. The image on the right represents the Programmable Pipeline.

Dual Pipeline support: a Comparison of the Rendering Pipelines available in Vortex Engine. The image on the left represent the Fixed Pipeline. The image on the right represents the Programmable Pipeline.

The plan is to move forward with the basic Editor functionality and then drop in the programmable pipeline renderer later in the game, retiring the fixed one.

It’s quite amazing to see the fixed pipeline renderer, written about 6 years ago, running unmodified on a completely new platform that it has never been tested on before. This is the true virtue of OpenGL.

In Closing

So far work is progressing nicely. As the image above shows, I have a simple proof-of-concept of the engine running inside the Editor skeleton under Windows. This is the foundation on which I will continue building the Editor App.

Stay tuned for more!

Conway’s Game of Life

This week we take a short break from 3D programming topics and go into gaming! Well, sort of…

A few weeks ago I published on my GitHub page a CUDA implementation of Conway’s Game of Life. The code is pretty simple, well in tune with the simplicity of the game.

The implementation can be found here: https://github.com/alesegovia/game-of-life.

If you are not familiar with the game, Conway’s Game of Life is a 0-player game where cells live and die on an infinite 2D grid. The life/death rules are the following, according to Wikipedia:

Every cell interacts with its eight neighbours, which are the cells that are horizontally, vertically, or diagonally adjacent. At each step in time, the following transitions occur:

  1. Any live cell with fewer than two live neighbours dies, as if caused by under-population.
  2. Any live cell with two or three live neighbours lives on to the next generation.
  3. Any live cell with more than three live neighbours dies, as if by overcrowding.
  4. Any dead cell with exactly three live neighbours becomes a live cell, as if by reproduction.

Conway’s game is excellent for implementing on a GPU, as it involves analyzing the cells in the 2D grid and, what’s best, each cell’s next state depends only on the previous state of its neighbors and never on the their current state.

This means that we can spawn a GPU thread for every single cell in the board and calculate the next state in parallel.

In the published implementation, the board size is 64×64 cells, so we are effectively spawning 4,096 GPU threads to solve every iteration. We do this for one million generations.

The project has been released under a GPLv3 license, so feel free to download, build it, run it, modify it and share it with others under its terms.

If you are looking for a fun weekend project, the game could definitely use an UI. I’ll give you extra points if you can draw it using OpenGL without ever having to copy the board back from GPU memory into system memory ;-)

Enjoy!

C++11 Enum Classes

With the release of the C++11 standard, C++ finally obtained its own enum type declarations. Dubbed “enum classes”, these new enums type define a namespace for the discrete values they contain. This sets them apart from classic C-style enums, which define their values in the enclosing scope. Enum classes can also be forward declared, helping improve compilation times by reducing transitive header inclusion.

C-style enums

So, what was the problem with C-style enums? -Consider this classic C enum defined at file scope:

enum ProjectionType
{
    PERSPECTIVE,
    ORTHOGONAL
};

Constants PERSPECTIVE and ORTHOGONAL are defined in the global namespace, meaning that all references to these names will be considered a value belonging to this enum. Using general names will surely lead to chaos, as two enums defined in different headers can easily cause type ambiguities when pulling both headers together in a compilation unit.

A solution to this problem in a language that does not have namespaces, like C, is to prefix each constant with something that identifies the type, as to prevent possible name clashes.

This means our constants would become PROJECTION_TYPE_PERSPECTIVE and PROJECTION_TYPE_ORTHOGONAL. Needless to say, all caps might not be ideal from a code readabilty standpoint, as they can easily make a modern C++ codebase look like an old C-style macro-plagued program.

The pre-2011 C++ approach

In C++, we do have namespaces, so we can wrap our enums in namespace declarations to help organize our constants:

namespace ProjectionType
{
    enum Enum
    {
        Perspective,
        Orthogonal
    };
}

Now, this is better. With this small change, our constants can be referenced as: ProjectionType::Perspective and ProjectionType::Orthogonal. The problem here is the fact that doing this every time for every enum can get a little tedious. Furthermore, our datatype is now called ProjectionType::Enum, which is not that pretty. Can we do better?

The C++11 solution

The ISO Committee decided to take this problem on by introducing the new concept of “enum classes”. Enum classes are just like C-style enums, with the advantage that they define a containing namespace (of the same name of the enum type) for the constants they declare.

enum class ProjectionType
{
    Perspective,
    Orthogonal
};

Notice we declare an enum class by adding the class keyword right after the enum keyword. This statement, which would cause a syntax error in the C++98 standard, is how we declare enum classes in C++11. It must be accepted by all conforming compilers.

Using this declaration, our constants can now be accessed as ProjectionType::Perspective and ProjectionType::Orthogonal, with the added advantage that our type is called ProjectionType.

C-style enums vs enum classes

Because C++ is a superset of C, we still have access to C-style enums in C++11-conforming compilers. You should, however, favor enum classes over C-style enums for all source files that are C++ code.

Posted in C++

Function Objects using Variadic Templates in C++11

One of the biggest features introduced in C++11 are lambda expressions. Lamba expressions (or “lambdas”) are a powerful mechanism that allows defining functions “on-the-fly”. These can be stored in function objects and then be used by an object that doesn’t want to reveal its internal implementation, as well as to perform tasks on a parallel thread.

I was wondering how function objects could be implemented using other constructs of the language, mostly to see how much the language had to be changed in order to accomodate them. In this post I will show how to implement function objects using nothing but C++11’s variadic templates.

Variadic templates were introduced in our previous post in this C++11 series. This post will help pave the way for us to move into the more advanced topic of lambda expressions and std::function objects.

Function Pointers in C

As far as I can tell, C has always had functions as first-class citizens. Although their syntax is vastly different from the way one declares variables of other types, you could always define a variable that can be assigned a pointer to a function.

#include <stdio.h>
void print_number(int number)
{
    printf("Argument is: %d\n", number);
}

int main(int argc, char* argv[])
{
    void (*f)(int);  // f is a variable that can 
                     // store a pointer to a 
                     // function that receives 
                     // an int an returns nothing.
    f = print_number // legal.

    f(1024);         // calls print_number.  

    return 0;
}

Because C is a statically-typed language, we must be very careful with the types we declare f with, otherwise the assignment may be deemed invalid by the compiler.

Another problem is that the syntax can be cumbersome to remember and handle. Notice how the name of the variable is declared within the associated types.

Function Objects in C++

Unless you were using boost, C++ never had its own version of function objects until C++11. Programmers had to rely on the traditional C construct or write functors, classes that implement operator() and allow mimicking a function’s behavior.

Unlike C function pointers, functors are more readable, but we need to declare a class and implement operator() inside it. It’s much more verbose.

C++11 introduced the standard std::function type, which allows encapsulating a function declared by lamba expressions (or by other means). In production code, you will most certainly want to use std::functions. In this post, however, because of academic interest only, we’re going to exercise C++11’s variadic templates by implementing a generic function wrapper object called alg::function.

alg::function

Using variadic templates, it’s very easy to implement a simple function wrapper capable of handling any function type. I’m going to implement my solution using C function pointers. Remember: in C++11 production code, you will never do this.

namespace alg
{
    template <typename R, typename... T>
    class function
    {
        public:
            function(R (*f)(T...)) { _f = f; }
            R operator()(T... args) {return _f(args...);}
		
        private:
            R (*_f)(T...);

    };
}

And that’s all there is to it, really. Let’s go over the code.

I declare the class alg::function as a template that is to be instanciated by two parameters: a return type dubbed “R” and a list of types which I will refer to as “T”. The list of types can be empty, the return type is mandatory.

alg::function objects will have a single private member that is declared to be a pointer to a C function that returns a value of type R and receives the list of argument types T.

The only two additional methods that we need are: a public constructor that receives the function to wrap, storing it, and operator(), which will let us call the function. Notice how easily we can unpack the arguments when we call _f.

Instancing Examples

Let’s see it in action!

These samples illustrate the basic idea without complex data structures. It is certainly possible to pass alg::function objects around as the first-class citizens that they are, as well as passing-in and returning arbitrary C++ objects and structs from these wrappers.

The only limitation is the fact that we don’t support const parameters. Adding support for them would require tweaking the alg::function class a bit more. We leave this as an exercise to the reader.

A single function that receives an int and prints it:

void fun(int arg)
{
    std::cout << "hello from fun! (arg=" 
              << arg 
              << ")"
              << std::endl;
}

int main(int argc, char* argv[])
{
    alg::function<void, int> f(fun);

    f(0); // prints "hello from fun! (arg=0)"
}

A function that receives no parameters:

void fun2()
{
    std::cout << "hello from fun 2!" << std::endl;
}

int main(int argc, char* argv[])
{
    alg::function<void> f2(fun2);

    f2();  // prints "hello from fun 2!"
}

A function that receives two std::string instances and prints them together:

void fun4(std::string msg1, std::string msg2)
{
    std::cout << "hello from fun 4! (concat=" 
              << msg1 << msg2 
              << ")" 
              << std::endl;
}

int main(int argc, char* argv[])
{
    alg::function<void, std::string, std::string> f4(fun4);

    f4("hello, ", "world!"); // prints "hello from fun 4! (concat=hello, world!)"
}

Conclusion

Variadic templates offer a lot in terms of flexibility and allow us to extend the language even further than it was possible before. Remember, these examples are just food for thought. In C++11 production code, you will most certainly want to use std::function objects and lambda expressions.

See you next time!

Other articles in this C++11 series

Posted in C++

libsdl-1.2 support for OpenGL 3.2

This week we take a break from the C++ saga to talk a little about OpenGL. I’ve forked libsdl-1.2 and added support for creating OpenGL 3.2 forward-compatible contexts. This is something that could be deemed helpful until libsdl 2.0 is released.

You can find the source code on my GitHub page, at: github.com/alesegovia. In so far, only the Mac platform is supported, as it’s the only Operating System I currently have access to. I’ll hopefully be able to add Linux support as soon as I can get hold of a Linux box with a suitable video card.

Creating an OpenGL 3.2 compatible context is very simple. Once you have downloaded, compiled and installed libsdl-1.2-gl, you just need to create your window using the new SDL_OPENGLCORE flag.

This sample program creates an OpenGL 3.2 context, displays the OpenGL version number and exits:

#include <SDL.h>
#include <stdio.h>
#include <OpenGL/gl.h>

int main(int argc, char* argv[])
{
    SDL_Init(SDL_INIT_VIDEO);

    SDL_Surface* pSurface = SDL_SetVideoMode(600, 600, 32, SDL_OPENGL|SDL_OPENGLCORE);

    printf("GL Version:%s\n", glGetString(GL_VERSION));

    SDL_Quit();

    return 0;
}

You need to be running Mac OS X Lion or higher in order to be able to create OpenGL 3.2 contexts. If you are running Snow Leopard or your video card does not support OpenGL 3.2, you might get a Compatibility profile and your OpenGL version might be stuck on 2.1.

Also note that Mac OS X reports the OpenGL version to be 2.1 unless you specifically create forward-compatible OpenGL contexts, so if you need to know whether your Mac supports OpenGL 3.2, you can look your system configuration up in this great table maintained by Apple.

If you find this useful, let me know in the comments. Enjoy!

Variadic Templates in C++ 11 (Part 2 of 2)

Last week we started discussing C++11’s new Variadic Templates.

In that article I showed you how to declare a function initialized by a Variadic Template that returns the number of template arguments it was called with. Today, we build on those concepts to implement a printf-like function with the properties of being typesafe, don’t requiring parsing formatting characters and supporting non-POD data types. Let’s get started.

The first thing I want to do is rewrite the example from last week. This will ease introducing this week’s concepts. Let’s create a new function g that does the same that f did, but receives any number of arguments via template instantiation.

template <class... T>
size_t g(T... args)
{
  return sizeof...(args);
}

g does exactly the same that f, the only difference is that we can now call it using the following syntax:

int main(int argc, char* argv[])
{
  cout << g(); // prints "0"
  cout << g(1); // prints "1"
  cout << g(1,2,3,4,5,6,7) // prints "7"
}

It is, in a way, like a variadic function, but written in C++11 style instead of plain C.

It’s important that you understand how g works, if you don’t, please review the code again and try making changes to it.

Assuming that’s out of the way, let’s delve in today’s article.

We want to develop a print function that can receive any number of parameters of any type. So we can call:


A a; // Create some object...

print(1);
print(1, 2.0f);
print(1, 2.0f, "Hello");
print(1, 2.0f, "Hello", a);

// An so on...

Now, you might be guessing that one way to implement this function would be to have a “base case” that can print one argument and then have it called with each supplied argument.

template <class T>
void print(const T& msg)
{
  cout << msg << " ";
}

That’s a really good idea and, last time, we saw how to determine how many arguments a Variadic Template has been instanciated with (using the sizeof… operator).

Unfortunately, when it comes to Variadic Templates, we can’t iterate over the arguments based on the count. We need to find another way.

In an idea that I regard largely borrowed from functional programming, what if we could separate the list of arguments in the “head” element and the “rest” of the list?

Well, if that was the case, then we could call our base-case print function with the head and then call the general-case function with the rest of the list. Eventually, the list will be empty and we will have printed all elements!

We can do exactly that using C++11 new template syntax:


// This is our base-case for the print function:

template <class T>
void print(const T& msg)
{
  cout << msg << " ";
}


// And this is the recursive case:

template <class A, class... B>
void print(A head, B... tail)
{
  print(head);
  print(tail...);
}

Let’s try this with a simple program:

int main(int argc, char* argv[])
{

  print(1, "\n");
  print(1, 2.0f, "\n");
  print(1, 2.0f, "Hello", "\n");

  return 0;
}

The output should be something like:

 1 
 1 2 
 1 2 Hello

And that’s all there is to it. The C++ compiler will take care of generating the appropriate code to instanciate our functions and to split the parameter list so we can print the arguments one by one. Notice how this is much easier than having to mess with variadic functions, while being typesafe too.

The biggest challenge is assimilating the new syntax, but there are lots of good references around like this article from IBM, where I got the idea of writing a print function, or even the Variadic Template proposal for the ISO committee, another great resource.

Stay tuned for more C++11 goodness next week!

Posted in C++

Variadic Templates in C++11 (Part 1 of 2)

This week we are continuing with the C++11 saga, moving on to a new feature called “Variadic Templates”.

Inspired by the concept of Variadic Functions in C, the idea behind Variadic Templates is to allow a template to be instanciated with any number of arguments.

In this article I’ll cover the new syntax and present a simple example that declares a function that uses a Variadic Template that prints the number of arguments it was intantiated with.

Previous parts of this saga included an overview of new iterators and lambda expressions as well as Move Semantics, which was treated in two parts: Part 1 and Part 2.

Let’s get started!

First of all, C++11 had to extend the C++ syntax for templates in order to support Variadic Templates. The new syntax allows annotating a parameter with an ellipsis (…) to denotate that we may expect receive zero or more parameters in the given place.

This means that the following declaration is now valid C++:

template <class... T>
void f()
{
}

f is a function that can be expanded with zero or more template parameters.

Now that we know how to declare a Variadic Template, let’s write a simple program that prints the number of arguments the function f template has been expanded with.

#include <iostream>
using std::cout;
using std::endl;

template <class... T>
size_t f()
{
    return sizeof...(T);
}

int main(int argc, char* argv[])
{
    cout << "sizeof f<int>" << f<int>() << "\n";
    cout << "sizeof f<int, float>" << f<int, float>() << "\n";
    cout << "sizeof f<int, float, char>" << f<int, float, char>() << endl;
    return 0;
}

Compiling this program with clang and running it produces the following output:

sizeof f<int>: 1
sizeof f<int, float>: 2
sizeof f<int, float, char>: 3

Here, each line of the output corresponds to the number of parameters the function template was instantiated with. As you probably noticed, the new sizeof… takes the packed template arguments and returns their count. If we just called f without instantiating its template, the number of arguments printed would be 0.

Now, you might be wondering what Variadic Templates might be used for. In next week’s article I will show you how to write a typesafe printf-like function in C++11 style. Stay tuned!

Posted in C++

Move Semantics in C++11 (Part 2 of 2)

Last week we talked about the problem with passing objects around in the 2003 C++ standard, and mentioned how the new move semantics in C++11 could help us avoid expensive object copies.

In this article I continue where I left off last week and show an updated version of the code that implements move semantics.

Now, the new code might be generated by a C++11 compiler automatically, just like default copy and assignment operators are generated, but I will show you how to implement these yourself.

Normally, we would just go about adding the new functions to our program, and that’s the way it should be for new C++11 codebases. Chances are, however, that today you will probably be adding support to an existing project that might need to be built for a platform that doesn’t have a readily available C++11 compiler.

You wouldn’t want to break compatibility with that platform, so something you can do is to conditionally compile move semantics into your program. This way, you can have the best of both worlds: a speedy implementation for programs compiled with a state-of-the-art compiler and a (slower, but working) fallback for all other target platforms.

So, bearing this in mind, let’s add a new move constructor and a new move assignment operator to A, but protected by a MOVE_CTOR macro:

#include <iostream>
using std::cout;
using std::endl;

class A
{
	public:
		A();
		A(const A& other);
		A& operator=(const A& other);

#ifdef MOVE_CTOR
		A(A&& other);
		A& operator=(A&& other);
#endif //MOVE_CTOR

		float _a;
};

A::A() : _a(0.0f)
{
	cout << "Running A()" << endl;
}

A::A(const A& other)
{
	cout << "Running A(const A&)" << endl;
	_a = other._a;
}

A& A::operator=(const A& other)
{
	cout << "Running operator=(const A&)" << endl;
	_a = other._a;
	return *this;
}

#ifdef MOVE_CTOR

A::A(A&& other)
{
	cout << "Move-constructing A" << endl;
	_a = other._a;
	other._a = 0.0f;
}

A& A::operator=(A&& other)
{
	if (&other != this)
	{
		cout << "Move-assigning A" << endl;
		_a = other._a;
		other._a = 0.0f;
	}

	return *this;
}

#endif //MOVE_CTOR

A getAnA()
{
	A a;
	a._a = 1.0f;
	return a;
}

int main(int argc, char* argv[])
{
	cout << "==Declaring A a==" << endl;
	A a; 
	cout << "==Assigning a=getAnA()==" << endl;
	a = getAnA();
	cout << "==done==" << endl;
	cout << "a._a = "<< a._a << endl;
	return 0;
}

To build this source file with support for move semantics, use the following command. Older compilers can opt-out of building C++11-specific code.

clang++ -std=c++11 -stdlib=libc++ -DMOVE_CTOR main.cpp

If everything's alright, here's the output that should be produced from running this program:

:@~/devel/C++/move_ctor11$ ./a.out 
==Declaring A a==
Running A()
==Assigning a=getAnA()==
Running A()
Move-assigning A
==done==
a._a = 1

Here we can immediately notice how we are avoiding running the (expensive) copy assignment operator and we substitute it with a lightweight move assignment that takes the internal data of the source object.

Think how this could help you avoid deep-copying matrices, trees, memory pages, and even how it would help you code safer by preventing having to sacrifice const-correctness and immutable object state in your code as hacky ways to gain performance. C++11 really does feel like a new language.

Stay tuned for more C++11 goodness next week!

Posted in C++

Move Semantics in C++11 (Part 1 of 2)

Continuing with the C++11 saga, this week I’ve been playing with move semantics. In this post I share my findings regarding using it on Apple’s Clang version 4.0.

Move semantics are a new mechanism built into C++ that allow us to “move” data from one object to another. They help avoid full object copies that, today, could occur under different scenarios.

The copy involved in passing in values to a function is a good example that can usually be avoided by receiving parameters as const references instead of as values. Other scenarios, like returning an object from a function, are, however, harder to avoid.

Move semantics let us write special functions that can “steal” a source object’s inner data and assign it to a new object. It could help you avoid an expensive copy operation when you know the source object is not going to be used anymore.

I’ve written a simple example to illustrate the problem. Suppose we have class A and we want to have a function that creates an A instance, then customizes it and finally returns it to the caller.

#include <iostream>
using std::cout;
using std::endl;

// A's interface:

class A
{
	public:
		// Default ctor:
		A();
		
		// Copy ctor:
		A(const A& other);

		// Copy-assignment operator:
		A& operator=(const A& other);

		// Inner data, should be private...
		float _a;
};

// A's implementation:

A::A() : _a(0.0f)
{
	cout << "Running A()" << endl;
}

A::A(const A& other)
{
	cout << "Running A(const A&)" << endl;
	_a = other._a;
}

A& A::operator=(const A& other)
{
	cout << "Running operator=(const A&)" << endl;
	_a = other._a;
	return *this;
}

// Helper function to create and customize an A instance:

A getAnA()
{
	A a;
	a._a = 1.0f;
	return a;
}

// Program entry point:

int main(int argc, char* argv[])
{
	cout << "==Declaring A a==" << endl;
	A a; 
	cout << "==Assigning a=getAnA()==" << endl;
	a = getAnA();
	cout << "==done==" << endl;
	cout << "a._a = "<< a._a << endl;
	return 0;
}

Running this code yields the following output:

:@~/devel/C++/move_ctor11$ ./a.out 
==Declaring A a==
Running A()
==Assigning a=getAnA()==
Running A()
Running operator=(const A&)
==done==
a._a = 1

The problem here lays in line 6. When the instance is to be returned from function getAnA(), it lives in the called function's stack, so we need to copy it into the caller's stack space in orde to return it. The code that does the copy is generated automatically by the C++ compiler and will incur in a performance penalty.

How slow is it? Well, it depends. For a big object (for instance a memory page, a huge matrix or other gigantic data structure) the performance penalty can be significant, specially when we really start to pass objects around.

This copy, however could completely be avoided by leveraging the fact that the source instance is going to go away. Move semantics allow us to have the destination A steal (or move) the source A's inner data.

Next week I'll show you how you can implement move semantics to speed up object passing. Stay tuned!.

Posted in C++