On Polymorphism in C++

C++, as many other Object-Oriented languagues, provides many facilities for us to implement our Object Oriented Designs.

In this post I talk about a feature which is not frequently discussed, but must be taken into account when implementing a class hierarchy that leverages polymorphism as part of its design.

Let’s suppose we have a base class (conveniently named “Base”) and a derived class (conveniently called “Derived”) that extends “Base” through Public Inheritance.

If you don’t know what “Public Inheritance” means, all you really need to know for the purpose of this article is that there are three different kinds of inheritance in C++: public, protected and private. Public Inheritance is akin to inheritance in other programming languages such as Java, Python, Objective-C and C#.

Going back to our example, I’ve coded the Base and Derived classes’ interfaces in the following snippet:

#include <iostream>
using std::cout;
using std::endl;

class Base
{
	public:
		Base();
		virtual ~Base();
		virtual void printClassName() = 0; // =0 means "abstract"
};

class Derived : public Base
{
	public:
		Derived();
		virtual ~Derived();
		virtual void printClassName();

};

Now, for the implementation, I want to have every object whose class is derived from “Base” to have its classname printed upon creation. So, what I am going to do is add a call to printClassName() in Base’s constructor.

Base::Base()
{
	this->printClassName();
}

Base::~Base()
{
}

Derived::Derived() : Base()
{
}

Derived::~Derived()
{
}

void Derived::printClassName()
{
	cout << ""Derived"" << endl;
}

This is the program entry point:

// program entry point
int main()
{
	Derived* d = new Derived(); //print "Derived"
	delete d;
	return 0;
}

If you know your Patterns, by now you’ll probably have noticed by now that this example is just an implementation of the Factory Method pattern, where I’m relying in a virtual method for leveraging derived-class-specific behavior in the Base class.

The problem here is that, when we try to compile this program, we get the following error:

ale@syaoran factory]$ g++ main.cpp -Wall
Undefined symbols:
  "Base::printClassName()", referenced from:
      Base::Base()  in ccNxv98U.o
      Base::Base()  in ccNxv98U.o
ld: symbol(s) not found
collect2: ld returned 1 exit status

GCC is warning us that there is no implementation for the abstract method “printClassName” in class “Base”, and it aborts!

Why does this happen? The compiler certainly should not be trying to find an implementation for an abstract method after all. The reason this happens is because of how inheritance works in C++.

When an object whose class is derived from another class is instantiated, what happens in C++ is that, in order to create this object, an inner instance of its base class is created first. In our example, when calling Derived’s constructor, Base’s constructor is called first.

This actually implies that the derived object does not exist until the the inner base object is created.

In turn, this means that the derived object’s vtable (used for implementing Polymorphism in C++) does not exist when the base class’ constructor is executing, so the compiler assumes that the base class’ implementation of the method has to be invoked, and that triggers the link error.

Even worse would be the case if the method was not abstract (but still virtual). In this case the program would build fine, but, at runtime you would find out that the base class’ implementation is being called instead of the derived class’.

How can you fix this problem?

Fortunately, the solution is simple, all you have to do is adopt the following gold rule:

Never call virtual methods on a partially-created object.

In order to fix our program, what I’m going to do first is to remove the printClassName call from Base’s constructor.

So, where should I place it now? Well, depending on the application it could be someplace or another. Since I would not like to change my design, what I’m going to do here is to separate the object construction into two phases: construction and initialization.

Construction will be carried out by the constructors, and initialization will be carried out by a special init() method that we can call once all the base objects have been created.

This separation will allow us to work safely, as all constructors will have executed when init() is called.

Applying these changes, the program will now look like this: the header for the Base class will include a new init() method:

#include <iostream>
using std::cout;
using std::endl;

class Base
{
	public:
		Base();
		virtual ~Base();
		virtual void init(); // new: two-phase initialization
		virtual void printClassName() = 0; // =0 means "abstract"
};

And the base class’ implementation will implement init() as to call our virtual method:

void Base::init()
{
    printClassName();
}

At this point the program should build fine and, when run, should produce the following result:

[ale@syaoran factory]$ ./a.out
"Derived"

As a final note, it would be possible to hide the call to init() into the Derived class’ constructor, but this will only work as long as Derived is the last class is the hierarchy.

As soon as we add a “Derived2” class that inherits from Derived, we will have to refactor the call to init() into Derived2’s constructor, and so forth.

It would be interesting to see whether this problem arises in other programming languages. This will be left as an exercise for the reader.

You can find more information on abstract (pure virtual) methods and problems related to them here.