Vortex V3 Renderer

The past couple of weeks have been super busy at work, but I’ve still managed to get the ball rolling with the new renderer for the Vortex Engine.

This is the first time in years I’ve decided to actually write a completely new renderer for Vortex. This new renderer is a complete clean-room implementation of the rendering logic, designed specifically for the new Entity-Component system in the engine.

Current Rendering Systems in Vortex

Let’s start by taking a look at the current rendering systems in the Vortex Engine. Ever since 2011, Vortex has had two rendering systems: a Fixed Pipeline rendering system and a Programmable Pipeline rendering system.

Dual Pipeline support: a Comparison of the Rendering Pipelines available in Vortex Engine. The image on the left represent the Fixed Pipeline. The image on the right represents the Programmable Pipeline.

Dual Pipeline support: a Comparison of the Rendering Pipelines available in Vortex Engine. The image on the left represent the Fixed Pipeline. The image on the right represents the Programmable Pipeline.

Both these rendering systems are pretty robust. Both have been used to develop and launch successful apps in the iOS App Store and they have proven to be reliable and portable, allowing the programmer to target Linux, Windows, Mac OS X and Android, as well as iOS.

The problem with these renderers is that they were designed with Vortex’s Scenegraph-based API in mind. This means that these renderers do not know anything about Entities or Components, but rather, they work at the Node level.

Moving forward, the direction for the Vortex Engine is to provide an Entity Component interface and move away from the Scenegraph-based API. This means that glue code has to be developed to allow the traditional renderers to draw the Entity-Component hierarchy.

So… why is this a problem?

Why a new Renderer?

As Vortex V3 now provides a brand new Entity-Component hierarchy for expressing scenes, glue code had to be developed in order to leverage the legacy renderers in the Vortex Editor. In the beginning this was not a major problem, however, as the Entity-Component system matures, it’s become ever more difficult to maintain compatibility with the legacy renderers.

PBR Materials in Unreal Engine 4. Image from ArtStation's Pinterest.

PBR Materials in Unreal Engine 4. Image from ArtStation’s Pinterest.

Another factor is the incredible pace at which the rendering practice has continued to develop in these past few years. Nowadays, almost all modern mobile devices have support for OpenGL ES 2.0 and even 3.0, and PBR rendering has gone from a distant possibility to a very real technique for mobile devices. Supporting PBR rendering on these legacy renderers would require a significant rewrite of their core logic.

Finally, from a codebase standpoint, both renderers were implemented more than 5 years ago, back when C++11 was just starting to get adopted and compiler support was very limited. This does not mean that the legacy renderers’ codebases are obsolete by any means, but by leveraging modern C++ techniques, they could be cleaned up significantly.

From all of this, it is clear that a new clean-room implementation of the renderer is needed.

Designing a New Renderer

The idea is for the new renderer to be able to work with the Entity-Component hierarchy natively without a translation layer. It should be able to traverse the hierarchy and determine exactly what needs to be rendered for the current frame.

Once the objects to be renderer have been determined, then, a new and much richer material interface would determine exactly how to draw each object according to its properties.

Just like with the Vortex 2.0 renderer, this new renderer should fully support programmable shaders, but through a simplified API that requires less coding and allows drawing much more interesting objects and visual effects.

Choosing a backing API

Choosing a rendering API used to be a simple decision: pick DirectX for Windows-only code or OpenGL (ES) for portable code. The landscape has changed significantly in the past few years, however, and there are now a plethora of APIs we can choose from to implement hardware-accelerated graphics.

This year alone, the Khronos Group released the Vulkan specification, a new API that tackles some of the problems inherent to OpenGL, as seen in the following image.

Comparison of the OpenGL and Vulkan APIs of the Khronos Group. Slide Copyright (C) 2016 Khronos Group.

Comparison of the OpenGL and Vulkan APIs of the Khronos Group. Slide Copyright (C) 2016 Khronos Group.

Now, both Vulkan and Metal are very appealing low-level APIs that provide a fine degree of control over the hardware, but they are limited in the sense that while Metal is Apple specific, Vulkan is cross-platform but not available on Apple devices.

DirectX 12 is Windows 10 only and that discards it right off the bat (for this project at least). DirectX 11 is a good option but, again, Windows only.

This leaves OpenGL and OpenGL ES as the two remaining options. I’ve decided to settle for Core OpenGL 3.3 at this time. I think it’s an API that exposes enough modern concepts to allow implementing a sophisticated renderer while also remaining fully compatible with Windows, iOS and everything in-between.

I don’t discard the possibility in the future of implementing a dedicated Metal or Vulkan backend for Vortex, and nothing in the Engine design should prevent this from happening, however, at this time, we have to start on a platform that’s available everywhere.

Using Core OpenGL 3.3 will also allow reusing the battle-tested shader API in Vortex. This component has several years of service under its belt and I’d risk to say that all of its bugs have been found and fixed.

Other than this particular component, I’m also reusing the material interface (but completely overhauling it) and developing a new RetainedMesh class for better handling mesh data streaming to the GPU.

Closing Thoughts

Writing a comprehensive renderer is no weekend task. A lot of components must be carefully designed and built to fit together. The room for error is minimal, and any problem in any component that touches anything related to the Video Card can potentially make the entire system fail.

It is, at the same time, one of the most satisfying tasks that I can think of as a software engineer. Once you see it come to life, it’s more than a sum of its parts: it’s a platform for rendering incredible dream worlds on a myriad of platforms.

I will take my time developing this new renderer, enjoying the process along the way.

Stay tuned for more! : )

OpenGL from a 10,000ft view

This month marks 10 years since I started learning and using OpenGL, and what a ride has it been! I started off with basic OpenGL 1.1 back in the University under the advisory of my mentor and ex-Googler Gabriel Gambetta, then moving on to the programmable pipeline by teaching myself how to code shaders and then riding the wave of the mobile revolution with OpenGL ES on the iPhone.

OpenGL Logo. Copyright (C) Khronos Group.

OpenGL Logo. Copyright (C) Khronos Group.

As part of this process, I’ve also had the privilege of teaching OpenGL to others at one of the most important private universities back home. This exposed me to learn evermore about the API and improve my skills.

Rather than doing a retrospective post to commemorate the date, I though about doing something different. In this post I’m going to explain how OpenGL works from a 10,000ft view. I will lay the main concepts of how vertex and triangle data gets converted into pixels on the screen and, in the process, explain how a video card works. Let’s get started!

What is OpenGL

At the most basic level, OpenGL can be seen as a C API that allows a program to talk to the video driver and request operations or commands to be performed on the system’s video card.

Titan X GPU by NVIDIA. Image courtesy of TechPowerup.com

Titan X GPU by NVIDIA. Image courtesy of TechPowerup.com

So what is a video card? A video card (or GPU) is a special-purpose parallel computer, excellent at executing a list of instructions on multiple data at the same time. A video card has its own processors, its own memory and it’s good at performing one particular set of operations (namely, linear algebra) very very fast.

What OpenGL gives us is access to this device through a client/server metaphor where our program is the client that “uploads” commands and data to the video driver. The video driver, which is the server in this metaphor, buffers this data and, when the time is right, it will send it to the video card to execute it.

Our program’s “instance” inside the video driver is known as the OpenGL Context. The Context is our program’s counterpart in the video card and it holds all the data we’ve uploaded (including compiled shader programs) as well as a large set of global variables that control the graphics pipeline configuration. These variables comprise the OpenGL State and they’re the reason OpenGL is usually seen as a State Machine.

The Graphics Pipeline (Simplified)

Remember how I mentioned that the video card excels at performing a limited set of operations very very fast? Well, what are these operations?

The way the video card works is that data is fed into it through OpenGL and then it goes through a series of fixed steps that operate on it to generate, ultimately, a series of pixels. It is the job of the video card to determine which pixels should be painted and using which color. And that’s really all the video card does at the end of the day: paint pixels with colors.

The following image, taken from Wikipedia, shows a simplified view of the data processing pipeline that OpenGL defines.

A simplified view of the OpenGL pipeline. Source: Wikipedia

A simplified view of the OpenGL pipeline. Source: Wikipedia

In this image, imagine the data coming from the user program. This diagram shows what is happening to this data inside the video card.

Here:

  1. Per-Vertex Operations: are operations that are applied to the vertex data supplied. This data can be coming in hot from main system memory or be already stored in video card memory. Here is where vertices are transformed from the format they were originally specified in into something we can draw on the screen. The most common scenario here is to take a piece of geometry from Object Space (the coordinate system the artist used), place it in front of a virtual “camera”, apply a perspective projection and adjust its coordinates. In the old days, here is where all transformation and lighting would take place. Nowadays, when a shader program is active, this stage is implemented by the user’s Vertex Shader.
  2. Primitive Assembly: here’s where OpenGL actually assembles whatever vertex data we supplied into its basic primitives (Triangle, Triangle Strip, Triangle Fan, Lines or Points, among others). This is important for the next step.
  3. Rasterization: is the process of taking a primitive and generating discrete pixels. It amounts to, given a shape, determining what pixels said shape covers. If the conditions are right, texture memory can be sampled here to speed up the texturing process.
  4. Per-Fragment Operations: are operations performed on would-be pixels. If there is a shader program active, this is implemented by the user in the fragment shader. Texture mapping operations take place here, as well as (usually) shading and any other operations that the user can control. After this stage, a number of operations take place based on the State Machine. These operations include depth testing, alpha testing and blend operations.
  5. Framebuffer: finally, this is the image we are rendering our pixels to. It is normally the screen, but we can also define a texture or a Render Target object that we could then sample to implement more complex effects. Shadow Mapping is a great example of this.

Sample OpenGL Program

Having taken a (very quick) look at OpenGL, let’s see what a simple OpenGL program might look like.

We are going to draw a colored triangle on the screen using a very simple script that shows the basic interaction between our program and the video card through OpenGL.

A colored triangle drawn by a simple program exercising the OpenGL API.

A colored triangle drawn by a simple program exercising the OpenGL API.

I’m using Python because I find that its super simple syntax helps put the focus on OpenGL. OpenGL is a C API however and, in production code, when working with OpenGL, we tend to use C or C++. There are other bindings available for Java and C# as well, but -mind you- these just marshal the calls into C and invoke the API directly.

This script can be divided in roughly 3 parts: initializing the window and OpenGL context, declaring our data to feed to the video card and a simple event loop. Don’t worry, I’ll break it down in the next section.

#!/opt/local/bin/python2.6
import pygame
from OpenGL.GL import *

def main():
	# Boilerplate code to get a window with a valid
	# OpenGL Context
	w, h = 600, 600
	pygame.init()
	pygame.display.set_caption("Simple OpenGL Example")
	scr = pygame.display.set_mode((w,h), pygame.OPENGL|pygame.DOUBLEBUF)
	
	glClearColor(0.2, 0.2, 0.2, 0.0)

	# Data that we are going to feed to the video card:
	vertices = [ \
		-1.0, -1.0, 0.0, \
		1.0, -1.0, 0.0,  \
		0.0, 1.0, 0.0 ]

	colors = [ \
		1.0, 0.0, 0.0, 1.0, \
		0.0, 1.0, 0.0, 1.0, \
		0.0, 0.0, 1.0, 1.0 ]
			

	# Here's the game loop, all our program does is
	# draw to a buffer, then show that buffer to the
	# user and read her input.
	done = False
	while not done:
		# Clear the framebuffer
		glClear(GL_COLOR_BUFFER_BIT)
		
		# Supply the video driver a pointer to our
		# data to be drawn:

		glVertexPointer(3, GL_FLOAT, 0, vertices)
		glEnableClientState(GL_VERTEX_ARRAY)

		glColorPointer(4, GL_FLOAT, 0, colors)
		glEnableClientState(GL_COLOR_ARRAY)

		# Now that all data has been set, we tell
		# OpenGL to draw it, and which primitive
		# our data describes. This will be used
		# at the primitive assembly stage.
		glDrawArrays(GL_TRIANGLES, 0, 3)

		# Clean up
		glDisableClientState(GL_COLOR_ARRAY)
		glDisableClientState(GL_VERTEX_ARRAY)
		
		# Show the framebuffer
		pygame.display.flip()

		# Process input:
		for evt in pygame.event.get():
			if evt.type == pygame.QUIT:
				done = True

if __name__ == "__main__":
	main()

If you’re familiar with OpenGL, you’ll notice I’m using mostly OpenGL 1.1 here. I find it’s a simple way to show the basic idea of how data is fed into the video card. Production-grade OpenGL will no doubt prefer to buffer data on the GPU and leverage shaders and other advanced rendering techniques to efficiently render a scene composed of thousands of triangles.

Also note that the data is in Python list objects and, therefore, the pyopengl biding is doing a lot of work behind the scenes here to convert it into the float arrays we need to supply to the video card.

In production code we would never do this, however, doing anything more efficient would require to start fiddling with pointer syntax that would undoubtedly make the code harder to read.

Putting it all together

Now, if you’re unfamiliar with OpenGL code, let’s see how our program is handled by the Graphics Pipeline.

		# Supply the video driver a pointer to our
		# data to be drawn:

		glVertexPointer(3, GL_FLOAT, 0, vertices)
		glEnableClientState(GL_VERTEX_ARRAY)

		glColorPointer(4, GL_FLOAT, 0, colors)
		glEnableClientState(GL_COLOR_ARRAY)

We start off by providing an array of vertices and colors to OpenGL, as well as a description of how this data is to be interpreted. Our calls to glVertexPointer and glColorPointer (in real life you would use glVertexAttribPointer instead) tells OpenGL how our numbers are to be interpreted. In the case of the vertex array, we say that each vertex is composed by 3 floats.

glEnableClientState is a function that tells OpenGL that it’s safe to read from the supplied array at the time of drawing.

		# Now that all data has been set, we tell
		# OpenGL to draw it, and which primitive
		# our data describes. This will be used
		# at the primitive assembly stage.
		glDrawArrays(GL_TRIANGLES, 0, 3)

glDrawArrays is the actual function that tells OpenGL to draw, and what to draw. In this case, we are telling it to draw triangles out of the data we’ve supplied.

After this call, vertex data will go through the per-vertex operations stage and then be handed off to the primitive assembly, which will effectively interpret the vertices as forming part of one (or more) triangles.

Next, the rasterization stage will determine which pixels on the framebuffer would be covered by our triangle and emit these pixels, which will then go to the per-fragment operations stage. The rasterization stage is also responsible for interpolating vertex data over the triangle, this is why we get a color degrade effect spanning the area of the triangle – it’s simply the interpolation of the colors at the three vertices.

This is all happening inside the video card in parallel to our event loop, that’s why we have no source code here to show.

		# Show the framebuffer
		pygame.display.flip()

Finally, after everything is said and done, the video card writes the resulting pixels on the framebuffer, and we then make it visible to the user by flipping the buffers.

In Closing and Future Thoughts

We’ve barely scratched the surface of what OpenGL is and can do. OpenGL is a big API that has been around for 20+ years and has been adding lots of new features as video card and video games companies continue to push for ever more realistic graphics.

Now, while 20+ years of backwards compatibility allow running old code almost unmodified on modern systems, design decisions accumulated over time tend to obscure the optimal path to performance, as well as to impose restrictions on applications that would benefit for more direct control of the video card.

Vulkan logo. (tm) Khronos Group.

Vulkan logo. ™ Khronos Group.

These points, made by the Khronos group itself, have led to the design and development of a new graphics API standard called Vulkan. Vulkan is a break from the past that provides a slimmed down API more suitable for modern day hardware. In particular multi-threaded and mobile applications.

OpenGL, however, is not going away any time soon, and the plan for the Khronos group, at least for the time being, appears to be to offer both APIs side by side and let the developers choose the one more suitable to their problem at hand.

Additionally, with Apple focusing on Metal and Microsoft on DX12, OpenGL (in particular OpenGL ES 2.0) remains the only truly cross-platform API that can target almost every relevant device on the planet, be it an iPhone, an Android phone, a Windows PC, GNU/Linux or Mac.

Finally, the large body of knowledge surrounding 20+ years of OpenGL being around, coupled with OpenGL’s relative “simplicity” when compared to a lower-level API such as Vulkan, may make it a more interesting candidate for students learning their first hardware-accelerated 3D API.

As time marches on, OpenGL remains a strong contender, capable of pushing anything from AAA games (like Doom) to modern-day mobile 3D graphics and everything in-between. It is an API that has stood the test of time, and will continue to do so for many years to come.

The GLSL Shader Editor

This week we take a break from work in the Vortex Editor to revisit an older personal project of mine: the GLSL Shader Editor, a custom editor for OpenGL shaders.

The UI of my custom shader editor.

The UI of my custom shader editor.

The idea of the editor was to allow very fast shader iteration times by providing an area where a shader program could be written and then, by simply pressing Cmd+B (Ctrl+B on Windows), the shader source would be complied and hot-loaded into the running application.

This concept of hot-loading allowed seeing the results of the new shading instantly, without having to stop the app and without even having to save the shader source files. This allowed for very fast turn-around times for experimenting with shader ideas.

As the image above shows, the UI was divided in two main areas: an Edit View and a Render View.

The Edit View consisted in a tabbed component with two text areas. These text areas (named “Vertex” and “Fragment”) are where you could write your custom vertex and fragment shaders respectively. The contents of these two would be the shader source that was be compiled and linked into a shader program.

The shader program would be compiled by pressing Cmd+B and, if no errors were found, then it would be hot-loaded and used to shade the model displayed in the Render View.

The status bar (showing “OK” in the image), would display any shader compilation errors as reported by the video driver.

The application had a number of built-in primitives and it also allowed importing in models in the OBJ format. It was developed on Ubuntu Linux and supported MS Windows and OS X on a number of different video cards.

Application Features

  • Built entirely in C++.
  • Supports Desktop OpenGL 2.0.
  • Qt GUI.
  • Supported platforms: (Ubuntu) Linux, MS Windows, OSX.
  • Diverse set of visual primitives and OBJ import support.
  • Very efficient turn-around times by hot-loading the shader dynamically – no need to save files!
  • GLSL syntax highlighting.
  • Docked, customizable UI.

Interestingly, this project was developed at around the same time that I got started with the Vortex Engine, therefore, it does not use any of Vortex’s code. This means that all shader compiling and loading, as well as all rendering was developed from scratch for this project.

I’ve added a project page for this application (under Personal Projects in the blog menu). I’ve also redesigned the menu to list all different personal projects that I’ve either worked on or that I’m currently working on, so please feel free to check it out.

Next week, we’ll be going back to the Vortex Editor! Stay tuned for more!

Bump Mapping a Simple Surface

In my last post, I started discussing Bump Mapping and showed a mechanism through which we can generate a normal map from any diffuse texture. At the time, I signed off by mentioning next time I would show you how to apply a bump map on a trivial surface. This is what we are going to do today.


Bump Mapping a Door texture. Diffuse and Normal maps taken from the Duke3D HRP. Rendered using Vortex 3D Engine. (HTML5 video, a compatibility GIF version can be found here.)

 
In the video above you can see the results of the effect we are trying to achieve. This video was generated from a GIF file created using the Vortex Engine. It clearly shows the dramatic lighting effect bump mapping achieves.

Although it may seem as if this image is composed of a detailed mesh of the boss’ head, it is in fact just two triangles. If you could see the image from the side, you’d see it’s completely flat! The illusion of curvature and depth is generated by applying per-pixel lighting on a bump mapped surface.

How does bump mapping work? Our algorithm will take as input two images: the diffuse map and the bump map. The diffuse map is just the colors of each pixel in the image, whereas the bump map consists in an encoded set of per-pixel normals that we will use to affect our lighting equation.

Here’s the diffuse map of the Boss door:

Diffuse map of the boss door. Image taken from the Duke3D HRP project.

Diffuse map of the boss door. Image taken from the Duke3D HRP project.

And here’s the bump map:

Bump map of the boss door. Image taken from the Duke3D HRP project.

Bump map of the boss door. Image taken from the Duke3D HRP project.

I’m using these images taken from the excellent Duke3D High Resolution Pack (HRP) for educational purposes. Although we could’ve generated the bump map using the technique from my previous post, this especially-tailored bump map will provide better results.

Believe it or not, there are no more input textures used! The final image was produced by applying the technique on these two. This is the reason I think bump mapping is such a game changer. This technique alone can significantly up the quality and realism of the images our renderers produce.

It is especially shocking when we compare the diffuse map with the final bump-mapped image. Even if we applied per-pixel lighting to the diffuse map in our rendering pipeline, the results would be nowhere close to what we can achieve with bump mapping. Bump mapping really makes this door “pop out” of its surface.

Bump Mapping Theory

The theory I develop in these sections is heavily based on the books Mathematics for 3D Game Programming and Computer Graphics from Eric Lengyel and More OpenGL from David Astle. You should check those books for the definitive reference on bump mapping. Here, I try to explain the concepts in simple terms.

So far, you might have noticed I’ve been mentioning that the bump map consists of the “deformed” normals that we should use when applying the lighting equation to the scene. But I haven’t mentioned how these normals are actually introduced into our lighting equations.

Remember from my previous post how we mentioned that normals are stored in the RGB image? Remember that normals close to (0,0,1) looked blueish? Well, that is because normals are stored in a coordinate system that corresponds to the image. This means that, unfortunately, we can’t just take each normal N and plug it into our lighting equation. If we call L the vector that takes each 3D point (corresponding to each fragment) to the light source, the problem here is that L and N are in different coordinate systems.

L is, of course, in camera or world space, depending on where you like doing your lighting math. But where is N? N is defined in terms of the image. That’s neither of those spaces.

Where is it then? Well, N is actually in its own coordinate system that authors refer to as “tangent space”. It’s its own coordinate system.

In order to apply per-pixel lighting using the normals coming from the bump map, we’ll have to bring all vectors to the same coordinate system. For bump mapping, we usually bring the L vector into tangent space instead of bringing all the normals back into camera/world space. It seems more convenient and should produce the same results.

Once L has been transformed, we will retrieve N from the bump map and use the Lambert equation between these two to calculate the light intensity at the fragment.

From World Space to Tangent Space

How can we convert from camera space to tangent space? Tangent space is not by itself defined in terms of anything that we can map to our mesh. So, we will have to use one additional piece of information to determine the relationship between these two spaces.

Given that our meshes are composed of triangles, we will assume the bump map is to be mapped on top of each triangle. The orientation will be given by the direction of the texture coordinates of the vertices that comprise the triangle.

This means that if we have a triangle that has an edge: (-1,1)(1,1) with texture coordinates: (0,1)(1,1), a horizontal vector (1,0) represents a vector tangent to the vertices that is aligned with the horizontal texture coordinates. We will call this the tangent.

Now, we need two more vectors in order to define the coordinate system. Well, the other vector we can use is the normal of the triangle. This vector is, by definition, perpendicular to the surface and will be perpendicular to the tangent.

The final vector we will use to define the coordinate system has to be perpendicular to both, the normal and the tangent, so we can calculate it using a cross product. There is an ongoing debate whether this vector should be called the “bitangent” or the “binormal” vector. According to Eric Lengyel the term “binormal” makes no sense from a mathematical standpoint, so we will refer to it as the “bitangent”.

Now that we have three vectors that define the tangent space, we can create a transformation matrix that takes vector L and puts it in the same coordinate system that the normals for that specific triangle. Doing this for every triangle will allow applying bump mapping on the triangle.

Responsibility – who does what

Although we can compute the bitangent and the transform matrix in our vertex shader, we will have to supply the tangent vectors as input to our shader program. Tangent vectors need to be calculated using the CPU, but (thankfully) only once. Once we have them, we supply them as an additional vertex array.

Calculating the tangent vectors is trivial for a simple surface like our door, but can become very tricky for an arbitrary mesh. The book Mathematics for 3D Game Programming and Computer Graphics provides a very convenient algorithm to do so, and is widely cited in other books and the web.

For our door, composed of the vertices:

(-1.0, -1.0, 0.0)
(1.0, -1.0, 0.0)
(1.0, 1.0, 0.0)
(-1.0, 1.0, 0.0)

Tangents will be:

(1.0, 0.0, 0.0)
(1.0, 0.0, 0.0)
(1.0, 0.0, 0.0)
(1.0, 0.0, 0.0)

Once we have the bitangents and the transformation matrix, we rotate L in the vertex shader, and pass it down to the fragment shader as a varying attribute, interpolating it over the surface of the triangle.

Our fragment shader can just take L, retrieve (and decode) N from the bump map texture and apply the Lambert equation on both of them. The rest of the fragment shading algorithm need not be changed if we are not applying specular highlights.

In Conclusion

Bump mapping is an awesome technique that greatly improves the lighting in our renderers at limited additional costs. Its implementation is not without a challenge, however.

Here are the steps necessary for applying the technique:

  • After loading the geometry, compute the per-vertex tangent vectors in the CPU.
  • Pass down the per-vertex tangents as an additional vertex array to the shader, along with the normal and other attributes.
  • In the vertex shader, compute the bitangent vector.
  • Compute the transformation matrix.
  • Compute vector L and transform it into tangent space using this matrix.
  • Interpolate L over the triangle as part of the rasterization process.
  • In the fragment shader, normalize L.
  • Retrieve the normal from the bump map by sampling the texture. Decode the RGBA values into a vector.
  • Apply the Lambert equation using N and the normalized L.
  • Finish shading the triangle as usual.

On the bright side, since this technique doesn’t require any additional shading stages, it can be implemented in both OpenGL and OpenGL ES 2.0 and run on most of today’s mobile devices.

In my next post I will show bump mapping applied to a 3D model. Stay tuned!

WebGL for OpenGL ES programmers

I’ve been meaning to look into WebGL for a while now. Coming from an OpenGL (and then an OpenGL ES 2.0) programming background, I figured it should be relatively “easy” to get up to speed with some basic primitive drawing.

Luckily, I was not disappointed: WebGL’s specification was heavily based on OpenGL ES’ and knowledge can be easily transferred between the two. In this post I outline the main differences and similitudes between these two standards.

WebGL

Screenshot of “WebGL Test” – a test program built to determine if WebGL is supported on a browser – Click on the image for the live version.

I was surprised to learn that WebGL, as an API, is even slimmer than OpenGL ES 2.0. OpenGL ES 2.0 had already done away with many features from ES 1.1, so WebGL being even smaller, really feels minimal. This is not a bad thing at all, but may make the learning curve a little more steep for developers just getting started with the *GL APIs.

In order to try WebGL, I decided to create a simple test application that determines if your browser supports it. A screenshot of the application can be seen above. The live version can be accessed by clicking on it or clicking here.

Some of the main things that struck me from WebGL while building this application were:

  • Javascript is the only binding. This might sound obvious, but it’s worth mentioning. WebGL development is done in Javascript (unless you are Notch).
  • No in-system memory Vertex Arrays: usage of VBOs is mandatory. It is the only way to submit geometry to the GPU. I think this decision makes a lot of sense, considering that if data were kept in system RAM as a Javascript array, copying to the GPU every frame may be prohibitively expensive. One of the best practices in OpenGL is to cache data in the GPU’s RAM and WebGL makes it mandatory.
  • Javascript types: WebGL provides several Javascript objects/wrappers that help use the API. Some function calls have been changed from the ES 2.0 spec to accommodate Javascript conventions. The glTexImage2D function, in particular, has a very different signature and seems unable to accept a raw array of bytes as texture data. Javascript Image objects help here.
  • Data must be loaded into WebGL using helper types like Float32Array, which tightly packs vertex data into consecutive memory. This is mandatory for populating VBOs.
  • You will have to deal with interleaved array data and feel comfortable counting bytes to compute strides and offsets. It’s the only way to keep the number of VBOs reasonable and is also one of the best practices for working with OpenGL and WebGL.

On the other hand, just like in ES 2.0:

  • There is no fixed-function pipeline. The T&L pipeline has to be coded.
  • Shaders are mandatory. The current types are vertex and fragment shaders.
  • Old data upload functions, such as immediate mode and display lists, are not supported.
  • There is no matrix stack, nor matrix helper functions. Be prepared to roll your own and try to leverage shaders as much as possible to avoid expensive computations in Javascript.

Conclusion

All things considered, I had fun programming WebGL. While developing the application, I found that most issues I encountered were not caused by WebGL, but rather by “surprises” in the way the Javascript programming language works.

I find WebGL, with its fast iteration cycles (just change the code, save and refresh the browser window), a reasonable tool for prototyping 3D applications and quickly trying out ideas.

The joy of not requiring the user to install any plugins and being able to present 3D data to them right in the browser is the icing on the cake and makes it a very interesting tool for people working in the 3D field.

Stay tuned for more WebGL goodness coming soon!

Writing a Mac OS X Screensaver

A screensaver can be seen as a zero-player game used mostly for entertainment or amusement when the computer is idle.

A Mac OS X screensaver is a system plugin. It is loaded dynamically by the Operating System after a given time has elapsed, or embedded into a configuration window within the Settings App.

What is a system plugin? It means we basically write a module that ascribes to a given interface and receives callbacks from the OS to perform an operation. In this case, draw a view.

A custom screensaver that uses OpenGL to render a colorful triangle.

A custom screensaver that uses OpenGL to render a colorful triangle.

Writing a Mac OS X screensaver is surprisingly easy. A special class from the ScreenSaver framework, called ScreenSaverView, provides the callbacks we need to override in order to render our scene. All work related to packing the executable code into a system component is handled by Xcode automatically.

We can render our view using either CoreGraphics or OpenGL. In this sample, I’m going to use OpenGL to draw the scene.

Initialization and Lifecycle Management

We start off by creating a View that extends ScreenSaverView:

#import <ScreenSaver/ScreenSaver.h>

@interface ScreensaverTestView : ScreenSaverView

@property (nonatomic, retain) NSOpenGLView* glView;

- (NSOpenGLView *)createGLView;

@end

Let’s move on to the implementation.

In the init method, we create our OpenGL Context (associated to its own view). We’ll also get the cleanup code out of the way.

- (id)initWithFrame:(NSRect)frame isPreview:(BOOL)isPreview
{
    self = [super initWithFrame:frame isPreview:isPreview];
    if (self)
    {
        self.glView = [self createGLView];
        [self addSubview:self.glView];
        [self setAnimationTimeInterval:1/30.0];
    }
    return self;
}

- (NSOpenGLView *)createGLView
{
	NSOpenGLPixelFormatAttribute attribs[] = {
		NSOpenGLPFAAccelerated,
		0
	};
	
	NSOpenGLPixelFormat* format = [[NSOpenGLPixelFormat alloc] initWithAttributes:attribs];
	NSOpenGLView* glview = [[NSOpenGLView alloc] initWithFrame:NSZeroRect pixelFormat:format];
	
	NSAssert(glview, @"Unable to create OpenGL view!");
	
	[format release];
	
	return [glview autorelease];
}

- (void)dealloc
{
	[self.glView removeFromSuperview];
	self.glView = nil;
	[super dealloc];
}

The above code is self-explanatory.

Notice how we tell the video driver what kind of OpenGL configuration it should allocate for us; In this case, we only request hardware acceleration. We won’t allocate a depth buffer because there is no need for it (yet).

Rendering Callbacks

Now, let’s move on to implementing the rendering callbacks for our screensaver. Most of the methods here will just forward the events to the super class, but we’ll customize the animateOneFrame method in order to do our rendering.

- (void)startAnimation
{
    [super startAnimation];
}

- (void)stopAnimation
{
    [super stopAnimation];
}

- (void)drawRect:(NSRect)rect
{
    [super drawRect:rect];
}

- (void)animateOneFrame
{
	[self.glView.openGLContext makeCurrentContext];
	glClearColor(0.5f, 0.5f, 0.5f, 1.0f);
	glClear(GL_COLOR_BUFFER_BIT);
	
	static float vertices[] = {
		1.0f, -1.0f, 0.0f,
		0.0f, 1.0f, 0.0f,
		-1.0f, -1.0f, 0.0f
		
	};
	
	static float colors[] = {
		1.0f, 0.0f, 0.0f,
		1.0f, 0.0f, 1.0f,
		0.0f, 0.0f, 1.0f
	};
		
	glVertexPointer(3, GL_FLOAT, 0, vertices);
	glEnableClientState(GL_VERTEX_ARRAY);
	glColorPointer(3, GL_FLOAT, 0, colors);
	glEnableClientState(GL_COLOR_ARRAY);

	glDrawArrays(GL_TRIANGLES, 0, 3);

	glDisableClientState(GL_COLOR_ARRAY);
	glDisableClientState(GL_VERTEX_ARRAY);
	
	glFlush();
	[self setNeedsDisplay:YES];
    return;
}

- (void)setFrameSize:(NSSize)newSize
{
	[super setFrameSize:newSize];
	[self.glView setFrameSize:newSize];
}

We place our rendering logic in the animateOneFrame method. Here, we define our geometry in terms of vertices and colors and submit it as vertex arrays to OpenGL.

Implementing the setFrameSize: method is very important. This method is called when our screensaver starts and we must use it to adjust our views’ dimensions so we can render on the whole screen.

Actionsheet Methods

Mac OS X screensavers may have an associated actionsheet. The actionsheet can be used to let the user customize the experience or configure necessary attributes.

- (BOOL)hasConfigureSheet
{
    return NO;
}

- (NSWindow*)configureSheet
{
    return nil;
}

Testing our Screensaver

Unfortunately, we can’t run our screensaver right off Xcode. Because it’s a system plugin, we need to move its bundle to a specific system folder so Mac OS X can register it. In order to install the screensaver just for ourselves, we place the bundle in the $HOME/Library/Screen\ Savers directory.

Once copied, we need to open the Settings App (if it was open, we need to close it first). Our screensaver will be available in the “Desktop & Screen Saver” group, under the “Other” category.

Conclusion

Screensaver writing for Mac OS X is surprisingly easy! With the full power of desktop OpenGL and C++ at our disposal, we can create compelling experiences that delight users and bystanders.

As usual, there are some caveats when developing OS X screensavers. You can read about them here.

Happy coding!

libsdl-1.2 support for OpenGL 3.2

This week we take a break from the C++ saga to talk a little about OpenGL. I’ve forked libsdl-1.2 and added support for creating OpenGL 3.2 forward-compatible contexts. This is something that could be deemed helpful until libsdl 2.0 is released.

You can find the source code on my GitHub page, at: github.com/alesegovia. In so far, only the Mac platform is supported, as it’s the only Operating System I currently have access to. I’ll hopefully be able to add Linux support as soon as I can get hold of a Linux box with a suitable video card.

Creating an OpenGL 3.2 compatible context is very simple. Once you have downloaded, compiled and installed libsdl-1.2-gl, you just need to create your window using the new SDL_OPENGLCORE flag.

This sample program creates an OpenGL 3.2 context, displays the OpenGL version number and exits:

#include <SDL.h>
#include <stdio.h>
#include <OpenGL/gl.h>

int main(int argc, char* argv[])
{
    SDL_Init(SDL_INIT_VIDEO);

    SDL_Surface* pSurface = SDL_SetVideoMode(600, 600, 32, SDL_OPENGL|SDL_OPENGLCORE);

    printf("GL Version:%s\n", glGetString(GL_VERSION));

    SDL_Quit();

    return 0;
}

You need to be running Mac OS X Lion or higher in order to be able to create OpenGL 3.2 contexts. If you are running Snow Leopard or your video card does not support OpenGL 3.2, you might get a Compatibility profile and your OpenGL version might be stuck on 2.1.

Also note that Mac OS X reports the OpenGL version to be 2.1 unless you specifically create forward-compatible OpenGL contexts, so if you need to know whether your Mac supports OpenGL 3.2, you can look your system configuration up in this great table maintained by Apple.

If you find this useful, let me know in the comments. Enjoy!

Light Scattering

When we introduced Programmable Pipeline support to Vortex Engine, we claimed that better visual effects could now be introduced into the renderer. With the introduction of Render-to-Texture functionality into Vortex 2.0, coupled with the power of Shaders, we can now make good on our promise.

Light Scattering (also known as “God Rays”) is a prime example of what can be achieved with Shaders and Render-to-Texture capabilities.

In the following image, a room consisting of an art gallery with tall pillars is depicted. We want to convey the effect of sun light coming from outside, illuminating the inner nave. Enter Light Scattering:

A Light Scattering algorithm is used for improving the visual experience. The scene is rendered using Vortex 2.0. Room courtesy of RealityFrontier. (Click to Enlarge)

 

It can be seen in the picture above how the the effect, although subtle, brings more life to the rendered scene. There is still much room to improve the visual results, though, particularily with the results of Kenny Mitchel’s article on GPU Gems 3.

There is also room for optimization. Currently, the scene is rendered in realtime at an average of 187 frames per second, producing 1024×1024 images on a NVIDIA GeForce GTX465. Moving the algorithm to mobile devices, although easy from a coding perspective, might require extra work to achieve a high frame rate on the embedded GPU.

Here are three more captures from different angles.

(Click to Enlarge)

 

(Click to Enlarge)

(Click to Enlarge)

 

A short detour along the way…

I wanted to improve the Stencil Shadow Volumes code a little bit and enable it for the Programmable Pipeline in Vortex, however, I had to take a small detour to fix an issue related to mobile device support.

It turns out that OpenGL ES, the 3D library that Vortex uses to render its graphics on mobile devices, does not support rendering indexed geometry for indices larger than 16 bits. Keep this in mind when developing for mobile devices such as the iPhone or iPad.

I can completely understand the reasoning behind this decision. 32-bit indices could be considered too much data to submit to the GPU in a mobile device. Furthermore, they are not strictly necessary, as they could be replaced (if needed) by splitting the geometry into two groups defined by 16-bit indices.

The solution I devised, which is now part of Vortex 2.0, is to allow the user to specify the data size for the indices when defining the geometry. This provides the flexibility to use 32, 16 or 8 bit indices. You can even have several geometric objects with different index sizes in a scene.

The advantage of leveraging this mechanism is that now it is very easy to fine-tune the number of bytes used for index representation for improving performance. For example, using 16-bit indices instead of 32-bit indices would make no difference for representing models composed of less than 65536 vertices, while requiring a copy of just half the number of bytes to the GPU.

In the extreme case of 8-bit indices we would be constrained to only 256 vertices, but we would be sending only one fourth of the data to the GPU.

This was mostly plumbing work, so no new picture this time. Stay tuned for more updates!