Deferred Rendering – Part I

This week, work went into completing the initial implementation of a Deferred Renderer for Vortex. Every feature we had been incorporating into the Engine so far was building up to this moment and so, this time around, I finished writing all the components necessary to allow rendering geometry to the G-Buffer and then doing a simple light pass.

Initial implementation of deferred rendering in the Vortex V3 Engine.

Initial implementation of deferred rendering in the Vortex V3 Engine.

The image above shows the current functionality. Here, a box is created and its material is set to the “Geometry Pass” built-in shader. Other engines usually refer to this shader with another name, but ultimately, the purpose is always the same: populate the G-Buffer with data.

Once we’ve assigned the shader to the material, we attach a diffuse texture to it. This will be the object’s albedo value that we will write into the G-Buffer together with the corresponding position and normal data. Multi-Render Target support in Vortex is fundamental to efficiently fill the G-Buffer in only one render pass.

Next, we create a light entity. We mentioned the new light interface in Vortex from February. Light entities are gathered by the V3 renderer and, depending on their type, they are drawn directly on the framebuffer. The light shader is selected depending on the light type and it will access the readily available G-Buffer data to shade the scene.

The Postprocessing Underpinnings in the new renderer turned out to be essential for testing new ideas and debugging the implementation process every step of the way.

I’m quite happy with the results. Now that we have a complete vertical slice of the deferred renderer, we can iteratively add new features on top of it to expand its capabilities. From the top of my head, adding support for more light types and transparent meshes are two major features that I want to tackle in the upcoming weeks.

G-Buffer

This week I started working on the G-Buffer implementation for the deferred renderer.

A test of the G-Buffer. Colors correspond to the coordinates of the vertices in world space.

A test of the G-Buffer. Colors correspond to the coordinates of the vertices in world space.

The G-Buffer pass consists in the first half of any deferred shading algorithm. The idea is that, instead of drawing shaded pixels directly on the screen, we will store geometric information for all our opaque objects in a “Geometry Buffer” (G-Buffer for short). This information will be used later in a lighting pass.

In sharp contrast to forward rendering, where the output of our rendering is the already-lit pixels, lighting calculations here are deferred to a later pass, hence the term deferred rendering.

The image above shows a simple test for the G-Buffer implementation in Vortex. In this test, we draw a number of boxes, with each vertex being colored according to their position in world space as read from the G-Buffer.

Notice how as vertices move left to right, we gain red (x translates to red). Similarly, as vertices move from bottom to top, we gain green (y translates to green). We cannot see it in this picture, but as vertices move from the back to the front, we also gain blue (z translates to blue).

Through this test we can also verify that moving the camera does not change the colors at all. This helps validate that our position data are indeed in world space.

This is going to be all for today. Next week, work will continue in the G-Buffer implementation. Stay tuned for more!

Multi-Render Targets

This week, work went into testing Multi-Render Target support. Usually abbreviated MRT, Multi-Render Targets allow our shaders’ output to be written to more than one texture in a single render pass.

A test of Multi-Render Targets in Vortex 3. Texels from the left half of the box come from a texture, whereas texels from the right half come from another.

A test of Multi-Render Targets in Vortex 3. Texels from the left half of the box come from a texture, whereas texels from the right half come from another.

MRT is the foundation of every modern renderer, as it allows building complex visuals without requiring several passes over the scene. With MRT, we can simply specify all the textures that a render pass will be writing to and then, from the shader code, write to each out variable.

MRT is standard in both Core OpenGL and OpenGL ES 3.0, so opting into it will not preclude the renderer from running on mobile hardware. There were only a couple of minor adjustments that had to be done to our current shaders in order to support MRT.

In particular, OpenGL 3.3 changed the way that fragment shaders write to multiple attachments. Prior to OpenGL 3.3, we would write to the GLSL built-in variable glFragData[i] to specify the output we were writing to. Starting in OpenGL 3.3, we explicitly specify the layout description for our out variables in the fragment shader using the seemingly weird syntax: layout(location = i) out vec3 attachment_i; and then writing to that variable directly.

In order to achieve this, we had to increase the GLSL version to 330. There was the option to stay at version 150 and use a GLSL extension, but we are trying to stick to standard out-of-the-box OpenGL as much as possible, so this was not an option.

A test of the Multi-Render Target functionality in Core OpenGL 3.3

A test of the Multi-Render Target functionality in Core OpenGL 3.3

In order to test MRT I designed a simple test. The output can be seen above. In these images, the shader used to render the box writes the red and green channels to two different texture attachments.

The blit pass that draws the framebuffer to the screen samples both textures and uses pixels coming from the red texture to the left half of the screen and pixels from the green texture for the right half. This generates the visual effect of the box being painted in two colors.

I’m very happy with the results of this test. MRT is a very approachable feature and there is no reason not to use it if you are targeting recent hardware.

The next steps will be to clean up the internal Framebuffer API even more to make MRT support more flexible, and to start working on implementing the G-Buffer. As usual, stay tuned for more!

Postprocessing Underpinnings

These past few weeks the majority of work went into establishing the underpinnings for frame postprocessing in the new V3 renderer.

A postprocessing effect that renders the framebuffer contents in grayscale.

A postprocessing effect that renders the framebuffer contents in grayscale.

Vortex 2.0 was the first version of the engine to introduce support for custom shaders and, although this opened the door to implement postprocessing effects, the API was cumbersome to use this way.

In general, the process would boil down to having two separate scene graphs and manually controlling the render-to-texture process. This would spill many engine details to user programs and was prone to breaking if the engine changed. This would also mean engine users would have to write hundreds of lines of code.

With V3 I want to make render-to-texture the default render mode. This means the engine will never render directly into the default framebuffer but, rather, we always render to an FBO object that we can then postprocess.

Architecting the renderer this way provides the opportunity to implement a myriad of effects that will up the visuals significantly while also keeping the nitty gritty details hidden under the hood.

The image above shows a postprocessed scene where the framebuffer contents were desaturated while being blit onto the screen, producing a grayscale image.

For the upcoming weeks, work will focus on building upon this functionality to develop the components necessary to support more advanced rendering in V3. Stay tuned for more!

Wrapping up 2016: Lessons Learned from working on Vortex

2016 marks a year where a lot of work went into both the Vortex Engine and the Vortex Editor. Working on both these projects during my free time has been great to continue to hone my C++ and OpenGL skills. In this post, I am going to do a quick retrospective on the work done and present a few lessons learned.

Lessons on making such an UI-heavy application

Rendering a deserialized scene using the new renderer.

Rendering a deserialized scene using the new renderer.

The way I decided to approach this work was to divide Vortex into two separate projects: the Vortex Engine project and a brand new Editor built on top of it. The Engine itself has been on-going since 2010.

Today, both projects have reached a level of maturity where there we could clearly have two engineers working full-time on them.

The Editor is definitely one of the driving forces that pushes the Engine into supporting more and more visual features. This is to provide the user more power of expression. The amount of work required for exposing these features to the outside world, however, is something that I did not expect.

Let’s walk through a simple example: selecting an Entity and editing it so we can change its properties. In order to do this we must:

  1. Provide the user a means to select an Entity.
  2. React to this, introspecting the selected Entity and its components.
  3. Build and display a custom UI for each component in the Entity.
  4. For each component, render its UI widgets and preload them the component’s UI.
  5. Provide the user the means to change these properties.
  6. Have the changes reflect in the 3D scene immediately.

This system can quickly grow into thousands of lines of code. Even if the code does not have the strenuous performance requirements of the rendering loop, we still need to develop responsive code with a good architecture that allows building more features on top of it.

The rewards from this effort are huge, however. The Editor UI is the main point of interaction of the user with Vortex and it’s the way that she can tell the Engine what she wants it to do. Good UI and, more importantly, good UX, are key in making the Editor enjoyable to the user.

Lessons on going with C++11

C++ Logo

I decided to finally do the jump and update the codebase from C++98 to C++11 and the min-spec for running the renderer from OpenGL ES 2.0 to Core OpenGL 3.3.

Going to C++11 was the right choice. I find that C++11 allows for more power of expression when developing a large C++ codebase and it provides several utilities for making the code look more clean.

There are a few takeaways from using C++11, however, that I think may not be as clear for people just getting started with this version of the language.

Lessons on C++11 enum classes

I like enum classes a lot and I tend to use them as much as possible. There were several places through the legacy Vortex Engine code where old C-style structs and/or static const values that were used to declare configuration parameters did not look too clean. C++ enum classes helped wrap these while also keeping their enclosing namespace clean.

The only limitation I found was using enum classes for bitmasks. enum class members can definitely be cast to make them behave as expected. Doing this however is heavy-handing the type system and some may argue it does away with the advantages of having it.

Additionally, if you’re trying to implicitly cast a binary operator expression involving an enum class value into a bool, you are going to find a roadblock:

I like doing if (mask & kParameterFlag), as I find it more clear to read than having a mandatory comparison against zero at the end, and C++11 enum classes do not provide that option for me.

Lessons on C++11 Weak Pointers

C++11 Shared Pointers (std::shared_ptr and std::weak_ptr) are great for an application like the Editor, where Entity references (pointers) need to be passed around.

Imagine this situation: we have an Entity with a few components that is selected. We have several UI components that are holding pointers to all these objects. Now, if the user decides to delete this entity or remove one of its components, how do we prevent the UI code from dereferencing dangling pointers? We would have to hunt down all references and null them.

Using C++11’s std::weak_ptr, we know that a pointer to a destroyed Entity or Component will fail to lock. This means that the UI controllers can detect situations where they are pointing to deleted data and graciously handle it.

Lessons on C++11 Shared Pointers

Like any other C++ object, smart pointers passed by value will be copied. Unlike copying a raw pointer, however, copying a smart pointer is an expensive operation.

Smart pointers need to keep a reference count to know when it’s okay to delete the managed object and, additionally, C++ mandates that this bookkeeping be performed in a thread-safe fashion. This means that a mutex will have to be locked and unlocked for every smart pointer we create and destroy.

If not careful, your CPU cycles will be spent copying pointers around and you may have a hard time when you try to scale your Engine down to run on a mobile device.

I investigated this issue and found this amazing write up by Herb Sutter: https://herbsutter.com/2013/06/05/gotw-91-solution-smart-pointer-parameters/. The idea is to avoid passing shared pointers by copy to any function that does not intend to keep a long-term reference to the object.

Don’t be afraid of calling std::shared_ptr::get() for passing a pointer to a short, pure function that performs temporary work and only opt into smart pointers when you want to signal to the outside world that you want to share the ownership of the passed in pointer.

Lessons on using Core OpenGL

OpenGL Logo. Copyright (C) Khronos Group.

OpenGL Logo. Copyright (C) Khronos Group.

Choosing to go for a specific minimum version of Core OpenGL helps root out all the questions that pop up every time you use anything outside OpenGL 1.1 and wonder if you should implement the ARB / EXT variants as well.

Core OpenGL 3.3 makes it easier to discover the optimal GPU usage path, as you are now required to use VBOs, VAOs, Shaders and other modern Video Card constructs. It also has the added advantage that it will make it so that legacy OpenGL calls (which should not be used anyways) will not work.

Differences in OpenGL, however, are still pervasive enough so that code that is tested and verified to work on Windows may not work at all on OSX due to differences in the video driver. Moving the codebase to a mobile device will again prove a challenge.

The lesson here is to never assume that your OpenGL code works. Always test on all the platforms you claim to support.

In Closing

These were some of the most prominent lessons learned from working on Vortex this year. I am looking forward to continuing to work on these projects through 2017!

I think we’ve only scratched the surface of what the new renderer’s architecture can help build and I definitely want to continue developing the renderer to support more immersive experiences, as well as the Editor so it exposes even more features to the user!

Thank you for joining in though the year and, as usual, stay tuned for more!

Reaching Feature Parity

Last week work continued on different parts of the Editor. As the new renderer falls into place, I’ve been going through the Editor code, finding scaffolding and other legacy code pieces that were built to make the Editor originally work with the 2011 renderer, but that were disabled as part of the renderer update process.

Rendering a deserialized scene using the new renderer.

Rendering a deserialized scene using the new renderer.

As I was working on the Editor, I was careful to have a clearly defined separation between Editor code and Engine code. This helped keep the impact of swapping out the renderer for a new one limited to a single C++ file.

Changing the renderer, however, did bring a fair share of feature regressions to the Editor, as some semantics had changed in terms of where entities need to be registered to have them drawn and how the UI controllers inspect their components.

Today, after several fixes here and there, I’ve been able to load a scene that had been serialized using the old Editor and have its entities displayed in the 3D World. This is good confirmation that the Editor is reaching a stable point with feature parity to what we had a few months ago when we decided the rewrite the renderer.

Working on both the Editor and the Engine is and has been an amazing experience. In next week’s post, as a way to wrap up year, I’m going to break down a few of the lessons learned. Stay tuned for more!

New RenderCamera Work

This week I reimplemented the camera logic from scratch.

The previous camera controls in Vortex date from around 2010 and, even though it worked and helped release a number of Apps using it, it had some limitations that made it difficult to build new features on top of it.

I was interested, in particular, in providing a mechanism through which camera animations could be controlled from native code and scripts, rather than just responding to user input directly.

The new vtx::RenderCamera is implemented as a Component that can be attached to an Entity directly, basically providing the potential to make any entity in the scene a camera. Once the active camera is selected, this is passed down to the new Rendering System to draw the scene from the camera’s point of view.

This allows configuring a scene with several “vantage point” cameras in order to dynamically switch them around to make different cuts. At the same time, because cameras are attached to entities, entity movement provides a natural means to pan and rotate cameras through the scene.

I’m very happy with the result of this rework, as it provides exactly all the functionality we had before on top of a solid, expandable foundation.

Unfortunately, we still don’t have a way to draw a camera gizmo in the Editor, so we have no screenshot for this week. Next week, however, I’m planning on continue working on some visual features, so definitely stay tuned for more!

Shader Texture Mapping and Interleaved Arrays

Work continues on the new Rendering System for the Vortex Engine.

This week was all about implementing basic texture mapping using GL Core. The following image shows our familiar box, only this time, it’s being textured from a shader.

Texture Mapping via Interleaved Arrays and Shaders in the Vortex Engine.

Texture Mapping via Interleaved Arrays and Shaders in the Vortex Engine.

A number of changes had to go into the Renderer in order to perform texture mapping. These touched almost all the layers of the Engine and Editor.

  1. First, I wrote a “Single Texture” shader in GLSL to perform the perspective transform of the Entity’s mesh, interpolate its UV texture coordinates and sample a texture.
  2. Second, I had to change the way the retained mesh works in order to be able to send texture coordinate data to the video card (more on this later).
  3. Finally, I had to modify the Editor UI to allow selecting which shader is to be used when rendering an Entity.

So, regarding how to submit mesh texture coordinate data to the video card, because in OpenGL Core we use Vertex Buffer Objects (VBOs), it was clear that UVs had to be sent (and retained) in GPU memory too.

There are two ways to achieve this in OpenGL. One way consists in creating several VBOs so that each buffer holds an attribute of the vertex. Position data is stored in its own buffer, texture coordinates are stored their own buffer, per-vertex colors take a third buffer and so on and so forth.

There’s nothing wrong with doing things this way and it definitely works, however, there is one consideration to take into account: when we scatter our vertex data into several buffers, then every frame the GPU will have to collect all this data at render time as part of vertex processing.

I am personally not a big fan of this approach. I prefer interleaving the data myself once and then sending it to the video card in a way that’s already prepared for rendering. OpenGL is pretty flexible in this regard and it lets you interleave all the data you need in any format you may choose.

I’ve chosen to store position data first, then texture coordinate data and, finally, color data (XYZUVRGBA). The retained mesh class will be responsible for tightly packing vertex data into this format.

Once data is copied over to video memory, setting up the vertex attrib pointers can be a little tricky. This is where interleaved arrays become a more challenging than separate attribute buffers. An error here will cause the video card to read garbage memory and possibly segfault in the GPU. This is not a good idea. I’ve seen errors like this bring down entire operating systems and it’s not a pretty picture.

A sheet of paper will help calculate the byte offsets. It’s important to write down the logic and then manually test it using pen and paper. The video driver must never read outside the interleaved array.

Once ready, OpenGL will take care of feeding the vertex data into our shader inputs, where we will interpolate the texture coordinates and successfully sample the bound texture, as shown in the image above.

Now that we have a working foundation where we can develop custom shaders for any Entity in the scene it’s time to start cleaning up resource management. Stay tuned for more!

First Steps for the V3 Renderer

This week, work continued on the new V3 Renderer.

First steps of the V3 renderer.

First steps for the V3 renderer.

In the image above, we can see that the renderer is starting to produce some images.

It might not look like much at the moment and, indeed, there’s still work left to reach parity with the legacy fixed-pipeline renderer. Nonetheless, being able to render the floor grid and the PK Knight model is an important step that validates that the core Entity-Component traversal, the Core OpenGL rendering code, the shaders and the new material and retained mesh systems are interacting correctly.

As with any rendering project where you start from scratch, until the moment where the basic foundation comes together, you have no option but to rely on your code, a piece of paper and a bunch of scaffolding code. Everything has to be built “in the dark”, without being able to see anything on the screen.

Now that we’ve established this foundation, however, we can continue to build the V3 Renderer with visual feedback, which should help tremendously.

Next step: on to basic texture mapping!

Stay tuned for more!

Vortex V3 Renderer

The past couple of weeks have been super busy at work, but I’ve still managed to get the ball rolling with the new renderer for the Vortex Engine.

This is the first time in years I’ve decided to actually write a completely new renderer for Vortex. This new renderer is a complete clean-room implementation of the rendering logic, designed specifically for the new Entity-Component system in the engine.

Current Rendering Systems in Vortex

Let’s start by taking a look at the current rendering systems in the Vortex Engine. Ever since 2011, Vortex has had two rendering systems: a Fixed Pipeline rendering system and a Programmable Pipeline rendering system.

Dual Pipeline support: a Comparison of the Rendering Pipelines available in Vortex Engine. The image on the left represent the Fixed Pipeline. The image on the right represents the Programmable Pipeline.

Dual Pipeline support: a Comparison of the Rendering Pipelines available in Vortex Engine. The image on the left represent the Fixed Pipeline. The image on the right represents the Programmable Pipeline.

Both these rendering systems are pretty robust. Both have been used to develop and launch successful apps in the iOS App Store and they have proven to be reliable and portable, allowing the programmer to target Linux, Windows, Mac OS X and Android, as well as iOS.

The problem with these renderers is that they were designed with Vortex’s Scenegraph-based API in mind. This means that these renderers do not know anything about Entities or Components, but rather, they work at the Node level.

Moving forward, the direction for the Vortex Engine is to provide an Entity Component interface and move away from the Scenegraph-based API. This means that glue code has to be developed to allow the traditional renderers to draw the Entity-Component hierarchy.

So… why is this a problem?

Why a new Renderer?

As Vortex V3 now provides a brand new Entity-Component hierarchy for expressing scenes, glue code had to be developed in order to leverage the legacy renderers in the Vortex Editor. In the beginning this was not a major problem, however, as the Entity-Component system matures, it’s become ever more difficult to maintain compatibility with the legacy renderers.

PBR Materials in Unreal Engine 4. Image from ArtStation's Pinterest.

PBR Materials in Unreal Engine 4. Image from ArtStation’s Pinterest.

Another factor is the incredible pace at which the rendering practice has continued to develop in these past few years. Nowadays, almost all modern mobile devices have support for OpenGL ES 2.0 and even 3.0, and PBR rendering has gone from a distant possibility to a very real technique for mobile devices. Supporting PBR rendering on these legacy renderers would require a significant rewrite of their core logic.

Finally, from a codebase standpoint, both renderers were implemented more than 5 years ago, back when C++11 was just starting to get adopted and compiler support was very limited. This does not mean that the legacy renderers’ codebases are obsolete by any means, but by leveraging modern C++ techniques, they could be cleaned up significantly.

From all of this, it is clear that a new clean-room implementation of the renderer is needed.

Designing a New Renderer

The idea is for the new renderer to be able to work with the Entity-Component hierarchy natively without a translation layer. It should be able to traverse the hierarchy and determine exactly what needs to be rendered for the current frame.

Once the objects to be renderer have been determined, then, a new and much richer material interface would determine exactly how to draw each object according to its properties.

Just like with the Vortex 2.0 renderer, this new renderer should fully support programmable shaders, but through a simplified API that requires less coding and allows drawing much more interesting objects and visual effects.

Choosing a backing API

Choosing a rendering API used to be a simple decision: pick DirectX for Windows-only code or OpenGL (ES) for portable code. The landscape has changed significantly in the past few years, however, and there are now a plethora of APIs we can choose from to implement hardware-accelerated graphics.

This year alone, the Khronos Group released the Vulkan specification, a new API that tackles some of the problems inherent to OpenGL, as seen in the following image.

Comparison of the OpenGL and Vulkan APIs of the Khronos Group. Slide Copyright (C) 2016 Khronos Group.

Comparison of the OpenGL and Vulkan APIs of the Khronos Group. Slide Copyright (C) 2016 Khronos Group.

Now, both Vulkan and Metal are very appealing low-level APIs that provide a fine degree of control over the hardware, but they are limited in the sense that while Metal is Apple specific, Vulkan is cross-platform but not available on Apple devices.

DirectX 12 is Windows 10 only and that discards it right off the bat (for this project at least). DirectX 11 is a good option but, again, Windows only.

This leaves OpenGL and OpenGL ES as the two remaining options. I’ve decided to settle for Core OpenGL 3.3 at this time. I think it’s an API that exposes enough modern concepts to allow implementing a sophisticated renderer while also remaining fully compatible with Windows, iOS and everything in-between.

I don’t discard the possibility in the future of implementing a dedicated Metal or Vulkan backend for Vortex, and nothing in the Engine design should prevent this from happening, however, at this time, we have to start on a platform that’s available everywhere.

Using Core OpenGL 3.3 will also allow reusing the battle-tested shader API in Vortex. This component has several years of service under its belt and I’d risk to say that all of its bugs have been found and fixed.

Other than this particular component, I’m also reusing the material interface (but completely overhauling it) and developing a new RetainedMesh class for better handling mesh data streaming to the GPU.

Closing Thoughts

Writing a comprehensive renderer is no weekend task. A lot of components must be carefully designed and built to fit together. The room for error is minimal, and any problem in any component that touches anything related to the Video Card can potentially make the entire system fail.

It is, at the same time, one of the most satisfying tasks that I can think of as a software engineer. Once you see it come to life, it’s more than a sum of its parts: it’s a platform for rendering incredible dream worlds on a myriad of platforms.

I will take my time developing this new renderer, enjoying the process along the way.

Stay tuned for more! : )