Have any questions about me, my projects or any of my blog posts? Send me a Tweet!
The original outline shader (Used in Battle World to draw cinematic lighting on objects) would take the dot produce of a vertex normal and the camera look direction and use that to select what verteces need to be highlighted, which worked well for high-polygon models, but there were some issues.
I decided to use the new shader system in the engine to try and attempt high-lighting as a post-processing moment, in stead of using vertex normals, the engine renders an off-screen depth buffer to a texture and then scans it in post-processing shader to find the steepest changes in depth and highlight a 4 pixel edge.
(Credit to Tommy Tallian for the TF2 low-poly models seen here, used only for testing)
This solved the problem with simple models lighting incorrectly, however it brings some new problems to the table.
I plan to only use this technique for cinematic cut-scenes, so when the camera is close to an object during a cinematic the object can be lit with this method.
I can also adapt the shader to draw pencil-styled outlines, which could be useful for something in the future.
And the best use of it is that it can highlight objects in the same way that the Left 4 Dead games highlight allies, where you can see their outline through the world, which could be useful for highlighting treasure on the field.
The biggest reason for re-factoring the rendering engine is to make room for the new shader system.
Today I managed to set up the 3D skybox system once again and added a few render buffers, particularly one that is used for post-processing, a handy technique for mangling your rendered scene so it can look pretty.
One such feature is anti-aliasing and the current popular technique for image enhancement is fast-approximate anti-aliasing, FXAA.
Anti-aliasing is the technique of smoothing out a 3D render so the pixels on the screen are less noticeable, FXAA is a quick approximation of how the smoothing should look like.
Here is Battle World’s FXAA (Click open in full size).
On the left is no anti-aliasing, on the right is FXAA. Look at the corners where the darker ceiling meets the lighter walls.
FXAA is often praised for it’s performance, it comes with no memory cost (Unless you count the frame-buffer, but I have it there anyway for other special effects) and can be slapped over any scene, in fact, you can see I only lose ~4 frames-per-second in this scene.
I have experienced problems with FXAA on mobile devices, their GPUs are pretty weak compared to the AMD HD 6970 on my primary development machine, so the speed hit is more apparent on devices such as iPhone, iPad and Tegra 3 chips, so far only the Adreno 320 can churn out over 60 frames with FXAA and that is probably due to it’s unique chip layout.
So what’s going on? In FXAA the processing is offloaded directly onto the GPU’s fragment shader and runs across the entire resolution of the buffer, this is quite a lot of work. On a 1920x1080 buffer the GPU has to traverse 2,073,600 fragments (Fragments are essentially pixels), and reading a pixel is the slowest action of this process, FXAA reads pixel data 9 times for each pixel, that’s 18,662,400 read operations! Good job PC GPUs are powerful enough to handle this.
Improvements to FXAA would be to somehow offload the calculations to the vertex shader, there are 6 vertices for the frame buffer (2 pairs overlap), much less than over 2 million fragments to traverse, at the moment there is no obvious parts of FXAA’s operations that can be offloaded to the vertex shader.
Here’s a download of the shader as a function for GLSL, remember to define the precision settings then just drop this function above your screen-shader and then run as “gl_FragColor = vec4( fxaa( INPUT_TEXTURE, TEXTURE_UV_VARYING, INVERSE_SCREEN_RESOLUTION ), 1.0 );”
The INVERSE_SCREEN_RESOLUTION is simply 1.0 / screenResolution[width/height], it is the rough size of the fragment on the open gl window.
The original source kept these as constants:
float FXAA_SPAN_MAX = 8.0;
float FXAA_REDUCE_MUL = 1.0/8.0;
float FXAA_REDUCE_MIN = (1.0/128.0);
But I highly recommend you equate them and move them directly into the shader source once you’ve experimented with changing their values.
Just plug those 3 lines at the top of the shader file to put them to use.
I’m forever learning new things, as such I am constantly going back to improve my old code, which bring performance boosts and bug fixes all around.
A lot of people call this my curse as I end up revising engines over and over until I cannot make them shine any better, this is why Battle World’s engine has been under development since 2008, I have constantly changed architectures, engine design and have been refactoring every time I learn something new.
This screen shot is from 2012, back when the game was still 2D on a 3D plane.
The best change the game has made so far is going fully 3D. This system used a sector format where areas are built up with shapes that connect to each other and describe the layout of the map from a top-down view.
Very similar to Doom’s map format.
That’s why I went fully 3D, I was mimicking Doom’s map format, so why not use the map format?
I chose UDMF which is a modernised, semi-human readable version of Doom’s map format, and the best part is that it already had powerful editors, which meant I didn’t need to work on any tools.
This engine version in this screen suffered with poor performance and lacked optimisation with the UDMF sectors, which started the path to the new code-base that is currently under it’s first refactor.