Have you ever been in a situation where you’re trying to figure out which postal center is closer to your house? Have you ever just wished there was some kind of map that would tell you all the answers?! I sure haven’t, however, if you have then the answer is Voronoi diagrams. These diagrams take a set of known points and generate a set of regions, each representing the closest areas to one of the input points. That might sound a bit confusing, but watch the video to see what I mean. You may want to pause it, pick any random spot on the video screen, then try and find the closest black point to it; you’ll see how the colored regions come in to play.
Hierarchical Occlusion Culling is a technique used to eliminate objects that are hidden by other larger objects. Though it sounds like it would be a great performance boost, this technique is quite heavy itself and may outweigh its own benefit (especially in games, but not so much in complex CAD). The idea is to mark separate larger objects as occluders (objects which tend to obstruct visibility). Then, these objects are rendered out to a separate Boolean buffer of the pixels they cover. In my demo this buffer is represented by a black and white buffer, black representing minimal or no coverage, and white representing complete coverage. This render is then scaled down like a mipmap until it reaches a minimum stopping limit, in my demo it is 4×4. This series of buffers we will call the “hierarchical occlusion maps”, or HOM. Since the scaling down uses filtering, pixels will no longer be simply just black or white, but will be gray and represent a percentage of occlusion.
Another buffer is then rendered out known as the Depth Estimation Buffer (DEB). On the occluder, the furthest point from the viewer is found and its depth is stored. From there, a screen-space AABB is made around the occluder and the AABB is rendered to the DEB with the float value of the furthest depth.
The next step is to render the objects. First, provided the occluders are actual objects, render them only using frustum culling. Next, when rendering all the other non-occluders we start with frustum culling, then we move to check occlusion. By creating a screen space AABB around the object being rendered, we can then loop through all the pixels touched by the AABB in the smallest level of the HOM. We will then determine if an object completely shares the same screen space as the occluder or not. If it is neither (it only partially shares), we traverse up the HOM until we determine one of those or reach the last HOM level (at this point, we would just draw it). Interestingly enough, by making the checks a little less rigid, we can introduce approximate occlusion culling. If we ever determine complete coverage, the next step is to check the DEB. Notice that for each occluder, the DEB has a large rectangle with each pixel representing the value of the furthest depth. That means, if an object completely shares the same space as an occluder and that it is completely behind the depth given by the DEB, then it must be not visible. The next thing to do is find the closest depth to the viewer on the object, then scan pixels covered by the object’s screen space AABB and compare the closest depth to the depth read. If it always turns out to be behind, then the object is not visible.
Apologies for such a lengthy explanation, but it’s a long process. Hope you enjoy the demo below!
The particular implementation of frustum culling I did was Hierarchical Frustum Culling with plane caching. This technique has become one of my favorites for doing frustum culling with static objects, since it exhibits a really nice performance. The basic idea is to build an AABB tree, where the root is an AABB surrounding the entire scene and the children split up the parent into smaller pieces. The basic algorithm then goes: test if an AABB is all inside, all outside, or intersecting the frustum. If it’s all inside, draw all the children. If it’s all outside, don’t bother traversing it anymore. If it’s intersecting, we have to check to recursively do the same test for all the children AABBs. Where plane caching comes in is absolutely magical. A maximum of 6 plane tests are needed to determine if an AABB is within a frustum. However, when doing a plane test with plane P, if we find an AABB to be all outside then we can cache plane P so that next time we run it we start with P. This caching mechanism works off the basis of temporal coherence, or in other words if something is off to the right, more than likely the next frame it will continue to be off to the right. One interesting result is that if you look away entirely from the scene, it takes only one plane test to throw everything out (of course… how interesting of a game would that be :P).
Raytracing on the CPU is a slow and intense process, however, there’s a lot we can do to help it out when it comes to spatial partitioning. In dealing with static objects in a scene, one of the dominant spatial partitioning methods used is the KD-Tree. The KD-Tree is a spatial partitioning algorithm that splits up space using infinite half-planes, storing the split plane and axis on each branch of the tree. Though KD-Trees are typically used to store points and do nearest point queries, the KD-Tree implemented here stores triangles at the leaf node level and makes splits on the edges of triangles. In order to determine where to make the splits, the KD-Tree uses the Surface Area Heuristic (SAH). SAH was made with raytracing in mind, and is based loosely upon the entropy and occlusion of rays. Using SAH over doing a naïve split (like splitting half way in) yields a tremendous performance gain and allows for nearly realtime raytraced scenes, as can be seen below.
GPU Raytracing is going to be the future. The amount of effects that can be achieved through raytracing as well as how easy it is to code up makes it incredibly appealing (with the only downside being the rendering cost, obviously). This simple demo I put together in DirectX 9 to demonstrate simple GPU Raytracing with absolutely no spatial partitioning. The quality of the video isn’t perfect, but in the actual demo the surfaces of the spheres are infinitely detailed as there are no triangles needed to represent it; the sphere is its own geometric primitive! The next step in this demo is to integrate SAH KD-trees into the GPU. I attempted to do this without any sort of GPGPU language, but it proved to be massively difficult.
3D Studio Max is a great tool for designing cinematic or game asset content (though arguably Maya is better). Before I go on… forgive me, I am not an artist! This video was made utilizing a lot of different features in 3DsMax (splines, post render process effects, geometric editing, displacement maps, cool modifiers, etc). The knowledge of 3DsMax and Maya is what really helped me to work with the FBX file format and to help artists easily get integrated into our game engine.
Cube map reflection and refraction is still to date a popular technique for emulating realism. The technique implemented here was to actually render to the six faces of a cube map, thus resulting in non static behavior. The result was pretty cool, but still had a few visual issues due to the fact that the renderer was an infinitely small point. Refraction actually turned out to be the most believable, probably because reflection is more intuitive to humans.
In my experience with graphics, the most dramatic change introduced to a game’s look has to be shadows. Shadows change the perception of the viewer and add a lot of depth to the scene (literally, it’s much easier to infer depth!). In Bear Pile, we added shadows partially to fix some game design issues that we had with jet-packing bears. Players would complain about not being able to know when a bear was going to land, or, since the bears flew quite high sometimes, they would also complain about not even knowing there was a flying bear. This issue was mitigated with shadows, and in my opinion it even intensified the action.
Percentage Closer Filtering (PCF) shadows are a variation of shadow maps that involves blurring the samples. PCF is more than just a screen space blur, as the blur respects how far the object is from the viewer, and furthermore PCF can be extended to support more realistic variable umbrae and penumbrae. The basic idea is that when sampling the shadow map, instead of just sampling at one position, many samples can be taken around the same spot and an average can be determined.
Because these extra samples are often taken in a predictable way, noticeable artifacts may arise due to the uniformity. One technique used to mitigate this is to randomize samples taken so that the human eye won’t notice. This is the technique we used in Bear Pile, and it’s used in other popular games like Crysis.
In the below video I show off my implementation of shadow mapping and PCF. It’s hard to tell that I’m adjusting the PCF filter because the video is so compressed and grainy, but while I’m running the demo I change the shadows dynamically from hard to soft shadows.
Ambient Occlusion is one of those terms that graphics geeks like to throw around, partially because it’s a big word, but also partially because it’s so damned pretty. The video below was from a basic Ambient Occlusion tracer I wrote to pre-compute lighting values per vertex. The better extension to this would be to shoot rays from actual pixels on a texture map (the only restriction is that the texture must have a one to one mapping to triangles, or new UV coordinates must be generated). Ambient Occlusion is by far one of my favorite effects because, though it’s supposed to be an approximation of global illumination, it also has its own almost cartoony or soft look. I must apologize for the video was recorded using some particularly bad screen cap software.
Surface detail with respect to lighting is a long studied issue with tons of research behind it. Techniques like detail mapping, parallax mapping, tessellation and displacement mapping all exist primarily to solve this issue and make a surface come to life. One of the most standard methods used to introduce lighting detail to surfaces is through normal mapping. Normal mapping is the technique of skewing a surface’s per-pixel normals (used in computing lighting terms) to emulate the effect of lighting moving over small surface cavities. The effect has quite a dramatic impact on the look and feel of a game, though it is often heavily abused. Below is my implementation of the normal mapping technique working along with the Phong lighting model.
Phong lighting is one of the most simple and intuitive methods for shading objects and it yields surprisingly good looking results (especially when you crank up the specular to 11 :P). The Phong lighting model is still to date a very popular lighting model used in games. The video shown here is an implementation of Phong lighting, used in three different light types: Point, Spot, and Directional. The spot lights are the most restrictive, as they attenuate by both distance and angle (only a cone of light is produced). The point light is less restrictive, but still typically attenuates by distance. The last of the lights is the directional, a point light assumed to be so far away that its rays are literally parallel with each other. The directional light is the simplest in terms of calculations as it does not attenuate at all, and it provides a simple way of emulating sunlight.
This simple demo shows off a lot of different aspects of the 3D pipeline, including the underlying mathematics for vertex transformation, triangle rasterization, clipping, frustum and back-face culling, depth buffering, and perspective division. Furthermore, in this demo the tank is controlled by hierarchical transformations, thus allowing pieces to be seemingly “connected”.
One problem worth mentioning is that the frustum culling algorithm is faulty. The algorithm essentially culls any triangle where all vertices lie outside the frustum. Though this might sound correct, it is very possible to construct any triangle where all vertices lie outside the frustum but the triangle is still partially visible… just imagine really big triangle! This is the exact reason for why it looks like triangles pop out when they reach the edge. One simple way of fixing this would be to make an AABB around each triangle, and make sure the AABB is entirely outside the frustum.
Triangle rasterization was one of the first great steps I took into understanding partial derivatives and programming graphical algorithms only using integers. Having already done DirectX at the point of making this, I felt like this assignment was pointless. Furthermore, I never really saw the benefit of using integers instead of just using floats since on the PC the performance difference was negligible (seemed like it was just a legacy thing). What really changed my opinion, however, was my VX6900 phone. No form of triangle rasterization existed on the phone, so in order to do any sort of 3D I had to port it myself. Once I ported this I found out something else: how slow floating point can really be on embedded systems. It turned out this served as a great lesson as to why things are done the way they are done.
The below videos include triangle rasterization, Gouraud shading (color interpolation), and texturing. One neat thing I added to my implementation was the concept of shaders (as simple function pointers), which really helped me easily add things like interpolated depth and nearest neighbor texturing.
This next demo shows off IK working with the animation system and some simple pathing. If you read the post below, the IK in this demo was implemented simply enough as a node, whose input is a goal position, the end effector, and the animation node we’d like to combine IK with. The IK also implements constraints and priorities, however, I personally am quite inexperienced at determining the constraints for a human body. For the pathing I allow the user to define walking positions, however, I use a cubic spline space curve to make the motion look smooth. The cubic spline is continuous, minimizes curvature, and is guaranteed to path through all the walking positions; it makes a great candidate here. One problem with the cubic spline is that as total distance along the spline is nonlinear with the interpolation parameter, therefore I used a look-up table to sample the cubic spline with a distance, rather than an interpolation parameter. For efficiency, I only break up the cubic spline in areas that that experience more curvature (IE straight parts of the cubic spline don’t need to be broken up). To make the animation of the character look a lot better I synced up the animation pace with the character’s motion speed so that his feet would sync up with the ground. I also made sure the character locked to the ground by using a ray trace directly downward to the environment (the ray intersection is done using an accelerated SAH KD-Tree that was preprocessed by the FBX exporter). Here’s a video of the demo:
This demo shows off the use of the art pipeline that I wrote for our engine. In particular, this demo is showing off animations and the animation graph, including special features such as animation blending. The models were created, rigged, and animated in 3D studio max (by our lovely artists) and were imported into our engine via the FBX format and my own model importer / converter. From there we load up the animations and actually animate the instances of our characters using an animation graph. The animation graph is essentially a graph of nodes that get updated each frame that produces an final skeletal pose as the result. For example, one of our nodes is an animation node which allows you to select an animation and play it (the processing phase of that node simply interpolates key frames and outputs the final bone data). Another node is the blend node, which takes two node inputs (could be anything), processes them, and blends the results together via a blend factor. Below is a video of the demo in action: