top of page

Optimizing VXGI with Clipmaps and Reflective Shadow Maps

Specialization Project at TGA 2023

Background

Global illumination is the simulation of light transportation, most lighting models can only produce direct illumination, which is what you get by shining a spotlight on a surface. To achieve light bouncing off a surface to achieve global illumination, it is common to fake it with uniform ambient light through rough estimations like IBLs. Early in production of project seven it was decided that some form of global illumination would be needed, or would at least help greatly with atmosphere building. My first naive attempt was implementing reflective shadow maps which worked fine for local indirect illumination, but was not good enough for what we needed.  Eventually I decided on implementing voxel cone traced global illumination, or VXGI for short, which just approximates world radiance through voxels, voxels being volumetric pixels. You write direct radiance into these voxels and then solve for indirect radiance in a second pass with a technique called cone-tracing. This method not only gives a pretty good approximation of diffuse global illumination but also specular reflections, and can be used to produce a miriad of different effects like ambient occlusion and soft shadows.

Motivation

Immediately after implementing a minimum viable product of VXGI, I realized it was a performance hit on lower end computers. There were also issues with camera movement when the scene had to be revoxelized, the global illumination would start flickering and it looked pretty rough. I tried to mask it with temporal blending, smarter revoxelization, and after countless hours spent debugging dead-ends, it hit me that the problem was probably todo with the reliance on mipmaps. The idea being that the smallest change in radiance in any voxel during revoxelization would propagate exponentially through the mipmap chain.

Flickering occuring during voxel offset

My goal with this project is to get rid of flickering, improve performance and shrink memory footprint, first using an optimization idea called clipmaps first presented by James McLaren, Q-Games at his GDC Talk about the technology behind Tomorrow Children. After that I will be trying another less documented optimization idea, originally proposed by Intel, using reflective shadow maps to optimize memory usage.

Implementation

First step was to implement clipmaps, this was fairly straight forward and I pretty quickly had voxelization and debug drawing that used clipmaps instead of my previous reliance on mips. I mostly had to defer multiple voxelization passes over multiple frames, and introduce an index that told all the voxelization shaders what voxel clipmap to write to. First frame it would write to the first clipmap, second frame it would write to the second one, and so on. Each clipmap doubles in size, but all of them have the same resolution, this means you get growing voxel reach as you go up in clipmaps with the trade-off of lesser quality. This lower quality however works perfectly for cone-tracing, as blending between multiple overlapping clipmaps essentially gives me the same trilinear filter effect as tracing through lower mips did previously. There was no actual global illumination yet since the previous cone-tracing step was based on tracing through mip levels instead of clipmaps. 

Every cascading clipmap is twice the size of the previous one

After reimplementing cone-traced diffuse bounce light, I realized there were noticeable cuts and seams where the clipmaps would cascade, at first I tried blending between clipmaps but it didn't really fix anything, so I remedied it by biasing each sample with the voxel size over the distance traveled.

Visible seams where the clipmaps cascade

FixedClipmapSeams.png

No more visible seams

With this fixed I went on to reimplement cone-traced specular bounce, which gave some pretty reflective surfaces. Because of the adaptive resolution of the clipmaps, the reflection could now be very detailed up close, which actually led one of the graphics artists in my project group to include a mirror in our Project 7 game!

Having implemented clipmaps, how big of a performance improvement did it leverage? Please note that the following statistics are taken from Nsight Graphics running on a 4GB RTX 3060 graphics card.

 

 

 

 

 

Considering I defer every clipmap update to a frame of their own, this means the voxel grid is always six frames behind, since I use six clipmaps. This however also means I only offset and revoxelize a maximum of 64 * 64 * 64, or 262144 voxels, every frame. Compared to the previous solution which was 196 * 196 * 196 in resolution, or 7529536 voxels every frame.

 

This means I have a reduction of over 2870% in the total number voxels I worked with previously. Not only does this give us an incredible performance gain of almost 6 ms on a 4GB RTX 3060, it also gives us unlimited reach for the global illumination! Previously the global illumination would only reach as far as the voxel grid did, which meant we had to balance the voxel grid resolution and voxel size for optimal performance and light reach. In terms of memory, I have to save six different voxel grids to represent each clipmap, this means I actually have to keep 1572864 voxels in memory, instead of the aformentioned 262144. This is still a memory reduction of over 478%. This alone actually put our VXGI within frame budget, since it now ran smoothly on the slowest computer in our group, which had a 3GB GTX 1060 graphics card. Most importantly of all, it got rid of the flickering.

 

The remainder of this project is now going to explore the usage of reflective shadow maps to store radiance, which would mean I only need to write occlusion data in the voxel texture, which can be represented with a single byte per voxel. 

Voxelization without clipmaps

Voxelization with clipmaps

Voxelization and deferred indirect diffuse without clipmaps

Voxelization and deferred indirect diffuse with clipmaps

Specular reflections without clipmaps, low resolution

Specular reflections with clipmaps, better resolution

Reflective Shadow Maps

The general idea behind reflective shadow maps is pretty simple, it just means we add a "flux map" to the shadow map for a light, which is just the scene albedo seen from the light, with some attenuation. With this flux map you can then calculate screenspace global illumination, and this is what I had implemented in the engine before VXGI. The optimization idea proposed by Intel in their paper talks about using this flux map and instead of solving the GI in screenspace, you solve GI with cone-tracing as you would do in VXGI, but unlike in VXGI where you save radiance data in every voxel, you instead only save occlusion, and then extrapolate the radiance data from the reflective shadow map.

The difference between a normal radiance voxel map and a thin voxel map with RSM optimization

One of the problems I can already forsee is that this would scale horribly with light complexity. The more lights you add the more reflective shadow maps are allocated and the more radiance data you have to traverse for every voxel trace, which means memory and performance both scale horribly with the number of indirect casting lights. This means the optimization is probably only really useful for games that solely require bounce light from the sun or sky. Another issue with not saving radiance per voxel is that you can no longer write emission into the voxels, which means emissive surfaces no longer cast indirect light. Something we had previously used very appropriately.

Courtesy of Philip and his very emissive face

Since I already had reflective shadow maps implemented from an earlier global illumination attempt, I mostly just had to remove the radiance from the voxels. After that there was a whole week of pitch black darkness, as it turned out extrapolating the radiance data from the reflective shadow maps was far more difficult than I originally anticipated.

Eventually I got something working, and I could see some radiance in my voxels again, albeit very buggy and only working on a single clipmap. This also meant I had nothing to cone-trace through and with the little time I had left I decided to call it quits and compare the thin voxel map I had now against the voxel map I had previously, to see what the difference in memory usage was.

Voxel map texture before RSM optimization, holds radiance per voxel which can be seen in the color variation

This was the radiance voxel texture before the reflective shadow map optimization, you can see radiance data written in each pixel of the texture if you look closely. And the total texture size is 8.00 MB, something to note is that two of these textures are created to achieve temporal blending. So the total memory consumption is around 16.00 MB.

Voxel map texture after RSM optimization, only occlusion is written in each voxel.

Here you can see the same texture now with the reflective shadow map optimization, which means I only need to write occlusion data in each voxel which can be represented with a single byte. You can no longer see radiance in each voxel, but the memory consumption is now down to 1024 KB, or 1.0 MB in comparison to the previous 8 MB. This means it's a memory shrink of 1/8 to the original solution, however when you consider that there is no data to temporally blend between anymore, this means I don't even have to allocate a second duplicate texture. Which makes the total memory shrink around 1/16th of the original size.

Assuming your reflective shadow map is 1024 by 1024 in resolution, which is pretty small for directional lights, and with proper SRGB packing. This could put your RSM alone around 4194 KB, or 4.194 MB. Do the same for any other light, or increase the shadow map resolution for your directional to 2048 by 2048, and you've already surpassed the memory consumption of what you would have with the original SRGB voxel map. This is why I can only imagine this optimization being useful for games that primarily rely on skylight / directional light having indirect light capabilities. If your graphics card supports bitmap textures, such as the DirectX format DXGI_FORMAT_R1_UNORM, or if you simply pack and create the occlusion voxel map in a smarter way, you could lower the memory consumption even further. And maybe if you do this, it could be worth it, this is something I never got the time to try.

But with my naive implementation and with the type of game we were making, a game that doesn't even have a directional light, and is solely dependent on multiple spot and point lights, this optimization would not be worth it.

Conclusion

While I'm happy with how the clipmap optimization turned out, I still feel like I could have done more in terms of shrinking memory usage. Reflective Shadow Maps didn't turn out to be a viable solution for me, at least not my naive and clumsy attempt. There is another memory related optimization that a lot of VXGI implementations seem to do, which is using oct-trees to increase voxel detail in areas with varying radiance and not write data where its not needed. This technique seemed a bit daunting so I didn't even consider implementing it. If I could do this again, I would probably try to implement that instead of using reflective shadow maps.

bottom of page