Thesis Overview

This post is one-page summary of the thesis


Practical Unified Volumetric Rendering

Volumetric Rendering at Fatshark and Stingray is currently limited to screen space light shafts, distance fog and local fog volumes with analytic sampling. Fog interacts poorly with other systems, the light shafts are limited to on screen light sources and local fog volumes can not be rendered over another. There are also problems with rendering transparency inside fog.

The thesis explores a volumetric ray marching approach for Unified volumetric rendering, meant to provide a more general and better volumetric solution than mentioned above. The volumetric rendering algorithm should handle global and local participating media and in theory replace the current fog solutions. Further, the thesis explores to what extent the clustered shading pipeline can be utilized for rendering unified volumetric effects, and what other optimisations are important to yield real time production quality.

=> “Is ray marching suitable for real time local lights volumetric lighting in single scattering participating media?”


Local Lights Volumetric Ray Marching

The basis of the thesis is volumetric ray marching to approximate the radiate transfer equation. The solution should be handle to handle heterogeneous media under the presumption of single scattering and use and average lobe. The implementation uses a frustum aligned voxel buffer and integrates lighting by using clustered shading culled local light and cascaded sun shadow maps.

  • Sun Ray Marching
  • Local Lights
  • Heterogeneous media
  • phase, scattering, transmittance
  • single scattering
  • Indirect lighting
  • Froxel filtering
Low Resolution Rendering

The thesis includes low resolution volumetric rendering, to reduce fragment workload. The methods tried are naive, bilinear, bilateral and nearest depth. Different methods of depth downsampling are also considered: min/max/checkered, with or without temporal jittering.

Evaluation variables: on/off, subsampling resolution, upsampling method, downsampling method

Temporal reprojection

This optimization work by trying to reproject data from previous frames.

Evaulation variables: on/off, depth-similarity, velocity-similarity


Different dithering methods are explored, including interleaved sampling, bayer matrix dithering, sub-pixel jittering and temporal dithering. To hide artefacts directly from dithering, separable bilateral gaussian blur filters will be used.

Evaluation variables: on/off, interleaved group size, blur procedure


The exact stepsizes that are used in the raymarching step will affect the result a lot. I will implement different important schemes for stepsizes:

  • Uniform View Space
  • Exponential View Space
  • Uniform Froxel Space

I will also explore some ideas of adaptively choosing stepsizes or using supersampling. Can you adaptively super-sample according to local lights in a cluster? What kind of bail-out methods are there? 


Memory Alignment

I will also explore access patterns in the voxel datastructures, or at least discuss what layouts are theoretically optimal. Will the way the froxels are layed out memory wise can affect the performance notably? I will also discuss compute shader implementations and how memory alignment could be more important if compute shaders are used for the froxel data.

Anti Aliasing

Since the thesis concerns practical volumetric rendering, I also adress anti-aliasing. Specifically I will discuss how to make sure the volumetrics are compatible with FXAA and TAA. Especially low resolution rendering will introduce aliasing.



The evaluation will use the naive ray marching with many steps as a ground truth. By rendering passes with 256 steps, a very good result is yielded but is very costly. The real-time algorithm  will reduce its time cost with each optimization and quality is compared to ground truth by means of per-pixel difference visualization. Both Lords of the Fallen and Killzone uses this approach. It might be possible to motivate the 256 step ground truth by means of the nyquist theorem and the theoretical maximum frequency of froxels and shadow map variations.

Performance measurements will be made by measuring average microseconds of frame breakdowns, looking at each pass of the algorithm. For instance, the low resolution rendering will contain one downsampling pass, the ray march pass and an upsampling pass.


Each main feature/optimization will be evaluated in terms of performance and quality.