I've decided to try to implement the realtime visualization part (i.e. the one that takes already-produced gaussian splat "model" file) in Unity.
The original paper code has a purely CUDA-based realtime renderer; other
people have done their own implementations (e.g. WebGPU at cvlab-epfl, Taichi at wanmeihuali, etc.).
Code in here so far is randomly cribbled together from reading the paper (as well as earlier literature on EWA splatting), looking at the official CUDA implementation, and so on. Current state:
The code does not use the "tile-based splat rasterizer" bit from the paper; it just draws each gaussian splat as a screenspace aligned rectangle that covers the extents of it.
Splat color accumulation is done by rendering front-to-back, with a blending mode that results in the same accumulated color as their tile-based renderer.
Splat sorting is done with a AMD FidelityFX derived radix sort, or (on DX11) with a GPU bitonic sort that is lifted from Unity HDRP codebase.
This is not a fast implementation yet!
Usage
Within Unity (2022.3), there's a Scene.unity that has a GaussianSplatRenderer script attached to it.
The project defaults to DX12 on Windows, since then it can use a faster GPU sorting routine. DX11 should also work, at expense of performance.
Metal and Vulkan also use the faster sorting approach.
You need to point it to a "model" directory. The model directory is expected to contain cameras.json and
point_cloud/iteration_7000/point_cloud.ply inside of it.
Since the models are quite large, I have not included any in this Github repo. The original paper github page has a a link to
14GB zip of their models.
Press play.
The gaussian splat renderer component inspector will have a slider to move the game view camera into one of the cameras from the model directory.
Or you can just move the game/scene view camera however you please.
There are also various controls in the script to debug or visualize the data.
⚠️ Note: this is all a toy, it is not robust, it does not handle errors gracefully, it does not interact or composite well with the "rest of rendering", it is not fast, etc. etc. I told you so!
Wishlist that I may or might not do at some point:
Make rendering faster (actual tiled compute shader rasterizer)
Look at ways to make the data sets smaller (both on-disk and in-memory)
Integrate better with "the rest" of rendering that might be in the scene (BiRP)
Maybe look at making it work in URP/HDRP? Not sure yet
Make sorting faster (bitonic -> FidelityFX radix sort)
Write-ups
My own blog posts about all this (so far... not that many!):