JonathonLuiten / Dynamic3DGaussians
- четверг, 19 октября 2023 г. в 00:00:03
Official implementation of our approach for modelling the dynamic 3D world as a set of 3D Gaussians that move & rotate over time. This extends Gaussian Splatting to dynamic scenes, with accurate novel-view synthesis and dense 3D 6-DOF tracking.
Dynamic 3D Gaussians: Tracking by Persistent Dynamic View Synthesis
Jonathon Luiten 1, 2,
Georgios Kopanas 3,
Bastian Leibe 2,
Deva Ramanan 1
1 Carnegie Mellon University, 2 RWTH Aachen University, 3 Inria & Universite Cote d’Azur, France
International Conference on 3D Vision (3DV), 2024
# Install this repo (pytorch)
git clone
conda env create --file environment.yml
conda activate dynamic_gaussians
# Install rendering code (cuda)
git clone
cd diff-gaussian-rasterization-w-depth
python install
pip install .
cd Dynamic3DGaussians
wget # Download pretrained models
python # See code for visualization options
cd Dynamic3DGaussians
wget # Download training data
I tried really hard to make this code really clean and useful for building upon. In my opinion it is now much nicer than the original code it was built upon. Everything is relatively 'functional' and I tried to remove redundant classes and modules wherever possible. Almost all of the code is in in a few core functions, with the overall training loop clearly laid out. There are only a few other helper functions used, divided between and (depending on license). I have split all useful variables into two dicts: 'params' (those updated with gradient descent), and 'variables' (those not updated by gradient descent). There is also a custom visualization codebase build using Open3D (used for the cool visuals on the website) that is entirely in Please let me know if there is anyway you think the code could be cleaner.
Before a recent commit there was a bug in this code. This has now been fixed and the code now seems like it is working bug free. On Oct 17 I also updated the pretrained model link to the most recent working code version. Please make sure to pull code to latest version AND redownload the pretrained models.
This codebase contains some significant changes from the results presented in the currently public version of the paper. Both this codebase and the corresponding paper are work-in-progress and likely to change in the near future. Until I find time to update the paper (ETA Dec 15th) the code here is the more up-to-date public facing version of these two.
Please let me know if there are any other differences between the paper and the code so that I can include them here and remember to include them in future version of the paper.
So far we have released two parts of the code: training and visualization. There are three further parts to be released in the future when I find time to clean them up (ETA Dec 15):
Happy to work together to make this code better. If you want to contrib either open and issue / pull request, or send me an email.
I do a number of dumb things which slows the code down ALOT. If someone is motivated improving these could significantly speed up training time.
In this codebase we provide an open3D based dynamic visualizer. This is makes adding 3D effects (like the track trajectories) really easy, although it definitely makes visualization slower than it should be. E.g. the code renders the scene at 800 FPS, but including open3D in order to display it on the scene (and add camera controls etc) slows it down to ~30 FPS.
I have seen lots of cool renderers for Gaussians for static scenes. It would be cool to make my dynamic scenes work on these.
In particular, I have seen various things that (a) somehow run on my phone and old laptop (e.g. here and here) (b) run on VR headsets (e.g. here and here) (c) run in commonly used tools like unity (e.g. here)
Dylan made a helpful list that can be found here
The current FG/BG segmentations I use are REALLY bad. I made them very quickly by using simple background subtraction using a background image (image with no objects) for each camera with some smoothing. The badness of these segmentation masks causes a noticable degradation of the results. Especially around the feet of people / near the floor. It should be very easy to get much better segmentation masks (e.g. using pretrained networks), but I think it also probably isn't too hard to get the method to work without segmentation masks as all.
There are ALOT of cool things still to be done building upon Dynamic 3D Gaussians. If you're doing so (especially research projects) feel free to reach out if you want to discuss (email / issue / twitter)
The code in this repository (except in is licensed under the MIT licence.
However, for this code to run it uses the cuda rasterizer code from here, as well as various code in which has been taken or adapted from here. These are required for this project, and for these a much more restrictive license from Inria applies which can be found here. This requires express permission (licensing agreements) from Inria for use in any commercial application, but is otherwise freely distributed for research and experimentation.
title={Dynamic 3D Gaussians: Tracking by Persistent Dynamic View Synthesis},
author={Luiten, Jonathon and Kopanas, Georgios and Leibe, Bastian and Ramanan, Deva},