PlenopticDreamer: Coherent Multi‑View Video Synthesis

← Back to Papers

Research Paper

PlenopticDreamer: Coherent Multi‑View Video Synthesis

Authors: Xiao Fu,

computer-vision multimodal advanced ▲ 6 • arXiv ↗ • HuggingFace ↗

Organization: Hugging Face

Published: 2026-01-09 • Added: 2026-01-09

Key Insights

Introduces a camera‑guided retrieval module that pulls relevant latent frames from a pre‑built spatio‑temporal memory, ensuring consistent geometry across different viewpoints.
Employs progressive training (stage‑wise spatial then temporal finetuning) to stabilize GAN learning and significantly boost temporal coherence without sacrificing spatial detail.
Uses synchronized generative hallucination, where the generator receives both the current view’s pose and the retrieved multi‑view context, enabling faithful re‑rendering of dynamic scenes.
Demonstrates that jointly optimizing view consistency and motion continuity yields higher visual fidelity than treating each view or frame independently.

Abstract

PlenopticDreamer enables consistent multi-view video re-rendering through synchronized generative hallucinations, leveraging camera-guided retrieval and progressive training mechanisms for improved temporal coherence and visual fidelity.

Full Analysis

# PlenopticDreamer: Coherent Multi‑View Video Synthesis **Authors:** Xiao Fu, **Source:** [HuggingFace](https://huggingface.co/papers/2601.05239) | [arXiv](https://arxiv.org/abs/2601.05239) **Published:** 2026-01-09 **Organization:** Hugging Face ## Summary - Introduces a camera‑guided retrieval module that pulls relevant latent frames from a pre‑built spatio‑temporal memory, ensuring consistent geometry across different viewpoints. - Employs progressive training (stage‑wise spatial then temporal finetuning) to stabilize GAN learning and significantly boost temporal coherence without sacrificing spatial detail. - Uses synchronized generative hallucination, where the generator receives both the current view’s pose and the retrieved multi‑view context, enabling faithful re‑rendering of dynamic scenes. - Demonstrates that jointly optimizing view consistency and motion continuity yields higher visual fidelity than treating each view or frame independently. ## Abstract PlenopticDreamer enables consistent multi-view video re-rendering through synchronized generative hallucinations, leveraging camera-guided retrieval and progressive training mechanisms for improved temporal coherence and visual fidelity. --- *Topics: computer-vision, multimodal* *Difficulty: advanced* *Upvotes: 6*