An advanced, real-time stereo rendering engine for 2D-to-3D conversion in VR and stereoscopic displays.
VisionDepth3D introduces multiple first-of-their-kind techniques in the 3D conversion community, while also integrating established practices into a single GPU-optimized pipeline.
- The
shape_depth_for_popalgorithm is unique to VisionDepth3D:- Percentile stretch of depth range.
- Recentered on subject depth.
- Symmetric gamma curve for controlled “pop”.
- Enables tunable cinematic or VR-style stereo with consistent subject emphasis.
- Dynamically estimates subject depth (center-weighted histograms + percentiles).
- Locks the zero-parallax plane to the subject with exponential smoothing.
- Prevents subject “drift” and stabilizes the screen plane without manual keyframing.
- Computes normalized variance of depth in the central region:
parallax_scale = compute_dynamic_parallax_scale(depth_tensor)- Adapts stereo strength automatically — gentle for flat shots, expansive for landscapes.
- Novel use of depth gradients to suppress parallax shifts near thin, detailed edges:
edge_mask = torch.sigmoid((grad_mag - edge_threshold) * feather_strength * 5)- Prevents ghosting and halos without AI inpainting.
- Combines distance transforms with EMA smoothing to “round” depth on subjects.
- Prevents shimmer in hair, fingers, and soft edges.
- Creates natural curvature without requiring heavy segmentation pipelines.
- Subject-aware floating window automatically detects window violations.
FloatingWindowTracker+FloatingBarEasersmooth jitter and clamp drift.- First real-time system that applies cinematic floating windows automatically.
- Depth-of-field blur controlled by a focal depth tracker that adapts to scene motion.
- Busy shots → faster focus shifts; still shots → stable focus.
- Simulates real cinematography rules dynamically during 3D rendering.
- Detects warping gaps via gradient magnitude and fills with blended original + blurred content.
- Lightweight alternative to neural inpainting — seamless and invisible.
- Depth-weighted continuous parallax shifting – smooth stereo gradients instead of discrete layers.
- GPU tensor grid warping – CUDA-optimized
grid_sampleper-eye rendering. - Scene-aware dampening – adjusts disparity for flat vs. complex scenes.
- Temporal percentile EMA normalization – stabilizes depth scale across frames.
- Depth-based DOF (multi-level Gaussian pyramid) – established technique, enhanced in VD3D with motion-aware focus.
- Black bar detection & aspect handling – auto-crop and cinematic aspect preservation.
- Color grading & sharpening – GPU-accelerated saturation, contrast, brightness, and safe sharpening.
- Multi-format 3D output – Half-SBS, Full-SBS, VR, anaglyph, interlaced.
- FFmpeg streaming codec pipeline – CPU (libx264/x265/AV1), NVIDIA NVENC, AMD AMF, Intel QSV with CRF/CQ control.
| Category | Component |
|---|---|
| Original | Pop Control Depth Shaping |
| Subject-Aware Zero-Parallax Plane (EMA stabilized) | |
| Dynamic Parallax Scaling by Scene Variance | |
| Edge-Aware Gradient Shift Suppression | |
| Matte Sculpting + Temporal Stabilization | |
| Floating Window with Temporal Easing | |
| Motion-Aware DOF Focal Tracking | |
| Gradient-Based Healing of Occlusion Gaps | |
| Supporting | Depth-weighted Parallax Shifting |
| Temporal Percentile EMA Depth Normalization | |
| GPU Tensor Grid Warping | |
| Scene-Aware Dampening | |
| DOF via Gaussian Pyramids | |
| Aspect Handling + Black Bar Detection | |
| GPU Color Grading + Sharpening | |
| Multi-Codec Pipeline + Multi-Format Export |
VisionDepth3D combines these original contributions with proven practices to form a holistic, real-time, GPU-accelerated 2D→3D pipeline.
📄 Licensed under: VisionDepth3D Custom Use License (No Derivatives)
🔗 Project: https://github.com/VisionDepth/VisionDepth3D