Project Overview
A production-quality volumetric cloud rendering system built from scratch in DirectX 11, demonstrating advanced graphics programming techniques used in AAA games like Horizon Zero Dawn and Red Dead Redemption 2. The system features GPU compute shader raymarching, spatial octree acceleration, procedural noise generation, and physically-based light scattering — all running in real-time.
🔹 Technologies Used: C++17, DirectX 11, HLSL Compute Shaders, Structured Buffers, 3D Textures, ImGui
Key Features
Throughout development, I focused on implementing production-level graphics programming techniques:
GPU Compute Raymarching: Full-screen raymarching executed entirely on the GPU via compute shaders, processing each pixel independently with adaptive step sizing for optimal performance.
Spatial Octree Acceleration: Hierarchical octree structure enabling O(log n) density lookups and empty-space skipping, reducing ray steps by up to 60% in sparse regions.
Physically-Based Scattering: Henyey-Greenstein phase function for anisotropic scattering, Beer-Lambert extinction, and powder effect approximation for realistic cloud illumination.
Procedural 3D Noise: Multi-octave Perlin and Worley noise textures combined for organic cloud shapes with temporal animation and wind simulation.
Shadow Pass System: Orthographic light-space raymarching for volumetric self-shadowing with transmittance accumulation.
Technical Architecture
🔹System Overview
CPU (Host)
SkyVolume Manager
├── Cloud Region Definitions (center, radii, type)
├── Octree Construction & Subdivision Logic
├── Density Field Generation (CPU-side sampling)
└── GPU Resource Management
GPU Resources
│ StructuredBuffer<OctreeNode>
│ Texture3D<float> Density
│ Texture3D<float> Perlin
│ Texture3D<float> Worley
│ RWTexture2D<float4> Output
│ RWTexture2D<float4> Shadow
│ Spatial acceleration data
│ 256³ density field
│ 256³ FBM noise
│ 256³ cellular noise
│ Final cloud render target
│ Volumetric shadow map
🔹Component Separation
SkyVolume - Cloud region management, octree construction, CPU/GPU synchronization
CloudVolumeShader.hlsl - Main raymarching, density sampling, lighting integration
CloudVolumeShadowShader.hlsl - Light-space transmittance calculation
SkyShader.hlslAtmospheric gradient and sun rendering
GodRays.hlslVolumetric light shaft post-processing
Raymarching Pipeline
🔹GPU Compute Dispatch
The cloud rendering executes as a full-screen compute shader dispatched in 16×16 thread groups:
🔹Ray Generation
Each thread reconstructs a world-space ray from its screen coordinates using inverse view-projection:
🔹Adaptive Step Sizing
Performance is optimized through multiple adaptive stepping strategies:
Challenges & Solutions
🔹 Challenge: Performance with Dense Raymarching
Naive fixed-step raymarching required 500+ steps per ray, causing GPU bottlenecks at high resolutions.
Solution: Multi-Tier Adaptive Stepping
Octree empty-space skipping — Jump entire nodes with zero density
Distance-based scaling — Larger steps at greater distances
Density-based scaling — Larger steps in low-density regions
Early termination — Stop when transmittance drops below 1%
🔹 Challenge: View-Dependent vs View-Independent Calculations
Initial shadow pass included phase functions and powder effects, producing incorrect self-shadowing.
Solution:
Main Pass (View-Dependent)
Henyey-Greenstein phase
Powder effect
In-scattering calculation
Final color composition
Shadow Pass (View-Independent)
Optical depth accumulation
Beer's law transmittance
First/last hit distances
Total density integration
Future Work
Temporal Reprojection: Reuse shadow calculations across frames to reduce redundant computation
Deep Shadow Maps: Replace current shadow pass with exponential/variance shadow maps for smoother self-shadowing
Multiple Scattering Approximation: Add secondary bounce estimation for more realistic cloud interiors
LOD System: Automatic quality scaling based on distance and screen coverage
Compute Shader Optimization: Implement wavefront occupancy optimization and shared memory caching