Nightbloom Engine

Custom Vulkan Renderer with Real-Time ShaderGraphEditor

Project Overview

Nightbloom is a learning-focused graphics engine that demonstrates low-level Vulkan API usage, GPU synchronization patterns, and shader tooling development. The standout feature is a visual shader graph editor that generates GLSL, compiles to SPIR-V at runtime, and hot-reloads shaders without restarting the application.

🔹 Technologies Used: C++17, Vulkan 1.4, VMA, GLSL, SPIR-V, ImGui, GLM.

Key Features

Throughout the development of the Nightbloom Engine, I focused on several core graphics programming concepts:

  • Real-Time Shader Node Editor: Visual node-based shader authoring similar to Unity ShaderGraph or Unreal's Material Editor. Create shaders by connecting nodes, see results instantly.

  • Live SPIR-V Compilation: Edit shader graphs → Generate GLSL → Compile to SPIR-V → Hot-reload pipeline. No engine restart required.

  • Frames-in-Flight Architecture: Double-buffered command recording allows CPU to prepare frame N+1 while GPU renders frame N, maximizing throughput.

  • VMA Memory Management: Integrated Vulkan Memory Allocator with staging buffer pools, persistent mapping for uniforms, and efficient device-local uploads.

Shader Node Editor

🔹 Visual shader creation:

• Texture sampling with channel separation (R, G, B, A outputs)

• Math operations (Multiply, Add, Mix) with automatic type promotion

• Time-based animation nodes (Time, Sin, Cos)

• Topological sorting ensures correct evaluation order

• Type-aware connections (float automatically broadcasts to vec4)


🔹 How It Works:

1. User creates and connects nodes visually

2. Graph is topologically sorted by dependencies

3. Each node emits its GLSL code snippet

4. Combined into complete vertex/fragment shaders

5. glslc compiles GLSL → SPIR-V

6. New pipeline created and swapped in live

System Architecture

🔹Seperation of Components:

• FrameSyncManager - Fences and semaphores for CPU/GPU synchronization

• ResourceManager - Buffer, texture, and shader lifecycle

• CommandRecorder - Command buffer recording and draw execution

• VulkanDescriptorManager - Descriptor set layouts, pools, and bindings

• VulkanPipelineAdapter - Abstracts pipeline creation behind generic interface

Frame Synchronization Model

Implementation

  • Vulkan requires explicit synchronization between CPU and GPU. The engine
    implements a double-buffered frames-in-flight model:

  • Fences (CPU ↔ GPU): CPU waits on a fence before reusing a frame slot's
    resources, ensuring the GPU has finished with them.

  • Semaphores (GPU ↔ GPU): Image acquisition signals a semaphore that the render pass waits on. Render completion signals another semaphore that presentation waits on.

  • Key Insight: imageAvailable semaphores are sized to MAX_FRAMES_IN_FLIGHT (2), while renderFinished semaphores are sized to swapchain image count (3). This handles swapchain images being acquired out of order.

🔹Code Sample: Frame Execution

Shader Compilation Pipeline

🔹Implementation

1. Graph Analysis

- Topological sort orders nodes by dependencies

- Type resolution propagates types through connections

- Float × Vec4 automatically promotes to Vec4 output

2. GLSL Generation

- Each node emits its code snippet

- Variables named node{id}_out{pin} to avoid collisions

- Uniforms and push constants automatically declared

3. SPIR-V Compilation

- Invokes glslc.exe to compile GLSL to SPIR-V

- Error messages captured and displayed in editor

4. Pipeline Hot-Reload

- New VkShaderModules created from SPIR-V

- New VkPipeline created with updated shaders

- Old pipeline destroyed after GPU finishes (fence sync)

🔹Code Sample: Frame Execution

Results & Showcase

What I Would Improve Next

While the core functionality is complete, there are several areas I would expand on if I continued developing this project:

  • Depth buffer — Currently rendering a single object. Adding a depth attachment to the render pass would enable correct multi-object rendering with proper occlusion.

  • Bindless Textures — The current implementation updates descriptor sets per draw call. Implementing descriptor indexing would allow all textures in a single bindless array, eliminating per-draw overhead and scaling to thousands of objects.

  • Async Shader Compilation — glslc currently blocks the main thread during compilation. Moving this to a worker thread with a callback system would prevent frame hitches when editing shaders.

  • Compute Shader Support — Adding compute pipeline infrastructure would enable GPU-driven effects like particle systems, culling, and eventually volumetric rendering.

  • Graph Serialization — Shader graphs currently reset on restart. Implementing JSON save/load would allow persistent material authoring and asset sharing.

  • Multi-Pass Rendering — Extending the render pass system to support shadow maps, deferred rendering, or post-processing chains.

  • Material System — Building on top of the shader graph to create reusable material instances with exposed parameters for artists.

These improvements would transform the project from a learning tool into a practical asset viewer that could slot into a real studio workflow.