Major reorg

This commit is contained in:
Zachary Levy
2026-04-30 18:49:38 -07:00
parent fd64bc01bf
commit 87d4c9a0b5
16 changed files with 2293 additions and 2259 deletions
+260 -236
View File
@@ -5,151 +5,79 @@ import "core:math"
import "core:mem"
import sdl "vendor:sdl3"
// Adaptive downsample design (Flutter-style).
// This file hosts the backdrop subsystem: any visual effect that samples the current
// framebuffer as input. Today the only implemented effect is Gaussian blur (frosted glass);
// future effects (refraction, mirror, etc.) will live here too.
//
// The bracket picks a downsample factor per-sigma-group, not as a global constant. The choice
// is driven by Flutter's `CalculateScale` formula in
// impeller/entity/contents/filters/gaussian_blur_filter_contents.cc (originally from Skia's
// GrBlurUtils): downsample so that the sigma in working-resolution pixels stays in the
// 2..4 range. This keeps the kernel reach wide enough to hide high-frequency artifacts from
// the bilinear upsample at the composite, while keeping the kernel's discrete tap count
// small (≤3σ reach → ≈12 paired taps).
// The file is split into two top-level sections:
//
// The full table, in physical pixels (sigma_logical * dpi_scaling):
// 1. Shared backdrop infrastructure — bracket coordination, source_texture lifecycle,
// sub-batch scanners. These are general to any backdrop effect: every backdrop effect
// needs a snapshot of the framebuffer (source_texture) and needs to participate in the
// bracket render-pass-boundary scheduling. When a second effect is added, its
// per-effect resources go in their own section like the Gaussian blur one below; this
// shared section stays.
//
// sigma_phys ≤ 4 → factor = 1 (no downsample; source is sampled directly)
// sigma_phys ≤ 8 → factor = 2
// sigma_phys > 8 → factor = 4 (capped)
// 2. Gaussian blur — the only effect implemented today. Owns its own PSOs, working
// textures (downsample / h_blur), per-primitive storage layout, kernel math, and
// bracket-runner inner loop. None of this is shared with future backdrop effects: a
// refraction shader would have its own PSO, its own primitive struct, and likely
// wouldn't need the downsample/h_blur intermediates at all.
//
// Capped at factor=4 to favor visual quality over bandwidth at the high end. Larger factors
// (8 and 16) would lose more high-frequency detail than the kernel can mask even with the
// H+V split, and the bandwidth saving is small (the work region also shrinks quadratically,
// so most of the savings are already captured at factor=4).
//
// Working textures are sized at full swapchain resolution to support factor=1. Larger factors
// just write to a smaller sub-rect via viewport-limited rendering. Memory cost: full-res
// working textures (2 textures, RGBA8) is roughly 16 MB at 1080p, 64 MB at 4K. On modern
// GPUs this is well within budget; on Mali Valhall SBCs it's negligible against unified-
// memory headroom.
//
// The shaders read the factor as a uniform. The downsample shader has three paths (factor=1
// identity, factor=2 single bilinear tap, factor>=4 four bilinear taps with offsets scaling
// by factor/4). The V-composite mode of backdrop_blur.frag uses inv_downsample_factor to
// scale full-res frag coords down to working-res UV.
// Maximum number of (weight, offset) pairs in a single blur kernel. Each pair represents
// the linear-sampling pair adjustment (one bilinear fetch covering two adjacent texels);
// pair[0] is the center weight with offset 0. With 32 pairs we cover up to 63 input texels
// (1 center + 31 paired symmetric taps × 2 texels each), enough for sigma values well past
// the 4..24 typical UI range. Must match MAX_KERNEL_PAIRS in shaders/source/backdrop_blur.frag.
MAX_BACKDROP_KERNEL_PAIRS :: 32
// Backdrop_Primitive is the GPU-side per-primitive storage layout. Mirrors the GLSL std430
// struct in shaders/source/backdrop_blur.vert. Field order is chosen so std430 alignment
// rules pack the struct to a clean 48-byte natural layout (no implicit padding): vec4
// members come first (16-byte aligned at any offset), then vec2, then scalars. The total is
// a multiple of 16 so the std430 array stride matches size_of(...) exactly.
//
// Backdrop primitives are RRect-only: rectangles, rounded rectangles, and circles
// (via uniform_radii) are all expressible. Rotation is intentionally omitted — backdrop
// sampling is in screen space, so a rotated mask over a stationary blur sample would look
// visually wrong. iOS, CSS backdrop-filter, and Flutter BackdropFilter all enforce this
// implicitly; we enforce it explicitly by leaving no rotation field.
//
// Outline is also intentionally omitted. A specialized edge effect (e.g. liquid-glass-style
// refraction outlines) would be implemented as a dedicated primitive type with its own
// pipeline rather than tacked onto this one as a flag bit.
Backdrop_Primitive :: struct {
bounds: [4]f32, // 0: 16 — world-space quad (min_xy, max_xy)
radii: [4]f32, // 16: 16 — per-corner radii in physical pixels (BR, TR, BL, TL)
half_size: [2]f32, // 32: 8 — RRect half extents (physical px)
half_feather: f32, // 40: 4 — feather_px * 0.5 (SDF anti-aliasing)
color: Color, // 44: 4 — tint, packed RGBA u8x4
}
#assert(size_of(Backdrop_Primitive) == 48)
// The `Backdrop` struct currently holds resources from both categories; field-group
// comments inside it mark which are which. When a second effect lands the struct will be
// split, but doing that pre-emptively means inventing a per-effect dispatch protocol on
// speculation. Better to keep the conflation visible (and labeled) until concrete needs
// shape the design.
// ---------------------------------------------------------------------------------------------------------------------
// ----- Uniform blocks ----------------
// ----- Shared backdrop infrastructure ------------
// ---------------------------------------------------------------------------------------------------------------------
// Vertex uniforms for the unified blur PSO (mode 0 = H-blur, mode 1 = V-composite).
// Matches the GLSL Uniforms block in shaders/source/backdrop_blur.vert. The downsample
// PSO has no vertex uniforms.
Backdrop_Vert_Uniforms :: struct {
projection: matrix[4, 4]f32, // 0: 64 — screen-space ortho (mode 1 only; mode 0 ignores)
dpi_scale: f32, // 64: 4
mode: u32, // 68: 4 — 0 = H-blur fullscreen tri; 1 = V-composite instanced quads
_pad0: [2]f32, // 72: 8 — std140 vec4 alignment pad
}
//INTERNAL
Backdrop :: struct {
// -- Shared across all backdrop effects --
// Fragment uniforms for the downsample PSO. Matches Uniforms block in
// shaders/source/backdrop_downsample.frag.
Backdrop_Downsample_Frag_Uniforms :: struct {
inv_source_size: [2]f32, // 0: 8 — 1.0 / source_texture pixel dimensions (full-res)
downsample_factor: u32, // 8: 4 — 1, 2, or 4 (selects identity / 1-tap / 4-tap path in shader)
_pad0: u32, // 12: 4
}
// When any backdrop draw exists this frame, the entire frame renders into source_texture
// instead of the swapchain. Acts as the bracket's snapshot input by virtue of already
// containing the pre-bracket frame. Copied to the swapchain at frame end.
source_texture: ^sdl.GPUTexture,
// Fragment uniforms for the unified blur PSO (mode 0 + mode 1). Matches the GLSL Uniforms
// block in shaders/source/backdrop_blur.frag. The kernel array holds the linear-sampling
// pair coefficients computed CPU-side via `compute_blur_kernel`.
Backdrop_Frag_Uniforms :: struct {
inv_working_size: [2]f32, // 0: 8 — 1.0 / working-resolution texture dimensions
pair_count: u32, // 8: 4 — number of (weight, offset) pairs; pair[0] is center
mode: u32, // 12: 4 — 0 = H-blur, 1 = V-composite (must match vert mode)
direction: [2]f32, // 16: 8 — (1,0) for H-blur, (0,1) for V-composite
inv_downsample_factor: f32, // 24: 4 — 1.0 / downsample_factor (mode 1 only; mode 0 ignores)
_pad0: f32, // 28: 4
kernel: [MAX_BACKDROP_KERNEL_PAIRS][4]f32, // 32: 512 — .x = weight, .y = offset (texels)
}
// Cached pixel dimensions for resize-detection in `ensure_backdrop_textures`.
cached_width: u32,
cached_height: u32,
// ---------------------------------------------------------------------------------------------------------------------
// ----- Pipeline ---------------
// ---------------------------------------------------------------------------------------------------------------------
// Linear-clamp sampler used for sampling source_texture (and Gaussian blur's working
// textures). Linear filtering is required by the Gaussian linear-sampling pair trick;
// any future backdrop effect that samples source_texture with bilinear interpolation
// can reuse this sampler. Clamp avoids edge-bleed at work-region boundaries.
sampler: ^sdl.GPUSampler,
// -- Gaussian blur effect --
Pipeline_2D_Backdrop :: struct {
// Two graphics pipelines. The downsample PSO is a single-bilinear-sample fullscreen pass;
// the blur PSO is mode-branched (H-blur fullscreen + V-composite instanced) and shares
// one shader program for both modes via a uniform `mode` selector.
downsample_pipeline: ^sdl.GPUGraphicsPipeline,
blur_pipeline: ^sdl.GPUGraphicsPipeline,
// Per-instance Backdrop_Primitive storage buffer. Grows on demand via grow_buffer_if_needed.
// Per-instance Gaussian_Blur_Primitive storage buffer. Grows on demand via grow_buffer_if_needed.
// All backdrop primitives across all layers in a frame share this single buffer; sub-batches
// reference into it by offset.
primitive_buffer: Buffer,
// Working textures, allocated once at swapchain resolution and recreated only on resize.
// All three are sized at full swapchain resolution and single-sample. Larger downsample
// factors fill only a sub-rect via viewport-limited rendering (see file-header comment).
// source_texture — when any backdrop draw exists this frame, the entire frame renders
// here instead of the swapchain. Copied to the swapchain at frame
// end. Acts as the bracket's snapshot input by virtue of already
// containing the pre-bracket frame.
// Both are sized at full swapchain resolution and single-sample. Larger downsample
// factors fill only a sub-rect via viewport-limited rendering (see file-header comment
// on adaptive downsampling in the Gaussian blur section below).
// downsample_texture — written by the downsample PSO. Read by the blur PSO in mode 0.
// h_blur_texture — written by the blur PSO in mode 0. Read by the blur PSO in mode 1.
source_texture: ^sdl.GPUTexture,
downsample_texture: ^sdl.GPUTexture,
h_blur_texture: ^sdl.GPUTexture,
// Cached pixel dimensions for resize-detection in `ensure_backdrop_textures`.
cached_width: u32,
cached_height: u32,
// Linear-clamp sampler used for all backdrop sampling. Linear filtering is required by the
// linear-sampling pair trick (one bilinear fetch covers two adjacent texels). Clamp avoids
// edge-bleed at the work-region boundary.
sampler: ^sdl.GPUSampler,
}
@(private)
create_pipeline_2d_backdrop :: proc(
device: ^sdl.GPUDevice,
window: ^sdl.Window,
) -> (
pipeline: Pipeline_2D_Backdrop,
ok: bool,
) {
//INTERNAL
create_backdrop :: proc(device: ^sdl.GPUDevice, window: ^sdl.Window) -> (pipeline: Backdrop, ok: bool) {
// On failure, clean up any partially-created resources.
defer if !ok {
if pipeline.sampler != nil do sdl.ReleaseGPUSampler(device, pipeline.sampler)
@@ -307,10 +235,10 @@ create_pipeline_2d_backdrop :: proc(
return pipeline, false
}
//----- Storage buffer for Backdrop_Primitive instances -------------
//----- Storage buffer for Gaussian_Blur_Primitive instances -------------
pipeline.primitive_buffer = create_buffer(
device,
size_of(Backdrop_Primitive) * BUFFER_INIT_SIZE,
size_of(Gaussian_Blur_Primitive) * BUFFER_INIT_SIZE,
sdl.GPUBufferUsageFlags{.GRAPHICS_STORAGE_READ},
) or_return
@@ -331,12 +259,12 @@ create_pipeline_2d_backdrop :: proc(
return pipeline, false
}
log.debug("Done creating backdrop pipeline")
log.debug("Done creating backdrop subsystem")
return pipeline, true
}
@(private)
destroy_pipeline_2d_backdrop :: proc(device: ^sdl.GPUDevice, pipeline: ^Pipeline_2D_Backdrop) {
//INTERNAL
destroy_backdrop :: proc(device: ^sdl.GPUDevice, pipeline: ^Backdrop) {
if pipeline.h_blur_texture != nil do sdl.ReleaseGPUTexture(device, pipeline.h_blur_texture)
if pipeline.downsample_texture != nil do sdl.ReleaseGPUTexture(device, pipeline.downsample_texture)
if pipeline.source_texture != nil do sdl.ReleaseGPUTexture(device, pipeline.source_texture)
@@ -346,20 +274,22 @@ destroy_pipeline_2d_backdrop :: proc(device: ^sdl.GPUDevice, pipeline: ^Pipeline
if pipeline.downsample_pipeline != nil do sdl.ReleaseGPUGraphicsPipeline(device, pipeline.downsample_pipeline)
}
// ---------------------------------------------------------------------------------------------------------------------
// ----- Working texture management ----
// ---------------------------------------------------------------------------------------------------------------------
//----- Working texture management ----------------------------------
// Allocate (or reallocate, on resize) the three working textures that the backdrop bracket
// uses. All three are sized at full swapchain resolution, single-sample, share the swapchain
// format, and need {.COLOR_TARGET, .SAMPLER} usage so they can be written by render passes
// and read by subsequent passes.
//
// `source_texture` is shared infrastructure (used by every backdrop effect).
// `downsample_texture` and `h_blur_texture` are Gaussian-blur-specific intermediates; a
// future backdrop effect with no downsample/blur prep would skip them.
//
// Recreates on dimension change only — same-size frames hit the early-out and skip GPU
// resource churn.
@(private)
//INTERNAL
ensure_backdrop_textures :: proc(device: ^sdl.GPUDevice, format: sdl.GPUTextureFormat, width, height: u32) {
pipeline := &GLOB.pipeline_2d_backdrop
pipeline := &GLOB.backdrop
if pipeline.source_texture != nil && pipeline.cached_width == width && pipeline.cached_height == height {
return
}
@@ -449,10 +379,138 @@ ensure_backdrop_textures :: proc(device: ^sdl.GPUDevice, format: sdl.GPUTextureF
pipeline.cached_height = height
}
//----- Frame / layer scanners ----------------------------------
// Returns true if any sub-batch in any layer this frame is .Backdrop kind. Called once at the
// top of `end()` to decide whether to route the whole frame to source_texture.
// O(total sub-batches) but with an early-exit on the first hit, so typical cost is tiny.
//INTERNAL
frame_has_backdrop :: proc() -> bool {
for &batch in GLOB.tmp_sub_batches {
if batch.kind == .Backdrop do return true
}
return false
}
// Returns the absolute index of the first .Backdrop sub-batch in the layer's sub-batch range,
// or -1 if the layer has no backdrops. The index is into GLOB.tmp_sub_batches (not relative to
// layer.sub_batch_start), to match how draw_layer's render-range helpers consume it.
//INTERNAL
find_first_backdrop_in_layer :: proc(layer: ^Layer) -> int {
for i in 0 ..< layer.sub_batch_len {
abs_idx := layer.sub_batch_start + i
if GLOB.tmp_sub_batches[abs_idx].kind == .Backdrop do return int(abs_idx)
}
return -1
}
// ---------------------------------------------------------------------------------------------------------------------
// ----- Kernel computation ------------
// ----- Gaussian blur ------------
// ---------------------------------------------------------------------------------------------------------------------
// Adaptive downsample design (Flutter-style).
//
// The bracket picks a downsample factor per-sigma-group, not as a global constant. The choice
// is driven by Flutter's `CalculateScale` formula in
// impeller/entity/contents/filters/gaussian_blur_filter_contents.cc (originally from Skia's
// GrBlurUtils): downsample so that the sigma in working-resolution pixels stays in the
// 2..4 range. This keeps the kernel reach wide enough to hide high-frequency artifacts from
// the bilinear upsample at the composite, while keeping the kernel's discrete tap count
// small (≤3σ reach → ≈12 paired taps).
//
// The full table, in physical pixels (sigma_logical * dpi_scaling):
//
// sigma_phys ≤ 4 → factor = 1 (no downsample; source is sampled directly)
// sigma_phys ≤ 8 → factor = 2
// sigma_phys > 8 → factor = 4 (capped)
//
// Capped at factor=4 to favor visual quality over bandwidth at the high end. Larger factors
// (8 and 16) would lose more high-frequency detail than the kernel can mask even with the
// H+V split, and the bandwidth saving is small (the work region also shrinks quadratically,
// so most of the savings are already captured at factor=4).
//
// Working textures are sized at full swapchain resolution to support factor=1. Larger factors
// just write to a smaller sub-rect via viewport-limited rendering. Memory cost: full-res
// working textures (2 textures, RGBA8) is roughly 16 MB at 1080p, 64 MB at 4K. On modern
// GPUs this is well within budget; on Mali Valhall SBCs it's negligible against unified-
// memory headroom.
//
// The shaders read the factor as a uniform. The downsample shader has three paths (factor=1
// identity, factor=2 single bilinear tap, factor>=4 four bilinear taps with offsets scaling
// by factor/4). The V-composite mode of backdrop_blur.frag uses inv_downsample_factor to
// scale full-res frag coords down to working-res UV.
//----- GPU types ----------------------------------
// Maximum number of (weight, offset) pairs in a single blur kernel. Each pair represents
// the linear-sampling pair adjustment (one bilinear fetch covering two adjacent texels);
// pair[0] is the center weight with offset 0. With 32 pairs we cover up to 63 input texels
// (1 center + 31 paired symmetric taps × 2 texels each), enough for sigma values well past
// the 4..24 typical UI range. Must match MAX_KERNEL_PAIRS in shaders/source/backdrop_blur.frag.
//INTERNAL
MAX_GAUSSIAN_BLUR_KERNEL_PAIRS :: 32
// Gaussian_Blur_Primitive is the GPU-side per-primitive storage layout. Mirrors the GLSL std430
// struct in shaders/source/backdrop_blur.vert. Field order is chosen so std430 alignment
// rules pack the struct to a clean 48-byte natural layout (no implicit padding): vec4
// members come first (16-byte aligned at any offset), then vec2, then scalars. The total is
// a multiple of 16 so the std430 array stride matches size_of(...) exactly.
//
// Gaussian blur primitives are RRect-only: rectangles, rounded rectangles, and circles
// (via uniform_radii) are all expressible. Rotation is intentionally omitted — backdrop
// sampling is in screen space, so a rotated mask over a stationary blur sample would look
// visually wrong. iOS, CSS backdrop-filter, and Flutter BackdropFilter all enforce this
// implicitly; we enforce it explicitly by leaving no rotation field.
//
// Outline is also intentionally omitted. A specialized edge effect (e.g. liquid-glass-style
// refraction outlines) would be implemented as a dedicated primitive type with its own
// pipeline rather than tacked onto this one as a flag bit.
//INTERNAL
Gaussian_Blur_Primitive :: struct {
bounds: [4]f32, // 0: 16 — world-space quad (min_xy, max_xy)
radii: [4]f32, // 16: 16 — per-corner radii in physical pixels (BR, TR, BL, TL)
half_size: [2]f32, // 32: 8 — RRect half extents (physical px)
half_feather: f32, // 40: 4 — feather_px * 0.5 (SDF anti-aliasing)
color: Color, // 44: 4 — tint, packed RGBA u8x4
}
#assert(size_of(Gaussian_Blur_Primitive) == 48)
// Vertex uniforms for the unified blur PSO (mode 0 = H-blur, mode 1 = V-composite).
// Matches the GLSL Uniforms block in shaders/source/backdrop_blur.vert. The downsample
// PSO has no vertex uniforms.
//INTERNAL
Gaussian_Blur_Vert_Uniforms :: struct {
projection: matrix[4, 4]f32, // 0: 64 — screen-space ortho (mode 1 only; mode 0 ignores)
dpi_scale: f32, // 64: 4
mode: u32, // 68: 4 — 0 = H-blur fullscreen tri; 1 = V-composite instanced quads
_pad0: [2]f32, // 72: 8 — std140 vec4 alignment pad
}
// Fragment uniforms for the downsample PSO. Matches Uniforms block in
// shaders/source/backdrop_downsample.frag.
//INTERNAL
Gaussian_Blur_Downsample_Frag_Uniforms :: struct {
inv_source_size: [2]f32, // 0: 8 — 1.0 / source_texture pixel dimensions (full-res)
downsample_factor: u32, // 8: 4 — 1, 2, or 4 (selects identity / 1-tap / 4-tap path in shader)
_pad0: u32, // 12: 4
}
// Fragment uniforms for the unified blur PSO (mode 0 + mode 1). Matches the GLSL Uniforms
// block in shaders/source/backdrop_blur.frag. The kernel array holds the linear-sampling
// pair coefficients computed CPU-side via `compute_blur_kernel`.
//INTERNAL
Gaussian_Blur_Frag_Uniforms :: struct {
inv_working_size: [2]f32, // 0: 8 — 1.0 / working-resolution texture dimensions
pair_count: u32, // 8: 4 — number of (weight, offset) pairs; pair[0] is center
mode: u32, // 12: 4 — 0 = H-blur, 1 = V-composite (must match vert mode)
direction: [2]f32, // 16: 8 — (1,0) for H-blur, (0,1) for V-composite
inv_downsample_factor: f32, // 24: 4 — 1.0 / downsample_factor (mode 1 only; mode 0 ignores)
_pad0: f32, // 28: 4
kernel: [MAX_GAUSSIAN_BLUR_KERNEL_PAIRS][4]f32, // 32: 512 — .x = weight, .y = offset (texels)
}
//----- Kernel computation ----------------------------------
// Compute Gaussian blur kernel weights with the linear-sampling pair adjustment.
// Adapted from RAD Debugger's r_d3d11_g_blur_shader_src CPU-side coefficient generation
// and Daniel Rákos's "Efficient Gaussian blur with linear sampling" article.
@@ -480,18 +538,23 @@ ensure_backdrop_textures :: proc(device: ^sdl.GPUDevice, format: sdl.GPUTextureF
// from it, rather than the inverse (RAD Debugger's algorithm passes a tap count and derives
// `stdev = (blur_count-1)/2`). Taking σ directly matches what callers expect when they read
// "gaussian_sigma" — passing tap count under that name was a footgun.
@(private)
compute_blur_kernel :: proc(sigma: f32, kernel: ^[MAX_BACKDROP_KERNEL_PAIRS][4]f32) -> (pair_count: u32) {
//INTERNAL
compute_blur_kernel :: proc(
sigma: f32,
kernel: ^[MAX_GAUSSIAN_BLUR_KERNEL_PAIRS][4]f32,
) -> (
pair_count: u32,
) {
if sigma <= 0 {
kernel[0] = {1, 0, 0, 0}
return 1
}
// Per-side discrete tap count: ceil(3*sigma) + 1 (center + 3σ reach on each side).
// Cap at the storage budget. With MAX_BACKDROP_KERNEL_PAIRS=32 each pair collapses 2
// Cap at the storage budget. With MAX_GAUSSIAN_BLUR_KERNEL_PAIRS=32 each pair collapses 2
// discrete taps via linear-sampling, so max discrete taps per side = 1 + 31*2 = 63.
discrete_taps := u32(math.ceil(3 * sigma)) + 1
max_taps := u32(MAX_BACKDROP_KERNEL_PAIRS - 1) * 2 + 1
max_taps := u32(MAX_GAUSSIAN_BLUR_KERNEL_PAIRS - 1) * 2 + 1
if discrete_taps > max_taps do discrete_taps = max_taps
if discrete_taps < 2 {
// Sigma was so small that 3σ < 1 texel; degenerate to a sharp sample.
@@ -501,7 +564,7 @@ compute_blur_kernel :: proc(sigma: f32, kernel: ^[MAX_BACKDROP_KERNEL_PAIRS][4]f
// Compute discrete weights[i] = exp(-i² / (2σ²)). The inv_root prefactor cancels in the
// final normalization, so we skip it.
weights: [MAX_BACKDROP_KERNEL_PAIRS * 2]f32 = {}
weights: [MAX_GAUSSIAN_BLUR_KERNEL_PAIRS * 2]f32 = {}
two_sigma_sq := 2 * sigma * sigma
total: f32 = 0
for i in 0 ..< discrete_taps {
@@ -535,38 +598,9 @@ compute_blur_kernel :: proc(sigma: f32, kernel: ^[MAX_BACKDROP_KERNEL_PAIRS][4]f
return pair_count
}
// ---------------------------------------------------------------------------------------------------------------------
// ----- Uniform push helpers ----------
// ---------------------------------------------------------------------------------------------------------------------
// Push the Backdrop_Vert_Uniforms block to the vertex stage at slot 0.
@(private)
push_backdrop_vert_globals :: proc(cmd_buffer: ^sdl.GPUCommandBuffer, width: f32, height: f32, mode: u32) {
uniforms := Backdrop_Vert_Uniforms {
projection = ortho_rh(left = 0.0, top = 0.0, right = width, bottom = height, near = -1.0, far = 1.0),
dpi_scale = GLOB.dpi_scaling,
mode = mode,
}
sdl.PushGPUVertexUniformData(cmd_buffer, 0, &uniforms, size_of(Backdrop_Vert_Uniforms))
}
// Push the Backdrop_Downsample_Frag_Uniforms block to the fragment stage at slot 0.
@(private)
push_backdrop_downsample_frag_globals :: proc(
cmd_buffer: ^sdl.GPUCommandBuffer,
source_width, source_height: u32,
downsample_factor: u32,
) {
uniforms := Backdrop_Downsample_Frag_Uniforms {
inv_source_size = {1.0 / f32(source_width), 1.0 / f32(source_height)},
downsample_factor = downsample_factor,
}
sdl.PushGPUFragmentUniformData(cmd_buffer, 0, &uniforms, size_of(Backdrop_Downsample_Frag_Uniforms))
}
// Pick a downsample factor for a given sigma. See the file-header comment for the table and
// rationale. Returned values: {1, 2, 4}.
@(private)
//INTERNAL
compute_backdrop_downsample_factor :: proc(sigma_logical: f32) -> u32 {
sigma_phys := sigma_logical * GLOB.dpi_scaling
switch {
@@ -576,80 +610,76 @@ compute_backdrop_downsample_factor :: proc(sigma_logical: f32) -> u32 {
}
}
// Push the Backdrop_Frag_Uniforms block (kernel + pass mode/direction) to the fragment stage at slot 0.
@(private)
push_backdrop_blur_frag_globals :: proc(
cmd_buffer: ^sdl.GPUCommandBuffer,
uniforms: ^Backdrop_Frag_Uniforms,
) {
sdl.PushGPUFragmentUniformData(cmd_buffer, 0, uniforms, size_of(Backdrop_Frag_Uniforms))
//----- Uniform push helpers ----------------------------------
// Push the Gaussian_Blur_Vert_Uniforms block to the vertex stage at slot 0.
//INTERNAL
push_backdrop_vert_globals :: proc(cmd_buffer: ^sdl.GPUCommandBuffer, width: f32, height: f32, mode: u32) {
uniforms := Gaussian_Blur_Vert_Uniforms {
projection = ortho_rh(left = 0.0, top = 0.0, right = width, bottom = height, near = -1.0, far = 1.0),
dpi_scale = GLOB.dpi_scaling,
mode = mode,
}
sdl.PushGPUVertexUniformData(cmd_buffer, 0, &uniforms, size_of(Gaussian_Blur_Vert_Uniforms))
}
// ---------------------------------------------------------------------------------------------------------------------
// ----- Storage-buffer upload ---------
// ---------------------------------------------------------------------------------------------------------------------
// Push the Gaussian_Blur_Downsample_Frag_Uniforms block to the fragment stage at slot 0.
//INTERNAL
push_backdrop_downsample_frag_globals :: proc(
cmd_buffer: ^sdl.GPUCommandBuffer,
source_width, source_height: u32,
downsample_factor: u32,
) {
uniforms := Gaussian_Blur_Downsample_Frag_Uniforms {
inv_source_size = {1.0 / f32(source_width), 1.0 / f32(source_height)},
downsample_factor = downsample_factor,
}
sdl.PushGPUFragmentUniformData(cmd_buffer, 0, &uniforms, size_of(Gaussian_Blur_Downsample_Frag_Uniforms))
}
// Upload all Backdrop_Primitive instances staged this frame to the backdrop pipeline's storage
// buffer. Mirrors the SDF primitive upload in pipeline_2d_base.odin's `upload`. Called from
// Push the Gaussian_Blur_Frag_Uniforms block (kernel + pass mode/direction) to the fragment stage at slot 0.
//INTERNAL
push_backdrop_blur_frag_globals :: proc(
cmd_buffer: ^sdl.GPUCommandBuffer,
uniforms: ^Gaussian_Blur_Frag_Uniforms,
) {
sdl.PushGPUFragmentUniformData(cmd_buffer, 0, uniforms, size_of(Gaussian_Blur_Frag_Uniforms))
}
//----- Storage-buffer upload ----------------------------------
// Upload all Gaussian_Blur_Primitive instances staged this frame to the backdrop subsystem's storage
// buffer. Mirrors the SDF primitive upload in core_2d.odin's `upload`. Called from
// `end()` inside the same copy pass that uploads vertices/indices/SDF primitives.
@(private)
//INTERNAL
upload_backdrop_primitives :: proc(device: ^sdl.GPUDevice, pass: ^sdl.GPUCopyPass) {
prim_count := u32(len(GLOB.tmp_backdrop_primitives))
prim_count := u32(len(GLOB.tmp_gaussian_blur_primitives))
if prim_count == 0 do return
prim_size := prim_count * size_of(Backdrop_Primitive)
prim_size := prim_count * size_of(Gaussian_Blur_Primitive)
grow_buffer_if_needed(
device,
&GLOB.pipeline_2d_backdrop.primitive_buffer,
&GLOB.backdrop.primitive_buffer,
prim_size,
sdl.GPUBufferUsageFlags{.GRAPHICS_STORAGE_READ},
)
prim_array := sdl.MapGPUTransferBuffer(device, GLOB.pipeline_2d_backdrop.primitive_buffer.transfer, false)
prim_array := sdl.MapGPUTransferBuffer(device, GLOB.backdrop.primitive_buffer.transfer, false)
if prim_array == nil {
log.panicf("Failed to map backdrop primitive transfer buffer: %s", sdl.GetError())
}
mem.copy(prim_array, raw_data(GLOB.tmp_backdrop_primitives), int(prim_size))
sdl.UnmapGPUTransferBuffer(device, GLOB.pipeline_2d_backdrop.primitive_buffer.transfer)
mem.copy(prim_array, raw_data(GLOB.tmp_gaussian_blur_primitives), int(prim_size))
sdl.UnmapGPUTransferBuffer(device, GLOB.backdrop.primitive_buffer.transfer)
sdl.UploadToGPUBuffer(
pass,
sdl.GPUTransferBufferLocation{transfer_buffer = GLOB.pipeline_2d_backdrop.primitive_buffer.transfer},
sdl.GPUBufferRegion{buffer = GLOB.pipeline_2d_backdrop.primitive_buffer.gpu, offset = 0, size = prim_size},
sdl.GPUTransferBufferLocation{transfer_buffer = GLOB.backdrop.primitive_buffer.transfer},
sdl.GPUBufferRegion{buffer = GLOB.backdrop.primitive_buffer.gpu, offset = 0, size = prim_size},
false,
)
}
// ---------------------------------------------------------------------------------------------------------------------
// ----- Frame / layer scanners --------
// ---------------------------------------------------------------------------------------------------------------------
// Returns true if any sub-batch in any layer this frame is .Backdrop kind. Called once at the
// top of `end()` to decide whether to route the whole frame to source_texture.
// O(total sub-batches) but with an early-exit on the first hit, so typical cost is tiny.
@(private)
frame_has_backdrop :: proc() -> bool {
for &batch in GLOB.tmp_sub_batches {
if batch.kind == .Backdrop do return true
}
return false
}
// Returns the absolute index of the first .Backdrop sub-batch in the layer's sub-batch range,
// or -1 if the layer has no backdrops. The index is into GLOB.tmp_sub_batches (not relative to
// layer.sub_batch_start), to match how draw_layer's render-range helpers consume it.
@(private)
find_first_backdrop_in_layer :: proc(layer: ^Layer) -> int {
for i in 0 ..< layer.sub_batch_len {
abs_idx := layer.sub_batch_start + i
if GLOB.tmp_sub_batches[abs_idx].kind == .Backdrop do return int(abs_idx)
}
return -1
}
// ---------------------------------------------------------------------------------------------------------------------
// ----- Bracket scheduler -------------
// ---------------------------------------------------------------------------------------------------------------------
//----- Bracket scheduler ----------------------------------
// Compute the union AABB of the backdrop primitives in a contiguous-same-sigma sub-batch run
// (one "sigma group"), expanded by 6 sigmas of blur reach (the kernel weight beyond 3σ is
@@ -661,7 +691,7 @@ find_first_backdrop_in_layer :: proc(layer: ^Layer) -> int {
// Per-group (rather than per-layer) because the adaptive downsample picks a different factor
// per sigma, and the kernel reach is also per-sigma. A tighter region per group means less
// fragment work in the downsample and H-blur passes.
@(private)
//INTERNAL
compute_backdrop_group_work_region :: proc(
group_start, group_end: u32,
sigma_logical: f32,
@@ -680,7 +710,7 @@ compute_backdrop_group_work_region :: proc(
batch := GLOB.tmp_sub_batches[i]
if batch.kind != .Backdrop do continue
for p in batch.offset ..< batch.offset + batch.count {
prim := GLOB.tmp_backdrop_primitives[p]
prim := GLOB.tmp_gaussian_blur_primitives[p]
// prim.bounds is in logical pixels (world space).
if !has_any {
min_x = prim.bounds[0]
@@ -751,13 +781,13 @@ compute_backdrop_group_work_region :: proc(
// On exit, source_texture contains the pre-bracket contents plus all backdrop primitives
// composited on top. The caller then runs Pass B (post-bracket non-backdrop sub-batches) on
// source_texture with LOAD.
@(private)
//INTERNAL
run_backdrop_bracket :: proc(
cmd_buffer: ^sdl.GPUCommandBuffer,
layer: ^Layer,
swapchain_width, swapchain_height: u32,
) {
pipeline := &GLOB.pipeline_2d_backdrop
pipeline := &GLOB.backdrop
full_viewport := sdl.GPUViewport {
x = 0,
@@ -852,7 +882,7 @@ run_backdrop_bracket :: proc(
// Convert the user's logical-pixel sigma into the kernel's working space.
// sigma_working_texels = sigma_logical * dpi_scaling / downsample_factor.
effective_sigma := sigma * GLOB.dpi_scaling / f32(downsample_factor)
frag_uniforms := Backdrop_Frag_Uniforms {
frag_uniforms := Gaussian_Blur_Frag_Uniforms {
inv_working_size = inv_working_size,
inv_downsample_factor = 1.0 / f32(downsample_factor),
}
@@ -1002,24 +1032,20 @@ run_backdrop_bracket :: proc(
}
}
// ---------------------------------------------------------------------------------------------------------------------
// ----- Primitive builders ------------
// ---------------------------------------------------------------------------------------------------------------------
//----- Primitive builders ----------------------------------
// Internal
//
// Build a Backdrop_Primitive with bounds, radii, and feather computed from rectangle
// Build a Gaussian_Blur_Primitive with bounds, radii, and feather computed from rectangle
// geometry. The caller sets `color` (tint) on the returned primitive before submitting.
//
// No rotation, no outline — backdrop primitives are intentionally limited to axis-aligned
// No rotation, no outline — gaussian blur primitives are intentionally limited to axis-aligned
// RRects. Rotation breaks screen-space blur sampling visually; outline would be a specialized
// edge effect that belongs in its own primitive type.
@(private)
//INTERNAL
build_backdrop_primitive :: proc(
rect: Rectangle,
radii: Rectangle_Radii,
feather_px: f32,
) -> Backdrop_Primitive {
) -> Gaussian_Blur_Primitive {
max_radius := min(rect.width, rect.height) * 0.5
clamped_top_left := clamp(radii.top_left, 0, max_radius)
clamped_top_right := clamp(radii.top_right, 0, max_radius)
@@ -1035,7 +1061,7 @@ build_backdrop_primitive :: proc(
center_x := rect.x + half_width
center_y := rect.y + half_height
return Backdrop_Primitive {
return Gaussian_Blur_Primitive {
bounds = {
center_x - half_width - padding,
center_y - half_height - padding,
@@ -1057,13 +1083,13 @@ build_backdrop_primitive :: proc(
}
}
// Internal — append a Backdrop_Primitive to the staging array and emit a .Backdrop sub-batch
// Append a Gaussian_Blur_Primitive to the staging array and emit a .Backdrop sub-batch
// carrying the requested gaussian_sigma. Sub-batch coalescing in append_or_extend_sub_batch
// will merge contiguous backdrops that share a sigma into a single instanced draw.
@(private)
prepare_backdrop_primitive :: proc(layer: ^Layer, prim: Backdrop_Primitive, gaussian_sigma: f32) {
offset := u32(len(GLOB.tmp_backdrop_primitives))
append(&GLOB.tmp_backdrop_primitives, prim)
//INTERNAL
prepare_backdrop_primitive :: proc(layer: ^Layer, prim: Gaussian_Blur_Primitive, gaussian_sigma: f32) {
offset := u32(len(GLOB.tmp_gaussian_blur_primitives))
append(&GLOB.tmp_gaussian_blur_primitives, prim)
scissor := &GLOB.scissors[layer.scissor_start + layer.scissor_len - 1]
append_or_extend_sub_batch(
scissor,
@@ -1075,9 +1101,7 @@ prepare_backdrop_primitive :: proc(layer: ^Layer, prim: Backdrop_Primitive, gaus
)
}
// ---------------------------------------------------------------------------------------------------------------------
// ----- Public API --------------------
// ---------------------------------------------------------------------------------------------------------------------
//----- Public API ----------------------------------
// Draw a rectangle whose interior samples a Gaussian-blurred snapshot of the framebuffer
// behind it. RRect-only — covers rectangles, rounded rectangles, and circles via