Major reorg
This commit is contained in:
+260
-236
@@ -5,151 +5,79 @@ import "core:math"
|
||||
import "core:mem"
|
||||
import sdl "vendor:sdl3"
|
||||
|
||||
// Adaptive downsample design (Flutter-style).
|
||||
// This file hosts the backdrop subsystem: any visual effect that samples the current
|
||||
// framebuffer as input. Today the only implemented effect is Gaussian blur (frosted glass);
|
||||
// future effects (refraction, mirror, etc.) will live here too.
|
||||
//
|
||||
// The bracket picks a downsample factor per-sigma-group, not as a global constant. The choice
|
||||
// is driven by Flutter's `CalculateScale` formula in
|
||||
// impeller/entity/contents/filters/gaussian_blur_filter_contents.cc (originally from Skia's
|
||||
// GrBlurUtils): downsample so that the sigma in working-resolution pixels stays in the
|
||||
// 2..4 range. This keeps the kernel reach wide enough to hide high-frequency artifacts from
|
||||
// the bilinear upsample at the composite, while keeping the kernel's discrete tap count
|
||||
// small (≤3σ reach → ≈12 paired taps).
|
||||
// The file is split into two top-level sections:
|
||||
//
|
||||
// The full table, in physical pixels (sigma_logical * dpi_scaling):
|
||||
// 1. Shared backdrop infrastructure — bracket coordination, source_texture lifecycle,
|
||||
// sub-batch scanners. These are general to any backdrop effect: every backdrop effect
|
||||
// needs a snapshot of the framebuffer (source_texture) and needs to participate in the
|
||||
// bracket render-pass-boundary scheduling. When a second effect is added, its
|
||||
// per-effect resources go in their own section like the Gaussian blur one below; this
|
||||
// shared section stays.
|
||||
//
|
||||
// sigma_phys ≤ 4 → factor = 1 (no downsample; source is sampled directly)
|
||||
// sigma_phys ≤ 8 → factor = 2
|
||||
// sigma_phys > 8 → factor = 4 (capped)
|
||||
// 2. Gaussian blur — the only effect implemented today. Owns its own PSOs, working
|
||||
// textures (downsample / h_blur), per-primitive storage layout, kernel math, and
|
||||
// bracket-runner inner loop. None of this is shared with future backdrop effects: a
|
||||
// refraction shader would have its own PSO, its own primitive struct, and likely
|
||||
// wouldn't need the downsample/h_blur intermediates at all.
|
||||
//
|
||||
// Capped at factor=4 to favor visual quality over bandwidth at the high end. Larger factors
|
||||
// (8 and 16) would lose more high-frequency detail than the kernel can mask even with the
|
||||
// H+V split, and the bandwidth saving is small (the work region also shrinks quadratically,
|
||||
// so most of the savings are already captured at factor=4).
|
||||
//
|
||||
// Working textures are sized at full swapchain resolution to support factor=1. Larger factors
|
||||
// just write to a smaller sub-rect via viewport-limited rendering. Memory cost: full-res
|
||||
// working textures (2 textures, RGBA8) is roughly 16 MB at 1080p, 64 MB at 4K. On modern
|
||||
// GPUs this is well within budget; on Mali Valhall SBCs it's negligible against unified-
|
||||
// memory headroom.
|
||||
//
|
||||
// The shaders read the factor as a uniform. The downsample shader has three paths (factor=1
|
||||
// identity, factor=2 single bilinear tap, factor>=4 four bilinear taps with offsets scaling
|
||||
// by factor/4). The V-composite mode of backdrop_blur.frag uses inv_downsample_factor to
|
||||
// scale full-res frag coords down to working-res UV.
|
||||
|
||||
// Maximum number of (weight, offset) pairs in a single blur kernel. Each pair represents
|
||||
// the linear-sampling pair adjustment (one bilinear fetch covering two adjacent texels);
|
||||
// pair[0] is the center weight with offset 0. With 32 pairs we cover up to 63 input texels
|
||||
// (1 center + 31 paired symmetric taps × 2 texels each), enough for sigma values well past
|
||||
// the 4..24 typical UI range. Must match MAX_KERNEL_PAIRS in shaders/source/backdrop_blur.frag.
|
||||
MAX_BACKDROP_KERNEL_PAIRS :: 32
|
||||
|
||||
// Backdrop_Primitive is the GPU-side per-primitive storage layout. Mirrors the GLSL std430
|
||||
// struct in shaders/source/backdrop_blur.vert. Field order is chosen so std430 alignment
|
||||
// rules pack the struct to a clean 48-byte natural layout (no implicit padding): vec4
|
||||
// members come first (16-byte aligned at any offset), then vec2, then scalars. The total is
|
||||
// a multiple of 16 so the std430 array stride matches size_of(...) exactly.
|
||||
//
|
||||
// Backdrop primitives are RRect-only: rectangles, rounded rectangles, and circles
|
||||
// (via uniform_radii) are all expressible. Rotation is intentionally omitted — backdrop
|
||||
// sampling is in screen space, so a rotated mask over a stationary blur sample would look
|
||||
// visually wrong. iOS, CSS backdrop-filter, and Flutter BackdropFilter all enforce this
|
||||
// implicitly; we enforce it explicitly by leaving no rotation field.
|
||||
//
|
||||
// Outline is also intentionally omitted. A specialized edge effect (e.g. liquid-glass-style
|
||||
// refraction outlines) would be implemented as a dedicated primitive type with its own
|
||||
// pipeline rather than tacked onto this one as a flag bit.
|
||||
Backdrop_Primitive :: struct {
|
||||
bounds: [4]f32, // 0: 16 — world-space quad (min_xy, max_xy)
|
||||
radii: [4]f32, // 16: 16 — per-corner radii in physical pixels (BR, TR, BL, TL)
|
||||
half_size: [2]f32, // 32: 8 — RRect half extents (physical px)
|
||||
half_feather: f32, // 40: 4 — feather_px * 0.5 (SDF anti-aliasing)
|
||||
color: Color, // 44: 4 — tint, packed RGBA u8x4
|
||||
}
|
||||
#assert(size_of(Backdrop_Primitive) == 48)
|
||||
// The `Backdrop` struct currently holds resources from both categories; field-group
|
||||
// comments inside it mark which are which. When a second effect lands the struct will be
|
||||
// split, but doing that pre-emptively means inventing a per-effect dispatch protocol on
|
||||
// speculation. Better to keep the conflation visible (and labeled) until concrete needs
|
||||
// shape the design.
|
||||
|
||||
// ---------------------------------------------------------------------------------------------------------------------
|
||||
// ----- Uniform blocks ----------------
|
||||
// ----- Shared backdrop infrastructure ------------
|
||||
// ---------------------------------------------------------------------------------------------------------------------
|
||||
|
||||
// Vertex uniforms for the unified blur PSO (mode 0 = H-blur, mode 1 = V-composite).
|
||||
// Matches the GLSL Uniforms block in shaders/source/backdrop_blur.vert. The downsample
|
||||
// PSO has no vertex uniforms.
|
||||
Backdrop_Vert_Uniforms :: struct {
|
||||
projection: matrix[4, 4]f32, // 0: 64 — screen-space ortho (mode 1 only; mode 0 ignores)
|
||||
dpi_scale: f32, // 64: 4
|
||||
mode: u32, // 68: 4 — 0 = H-blur fullscreen tri; 1 = V-composite instanced quads
|
||||
_pad0: [2]f32, // 72: 8 — std140 vec4 alignment pad
|
||||
}
|
||||
//INTERNAL
|
||||
Backdrop :: struct {
|
||||
// -- Shared across all backdrop effects --
|
||||
|
||||
// Fragment uniforms for the downsample PSO. Matches Uniforms block in
|
||||
// shaders/source/backdrop_downsample.frag.
|
||||
Backdrop_Downsample_Frag_Uniforms :: struct {
|
||||
inv_source_size: [2]f32, // 0: 8 — 1.0 / source_texture pixel dimensions (full-res)
|
||||
downsample_factor: u32, // 8: 4 — 1, 2, or 4 (selects identity / 1-tap / 4-tap path in shader)
|
||||
_pad0: u32, // 12: 4
|
||||
}
|
||||
// When any backdrop draw exists this frame, the entire frame renders into source_texture
|
||||
// instead of the swapchain. Acts as the bracket's snapshot input by virtue of already
|
||||
// containing the pre-bracket frame. Copied to the swapchain at frame end.
|
||||
source_texture: ^sdl.GPUTexture,
|
||||
|
||||
// Fragment uniforms for the unified blur PSO (mode 0 + mode 1). Matches the GLSL Uniforms
|
||||
// block in shaders/source/backdrop_blur.frag. The kernel array holds the linear-sampling
|
||||
// pair coefficients computed CPU-side via `compute_blur_kernel`.
|
||||
Backdrop_Frag_Uniforms :: struct {
|
||||
inv_working_size: [2]f32, // 0: 8 — 1.0 / working-resolution texture dimensions
|
||||
pair_count: u32, // 8: 4 — number of (weight, offset) pairs; pair[0] is center
|
||||
mode: u32, // 12: 4 — 0 = H-blur, 1 = V-composite (must match vert mode)
|
||||
direction: [2]f32, // 16: 8 — (1,0) for H-blur, (0,1) for V-composite
|
||||
inv_downsample_factor: f32, // 24: 4 — 1.0 / downsample_factor (mode 1 only; mode 0 ignores)
|
||||
_pad0: f32, // 28: 4
|
||||
kernel: [MAX_BACKDROP_KERNEL_PAIRS][4]f32, // 32: 512 — .x = weight, .y = offset (texels)
|
||||
}
|
||||
// Cached pixel dimensions for resize-detection in `ensure_backdrop_textures`.
|
||||
cached_width: u32,
|
||||
cached_height: u32,
|
||||
|
||||
// ---------------------------------------------------------------------------------------------------------------------
|
||||
// ----- Pipeline ---------------
|
||||
// ---------------------------------------------------------------------------------------------------------------------
|
||||
// Linear-clamp sampler used for sampling source_texture (and Gaussian blur's working
|
||||
// textures). Linear filtering is required by the Gaussian linear-sampling pair trick;
|
||||
// any future backdrop effect that samples source_texture with bilinear interpolation
|
||||
// can reuse this sampler. Clamp avoids edge-bleed at work-region boundaries.
|
||||
sampler: ^sdl.GPUSampler,
|
||||
|
||||
// -- Gaussian blur effect --
|
||||
|
||||
Pipeline_2D_Backdrop :: struct {
|
||||
// Two graphics pipelines. The downsample PSO is a single-bilinear-sample fullscreen pass;
|
||||
// the blur PSO is mode-branched (H-blur fullscreen + V-composite instanced) and shares
|
||||
// one shader program for both modes via a uniform `mode` selector.
|
||||
downsample_pipeline: ^sdl.GPUGraphicsPipeline,
|
||||
blur_pipeline: ^sdl.GPUGraphicsPipeline,
|
||||
|
||||
// Per-instance Backdrop_Primitive storage buffer. Grows on demand via grow_buffer_if_needed.
|
||||
// Per-instance Gaussian_Blur_Primitive storage buffer. Grows on demand via grow_buffer_if_needed.
|
||||
// All backdrop primitives across all layers in a frame share this single buffer; sub-batches
|
||||
// reference into it by offset.
|
||||
primitive_buffer: Buffer,
|
||||
|
||||
// Working textures, allocated once at swapchain resolution and recreated only on resize.
|
||||
// All three are sized at full swapchain resolution and single-sample. Larger downsample
|
||||
// factors fill only a sub-rect via viewport-limited rendering (see file-header comment).
|
||||
// source_texture — when any backdrop draw exists this frame, the entire frame renders
|
||||
// here instead of the swapchain. Copied to the swapchain at frame
|
||||
// end. Acts as the bracket's snapshot input by virtue of already
|
||||
// containing the pre-bracket frame.
|
||||
// Both are sized at full swapchain resolution and single-sample. Larger downsample
|
||||
// factors fill only a sub-rect via viewport-limited rendering (see file-header comment
|
||||
// on adaptive downsampling in the Gaussian blur section below).
|
||||
// downsample_texture — written by the downsample PSO. Read by the blur PSO in mode 0.
|
||||
// h_blur_texture — written by the blur PSO in mode 0. Read by the blur PSO in mode 1.
|
||||
source_texture: ^sdl.GPUTexture,
|
||||
downsample_texture: ^sdl.GPUTexture,
|
||||
h_blur_texture: ^sdl.GPUTexture,
|
||||
|
||||
// Cached pixel dimensions for resize-detection in `ensure_backdrop_textures`.
|
||||
cached_width: u32,
|
||||
cached_height: u32,
|
||||
|
||||
// Linear-clamp sampler used for all backdrop sampling. Linear filtering is required by the
|
||||
// linear-sampling pair trick (one bilinear fetch covers two adjacent texels). Clamp avoids
|
||||
// edge-bleed at the work-region boundary.
|
||||
sampler: ^sdl.GPUSampler,
|
||||
}
|
||||
|
||||
@(private)
|
||||
create_pipeline_2d_backdrop :: proc(
|
||||
device: ^sdl.GPUDevice,
|
||||
window: ^sdl.Window,
|
||||
) -> (
|
||||
pipeline: Pipeline_2D_Backdrop,
|
||||
ok: bool,
|
||||
) {
|
||||
//INTERNAL
|
||||
create_backdrop :: proc(device: ^sdl.GPUDevice, window: ^sdl.Window) -> (pipeline: Backdrop, ok: bool) {
|
||||
// On failure, clean up any partially-created resources.
|
||||
defer if !ok {
|
||||
if pipeline.sampler != nil do sdl.ReleaseGPUSampler(device, pipeline.sampler)
|
||||
@@ -307,10 +235,10 @@ create_pipeline_2d_backdrop :: proc(
|
||||
return pipeline, false
|
||||
}
|
||||
|
||||
//----- Storage buffer for Backdrop_Primitive instances -------------
|
||||
//----- Storage buffer for Gaussian_Blur_Primitive instances -------------
|
||||
pipeline.primitive_buffer = create_buffer(
|
||||
device,
|
||||
size_of(Backdrop_Primitive) * BUFFER_INIT_SIZE,
|
||||
size_of(Gaussian_Blur_Primitive) * BUFFER_INIT_SIZE,
|
||||
sdl.GPUBufferUsageFlags{.GRAPHICS_STORAGE_READ},
|
||||
) or_return
|
||||
|
||||
@@ -331,12 +259,12 @@ create_pipeline_2d_backdrop :: proc(
|
||||
return pipeline, false
|
||||
}
|
||||
|
||||
log.debug("Done creating backdrop pipeline")
|
||||
log.debug("Done creating backdrop subsystem")
|
||||
return pipeline, true
|
||||
}
|
||||
|
||||
@(private)
|
||||
destroy_pipeline_2d_backdrop :: proc(device: ^sdl.GPUDevice, pipeline: ^Pipeline_2D_Backdrop) {
|
||||
//INTERNAL
|
||||
destroy_backdrop :: proc(device: ^sdl.GPUDevice, pipeline: ^Backdrop) {
|
||||
if pipeline.h_blur_texture != nil do sdl.ReleaseGPUTexture(device, pipeline.h_blur_texture)
|
||||
if pipeline.downsample_texture != nil do sdl.ReleaseGPUTexture(device, pipeline.downsample_texture)
|
||||
if pipeline.source_texture != nil do sdl.ReleaseGPUTexture(device, pipeline.source_texture)
|
||||
@@ -346,20 +274,22 @@ destroy_pipeline_2d_backdrop :: proc(device: ^sdl.GPUDevice, pipeline: ^Pipeline
|
||||
if pipeline.downsample_pipeline != nil do sdl.ReleaseGPUGraphicsPipeline(device, pipeline.downsample_pipeline)
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------------------------------------------------
|
||||
// ----- Working texture management ----
|
||||
// ---------------------------------------------------------------------------------------------------------------------
|
||||
//----- Working texture management ----------------------------------
|
||||
|
||||
// Allocate (or reallocate, on resize) the three working textures that the backdrop bracket
|
||||
// uses. All three are sized at full swapchain resolution, single-sample, share the swapchain
|
||||
// format, and need {.COLOR_TARGET, .SAMPLER} usage so they can be written by render passes
|
||||
// and read by subsequent passes.
|
||||
//
|
||||
// `source_texture` is shared infrastructure (used by every backdrop effect).
|
||||
// `downsample_texture` and `h_blur_texture` are Gaussian-blur-specific intermediates; a
|
||||
// future backdrop effect with no downsample/blur prep would skip them.
|
||||
//
|
||||
// Recreates on dimension change only — same-size frames hit the early-out and skip GPU
|
||||
// resource churn.
|
||||
@(private)
|
||||
//INTERNAL
|
||||
ensure_backdrop_textures :: proc(device: ^sdl.GPUDevice, format: sdl.GPUTextureFormat, width, height: u32) {
|
||||
pipeline := &GLOB.pipeline_2d_backdrop
|
||||
pipeline := &GLOB.backdrop
|
||||
if pipeline.source_texture != nil && pipeline.cached_width == width && pipeline.cached_height == height {
|
||||
return
|
||||
}
|
||||
@@ -449,10 +379,138 @@ ensure_backdrop_textures :: proc(device: ^sdl.GPUDevice, format: sdl.GPUTextureF
|
||||
pipeline.cached_height = height
|
||||
}
|
||||
|
||||
//----- Frame / layer scanners ----------------------------------
|
||||
|
||||
// Returns true if any sub-batch in any layer this frame is .Backdrop kind. Called once at the
|
||||
// top of `end()` to decide whether to route the whole frame to source_texture.
|
||||
// O(total sub-batches) but with an early-exit on the first hit, so typical cost is tiny.
|
||||
//INTERNAL
|
||||
frame_has_backdrop :: proc() -> bool {
|
||||
for &batch in GLOB.tmp_sub_batches {
|
||||
if batch.kind == .Backdrop do return true
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
// Returns the absolute index of the first .Backdrop sub-batch in the layer's sub-batch range,
|
||||
// or -1 if the layer has no backdrops. The index is into GLOB.tmp_sub_batches (not relative to
|
||||
// layer.sub_batch_start), to match how draw_layer's render-range helpers consume it.
|
||||
//INTERNAL
|
||||
find_first_backdrop_in_layer :: proc(layer: ^Layer) -> int {
|
||||
for i in 0 ..< layer.sub_batch_len {
|
||||
abs_idx := layer.sub_batch_start + i
|
||||
if GLOB.tmp_sub_batches[abs_idx].kind == .Backdrop do return int(abs_idx)
|
||||
}
|
||||
return -1
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------------------------------------------------
|
||||
// ----- Kernel computation ------------
|
||||
// ----- Gaussian blur ------------
|
||||
// ---------------------------------------------------------------------------------------------------------------------
|
||||
|
||||
// Adaptive downsample design (Flutter-style).
|
||||
//
|
||||
// The bracket picks a downsample factor per-sigma-group, not as a global constant. The choice
|
||||
// is driven by Flutter's `CalculateScale` formula in
|
||||
// impeller/entity/contents/filters/gaussian_blur_filter_contents.cc (originally from Skia's
|
||||
// GrBlurUtils): downsample so that the sigma in working-resolution pixels stays in the
|
||||
// 2..4 range. This keeps the kernel reach wide enough to hide high-frequency artifacts from
|
||||
// the bilinear upsample at the composite, while keeping the kernel's discrete tap count
|
||||
// small (≤3σ reach → ≈12 paired taps).
|
||||
//
|
||||
// The full table, in physical pixels (sigma_logical * dpi_scaling):
|
||||
//
|
||||
// sigma_phys ≤ 4 → factor = 1 (no downsample; source is sampled directly)
|
||||
// sigma_phys ≤ 8 → factor = 2
|
||||
// sigma_phys > 8 → factor = 4 (capped)
|
||||
//
|
||||
// Capped at factor=4 to favor visual quality over bandwidth at the high end. Larger factors
|
||||
// (8 and 16) would lose more high-frequency detail than the kernel can mask even with the
|
||||
// H+V split, and the bandwidth saving is small (the work region also shrinks quadratically,
|
||||
// so most of the savings are already captured at factor=4).
|
||||
//
|
||||
// Working textures are sized at full swapchain resolution to support factor=1. Larger factors
|
||||
// just write to a smaller sub-rect via viewport-limited rendering. Memory cost: full-res
|
||||
// working textures (2 textures, RGBA8) is roughly 16 MB at 1080p, 64 MB at 4K. On modern
|
||||
// GPUs this is well within budget; on Mali Valhall SBCs it's negligible against unified-
|
||||
// memory headroom.
|
||||
//
|
||||
// The shaders read the factor as a uniform. The downsample shader has three paths (factor=1
|
||||
// identity, factor=2 single bilinear tap, factor>=4 four bilinear taps with offsets scaling
|
||||
// by factor/4). The V-composite mode of backdrop_blur.frag uses inv_downsample_factor to
|
||||
// scale full-res frag coords down to working-res UV.
|
||||
|
||||
//----- GPU types ----------------------------------
|
||||
|
||||
// Maximum number of (weight, offset) pairs in a single blur kernel. Each pair represents
|
||||
// the linear-sampling pair adjustment (one bilinear fetch covering two adjacent texels);
|
||||
// pair[0] is the center weight with offset 0. With 32 pairs we cover up to 63 input texels
|
||||
// (1 center + 31 paired symmetric taps × 2 texels each), enough for sigma values well past
|
||||
// the 4..24 typical UI range. Must match MAX_KERNEL_PAIRS in shaders/source/backdrop_blur.frag.
|
||||
//INTERNAL
|
||||
MAX_GAUSSIAN_BLUR_KERNEL_PAIRS :: 32
|
||||
|
||||
// Gaussian_Blur_Primitive is the GPU-side per-primitive storage layout. Mirrors the GLSL std430
|
||||
// struct in shaders/source/backdrop_blur.vert. Field order is chosen so std430 alignment
|
||||
// rules pack the struct to a clean 48-byte natural layout (no implicit padding): vec4
|
||||
// members come first (16-byte aligned at any offset), then vec2, then scalars. The total is
|
||||
// a multiple of 16 so the std430 array stride matches size_of(...) exactly.
|
||||
//
|
||||
// Gaussian blur primitives are RRect-only: rectangles, rounded rectangles, and circles
|
||||
// (via uniform_radii) are all expressible. Rotation is intentionally omitted — backdrop
|
||||
// sampling is in screen space, so a rotated mask over a stationary blur sample would look
|
||||
// visually wrong. iOS, CSS backdrop-filter, and Flutter BackdropFilter all enforce this
|
||||
// implicitly; we enforce it explicitly by leaving no rotation field.
|
||||
//
|
||||
// Outline is also intentionally omitted. A specialized edge effect (e.g. liquid-glass-style
|
||||
// refraction outlines) would be implemented as a dedicated primitive type with its own
|
||||
// pipeline rather than tacked onto this one as a flag bit.
|
||||
//INTERNAL
|
||||
Gaussian_Blur_Primitive :: struct {
|
||||
bounds: [4]f32, // 0: 16 — world-space quad (min_xy, max_xy)
|
||||
radii: [4]f32, // 16: 16 — per-corner radii in physical pixels (BR, TR, BL, TL)
|
||||
half_size: [2]f32, // 32: 8 — RRect half extents (physical px)
|
||||
half_feather: f32, // 40: 4 — feather_px * 0.5 (SDF anti-aliasing)
|
||||
color: Color, // 44: 4 — tint, packed RGBA u8x4
|
||||
}
|
||||
#assert(size_of(Gaussian_Blur_Primitive) == 48)
|
||||
|
||||
// Vertex uniforms for the unified blur PSO (mode 0 = H-blur, mode 1 = V-composite).
|
||||
// Matches the GLSL Uniforms block in shaders/source/backdrop_blur.vert. The downsample
|
||||
// PSO has no vertex uniforms.
|
||||
//INTERNAL
|
||||
Gaussian_Blur_Vert_Uniforms :: struct {
|
||||
projection: matrix[4, 4]f32, // 0: 64 — screen-space ortho (mode 1 only; mode 0 ignores)
|
||||
dpi_scale: f32, // 64: 4
|
||||
mode: u32, // 68: 4 — 0 = H-blur fullscreen tri; 1 = V-composite instanced quads
|
||||
_pad0: [2]f32, // 72: 8 — std140 vec4 alignment pad
|
||||
}
|
||||
|
||||
// Fragment uniforms for the downsample PSO. Matches Uniforms block in
|
||||
// shaders/source/backdrop_downsample.frag.
|
||||
//INTERNAL
|
||||
Gaussian_Blur_Downsample_Frag_Uniforms :: struct {
|
||||
inv_source_size: [2]f32, // 0: 8 — 1.0 / source_texture pixel dimensions (full-res)
|
||||
downsample_factor: u32, // 8: 4 — 1, 2, or 4 (selects identity / 1-tap / 4-tap path in shader)
|
||||
_pad0: u32, // 12: 4
|
||||
}
|
||||
|
||||
// Fragment uniforms for the unified blur PSO (mode 0 + mode 1). Matches the GLSL Uniforms
|
||||
// block in shaders/source/backdrop_blur.frag. The kernel array holds the linear-sampling
|
||||
// pair coefficients computed CPU-side via `compute_blur_kernel`.
|
||||
//INTERNAL
|
||||
Gaussian_Blur_Frag_Uniforms :: struct {
|
||||
inv_working_size: [2]f32, // 0: 8 — 1.0 / working-resolution texture dimensions
|
||||
pair_count: u32, // 8: 4 — number of (weight, offset) pairs; pair[0] is center
|
||||
mode: u32, // 12: 4 — 0 = H-blur, 1 = V-composite (must match vert mode)
|
||||
direction: [2]f32, // 16: 8 — (1,0) for H-blur, (0,1) for V-composite
|
||||
inv_downsample_factor: f32, // 24: 4 — 1.0 / downsample_factor (mode 1 only; mode 0 ignores)
|
||||
_pad0: f32, // 28: 4
|
||||
kernel: [MAX_GAUSSIAN_BLUR_KERNEL_PAIRS][4]f32, // 32: 512 — .x = weight, .y = offset (texels)
|
||||
}
|
||||
|
||||
//----- Kernel computation ----------------------------------
|
||||
|
||||
// Compute Gaussian blur kernel weights with the linear-sampling pair adjustment.
|
||||
// Adapted from RAD Debugger's r_d3d11_g_blur_shader_src CPU-side coefficient generation
|
||||
// and Daniel Rákos's "Efficient Gaussian blur with linear sampling" article.
|
||||
@@ -480,18 +538,23 @@ ensure_backdrop_textures :: proc(device: ^sdl.GPUDevice, format: sdl.GPUTextureF
|
||||
// from it, rather than the inverse (RAD Debugger's algorithm passes a tap count and derives
|
||||
// `stdev = (blur_count-1)/2`). Taking σ directly matches what callers expect when they read
|
||||
// "gaussian_sigma" — passing tap count under that name was a footgun.
|
||||
@(private)
|
||||
compute_blur_kernel :: proc(sigma: f32, kernel: ^[MAX_BACKDROP_KERNEL_PAIRS][4]f32) -> (pair_count: u32) {
|
||||
//INTERNAL
|
||||
compute_blur_kernel :: proc(
|
||||
sigma: f32,
|
||||
kernel: ^[MAX_GAUSSIAN_BLUR_KERNEL_PAIRS][4]f32,
|
||||
) -> (
|
||||
pair_count: u32,
|
||||
) {
|
||||
if sigma <= 0 {
|
||||
kernel[0] = {1, 0, 0, 0}
|
||||
return 1
|
||||
}
|
||||
|
||||
// Per-side discrete tap count: ceil(3*sigma) + 1 (center + 3σ reach on each side).
|
||||
// Cap at the storage budget. With MAX_BACKDROP_KERNEL_PAIRS=32 each pair collapses 2
|
||||
// Cap at the storage budget. With MAX_GAUSSIAN_BLUR_KERNEL_PAIRS=32 each pair collapses 2
|
||||
// discrete taps via linear-sampling, so max discrete taps per side = 1 + 31*2 = 63.
|
||||
discrete_taps := u32(math.ceil(3 * sigma)) + 1
|
||||
max_taps := u32(MAX_BACKDROP_KERNEL_PAIRS - 1) * 2 + 1
|
||||
max_taps := u32(MAX_GAUSSIAN_BLUR_KERNEL_PAIRS - 1) * 2 + 1
|
||||
if discrete_taps > max_taps do discrete_taps = max_taps
|
||||
if discrete_taps < 2 {
|
||||
// Sigma was so small that 3σ < 1 texel; degenerate to a sharp sample.
|
||||
@@ -501,7 +564,7 @@ compute_blur_kernel :: proc(sigma: f32, kernel: ^[MAX_BACKDROP_KERNEL_PAIRS][4]f
|
||||
|
||||
// Compute discrete weights[i] = exp(-i² / (2σ²)). The inv_root prefactor cancels in the
|
||||
// final normalization, so we skip it.
|
||||
weights: [MAX_BACKDROP_KERNEL_PAIRS * 2]f32 = {}
|
||||
weights: [MAX_GAUSSIAN_BLUR_KERNEL_PAIRS * 2]f32 = {}
|
||||
two_sigma_sq := 2 * sigma * sigma
|
||||
total: f32 = 0
|
||||
for i in 0 ..< discrete_taps {
|
||||
@@ -535,38 +598,9 @@ compute_blur_kernel :: proc(sigma: f32, kernel: ^[MAX_BACKDROP_KERNEL_PAIRS][4]f
|
||||
return pair_count
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------------------------------------------------
|
||||
// ----- Uniform push helpers ----------
|
||||
// ---------------------------------------------------------------------------------------------------------------------
|
||||
|
||||
// Push the Backdrop_Vert_Uniforms block to the vertex stage at slot 0.
|
||||
@(private)
|
||||
push_backdrop_vert_globals :: proc(cmd_buffer: ^sdl.GPUCommandBuffer, width: f32, height: f32, mode: u32) {
|
||||
uniforms := Backdrop_Vert_Uniforms {
|
||||
projection = ortho_rh(left = 0.0, top = 0.0, right = width, bottom = height, near = -1.0, far = 1.0),
|
||||
dpi_scale = GLOB.dpi_scaling,
|
||||
mode = mode,
|
||||
}
|
||||
sdl.PushGPUVertexUniformData(cmd_buffer, 0, &uniforms, size_of(Backdrop_Vert_Uniforms))
|
||||
}
|
||||
|
||||
// Push the Backdrop_Downsample_Frag_Uniforms block to the fragment stage at slot 0.
|
||||
@(private)
|
||||
push_backdrop_downsample_frag_globals :: proc(
|
||||
cmd_buffer: ^sdl.GPUCommandBuffer,
|
||||
source_width, source_height: u32,
|
||||
downsample_factor: u32,
|
||||
) {
|
||||
uniforms := Backdrop_Downsample_Frag_Uniforms {
|
||||
inv_source_size = {1.0 / f32(source_width), 1.0 / f32(source_height)},
|
||||
downsample_factor = downsample_factor,
|
||||
}
|
||||
sdl.PushGPUFragmentUniformData(cmd_buffer, 0, &uniforms, size_of(Backdrop_Downsample_Frag_Uniforms))
|
||||
}
|
||||
|
||||
// Pick a downsample factor for a given sigma. See the file-header comment for the table and
|
||||
// rationale. Returned values: {1, 2, 4}.
|
||||
@(private)
|
||||
//INTERNAL
|
||||
compute_backdrop_downsample_factor :: proc(sigma_logical: f32) -> u32 {
|
||||
sigma_phys := sigma_logical * GLOB.dpi_scaling
|
||||
switch {
|
||||
@@ -576,80 +610,76 @@ compute_backdrop_downsample_factor :: proc(sigma_logical: f32) -> u32 {
|
||||
}
|
||||
}
|
||||
|
||||
// Push the Backdrop_Frag_Uniforms block (kernel + pass mode/direction) to the fragment stage at slot 0.
|
||||
@(private)
|
||||
push_backdrop_blur_frag_globals :: proc(
|
||||
cmd_buffer: ^sdl.GPUCommandBuffer,
|
||||
uniforms: ^Backdrop_Frag_Uniforms,
|
||||
) {
|
||||
sdl.PushGPUFragmentUniformData(cmd_buffer, 0, uniforms, size_of(Backdrop_Frag_Uniforms))
|
||||
//----- Uniform push helpers ----------------------------------
|
||||
|
||||
// Push the Gaussian_Blur_Vert_Uniforms block to the vertex stage at slot 0.
|
||||
//INTERNAL
|
||||
push_backdrop_vert_globals :: proc(cmd_buffer: ^sdl.GPUCommandBuffer, width: f32, height: f32, mode: u32) {
|
||||
uniforms := Gaussian_Blur_Vert_Uniforms {
|
||||
projection = ortho_rh(left = 0.0, top = 0.0, right = width, bottom = height, near = -1.0, far = 1.0),
|
||||
dpi_scale = GLOB.dpi_scaling,
|
||||
mode = mode,
|
||||
}
|
||||
sdl.PushGPUVertexUniformData(cmd_buffer, 0, &uniforms, size_of(Gaussian_Blur_Vert_Uniforms))
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------------------------------------------------
|
||||
// ----- Storage-buffer upload ---------
|
||||
// ---------------------------------------------------------------------------------------------------------------------
|
||||
// Push the Gaussian_Blur_Downsample_Frag_Uniforms block to the fragment stage at slot 0.
|
||||
//INTERNAL
|
||||
push_backdrop_downsample_frag_globals :: proc(
|
||||
cmd_buffer: ^sdl.GPUCommandBuffer,
|
||||
source_width, source_height: u32,
|
||||
downsample_factor: u32,
|
||||
) {
|
||||
uniforms := Gaussian_Blur_Downsample_Frag_Uniforms {
|
||||
inv_source_size = {1.0 / f32(source_width), 1.0 / f32(source_height)},
|
||||
downsample_factor = downsample_factor,
|
||||
}
|
||||
sdl.PushGPUFragmentUniformData(cmd_buffer, 0, &uniforms, size_of(Gaussian_Blur_Downsample_Frag_Uniforms))
|
||||
}
|
||||
|
||||
// Upload all Backdrop_Primitive instances staged this frame to the backdrop pipeline's storage
|
||||
// buffer. Mirrors the SDF primitive upload in pipeline_2d_base.odin's `upload`. Called from
|
||||
// Push the Gaussian_Blur_Frag_Uniforms block (kernel + pass mode/direction) to the fragment stage at slot 0.
|
||||
//INTERNAL
|
||||
push_backdrop_blur_frag_globals :: proc(
|
||||
cmd_buffer: ^sdl.GPUCommandBuffer,
|
||||
uniforms: ^Gaussian_Blur_Frag_Uniforms,
|
||||
) {
|
||||
sdl.PushGPUFragmentUniformData(cmd_buffer, 0, uniforms, size_of(Gaussian_Blur_Frag_Uniforms))
|
||||
}
|
||||
|
||||
//----- Storage-buffer upload ----------------------------------
|
||||
|
||||
// Upload all Gaussian_Blur_Primitive instances staged this frame to the backdrop subsystem's storage
|
||||
// buffer. Mirrors the SDF primitive upload in core_2d.odin's `upload`. Called from
|
||||
// `end()` inside the same copy pass that uploads vertices/indices/SDF primitives.
|
||||
@(private)
|
||||
//INTERNAL
|
||||
upload_backdrop_primitives :: proc(device: ^sdl.GPUDevice, pass: ^sdl.GPUCopyPass) {
|
||||
prim_count := u32(len(GLOB.tmp_backdrop_primitives))
|
||||
prim_count := u32(len(GLOB.tmp_gaussian_blur_primitives))
|
||||
if prim_count == 0 do return
|
||||
|
||||
prim_size := prim_count * size_of(Backdrop_Primitive)
|
||||
prim_size := prim_count * size_of(Gaussian_Blur_Primitive)
|
||||
grow_buffer_if_needed(
|
||||
device,
|
||||
&GLOB.pipeline_2d_backdrop.primitive_buffer,
|
||||
&GLOB.backdrop.primitive_buffer,
|
||||
prim_size,
|
||||
sdl.GPUBufferUsageFlags{.GRAPHICS_STORAGE_READ},
|
||||
)
|
||||
|
||||
prim_array := sdl.MapGPUTransferBuffer(device, GLOB.pipeline_2d_backdrop.primitive_buffer.transfer, false)
|
||||
prim_array := sdl.MapGPUTransferBuffer(device, GLOB.backdrop.primitive_buffer.transfer, false)
|
||||
if prim_array == nil {
|
||||
log.panicf("Failed to map backdrop primitive transfer buffer: %s", sdl.GetError())
|
||||
}
|
||||
mem.copy(prim_array, raw_data(GLOB.tmp_backdrop_primitives), int(prim_size))
|
||||
sdl.UnmapGPUTransferBuffer(device, GLOB.pipeline_2d_backdrop.primitive_buffer.transfer)
|
||||
mem.copy(prim_array, raw_data(GLOB.tmp_gaussian_blur_primitives), int(prim_size))
|
||||
sdl.UnmapGPUTransferBuffer(device, GLOB.backdrop.primitive_buffer.transfer)
|
||||
|
||||
sdl.UploadToGPUBuffer(
|
||||
pass,
|
||||
sdl.GPUTransferBufferLocation{transfer_buffer = GLOB.pipeline_2d_backdrop.primitive_buffer.transfer},
|
||||
sdl.GPUBufferRegion{buffer = GLOB.pipeline_2d_backdrop.primitive_buffer.gpu, offset = 0, size = prim_size},
|
||||
sdl.GPUTransferBufferLocation{transfer_buffer = GLOB.backdrop.primitive_buffer.transfer},
|
||||
sdl.GPUBufferRegion{buffer = GLOB.backdrop.primitive_buffer.gpu, offset = 0, size = prim_size},
|
||||
false,
|
||||
)
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------------------------------------------------
|
||||
// ----- Frame / layer scanners --------
|
||||
// ---------------------------------------------------------------------------------------------------------------------
|
||||
|
||||
// Returns true if any sub-batch in any layer this frame is .Backdrop kind. Called once at the
|
||||
// top of `end()` to decide whether to route the whole frame to source_texture.
|
||||
// O(total sub-batches) but with an early-exit on the first hit, so typical cost is tiny.
|
||||
@(private)
|
||||
frame_has_backdrop :: proc() -> bool {
|
||||
for &batch in GLOB.tmp_sub_batches {
|
||||
if batch.kind == .Backdrop do return true
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
// Returns the absolute index of the first .Backdrop sub-batch in the layer's sub-batch range,
|
||||
// or -1 if the layer has no backdrops. The index is into GLOB.tmp_sub_batches (not relative to
|
||||
// layer.sub_batch_start), to match how draw_layer's render-range helpers consume it.
|
||||
@(private)
|
||||
find_first_backdrop_in_layer :: proc(layer: ^Layer) -> int {
|
||||
for i in 0 ..< layer.sub_batch_len {
|
||||
abs_idx := layer.sub_batch_start + i
|
||||
if GLOB.tmp_sub_batches[abs_idx].kind == .Backdrop do return int(abs_idx)
|
||||
}
|
||||
return -1
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------------------------------------------------
|
||||
// ----- Bracket scheduler -------------
|
||||
// ---------------------------------------------------------------------------------------------------------------------
|
||||
//----- Bracket scheduler ----------------------------------
|
||||
|
||||
// Compute the union AABB of the backdrop primitives in a contiguous-same-sigma sub-batch run
|
||||
// (one "sigma group"), expanded by 6 sigmas of blur reach (the kernel weight beyond 3σ is
|
||||
@@ -661,7 +691,7 @@ find_first_backdrop_in_layer :: proc(layer: ^Layer) -> int {
|
||||
// Per-group (rather than per-layer) because the adaptive downsample picks a different factor
|
||||
// per sigma, and the kernel reach is also per-sigma. A tighter region per group means less
|
||||
// fragment work in the downsample and H-blur passes.
|
||||
@(private)
|
||||
//INTERNAL
|
||||
compute_backdrop_group_work_region :: proc(
|
||||
group_start, group_end: u32,
|
||||
sigma_logical: f32,
|
||||
@@ -680,7 +710,7 @@ compute_backdrop_group_work_region :: proc(
|
||||
batch := GLOB.tmp_sub_batches[i]
|
||||
if batch.kind != .Backdrop do continue
|
||||
for p in batch.offset ..< batch.offset + batch.count {
|
||||
prim := GLOB.tmp_backdrop_primitives[p]
|
||||
prim := GLOB.tmp_gaussian_blur_primitives[p]
|
||||
// prim.bounds is in logical pixels (world space).
|
||||
if !has_any {
|
||||
min_x = prim.bounds[0]
|
||||
@@ -751,13 +781,13 @@ compute_backdrop_group_work_region :: proc(
|
||||
// On exit, source_texture contains the pre-bracket contents plus all backdrop primitives
|
||||
// composited on top. The caller then runs Pass B (post-bracket non-backdrop sub-batches) on
|
||||
// source_texture with LOAD.
|
||||
@(private)
|
||||
//INTERNAL
|
||||
run_backdrop_bracket :: proc(
|
||||
cmd_buffer: ^sdl.GPUCommandBuffer,
|
||||
layer: ^Layer,
|
||||
swapchain_width, swapchain_height: u32,
|
||||
) {
|
||||
pipeline := &GLOB.pipeline_2d_backdrop
|
||||
pipeline := &GLOB.backdrop
|
||||
|
||||
full_viewport := sdl.GPUViewport {
|
||||
x = 0,
|
||||
@@ -852,7 +882,7 @@ run_backdrop_bracket :: proc(
|
||||
// Convert the user's logical-pixel sigma into the kernel's working space.
|
||||
// sigma_working_texels = sigma_logical * dpi_scaling / downsample_factor.
|
||||
effective_sigma := sigma * GLOB.dpi_scaling / f32(downsample_factor)
|
||||
frag_uniforms := Backdrop_Frag_Uniforms {
|
||||
frag_uniforms := Gaussian_Blur_Frag_Uniforms {
|
||||
inv_working_size = inv_working_size,
|
||||
inv_downsample_factor = 1.0 / f32(downsample_factor),
|
||||
}
|
||||
@@ -1002,24 +1032,20 @@ run_backdrop_bracket :: proc(
|
||||
}
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------------------------------------------------
|
||||
// ----- Primitive builders ------------
|
||||
// ---------------------------------------------------------------------------------------------------------------------
|
||||
//----- Primitive builders ----------------------------------
|
||||
|
||||
// Internal
|
||||
//
|
||||
// Build a Backdrop_Primitive with bounds, radii, and feather computed from rectangle
|
||||
// Build a Gaussian_Blur_Primitive with bounds, radii, and feather computed from rectangle
|
||||
// geometry. The caller sets `color` (tint) on the returned primitive before submitting.
|
||||
//
|
||||
// No rotation, no outline — backdrop primitives are intentionally limited to axis-aligned
|
||||
// No rotation, no outline — gaussian blur primitives are intentionally limited to axis-aligned
|
||||
// RRects. Rotation breaks screen-space blur sampling visually; outline would be a specialized
|
||||
// edge effect that belongs in its own primitive type.
|
||||
@(private)
|
||||
//INTERNAL
|
||||
build_backdrop_primitive :: proc(
|
||||
rect: Rectangle,
|
||||
radii: Rectangle_Radii,
|
||||
feather_px: f32,
|
||||
) -> Backdrop_Primitive {
|
||||
) -> Gaussian_Blur_Primitive {
|
||||
max_radius := min(rect.width, rect.height) * 0.5
|
||||
clamped_top_left := clamp(radii.top_left, 0, max_radius)
|
||||
clamped_top_right := clamp(radii.top_right, 0, max_radius)
|
||||
@@ -1035,7 +1061,7 @@ build_backdrop_primitive :: proc(
|
||||
center_x := rect.x + half_width
|
||||
center_y := rect.y + half_height
|
||||
|
||||
return Backdrop_Primitive {
|
||||
return Gaussian_Blur_Primitive {
|
||||
bounds = {
|
||||
center_x - half_width - padding,
|
||||
center_y - half_height - padding,
|
||||
@@ -1057,13 +1083,13 @@ build_backdrop_primitive :: proc(
|
||||
}
|
||||
}
|
||||
|
||||
// Internal — append a Backdrop_Primitive to the staging array and emit a .Backdrop sub-batch
|
||||
// Append a Gaussian_Blur_Primitive to the staging array and emit a .Backdrop sub-batch
|
||||
// carrying the requested gaussian_sigma. Sub-batch coalescing in append_or_extend_sub_batch
|
||||
// will merge contiguous backdrops that share a sigma into a single instanced draw.
|
||||
@(private)
|
||||
prepare_backdrop_primitive :: proc(layer: ^Layer, prim: Backdrop_Primitive, gaussian_sigma: f32) {
|
||||
offset := u32(len(GLOB.tmp_backdrop_primitives))
|
||||
append(&GLOB.tmp_backdrop_primitives, prim)
|
||||
//INTERNAL
|
||||
prepare_backdrop_primitive :: proc(layer: ^Layer, prim: Gaussian_Blur_Primitive, gaussian_sigma: f32) {
|
||||
offset := u32(len(GLOB.tmp_gaussian_blur_primitives))
|
||||
append(&GLOB.tmp_gaussian_blur_primitives, prim)
|
||||
scissor := &GLOB.scissors[layer.scissor_start + layer.scissor_len - 1]
|
||||
append_or_extend_sub_batch(
|
||||
scissor,
|
||||
@@ -1075,9 +1101,7 @@ prepare_backdrop_primitive :: proc(layer: ^Layer, prim: Backdrop_Primitive, gaus
|
||||
)
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------------------------------------------------
|
||||
// ----- Public API --------------------
|
||||
// ---------------------------------------------------------------------------------------------------------------------
|
||||
//----- Public API ----------------------------------
|
||||
|
||||
// Draw a rectangle whose interior samples a Gaussian-blurred snapshot of the framebuffer
|
||||
// behind it. RRect-only — covers rectangles, rounded rectangles, and circles via
|
||||
|
||||
Reference in New Issue
Block a user