Orgnaization & cleanup

This commit is contained in:
Zachary Levy
2026-04-30 16:52:55 -07:00
parent 16989cbb71
commit fd64bc01bf
13 changed files with 269 additions and 258 deletions
+33 -32
View File
@@ -21,16 +21,16 @@ import sdl "vendor:sdl3"
// sigma_phys ≤ 8 → factor = 2
// sigma_phys > 8 → factor = 4 (capped)
//
// Capped at factor=4: master's preference for visual quality over bandwidth at the high end.
// Larger factors (8 and 16) would lose more high-frequency detail than the kernel can mask
// even with the H+V split, and the bandwidth saving is small (the work region also shrinks
// quadratically, so most of the savings are already captured at factor=4).
// Capped at factor=4 to favor visual quality over bandwidth at the high end. Larger factors
// (8 and 16) would lose more high-frequency detail than the kernel can mask even with the
// H+V split, and the bandwidth saving is small (the work region also shrinks quadratically,
// so most of the savings are already captured at factor=4).
//
// Working textures are sized at full swapchain resolution to support factor=1. Larger factors
// just write to a smaller sub-rect via viewport-limited rendering. Memory cost: ½-res → full-
// res working textures means 4× more bytes per working texture (2 textures, RGBA8: roughly
// 16 MB at 1080p, 64 MB at 4K). On modern GPUs this is well within budget; on Mali Valhall
// SBCs it's negligible against unified-memory headroom.
// just write to a smaller sub-rect via viewport-limited rendering. Memory cost: full-res
// working textures (2 textures, RGBA8) is roughly 16 MB at 1080p, 64 MB at 4K. On modern
// GPUs this is well within budget; on Mali Valhall SBCs it's negligible against unified-
// memory headroom.
//
// The shaders read the factor as a uniform. The downsample shader has three paths (factor=1
// identity, factor=2 single bilinear tap, factor>=4 four bilinear taps with offsets scaling
@@ -86,7 +86,7 @@ Backdrop_Vert_Uniforms :: struct {
// shaders/source/backdrop_downsample.frag.
Backdrop_Downsample_Frag_Uniforms :: struct {
inv_source_size: [2]f32, // 0: 8 — 1.0 / source_texture pixel dimensions (full-res)
downsample_factor: u32, // 8: 4 — 2 or 4 (selects 1-tap vs 4-tap path in shader)
downsample_factor: u32, // 8: 4 — 1, 2, or 4 (selects identity / 1-tap / 4-tap path in shader)
_pad0: u32, // 12: 4
}
@@ -120,11 +120,12 @@ Pipeline_2D_Backdrop :: struct {
primitive_buffer: Buffer,
// Working textures, allocated once at swapchain resolution and recreated only on resize.
// `source_texture` is full-resolution; the other two are ¼-res. All single-sample.
// All three are sized at full swapchain resolution and single-sample. Larger downsample
// factors fill only a sub-rect via viewport-limited rendering (see file-header comment).
// source_texture — when any backdrop draw exists this frame, the entire frame renders
// here instead of the swapchain (Approach B). Copied to the swapchain
// at frame end. Acts as the bracket's snapshot input by virtue of
// already containing the pre-bracket frame.
// here instead of the swapchain. Copied to the swapchain at frame
// end. Acts as the bracket's snapshot input by virtue of already
// containing the pre-bracket frame.
// downsample_texture — written by the downsample PSO. Read by the blur PSO in mode 0.
// h_blur_texture — written by the blur PSO in mode 0. Read by the blur PSO in mode 1.
source_texture: ^sdl.GPUTexture,
@@ -243,7 +244,7 @@ create_pipeline_2d_backdrop :: proc(
//----- Downsample PSO ----------------------------------
// Single bilinear sample, blend disabled. No vertex buffer (gl_VertexIndex 0..2 emits the
// fullscreen triangle). Single-sample target (the ¼-res working textures are never MSAA).
// fullscreen triangle). Single-sample target (working textures are never MSAA).
downsample_target := sdl.GPUColorTargetDescription {
format = swapchain_format,
blend_state = sdl.GPUColorTargetBlendState{enable_blend = false},
@@ -350,9 +351,9 @@ destroy_pipeline_2d_backdrop :: proc(device: ^sdl.GPUDevice, pipeline: ^Pipeline
// ---------------------------------------------------------------------------------------------------------------------
// Allocate (or reallocate, on resize) the three working textures that the backdrop bracket
// uses. `source_texture` is full swapchain resolution; the other two are ¼-res. All single-
// sample, all share the swapchain format, all need {.COLOR_TARGET, .SAMPLER} usage so they
// can be written by render passes and read by subsequent passes.
// uses. All three are sized at full swapchain resolution, single-sample, share the swapchain
// format, and need {.COLOR_TARGET, .SAMPLER} usage so they can be written by render passes
// and read by subsequent passes.
//
// Recreates on dimension change only — same-size frames hit the early-out and skip GPU
// resource churn.
@@ -466,19 +467,19 @@ ensure_backdrop_textures :: proc(device: ^sdl.GPUDevice, format: sdl.GPUTextureF
// `i in [1, pair_count)` and does two texture fetches per pair — one at +offset, one at
// -offset — for a total of 1 + 2*(pair_count-1) bilinear fetches per fragment.
//
// `sigma` is the true Gaussian standard deviation in the kernel's working-space units (¼-res
// texels, after the caller has converted from logical pixels via dpi_scaling and the
// downsample factor). The kernel extent reaches ±3σ, capturing 99.7% of the Gaussian's
// `sigma` is the true Gaussian standard deviation in the kernel's working-space units
// (working-resolution texels, after the caller has converted from logical pixels via
// dpi_scaling and the downsample factor). The kernel extent reaches ±3σ, capturing 99.7% of
// the Gaussian's
// mass; weights beyond that contribute imperceptibly. sigma <= 0 produces a degenerate
// kernel `{1, 0}` that acts as a sharp pass-through. After the loop, the discrete weights
// are normalized so they sum to 1.0 (truncating at ±3σ loses a tiny amount of mass; we
// renormalize to preserve overall image brightness).
//
// Earlier versions of this routine ported RAD Debugger's algorithm verbatim, which derives
// stdev from a tap-count parameter (`stdev = (blur_count-1)/2`). That made the parameter
// name misleading: the user thought they were passing σ but were actually passing
// half-kernel-width. This version takes σ directly and derives the tap count from it,
// matching what callers expect when they read "gaussian_sigma".
// Note on the parameter contract: this routine takes σ directly and derives the tap count
// from it, rather than the inverse (RAD Debugger's algorithm passes a tap count and derives
// `stdev = (blur_count-1)/2`). Taking σ directly matches what callers expect when they read
// "gaussian_sigma" — passing tap count under that name was a footgun.
@(private)
compute_blur_kernel :: proc(sigma: f32, kernel: ^[MAX_BACKDROP_KERNEL_PAIRS][4]f32) -> (pair_count: u32) {
if sigma <= 0 {
@@ -624,7 +625,7 @@ upload_backdrop_primitives :: proc(device: ^sdl.GPUDevice, pass: ^sdl.GPUCopyPas
// ---------------------------------------------------------------------------------------------------------------------
// Returns true if any sub-batch in any layer this frame is .Backdrop kind. Called once at the
// top of `end()` to decide whether to route the whole frame to source_texture (Approach B).
// top of `end()` to decide whether to route the whole frame to source_texture.
// O(total sub-batches) but with an early-exit on the first hit, so typical cost is tiny.
@(private)
frame_has_backdrop :: proc() -> bool {
@@ -742,10 +743,10 @@ compute_backdrop_group_work_region :: proc(
// target viewport, per-primitive SDF discard handles masking and applies the tint. Each
// sub-batch in the group is one instanced draw.
//
// V-blur was historically combined with the composite into a single shader invocation, but
// that produced a horizontal-vs-vertical asymmetry artifact (horizontal source features
// looked sharper than vertical ones inside the panel). Splitting V-blur into its own
// working→working pass restores symmetry by making H and V blurs structurally identical.
// V-blur is run as its own working→working pass rather than folded into the composite. The
// folded variant produces a horizontal-vs-vertical asymmetry artifact (horizontal source
// features end up looking sharper than vertical ones inside the panel). Matching V's
// structure exactly to H's restores symmetry.
//
// On exit, source_texture contains the pre-bracket contents plus all backdrop primitives
// composited on top. The caller then runs Pass B (post-bracket non-backdrop sub-batches) on
@@ -1011,8 +1012,8 @@ run_backdrop_bracket :: proc(
// geometry. The caller sets `color` (tint) on the returned primitive before submitting.
//
// No rotation, no outline — backdrop primitives are intentionally limited to axis-aligned
// RRects in v1. Rotation breaks screen-space blur sampling visually; outline would be a
// specialized edge effect that belongs in its own primitive type.
// RRects. Rotation breaks screen-space blur sampling visually; outline would be a specialized
// edge effect that belongs in its own primitive type.
@(private)
build_backdrop_primitive :: proc(
rect: Rectangle,