Orgnaization & cleanup
This commit is contained in:
+33
-32
@@ -21,16 +21,16 @@ import sdl "vendor:sdl3"
|
||||
// sigma_phys ≤ 8 → factor = 2
|
||||
// sigma_phys > 8 → factor = 4 (capped)
|
||||
//
|
||||
// Capped at factor=4: master's preference for visual quality over bandwidth at the high end.
|
||||
// Larger factors (8 and 16) would lose more high-frequency detail than the kernel can mask
|
||||
// even with the H+V split, and the bandwidth saving is small (the work region also shrinks
|
||||
// quadratically, so most of the savings are already captured at factor=4).
|
||||
// Capped at factor=4 to favor visual quality over bandwidth at the high end. Larger factors
|
||||
// (8 and 16) would lose more high-frequency detail than the kernel can mask even with the
|
||||
// H+V split, and the bandwidth saving is small (the work region also shrinks quadratically,
|
||||
// so most of the savings are already captured at factor=4).
|
||||
//
|
||||
// Working textures are sized at full swapchain resolution to support factor=1. Larger factors
|
||||
// just write to a smaller sub-rect via viewport-limited rendering. Memory cost: ½-res → full-
|
||||
// res working textures means 4× more bytes per working texture (2 textures, RGBA8: roughly
|
||||
// 16 MB at 1080p, 64 MB at 4K). On modern GPUs this is well within budget; on Mali Valhall
|
||||
// SBCs it's negligible against unified-memory headroom.
|
||||
// just write to a smaller sub-rect via viewport-limited rendering. Memory cost: full-res
|
||||
// working textures (2 textures, RGBA8) is roughly 16 MB at 1080p, 64 MB at 4K. On modern
|
||||
// GPUs this is well within budget; on Mali Valhall SBCs it's negligible against unified-
|
||||
// memory headroom.
|
||||
//
|
||||
// The shaders read the factor as a uniform. The downsample shader has three paths (factor=1
|
||||
// identity, factor=2 single bilinear tap, factor>=4 four bilinear taps with offsets scaling
|
||||
@@ -86,7 +86,7 @@ Backdrop_Vert_Uniforms :: struct {
|
||||
// shaders/source/backdrop_downsample.frag.
|
||||
Backdrop_Downsample_Frag_Uniforms :: struct {
|
||||
inv_source_size: [2]f32, // 0: 8 — 1.0 / source_texture pixel dimensions (full-res)
|
||||
downsample_factor: u32, // 8: 4 — 2 or 4 (selects 1-tap vs 4-tap path in shader)
|
||||
downsample_factor: u32, // 8: 4 — 1, 2, or 4 (selects identity / 1-tap / 4-tap path in shader)
|
||||
_pad0: u32, // 12: 4
|
||||
}
|
||||
|
||||
@@ -120,11 +120,12 @@ Pipeline_2D_Backdrop :: struct {
|
||||
primitive_buffer: Buffer,
|
||||
|
||||
// Working textures, allocated once at swapchain resolution and recreated only on resize.
|
||||
// `source_texture` is full-resolution; the other two are ¼-res. All single-sample.
|
||||
// All three are sized at full swapchain resolution and single-sample. Larger downsample
|
||||
// factors fill only a sub-rect via viewport-limited rendering (see file-header comment).
|
||||
// source_texture — when any backdrop draw exists this frame, the entire frame renders
|
||||
// here instead of the swapchain (Approach B). Copied to the swapchain
|
||||
// at frame end. Acts as the bracket's snapshot input by virtue of
|
||||
// already containing the pre-bracket frame.
|
||||
// here instead of the swapchain. Copied to the swapchain at frame
|
||||
// end. Acts as the bracket's snapshot input by virtue of already
|
||||
// containing the pre-bracket frame.
|
||||
// downsample_texture — written by the downsample PSO. Read by the blur PSO in mode 0.
|
||||
// h_blur_texture — written by the blur PSO in mode 0. Read by the blur PSO in mode 1.
|
||||
source_texture: ^sdl.GPUTexture,
|
||||
@@ -243,7 +244,7 @@ create_pipeline_2d_backdrop :: proc(
|
||||
|
||||
//----- Downsample PSO ----------------------------------
|
||||
// Single bilinear sample, blend disabled. No vertex buffer (gl_VertexIndex 0..2 emits the
|
||||
// fullscreen triangle). Single-sample target (the ¼-res working textures are never MSAA).
|
||||
// fullscreen triangle). Single-sample target (working textures are never MSAA).
|
||||
downsample_target := sdl.GPUColorTargetDescription {
|
||||
format = swapchain_format,
|
||||
blend_state = sdl.GPUColorTargetBlendState{enable_blend = false},
|
||||
@@ -350,9 +351,9 @@ destroy_pipeline_2d_backdrop :: proc(device: ^sdl.GPUDevice, pipeline: ^Pipeline
|
||||
// ---------------------------------------------------------------------------------------------------------------------
|
||||
|
||||
// Allocate (or reallocate, on resize) the three working textures that the backdrop bracket
|
||||
// uses. `source_texture` is full swapchain resolution; the other two are ¼-res. All single-
|
||||
// sample, all share the swapchain format, all need {.COLOR_TARGET, .SAMPLER} usage so they
|
||||
// can be written by render passes and read by subsequent passes.
|
||||
// uses. All three are sized at full swapchain resolution, single-sample, share the swapchain
|
||||
// format, and need {.COLOR_TARGET, .SAMPLER} usage so they can be written by render passes
|
||||
// and read by subsequent passes.
|
||||
//
|
||||
// Recreates on dimension change only — same-size frames hit the early-out and skip GPU
|
||||
// resource churn.
|
||||
@@ -466,19 +467,19 @@ ensure_backdrop_textures :: proc(device: ^sdl.GPUDevice, format: sdl.GPUTextureF
|
||||
// `i in [1, pair_count)` and does two texture fetches per pair — one at +offset, one at
|
||||
// -offset — for a total of 1 + 2*(pair_count-1) bilinear fetches per fragment.
|
||||
//
|
||||
// `sigma` is the true Gaussian standard deviation in the kernel's working-space units (¼-res
|
||||
// texels, after the caller has converted from logical pixels via dpi_scaling and the
|
||||
// downsample factor). The kernel extent reaches ±3σ, capturing 99.7% of the Gaussian's
|
||||
// `sigma` is the true Gaussian standard deviation in the kernel's working-space units
|
||||
// (working-resolution texels, after the caller has converted from logical pixels via
|
||||
// dpi_scaling and the downsample factor). The kernel extent reaches ±3σ, capturing 99.7% of
|
||||
// the Gaussian's
|
||||
// mass; weights beyond that contribute imperceptibly. sigma <= 0 produces a degenerate
|
||||
// kernel `{1, 0}` that acts as a sharp pass-through. After the loop, the discrete weights
|
||||
// are normalized so they sum to 1.0 (truncating at ±3σ loses a tiny amount of mass; we
|
||||
// renormalize to preserve overall image brightness).
|
||||
//
|
||||
// Earlier versions of this routine ported RAD Debugger's algorithm verbatim, which derives
|
||||
// stdev from a tap-count parameter (`stdev = (blur_count-1)/2`). That made the parameter
|
||||
// name misleading: the user thought they were passing σ but were actually passing
|
||||
// half-kernel-width. This version takes σ directly and derives the tap count from it,
|
||||
// matching what callers expect when they read "gaussian_sigma".
|
||||
// Note on the parameter contract: this routine takes σ directly and derives the tap count
|
||||
// from it, rather than the inverse (RAD Debugger's algorithm passes a tap count and derives
|
||||
// `stdev = (blur_count-1)/2`). Taking σ directly matches what callers expect when they read
|
||||
// "gaussian_sigma" — passing tap count under that name was a footgun.
|
||||
@(private)
|
||||
compute_blur_kernel :: proc(sigma: f32, kernel: ^[MAX_BACKDROP_KERNEL_PAIRS][4]f32) -> (pair_count: u32) {
|
||||
if sigma <= 0 {
|
||||
@@ -624,7 +625,7 @@ upload_backdrop_primitives :: proc(device: ^sdl.GPUDevice, pass: ^sdl.GPUCopyPas
|
||||
// ---------------------------------------------------------------------------------------------------------------------
|
||||
|
||||
// Returns true if any sub-batch in any layer this frame is .Backdrop kind. Called once at the
|
||||
// top of `end()` to decide whether to route the whole frame to source_texture (Approach B).
|
||||
// top of `end()` to decide whether to route the whole frame to source_texture.
|
||||
// O(total sub-batches) but with an early-exit on the first hit, so typical cost is tiny.
|
||||
@(private)
|
||||
frame_has_backdrop :: proc() -> bool {
|
||||
@@ -742,10 +743,10 @@ compute_backdrop_group_work_region :: proc(
|
||||
// target viewport, per-primitive SDF discard handles masking and applies the tint. Each
|
||||
// sub-batch in the group is one instanced draw.
|
||||
//
|
||||
// V-blur was historically combined with the composite into a single shader invocation, but
|
||||
// that produced a horizontal-vs-vertical asymmetry artifact (horizontal source features
|
||||
// looked sharper than vertical ones inside the panel). Splitting V-blur into its own
|
||||
// working→working pass restores symmetry by making H and V blurs structurally identical.
|
||||
// V-blur is run as its own working→working pass rather than folded into the composite. The
|
||||
// folded variant produces a horizontal-vs-vertical asymmetry artifact (horizontal source
|
||||
// features end up looking sharper than vertical ones inside the panel). Matching V's
|
||||
// structure exactly to H's restores symmetry.
|
||||
//
|
||||
// On exit, source_texture contains the pre-bracket contents plus all backdrop primitives
|
||||
// composited on top. The caller then runs Pass B (post-bracket non-backdrop sub-batches) on
|
||||
@@ -1011,8 +1012,8 @@ run_backdrop_bracket :: proc(
|
||||
// geometry. The caller sets `color` (tint) on the returned primitive before submitting.
|
||||
//
|
||||
// No rotation, no outline — backdrop primitives are intentionally limited to axis-aligned
|
||||
// RRects in v1. Rotation breaks screen-space blur sampling visually; outline would be a
|
||||
// specialized edge effect that belongs in its own primitive type.
|
||||
// RRects. Rotation breaks screen-space blur sampling visually; outline would be a specialized
|
||||
// edge effect that belongs in its own primitive type.
|
||||
@(private)
|
||||
build_backdrop_primitive :: proc(
|
||||
rect: Rectangle,
|
||||
|
||||
Reference in New Issue
Block a user