diff --git a/draw/README.md b/draw/README.md index 879b520..e02a7c3 100644 --- a/draw/README.md +++ b/draw/README.md @@ -15,10 +15,10 @@ modes dispatched by a push constant: shader premultiplies the texture sample (`t.rgb *= t.a`) and computes `out = color * t`. - **Mode 1 (SDF):** A static 6-vertex unit-quad buffer is drawn instanced, with per-primitive - `Primitive` structs (80 bytes each) uploaded each frame to a GPU storage buffer. The vertex shader - reads `primitives[gl_InstanceIndex]`, computes world-space position from unit quad corners + + `Base_2D_Primitive` structs (96 bytes each) uploaded each frame to a GPU storage buffer. The vertex + shader reads `primitives[gl_InstanceIndex]`, computes world-space position from unit quad corners + primitive bounds. The fragment shader dispatches on `Shape_Kind` (encoded in the low byte of - `Primitive.flags`) to evaluate one of four signed distance functions: + `Base_2D_Primitive.flags`) to evaluate one of four signed distance functions: - **RRect** (kind 1) — `sdRoundedBox` with per-corner radii. Covers rectangles (sharp or rounded), circles (uniform radii = half-size), and line segments / capsules (rotated RRect with uniform radii = half-thickness). Covers filled, outlined, textured, and gradient-filled variants. @@ -28,21 +28,22 @@ modes dispatched by a push constant: normals. Covers full rings, partial arcs, and pie slices (`inner_radius = 0`). All SDF shapes support fill, outline, solid color, 2-color linear gradients, 2-color radial -gradients, and texture fills via `Shape_Flags` (see `pipeline_2d_base.odin`). Gradient and outline -parameters are packed into the same 16 bytes as the texture UV rect via a `Uv_Or_Effects` raw union -— zero size increase to the 80-byte `Primitive` struct. Gradient/outline and texture are mutually -exclusive. +gradients, and texture fills via `Shape_Flags` (see `pipeline_2d_base.odin`). The texture UV rect +(`uv_rect: [4]f32`) and the gradient/outline parameters (`effects: Gradient_Outline`) live in their +own 16-byte slots in `Base_2D_Primitive`, so a primitive can carry texture and outline simultaneously. +Gradient and texture remain mutually exclusive at the fill-source level (a Brush variant chooses one +or the other) since they share the worst-case fragment-shader register path. All SDF shapes produce mathematically exact curves with analytical anti-aliasing via `smoothstep` — -no tessellation, no piecewise-linear approximation. A rounded rectangle is 1 primitive (80 bytes) +no tessellation, no piecewise-linear approximation. A rounded rectangle is 1 primitive (96 bytes) instead of ~250 vertices (~5000 bytes). The main pipeline's register budget is **≤24 registers** (see "Main/effects split: register pressure" -in the pipeline plan below for the full cliff/margin analysis and SBC architecture context). The -fragment shader's estimated peak footprint is ~22–26 fp32 VGPRs (~16–22 fp16 VGPRs on architectures +in the pipeline plan below for the full cliff/margin analysis and SBC architecture context). +The fragment shader's estimated peak footprint is ~22–26 fp32 VGPRs (~16–22 fp16 VGPRs on architectures with native mediump) via manual live-range analysis. The dominant peak is the Ring_Arc kind path (wedge normals + inner/outer radii + dot-product temporaries live simultaneously with carried state -like `f_color`, `f_uv_or_effects`, and `half_size`). RRect is 1–2 regs lower (`corner_radii` vec4 +like `f_color`, `f_uv_rect`/`f_effects`, and `half_size`). RRect is 1–2 regs lower (`corner_radii` vec4 replaces the separate inner/outer + normal pairs). NGon and Ellipse are lighter still. Real compilers apply live-range coalescing, mediump-to-fp16 promotion, and rematerialization that typically shave 2–4 regs from hand-counted estimates — the conservative 26-reg upper bound is expected to compile @@ -439,12 +440,13 @@ vertex shader branches on this uniform to select the tessellated or SDF code pat - **Tessellated mode** (`mode = 0`): direct vertex buffer with explicit geometry. Used for text (SDL_ttf atlas sampling), triangles, triangle fans/strips, single-pixel points, and any user-provided raw vertex geometry. -- **SDF mode** (`mode = 1`): shared unit-quad vertex buffer + GPU storage buffer of `Primitive` - structs, drawn instanced. Used for all shapes with closed-form signed distance functions. +- **SDF mode** (`mode = 1`): shared unit-quad vertex buffer + GPU storage buffer of + `Base_2D_Primitive` structs, drawn instanced. Used for all shapes with closed-form signed distance + functions. Both modes use the same fragment shader. The fragment shader checks `Shape_Kind` (low byte of -`Primitive.flags`): kind 0 (`Solid`) is the tessellated path, which premultiplies the texture sample -and computes `out = color * t`; kinds 1–4 dispatch to one of four SDF functions (RRect, NGon, +`Base_2D_Primitive.flags`): kind 0 (`Solid`) is the tessellated path, which premultiplies the texture +sample and computes `out = color * t`; kinds 1–4 dispatch to one of four SDF functions (RRect, NGon, Ellipse, Ring_Arc) and apply gradient/texture/outline/solid color based on `Shape_Flags` bits. #### Why SDF for shapes @@ -452,8 +454,8 @@ Ellipse, Ring_Arc) and apply gradient/texture/outline/solid color based on `Shap CPU-side adaptive tessellation for curved shapes (the current approach) has three problems: 1. **Vertex bandwidth.** A rounded rectangle with four corner arcs produces ~250 vertices × 20 bytes - = 5 KB. An SDF rounded rectangle is one `Primitive` struct (~56 bytes) plus 4 shared unit-quad - vertices. That is roughly a 90× reduction per shape. + = 5 KB. An SDF rounded rectangle is one `Base_2D_Primitive` struct (96 bytes) plus 4 shared + unit-quad vertices. That is roughly a 50× reduction per shape. 2. **Quality.** Tessellated curves are piecewise-linear approximations. At high DPI or under animation/zoom, faceting is visible at any practical segment count. SDF evaluation produces @@ -484,14 +486,14 @@ SDF primitives are submitted via a GPU storage buffer indexed by `gl_InstanceInd shader, rather than encoding per-primitive data redundantly in vertex attributes. This follows the pattern used by both Zed GPUI and vger-rs. -Each SDF shape is described by a single `Primitive` struct (80 bytes) in the storage buffer. The -vertex shader reads `primitives[gl_InstanceIndex]`, computes the quad corner position from the unit -vertex and the primitive's bounds, and passes shape parameters to the fragment shader via `flat` -interpolated varyings. +Each SDF shape is described by a single `Base_2D_Primitive` struct (96 bytes) in the storage +buffer. The vertex shader reads `primitives[gl_InstanceIndex]`, computes the quad corner position +from the unit vertex and the primitive's bounds, and passes shape parameters to the fragment shader +via `flat` interpolated varyings. Compared to encoding per-primitive data in vertex attributes (the "fat vertex" approach), storage- buffer instancing eliminates the 4–6× data duplication across quad corners. A rounded rectangle costs -80 bytes instead of 4 vertices × 40+ bytes = 160+ bytes. +96 bytes instead of 4 vertices × 60+ bytes = 240+ bytes. The tessellated path retains the existing direct vertex buffer layout (20 bytes/vertex, no storage buffer access). The vertex shader branch on `mode` (push constant) is warp-uniform — every invocation @@ -499,15 +501,18 @@ in a draw call has the same mode — so it is effectively free on all modern GPU #### Shape kinds and SDF dispatch -The fragment shader dispatches on `Shape_Kind` (low byte of `Primitive.flags`) to evaluate one of -four signed distance functions. The `Shape_Kind` enum and per-kind `*_Params` structs are defined in -`pipeline_2d_base.odin`. CPU-side drawing procs in `shapes.odin` build the appropriate `Primitive` -and set the kind automatically: +The fragment shader dispatches on `Shape_Kind` (low byte of `Base_2D_Primitive.flags`) to evaluate +one of four signed distance functions. The `Shape_Kind` enum and per-kind `*_Params` structs are +defined in `pipeline_2d_base.odin`. CPU-side drawing procs in `shapes.odin` build the appropriate +`Base_2D_Primitive` and set the kind automatically: + +Each user-facing shape proc accepts a `Brush` union (color, linear gradient, radial gradient, +or textured fill) as its fill source, plus optional outline parameters. The procs map to SDF +kinds as follows: | User-facing proc | Shape_Kind | SDF function | Notes | | -------------------- | ---------- | ------------------ | ---------------------------------------------------------- | | `rectangle` | `RRect` | `sdRoundedBox` | Per-corner radii from `radii` param | -| `rectangle_texture` | `RRect` | `sdRoundedBox` | Textured fill; `.Textured` flag set | | `circle` | `RRect` | `sdRoundedBox` | Uniform radii = half-size (circle is a degenerate RRect) | | `line`, `line_strip` | `RRect` | `sdRoundedBox` | Rotated capsule — stadium shape (radii = half-thickness) | | `ellipse` | `Ellipse` | `sdEllipseApprox` | Approximate ellipse SDF (fast, suitable for UI) | @@ -599,20 +604,21 @@ to is a hard GPU constraint; the only way to satisfy it is to end the current re a new one. That render-pass boundary is what a “bracket” is. **Multi-pass implementation.** Backdrop effects are implemented as separable multi-pass sequences -(downsample → horizontal blur → vertical-blur+composite), following the standard approach used by -iOS `UIVisualEffectView`, Android `RenderEffect`, and Flutter's `BackdropFilter`. Each individual +(downsample → horizontal blur → vertical blur → composite), following the standard approach used +by iOS `UIVisualEffectView`, Android `RenderEffect`, and Flutter's `BackdropFilter`. Each individual sub-pass is budgeted at **≤24 registers** (same as the main pipeline — full Valhall occupancy). The multi-pass approach avoids the monolithic 70+ register shader that a single-pass Gaussian blur would require, keeping each sub-pass well under the 32-register cliff. -**Approach B: render-target choice.** When any layer in the frame contains a backdrop draw, the -entire frame renders into `source_texture` (a full-resolution single-sample texture owned by the -backdrop pipeline) instead of directly into the swapchain. At the end of the frame, `source_texture` -is copied to the swapchain via a single `CopyGPUTextureToTexture` call. This means the bracket has -no mid-frame texture copy: by the time the bracket runs, `source_texture` already contains the pre- -bracket frame contents and is the natural sampler input. When no layer in the frame has a backdrop -draw, the existing fast path runs: the frame renders directly to the swapchain and the backdrop -pipeline's working textures are never touched. Zero cost for backdrop-free frames. +**Render-target choice.** When any layer in the frame contains a backdrop draw, the entire +frame renders into `source_texture` (a full-resolution single-sample texture owned by the +backdrop pipeline) instead of directly into the swapchain. At the end of the frame, +`source_texture` is copied to the swapchain via a single `CopyGPUTextureToTexture` call. +This means the bracket has no mid-frame texture copy: by the time the bracket runs, +`source_texture` already contains the pre-bracket frame contents and is the natural sampler +input. When no layer in the frame has a backdrop draw, the existing fast path runs: the frame +renders directly to the swapchain and the backdrop pipeline's working textures are never +touched. Zero cost for backdrop-free frames. **Why not split the backdrop sub-passes into separate pipelines?** Each sub-pass is budgeted at ≤24 registers, well under Valhall's 32-register cliff, so there is no occupancy motivation for splitting. @@ -638,13 +644,20 @@ submission order. Concretely, a layer with one or more backdrops splits into thr range. If the layer has no backdrops, none of this kicks in and the layer renders in a single render pass via the existing fast path. -The downsample runs once per layer, not once per sigma: it just copies `source_texture` to a ¼- -resolution working texture and doesn't depend on the kernel. Each unique sigma in the layer triggers -one H-blur (reads `downsample_texture`, writes `h_blur_texture`) and one V-composite (reads -`h_blur_texture`, writes `source_texture` per-primitive with the SDF mask). Sub-batch coalescing in -`append_or_extend_sub_batch` merges contiguous same-sigma backdrops into a single instanced V- -composite draw call; non-contiguous same-sigma backdrops still share the H-blur output but issue -separate V-composite draws. +Per-sigma-group execution. The bracket walks each layer's sub-batches and groups contiguous +`.Backdrop` sub-batches that share a sigma; each group picks its own downsample factor (1, 2, or 4) +based on `compute_backdrop_downsample_factor`. For each group it runs four sub-passes: a downsample +from `source_texture` to `downsample_texture`; an H-blur from `downsample_texture` to +`h_blur_texture`; a V-blur from `h_blur_texture` back into `downsample_texture` (ping-pong reuse); +and finally a composite that reads the fully-blurred `downsample_texture`, applies the SDF mask +and tint, and writes the result to `source_texture`. Sub-batch coalescing in +`append_or_extend_sub_batch` merges contiguous same-sigma backdrops into a single instanced +composite draw; non-contiguous same-sigma backdrops still share the blur output but issue separate +composite draws. + +The working textures are sized at the full swapchain resolution; larger downsample factors only +fill a sub-rect via viewport-limited rendering (see the comment block at the top of `backdrop.odin` +for the factor-selection table and rationale). #### Submission-order trade-off @@ -654,12 +667,12 @@ layer. A non-backdrop sub-batch submitted between two backdrops still renders in bracket), not at its submission position. Worked example: ``` -draw.rectangle(layer, bg, GRAY) // 0 Tessellated → Pass A -draw.rectangle(layer, card_blue, BLUE) // 1 SDF → Pass A -draw.rectangle_backdrop(layer, panelA, 12) // 2 Backdrop → Bracket (sees: bg + blue card) -draw.rectangle(layer, card_red, RED) // 3 SDF → Pass B (drawn ON TOP of panelA) -draw.rectangle_backdrop(layer, panelB, 12) // 4 Backdrop → Bracket (sees: bg + blue card; same as panelA) -draw.text(layer, "label", ...) // 5 Text → Pass B (drawn ON TOP of both panels) +draw.rectangle(layer, bg, GRAY) // 0 Tessellated → Pass A +draw.rectangle(layer, card_blue, BLUE) // 1 SDF → Pass A +draw.gaussian_blur(layer, panelA, sigma=12) // 2 Backdrop → Bracket (sees: bg + blue card) +draw.rectangle(layer, card_red, RED) // 3 SDF → Pass B (drawn ON TOP of panelA) +draw.gaussian_blur(layer, panelB, sigma=12) // 4 Backdrop → Bracket (sees: bg + blue card; same as panelA) +draw.text(layer, "label", ...) // 5 Text → Pass B (drawn ON TOP of both panels) ``` In this layer, panelB does *not* see card_red — even though card_red was submitted before panelB — @@ -674,11 +687,11 @@ card_red: base := draw.begin(...) draw.rectangle(base, bg, GRAY) draw.rectangle(base, card_blue, BLUE) -draw.rectangle_backdrop(base, panelA, 12) // panelA in base layer's bracket +draw.gaussian_blur(base, panelA, sigma=12) // panelA in base layer's bracket top := draw.new_layer(base, ...) draw.rectangle(top, card_red, RED) -draw.rectangle_backdrop(top, panelB, 12) // top layer's bracket; sees base + card_red +draw.gaussian_blur(top, panelB, sigma=12) // top layer's bracket; sees base + card_red draw.text(top, "label", ...) ``` @@ -708,29 +721,30 @@ draws, `position` carries actual world-space geometry. For SDF draws, `position` corners (0,0 to 1,1) and the vertex shader computes world-space position from the storage-buffer primitive's bounds. -The `Primitive` struct for SDF shapes lives in the storage buffer, not in vertex attributes: +The `Base_2D_Primitive` struct for SDF shapes lives in the storage buffer, not in vertex attributes: ``` -Primitive :: struct { - bounds: [4]f32, // 0: min_x, min_y, max_x, max_y - color: Color, // 16: u8x4, unpacked in shader via unpackUnorm4x8 - flags: u32, // 20: low byte = Shape_Kind, bits 8+ = Shape_Flags - rotation_sc: u32, // 24: packed f16 pair (sin, cos). Requires .Rotated flag. - _pad: f32, // 28: reserved for future use - params: Shape_Params, // 32: per-kind params union (half_feather, radii, etc.) (32 bytes) - uv: Uv_Or_Effects, // 64: texture UV rect or gradient/outline parameters (16 bytes) +Base_2D_Primitive :: struct { + bounds: [4]f32, // 0: min_x, min_y, max_x, max_y + color: Color, // 16: u8x4, unpacked in shader via unpackUnorm4x8 + flags: u32, // 20: low byte = Shape_Kind, bits 8+ = Shape_Flags + rotation_sc: u32, // 24: packed f16 pair (sin, cos). Requires .Rotated flag. + _pad: f32, // 28: reserved for future use + params: Shape_Params, // 32: per-kind params union (half_feather, radii, etc.) (32 bytes) + uv_rect: [4]f32, // 64: texture UV coordinates. Read when .Textured. + effects: Gradient_Outline, // 80: gradient and/or outline parameters (16 bytes). } -// Total: 80 bytes (std430 aligned) +// Total: 96 bytes (std430 aligned) ``` `Shape_Params` is a `#raw_union` over `RRect_Params`, `NGon_Params`, `Ellipse_Params`, and `Ring_Arc_Params` (plus a `raw: [8]f32` view), defined in `pipeline_2d_base.odin`. Each SDF kind writes its own params variant; the fragment shader reads the appropriate fields based on `Shape_Kind`. -`Uv_Or_Effects` is a `#raw_union` that aliases `[4]f32` (texture UV rect: u_min, v_min, u_max, -v_max) with a `Gradient_Outline` struct containing `gradient_color: Color`, `outline_color: Color`, +`Gradient_Outline` is a 16-byte struct containing `gradient_color: Color`, `outline_color: Color`, `gradient_dir_sc: u32` (packed f16 cos/sin pair), and `outline_packed: u32` (packed f16 outline -width). The `flags` field encodes the `Shape_Kind` in the low byte and `Shape_Flags` in bits 8+ -via `pack_kind_flags`. +width). It is independent of `uv_rect`, so a primitive can carry texture and outline parameters at +the same time. The `flags` field encodes the `Shape_Kind` in the low byte and `Shape_Flags` in bits +8+ via `pack_kind_flags`. ### Draw submission order @@ -754,7 +768,7 @@ pair into bitmap atlases and emits indexed triangle data via `GetGPUTextDrawData **unchanged** by the SDF migration — text continues to flow through the main pipeline's tessellated mode with `mode = 0`, sampling the SDL_ttf atlas texture. -A future phase may evaluate MSDF (multi-channel signed distance field) text rendering, which would +MSDF (multi-channel signed distance field) text rendering may be evaluated later, which would allow resolution-independent glyph rendering from a single small atlas per font. This would involve: - Offline atlas generation via Chlumský's msdf-atlas-gen tool. @@ -763,8 +777,7 @@ allow resolution-independent glyph rendering from a single small atlas per font. already exists for the four current SDF kinds). - Potential removal of the SDL_ttf dependency. -This is explicitly deferred. The SDF shape migration is independent of and does not block text -changes. +This is explicitly deferred. **References:** @@ -778,8 +791,8 @@ changes. ### Textures Textures plug into the existing main pipeline — no additional GPU pipeline, no shader rewrite. The -work is a resource layer (registration, upload, sampling, lifecycle) plus two textured-draw procs -that route into the existing tessellated and SDF paths respectively. +work is a resource layer (registration, upload, sampling, lifecycle) plus a `Texture_Fill` Brush +variant that routes the existing shape procs through the SDF path with the `.Textured` flag set. #### Why draw owns registered textures @@ -829,22 +842,25 @@ with the same texture but different samplers produce separate draw calls, which #### Textured draw procs -Textured rectangles route through the existing SDF path via `rectangle_texture`, which mirrors -`rectangle` exactly — same parameters for radii, origin, rotation, feather — with the `color` -parameter replaced by a `Texture_Id`, an optional `tint`, a `uv_rect`, and a `Sampler_Preset`. +Textures share the same shape procs as colors and gradients. Each shape proc takes a `Brush` +union as its fill source; passing a `Texture_Fill` value (carrying `Texture_Id`, `tint`, +`uv_rect`, and `Sampler_Preset`) routes the draw through the SDF path with the `.Textured` +flag set. There is no dedicated `rectangle_texture` / `circle_texture` proc — the same +`rectangle`, `circle`, `ellipse`, `polygon`, `ring`, `line`, and `line_strip` procs handle +all fill sources. -An earlier iteration of this design considered a separate tessellated proc for "simple" fullscreen -quads, on the theory that the tessellated path's lower register count would improve occupancy at -large fragment counts. Both paths are well within the ≤24-register main pipeline budget — both run at -full occupancy on every target architecture (Valhall and above). The remaining ALU difference (~15 -extra instructions for the SDF evaluation) amounts to ~20μs at 4K — below noise. Meanwhile, -splitting into a separate pipeline would add ~1–5μs per pipeline bind on the CPU side per scissor, -matching or exceeding the GPU-side savings. Within the main pipeline, unified remains strictly better. +A separate tessellated proc for "simple" fullscreen quads was considered on the theory that +the tessellated path's lower register count would improve occupancy at large fragment counts. +Both paths are well within the ≤24-register main pipeline budget — both run at full +occupancy on every target architecture (Valhall and above). The remaining ALU difference +(~15 extra instructions for the SDF evaluation) amounts to ~20μs at 4K — below noise. +Meanwhile, splitting into a separate pipeline would add ~1–5μs per pipeline bind on the CPU +side per scissor, matching or exceeding the GPU-side savings. Within the main pipeline, +unified remains strictly better. -SDF drawing procs live in the `draw` package with unprefixed names (`rectangle`, `rectangle_texture`, -`circle`, `ellipse`, `polygon`, `ring`, `line`, `line_strip`). Gradients and outlines are optional -parameters on each proc rather than separate overloads. Future per-shape texture variants -(`circle_texture`, `ellipse_texture`) are additive. +SDF drawing procs live in the `draw` package with unprefixed names (`rectangle`, `circle`, +`ellipse`, `polygon`, `ring`, `line`, `line_strip`). Gradients, textures, and outlines are +selected via the `Brush` union and optional outline parameters rather than separate overloads. #### What SDF anti-aliasing does and does not do for textured draws @@ -858,8 +874,8 @@ depends on how closely the display size matches the SDL_ttf atlas's rasterized s #### Fit modes are a computation layer, not a renderer concept Standard image-fit behaviors (stretch, fill/cover, fit/contain, tile, center) are expressed as UV -sub-region computations on top of the `uv_rect` parameter that both textured-draw procs accept. The -renderer has no knowledge of fit modes — it samples whatever UV region it is given. +sub-region computations on top of the `uv_rect` field of `Texture_Fill`. The renderer has no +knowledge of fit modes — it samples whatever UV region it is given. A `fit_params` helper computes the appropriate `uv_rect`, sampler preset, and (for letterbox/fit mode) shrunken inner rect from a `Fit_Mode` enum, the target rect, and the texture's pixel size. @@ -883,13 +899,13 @@ textures onto a free list that is processed in `r_end_frame`, not at the call si Clay's `RenderCommandType.Image` is handled by dereferencing `imageData: rawptr` as a pointer to a `Clay_Image_Data` struct containing a `Texture_Id`, `Fit_Mode`, and tint color. Routing mirrors the -existing rectangle handling: `fit_params` computes UVs from the fit mode, then -`rectangle_texture` is called with the appropriate radii (zero for sharp corners, per-corner values -from Clay's `cornerRadius` otherwise). +existing rectangle handling: `fit_params` computes UVs from the fit mode, then `rectangle` is +called with a `Texture_Fill` brush and the appropriate radii (zero for sharp corners, per-corner +values from Clay's `cornerRadius` otherwise). #### Deferred features -The following are plumbed in the descriptor but not implemented in phase 1: +The following are plumbed in `Texture_Desc` but not yet implemented: - **Mipmaps**: `Texture_Desc.mip_levels` field exists; generation via SDL3 deferred. - **Compressed formats**: `Texture_Desc.format` accepts BC/ASTC; upload path deferred. @@ -897,7 +913,6 @@ The following are plumbed in the descriptor but not implemented in phase 1: - **3D textures, arrays, cube maps**: `Texture_Desc.type` and `depth_or_layers` fields exist. - **Additional samplers**: anisotropic, trilinear, clamp-to-border — additive enum values. - **Atlas packing**: internal optimization for sub-batch coalescing; invisible to callers. -- **Per-shape texture variants**: `circle_texture`, `ellipse_texture`, `polygon_texture` — potential future additions, following the existing naming convention. **References:** diff --git a/draw/backdrop.odin b/draw/backdrop.odin index 78776d7..729f571 100644 --- a/draw/backdrop.odin +++ b/draw/backdrop.odin @@ -21,16 +21,16 @@ import sdl "vendor:sdl3" // sigma_phys ≤ 8 → factor = 2 // sigma_phys > 8 → factor = 4 (capped) // -// Capped at factor=4: master's preference for visual quality over bandwidth at the high end. -// Larger factors (8 and 16) would lose more high-frequency detail than the kernel can mask -// even with the H+V split, and the bandwidth saving is small (the work region also shrinks -// quadratically, so most of the savings are already captured at factor=4). +// Capped at factor=4 to favor visual quality over bandwidth at the high end. Larger factors +// (8 and 16) would lose more high-frequency detail than the kernel can mask even with the +// H+V split, and the bandwidth saving is small (the work region also shrinks quadratically, +// so most of the savings are already captured at factor=4). // // Working textures are sized at full swapchain resolution to support factor=1. Larger factors -// just write to a smaller sub-rect via viewport-limited rendering. Memory cost: ½-res → full- -// res working textures means 4× more bytes per working texture (2 textures, RGBA8: roughly -// 16 MB at 1080p, 64 MB at 4K). On modern GPUs this is well within budget; on Mali Valhall -// SBCs it's negligible against unified-memory headroom. +// just write to a smaller sub-rect via viewport-limited rendering. Memory cost: full-res +// working textures (2 textures, RGBA8) is roughly 16 MB at 1080p, 64 MB at 4K. On modern +// GPUs this is well within budget; on Mali Valhall SBCs it's negligible against unified- +// memory headroom. // // The shaders read the factor as a uniform. The downsample shader has three paths (factor=1 // identity, factor=2 single bilinear tap, factor>=4 four bilinear taps with offsets scaling @@ -86,7 +86,7 @@ Backdrop_Vert_Uniforms :: struct { // shaders/source/backdrop_downsample.frag. Backdrop_Downsample_Frag_Uniforms :: struct { inv_source_size: [2]f32, // 0: 8 — 1.0 / source_texture pixel dimensions (full-res) - downsample_factor: u32, // 8: 4 — 2 or 4 (selects 1-tap vs 4-tap path in shader) + downsample_factor: u32, // 8: 4 — 1, 2, or 4 (selects identity / 1-tap / 4-tap path in shader) _pad0: u32, // 12: 4 } @@ -120,11 +120,12 @@ Pipeline_2D_Backdrop :: struct { primitive_buffer: Buffer, // Working textures, allocated once at swapchain resolution and recreated only on resize. - // `source_texture` is full-resolution; the other two are ¼-res. All single-sample. + // All three are sized at full swapchain resolution and single-sample. Larger downsample + // factors fill only a sub-rect via viewport-limited rendering (see file-header comment). // source_texture — when any backdrop draw exists this frame, the entire frame renders - // here instead of the swapchain (Approach B). Copied to the swapchain - // at frame end. Acts as the bracket's snapshot input by virtue of - // already containing the pre-bracket frame. + // here instead of the swapchain. Copied to the swapchain at frame + // end. Acts as the bracket's snapshot input by virtue of already + // containing the pre-bracket frame. // downsample_texture — written by the downsample PSO. Read by the blur PSO in mode 0. // h_blur_texture — written by the blur PSO in mode 0. Read by the blur PSO in mode 1. source_texture: ^sdl.GPUTexture, @@ -243,7 +244,7 @@ create_pipeline_2d_backdrop :: proc( //----- Downsample PSO ---------------------------------- // Single bilinear sample, blend disabled. No vertex buffer (gl_VertexIndex 0..2 emits the - // fullscreen triangle). Single-sample target (the ¼-res working textures are never MSAA). + // fullscreen triangle). Single-sample target (working textures are never MSAA). downsample_target := sdl.GPUColorTargetDescription { format = swapchain_format, blend_state = sdl.GPUColorTargetBlendState{enable_blend = false}, @@ -350,9 +351,9 @@ destroy_pipeline_2d_backdrop :: proc(device: ^sdl.GPUDevice, pipeline: ^Pipeline // --------------------------------------------------------------------------------------------------------------------- // Allocate (or reallocate, on resize) the three working textures that the backdrop bracket -// uses. `source_texture` is full swapchain resolution; the other two are ¼-res. All single- -// sample, all share the swapchain format, all need {.COLOR_TARGET, .SAMPLER} usage so they -// can be written by render passes and read by subsequent passes. +// uses. All three are sized at full swapchain resolution, single-sample, share the swapchain +// format, and need {.COLOR_TARGET, .SAMPLER} usage so they can be written by render passes +// and read by subsequent passes. // // Recreates on dimension change only — same-size frames hit the early-out and skip GPU // resource churn. @@ -466,19 +467,19 @@ ensure_backdrop_textures :: proc(device: ^sdl.GPUDevice, format: sdl.GPUTextureF // `i in [1, pair_count)` and does two texture fetches per pair — one at +offset, one at // -offset — for a total of 1 + 2*(pair_count-1) bilinear fetches per fragment. // -// `sigma` is the true Gaussian standard deviation in the kernel's working-space units (¼-res -// texels, after the caller has converted from logical pixels via dpi_scaling and the -// downsample factor). The kernel extent reaches ±3σ, capturing 99.7% of the Gaussian's +// `sigma` is the true Gaussian standard deviation in the kernel's working-space units +// (working-resolution texels, after the caller has converted from logical pixels via +// dpi_scaling and the downsample factor). The kernel extent reaches ±3σ, capturing 99.7% of +// the Gaussian's // mass; weights beyond that contribute imperceptibly. sigma <= 0 produces a degenerate // kernel `{1, 0}` that acts as a sharp pass-through. After the loop, the discrete weights // are normalized so they sum to 1.0 (truncating at ±3σ loses a tiny amount of mass; we // renormalize to preserve overall image brightness). // -// Earlier versions of this routine ported RAD Debugger's algorithm verbatim, which derives -// stdev from a tap-count parameter (`stdev = (blur_count-1)/2`). That made the parameter -// name misleading: the user thought they were passing σ but were actually passing -// half-kernel-width. This version takes σ directly and derives the tap count from it, -// matching what callers expect when they read "gaussian_sigma". +// Note on the parameter contract: this routine takes σ directly and derives the tap count +// from it, rather than the inverse (RAD Debugger's algorithm passes a tap count and derives +// `stdev = (blur_count-1)/2`). Taking σ directly matches what callers expect when they read +// "gaussian_sigma" — passing tap count under that name was a footgun. @(private) compute_blur_kernel :: proc(sigma: f32, kernel: ^[MAX_BACKDROP_KERNEL_PAIRS][4]f32) -> (pair_count: u32) { if sigma <= 0 { @@ -624,7 +625,7 @@ upload_backdrop_primitives :: proc(device: ^sdl.GPUDevice, pass: ^sdl.GPUCopyPas // --------------------------------------------------------------------------------------------------------------------- // Returns true if any sub-batch in any layer this frame is .Backdrop kind. Called once at the -// top of `end()` to decide whether to route the whole frame to source_texture (Approach B). +// top of `end()` to decide whether to route the whole frame to source_texture. // O(total sub-batches) but with an early-exit on the first hit, so typical cost is tiny. @(private) frame_has_backdrop :: proc() -> bool { @@ -742,10 +743,10 @@ compute_backdrop_group_work_region :: proc( // target viewport, per-primitive SDF discard handles masking and applies the tint. Each // sub-batch in the group is one instanced draw. // -// V-blur was historically combined with the composite into a single shader invocation, but -// that produced a horizontal-vs-vertical asymmetry artifact (horizontal source features -// looked sharper than vertical ones inside the panel). Splitting V-blur into its own -// working→working pass restores symmetry by making H and V blurs structurally identical. +// V-blur is run as its own working→working pass rather than folded into the composite. The +// folded variant produces a horizontal-vs-vertical asymmetry artifact (horizontal source +// features end up looking sharper than vertical ones inside the panel). Matching V's +// structure exactly to H's restores symmetry. // // On exit, source_texture contains the pre-bracket contents plus all backdrop primitives // composited on top. The caller then runs Pass B (post-bracket non-backdrop sub-batches) on @@ -1011,8 +1012,8 @@ run_backdrop_bracket :: proc( // geometry. The caller sets `color` (tint) on the returned primitive before submitting. // // No rotation, no outline — backdrop primitives are intentionally limited to axis-aligned -// RRects in v1. Rotation breaks screen-space blur sampling visually; outline would be a -// specialized edge effect that belongs in its own primitive type. +// RRects. Rotation breaks screen-space blur sampling visually; outline would be a specialized +// edge effect that belongs in its own primitive type. @(private) build_backdrop_primitive :: proc( rect: Rectangle, diff --git a/draw/draw.odin b/draw/draw.odin index 770a59c..51c4cb3 100644 --- a/draw/draw.odin +++ b/draw/draw.odin @@ -830,9 +830,9 @@ end :: proc(device: ^sdl.GPUDevice, window: ^sdl.Window, clear_color: Color = DF } // Pre-scan: if any layer this frame has a backdrop sub-batch, route the entire frame to - // source_texture (Approach B) so the bracket can sample the pre-bracket framebuffer - // without a mid-frame texture copy. Frames without any backdrop hit the existing fast - // path and never touch the backdrop pipeline's working textures. + // source_texture so the bracket can sample the pre-bracket framebuffer without a mid- + // frame texture copy. Frames without any backdrop hit the existing fast path and never + // touch the backdrop pipeline's working textures. has_backdrop := frame_has_backdrop() // Upload primitives to GPU (vertices, indices, SDF prims, and backdrop prims share one @@ -880,8 +880,8 @@ end :: proc(device: ^sdl.GPUDevice, window: ^sdl.Window, clear_color: Color = DF draw_layer(device, window, cmd_buffer, render_texture, width, height, clear_color_f32, &layer) } - // Approach B finalization: when we rendered into source_texture, copy it to the swapchain. - // Single CopyGPUTextureToTexture call per frame, only when backdrop content was present. + // When we rendered into source_texture, copy it to the swapchain. Single + // CopyGPUTextureToTexture call per frame, only when backdrop content was present. if has_backdrop { copy_pass := sdl.BeginGPUCopyPass(cmd_buffer) sdl.CopyGPUTextureToTexture( diff --git a/draw/draw_qr/draw_qr.odin b/draw/draw_qr/draw_qr.odin index d7f8586..f9e9e9d 100644 --- a/draw/draw_qr/draw_qr.odin +++ b/draw/draw_qr/draw_qr.odin @@ -20,7 +20,7 @@ texture_size :: #force_inline proc(qrcode_buf: []u8) -> int { // // Returns ok=false when: // - qrcode_buf is invalid (qrcode.get_size returns 0). -// - texture_buf is smaller than to_texture_size(qrcode_buf). +// - texture_buf is smaller than texture_size(qrcode_buf). @(require_results) to_texture :: proc( qrcode_buf: []u8, diff --git a/draw/examples/backdrop.odin b/draw/examples/backdrop.odin index 6e033aa..2d3a6fb 100644 --- a/draw/examples/backdrop.odin +++ b/draw/examples/backdrop.odin @@ -10,8 +10,8 @@ import cyber "../cybersteel" // Backdrop example. // -// Verifies the Stage D bracket scheduler end-to-end. The demo is structured as three zones in -// one window so we can stress-test the cases that matter: +// Exercises the bracket scheduler end-to-end. The demo is structured as three zones in one +// window so we can stress-test the cases that matter: // // Zone 1 (top, base layer): animated colorful background + two side-by-side frosted panels // with DIFFERENT sigmas and DIFFERENT tints. Tests sigma grouping @@ -269,9 +269,8 @@ gaussian_blur :: proc() { // SPACE : reset to sigma=10 // T : toggle the test rectangle on top of the panel // -// Sigma is printed to the console label and to the title bar so you can correlate visual -// behavior with kernel state (which is also logged via the [backdrop] debug print in -// backdrop.odin's compute_blur_kernel callsite). +// Sigma is printed to the title bar so you can correlate visual behavior with the numeric +// value as you adjust it. gaussian_blur_debug :: proc() { if !sdl.Init({.VIDEO}) do os.exit(1) window := sdl.CreateWindow("Backdrop debug", 800, 600, {.HIGH_PIXEL_DENSITY}) diff --git a/draw/pipeline_2d_base.odin b/draw/pipeline_2d_base.odin index 6ae876f..a47e091 100644 --- a/draw/pipeline_2d_base.odin +++ b/draw/pipeline_2d_base.odin @@ -116,9 +116,9 @@ Gradient_Outline :: struct { // avoiding per-pixel trigonometry in the fragment shader. Only read when .Rotated is set. // // Named Base_2D_Primitive (not just Primitive) to disambiguate from Backdrop_Primitive in -// pipeline_2d_backdrop.odin. The two pipelines have unrelated GPU layouts and unrelated -// fragment-shader contracts; pairing each with its own primitive type keeps cross-references -// unambiguous when grepping the codebase. +// backdrop.odin. The two pipelines have unrelated GPU layouts and unrelated fragment-shader +// contracts; pairing each with its own primitive type keeps cross-references unambiguous +// when grepping the codebase. Base_2D_Primitive :: struct { bounds: [4]f32, // 0: min_x, min_y, max_x, max_y (world-space, pre-DPI) color: Color, // 16: u8x4, fill color / gradient start color / texture tint diff --git a/draw/shaders/source/backdrop_blur.frag b/draw/shaders/source/backdrop_blur.frag index 1ccf6de..7193a24 100644 --- a/draw/shaders/source/backdrop_blur.frag +++ b/draw/shaders/source/backdrop_blur.frag @@ -1,19 +1,18 @@ #version 450 core // Unified backdrop blur fragment shader. -// Handles both H-blur (mode 0, blurs the ¼-resolution downsample texture into -// the ¼-resolution h_blur texture) and V-blur+composite (mode 1, blurs h_blur -// vertically, masks via RRect SDF, applies tint, composites outline, and writes -// to the main render target with premultiplied alpha). +// Handles both the 1D separable blur passes (mode 0, used for BOTH the H-pass and V-pass; +// `direction` picks the axis) and the composite pass (mode 1, reads the fully-blurred +// working texture, masks via RRect SDF, applies tint, and writes to source_texture with +// premultiplied-over blending). Working textures are sized at the full swapchain resolution; +// downsampled content occupies only a sub-rect at downsample factor > 1 (set via viewport). // -// Following RAD's pattern, V-mode replaces a separate composite pass: the SDF -// discard limits V-blur work to the masked region, and the per-primitive tint -// is folded in. Output blends with the main render target via the standard -// premultiplied-over blend state (ONE, ONE_MINUS_SRC_ALPHA). +// The composite blends with source_texture via the standard premultiplied-over blend state +// (ONE, ONE_MINUS_SRC_ALPHA). // -// Backdrop primitives are tint-only — there is no outline. A specialized edge -// effect (e.g. liquid-glass-style refraction outlines) would be implemented -// as a dedicated primitive type with its own pipeline. +// Backdrop primitives are tint-only — there is no outline. A specialized edge effect +// (e.g. liquid-glass-style refraction outlines) would be implemented as a dedicated +// primitive type with its own pipeline. // // Two modes, structurally distinct: // @@ -30,11 +29,11 @@ // (gl_FragCoord.xy * inv_downsample_factor) * inv_working_size. // No kernel is applied here — the blur is already complete. // -// Splitting V-blur out of the composite pass (an earlier version combined them) was needed -// to avoid a horizontal-vs-vertical asymmetry artifact: when the V-blur sampled the H-blur -// output through the bilinear-upsample/SDF-mask/tint pipeline in one shader invocation, -// horizontal source features ended up looking sharper than vertical ones. Running V-blur as -// its own working→working pass (matching H's structure exactly) restores symmetry. +// V-blur is run as its own working→working pass rather than folded into the composite. The +// folded variant produced a horizontal-vs-vertical asymmetry artifact: when V-blur sampled +// the H-blur output through the bilinear-upsample/SDF-mask/tint pipeline in one shader +// invocation, horizontal source features ended up looking sharper than vertical ones. +// Matching V's structure exactly to H's restores symmetry. const uint MAX_KERNEL_PAIRS = 32; @@ -140,16 +139,16 @@ void main() { vec2 uv = (gl_FragCoord.xy * inv_downsample_factor) * inv_working_size; vec3 color = texture(blur_input_tex, uv).rgb; - // Tint composition (Option B semantics): inside the masked region the panel is fully - // opaque — it completely hides the original framebuffer content, just like real frosted - // glass and like iOS UIBlurEffect / CSS backdrop-filter. f_color.rgb specifies the tint - // color; f_color.a specifies the tint *mix strength* (NOT panel opacity). At alpha=0 we - // see the pure blur; at alpha=255 we see the blur fully multiplied by the tint color. + // Tint composition: inside the masked region the panel is fully opaque — it completely + // hides the original framebuffer content, just like real frosted glass and like iOS + // UIBlurEffect / CSS backdrop-filter. f_color.rgb specifies the tint color; f_color.a + // specifies the tint *mix strength* (NOT panel opacity). At alpha=0 we see the pure + // blur; at alpha=255 we see the blur fully multiplied by the tint color. // // Output is premultiplied to match the ONE, ONE_MINUS_SRC_ALPHA blend state. Coverage // (the SDF mask's edge AA) modulates only the alpha channel, never the panel-vs-source - // blend; that way edge pixels still feather correctly without re-introducing the bug - // where mid-panel pixels became semi-transparent. + // blend; that way edge pixels still feather correctly while mid-panel pixels stay fully + // opaque. mediump vec3 tinted = mix(color, color * f_color.rgb, f_color.a); mediump float coverage = sdf_alpha(d_n, h_n); out_color = vec4(tinted * coverage, coverage); diff --git a/draw/shaders/source/backdrop_blur.vert b/draw/shaders/source/backdrop_blur.vert index 879a6a2..6b2e313 100644 --- a/draw/shaders/source/backdrop_blur.vert +++ b/draw/shaders/source/backdrop_blur.vert @@ -1,18 +1,19 @@ #version 450 core // Unified backdrop blur vertex shader. -// Handles both H-blur (fullscreen triangle, mode 0) and V-blur+composite (instanced -// unit-quad over Backdrop_Primitive storage buffer, mode 1) for the second PSO of -// the backdrop bracket. The first PSO (downsample) uses backdrop_fullscreen.vert. +// Handles both the 1D separable blur passes (fullscreen triangle, mode 0; used for +// BOTH the H-pass and V-pass) and the composite pass (instanced unit-quad over +// Backdrop_Primitive storage buffer, mode 1) for the second PSO of the backdrop bracket. +// The first PSO (downsample) uses backdrop_fullscreen.vert. // // No vertex buffer for either mode. Mode 0 uses gl_VertexIndex 0..2 for a single // fullscreen triangle; mode 1 uses gl_VertexIndex 0..5 for a unit-quad (two // triangles, TRIANGLELIST topology) and gl_InstanceIndex to select the primitive. // -// Mode 0 viewport+scissor are CPU-set per layer-bracket to the work region (union -// AABB of backdrop primitives + 3*max_sigma, clamped to swapchain bounds). Mode 1 -// renders into the main render target with the screen-space orthographic projection; -// the per-primitive bounds drive the quad in screen space. +// Mode 0 viewport+scissor are CPU-set per sigma group to the work region (union AABB +// of that group's backdrop primitives + halo, clamped to swapchain bounds). Mode 1 +// renders into source_texture with the screen-space orthographic projection; the +// per-primitive bounds drive the quad in screen space. // // Backdrop primitives have NO rotation — backdrop sampling is in screen space, so // a rotated mask over a stationary blur sample would look wrong. @@ -46,11 +47,11 @@ layout(set = 1, binding = 0) uniform Uniforms { // vec2 and scalar tail packs tight to land the struct at a clean 48-byte // stride (a multiple of 16, so the array stride needs no rounding either). // Field semantics match the CPU-side Backdrop_Primitive declared in -// levlib/draw/pipeline_2d_backdrop.odin; keep both in sync. +// levlib/draw/backdrop.odin; keep both in sync. // -// Backdrop primitives are tint-only in v1: outline is intentionally absent. -// Future specialized effects (e.g. liquid-glass-style edges) would be a -// dedicated primitive type with its own pipeline rather than a flag bit here. +// Backdrop primitives are tint-only: outline is intentionally absent. Specialized +// edge effects (e.g. liquid-glass-style refraction outlines) would be a dedicated +// primitive type with its own pipeline rather than a flag bit here. struct Backdrop_Primitive { vec4 bounds; // 0-15: min_xy, max_xy (world-space) vec4 radii; // 16-31: per-corner radii (physical px) diff --git a/draw/shaders/source/backdrop_downsample.frag b/draw/shaders/source/backdrop_downsample.frag index 7c4e1d3..b933991 100644 --- a/draw/shaders/source/backdrop_downsample.frag +++ b/draw/shaders/source/backdrop_downsample.frag @@ -2,9 +2,9 @@ // Backdrop downsample fragment shader. // Reads source_texture (full-resolution snapshot of pre-bracket framebuffer contents) and -// writes a downsampled copy at factor 1, 2, 4, 8, or 16. The output is the working texture -// (sized at full swapchain resolution); larger factors only fill a sub-rect of it via the -// CPU-set viewport. See backdrop.odin for the factor selection table (Flutter-style). +// writes a downsampled copy at factor 1, 2, or 4. The output is the working texture (sized +// at full swapchain resolution); larger factors only fill a sub-rect of it via the CPU-set +// viewport. See backdrop.odin for the factor selection table (Flutter-style). // // Shader paths by factor: // @@ -15,15 +15,12 @@ // factor=2: each output covers a 2×2 source block. Single bilinear tap at the shared // corner reads all 4 source pixels with 0.25 weight. // -// factor>=4: each output covers a (factor)×(factor) source block. We use 4 bilinear taps, -// each at the shared corner of a (factor/2)×(factor/2) sub-block. Each tap reads -// 4 source pixels uniformly; combined, the 4 taps sample 16 source pixels arranged -// uniformly across the block. This is an approximation of a true (factor)² box -// filter — exact at factor=4 (16 pixels = full coverage), undersampled at factor=8 -// (16 pixels of 64) and factor=16 (16 of 256). Flutter uses a richer 13-tap COD- -// style downsample shader at high factors; we accept the simpler 4-tap pattern -// for now since the high-factor cases come with large kernels that mask any -// residual aliasing. +// factor=4: each output covers a 4×4 source block. We use 4 bilinear taps, each at the +// shared corner of a 2×2 sub-block. Each tap reads 4 source pixels uniformly; +// combined, the 4 taps sample 16 source pixels arranged uniformly across the +// block (full coverage at factor=4). The factor>=4 path is structured so the +// same shader code would extend to factor=8 (16 pixels of 64) or factor=16 (16 +// of 256) if the CPU-side cap is ever raised, though the current cap is 4. // // The viewport+scissor are set by the CPU to limit output to the layer's work region in // working-texture coords (work_region_phys / factor), clamped to the texture bounds. diff --git a/draw/shaders/source/base_2d.vert b/draw/shaders/source/base_2d.vert index 7bd5c15..c903d1c 100644 --- a/draw/shaders/source/base_2d.vert +++ b/draw/shaders/source/base_2d.vert @@ -45,7 +45,7 @@ layout(std430, set = 0, binding = 0) readonly buffer Base_2D_Primitives { // ---------- Entry point ---------- void main() { if (mode == 0u) { - // ---- Mode 0: Tessellated (legacy) ---- + // ---- Mode 0: Tessellated (used for text and arbitrary user geometry) ---- f_color = v_color; f_local_or_uv = v_uv; f_params = vec4(0.0); diff --git a/draw/shapes.odin b/draw/shapes.odin index 0c9b712..bd12413 100644 --- a/draw/shapes.odin +++ b/draw/shapes.odin @@ -53,7 +53,8 @@ emit_rectangle :: proc(x, y, width, height: f32, color: Color, vertices: []Verte } // Internal — submit an SDF primitive with optional texture binding. -// Replaces the old prepare_sdf_primitive and prepare_sdf_primitive_textured. +// The texture-aware counterpart of `draw.prepare_sdf_primitive`; lets shape procs route a +// texture_id and sampler into the sub-batch without growing the public API. @(private) prepare_sdf_primitive_ex :: proc( layer: ^Layer, diff --git a/qrcode/examples/examples.odin b/qrcode/examples/examples.odin index a3d4be1..d03d820 100644 --- a/qrcode/examples/examples.odin +++ b/qrcode/examples/examples.odin @@ -9,46 +9,45 @@ import qr ".." main :: proc() { //----- General setup ---------------------------------- - { - // Temp - track_temp: mem.Tracking_Allocator - mem.tracking_allocator_init(&track_temp, context.temp_allocator) - context.temp_allocator = mem.tracking_allocator(&track_temp) + // Temp + track_temp: mem.Tracking_Allocator + mem.tracking_allocator_init(&track_temp, context.temp_allocator) + context.temp_allocator = mem.tracking_allocator(&track_temp) - // Default - track: mem.Tracking_Allocator - mem.tracking_allocator_init(&track, context.allocator) - context.allocator = mem.tracking_allocator(&track) - // Log a warning about any memory that was not freed by the end of the program. - // This could be fine for some global state or it could be a memory leak. - defer { - // Temp allocator - if len(track_temp.bad_free_array) > 0 { - fmt.eprintf("=== %v incorrect frees - temp allocator: ===\n", len(track_temp.bad_free_array)) - for entry in track_temp.bad_free_array { - fmt.eprintf("- %p @ %v\n", entry.memory, entry.location) - } - mem.tracking_allocator_destroy(&track_temp) + // Default + track: mem.Tracking_Allocator + mem.tracking_allocator_init(&track, context.allocator) + context.allocator = mem.tracking_allocator(&track) + // Log a warning about any memory that was not freed by the end of the program. + // This could be fine for some global state or it could be a memory leak. + defer { + // Temp allocator + if len(track_temp.bad_free_array) > 0 { + fmt.eprintf("=== %v incorrect frees - temp allocator: ===\n", len(track_temp.bad_free_array)) + for entry in track_temp.bad_free_array { + fmt.eprintf("- %p @ %v\n", entry.memory, entry.location) } - // Default allocator - if len(track.allocation_map) > 0 { - fmt.eprintf("=== %v allocations not freed - main allocator: ===\n", len(track.allocation_map)) - for _, entry in track.allocation_map { - fmt.eprintf("- %v bytes @ %v\n", entry.size, entry.location) - } - } - if len(track.bad_free_array) > 0 { - fmt.eprintf("=== %v incorrect frees - main allocator: ===\n", len(track.bad_free_array)) - for entry in track.bad_free_array { - fmt.eprintf("- %p @ %v\n", entry.memory, entry.location) - } - } - mem.tracking_allocator_destroy(&track) + mem.tracking_allocator_destroy(&track_temp) } - // Logger - context.logger = log.create_console_logger() - defer log.destroy_console_logger(context.logger) + // Default allocator + if len(track.allocation_map) > 0 { + fmt.eprintf("=== %v allocations not freed - main allocator: ===\n", len(track.allocation_map)) + for _, entry in track.allocation_map { + fmt.eprintf("- %v bytes @ %v\n", entry.size, entry.location) + } + } + if len(track.bad_free_array) > 0 { + fmt.eprintf("=== %v incorrect frees - main allocator: ===\n", len(track.bad_free_array)) + for entry in track.bad_free_array { + fmt.eprintf("- %p @ %v\n", entry.memory, entry.location) + } + } + mem.tracking_allocator_destroy(&track) } + // Logger + context.logger = log.create_console_logger() + defer log.destroy_console_logger(context.logger) + args := os.args if len(args) < 2 { diff --git a/vendor/lmdb/examples/examples.odin b/vendor/lmdb/examples/examples.odin index 4a2a805..4cc2d6b 100644 --- a/vendor/lmdb/examples/examples.odin +++ b/vendor/lmdb/examples/examples.odin @@ -14,46 +14,45 @@ DB_PATH :: "out/debug/lmdb_example_db" main :: proc() { //----- General setup ---------------------------------- - { - // Temp - track_temp: mem.Tracking_Allocator - mem.tracking_allocator_init(&track_temp, context.temp_allocator) - context.temp_allocator = mem.tracking_allocator(&track_temp) + // Temp + track_temp: mem.Tracking_Allocator + mem.tracking_allocator_init(&track_temp, context.temp_allocator) + context.temp_allocator = mem.tracking_allocator(&track_temp) - // Default - track: mem.Tracking_Allocator - mem.tracking_allocator_init(&track, context.allocator) - context.allocator = mem.tracking_allocator(&track) - // Log a warning about any memory that was not freed by the end of the program. - // This could be fine for some global state or it could be a memory leak. - defer { - // Temp allocator - if len(track_temp.bad_free_array) > 0 { - fmt.eprintf("=== %v incorrect frees - temp allocator: ===\n", len(track_temp.bad_free_array)) - for entry in track_temp.bad_free_array { - fmt.eprintf("- %p @ %v\n", entry.memory, entry.location) - } - mem.tracking_allocator_destroy(&track_temp) + // Default + track: mem.Tracking_Allocator + mem.tracking_allocator_init(&track, context.allocator) + context.allocator = mem.tracking_allocator(&track) + // Log a warning about any memory that was not freed by the end of the program. + // This could be fine for some global state or it could be a memory leak. + defer { + // Temp allocator + if len(track_temp.bad_free_array) > 0 { + fmt.eprintf("=== %v incorrect frees - temp allocator: ===\n", len(track_temp.bad_free_array)) + for entry in track_temp.bad_free_array { + fmt.eprintf("- %p @ %v\n", entry.memory, entry.location) } - // Default allocator - if len(track.allocation_map) > 0 { - fmt.eprintf("=== %v allocations not freed - main allocator: ===\n", len(track.allocation_map)) - for _, entry in track.allocation_map { - fmt.eprintf("- %v bytes @ %v\n", entry.size, entry.location) - } - } - if len(track.bad_free_array) > 0 { - fmt.eprintf("=== %v incorrect frees - main allocator: ===\n", len(track.bad_free_array)) - for entry in track.bad_free_array { - fmt.eprintf("- %p @ %v\n", entry.memory, entry.location) - } - } - mem.tracking_allocator_destroy(&track) + mem.tracking_allocator_destroy(&track_temp) } - // Logger - context.logger = log.create_console_logger() - defer log.destroy_console_logger(context.logger) + // Default allocator + if len(track.allocation_map) > 0 { + fmt.eprintf("=== %v allocations not freed - main allocator: ===\n", len(track.allocation_map)) + for _, entry in track.allocation_map { + fmt.eprintf("- %v bytes @ %v\n", entry.size, entry.location) + } + } + if len(track.bad_free_array) > 0 { + fmt.eprintf("=== %v incorrect frees - main allocator: ===\n", len(track.bad_free_array)) + for entry in track.bad_free_array { + fmt.eprintf("- %p @ %v\n", entry.memory, entry.location) + } + } + mem.tracking_allocator_destroy(&track) } + // Logger + context.logger = log.create_console_logger() + defer log.destroy_console_logger(context.logger) + environment: ^mdb.Env