libusb cleanup

LMDB cleanup
2026-04-22 04:47:43 +00:00 · 2026-04-22 04:47:43 +00:00
89 changed files with 3944 additions and 10804 deletions
@@ -75,21 +75,6 @@
    "command": "odin run draw/examples -debug -out=out/debug/draw-examples -- textures",
    "cwd": "$ZED_WORKTREE_ROOT",
  },
-  {
-    "label": "Run draw clay-borders example",
-    "command": "odin run draw/examples -debug -out=out/debug/draw-examples -- clay-borders",
-    "cwd": "$ZED_WORKTREE_ROOT",
-  },
-  {
-    "label": "Run draw gaussian-blur example",
-    "command": "odin run draw/examples -debug -out=out/debug/draw-examples -- gaussian-blur",
-    "cwd": "$ZED_WORKTREE_ROOT",
-  },
-  {
-    "label": "Run draw gaussian-blur-debug example",
-    "command": "odin run draw/examples -debug -out=out/debug/draw-examples -- gaussian-blur-debug",
-    "cwd": "$ZED_WORKTREE_ROOT",
-  },
  {
    "label": "Run qrcode basic example",
    "command": "odin run qrcode/examples -debug -out=out/debug/qrcode-examples -- basic",
@@ -5,73 +5,38 @@ Clay UI integration.

 ## Current state

-The renderer uses a single unified `Core_2D` (`TRIANGLELIST` pipeline) with three submission
-modes dispatched by a push constant. The split is by **vertex coordinate space**, not by what the
-fragment shader does — modes 0 and 2 share the same fragment-shader path (kind 0) and differ only
-in whether the vertex shader applies `dpi_scale` to incoming positions:
+The renderer uses a single unified `Pipeline_2D_Base` (`TRIANGLELIST` pipeline) with two submission
+modes dispatched by a push constant:

- **Mode 0 (Tessellated):** Vertex buffer contains real geometry in _logical_ pixels. The vertex
-  shader scales by `dpi_scale` before projecting. Used for single-pixel points (`tess.pixel`),
-  arbitrary user geometry (`tess.triangle`, `tess.triangle_aa`, `tess.triangle_lines`,
-  `tess.triangle_fan`, `tess.triangle_strip`), and any raw vertex geometry submitted via
-  `prepare_shape`. The fragment shader premultiplies the texture sample (`t.rgb *= t.a`) and
-  computes `out = color * t`.
-
- **Mode 2 (Text):** Vertex buffer contains real geometry in _physical_ pixels. SDL_ttf's GPU text
-  engine lays out glyphs in physical pixels (`TTF_SetFontSizeDPI` is called with `72 * dpi_scale`),
-  so `prepare_text` adds an anchor offset that is itself snapped to integer physical pixels for
-  atlas-aligned bilinear sampling, then writes vertices straight to the buffer. The vertex shader
-  must NOT rescale these vertices. Same fragment-shader kind as Tessellated; same indexed draws
-  into SDL_ttf atlas textures; the only difference is the coordinate space of the input. Mode 2
-  exists because integer-physical-pixel snapping is the load-bearing property of crisp glyph
-  rendering and CPU is the only place that snap can happen once-per-text-element instead of
-  per-vertex.
+- **Mode 0 (Tessellated):** Vertex buffer contains real geometry. Used for text (indexed draws into
+  SDL_ttf atlas textures), axis-aligned sharp-corner rectangles (already optimal as 2 triangles),
+  per-vertex color gradients (`rectangle_gradient`, `circle_gradient`), angular-clipped circle
+  sectors (`circle_sector`), and arbitrary user geometry (`triangle`, `triangle_fan`,
+  `triangle_strip`). The fragment shader computes `out = color * texture(tex, uv)`.

 - **Mode 1 (SDF):** A static 6-vertex unit-quad buffer is drawn instanced, with per-primitive
-  `Core_2D_Primitive` structs (96 bytes each) uploaded each frame to a GPU storage buffer. The vertex
-  shader reads `primitives[gl_InstanceIndex]`, computes world-space position from unit quad corners +
-  primitive bounds. The fragment shader dispatches on `Shape_Kind` (encoded in the low byte of
-  `Core_2D_Primitive.flags`) to evaluate one of four signed distance functions:
-  - **RRect** (kind 1) — `sdRoundedBox` with per-corner radii. Covers rectangles (sharp or rounded),
-    circles (uniform radii = half-size), and line segments / capsules (rotated RRect with uniform
-    radii = half-thickness). Covers filled, outlined, textured, and gradient-filled variants.
-  - **NGon** (kind 2) — `sdRegularPolygon` for regular N-sided polygons.
-  - **Ellipse** (kind 3) — `sdEllipseApprox`, an approximate ellipse SDF suitable for UI rendering.
-  - **Ring_Arc** (kind 4) — annular ring with optional angular clipping via pre-computed edge
-    normals. Covers full rings, partial arcs, and pie slices (`inner_radius = 0`).
+  `Primitive` structs uploaded each frame to a GPU storage buffer. The vertex shader reads
+  `primitives[gl_InstanceIndex]`, computes world-space position from unit quad corners + primitive
+  bounds. The fragment shader dispatches on `Shape_Kind` to evaluate the correct signed distance
+  function analytically.

-All SDF shapes support fill, outline, solid color, 2-color linear gradients, 2-color radial
-gradients, and texture fills via `Shape_Flags` (see `core_2d.odin`). The texture UV rect
-(`uv_rect: [4]f32`) and the gradient/outline parameters (`effects: Gradient_Outline`) live in their
-own 16-byte slots in `Core_2D_Primitive`, so a primitive can carry texture and outline simultaneously.
-Gradient and texture remain mutually exclusive at the fill-source level (a Brush variant chooses one
-or the other) since they share the worst-case fragment-shader register path.
+Seven SDF shape kinds are implemented:

-All SDF shapes produce mathematically exact curves with analytical anti-aliasing via `smoothstep` —
-no tessellation, no piecewise-linear approximation. A rounded rectangle is 1 primitive (96 bytes)
-instead of ~250 vertices (~5000 bytes).
+1. **RRect** — rounded rectangle with per-corner radii (iq's `sdRoundedBox`)
+2. **Circle** — filled or stroked circle
+3. **Ellipse** — exact signed-distance ellipse (iq's iterative `sdEllipse`)
+4. **Segment** — capsule-style line segment with rounded caps
+5. **Ring_Arc** — annular ring with angular clipping for arcs
+6. **NGon** — regular polygon with arbitrary side count and rotation
+7. **Polyline** — decomposed into independent `Segment` primitives per adjacent point pair

-The main pipeline's register budget is **≤24 registers** (see "Main/effects split: register pressure"
-in the pipeline plan below for the full cliff/margin analysis and SBC architecture context).
-The fragment shader's estimated peak footprint is ~22–26 fp32 VGPRs (~16–22 fp16 VGPRs on architectures
-with native mediump) via manual live-range analysis. The dominant peak is the Ring_Arc kind path
-(wedge normals + inner/outer radii + dot-product temporaries live simultaneously with carried state
-like `f_color`, `f_uv_rect`/`f_effects`, and `half_size_ppx`). RRect is 1–2 regs lower
-(`corner_radii_ppx` vec4 replaces the separate inner/outer + normal pairs). NGon and Ellipse are lighter still. Real compilers
-apply live-range coalescing, mediump-to-fp16 promotion, and rematerialization that typically shave
-2–4 regs from hand-counted estimates — the conservative 26-reg upper bound is expected to compile
-down to within the 24-register budget, but this must be verified with `malioc` (see "Verifying
-register counts" below). On V3D and Bifrost architectures (16-register cliff), the compiler
-statically allocates registers for the worst-case path (Ring_Arc) regardless of which kind any given
-fragment actually evaluates, so all fragments pay the occupancy cost of the heaviest branch. This is
-a documented limitation, not a design constraint (see "Known limitations: V3D and Bifrost" below).
+All SDF shapes support fill and stroke modes via `Shape_Flags`, and produce mathematically exact
+curves with analytical anti-aliasing via `smoothstep` — no tessellation, no piecewise-linear
+approximation. A rounded rectangle is 1 primitive (64 bytes) instead of ~250 vertices (~5000 bytes).

-MSAA is intentionally not supported. SDF text and shapes compute fragment coverage analytically
-via `smoothstep`, so they don't benefit from multisampling. Tessellated user geometry submitted via
-`prepare_shape` is rendered without anti-aliasing — if AA is required for tessellated content, the
-caller must render it to their own offscreen target and submit the result as a texture. This
-decision matches RAD Debugger's architecture and aligns with the SBC target (Mali Valhall, where
-MSAA's per-tile bandwidth multiplier is expensive).
+MSAA is opt-in (default `._1`, no MSAA) via `Init_Options.msaa_samples`. SDF rendering does not
+benefit from MSAA because fragment coverage is computed analytically. MSAA remains useful for text
+glyph edges and tessellated user geometry if desired.

 ## 2D rendering pipeline plan

@@ -85,23 +50,22 @@ primitives and effects can be added to the library without architectural changes
 The 2D renderer uses three GPU pipelines, split by **register pressure** (main vs effects) and
 **render-pass structure** (everything vs backdrop):

-1. **Main pipeline** — shapes (SDF and tessellated), text, and textured rectangles. Register budget:
-   **≤24 registers** (full occupancy on Valhall and all desktop GPUs). Handles 90%+ of all fragments
-   in a typical frame.
+1. **Main pipeline** — shapes (SDF and tessellated), text, and textured rectangles. Low register
+   footprint (~18–24 registers per thread). Runs at full GPU occupancy on every architecture.
+   Handles 90%+ of all fragments in a typical frame.

 2. **Effects pipeline** — drop shadows, inner shadows, outer glow, and similar ALU-bound blur
-   effects. Register budget: **≤56 registers** (targets Valhall's second cliff at 64; reduced
-   occupancy at the first cliff is accepted by design). Each effects primitive includes the base
+   effects. Medium register footprint (~48–60 registers). Each effects primitive includes the base
   shape's SDF so that it can draw both the effect and the shape in a single fragment pass, avoiding
   redundant overdraw. Separated from the main pipeline to protect main-pipeline occupancy on
   low-end hardware (see register analysis below).

 3. **Backdrop pipeline** — frosted glass, refraction, and any effect that samples the current render
   target as input. Implemented as a multi-pass sequence (downsample, separable blur, composite),
-   where each individual sub-pass has a register budget of **≤24 registers** (full occupancy on
-   Valhall). Separated from the other pipelines because it structurally requires ending the current
-   render pass and copying the render target before any backdrop-sampling fragment can execute — a
-   command-buffer-level boundary that cannot be avoided regardless of shader complexity.
+   where each individual pass has a low-to-medium register footprint (~15–40 registers). Separated
+   from the other pipelines because it structurally requires ending the current render pass and
+   copying the render target before any backdrop-sampling fragment can execute — a command-buffer-
+   level boundary that cannot be avoided regardless of shader complexity.

 A typical UI frame with no effects uses 1 pipeline bind and 0 switches. A frame with drop shadows
 uses 2 pipelines and 1 switch. A frame with shadows and frosted glass uses all 3 pipelines and 2
@@ -117,113 +81,56 @@ code) or many per-primitive-type pipelines (no branching overhead, lean per-shad

 A GPU shader core has a fixed register pool shared among all concurrent threads. The compiler
 allocates registers pessimistically based on the worst-case path through the shader. If the shader
-contains both a 24-register RRect SDF and a 56-register drop-shadow blur, _every_ fragment — even
-trivial RRects — is allocated 56 registers. This directly reduces **occupancy** (the number of
+contains both a 20-register RRect SDF and a 48-register drop-shadow blur, _every_ fragment — even
+trivial RRects — is allocated 48 registers. This directly reduces **occupancy** (the number of
 warps/wavefronts that can run simultaneously), which reduces the GPU's ability to hide memory
 latency.

-Each GPU architecture has discrete **occupancy cliffs** — register counts above which the number of
-concurrent threads drops in a step. Below the cliff, adding registers has zero occupancy cost. One
-register over, throughput drops sharply.
+Each GPU architecture has a **register cliff** — a threshold above which occupancy starts dropping.
+Below the cliff, adding registers has zero occupancy cost.

-**Target architecture: ARM Mali Valhall (32-register first cliff).** The binding constraint for our
-register budgets comes from the SBC (single-board computer) market, where Mali Valhall is the
-dominant current GPU architecture:
+On consumer Ampere/Ada GPUs (RTX 30xx/40xx, 65,536 regs/SM, max 1,536 threads/SM, cliff at ~43 regs):

- **RK3588-class boards** (Orange Pi 5, Radxa Rock 5, Khadas Edge 2, NanoPi R6, Banana Pi M7) ship
-  **Mali-G610** (Valhall). This is the dominant non-Pi SBC platform. First occupancy cliff at **32
-  registers**, second cliff at **64 registers**.
- **ARM Mali Valhall** (G57, G77, G78, G610, G710, G715; 2019+) and **5th-gen / Mali-G1** (2024+):
-  same cliff structure — first at 32, second at 64.
- **ARM Mali Bifrost** (G31, G51, G52, G71, G72, G76; ~2016–2018): first cliff at **16 registers**.
-  Legacy; found on older budget boards (Allwinner H6/H618, Amlogic S922X). See Known limitations
-  below.
- **Broadcom V3D 4.x / 7.x** (Raspberry Pi 4 / Pi 5): first cliff at **16 registers**. Outlier in
-  the current SBC market. See Known limitations below.
- **Apple M3+**: Dynamic Caching (register file virtualization) eliminates the static cliff entirely.
-  Register allocation happens at runtime based on actual usage.
- **Qualcomm Adreno**: dynamic register allocation with soft thresholds; no hard cliff.
- **NVIDIA desktop** (Ampere/Ada): cliff at ~43 registers. Not a constraint for any of our pipelines.
+| Register allocation     | Reg-limited threads | Actual (hw-capped) | Occupancy |
+| ----------------------- | ------------------- | ------------------ | --------- |
+| 20 regs (main pipeline) | 3,276               | 1,536              | 100%      |
+| 32 regs                 | 2,048               | 1,536              | 100%      |
+| 48 regs (effects)       | 1,365               | 1,365              | ~89%      |

-**Register budgets and margin.** We target Valhall's 32-register first cliff for the main and
-backdrop pipelines, and Valhall's 64-register second cliff for the effects pipeline, each with **8
-registers of margin**:
+On Volta/A100 GPUs (65,536 regs/SM, max 2,048 threads/SM, cliff at ~32 regs):

-| Pipeline            | Cliff targeted         | Margin | Register budget   | Rationale                                                                                     |
-| ------------------- | ---------------------- | ------ | ----------------- | --------------------------------------------------------------------------------------------- |
-| Main pipeline       | 32 (Valhall 1st cliff) | 8      | **≤24 regs**      | Handles 90%+ of frame fragments; must run at full occupancy                                   |
-| Backdrop sub-passes | 32 (Valhall 1st cliff) | 8      | **≤24 regs** each | Multi-pass structure keeps each pass small; no reason to give up occupancy                    |
-| Effects pipeline    | 64 (Valhall 2nd cliff) | 8      | **≤56 regs**      | Reduced occupancy at 1st cliff accepted by design — the entire point of splitting effects out |
+| Register allocation     | Reg-limited threads | Actual (hw-capped) | Occupancy |
+| ----------------------- | ------------------- | ------------------ | --------- |
+| 20 regs (main pipeline) | 3,276               | 2,048              | 100%      |
+| 32 regs                 | 2,048               | 2,048              | 100%      |
+| 48 regs (effects)       | 1,365               | 1,365              | ~67%      |

-**Why 8 registers of margin.** Targeting the cliff exactly is fragile. Three forces push register
-counts upward over a shader's lifetime:
+On low-end mobile (ARM Mali Bifrost/Valhall, 64 regs/thread, cliff fixed at 32 regs):

-1. **Compiler version changes.** Mali driver releases (r35p0 → r55p0 etc.) ship new register
-   allocators. Shaders typically drift ±2–3 registers between versions on unchanged source.
-2. **Feature additions.** Each new effect, flag, or uniform adds 1–4 live registers. A new gradient
-   mode or outline option lands in this range.
-3. **Precision regressions.** A `mediump` demoted to `highp` (by bug fix, compiler heuristic change,
-   or a contributor not knowing) costs 2 registers per affected `vec4`.
+| Register allocation  | Occupancy                  |
+| -------------------- | -------------------------- |
+| 0–32 regs (main)     | 100% (full thread count)   |
+| 33–64 regs (effects) | ~50% (thread count halves) |

-Realistic creep over a couple of years is 4–8 registers. The cost of conservatism is zero — a shader
-at 24 regs runs identically to one at 32 on every Valhall device. The cost of crossing the cliff is
-a 2× throughput drop with no warning. Asymmetric costs justify a generous margin.
+Mali's cliff at 32 registers is the binding constraint. On desktop the occupancy difference between
+20 and 48 registers is modest (89–100%); on Mali it is a hard 2× throughput reduction. The
+main/effects split protects 90%+ of a frame's fragments (shapes, text, textures) from the effects
+pipeline's register cost.

-**Why the main/effects split exists.** If the main pipeline shader contained both the 24-register
-SDF path and the ~50-register drop-shadow blur, every fragment — even trivial RRects — would be
-allocated ~50 registers. On Valhall this crosses the 32-register first cliff, halving occupancy for
-90%+ of the frame's fragments. Separating effects into their own pipeline means the main pipeline
-stays at ≤24 registers (full Valhall occupancy), and only the small fraction of fragments that
-actually render effects (~5–10% in a typical UI) run at reduced occupancy.
-
-For the effects pipeline's drop-shadow shader — analytical erf-approximation blur (~80 FLOPs, no
-texture samples) — 50% occupancy on Valhall roughly halves throughput. At 4K with 1.5× overdraw (~12.4M
+For the effects pipeline's drop-shadow shader — erf-approximation blur math with several texture
+fetches — 50% occupancy on Mali roughly halves throughput. At 4K with 1.5× overdraw (~12.4M
 fragments), a single unified shader containing the shadow branch would cost ~4ms instead of ~2ms on
-Valhall. This is a per-frame multiplier even when the heavy branch is never taken, because the
+low-end mobile. This is a per-frame multiplier even when the heavy branch is never taken, because the
 compiler allocates registers for the worst-case path.

-The effects pipeline's ≤56-register budget keeps it under Valhall's second cliff at 64, yielding
-50–67% occupancy on effected shapes. This is acceptable for the small fraction of frame fragments
-that effects cover.
+All main-pipeline members (SDF shapes, tessellated geometry, text, textured rectangles) cluster at
+12–24 registers — below the cliff on every architecture — so unifying them costs nothing in
+occupancy.

-**Note on Apple M3+ GPUs:** Apple's M3 Dynamic Caching allocates registers at runtime based on
-actual usage rather than worst-case. This eliminates the static register-pressure argument on M3 and
-later, but the split remains useful for isolating blur ALU complexity and keeping the backdrop
-texture-copy out of the main render pass.
-
-**Note on NVIDIA desktop GPUs:** On consumer Ampere/Ada (cliff at ~43 regs), even the effects
-pipeline's ≤56-register budget only reduces occupancy to ~89% — well within noise. On Volta/A100
-(cliff at ~32 regs), the effects pipeline drops to ~67%. In both cases the main pipeline runs at
-100% occupancy. Desktop GPUs are not the binding constraint; Valhall is.
-
-#### Known limitations: V3D and Bifrost (16-register cliff)
-
-Broadcom V3D 4.x / 7.x (Raspberry Pi 4 / Pi 5) and ARM Mali Bifrost (G31, G51, G52, G71, G72, G76)
-have a first occupancy cliff at **16 registers**. All three of our pipelines exceed this cliff — even
-the main pipeline's ≤24-register budget is above 16. On these architectures, every shader runs at
-reduced occupancy regardless of which shape kind or effect is active.
-
-Restoring full occupancy on V3D / Bifrost would require a fundamentally different shader
-architecture: per-shape-kind pipeline splitting (one pipeline per SDF kind, each with a minimal
-register footprint under 16). This conflicts with the unified-pipeline design that enables single
-draw calls per scissor, submission-order Z preservation, and low PSO compilation cost. It would
-effectively be the GPUI-style approach whose tradeoffs are analyzed in "Why not per-primitive-type
-pipelines" below.
-
-We treat this as a documented limitation, not a design constraint. The 16-register cliff is legacy
-(Bifrost) or a single-vendor outlier (V3D). The dominant current SBC platform (RK3588 / Mali-G610)
-and all mainstream mobile and desktop GPUs have cliffs at 32 or higher. The long-term direction in
-GPU architecture is toward eliminating static cliffs entirely (Apple Dynamic Caching, Adreno dynamic
-allocation).
-
-#### Verifying register counts
-
-The register estimates in this document are hand-counted via manual live-range analysis (see Current
-state). Shader changes that affect the main or effects pipeline should be verified with `malioc`
-(ARM Mali Offline Compiler) against current Valhall driver versions before merging. `malioc` reports
-exact register allocation, spilling, and occupancy for each Mali generation. On desktop, Radeon GPU
-Analyzer (RGA) and NVIDIA Nsight provide equivalent data. Replacing the hand-counted estimates with
-measured `malioc` numbers is a follow-up task.
+**Note on Apple M3+ GPUs:** Apple's M3 introduces Dynamic Caching (register file virtualization),
+which allocates registers at runtime based on actual usage rather than worst-case. This weakens the
+static register-pressure argument on M3 and later, but the split remains useful for isolating blur
+ALU complexity and keeping the backdrop texture-copy out of the main render pass.

 #### Backdrop split: render-pass structure

@@ -233,11 +140,10 @@ render target must be copied to a separate texture via `CopyGPUTextureToTexture`
 level operation that requires ending the current render pass. This boundary exists regardless of
 shader complexity and cannot be optimized away.

-The backdrop pipeline's individual shader passes (downsample, separable blur, composite) are budgeted
-at ≤24 registers each (same as the main pipeline), so merging them into the effects pipeline would
-cause no occupancy problem. But the render-pass boundary makes merging structurally impossible —
-effects draws happen inside the main render pass, backdrop draws happen inside their own bracketed
-pass sequence.
+The backdrop pipeline's individual shader passes (downsample, separable blur, composite) are
+register-light (~15–40 regs each), so merging them into the effects pipeline would cause no occupancy
+problem. But the render-pass boundary makes merging structurally impossible — effects draws happen
+inside the main render pass, backdrop draws happen inside their own bracketed pass sequence.

 #### Why not per-primitive-type pipelines (GPUI's approach)

@@ -266,9 +172,9 @@ API where each layer draws shadows before quads before glyphs. Our design avoids
 submission order is draw order, no layer juggling required.

 **PSO compilation costs multiply.** Each pipeline takes 1–50ms to compile on Metal/Vulkan/D3D12 at
-first use. 7 pipelines is ~175ms cold startup; 3 pipelines is ~75ms. Adding state axes (blend
-modes, color formats) multiplies combinatorially — a 2.3× larger variant matrix per additional
-axis with 7 pipelines vs 3.
+first use. 7 pipelines is ~175ms cold startup; 3 pipelines is ~75ms. Adding state axes (MSAA
+variants, blend modes, color formats) multiplies combinatorially — a 2.3× larger variant matrix per
+additional axis with 7 pipelines vs 3.

 **Branching cost comparison: unified vs per-kind in the effects pipeline.** The effects pipeline is
 the strongest candidate for per-kind splitting because effect branches are heavier than shape
@@ -349,23 +255,17 @@ There are three categories of branch condition in a fragment shader, ranked by c

 #### Which category our branches fall into

-Our design has three branch points:
+Our design has two branch points:

 1. **`mode` (push constant): tessellated vs. SDF.** This is category 2 — uniform per draw call.
   Every thread in every warp of a draw call sees the same `mode` value. **Zero divergence, zero
   cost.**

-2. **`kind` (flat varying from storage buffer): SDF shape kind dispatch.** This is category 3.
-   The low byte of `Primitive.flags` encodes `Shape_Kind` (RRect, NGon, Ellipse, Ring_Arc), passed
-   to the fragment shader as a `flat` varying. All fragments of one primitive's quad receive the same
-   kind value. The fragment shader's `if/else if` chain selects the appropriate SDF function (~15–30
-   instructions per kind). Divergence occurs only at primitive boundaries where adjacent quads have
-   different kinds.
-
-3. **`flags` (flat varying from storage buffer): gradient/texture/outline mode.** Also category 3.
-   The upper bits of `Primitive.flags` encode `Shape_Flags`, controlling gradient vs. texture vs.
-   solid color selection and outline rendering — all lightweight branches (3–8 instructions per
-   path). Divergence at primitive boundaries between different flag combinations has negligible cost.
+2. **`shape_kind` (flat varying from storage buffer): which SDF to evaluate.** This is category 3.
+   The `flat` interpolation qualifier ensures that all fragments rasterized from one primitive's quad
+   receive the same `shape_kind` value. Divergence can only occur at the **boundary between two
+   adjacent primitives of different kinds**, where the rasterizer might pack fragments from both
+   primitives into the same warp.

 For category 3, the divergence analysis depends on primitive size:

@@ -382,12 +282,10 @@ For category 3, the divergence analysis depends on primitive size:
  frame-level divergence is typically **1–3%** of all warps.

 At 1–3% divergence, the throughput impact is negligible. At 4K with 12.4M total fragments
-(~387,000 warps), divergent boundary warps number in the low thousands. The longest SDF kind branch
-is Ring_Arc (~30 instructions); when a divergent warp straddles two different kinds, it pays the cost
-of both (~45–60 instructions total). Each divergent warp's extra cost is modest — at ~12G
-instructions/sec on a mid-range GPU, even 3,000 divergent warps × 60 extra instructions totals
-~15μs, under 0.2% of an 8.3ms (120 FPS) frame budget. This is confirmed by production renderers
-that use exactly this pattern:
+(~387,000 warps), divergent boundary warps number in the low thousands. Each divergent warp pays at
+most ~25 extra instructions (the cost of the longest untaken SDF branch). At ~12G instructions/sec
+on a mid-range GPU, that totals ~4μs — under 0.05% of an 8.3ms (120 FPS) frame budget. This is
+confirmed by production renderers that use exactly this pattern:

 - **vger / vger-rs** (Audulus): single pipeline, 11 primitive kinds dispatched by a `switch` on a
  flat varying `prim_type`. Ships at 120 FPS on iPads. The author (Taylor Holliday) replaced nanovg
@@ -411,10 +309,9 @@ our design:
   > have no per-fragment data-dependent branches in the main pipeline.

 2. **Branches where both paths are very long.** If both sides of a branch are 500+ instructions,
-   divergent warps pay double a large cost. Our SDF kind branches are short (~15–30 instructions
-   each), and the gradient/texture/solid color selection branches are shorter still (3–8 instructions
-   each). Even fully divergent, the combined penalty is ~30–60 extra instructions — comparable to a
-   single texture sample's latency.
+   divergent warps pay double a large cost. Our SDF functions are 10–25 instructions each. Even
+   fully divergent, the penalty is ~25 extra instructions — less than a single texture sample's
+   latency.

 3. **Branches that prevent compiler optimizations.** Some compilers cannot schedule instructions
   across branch boundaries, reducing VLIW utilization on older architectures. Modern GPUs (NVIDIA
@@ -422,10 +319,9 @@ our design:
   concern.

 4. **Register pressure from the union of all branches.** This is the real cost, and it is why we
-   split heavy effects into separate pipelines. Within the main pipeline, the four
-   SDF kind branches and flag-based color selection cluster at ~22–26 registers (see register
-   analysis in Current state), within the ≤24-register budget that guarantees full occupancy on
-   Valhall and all desktop architectures. See Known limitations for V3D / Bifrost.
+   split heavy effects (shadows, glass) into separate pipelines. Within the main pipeline, all SDF
+   branches have similar register footprints (12–22 registers), so combining them causes negligible
+   occupancy loss.

 **References:**

@@ -445,40 +341,26 @@ our design:

 ### Main pipeline: SDF + tessellated (unified)

-The main pipeline serves three submission modes through a single `TRIANGLELIST` pipeline and a
-single vertex input layout, distinguished by a `mode` field in the `Vertex_Uniforms_2D` push
-constant (`Core_2D_Mode.Tessellated = 0`, `Core_2D_Mode.SDF = 1`, `Core_2D_Mode.Text = 2`), pushed
-per draw call via `push_globals`. The vertex shader branches on this uniform to select the
-appropriate code path.
+The main pipeline serves two submission modes through a single `TRIANGLELIST` pipeline and a single
+vertex input layout, distinguished by a push constant:

- **Tessellated mode** (`mode = 0`): direct vertex buffer with explicit geometry in _logical_
-  pixels. Vertex shader scales positions by `dpi_scale`. Used for triangles, triangle fans/strips,
-  single-pixel points, and any user-provided raw vertex geometry.
- **SDF mode** (`mode = 1`): shared unit-quad vertex buffer + GPU storage buffer of
-  `Core_2D_Primitive` structs, drawn instanced. Used for all shapes with closed-form signed distance
-  functions. `Core_2D_Primitive.bounds` is in logical pixels; the vertex shader scales by
-  `dpi_scale`.
- **Text mode** (`mode = 2`): direct vertex buffer with explicit geometry in _physical_ pixels.
-  Vertex shader does NOT scale. Used for SDL_ttf atlas sampling. The CPU-side anchor snap to
-  integer physical pixels (`prepare_text`/`prepare_text_transformed`) is what produces crisp glyphs
-  — sub-pixel anchors blur via the bilinear sampler. Mode 2 shares the fragment-shader path with
-  Tessellated (kind 0), so the only divergence between text and shape rasterization is the vertex
-  shader's `* dpi_scale` step.
+- **Tessellated mode** (`mode = 0`): direct vertex buffer with explicit geometry. Unchanged from
+  today. Used for text (SDL_ttf atlas sampling), polylines, triangle fans/strips, gradient-filled
+  shapes, and any user-provided raw vertex geometry.
+- **SDF mode** (`mode = 1`): shared unit-quad vertex buffer + GPU storage buffer of `Primitive`
+  structs, drawn instanced. Used for all shapes with closed-form signed distance functions.

-All three modes use the same fragment shader. Modes 0 (Tessellated) and 2 (Text) take the same
-fragment-shader path (kind 0), which premultiplies the texture sample and computes `out = color * t`;
-they differ only in the vertex shader (whether positions are pre-scaled to physical pixels). Mode 1
-(SDF) checks `Shape_Kind` (low byte of `Core_2D_Primitive.flags`): kinds 1–4 dispatch to one of four
-SDF functions (RRect, NGon, Ellipse, Ring_Arc) and apply gradient/texture/outline/solid color based
-on `Shape_Flags` bits.
+Both modes converge on the same fragment shader, which dispatches on a `shape_kind` discriminant
+carried either in the vertex data (tessellated, always `Solid = 0`) or in the storage-buffer
+primitive struct (SDF modes).

 #### Why SDF for shapes

 CPU-side adaptive tessellation for curved shapes (the current approach) has three problems:

 1. **Vertex bandwidth.** A rounded rectangle with four corner arcs produces ~250 vertices × 20 bytes
-   = 5 KB. An SDF rounded rectangle is one `Core_2D_Primitive` struct (96 bytes) plus 4 shared
-   unit-quad vertices. That is roughly a 50× reduction per shape.
+   = 5 KB. An SDF rounded rectangle is one `Primitive` struct (~56 bytes) plus 4 shared unit-quad
+   vertices. That is roughly a 90× reduction per shape.

 2. **Quality.** Tessellated curves are piecewise-linear approximations. At high DPI or under
   animation/zoom, faceting is visible at any practical segment count. SDF evaluation produces
@@ -509,55 +391,49 @@ SDF primitives are submitted via a GPU storage buffer indexed by `gl_InstanceInd
 shader, rather than encoding per-primitive data redundantly in vertex attributes. This follows the
 pattern used by both Zed GPUI and vger-rs.

-Each SDF shape is described by a single `Core_2D_Primitive` struct (96 bytes) in the storage
-buffer. The vertex shader reads `primitives[gl_InstanceIndex]`, computes the quad corner position
-from the unit vertex and the primitive's bounds, and passes shape parameters to the fragment shader
-via `flat` interpolated varyings.
+Each SDF shape is described by a single `Primitive` struct (~56 bytes) in the storage buffer. The
+vertex shader reads `primitives[gl_InstanceIndex]`, computes the quad corner position from the unit
+vertex and the primitive's bounds, and passes shape parameters to the fragment shader via `flat`
+interpolated varyings.

 Compared to encoding per-primitive data in vertex attributes (the "fat vertex" approach), storage-
 buffer instancing eliminates the 4–6× data duplication across quad corners. A rounded rectangle costs
-96 bytes instead of 4 vertices × 60+ bytes = 240+ bytes.
+56 bytes instead of 4 vertices × 40+ bytes = 160+ bytes.

-The tessellated and text paths retain the existing direct vertex buffer layout (20 bytes/vertex, no
-storage buffer access). The vertex shader branch on `mode` (push constant) is warp-uniform — every
-invocation in a draw call has the same mode — so it is effectively free on all modern GPUs.
+The tessellated path retains the existing direct vertex buffer layout (20 bytes/vertex, no storage
+buffer access). The vertex shader branch on `mode` (push constant) is warp-uniform — every invocation
+in a draw call has the same mode — so it is effectively free on all modern GPUs.

-#### Shape kinds and SDF dispatch
+#### Shape kinds

-The fragment shader dispatches on `Shape_Kind` (low byte of `Core_2D_Primitive.flags`) to evaluate
-one of four signed distance functions. The `Shape_Kind` enum, per-kind `*_Params` structs, and
-CPU-side drawing procs all live in `core_2d.odin`. The drawing procs build the appropriate
-`Core_2D_Primitive` and set the kind automatically:
+Primitives in the main pipeline's storage buffer carry a `Shape_Kind` discriminant:

-Each user-facing shape proc accepts a `Brush` union (color, linear gradient, radial gradient,
-or textured fill) as its fill source, plus optional outline parameters. The procs map to SDF
-kinds as follows:
+| Kind       | SDF function                           | Notes                                                     |
+| ---------- | -------------------------------------- | --------------------------------------------------------- |
+| `RRect`    | `sdRoundedBox` (iq)                    | Per-corner radii. Covers all Clay rectangles and borders. |
+| `Circle`   | `sdCircle`                             | Filled and stroked.                                       |
+| `Ellipse`  | `sdEllipse`                            | Exact (iq's closed-form).                                 |
+| `Segment`  | `sdSegment` capsule                    | Rounded caps, correct sub-pixel thin lines.               |
+| `Ring_Arc` | `abs(sdCircle) - thickness` + arc mask | Rings, arcs, circle sectors unified.                      |
+| `NGon`     | `sdRegularPolygon`                     | Regular n-gon for n ≥ 5.                                  |

-| User-facing proc     | Shape_Kind | SDF function       | Notes                                                      |
-| -------------------- | ---------- | ------------------ | ---------------------------------------------------------- |
-| `rectangle`          | `RRect`    | `sdRoundedBox`     | Per-corner radii from `radii` param                        |
-| `circle`             | `RRect`    | `sdRoundedBox`     | Uniform radii = half-size (circle is a degenerate RRect)   |
-| `line`, `line_strip` | `RRect`    | `sdRoundedBox`     | Rotated capsule — stadium shape (radii = half-thickness)   |
-| `ellipse`            | `Ellipse`  | `sdEllipseApprox`  | Approximate ellipse SDF (fast, suitable for UI)            |
-| `polygon`            | `NGon`     | `sdRegularPolygon` | Regular N-sided polygon inscribed in a circle              |
-| `ring` (full)        | `Ring_Arc` | Annular radial SDF | `max(inner - r, r - outer)` with no angular clipping       |
-| `ring` (partial arc) | `Ring_Arc` | Annular radial SDF | Pre-computed edge normals for angular wedge mask           |
-| `ring` (pie slice)   | `Ring_Arc` | Annular radial SDF | `inner_radius = 0`, angular clipping via `start/end_angle` |
+The `Solid` kind (value 0) is reserved for the tessellated path, where `shape_kind` is implicitly
+zero because the fragment shader receives it from zero-initialized vertex attributes.

-The `Shape_Flags` bit set controls per-primitive rendering mode (outline, gradient, texture, rotation,
-arc geometry). See the `Shape_Flag` enum in `core_2d.odin` for the authoritative flag
-definitions and bit assignments.
+Stroke/outline variants of each shape are handled by the `Shape_Flags` bit set rather than separate
+shape kinds. The fragment shader transforms `d = abs(d) - stroke_width` when the `Stroke` flag is
+set.

 **What stays tessellated:**

 - Text (SDL_ttf atlas, pending future MSDF evaluation)
- `tess.pixel` (single-pixel points)
- `tess.triangle`, `tess.triangle_aa`, `tess.triangle_lines` (single triangles)
- `tess.triangle_fan`, `tess.triangle_strip` (arbitrary user-provided geometry)
+- `rectangle_gradient`, `circle_gradient` (per-vertex color interpolation)
+- `triangle_fan`, `triangle_strip` (arbitrary user-provided point lists)
+- `line_strip` / polylines (SDF polyline rendering is possible but complex; deferred)
 - Any raw vertex geometry submitted via `prepare_shape`

-The design rule: if the shape has a closed-form SDF, it goes through the SDF path with its own
-`Shape_Kind`. If it is described by a vertex list or has no practical SDF, it stays tessellated.
+The rule: if the shape has a closed-form SDF, it goes SDF. If it's described only by a vertex list or
+needs per-vertex color interpolation, it stays tessellated.

 ### Effects pipeline

@@ -618,153 +494,44 @@ Wallace's variant) and vger-rs.
 ### Backdrop pipeline

 The backdrop pipeline handles effects that sample the current render target as input: frosted glass,
-refraction, mirror surfaces. It is separated from the main and effects pipelines for a structural
-reason, not register pressure.
+refraction, mirror surfaces. It is separated from the effects pipeline for a structural reason, not
+register pressure.

 **Render-pass boundary.** Before any backdrop-sampling fragment can run, the current render target
-must be in a sampler-readable state. A draw call that samples the render target it is also writing
-to is a hard GPU constraint; the only way to satisfy it is to end the current render pass and start
-a new one. That render-pass boundary is what a “bracket” is.
+must be copied to a separate texture via `CopyGPUTextureToTexture`. This is a command-buffer-level
+operation that cannot happen mid-render-pass. The copy naturally creates a pipeline boundary that no
+amount of shader optimization can eliminate — it is a fundamental requirement of sampling a surface
+while also writing to it.

 **Multi-pass implementation.** Backdrop effects are implemented as separable multi-pass sequences
-(downsample → horizontal blur → vertical blur → composite), following the standard approach used
-by iOS `UIVisualEffectView`, Android `RenderEffect`, and Flutter's `BackdropFilter`. Each individual
-sub-pass is budgeted at **≤24 registers** (same as the main pipeline — full Valhall occupancy). The
-multi-pass approach avoids the monolithic 70+ register shader that a single-pass Gaussian blur would
-require, keeping each sub-pass well under the 32-register cliff.
+(downsample → horizontal blur → vertical blur → composite), following the standard approach used by
+iOS `UIVisualEffectView`, Android `RenderEffect`, and Flutter's `BackdropFilter`. Each individual
+pass has a low-to-medium register footprint (~15–40 registers), well within the main pipeline's
+occupancy range. The multi-pass approach avoids the monolithic 70+ register shader that a single-pass
+Gaussian blur would require, making backdrop effects viable on low-end mobile GPUs (including
+Mali-G31 and VideoCore VI) where per-thread register limits are tight.

-**Render-target choice.** When any layer in the frame contains a backdrop draw, the entire
-frame renders into `source_texture` (a full-resolution single-sample texture owned by the
-backdrop pipeline) instead of directly into the swapchain. At the end of the frame,
-`source_texture` is copied to the swapchain via a single `CopyGPUTextureToTexture` call.
-This means each bracket has no mid-frame texture copy: by the time a bracket runs,
-`source_texture` already contains the contents written by everything that preceded it on the
-timeline and is the natural sampler input. When no layer in the frame has a backdrop draw,
-the existing fast path runs: the frame renders directly to the swapchain and the backdrop
-pipeline's working textures are never touched. Zero cost for backdrop-free frames.
+**Bracketed execution.** All backdrop draws in a frame share a single bracketed region of the command
+buffer: end the current render pass, copy the render target, execute all backdrop sub-passes, then
+resume normal drawing. The entry/exit cost (texture copy + render-pass break) is paid once per frame
+regardless of how many backdrop effects are visible. When no backdrop effects are present, the bracket
+is never entered and the texture copy never happens — zero cost.

-**Why not split the backdrop sub-passes into separate pipelines?** Each sub-pass is budgeted at ≤24
-registers, well under Valhall's 32-register cliff, so there is no occupancy motivation for splitting.
-The sub-passes also have no common-vs-uncommon distinction — if backdrop effects are active, every
-sub-pass runs; if not, none run. The backdrop pipeline either executes as a complete unit or not at
-all. Additionally, backdrop effects cover a small fraction of the frame's total fragments (~5% at
-typical UI scales), so even if a sub-pass did cross a cliff, the occupancy variation within the
-bracket would have negligible impact on frame time.
-
-#### Bracket scheduling
-
-Backdrop draws are scheduled via **explicit scopes**: every call to `backdrop_blur` must be wrapped
-in a `begin_backdrop` / `end_backdrop` pair (or the RAII-style `backdrop_scope` wrapper). Each
-scope produces exactly one bracket at render time. A layer may contain any number of scopes; draws
-between scopes render at their submission position relative to the brackets, so the user controls
-exactly which backdrops share a bracket.
-
-At render time, `draw_layer` walks the layer's sub-batch list once, alternating between two run
-kinds:
-
- **Non-backdrop runs** are rendered to `source_texture` in one render pass via
-  `render_layer_sub_batch_range`. Clear-vs-load is tracked frame-globally via `GLOB.cleared`.
- **Backdrop runs** are dispatched to `run_backdrop_bracket` with their index range. Each run is
-  one bracket; the bracket opens and closes its own render passes for downsample, H-blur, V-blur,
-  and composite stages.
-
-Within a bracket, the scheduler groups contiguous same-sigma sub-batches and runs four sub-passes
-per group: downsample (`source_texture` → `downsample_texture`), H-blur (`downsample_texture` →
-`h_blur_texture`), V-blur (`h_blur_texture` → `downsample_texture`, ping-pong reuse), and
-composite (`downsample_texture` → `source_texture` with SDF mask and tint applied). Each group
-picks its own downsample factor (1, 2, or 4) based on sigma; see the comment block at the top of
-`backdrop.odin` for the factor-selection table.
-
-Sub-batch coalescing in `append_or_extend_sub_batch` merges contiguous same-sigma backdrops
-sharing one scissor into a single instanced composite draw. Same-sigma backdrops separated by a
-`ScissorStart` boundary stay in one sigma group (one set of blur passes) but issue separate
-composite draws; the composite pass calls `SetGPUScissor` between draws when the active scissor
-changes.
-
-Working textures are sized at full swapchain resolution; larger downsample factors fill a sub-rect
-via viewport-limited rendering.
-
-#### Scope contract
-
-Scope state is global: `GLOB.open_backdrop_layer` tracks the currently-open scope (or `nil`) for
-the whole renderer. The five misuse cases panic via `log.panic` / `log.panicf`:
-
-1. `backdrop_blur` called outside an open scope.
-2. A non-backdrop draw call issued on a layer with an open scope. Asserted at the top of
-   `append_or_extend_sub_batch`.
-3. `new_layer` called while a scope is open.
-4. `end()` called while a scope is open.
-5. `begin_backdrop` while one is already open, or `end_backdrop` on the wrong layer.
-
-Worked example with two scopes on the same layer:
-
-```
-base := draw.begin(...)
-draw.rectangle(base, bg, GRAY)
-draw.rectangle(base, card_blue, BLUE)
-
-{
-    draw.backdrop_scope(base)
-    draw.backdrop_blur(base, panelA, sigma=12)      // bracket 1: sees bg + blue card
-}
-
-draw.rectangle(base, card_red, RED)                  // renders ON TOP of panelA's composite
-
-{
-    draw.backdrop_scope(base)
-    draw.backdrop_blur(base, panelB, sigma=12)      // bracket 2: sees bg + blue card + panelA + card_red
-}
-
-draw.text(base, "label", ...)                        // renders ON TOP of panelB's composite
-```
-
-Each bracket adds four render passes (downsample + H-blur + V-blur + composite) plus tile-cache
-flushes on tilers like Mali Valhall, so users who don't need interleaving should group backdrops
-into a single scope to amortize:
-
-```
-{
-    draw.backdrop_scope(base)
-    draw.backdrop_blur(base, panelA, sigma=12)      // shares one bracket with panelB;
-    draw.backdrop_blur(base, panelB, sigma=12)      // same sigma also coalesces into one
-}                                                    // instanced composite draw call
-```
-
-#### Clay integration: `Backdrop_Marker`
-
-Clay has no notion of backdrops. The integration uses Clay's only extension point — the opaque
-`customData: rawptr` on `clay.CustomElementConfig` — to carry a magic-number-tagged struct that
-`prepare_clay_batch` recognizes:
-
-```
-Backdrop_Marker :: struct {
-    magic:       u32,  // BACKDROP_MARKER_MAGIC (0x42445054, 'BDPT')
-    sigma:       f32,
-    tint:        Color,
-    radii:       Rectangle_Radii,
-    feather_ppx: f32,
-}
-```
-
-The user populates a `Backdrop_Marker` (with stable lifetime through the `prepare_clay_batch`
-call) and points the corresponding `clay.CustomElementConfig.customData` at it.
-`prepare_clay_batch` walks Clay's command stream once, calling `is_clay_backdrop` per command
-(a u32 magic check on `customData`'s first 4 bytes). On a hit it opens a backdrop scope (or
-extends an open one) and dispatches via `backdrop_blur`. Non-backdrop commands issued during an
-open scope go to a deferred index buffer for replay after the scope closes; this preserves Clay's
-painter's-algorithm ordering across backdrops without violating the scope contract.
-
-The magic-number sentinel keeps the marker type self-describing in core dumps and decouples the
-integration from Clay-side changes. Zero-init memory has `magic = 0`, so a marker with a forgotten
-magic field gets routed through the regular `custom_draw` path and surfaces as "my custom draw
-never fired" rather than as a silent backdrop schedule.
+**Why not split the backdrop sub-passes into separate pipelines?** The individual passes range from
+~15 to ~40 registers, which does cross Mali's 32-register cliff. However, the register-pressure argument
+that justifies the main/effects split does not apply here. The main/effects split protects the
+_common path_ (90%+ of frame fragments) from the uncommon path's register cost. Inside the backdrop
+pipeline there is no common-vs-uncommon distinction — if backdrop effects are active, every sub-pass
+runs; if not, none run. The backdrop pipeline either executes as a complete unit or not at all.
+Additionally, backdrop effects cover a small fraction of the frame's total fragments (~5% at typical
+UI scales), so the occupancy variation within the bracket has negligible impact on frame time.

 ### Vertex layout

 The vertex struct is unchanged from the current 20-byte layout:

 ```
-Vertex_2D :: struct {
+Vertex :: struct {
    position: [2]f32,  //  0: screen-space position
    uv:       [2]f32,  //  8: atlas UV (text) or unused (shapes)
    color:    Color,   // 16: u8x4, GPU-normalized to float
@@ -776,30 +543,25 @@ draws, `position` carries actual world-space geometry. For SDF draws, `position`
 corners (0,0 to 1,1) and the vertex shader computes world-space position from the storage-buffer
 primitive's bounds.

-The `Core_2D_Primitive` struct for SDF shapes lives in the storage buffer, not in vertex attributes:
+The `Primitive` struct for SDF shapes lives in the storage buffer, not in vertex attributes:

 ```
-Core_2D_Primitive :: struct {
-    bounds:      [4]f32,           //  0: min_x, min_y, max_x, max_y
-    color:       Color,            // 16: u8x4, unpacked in shader via unpackUnorm4x8
-    flags:       u32,              // 20: low byte = Shape_Kind, bits 8+ = Shape_Flags
-    rotation_sc: u32,              // 24: packed f16 pair (sin, cos). Requires .Rotated flag.
-    _pad:        f32,              // 28: reserved for future use
-    params:      Shape_Params,     // 32: per-kind params union (half_feather_ppx, radii_ppx, etc.) (32 bytes)
-    uv_rect:     [4]f32,           // 64: texture UV coordinates. Read when .Textured.
-    effects:     Gradient_Outline, // 80: gradient and/or outline parameters (16 bytes).
+Primitive :: struct {
+    bounds:     [4]f32,         //  0: min_x, min_y, max_x, max_y
+    color:      Color,          // 16: u8x4, unpacked in shader via unpackUnorm4x8
+    kind_flags: u32,            // 20: (kind as u32) | (flags as u32 << 8)
+    rotation:   f32,            // 24: shader self-rotation in radians
+    _pad:       f32,            // 28: alignment
+    params:     Shape_Params,   // 32: raw union, 32 bytes (two vec4s of shape-specific data)
+    uv_rect:    [4]f32,         // 64: texture UV sub-region (u_min, v_min, u_max, v_max)
 }
-// Total: 96 bytes (std430 aligned)
+// Total: 80 bytes (std430 aligned)
 ```

-`Shape_Params` is a `#raw_union` over `RRect_Params`, `NGon_Params`, `Ellipse_Params`, and
-`Ring_Arc_Params` (plus a `raw: [8]f32` view), defined in `core_2d.odin`. Each SDF kind
-writes its own params variant; the fragment shader reads the appropriate fields based on `Shape_Kind`.
-`Gradient_Outline` is a 16-byte struct containing `gradient_color: Color`, `outline_color: Color`,
-`gradient_dir_sc: u32` (packed f16 cos/sin pair), and `outline_packed: u32` (packed f16 outline
-width). It is independent of `uv_rect`, so a primitive can carry texture and outline parameters at
-the same time. The `flags` field encodes the `Shape_Kind` in the low byte and `Shape_Flags` in bits
-8+ via `pack_kind_flags`.
+`Shape_Params` is a `#raw_union` with named variants per shape kind (`rrect`, `circle`, `segment`,
+etc.), ensuring type safety on the CPU side and zero-cost reinterpretation on the GPU side. The
+`uv_rect` field is used by textured SDF primitives (Shape_Flag.Textured); non-textured primitives
+leave it zeroed.

 ### Draw submission order

@@ -821,18 +583,18 @@ invariant is that each primitive is drawn exactly once, in the pipeline that own
 Text rendering currently uses SDL_ttf's GPU text engine, which rasterizes glyphs per `(font, size)`
 pair into bitmap atlases and emits indexed triangle data via `GetGPUTextDrawData`. This path is
 **unchanged** by the SDF migration — text continues to flow through the main pipeline's tessellated
-mode with `mode = 0`, sampling the SDL_ttf atlas texture.
+mode with `shape_kind = Solid`, sampling the SDL_ttf atlas texture.

-MSDF (multi-channel signed distance field) text rendering may be evaluated later, which would
+A future phase may evaluate MSDF (multi-channel signed distance field) text rendering, which would
 allow resolution-independent glyph rendering from a single small atlas per font. This would involve:

 - Offline atlas generation via Chlumský's msdf-atlas-gen tool.
 - Runtime glyph metrics via `vendor:stb/truetype` (already in the Odin distribution).
- A new MSDF glyph `Shape_Kind` in the fragment shader (additive — the kind dispatch infrastructure
-  already exists for the four current SDF kinds).
+- A new `Shape_Kind.MSDF_Glyph` variant in the main pipeline's fragment shader.
 - Potential removal of the SDL_ttf dependency.

-This is explicitly deferred.
+This is explicitly deferred. The SDF shape migration is independent of and does not block text
+changes.

 **References:**

@@ -846,8 +608,8 @@ This is explicitly deferred.
 ### Textures

 Textures plug into the existing main pipeline — no additional GPU pipeline, no shader rewrite. The
-work is a resource layer (registration, upload, sampling, lifecycle) plus a `Texture_Fill` Brush
-variant that routes the existing shape procs through the SDF path with the `.Textured` flag set.
+work is a resource layer (registration, upload, sampling, lifecycle) plus two textured-draw procs
+that route into the existing tessellated and SDF paths respectively.

 #### Why draw owns registered textures

@@ -897,30 +659,35 @@ with the same texture but different samplers produce separate draw calls, which

 #### Textured draw procs

-Textures share the same shape procs as colors and gradients. Each shape proc takes a `Brush`
-union as its fill source; passing a `Texture_Fill` value (carrying `Texture_Id`, `tint`,
-`uv_rect`, and `Sampler_Preset`) routes the draw through the SDF path with the `.Textured`
-flag set. There is no dedicated `rectangle_texture` / `circle_texture` proc — the same
-`rectangle`, `circle`, `ellipse`, `polygon`, `ring`, `line`, and `line_strip` procs handle
-all fill sources.
+Textured rectangles route through the existing SDF path via `draw.rectangle_texture` and
+`draw.rectangle_texture_corners`, mirroring `draw.rectangle` and `draw.rectangle_corners` exactly —
+same parameters, same naming — with the color parameter replaced by a texture ID plus an optional
+tint.

-A separate tessellated proc for "simple" fullscreen quads was considered on the theory that
-the tessellated path's lower register count would improve occupancy at large fragment counts.
-Both paths are well within the ≤24-register main pipeline budget — both run at full
-occupancy on every target architecture (Valhall and above). The remaining ALU difference
-(~15 extra instructions for the SDF evaluation) amounts to ~20μs at 4K — below noise.
-Meanwhile, splitting into a separate pipeline would add ~1–5μs per pipeline bind on the CPU
-side per scissor, matching or exceeding the GPU-side savings. Within the main pipeline,
-unified remains strictly better.
+An earlier iteration of this design considered a separate tessellated `draw.texture` proc for
+"simple" fullscreen quads, on the theory that the tessellated path's lower register count (~16 regs
+vs ~24 for the SDF textured branch) would improve occupancy at large fragment counts. Applying the
+register-pressure analysis from the pipeline-strategy section above shows this is wrong: both 16 and
+24 registers are well below the register cliff (~43 regs on consumer Ampere/Ada, ~32 on Volta/A100),
+so both run at 100% occupancy. The remaining ALU difference (~15 extra instructions for the SDF
+evaluation) amounts to ~20μs at 4K — below noise. Meanwhile, splitting into a separate pipeline
+would add ~1–5μs per pipeline bind on the CPU side per scissor, matching or exceeding the GPU-side
+savings. Within the main pipeline, unified remains strictly better.

-SDF drawing procs live in the `draw` package with unprefixed names (`rectangle`, `circle`,
-`ellipse`, `polygon`, `ring`, `line`, `line_strip`). Gradients, textures, and outlines are
-selected via the `Brush` union and optional outline parameters rather than separate overloads.
+The naming convention follows the existing shape API: `rectangle_texture` and
+`rectangle_texture_corners` sit alongside `rectangle` and `rectangle_corners`, mirroring the
+`rectangle_gradient` / `circle_gradient` pattern where the shape is the primary noun and the
+modifier (gradient, texture) is secondary. This groups related procs together in autocomplete
+(`rectangle_*`) and reads as natural English ("draw a rectangle with a texture").
+
+Future per-shape texture variants (`circle_texture`, `ellipse_texture`, `polygon_texture`) are
+reserved by this naming convention and require only a `Shape_Flag.Textured` bit plus a small
+per-shape UV mapping function in the fragment shader. These are additive.

 #### What SDF anti-aliasing does and does not do for textured draws

 The SDF path anti-aliases the **shape's outer silhouette** — rounded-corner edges, rotated edges,
-outline edges. It does not anti-alias or sharpen the texture content. Inside the shape, fragments
+stroke outlines. It does not anti-alias or sharpen the texture content. Inside the shape, fragments
 sample through the chosen `Sampler_Preset`, and image quality is whatever the sampler produces from
 the source texels. A low-resolution texture displayed at a large size shows bilinear blur regardless
 of which draw proc is used. This matches the current text-rendering model, where glyph sharpness
@@ -929,8 +696,8 @@ depends on how closely the display size matches the SDL_ttf atlas's rasterized s
 #### Fit modes are a computation layer, not a renderer concept

 Standard image-fit behaviors (stretch, fill/cover, fit/contain, tile, center) are expressed as UV
-sub-region computations on top of the `uv_rect` field of `Texture_Fill`. The renderer has no
-knowledge of fit modes — it samples whatever UV region it is given.
+sub-region computations on top of the `uv_rect` parameter that both textured-draw procs accept. The
+renderer has no knowledge of fit modes — it samples whatever UV region it is given.

 A `fit_params` helper computes the appropriate `uv_rect`, sampler preset, and (for letterbox/fit
 mode) shrunken inner rect from a `Fit_Mode` enum, the target rect, and the texture's pixel size.
@@ -954,13 +721,13 @@ textures onto a free list that is processed in `r_end_frame`, not at the call si

 Clay's `RenderCommandType.Image` is handled by dereferencing `imageData: rawptr` as a pointer to a
 `Clay_Image_Data` struct containing a `Texture_Id`, `Fit_Mode`, and tint color. Routing mirrors the
-existing rectangle handling: `fit_params` computes UVs from the fit mode, then `rectangle` is
-called with a `Texture_Fill` brush and the appropriate radii (zero for sharp corners, per-corner
-values from Clay's `cornerRadius` otherwise).
+existing rectangle handling: zero `cornerRadius` dispatches to `draw.texture` (tessellated), nonzero
+dispatches to `draw.rectangle_texture_corners` (SDF). A `fit_params` call computes UVs from the fit
+mode before dispatch.

 #### Deferred features

-The following are plumbed in `Texture_Desc` but not yet implemented:
+The following are plumbed in the descriptor but not implemented in phase 1:

 - **Mipmaps**: `Texture_Desc.mip_levels` field exists; generation via SDL3 deferred.
 - **Compressed formats**: `Texture_Desc.format` accepts BC/ASTC; upload path deferred.
@@ -968,6 +735,7 @@ The following are plumbed in `Texture_Desc` but not yet implemented:
 - **3D textures, arrays, cube maps**: `Texture_Desc.type` and `depth_or_layers` fields exist.
 - **Additional samplers**: anisotropic, trilinear, clamp-to-border — additive enum values.
 - **Atlas packing**: internal optimization for sub-batch coalescing; invisible to callers.
+- **Per-shape texture variants**: `circle_texture`, `ellipse_texture`, etc. — reserved by naming.

 **References:**

@@ -1,794 +0,0 @@
-// Clay UI integration for the `draw` package.
-//
-// All code in this file is dedicated to bridging Clay's render command stream into `draw`'s
-// primitive/sub-batch pipeline. Nothing outside this file should reference the `clay` package
-// directly; everything Clay-related (types, lifecycle helpers, render-command dispatch, the
-// border-merge stack, the Clay backdrop bracket walker, the text measure/error callbacks,
-// and the `Clay_Image_Data` user-facing helper) lives here. `draw.odin`'s lifecycle procs
-// call `init_clay`, `destroy_clay`, and `clear_clay_per_frame` to drive the bits of state
-// that necessarily live on the shared `Global` struct.
-package draw
-
-import "base:runtime"
-import "core:c"
-import "core:log"
-import "core:strings"
-import sdl "vendor:sdl3"
-import sdl_ttf "vendor:sdl3/ttf"
-
-import clay "../vendor/clay"
-
-// ---------------------------------------------------------------------------------------------------------------------
-// ----- Lifecycle ------------
-// ---------------------------------------------------------------------------------------------------------------------
-
-// Allocate the Clay arena, build the merge-candidate stack, hand the arena to Clay, and
-// register the text-measurement and error callbacks. Called by `init` once `GLOB` has been
-// populated with the device/window state Clay's callbacks read from.
-//INTERNAL
-init_clay :: proc(window: ^sdl.Window, allocator: runtime.Allocator) {
-	min_memory_size: c.size_t = cast(c.size_t)clay.MinMemorySize()
-	GLOB.clay_merge_open_stack = make([dynamic]Clay_Merge_Candidate, 0, 16, allocator = allocator)
-	GLOB.clay_memory = make([^]u8, min_memory_size, allocator = allocator)
-	arena := clay.CreateArenaWithCapacityAndMemory(min_memory_size, GLOB.clay_memory)
-	window_width, window_height: c.int
-	sdl.GetWindowSize(window, &window_width, &window_height)
-	clay.Initialize(arena, {f32(window_width), f32(window_height)}, {handler = clay_error_handler})
-	clay.SetMeasureTextFunction(measure_text_clay, nil)
-}
-
-// Free the Clay arena memory allocated in `init_clay`. Called by `destroy`. The merge stack
-// is left to the package allocator's normal teardown to preserve historical behavior.
-//INTERNAL
-destroy_clay :: proc(allocator: runtime.Allocator) {
-	free(GLOB.clay_memory, allocator)
-}
-
-// Reset Clay per-frame state: the z-index high-water mark and the border-merge stack.
-// Called by `clear_global` at the start of every frame.
-//INTERNAL
-clear_clay_per_frame :: proc() {
-	GLOB.clay_z_index = 0
-	clear(&GLOB.clay_merge_open_stack)
-}
-
-// ---------------------------------------------------------------------------------------------------------------------
-// ----- Image data (Clay RenderCommandType.Image payload) ------------
-// ---------------------------------------------------------------------------------------------------------------------
-
-Clay_Image_Data :: struct {
-	texture_id: Texture_Id,
-	fit:        Fit_Mode,
-	tint:       Color,
-}
-
-clay_image_data :: proc(id: Texture_Id, fit: Fit_Mode = .Stretch, tint: Color = WHITE) -> Clay_Image_Data {
-	return {texture_id = id, fit = fit, tint = tint}
-}
-
-// ---------------------------------------------------------------------------------------------------------------------
-// ----- Callbacks (clay -> draw) ------------
-// ---------------------------------------------------------------------------------------------------------------------
-
-@(private = "file")
-clay_error_handler :: proc "c" (errorData: clay.ErrorData) {
-	context = GLOB.odin_context
-	log.error("Clay error:", errorData.errorType, errorData.errorText)
-}
-
-@(private = "file")
-measure_text_clay :: proc "c" (
-	text: clay.StringSlice,
-	config: ^clay.TextElementConfig,
-	user_data: rawptr,
-) -> clay.Dimensions {
-	context = GLOB.odin_context
-	text := string(text.chars[:text.length])
-	c_text := strings.clone_to_cstring(text, context.temp_allocator)
-	defer delete(c_text, context.temp_allocator)
-	width, height: c.int
-	if !sdl_ttf.GetStringSize(get_font(config.fontId, config.fontSize), c_text, 0, &width, &height) {
-		log.panicf("Failed to measure text: %s", sdl.GetError())
-	}
-
-	return clay.Dimensions{width = f32(width) / GLOB.dpi_scaling, height = f32(height) / GLOB.dpi_scaling}
-}
-
-// ---------------------------------------------------------------------------------------------------------------------
-// ----- Custom draw + customData envelope ------------
-// ---------------------------------------------------------------------------------------------------------------------
-
-// Called for each Clay `RenderCommandType.Custom` render command that
-// `prepare_clay_batch` encounters and which is NOT a levlib-managed variant
-// (e.g. `Backdrop_Marker`).
-//
-// - `layer` is the layer the command belongs to (post-z-index promotion).
-// - `bounds` is already translated into the active layer's coordinate system
-//   and pre-DPI, matching what the built-in shape procs expect.
-// - `render_data` is Clay's `CustomRenderData` for the element, exposing
-//   `backgroundColor` and `cornerRadius`. Its `customData` field has been
-//   unwrapped from the `Clay_Custom` envelope: it points at the user's own
-//   data (the value the user wrote into the `rawptr` variant), not at the
-//   `Clay_Custom` itself. If the union was zero-init (no variant set) or
-//   `customData` was originally nil, the callback receives nil.
-//
-// The callback must not call `new_layer` or `prepare_clay_batch`.
-Custom_Draw :: #type proc(layer: ^Layer, bounds: Rectangle, render_data: clay.CustomRenderData)
-
-ClayBatch :: struct {
-	bounds: Rectangle,
-	cmds:   clay.ClayArray(clay.RenderCommand),
-}
-
-// Discriminated sum of everything `clay.CustomElementConfig.customData` is allowed to point
-// at. levlib-defined variants (currently just `Backdrop_Marker`) are recognized by
-// `prepare_clay_batch` and routed to the appropriate internal path; the `rawptr` variant is
-// the escape hatch for user-defined custom drawing — `prepare_clay_batch` unwraps it before
-// invoking `custom_draw` so the callback sees the user's pointer in `render_data.customData`
-// exactly as if no wrapper were involved.
-//
-// Contract: `customData`, when non-nil, MUST point at storage holding a `Clay_Custom`
-// value. The user owns that storage; its lifetime must span the Clay layout call and the
-// matching `prepare_clay_batch` call. Pointing `customData` at a bare user struct violates
-// the contract — the dispatcher will read its first bytes as a union tag and either route
-// the draw incorrectly or panic on type assertion. There is no recovery path; this is a
-// strict-discipline API by design.
-//
-// Construction notes (Odin implicit-conversion rules):
-//   - Backdrop variant: `bd: Clay_Custom = Backdrop_Marker{...}` works directly.
-//     Variant-to-union conversion is implicit.
-//   - User pointer: `up: Clay_Custom = rawptr(&my_struct)` — the explicit `rawptr(...)` is
-//     required because Odin does not chain `^T -> rawptr -> Clay_Custom` implicitly. A bare
-//     `up: Clay_Custom = &my_struct` is a compile error.
-Clay_Custom :: union {
-	Backdrop_Marker,
-	rawptr,
-}
-
-// Per-primitive parameters for a backdrop blur dispatched through the Clay integration.
-// Embedded as a `Clay_Custom` variant; `prepare_clay_batch` walks the command stream,
-// opens/closes a backdrop scope around contiguous backdrop runs, and feeds these to
-// `backdrop_blur` via `dispatch_clay_backdrop`. The discriminant is the union tag — no
-// in-band magic field needed (compiler-enforced).
-Backdrop_Marker :: struct {
-	sigma:       f32,
-	tint:        Color,
-	radii:       Rectangle_Radii,
-	feather_ppx: f32,
-}
-
-// ---------------------------------------------------------------------------------------------------------------------
-// ----- Border-merge stack ------------
-// ---------------------------------------------------------------------------------------------------------------------
-
-// One entry on the Clay merge stack. Pushed by `dispatch_clay_command` when emitting a
-// Rectangle or an Image primitive, then popped by a matching Border to retroactively add
-// the outline. See `try_dispatch_clay_border_merge` for the matching semantics.
-//INTERNAL
-Clay_Merge_Candidate :: struct {
-	primitive_index: u32, // Index into `GLOB.tmp_primitives` of the candidate primitive.
-	outer_bounds:    Rectangle, // Clay's bounding box — keyed on for the bounds match check.
-	corner_radii:    clay.CornerRadius, // Clay's corner radii — also keyed on for the match check.
-	image_data:      Clay_Image_Data, // Only read when kind == .Fill_Texture (needed to refit UVs to inner_bounds).
-	kind:            Clay_Merge_Candidate_Kind,
-}
-
-//INTERNAL
-Clay_Merge_Candidate_Kind :: enum u8 {
-	// Solid Color brush. Used for Rectangle commands and for the bg primitive of an Image
-	// command that has `backgroundColor.a > 0`. Merge mutation: shrink shape + add outline.
-	Fill_Color,
-	// Texture_Fill brush. Used for the image primitive of an Image command with no bg, where
-	// `fit_params` returned `fit_rect == outer_bounds` (the image fully covers Clay's bounds).
-	// Merge mutation: shrink shape + add outline + refit UV against inner_bounds.
-	Fill_Texture,
-}
-
-// Returns true if this Clay render command represents a backdrop primitive — i.e. its
-// `customData` points at a `Clay_Custom` whose active variant is `Backdrop_Marker`.
-is_clay_backdrop :: proc(cmd: ^clay.RenderCommand) -> bool {
-	if cmd.commandType != .Custom do return false
-	p := cmd.renderData.custom.customData
-	if p == nil do return false
-	_, ok := (^Clay_Custom)(p).(Backdrop_Marker)
-	return ok
-}
-
-// ---------------------------------------------------------------------------------------------------------------------
-// ----- Border emission ------------
-// ---------------------------------------------------------------------------------------------------------------------
-
-// Emit a Clay border drawn INSIDE `bounds` — the outer edge of each side aligns with
-// `bounds`, the inner edge is `border_width.*` pixels inset. Matches Clay's layout model
-// (CSS border-box) so the visible element occupies exactly Clay's allocated space.
-//
-// The fast path (uniform widths) uses `rectangle()` with the built-in SDF outline, which
-// always extends outward from the shape it's given — we pre-shrink the shape by
-// `border_width` so the outline lands precisely at Clay's bounds. The slow path (non-uniform
-// widths) emits per-side rectangles and per-corner arcs directly, all positioned inside
-// `bounds`. All-zero widths is a no-op.
-//
-// A corner is rounded iff its radius is positive AND both adjacent sides have positive
-// width. Top corners take their thickness from `border_width.top`, bottom corners from
-// `border_width.bottom`. When the two widths meeting at a corner differ there is a step at
-// the side/corner junction (acceptable for the rare mixed-width case).
-//
-// When `border_width > corner_radius`, the inner corner clamps to zero (sharp inside, still
-// rounded outside) — matches CSS-standard behavior.
-//INTERNAL
-clay_emit_partial_border :: proc(
-	layer: ^Layer,
-	bounds: Rectangle,
-	border_color: Color,
-	border_width: clay.BorderWidth,
-	corner_radii: clay.CornerRadius,
-) {
-	// All-zero: nothing to draw.
-	if border_width.top == 0 && border_width.right == 0 && border_width.bottom == 0 && border_width.left == 0 {
-		return
-	}
-
-	// Convert side widths once (u16 -> f32) and cache for reuse.
-	width_top := f32(border_width.top)
-	width_right := f32(border_width.right)
-	width_bottom := f32(border_width.bottom)
-	width_left := f32(border_width.left)
-
-	// Fast path: all four sides have the same nonzero width. Pre-shrink the shape by the
-	// uniform width so the SDF outline (which always extends outward from the shape) lands
-	// exactly at Clay's `bounds` — the visible border ends up INSIDE Clay's allocation while
-	// the SDF mechanism keeps doing outward outlining. Single SDF primitive, exact curves,
-	// analytical AA.
-	if border_width.left == border_width.top &&
-	   border_width.top == border_width.right &&
-	   border_width.right == border_width.bottom {
-		uniform_width := width_top
-		inner_bounds := Rectangle {
-			x      = bounds.x + uniform_width,
-			y      = bounds.y + uniform_width,
-			width  = bounds.width - 2 * uniform_width,
-			height = bounds.height - 2 * uniform_width,
-		}
-		inner_radii := Rectangle_Radii {
-			top_left     = max(0, corner_radii.topLeft - uniform_width),
-			top_right    = max(0, corner_radii.topRight - uniform_width),
-			bottom_right = max(0, corner_radii.bottomRight - uniform_width),
-			bottom_left  = max(0, corner_radii.bottomLeft - uniform_width),
-		}
-		rectangle(
-			layer,
-			inner_bounds,
-			BLANK,
-			outline_color = border_color,
-			outline_width = uniform_width,
-			radii = inner_radii,
-		)
-		return
-	}
-
-	// A corner is drawn rounded only if its radius is positive AND both adjacent sides are present.
-	top_left_rounded := corner_radii.topLeft > 0 && border_width.top > 0 && border_width.left > 0
-	top_right_rounded := corner_radii.topRight > 0 && border_width.top > 0 && border_width.right > 0
-	bottom_left_rounded := corner_radii.bottomLeft > 0 && border_width.bottom > 0 && border_width.left > 0
-	bottom_right_rounded := corner_radii.bottomRight > 0 && border_width.bottom > 0 && border_width.right > 0
-
-	// Horizontal x-coordinates where the top/bottom side rectangles start/end. When the
-	// adjacent corner is rounded, the side stops at `bounds.x + radius` (where the corner
-	// arc takes over). When not rounded, the side runs to the bounds edge; the perpendicular
-	// side handles the inset to avoid overlap.
-	top_left_x: f32 = top_left_rounded ? bounds.x + corner_radii.topLeft : bounds.x
-	top_right_x: f32 =
-		top_right_rounded ? bounds.x + bounds.width - corner_radii.topRight : bounds.x + bounds.width
-	bottom_left_x: f32 = bottom_left_rounded ? bounds.x + corner_radii.bottomLeft : bounds.x
-	bottom_right_x: f32 =
-		bottom_right_rounded ? bounds.x + bounds.width - corner_radii.bottomRight : bounds.x + bounds.width
-
-	// Vertical y-coordinates where the left/right side rectangles start/end. When the
-	// adjacent corner is rounded, inset by the corner radius. When not rounded, inset by the
-	// adjacent horizontal width — the horizontal side owns the corner area (extending through
-	// it to the bounds edge), so the vertical side starts below it to avoid overdraw of
-	// translucent colors.
-	top_left_y: f32 = top_left_rounded ? bounds.y + corner_radii.topLeft : bounds.y + width_top
-	top_right_y: f32 = top_right_rounded ? bounds.y + corner_radii.topRight : bounds.y + width_top
-	bottom_left_y: f32 =
-		bottom_left_rounded ? bounds.y + bounds.height - corner_radii.bottomLeft : bounds.y + bounds.height - width_bottom
-	bottom_right_y: f32 =
-		bottom_right_rounded ? bounds.y + bounds.height - corner_radii.bottomRight : bounds.y + bounds.height - width_bottom
-
-	// Side rectangles drawn INSIDE `bounds`. Sharp corners, solid fill, no outline. Each
-	// gated on its own width — skipping zero-width sides saves the primitive upload.
-	if border_width.top > 0 {
-		top_side := Rectangle {
-			x      = top_left_x,
-			y      = bounds.y,
-			width  = top_right_x - top_left_x,
-			height = width_top,
-		}
-		rectangle(layer, top_side, border_color)
-	}
-	if border_width.bottom > 0 {
-		bottom_side := Rectangle {
-			x      = bottom_left_x,
-			y      = bounds.y + bounds.height - width_bottom,
-			width  = bottom_right_x - bottom_left_x,
-			height = width_bottom,
-		}
-		rectangle(layer, bottom_side, border_color)
-	}
-	if border_width.left > 0 {
-		left_side := Rectangle {
-			x      = bounds.x,
-			y      = top_left_y,
-			width  = width_left,
-			height = bottom_left_y - top_left_y,
-		}
-		rectangle(layer, left_side, border_color)
-	}
-	if border_width.right > 0 {
-		right_side := Rectangle {
-			x      = bounds.x + bounds.width - width_right,
-			y      = top_right_y,
-			width  = width_right,
-			height = bottom_right_y - top_right_y,
-		}
-		rectangle(layer, right_side, border_color)
-	}
-
-	// Corner arcs (90° quadrants) drawn INSIDE bounds: outer radius matches Clay's
-	// `corner_radii`, inner radius is the outer radius minus the relevant border thickness
-	// (clamped to 0 for thick borders — produces a filled pie slice when border > radius,
-	// matching CSS). Angle convention matches ring(): 0° = +x (right), 90° = +y (down),
-	// 180° = -x (left), 270° = -y (up).
-	if top_left_rounded {
-		radius := corner_radii.topLeft
-		inner_radius := max(0, radius - width_top)
-		center := Vec2{bounds.x + radius, bounds.y + radius}
-		ring(layer, center, inner_radius, radius, border_color, start_angle = 180, end_angle = 270)
-	}
-	if top_right_rounded {
-		radius := corner_radii.topRight
-		inner_radius := max(0, radius - width_top)
-		center := Vec2{bounds.x + bounds.width - radius, bounds.y + radius}
-		ring(layer, center, inner_radius, radius, border_color, start_angle = 270, end_angle = 360)
-	}
-	if bottom_right_rounded {
-		radius := corner_radii.bottomRight
-		inner_radius := max(0, radius - width_bottom)
-		center := Vec2{bounds.x + bounds.width - radius, bounds.y + bounds.height - radius}
-		ring(layer, center, inner_radius, radius, border_color, start_angle = 0, end_angle = 90)
-	}
-	if bottom_left_rounded {
-		radius := corner_radii.bottomLeft
-		inner_radius := max(0, radius - width_bottom)
-		center := Vec2{bounds.x + radius, bounds.y + bounds.height - radius}
-		ring(layer, center, inner_radius, radius, border_color, start_angle = 90, end_angle = 180)
-	}
-}
-
-// Try to retroactively merge this Border into a pending Rectangle/Image candidate on the
-// merge stack. Returns true on success so the caller can skip the standalone Border emission.
-//
-// Clay emits a parent element's bg and border bracketing all the children's commands, so a
-// simple "is the next command a Border?" check (the previous approach) only catches leaf
-// elements. The stack approach lets us pair them across arbitrary nesting: every Rectangle/
-// Image push registers itself; every Border pops down until it finds a geometric match.
-//
-// Pop semantics: non-matching candidates above the match are discarded — their elements had
-// no border anyway, so their primitives stay in `tmp_primitives` as plain Rectangles. A
-// Border that finds no match at all falls back to standalone `clay_emit_partial_border`.
-//
-// Predicates that decline a candidate:
-//   - non-uniform or zero border widths (can't be a single uniform outline)
-//   - translucent border (the unmerged path's bg-under-border blending differs)
-//   - mismatched bounds or cornerRadius (the candidate isn't from the same element)
-//
-// False-match risk: two unrelated elements with bit-identical bounds and corner radii.
-// Requires geometric coincidence (rare in practice), and even when it fires, the misattributed
-// outline still lands at the correct screen position with the correct color — the pixels
-// match the unmerged ground truth for opaque borders (the only kind we merge).
-//INTERNAL
-try_dispatch_clay_border_merge :: proc(bounds: Rectangle, border_data: clay.BorderRenderData) -> bool {
-	border_width := border_data.width
-	uniform_nonzero :=
-		border_width.left == border_width.top &&
-		border_width.top == border_width.right &&
-		border_width.right == border_width.bottom &&
-		border_width.top > 0
-	if !uniform_nonzero do return false
-	if border_data.color[3] < 255 do return false
-
-	for len(GLOB.clay_merge_open_stack) > 0 {
-		candidate := pop(&GLOB.clay_merge_open_stack)
-		if candidate.outer_bounds != bounds do continue
-		if candidate.corner_radii != border_data.cornerRadius do continue
-		apply_clay_border_merge_to_primitive(candidate, border_data)
-		return true
-	}
-	return false
-}
-
-// Mutates `tmp_primitives[candidate.primitive_index]` in place: shrinks the SDF shape by
-// the uniform border width so the (outward) outline lands at the outer bounds, sets the
-// outline flag and params, and — for `Fill_Texture` candidates — refits the texture's UV
-// against `inner_bounds` so the image doesn't overflow into the border strip.
-//
-// The primitive's `bounds` field stays at the outer bounds: the rasterized quad already
-// covers the area the outline now occupies. Skipping the bounds expansion that
-// `apply_brush_and_outline` would normally do is intentional — expanding here would push the
-// rasterized quad past Clay's outer edge.
-//INTERNAL
-apply_clay_border_merge_to_primitive :: proc(
-	candidate: Clay_Merge_Candidate,
-	border_data: clay.BorderRenderData,
-) {
-	prim := &GLOB.tmp_primitives[candidate.primitive_index]
-	uniform_width := f32(border_data.width.top)
-	dpi_scale := GLOB.dpi_scaling
-
-	inner_half_width := candidate.outer_bounds.width * 0.5 - uniform_width
-	inner_half_height := candidate.outer_bounds.height * 0.5 - uniform_width
-	prim.params.rrect.half_size_ppx = {inner_half_width * dpi_scale, inner_half_height * dpi_scale}
-	prim.params.rrect.radii_ppx = {
-		max(0, candidate.corner_radii.topLeft - uniform_width) * dpi_scale,
-		max(0, candidate.corner_radii.topRight - uniform_width) * dpi_scale,
-		max(0, candidate.corner_radii.bottomRight - uniform_width) * dpi_scale,
-		max(0, candidate.corner_radii.bottomLeft - uniform_width) * dpi_scale,
-	}
-
-	// Set the outline bit in the packed flags field (low byte = Shape_Kind, bits 8+ = Shape_Flags).
-	prim.flags |= u32(transmute(u8)Shape_Flags{.Outline}) << 8
-	prim.effects.outline_color = Color(border_data.color)
-	prim.effects.outline_packed = pack_f16_pair(f16(uniform_width * dpi_scale), 0)
-
-	if candidate.kind == .Fill_Texture {
-		// The candidate was only pushed if its `fit_rect == outer_bounds` at emission time, so the
-		// image fills the rasterized quad. Refit UVs against `inner_bounds` so the image is scoped
-		// to the area inside the new outline rather than overflowing into the border strip.
-		inner_bounds := Rectangle {
-			x      = candidate.outer_bounds.x + uniform_width,
-			y      = candidate.outer_bounds.y + uniform_width,
-			width  = candidate.outer_bounds.width - 2 * uniform_width,
-			height = candidate.outer_bounds.height - 2 * uniform_width,
-		}
-		uv_rect, _, _ := fit_params(candidate.image_data.fit, inner_bounds, candidate.image_data.texture_id)
-		prim.uv_rect = {uv_rect.x, uv_rect.y, uv_rect.width, uv_rect.height}
-	}
-}
-
-// ---------------------------------------------------------------------------------------------------------------------
-// ----- Command dispatch ------------
-// ---------------------------------------------------------------------------------------------------------------------
-
-// Dispatch a single non-backdrop Clay render command to the appropriate `draw` primitive.
-// Extracted from the main `prepare_clay_batch` walk so that the deferred-buffer flush path
-// can replay commands accumulated during an open backdrop scope without duplicating the
-// per-command lowering code.
-//INTERNAL
-dispatch_clay_command :: proc(
-	layer: ^Layer,
-	render_command: ^clay.RenderCommand,
-	custom_draw: Custom_Draw,
-	temp_allocator: runtime.Allocator,
-) {
-	// Translate bounding box of the primitive by the layer position
-	bounds := Rectangle {
-		x      = render_command.boundingBox.x + layer.bounds.x,
-		y      = render_command.boundingBox.y + layer.bounds.y,
-		width  = render_command.boundingBox.width,
-		height = render_command.boundingBox.height,
-	}
-
-	switch render_command.commandType {
-	case clay.RenderCommandType.None:
-		log.errorf(
-				"Received render command with type None. This generally means we're in some kind of fucked up state.",
-			)
-	case clay.RenderCommandType.Text:
-		render_data := render_command.renderData.text
-		txt := string(render_data.stringContents.chars[:render_data.stringContents.length])
-		c_text := strings.clone_to_cstring(txt, temp_allocator)
-		defer delete(c_text, temp_allocator)
-		// Clay render-command IDs are derived via Clay's internal HashNumber (Jenkins-family)
-		// and namespaced with .Clay so they can never collide with user-provided custom text IDs.
-		sdl_text := cache_get_or_update(
-			Cache_Key{render_command.id, .Clay},
-			c_text,
-			get_font(render_data.fontId, render_data.fontSize),
-		)
-		prepare_text(layer, Text{sdl_text, {bounds.x, bounds.y}, Color(render_data.textColor)})
-	case clay.RenderCommandType.Image:
-		// Any texture
-		render_data := render_command.renderData.image
-		if render_data.imageData == nil do return
-		img_data := (^Clay_Image_Data)(render_data.imageData)^
-		corner_radii_clay := render_data.cornerRadius
-		radii := Rectangle_Radii {
-			top_left     = corner_radii_clay.topLeft,
-			top_right    = corner_radii_clay.topRight,
-			bottom_right = corner_radii_clay.bottomRight,
-			bottom_left  = corner_radii_clay.bottomLeft,
-		}
-
-		background_color := Color(render_data.backgroundColor)
-		uv_rect, sampler, fit_rect := fit_params(img_data.fit, bounds, img_data.texture_id)
-
-		if background_color.a > 0 {
-			// Bg behind image. Push the bg primitive as the merge candidate so a matching Border
-			// turns into a bg+border-merged primitive plus a separate image draw on top.
-			rectangle(layer, bounds, background_color, radii = radii)
-			bg_primitive_index := u32(len(GLOB.tmp_primitives) - 1)
-			rectangle(
-				layer,
-				fit_rect,
-				Texture_Fill{id = img_data.texture_id, tint = img_data.tint, uv_rect = uv_rect, sampler = sampler},
-				radii = radii,
-			)
-			append(
-				&GLOB.clay_merge_open_stack,
-				Clay_Merge_Candidate {
-					primitive_index = bg_primitive_index,
-					outer_bounds = bounds,
-					corner_radii = corner_radii_clay,
-					kind = .Fill_Color,
-				},
-			)
-		} else {
-			// No bg: the image itself can host the outline if its fit fully covers Clay's bounds.
-			// `Fit_Mode.Fit` with aspect mismatch returns a sub-rect, which can't host an outline
-			// (the rasterized quad wouldn't reach Clay's outer edge), so we skip pushing.
-			rectangle(
-				layer,
-				fit_rect,
-				Texture_Fill{id = img_data.texture_id, tint = img_data.tint, uv_rect = uv_rect, sampler = sampler},
-				radii = radii,
-			)
-			if fit_rect == bounds {
-				img_primitive_index := u32(len(GLOB.tmp_primitives) - 1)
-				append(
-					&GLOB.clay_merge_open_stack,
-					Clay_Merge_Candidate {
-						primitive_index = img_primitive_index,
-						outer_bounds = bounds,
-						corner_radii = corner_radii_clay,
-						image_data = img_data,
-						kind = .Fill_Texture,
-					},
-				)
-			}
-		}
-	case clay.RenderCommandType.ScissorStart:
-		if bounds.width == 0 || bounds.height == 0 do return
-
-		curr_scissor := &GLOB.scissors[layer.scissor_start + layer.scissor_len - 1]
-
-		if curr_scissor.sub_batch_len != 0 {
-			// Scissor has some content, need to make a new scissor
-			new := Scissor {
-				sub_batch_start = curr_scissor.sub_batch_start + curr_scissor.sub_batch_len,
-				bounds          = sdl.Rect {
-					c.int(bounds.x * GLOB.dpi_scaling),
-					c.int(bounds.y * GLOB.dpi_scaling),
-					c.int(bounds.width * GLOB.dpi_scaling),
-					c.int(bounds.height * GLOB.dpi_scaling),
-				},
-			}
-			append(&GLOB.scissors, new)
-			layer.scissor_len += 1
-		} else {
-			curr_scissor.bounds = sdl.Rect {
-				c.int(bounds.x * GLOB.dpi_scaling),
-				c.int(bounds.y * GLOB.dpi_scaling),
-				c.int(bounds.width * GLOB.dpi_scaling),
-				c.int(bounds.height * GLOB.dpi_scaling),
-			}
-		}
-	case clay.RenderCommandType.ScissorEnd:
-	case clay.RenderCommandType.OverlayColorStart, clay.RenderCommandType.OverlayColorEnd:
-		unimplemented("Clay overlays not supported yet...")
-	case clay.RenderCommandType.Rectangle:
-		render_data := render_command.renderData.rectangle
-		corner_radii_clay := render_data.cornerRadius
-		background_color := Color(render_data.backgroundColor)
-		radii := Rectangle_Radii {
-			top_left     = corner_radii_clay.topLeft,
-			top_right    = corner_radii_clay.topRight,
-			bottom_right = corner_radii_clay.bottomRight,
-			bottom_left  = corner_radii_clay.bottomLeft,
-		}
-		rectangle(layer, bounds, background_color, radii = radii)
-		// Register this primitive as a merge candidate. If the element has a matching Border
-		// later in the stream (after its children's commands), `try_dispatch_clay_border_merge`
-		// will pop this candidate and mutate the primitive in-place to add the outline.
-		primitive_index := u32(len(GLOB.tmp_primitives) - 1)
-		append(
-			&GLOB.clay_merge_open_stack,
-			Clay_Merge_Candidate {
-				primitive_index = primitive_index,
-				outer_bounds = bounds,
-				corner_radii = corner_radii_clay,
-				kind = .Fill_Color,
-			},
-		)
-	case clay.RenderCommandType.Border:
-		render_data := render_command.renderData.border
-		if try_dispatch_clay_border_merge(bounds, render_data) do return
-		clay_emit_partial_border(
-			layer,
-			bounds,
-			Color(render_data.color),
-			render_data.width,
-			render_data.cornerRadius,
-		)
-	case clay.RenderCommandType.Custom:
-		// Copy the CustomRenderData by value so we can patch its `customData` field for the
-		// user callback without mutating Clay-owned memory. After unwrapping, the callback
-		// sees its own pointer in `render_data.customData`, identical to what it would see
-		// if `Clay_Custom` did not exist as an intermediary.
-		patched := render_command.renderData.custom
-		// Default to nil so a zero-init `Clay_Custom` (no variant set) and an originally-nil
-		// `customData` both surface to the callback as `customData = nil`.
-		patched.customData = nil
-		if custom_data_pointer := render_command.renderData.custom.customData; custom_data_pointer != nil {
-			switch custom_value in (^Clay_Custom)(custom_data_pointer)^ {
-			case Backdrop_Marker: // The walker pre-filters backdrops into `dispatch_clay_backdrop` and never feeds
-					// them here; reaching this branch means either the walker logic is broken or the
-					// `Clay_Custom` variant tag mutated between the walker's `is_clay_backdrop` check
-					// and this re-check (heap corruption / lifetime bug in user-managed customData
-					// memory). Both are renderer-level bugs that warrant a hard failure rather than a
-					// silently-dropped panel.
-					log.panicf(
-						"backdrop marker reached dispatch_clay_command; either the prepare_clay_batch walker is misrouting commands or the customData pointee at %p was mutated mid-frame",
-						render_command.renderData.custom.customData,
-					)
-			case rawptr: patched.customData = custom_value
-			}
-		}
-		if custom_draw != nil {
-			custom_draw(layer, bounds, patched)
-		} else if patched.customData != nil {
-			log.panicf(
-				"Received clay render command of type custom with non-nil user data but no custom_draw proc provided.",
-			)
-		}
-	}
-}
-
-// Dispatch a single backdrop Clay render command to `backdrop_blur` on the active layer.
-// Caller guarantees:
-//   - a backdrop scope is open on `layer` so the underlying `append_or_extend_sub_batch`
-//     contract assertion is satisfied;
-//   - the command's `customData` points at a `Clay_Custom` whose active variant is
-//     `Backdrop_Marker` (the walker has already verified this via `is_clay_backdrop`).
-//INTERNAL
-dispatch_clay_backdrop :: proc(layer: ^Layer, cmd: ^clay.RenderCommand) {
-	bounds := Rectangle {
-		x      = cmd.boundingBox.x + layer.bounds.x,
-		y      = cmd.boundingBox.y + layer.bounds.y,
-		width  = cmd.boundingBox.width,
-		height = cmd.boundingBox.height,
-	}
-	// Type-asserting form (no `, ok`): panics loudly if the variant tag changed since
-	// `is_clay_backdrop`, which is the desired tripwire for a heap-corruption bug in
-	// user-managed customData.
-	marker := (^Clay_Custom)(cmd.renderData.custom.customData).(Backdrop_Marker)
-	backdrop_blur(
-		layer,
-		bounds,
-		gaussian_sigma = marker.sigma,
-		tint = marker.tint,
-		radii = marker.radii,
-		feather_ppx = marker.feather_ppx,
-	)
-}
-
-// Close the in-flight backdrop scope (if open) and replay every command accumulated in the
-// deferred index buffer. Ordering: end_backdrop first so deferred non-backdrop draws land
-// at submission position relative to the bracket they followed (the bracket is now closed,
-// so these draws render after it). Used at every zIndex transition and at end of stream.
-//INTERNAL
-flush_deferred_and_close_backdrop_scope :: proc(
-	layer: ^Layer,
-	batch: ^ClayBatch,
-	deferred_indices: ^[dynamic]i32,
-	backdrop_scope_open: ^bool,
-	custom_draw: Custom_Draw,
-	temp_allocator: runtime.Allocator,
-) {
-	if backdrop_scope_open^ {
-		end_backdrop(layer)
-		backdrop_scope_open^ = false
-	}
-	// Clear the merge stack at scope/stratum boundaries: any pending candidates from the
-	// pre-scope (or pre-transition) commands stay as plain primitives — they can't merge
-	// with Borders on the far side of the boundary because that would change draw order.
-	clear(&GLOB.clay_merge_open_stack)
-	for index in deferred_indices^ {
-		cmd := clay.RenderCommandArray_Get(&batch.cmds, index)
-		dispatch_clay_command(layer, cmd, custom_draw, temp_allocator)
-	}
-	clear(deferred_indices)
-}
-
-// ---------------------------------------------------------------------------------------------------------------------
-// ----- Main entry point ------------
-// ---------------------------------------------------------------------------------------------------------------------
-
-// Process Clay render commands into shape, text, and backdrop primitives.
-//
-// Single-walk dispatcher with a deferred buffer. The walk does three things per command:
-//   1. zIndex transitions: close the in-flight scope, flush any deferred non-backdrop
-//      commands into the current layer, then open a new layer seeded with `base_layer.bounds`
-//      (NOT the bumping element's bounds — Clay's floating elements with `clipTo = .None`
-//      should not be over-clipped, and `clipTo = .AttachedParent` floating elements get a
-//      Clay-emitted ScissorStart immediately afterward that narrows correctly).
-//   2. Backdrop commands: open a scope on first encounter (extending it on subsequent ones),
-//      then dispatch the backdrop_blur call.
-//   3. Non-backdrop commands during an open scope: append to the deferred buffer for replay
-//      after the scope closes. The buffer holds command indices, not pointers, so it stays
-//      valid even if the underlying ClayArray reallocates.
-// At end of stream, flush whatever remains.
-prepare_clay_batch :: proc(
-	base_layer: ^Layer,
-	batch: ^ClayBatch,
-	custom_draw: Custom_Draw = nil,
-	temp_allocator := context.temp_allocator,
-) {
-	layer := base_layer
-	command_count := int(batch.cmds.length)
-	deferred_indices := make([dynamic]i32, 0, 16, temp_allocator)
-	backdrop_scope_open := false
-	// Seed from GLOB.clay_z_index so multi-batch frames preserve the original semantics: a
-	// later call to `prepare_clay_batch` doesn't re-trigger layer splits for zIndex values
-	// the previous batch already saw.
-	previous_z_index := GLOB.clay_z_index
-
-	// Start with a clean merge stack. The stack is also cleared by
-	// `flush_deferred_and_close_backdrop_scope` at every stratum boundary; both clears together
-	// ensure merge candidates never pair across a boundary that would shift draw order.
-	clear(&GLOB.clay_merge_open_stack)
-	for i in 0 ..< command_count {
-		cmd := clay.RenderCommandArray_Get(&batch.cmds, i32(i))
-
-		// zIndex transition: close out current stratum, create new layer, continue.
-		if cmd.zIndex > previous_z_index {
-			log.debug("Higher zIndex found, creating new layer & setting z_index to", cmd.zIndex)
-			flush_deferred_and_close_backdrop_scope(
-				layer,
-				batch,
-				&deferred_indices,
-				&backdrop_scope_open,
-				custom_draw,
-				temp_allocator,
-			)
-			layer = new_layer(layer, base_layer.bounds)
-			previous_z_index = cmd.zIndex
-			// Keep GLOB.clay_z_index in sync for any external readers (debug tooling, etc.).
-			GLOB.clay_z_index = cmd.zIndex
-		}
-
-		if is_clay_backdrop(cmd) {
-			if !backdrop_scope_open {
-				begin_backdrop(layer)
-				backdrop_scope_open = true
-			}
-			dispatch_clay_backdrop(layer, cmd)
-		} else if backdrop_scope_open {
-			append(&deferred_indices, i32(i))
-		} else {
-			// Rectangle/Image dispatches push merge candidates; Border dispatches pop the stack
-			// to retroactively add an outline to a matching candidate. See
-			// `try_dispatch_clay_border_merge` for the matching semantics.
-			dispatch_clay_command(layer, cmd, custom_draw, temp_allocator)
-		}
-	}
-
-	// End-of-stream: flush whatever remains.
-	flush_deferred_and_close_backdrop_scope(
-		layer,
-		batch,
-		&deferred_indices,
-		&backdrop_scope_open,
-		custom_draw,
-		temp_allocator,
-	)
-}
@@ -1,756 +0,0 @@
-// CYBERSTEEL DESIGN SYSTEM — Odin theme constants
-//
-// Retrofuturist. Technical. Direct. Gruvbox-derived palette
-// with Art Deco type system. Every visual token from the
-// Cybersteel design system, transferred 1:1 to Odin constants.
-//
-// Conventions:
-//   - Colors are [4]u8 RGBA. Alpha 255 = fully opaque.
-//     Translucent tints carry their alpha in the 4th channel.
-//   - Times are time.Duration via core:time.
-//   - Pixel sizes, weights, line-heights, letter-spacings, and
-//     ratio-like values are plain (untyped) numeric literals so
-//     callers can use them with whatever numeric type they need.
-//   - Letter-spacing values are expressed in EMs (multiply by
-//     the resolved font size to get pixels).
-//   - Line-heights are unitless multipliers of the font size.
-
-package cybersteel
-
-import "core:time"
-
-import draw ".."
-
-
-// ============================================================
-// BASE BACKGROUNDS — warm dark, Gruvbox-derived
-// Never pure black. The warmth is intentional: aged metal,
-// amber phosphor, old paper. Order is: deepest chrome first
-// (shell), then page, then progressively lighter surfaces.
-// ============================================================
-
-// Topbar, sidebar, nav chrome, modal backdrops. Deepest base.
-BG_SHELL :: draw.Color{0x1d, 0x20, 0x21, 0xff}
-
-// Default page canvas / main content area. One step up from shell.
-BG_PAGE :: draw.Color{0x31, 0x31, 0x31, 0xff}
-
-// Cards, panels, drawers, input fields, code blocks, table rows.
-// Slightly lighter than the page so raised surfaces read clearly
-// without shadows.
-BG_SURFACE :: draw.Color{0x3c, 0x38, 0x36, 0xff}
-
-// Selected rows, active nav items, hover states. One step lighter
-// than BG_SURFACE.
-BG_ACTIVE :: draw.Color{0x50, 0x49, 0x45, 0xff}
-
-// Disabled buttons / inputs background. Pairs with FG_MUTED text
-// only — the contrast is intentionally low.
-BG_DISABLED :: draw.Color{0x66, 0x5c, 0x54, 0xff}
-
-// Borders, dividers, rules, input outlines. Never use as a text
-// surface — it has no fg-pair guarantee.
-BG_BORDER :: draw.Color{0x7c, 0x6f, 0x64, 0xff}
-
-
-// ============================================================
-// BASE FOREGROUNDS — warm cream / ivory, never pure white
-// Five-step ramp from brightest (heading) to most muted.
-// ============================================================
-
-// Hero text, page headings, display titles. Brightest fg.
-FG_HEADING :: draw.Color{0xfb, 0xf1, 0xc7, 0xff}
-
-// Primary body text, default readable content.
-FG_BODY :: draw.Color{0xf2, 0xe2, 0xba, 0xff}
-
-// Labels, secondary descriptions, table data.
-FG_SECONDARY :: draw.Color{0xe0, 0xd0, 0xa8, 0xff}
-
-// Captions, metadata, timestamps, placeholders.
-FG_CAPTION :: draw.Color{0xce, 0xbd, 0x9e, 0xff}
-
-// Disabled text, token labels, subtle UI annotations.
-FG_MUTED :: draw.Color{0xb8, 0xa9, 0x8e, 0xff}
-
-
-// ============================================================
-// ACCENT — GOLD (signature color, Art Deco)
-// The defining accent of the system. Use sparingly: borders,
-// highlights, focus rings, primary interactive states.
-// ============================================================
-
-// Primary interactive, focus rings, headline interactive accent.
-GOLD_BRIGHT :: draw.Color{0xfa, 0xbd, 0x2f, 0xff}
-
-// Borders, decorative rules, default Art Deco ornament color.
-GOLD_DIM :: draw.Color{0xd7, 0x99, 0x21, 0xff}
-
-// Hover states, pressed accents, dimmer gold contexts.
-GOLD_MUTED :: draw.Color{0xb5, 0x76, 0x14, 0xff}
-
-// Pure CRT amber. Reserved for terminal-style glow / phosphor
-// references — distinct from gold ramp.
-AMBER :: draw.Color{0xff, 0xb0, 0x00, 0xff}
-
-
-// ============================================================
-// ACCENT — RED (danger, errors, critical alerts)
-// ============================================================
-
-RED_BRIGHT :: draw.Color{0xfb, 0x49, 0x34, 0xff}
-RED_DIM :: draw.Color{0xcc, 0x24, 0x1d, 0xff}
-RED_MUTED :: draw.Color{0x9d, 0x00, 0x06, 0xff}
-
-
-// ============================================================
-// ACCENT — GREEN (success, safe, complete)
-// ============================================================
-
-GREEN_BRIGHT :: draw.Color{0xb8, 0xbb, 0x26, 0xff}
-GREEN_DIM :: draw.Color{0x98, 0x97, 0x1a, 0xff}
-GREEN_MUTED :: draw.Color{0x79, 0x74, 0x0e, 0xff}
-
-
-// ============================================================
-// ACCENT — BLUE / TEAL (info, links, cool technical elements)
-// ============================================================
-
-BLUE_BRIGHT :: draw.Color{0x83, 0xa5, 0x98, 0xff}
-BLUE_DIM :: draw.Color{0x45, 0x85, 0x88, 0xff}
-BLUE_MUTED :: draw.Color{0x07, 0x66, 0x78, 0xff}
-
-
-// ============================================================
-// ACCENT — ORANGE (warnings, in-progress, hot paths)
-// ============================================================
-
-ORANGE_BRIGHT :: draw.Color{0xfe, 0x80, 0x19, 0xff}
-ORANGE_DIM :: draw.Color{0xd6, 0x5d, 0x0e, 0xff}
-ORANGE_MUTED :: draw.Color{0xaf, 0x3a, 0x03, 0xff}
-
-
-// ============================================================
-// ACCENT — AQUA (cool secondary accent, fresh/active states)
-// ============================================================
-
-AQUA_BRIGHT :: draw.Color{0x8e, 0xc0, 0x7c, 0xff}
-AQUA_DIM :: draw.Color{0x68, 0x9d, 0x6a, 0xff}
-AQUA_MUTED :: draw.Color{0x42, 0x7b, 0x58, 0xff}
-
-
-// ============================================================
-// ACCENT — PURPLE (rare, for categorical / data-vis variety)
-// ============================================================
-
-PURPLE_BRIGHT :: draw.Color{0xd3, 0x86, 0x9b, 0xff}
-PURPLE_DIM :: draw.Color{0xb1, 0x62, 0x86, 0xff}
-PURPLE_MUTED :: draw.Color{0x8f, 0x3f, 0x71, 0xff}
-
-
-// ============================================================
-// SEMANTIC COLOR ROLES
-// Aliases to accent ramps, named by intent. Prefer these in
-// product code so meaning travels with the value.
-// ============================================================
-
-// Primary brand interactive — buttons, key links, focus ring.
-COLOR_PRIMARY :: GOLD_BRIGHT
-COLOR_PRIMARY_DIM :: GOLD_DIM
-
-// Destructive / error / critical states.
-COLOR_DANGER :: RED_BRIGHT
-COLOR_DANGER_DIM :: RED_DIM
-
-// Successful operation / safe state / completion.
-COLOR_SUCCESS :: GREEN_BRIGHT
-COLOR_SUCCESS_DIM :: GREEN_DIM
-
-// Caution / in-progress / non-fatal anomaly.
-COLOR_WARNING :: ORANGE_BRIGHT
-COLOR_WARNING_DIM :: ORANGE_DIM
-
-// Informational / neutral status / passive notice.
-COLOR_INFO :: BLUE_BRIGHT
-COLOR_INFO_DIM :: BLUE_DIM
-
-// Hyperlinks at rest and on hover (links flip to gold on hover).
-COLOR_LINK :: BLUE_BRIGHT
-COLOR_LINK_HOVER :: GOLD_BRIGHT
-
-// Keyboard / programmatic focus ring color.
-COLOR_FOCUS :: GOLD_BRIGHT
-
-
-// ============================================================
-// SURFACE ROLES
-// Semantic aliases for the bg ramp by usage role.
-// ============================================================
-
-SURFACE_PAGE :: BG_PAGE // root canvas
-SURFACE_RAISED :: BG_SURFACE // cards, panels, inputs
-SURFACE_OVERLAY :: BG_SHELL // modals, popovers, deep chrome
-SURFACE_HOVER :: BG_ACTIVE // hovered raised surfaces
-SURFACE_ACTIVE :: BG_SURFACE // pressed/active raised surfaces
-
-
-// ============================================================
-// BORDER ROLES
-// Cybersteel borders are 1px solid, always crisp, always visible.
-// Color carries the meaning; weight rarely changes.
-// ============================================================
-
-BORDER :: BG_BORDER // structural borders, default
-BORDER_SUBTLE :: BG_DISABLED // very faint separators
-BORDER_ACCENT :: GOLD_DIM // decorative / active edge
-BORDER_FOCUS :: GOLD_BRIGHT // focus rings
-BORDER_DANGER :: RED_DIM // destructive states
-BORDER_SUCCESS :: GREEN_DIM // success states
-
-
-// ============================================================
-// TRANSLUCENT ACCENT TINTS
-// Used for hover fills behind ghost buttons and for warm
-// gradient overlays. Alpha encodes the tint strength.
-// ============================================================
-
-// 20% gold tint behind a hovered secondary button.
-TINT_GOLD_HOVER :: draw.Color{0xd7, 0x99, 0x21, 0x33} // ~20% alpha
-
-// 20% red tint behind a hovered danger ghost button.
-TINT_DANGER_HOVER :: draw.Color{0xcc, 0x24, 0x1d, 0x33}
-
-// 20% green tint behind a hovered success ghost button.
-TINT_SUCCESS_HOVER :: draw.Color{0x98, 0x97, 0x1a, 0x33}
-
-// 8% gold tint — top of the diagonal "gold fade" feature
-// section overlay.
-TINT_GOLD_FADE :: draw.Color{0xfa, 0xbd, 0x2f, 0x14} // ~8% alpha
-
-// 6% amber tint — top of the vertical "amber fade" overlay.
-TINT_AMBER_FADE :: draw.Color{0xff, 0xb0, 0x00, 0x0f} // ~6% alpha
-
-// 4% gold tint — corner of card gradient.
-TINT_GOLD_CARD :: draw.Color{0xfa, 0xbd, 0x2f, 0x0a} // ~4% alpha
-
-// 3% black tint — scanline overlay stripe color.
-TINT_SCANLINE :: draw.Color{0x00, 0x00, 0x00, 0x08} // ~3% alpha
-
-
-// ============================================================
-// SHADOWS
-// Cybersteel is FLAT — no drop shadows. Elevation is expressed
-// through bg + border only. The single permitted shadow use is
-// a 1px gold ring as a focus / active indicator. Constants are
-// kept here so callers don't reach for ad-hoc shadow values.
-// ============================================================
-
-// 1px inset gold ring — only permitted shadow, used as focus
-// or selected-state outline. Width is 1px; color follows.
-SHADOW_GOLD_RING_WIDTH :: 1
-SHADOW_GOLD_RING_COLOR :: GOLD_DIM
-
-
-// ============================================================
-// SPACING SCALE (8px base grid)
-// All spacing values are multiples of 4px, with the main scale
-// in multiples of 8px. Names describe the scope of the gap, not
-// the raw size — pick by intent, not by pixel count.
-// ============================================================
-
-// Badge/tag inner padding, icon-label gap, border offsets, micro nudges.
-SPACE_CHIP :: 4
-
-// Inline element gaps, chip/pill padding, icon inset, tight row spacing.
-SPACE_ELEMENT :: 8
-
-// Button vertical padding, input inset, list row gap, label-to-field gap.
-SPACE_COMPONENT :: 12
-
-// Card inset, input horizontal padding, form field gap, default gap.
-SPACE_GROUP :: 16
-
-// Grouped nav items, related form section spacing, compact panel inset.
-SPACE_CLUSTER :: 20
-
-// Sidebar / panel inset, modal body padding, drawer inset, section
-// subheader gap.
-SPACE_PANEL :: 24
-
-// Between distinct content blocks, card grid gutter, toolbar height.
-SPACE_BLOCK :: 32
-
-// Major content group spacing, dialog padding, page sub-section gap.
-SPACE_CONTENT :: 40
-
-// Page section breaks, feature group dividers, hero subheading gap.
-SPACE_SECTION :: 48
-
-// Hero vertical padding, layout area spacing, large feature gaps.
-SPACE_REGION :: 64
-
-// Page-scale layout spacing, full-width section vertical rhythm.
-SPACE_ZONE :: 80
-
-// Page margins, full-bleed hero top padding, maximum layout gutter.
-SPACE_CANVAS :: 96
-
-
-// ============================================================
-// CORNER RADIUS
-// Cybersteel does not round its corners like a toy. 0–4px is the
-// preferred range; larger radii exist only for chips/pills.
-// ============================================================
-
-RADIUS_NONE :: 0 // sharp corners — preferred default for chrome
-RADIUS_SM :: 4 // micro-rounding for inline code, small badges
-RADIUS_MD :: 6 // default for cards, buttons, inputs
-RADIUS_LG :: 10 // rare — used only for prominent containers
-RADIUS_PILL :: 999 // fully-rounded chips, status pills, tags
-
-
-// ============================================================
-// BORDER WIDTH
-// 1px solid is the standard. Heavier weights are only used for
-// the Art Deco hairline accent on pre/code blocks.
-// ============================================================
-
-// Standard border weight everywhere — always crisp, always visible.
-BORDER_WIDTH_DEFAULT :: 1
-
-// Accent edge on <pre> blocks (left side, gold) and similar
-// emphasized rule treatments.
-BORDER_WIDTH_ACCENT :: 2
-
-
-// ============================================================
-// MOTION — TRANSITION DURATIONS
-// Fast and purposeful. No bounce, no spring, no elastic. UI
-// state changes in well under a quarter-second. Animations
-// must explain causality; nothing is decorative.
-// ============================================================
-
-// Entering active/pressed state. Snap-down feel — must feel
-// instant under the finger.
-TRANSITION_PRESS :: 55 * time.Millisecond
-
-// Releasing from a pressed state, and slower hover-out cases.
-TRANSITION_UI :: 180 * time.Millisecond
-
-// Hover enter / exit color shift on buttons, cards, links.
-TRANSITION_HOVER :: 150 * time.Millisecond
-
-// Overlay / modal / popover fade-in. Slightly longer to
-// signal "a layer changed", not "a control changed".
-TRANSITION_MODAL :: 200 * time.Millisecond
-
-// Cursor / immediate-feedback transitions (caret moves,
-// terminal output ticks).
-TRANSITION_CURSOR :: 80 * time.Millisecond
-
-
-// ============================================================
-// MOTION — COMPONENT-LEVEL TIMINGS
-// Specific named durations for known interactions. Prefer these
-// over picking a raw transition for a given component.
-// ============================================================
-
-// Button press fade — primary/secondary/danger/success share this.
-BUTTON_PRESS_FADE_DUR :: 55 * time.Millisecond
-
-// Button release / hover-out fade.
-BUTTON_RELEASE_FADE_DUR :: 180 * time.Millisecond
-
-// Card hover (border + bg crossfade).
-CARD_HOVER_FADE_DUR :: 150 * time.Millisecond
-
-// Card press (border + bg snap to active).
-CARD_PRESS_FADE_DUR :: 55 * time.Millisecond
-
-// Modal / overlay enter.
-MODAL_ENTER_DUR :: 200 * time.Millisecond
-
-// Modal / overlay exit (mirror of enter for symmetry).
-MODAL_EXIT_DUR :: 200 * time.Millisecond
-
-// Link color crossfade on hover.
-LINK_HOVER_FADE_DUR :: 180 * time.Millisecond
-
-// Terminal scanline flicker tick — single frame of the loop.
-SCANLINE_FLICKER_TICK :: 80 * time.Millisecond
-
-
-// ============================================================
-// TYPOGRAPHY — FONT FAMILY NAMES
-// Sans: IBM Plex Sans
-// Mono: Lilex — IBM Plex Mono with programming ligatures.
-//       Drop-in Plex Mono replacement; same skeleton, same
-//       proportions, plus =>, !=, >=, <=, etc. ligatures.
-// Plex Sans covers display, body, and condensed roles by
-// default. Lilex is for code, terminal output, data values,
-// and full mono-mode surfaces.
-// ============================================================
-
-// Plain family names
-FONT_FAMILY_SANS :: "IBM Plex Sans"
-FONT_FAMILY_MONO :: "Lilex"
-
-// IBM Plex Sans raw font data
-SANS_THIN_RAW :: #load("fonts/IBMPlexSans-Thin.ttf") // IBM Plex Sans
-SANS_THIN_ITALIC_RAW :: #load("fonts/IBMPlexSans-ThinItalic.ttf") // IBM Plex Sans
-SANS_EXTRALIGHT_RAW :: #load("fonts/IBMPlexSans-ExtraLight.ttf") // IBM Plex Sans
-SANS_EXTRALIGHT_ITALIC_RAW :: #load("fonts/IBMPlexSans-ExtraLightItalic.ttf") // IBM Plex Sans
-SANS_LIGHT_RAW :: #load("fonts/IBMPlexSans-Light.ttf") // IBM Plex Sans
-SANS_LIGHT_ITALIC_RAW :: #load("fonts/IBMPlexSans-LightItalic.ttf") // IBM Plex Sans
-SANS_REGULAR_RAW :: #load("fonts/IBMPlexSans-Regular.ttf") // IBM Plex Sans
-SANS_ITALIC_RAW :: #load("fonts/IBMPlexSans-Italic.ttf") // IBM Plex Sans
-SANS_MEDIUM_RAW :: #load("fonts/IBMPlexSans-Medium.ttf") // IBM Plex Sans
-SANS_MEDIUM_ITALIC_RAW :: #load("fonts/IBMPlexSans-MediumItalic.ttf") // IBM Plex Sans
-SANS_SEMIBOLD_RAW :: #load("fonts/IBMPlexSans-SemiBold.ttf") // IBM Plex Sans
-SANS_SEMIBOLD_ITALIC_RAW :: #load("fonts/IBMPlexSans-SemiBoldItalic.ttf") // IBM Plex Sans
-SANS_BOLD_RAW :: #load("fonts/IBMPlexSans-Bold.ttf") // IBM Plex Sans
-SANS_BOLD_ITALIC_RAW :: #load("fonts/IBMPlexSans-BoldItalic.ttf") // IBM Plex Sans
-
-// Lilex raw font data
-MONO_THIN_RAW :: #load("fonts/Lilex-Thin.ttf") // Lilex
-MONO_THIN_ITALIC_RAW :: #load("fonts/Lilex-ThinItalic.ttf") // Lilex
-MONO_EXTRALIGHT_RAW :: #load("fonts/Lilex-ExtraLight.ttf") // Lilex
-MONO_EXTRALIGHT_ITALIC_RAW :: #load("fonts/Lilex-ExtraLightItalic.ttf") // Lilex
-MONO_LIGHT_RAW :: #load("fonts/Lilex-Light.ttf") // Lilex
-MONO_LIGHT_ITALIC_RAW :: #load("fonts/Lilex-LightItalic.ttf") // Lilex
-MONO_REGULAR_RAW :: #load("fonts/Lilex-Regular.ttf") // Lilex
-MONO_ITALIC_RAW :: #load("fonts/Lilex-Italic.ttf") // Lilex
-MONO_MEDIUM_RAW :: #load("fonts/Lilex-Medium.ttf") // Lilex
-MONO_MEDIUM_ITALIC_RAW :: #load("fonts/Lilex-MediumItalic.ttf") // Lilex
-MONO_SEMIBOLD_RAW :: #load("fonts/Lilex-SemiBold.ttf") // Lilex
-MONO_SEMIBOLD_ITALIC_RAW :: #load("fonts/Lilex-SemiBoldItalic.ttf") // Lilex
-MONO_BOLD_RAW :: #load("fonts/Lilex-Bold.ttf") // Lilex
-MONO_BOLD_ITALIC_RAW :: #load("fonts/Lilex-BoldItalic.ttf") // Lilex
-
-
-// ============================================================
-// TYPOGRAPHY — TYPE SCALE (1.25 modular ratio, base 16px)
-// Minimum body size on web is 14px; print is 12pt.
-// ============================================================
-
-TEXT_XS :: 11 // status badges, fine print
-TEXT_SM :: 13 // secondary labels, captions
-TEXT_BASE :: 15 // default body text
-TEXT_MD :: 16 // slightly prominent body
-TEXT_LG :: 18 // subheadings, emphasized labels
-TEXT_XL :: 22 // H3 level
-TEXT_2XL :: 28 // H2 level
-TEXT_3XL :: 36 // H1 level
-TEXT_4XL :: 48 // display / hero
-TEXT_5XL :: 64 // hero display
-TEXT_6XL :: 96 // max scale; masthead only
-
-
-// ============================================================
-// TYPOGRAPHY — FONT WEIGHTS
-// Constrained to the STATIC weights that BOTH faces actually
-// ship from Google Fonts — IBM Plex Sans and Lilex share the
-// same seven static instances:
-//   100 Thin · 200 ExtraLight · 300 Light · 400 Regular ·
-//   500 Medium · 600 SemiBold · 700 Bold
-// There is no 800 ExtraBold and no 900 Black for either face.
-// Do not request a weight outside this set — Google's API
-// will fail or substitute, and the design will drift.
-// ============================================================
-
-WEIGHT_THIN :: 100
-WEIGHT_EXTRALIGHT :: 200
-WEIGHT_LIGHT :: 300
-WEIGHT_REGULAR :: 400
-WEIGHT_MEDIUM :: 500
-WEIGHT_SEMIBOLD :: 600
-WEIGHT_BOLD :: 700
-
-
-// ============================================================
-// TYPOGRAPHY — LINE HEIGHTS (unitless multipliers)
-// Multiply by font size to derive a leading in pixels.
-// ============================================================
-
-LEADING_TIGHT :: 1.15 // display headings
-LEADING_SNUG :: 1.30 // subheadings
-LEADING_NORMAL :: 1.50 // default body prose
-LEADING_LOOSE :: 1.70 // long-form reading, sparse density
-LEADING_MONO :: 1.40 // code / terminal output
-
-
-// ============================================================
-// TYPOGRAPHY — LETTER SPACING (in EM units)
-// Multiply by the resolved font size to get pixel spacing.
-// ============================================================
-
-TRACKING_TIGHT :: -0.02 // large headings, tightened display
-TRACKING_NORMAL :: 0.00 // body default
-TRACKING_WIDE :: 0.05 // H1/H2 ALL CAPS, button labels
-TRACKING_WIDER :: 0.10 // H5 caps, section headers
-TRACKING_WIDEST :: 0.20 // .label / .label-mono — ALL CAPS chip text
-
-
-// ============================================================
-// HEADING ROLES — paired size + tracking + casing intent
-// Casing is documentation only; these are the numbers a
-// renderer actually consumes.
-// ============================================================
-
-// H1 — page title, masthead. Title Case, ALL CAPS at display.
-H1_SIZE :: TEXT_3XL
-H1_WEIGHT :: WEIGHT_BOLD
-H1_TRACKING :: TRACKING_WIDE
-H1_LEADING :: LEADING_TIGHT
-
-// H2 — major section. ALL CAPS.
-H2_SIZE :: TEXT_2XL
-H2_WEIGHT :: WEIGHT_BOLD
-H2_TRACKING :: TRACKING_WIDE
-H2_LEADING :: LEADING_TIGHT
-
-// H3 — subsection. Sentence case, condensed semibold.
-H3_SIZE :: TEXT_XL
-H3_WEIGHT :: WEIGHT_SEMIBOLD
-H3_TRACKING :: TRACKING_NORMAL
-H3_LEADING :: LEADING_TIGHT
-
-// H4 — minor subsection.
-H4_SIZE :: TEXT_LG
-H4_WEIGHT :: WEIGHT_SEMIBOLD
-H4_TRACKING :: TRACKING_NORMAL
-H4_LEADING :: LEADING_SNUG
-
-// H5 — small caps section header (uses FG_SECONDARY).
-H5_SIZE :: TEXT_BASE
-H5_WEIGHT :: WEIGHT_SEMIBOLD
-H5_TRACKING :: TRACKING_WIDER
-H5_LEADING :: LEADING_SNUG
-
-// H6 — mono caps eyebrow / overline (uses FG_CAPTION).
-H6_SIZE :: TEXT_SM
-H6_WEIGHT :: WEIGHT_REGULAR
-H6_TRACKING :: TRACKING_WIDEST
-H6_LEADING :: LEADING_SNUG
-
-
-// ============================================================
-// LABEL ROLES — small caps annotation chips
-// ============================================================
-
-// .label — sans condensed, ALL CAPS, FG_CAPTION.
-LABEL_SIZE :: TEXT_XS
-LABEL_WEIGHT :: WEIGHT_SEMIBOLD
-LABEL_TRACKING :: TRACKING_WIDEST
-
-// .label-mono — mono ALL CAPS, FG_MUTED.
-LABEL_MONO_SIZE :: TEXT_XS
-LABEL_MONO_WEIGHT :: WEIGHT_REGULAR
-LABEL_MONO_TRACKING :: TRACKING_WIDEST
-
-
-// ============================================================
-// FOCUS RING
-// 1px solid gold outline at 2px offset. Crisp, never blurry.
-// No glow, no box-shadow halo.
-// ============================================================
-
-FOCUS_RING_WIDTH :: 1
-FOCUS_RING_OFFSET :: 2
-FOCUS_RING_COLOR :: BORDER_FOCUS // GOLD_BRIGHT
-
-
-// ============================================================
-// COMPONENT — BUTTONS
-// Cybersteel buttons are uppercase, semibold→bold, with wide
-// tracking. Default size is "md"; sm/lg shift padding + size.
-// ============================================================
-
-// Default (md) padding: vertical / horizontal
-BUTTON_PAD_Y :: 8
-BUTTON_PAD_X :: 18
-BUTTON_FONT_SIZE :: 12
-BUTTON_FONT_WEIGHT :: WEIGHT_BOLD
-BUTTON_TRACKING :: 0.07 // EM — ALL CAPS button label
-BUTTON_RADIUS :: RADIUS_MD
-BUTTON_BORDER :: BORDER_WIDTH_DEFAULT
-
-// Small button
-BUTTON_SM_PAD_Y :: 5
-BUTTON_SM_PAD_X :: 12
-BUTTON_SM_FONT_SIZE :: 10
-
-// Large button
-BUTTON_LG_PAD_Y :: 11
-BUTTON_LG_PAD_X :: 24
-BUTTON_LG_FONT_SIZE :: 14
-
-// Primary — solid gold fill, dark text. Hover brightens, press
-// flips to fg-heading (cream) fill.
-BUTTON_PRIMARY_BG :: GOLD_DIM
-BUTTON_PRIMARY_FG :: BG_SHELL
-BUTTON_PRIMARY_BORDER :: GOLD_DIM
-BUTTON_PRIMARY_BG_HOVER :: GOLD_BRIGHT
-BUTTON_PRIMARY_BORDER_HOVER :: GOLD_BRIGHT
-BUTTON_PRIMARY_BG_PRESS :: FG_HEADING
-BUTTON_PRIMARY_FG_PRESS :: BG_SHELL
-BUTTON_PRIMARY_BORDER_PRESS :: FG_HEADING
-
-// Secondary — transparent bg, structural border, hover gains
-// gold tint + gold-dim border, press fills with gold-bright.
-BUTTON_SECONDARY_BG :: [4]u8{0, 0, 0, 0} // transparent
-BUTTON_SECONDARY_FG :: FG_SECONDARY
-BUTTON_SECONDARY_BORDER :: BG_BORDER
-BUTTON_SECONDARY_BG_HOVER :: TINT_GOLD_HOVER
-BUTTON_SECONDARY_BORDER_HOVER :: GOLD_DIM
-BUTTON_SECONDARY_FG_HOVER :: FG_BODY
-BUTTON_SECONDARY_BG_PRESS :: GOLD_BRIGHT
-BUTTON_SECONDARY_FG_PRESS :: [4]u8{0xff, 0xff, 0xff, 0xff}
-BUTTON_SECONDARY_BORDER_PRESS :: GOLD_BRIGHT
-
-// Ghost — fully transparent, no border. Hover lifts to BG_ACTIVE.
-BUTTON_GHOST_BG :: [4]u8{0, 0, 0, 0}
-BUTTON_GHOST_FG :: FG_CAPTION
-BUTTON_GHOST_BORDER :: [4]u8{0, 0, 0, 0}
-BUTTON_GHOST_BG_HOVER :: BG_ACTIVE
-BUTTON_GHOST_FG_HOVER :: FG_BODY
-BUTTON_GHOST_BG_PRESS :: GOLD_DIM
-BUTTON_GHOST_FG_PRESS :: [4]u8{0xff, 0xff, 0xff, 0xff}
-
-// Danger — destructive ghost button.
-BUTTON_DANGER_BG :: [4]u8{0, 0, 0, 0}
-BUTTON_DANGER_FG :: RED_BRIGHT
-BUTTON_DANGER_BORDER :: RED_DIM
-BUTTON_DANGER_BG_HOVER :: TINT_DANGER_HOVER
-BUTTON_DANGER_BORDER_HOVER :: RED_BRIGHT
-BUTTON_DANGER_FG_HOVER :: FG_BODY
-BUTTON_DANGER_BG_PRESS :: RED_BRIGHT
-BUTTON_DANGER_FG_PRESS :: [4]u8{0xff, 0xff, 0xff, 0xff}
-BUTTON_DANGER_BORDER_PRESS :: RED_BRIGHT
-
-// Success — confirming ghost button.
-BUTTON_SUCCESS_BG :: [4]u8{0, 0, 0, 0}
-BUTTON_SUCCESS_FG :: GREEN_BRIGHT
-BUTTON_SUCCESS_BORDER :: GREEN_DIM
-BUTTON_SUCCESS_BG_HOVER :: TINT_SUCCESS_HOVER
-BUTTON_SUCCESS_BORDER_HOVER :: GREEN_BRIGHT
-BUTTON_SUCCESS_FG_HOVER :: FG_BODY
-BUTTON_SUCCESS_BG_PRESS :: GREEN_BRIGHT
-BUTTON_SUCCESS_FG_PRESS :: [4]u8{0xff, 0xff, 0xff, 0xff}
-BUTTON_SUCCESS_BORDER_PRESS :: GREEN_BRIGHT
-
-// Disabled — flat low-contrast surface, opacity-dimmed.
-BUTTON_DISABLED_BG :: BG_ACTIVE
-BUTTON_DISABLED_FG :: FG_MUTED
-BUTTON_DISABLED_BORDER :: BG_BORDER
-BUTTON_DISABLED_OPACITY :: 0.5
-
-
-// ============================================================
-// COMPONENT — CARDS
-// Flat, structural, mechanical. Background sits one step above
-// page; border is structural by default and shifts to gold-dim
-// on hover/press. Corner radius is the default 6px (RADIUS_MD).
-// ============================================================
-
-CARD_BG :: BG_SURFACE
-CARD_BORDER :: BG_BORDER
-CARD_BORDER_HOVER :: GOLD_DIM
-CARD_BG_PRESS :: BG_ACTIVE
-CARD_BORDER_PRESS :: GOLD_DIM
-CARD_RADIUS :: RADIUS_MD
-CARD_BORDER_WIDTH :: BORDER_WIDTH_DEFAULT
-CARD_PADDING :: SPACE_GROUP // 16px default inset
-
-
-// ============================================================
-// COMPONENT — INPUTS
-// Inputs sit on BG_SURFACE with structural borders. Focus
-// promotes the border to gold-bright; the focus ring follows.
-// ============================================================
-
-INPUT_BG :: BG_SURFACE
-INPUT_FG :: FG_BODY
-INPUT_PLACEHOLDER :: FG_CAPTION
-INPUT_BORDER :: BG_BORDER
-INPUT_BORDER_HOVER :: GOLD_DIM
-INPUT_BORDER_FOCUS :: GOLD_BRIGHT
-INPUT_BORDER_DANGER :: RED_DIM
-INPUT_RADIUS :: RADIUS_MD
-INPUT_PAD_Y :: SPACE_COMPONENT // 12
-INPUT_PAD_X :: SPACE_GROUP // 16
-
-
-// ============================================================
-// COMPONENT — BADGES / STATUS PILLS
-// ============================================================
-
-BADGE_FONT_SIZE :: TEXT_XS
-BADGE_WEIGHT :: WEIGHT_SEMIBOLD
-BADGE_TRACKING :: TRACKING_WIDEST
-BADGE_PAD_Y :: SPACE_CHIP // 4
-BADGE_PAD_X :: SPACE_ELEMENT // 8
-BADGE_RADIUS :: RADIUS_SM
-
-
-// ============================================================
-// COMPONENT — DECO RULE
-// Hairline Art Deco horizontal rule: 1px gold-dim top + 1px
-// structural drop, with panel-sized vertical margins.
-// ============================================================
-
-DECO_RULE_TOP_WIDTH :: 1
-DECO_RULE_TOP_COLOR :: GOLD_DIM
-DECO_RULE_DROP_WIDTH :: 1
-DECO_RULE_DROP_COLOR :: BG_BORDER
-DECO_RULE_MARGIN_Y :: SPACE_PANEL // 24
-
-
-// ============================================================
-// LAYOUT — FIXED CHROME WIDTHS
-// Sidebar widths are fixed; content lives in 8 or 12 column
-// grids. No responsive collapsing for chrome — Cybersteel UIs
-// run on real workstations.
-// ============================================================
-
-SIDEBAR_WIDTH_NARROW :: 240
-SIDEBAR_WIDTH_WIDE :: 280
-
-GRID_COLUMNS_NARROW :: 8
-GRID_COLUMNS_WIDE :: 12
-
-// Toolbar height matches SPACE_BLOCK so vertical rhythm aligns.
-TOOLBAR_HEIGHT :: SPACE_BLOCK // 32
-
-
-// ============================================================
-// CODE BLOCKS — <pre>
-// Mono, BG_SHELL surface with a 1px structural border and a
-// 2px gold-dim accent on the left edge.
-// ============================================================
-
-CODE_INLINE_BG :: BG_SURFACE
-CODE_INLINE_FG :: GOLD_BRIGHT
-CODE_INLINE_BORDER :: BG_BORDER
-CODE_INLINE_PAD_Y :: 2
-CODE_INLINE_PAD_X :: 6
-CODE_INLINE_RADIUS :: RADIUS_SM
-
-PRE_BG :: BG_SHELL
-PRE_FG :: FG_BODY
-PRE_BORDER :: BG_BORDER
-PRE_BORDER_LEFT_COLOR :: GOLD_DIM
-PRE_BORDER_LEFT_WIDTH :: BORDER_WIDTH_ACCENT // 2
-PRE_PAD_Y :: SPACE_GROUP // 16
-PRE_PAD_X :: SPACE_PANEL // 24
-
-
-// ============================================================
-// SCANLINE OVERLAY (opt-in, terminal surfaces only)
-// Repeating-stripe pattern at very low opacity. Stripe is 2 logical
-// pixels transparent + 2 logical pixels black-at-3% (TINT_SCANLINE).
-// ============================================================
-
-SCANLINE_STRIPE_LPX :: 2
-SCANLINE_GAP_LPX :: 2
-SCANLINE_COLOR :: TINT_SCANLINE
@@ -1,71 +1,50 @@
 package draw_qr

-import "core:mem"
-import "core:slice"
-
 import draw ".."
 import "../../qrcode"

-DFT_QR_DARK :: draw.BLACK // Default QR code dark module color.
-DFT_QR_LIGHT :: draw.WHITE // Default QR code light module color.
-DFT_QR_BOOST_ECL :: true // Default QR error correction level boost.
-DFT_QR_QUIET_ZONE :: 4 // Default light-pixel border on each side; 4 is the QR spec value.
-
-// Returns the number of bytes to_texture will write. Equals dim*dim*4 where
-// dim = qrcode.get_size(qrcode_buf) + 2*quiet_zone.
-texture_size :: #force_inline proc(qrcode_buf: []u8, quiet_zone: int = DFT_QR_QUIET_ZONE) -> int {
+// Returns the number of bytes to_texture will write for the given encoded
+// QR buffer. Equivalent to size*size*4 where size = qrcode.get_size(qrcode_buf).
+texture_size :: #force_inline proc(qrcode_buf: []u8) -> int {
 	size := qrcode.get_size(qrcode_buf)
-	if size == 0 || quiet_zone < 0 do return 0
-	padded_size := size + 2 * quiet_zone
-	return padded_size * padded_size * 4
+	return size * size * 4
 }

 // Decodes an encoded QR buffer into tightly-packed RGBA pixel data written to
 // texture_buf. No allocations, no GPU calls. Returns the Texture_Desc the
 // caller should pass to draw.register_texture alongside texture_buf.
 //
-// quiet_zone adds that many `light` pixels on each side; the spec value is 4.
-// Final dimension is qrcode.get_size + 2*quiet_zone on each axis.
-//
 // Returns ok=false when:
 //   - qrcode_buf is invalid (qrcode.get_size returns 0).
-//   - quiet_zone is negative.
-//   - texture_buf is smaller than texture_size(qrcode_buf, quiet_zone).
+//   - texture_buf is smaller than to_texture_size(qrcode_buf).
@(require_results)
 to_texture :: proc(
 	qrcode_buf: []u8,
 	texture_buf: []u8,
-	dark: draw.Color = DFT_QR_DARK,
-	light: draw.Color = DFT_QR_LIGHT,
-	quiet_zone: int = DFT_QR_QUIET_ZONE,
+	dark: draw.Color = draw.BLACK,
+	light: draw.Color = draw.WHITE,
 ) -> (
 	desc: draw.Texture_Desc,
 	ok: bool,
 ) {
 	size := qrcode.get_size(qrcode_buf)
-	if size == 0 || quiet_zone < 0 do return
-	padded_size := size + 2 * quiet_zone
-	if len(texture_buf) < padded_size * padded_size * 4 do return
+	if size == 0 do return {}, false
+	if len(texture_buf) < size * size * 4 do return {}, false

-	// Type-pun to []Color so each store is a single 32-bit write.
-	pixels := mem.slice_data_cast([]draw.Color, texture_buf[:padded_size * padded_size * 4])
-
-	// Bulk-fill with light: handles the border and every light QR module at once.
-	slice.fill(pixels, light)
-
-	// Overwrite only the dark modules, offset by the quiet-zone border.
 	for y in 0 ..< size {
-		row := (y + quiet_zone) * padded_size + quiet_zone
 		for x in 0 ..< size {
-			if qrcode.get_module(qrcode_buf, x, y) {
-				pixels[row + x] = dark
-			}
+			i := (y * size + x) * 4
+			c := dark if qrcode.get_module(qrcode_buf, x, y) else light
+			texture_buf[i + 0] = c[0]
+			texture_buf[i + 1] = c[1]
+			texture_buf[i + 2] = c[2]
+			texture_buf[i + 3] = c[3]
 		}
 	}

 	return draw.Texture_Desc {
-			width = u32(padded_size),
-			height = u32(padded_size),
+			width = u32(size),
+			height = u32(size),
 			depth_or_layers = 1,
 			type = .D2,
 			format = .R8G8B8A8_UNORM,
@@ -86,22 +65,21 @@ to_texture :: proc(
@(require_results)
 register_texture_from_raw :: proc(
 	qrcode_buf: []u8,
-	dark: draw.Color = DFT_QR_DARK,
-	light: draw.Color = DFT_QR_LIGHT,
-	quiet_zone: int = DFT_QR_QUIET_ZONE,
+	dark: draw.Color = draw.BLACK,
+	light: draw.Color = draw.WHITE,
 	temp_allocator := context.temp_allocator,
 ) -> (
 	texture: draw.Texture_Id,
 	ok: bool,
 ) {
-	tex_size := texture_size(qrcode_buf, quiet_zone)
+	tex_size := texture_size(qrcode_buf)
 	if tex_size == 0 do return draw.INVALID_TEXTURE, false

 	pixels, alloc_err := make([]u8, tex_size, temp_allocator)
 	if alloc_err != nil do return draw.INVALID_TEXTURE, false
 	defer delete(pixels, temp_allocator)

-	desc := to_texture(qrcode_buf, pixels, dark, light, quiet_zone) or_return
+	desc := to_texture(qrcode_buf, pixels, dark, light) or_return
 	return draw.register_texture(desc, pixels)
 }

@@ -118,10 +96,9 @@ register_texture_from_text :: proc(
 	min_version: int = qrcode.VERSION_MIN,
 	max_version: int = qrcode.VERSION_MAX,
 	mask: Maybe(qrcode.Mask) = nil,
-	boost_ecl: bool = DFT_QR_BOOST_ECL,
-	dark: draw.Color = DFT_QR_DARK,
-	light: draw.Color = DFT_QR_LIGHT,
-	quiet_zone: int = DFT_QR_QUIET_ZONE,
+	boost_ecl: bool = true,
+	dark: draw.Color = draw.BLACK,
+	light: draw.Color = draw.WHITE,
 	temp_allocator := context.temp_allocator,
 ) -> (
 	texture: draw.Texture_Id,
@@ -142,7 +119,7 @@ register_texture_from_text :: proc(
 		temp_allocator,
 	) or_return

-	return register_texture_from_raw(qrcode_buf, dark, light, quiet_zone, temp_allocator)
+	return register_texture_from_raw(qrcode_buf, dark, light, temp_allocator)
 }

 // Encodes arbitrary binary data as a QR Code and registers the result as an RGBA texture.
@@ -158,10 +135,9 @@ register_texture_from_binary :: proc(
 	min_version: int = qrcode.VERSION_MIN,
 	max_version: int = qrcode.VERSION_MAX,
 	mask: Maybe(qrcode.Mask) = nil,
-	boost_ecl: bool = DFT_QR_BOOST_ECL,
-	dark: draw.Color = DFT_QR_DARK,
-	light: draw.Color = DFT_QR_LIGHT,
-	quiet_zone: int = DFT_QR_QUIET_ZONE,
+	boost_ecl: bool = true,
+	dark: draw.Color = draw.BLACK,
+	light: draw.Color = draw.WHITE,
 	temp_allocator := context.temp_allocator,
 ) -> (
 	texture: draw.Texture_Id,
@@ -182,10 +158,18 @@ register_texture_from_binary :: proc(
 		temp_allocator,
 	) or_return

-	return register_texture_from_raw(qrcode_buf, dark, light, quiet_zone, temp_allocator)
+	return register_texture_from_raw(qrcode_buf, dark, light, temp_allocator)
 }

 register_texture_from :: proc {
 	register_texture_from_text,
-	register_texture_from_binary,
+	register_texture_from_binary
+}
+
+// Default fit=.Fit preserves the QR's square aspect; override as needed.
+clay_image :: #force_inline proc(
+	texture: draw.Texture_Id,
+	tint: draw.Color = draw.WHITE,
+) -> draw.Clay_Image_Data {
+	return draw.clay_image_data(texture, fit = .Fit, tint = tint)
 }
@@ -1,409 +0,0 @@
-package examples
-
-import "core:fmt"
-import "core:math"
-import "core:os"
-import sdl "vendor:sdl3"
-
-import "../../draw"
-import cyber "../cybersteel"
-
-// Backdrop example.
-//
-// Exercises the bracket scheduler end-to-end. The demo is structured as three zones in one
-// window so we can stress-test the cases that matter:
-//
-//   Zone 1 (top, base layer): animated colorful background + two side-by-side frosted panels
-//                             with DIFFERENT sigmas and DIFFERENT tints. Tests sigma grouping
-//                             and per-primitive tint.
-//
-//   Zone 2 (bottom-left, second layer): a small frosted panel in a NEW layer; its bracket sees
-//                                       Zone 1's full content (base layer's bracket output is
-//                                       carried forward via source_texture). Tests multi-layer
-//                                       backdrop sampling.
-//
-//   Zone 3 (bottom-right, base layer): edge cases. A sigma=0 "mirror" panel (no blur), two
-//                                      same-sigma panels stacked (tests sub-batch coalescing
-//                                      via append_or_extend_sub_batch), and text drawn ON TOP
-//                                      of a backdrop (tests Pass B post-bracket rendering).
-//
-// Animation: an orbiting gradient stripe plus a few orbiting circles in Zone 1. Motion is the
-// only way to visually confirm the blur is Gaussian; a static panel can't tell you whether the
-// kernel coefficients are right.
-gaussian_blur :: proc() {
-	if !sdl.Init({.VIDEO}) do os.exit(1)
-	window := sdl.CreateWindow("Backdrop blur", 800, 600, {.HIGH_PIXEL_DENSITY})
-	gpu := sdl.CreateGPUDevice(draw.PLATFORM_SHADER_FORMAT, true, nil)
-	if !sdl.ClaimWindowForGPUDevice(gpu, window) do os.exit(1)
-	if !draw.init(gpu, window) do os.exit(1)
-	PLEX_SANS_REGULAR = draw.register_font(cyber.SANS_REGULAR_RAW)
-
-	WINDOW_W :: f32(800)
-	WINDOW_H :: f32(600)
-	FONT_SIZE :: u16(14)
-
-	t: f32 = 0
-
-	for {
-		defer free_all(context.temp_allocator)
-		ev: sdl.Event
-		for sdl.PollEvent(&ev) {
-			if ev.type == .QUIT do return
-		}
-		t += 1
-
-		base_layer := draw.begin({width = WINDOW_W, height = WINDOW_H})
-
-		//----- Background fill ----------------------------------
-		draw.rectangle(base_layer, {0, 0, WINDOW_W, WINDOW_H}, draw.Color{20, 20, 28, 255})
-
-		//----- Zone 1: animated background for the top frosted panels ----------------------------------
-
-		// A wide rotating gradient stripe sweeps left-to-right across Zone 1. The angle changes
-		// over time so the gradient itself shifts visibly.
-		stripe_angle := t * 0.4
-		draw.rectangle(
-			base_layer,
-			{20, 20, WINDOW_W - 40, 240},
-			draw.Linear_Gradient {
-				start_color = {255, 80, 60, 255},
-				end_color = {60, 120, 255, 255},
-				angle = stripe_angle,
-			},
-		)
-
-		// Five orbiting circles inside Zone 1's strip. The blur should smooth their hard edges
-		// and the gradient behind them into a continuous wash.
-		for i in 0 ..< 5 {
-			phase := f32(i) * 1.2 + t * 0.04
-			cx := 100 + f32(i) * 140 + math.cos(phase) * 30
-			cy := 140 + math.sin(phase) * 50
-			circle_color := draw.Color {
-				u8(clamp(120 + math.cos(phase) * 100, 0, 255)),
-				u8(clamp(180 + math.sin(phase * 1.3) * 60, 0, 255)),
-				u8(clamp(220 - math.sin(phase) * 80, 0, 255)),
-				255,
-			}
-			draw.circle(base_layer, {cx, cy}, 22, circle_color)
-		}
-
-		// Bright accent rectangles to give the blur some sharp edges to munch on.
-		draw.rectangle(base_layer, {200, 60, 60, 12}, draw.Color{255, 255, 200, 255})
-		draw.rectangle(base_layer, {500, 200, 80, 16}, draw.Color{200, 255, 200, 255})
-
-		//----- Zone 1 frosted panels: different sigmas, different tints --------------------------------
-
-		// Panel A: heavy blur, cool blue-grey tint. sigma=14 in logical px.
-		// Both panels share rounded corners.
-		panel_radii := draw.Rectangle_Radii{16, 16, 16, 16}
-
-		// Both zone1 panels share one scope. Different sigmas still trigger separate blur
-		// passes (cost scales with unique sigmas, not with backdrop count); the scope just
-		// declares "these draws form one bracket." `backdrop_scope` is the RAII-style API:
-		// `end_backdrop` fires automatically when the block exits.
-		{
-			draw.backdrop_scope(base_layer)
-			draw.backdrop_blur(
-				base_layer,
-				{60, 80, 320, 140},
-				gaussian_sigma = 30,
-				tint = draw.Color{170, 200, 240, 200}, // cool blue, strong mix
-				radii = panel_radii,
-			)
-
-			// Panel B: lighter blur, warm amber tint. sigma=6.
-			draw.backdrop_blur(
-				base_layer,
-				{420, 80, 320, 140},
-				gaussian_sigma = 6,
-				tint = draw.Color{255, 220, 160, 200}, // warm amber, strong mix
-				radii = panel_radii,
-			)
-		}
-
-		// Text labels for the two panels. Drawn AFTER `end_backdrop` (which fires at the
-		// scope-block exit above), so they composite on top of both panels.
-		draw.text(
-			base_layer,
-			"sigma = 20, cool tint",
-			{72, 90},
-			PLEX_SANS_REGULAR,
-			FONT_SIZE,
-			color = draw.Color{30, 35, 50, 255},
-		)
-		draw.text(
-			base_layer,
-			"sigma = 6, warm tint",
-			{432, 90},
-			PLEX_SANS_REGULAR,
-			FONT_SIZE,
-			color = draw.Color{60, 40, 20, 255},
-		)
-
-		// Post-bracket verification: a white stripe drawn AFTER `end_backdrop` in the same
-		// layer. Should render ON TOP of both panels because the backdrop scope (and its
-		// composite output) is now closed; any non-backdrop draw on this layer composites
-		// with LOAD on top of whatever the bracket left in source_texture.
-		draw.rectangle(base_layer, {WINDOW_W * 0.5 - 4, 70, 8, 160}, draw.Color{255, 255, 255, 230})
-
-		//----- Zone 2: second layer with its own backdrop --------------------------------
-		// Zone 2's panel is in a NEW layer. Its bracket samples source_texture as it stands
-		// after the base layer fully finished (including the base layer's bracket V-composite
-		// output). So this panel sees Zone 1's frosted panels through its own blur.
-
-		zone2 := draw.new_layer(base_layer, {0, 280, WINDOW_W * 0.55, WINDOW_H - 280})
-
-		// Pass A content for zone2: a translucent darker overlay to make the panel pop.
-		draw.rectangle(zone2, {20, 300, WINDOW_W * 0.55 - 40, WINDOW_H - 320}, draw.Color{0, 0, 0, 80})
-
-		// Animated diagonal stripe in Zone 2 so the blur in this layer's panel has motion to
-		// smooth, not just the static base-layer content.
-		stripe_y := 320 + (math.sin(t * 0.05) * 0.5 + 0.5) * 200
-		draw.rectangle(zone2, {30, stripe_y, WINDOW_W * 0.55 - 60, 18}, draw.Color{255, 100, 200, 200})
-
-		// Zone 2's frosted panel. Single-panel scope; `backdrop_scope` keeps the begin/end
-		// pair tied to the block.
-		{
-			draw.backdrop_scope(zone2)
-			draw.backdrop_blur(
-				zone2,
-				{60, 360, WINDOW_W * 0.55 - 120, 160},
-				gaussian_sigma = 10,
-				tint = draw.WHITE, // pure blur (white tint with any alpha is a no-op)
-				radii = draw.Rectangle_Radii{24, 24, 24, 24},
-			)
-		}
-		draw.text(
-			zone2,
-			"Layer 2 backdrop",
-			{72, 372},
-			PLEX_SANS_REGULAR,
-			FONT_SIZE,
-			color = draw.Color{30, 30, 30, 255},
-		)
-		draw.text(
-			zone2,
-			"sigma = 10",
-			{72, 392},
-			PLEX_SANS_REGULAR,
-			FONT_SIZE,
-			color = draw.Color{60, 60, 60, 255},
-		)
-
-		//----- Zone 3: edge cases (back in base layer would also work, but we use zone2 to keep --------
-		// the demo's two-layer structure simple). Zone 3 lives in a third layer so it gets
-		// a fresh source snapshot too.
-		zone3 := draw.new_layer(zone2, {WINDOW_W * 0.55, 280, WINDOW_W * 0.45, WINDOW_H - 280})
-
-		// Animated background patch for Zone 3 so its mirror panel has something to reflect.
-		for i in 0 ..< 4 {
-			phase := f32(i) * 1.5 + t * 0.06
-			y := 310 + f32(i) * 60 + math.sin(phase) * 8
-			draw.rectangle(
-				zone3,
-				{WINDOW_W * 0.55 + 20, y, WINDOW_W * 0.45 - 40, 14},
-				draw.Color {
-					u8(clamp(200 + math.cos(phase) * 50, 0, 255)),
-					u8(clamp(150 + math.sin(phase) * 80, 0, 255)),
-					u8(clamp(220 - math.cos(phase * 1.7) * 60, 0, 255)),
-					255,
-				},
-			)
-		}
-
-		// All three Zone 3 backdrops share one scope. The sigma=0 mirror, then the two
-		// contiguous sigma=8 panels. The sigma=8 pair stays contiguous in the sub-batch list,
-		// so `append_or_extend_sub_batch` still coalesces them into a single instanced
-		// composite draw — scope boundaries don't affect coalescing, only kind/sigma identity.
-		{
-			draw.backdrop_scope(zone3)
-
-			// Edge case 1: sigma = 0 "mirror" — sharp framebuffer sample, no blur. Should reproduce
-			// the underlying pixels exactly through the SDF mask. Tinted slightly so it's visible.
-			draw.backdrop_blur(
-				zone3,
-				{WINDOW_W * 0.55 + 30, 310, 150, 70},
-				gaussian_sigma = 0,
-				tint = draw.WHITE, // pure mirror (no blur, no tint)
-				radii = draw.Rectangle_Radii{12, 12, 12, 12},
-			)
-
-			// Edge case 2: two same-sigma panels submitted contiguously. The sub-batch coalescer
-			// should merge these into a single instanced V-composite draw. Visually, both should
-			// look identical (modulo position) — same blur radius, same tint.
-			draw.backdrop_blur(
-				zone3,
-				{WINDOW_W * 0.55 + 30, 400, 150, 70},
-				gaussian_sigma = 8,
-				tint = draw.Color{160, 255, 160, 200}, // green tint, strong mix
-				radii = draw.Rectangle_Radii{12, 12, 12, 12},
-			)
-			draw.backdrop_blur(
-				zone3,
-				{WINDOW_W * 0.55 + 200, 400, 150, 70},
-				gaussian_sigma = 8,
-				tint = draw.Color{160, 255, 160, 200}, // identical: tests sub-batch coalescing
-				radii = draw.Rectangle_Radii{12, 12, 12, 12},
-			)
-		}
-
-		// Edge case 3: text drawn AFTER `end_backdrop` in the same layer. Composites on top of
-		// the bracket's V-composite output and should appear sharply over the green panels.
-		draw.text(
-			zone3,
-			"sigma=0 (mirror)",
-			{WINDOW_W * 0.55 + 38, 318},
-			PLEX_SANS_REGULAR,
-			FONT_SIZE,
-			color = draw.Color{20, 20, 20, 255},
-		)
-		draw.text(
-			zone3,
-			"sigma=8 (coalesced pair)",
-			{WINDOW_W * 0.55 + 38, 408},
-			PLEX_SANS_REGULAR,
-			FONT_SIZE,
-			color = draw.Color{20, 40, 20, 255},
-		)
-		draw.text(
-			zone3,
-			"Post-scope text overlay",
-			{WINDOW_W * 0.55 + 38, 480},
-			PLEX_SANS_REGULAR,
-			FONT_SIZE,
-			color = draw.WHITE,
-		)
-
-		draw.end(gpu, window, draw.Color{15, 15, 22, 255})
-	}
-}
-
-// Backdrop diagnostic example.
-//
-// Minimal isolation harness for debugging the blur. ONE panel, ONE sigma, NO animation. The
-// fixed background gives the eye a stable reference: the blur should smooth a *known* set of
-// hard edges, and any artifacts (crisp circles, ghost mirrors, no apparent change with sigma)
-// stand out clearly.
-//
-// Controls:
-//   UP / DOWN arrow  : adjust sigma by ±1
-//   LEFT / RIGHT arrow : adjust sigma by ±5
-//   SPACE            : reset to sigma=10
-//   T                : toggle the test rectangle on top of the panel
-//
-// Sigma is printed to the title bar so you can correlate visual behavior with the numeric
-// value as you adjust it.
-gaussian_blur_debug :: proc() {
-	if !sdl.Init({.VIDEO}) do os.exit(1)
-	window := sdl.CreateWindow("Backdrop debug", 800, 600, {.HIGH_PIXEL_DENSITY})
-	gpu := sdl.CreateGPUDevice(draw.PLATFORM_SHADER_FORMAT, true, nil)
-	if !sdl.ClaimWindowForGPUDevice(gpu, window) do os.exit(1)
-	if !draw.init(gpu, window) do os.exit(1)
-	defer draw.destroy(gpu)
-	PLEX_SANS_REGULAR = draw.register_font(cyber.SANS_REGULAR_RAW)
-
-	WINDOW_W :: f32(800)
-	WINDOW_H :: f32(600)
-	FONT_SIZE :: u16(14)
-
-	sigma: f32 = 10
-	show_test_rect := true
-
-	for {
-		defer free_all(context.temp_allocator)
-		ev: sdl.Event
-		for sdl.PollEvent(&ev) {
-			if ev.type == .QUIT do return
-			if ev.type == .KEY_DOWN {
-				#partial switch ev.key.scancode {
-				case .UP: sigma += 1
-				case .DOWN: sigma = max(sigma - 1, 0)
-				case .RIGHT: sigma += 5
-				case .LEFT: sigma = max(sigma - 5, 0)
-				case .SPACE: sigma = 10
-				case .T: show_test_rect = !show_test_rect
-				}
-			}
-		}
-
-		// Update title with current sigma so we can correlate visuals to numbers.
-		title := fmt.ctprintf("Backdrop debug | sigma = %.1f", sigma)
-		sdl.SetWindowTitle(window, title)
-
-		base_layer := draw.begin({width = WINDOW_W, height = WINDOW_H})
-
-		// Background: deliberately high-contrast static content. The eye can verify whether
-		// hard edges (the black grid lines, the crisp circles, the fine vertical bars) get
-		// smoothed by the panel. NOTHING animates here — every difference between frames is
-		// caused by user input (sigma change), not by the demo itself.
-		draw.rectangle(base_layer, {0, 0, WINDOW_W, WINDOW_H}, draw.Color{255, 255, 255, 255})
-
-		// Black grid: 8x6 cells with thin lines. Each grid cell is 100x100 logical px.
-		for x: f32 = 0; x <= WINDOW_W; x += 100 {
-			draw.rectangle(base_layer, {x - 1, 0, 2, WINDOW_H}, draw.BLACK)
-		}
-		for y: f32 = 0; y <= WINDOW_H; y += 100 {
-			draw.rectangle(base_layer, {0, y - 1, WINDOW_W, 2}, draw.BLACK)
-		}
-
-		// A row of small bright circles across the middle. Their crisp edges are the most
-		// sensitive blur indicator.
-		for i in 0 ..< 8 {
-			cx := f32(i) * 100 + 50
-			color := draw.Color{u8((i * 32) & 0xff), u8((i * 64) & 0xff), u8(255 - (i * 32) & 0xff), 255}
-			draw.circle(base_layer, {cx, 350}, 25, color)
-		}
-
-		// Vertical fine-detail stripes on the left edge. At any meaningful sigma these should
-		// merge into a flat color through the panel.
-		for i in 0 ..< 20 {
-			x := 30 + f32(i) * 6
-			color := draw.RED if i % 2 == 0 else draw.BLUE
-			draw.rectangle(base_layer, {x, 200, 4, 200}, color)
-		}
-
-		// THE PANEL UNDER TEST. Square, centered, large enough to cover multiple grid cells and
-		// the circle row. Square shape makes any horizontal-vs-vertical asymmetry purely
-		// renderer-driven (geometry can't introduce it).
-		//
-		// Uses the explicit begin/end form (instead of `backdrop_scope`) to exercise the
-		// alternative API surface in the diagnostic harness.
-		panel := draw.Rectangle{250, 150, 300, 300}
-		draw.begin_backdrop(base_layer)
-		draw.backdrop_blur(
-			base_layer,
-			panel,
-			gaussian_sigma = sigma,
-			tint = draw.WHITE,
-			radii = draw.Rectangle_Radii{20, 20, 20, 20},
-		)
-		draw.end_backdrop(base_layer)
-
-		// Post-scope test: a bright rectangle drawn AFTER `end_backdrop` in the same layer.
-		// Should always render on top of the panel. If the panel ever shows a "ghost" of this
-		// rect inside its blur, the V-composite is sampling the wrong texture state.
-		if show_test_rect {
-			draw.rectangle(base_layer, {380, 280, 40, 40}, draw.Color{0, 200, 0, 255})
-		}
-
-		// Sigma label at the bottom in giant text so you can read it from across the room.
-		draw.text(
-			base_layer,
-			fmt.tprintf("sigma = %.1f", sigma),
-			{20, WINDOW_H - 40},
-			PLEX_SANS_REGULAR,
-			28,
-			color = draw.BLACK,
-		)
-		draw.text(
-			base_layer,
-			"UP/DOWN ±1   LEFT/RIGHT ±5   SPACE reset   T toggle test rect",
-			{20, WINDOW_H - 70},
-			PLEX_SANS_REGULAR,
-			FONT_SIZE,
-			color = draw.Color{60, 60, 60, 255},
-		)
-
-		draw.end(gpu, window, draw.Color{255, 255, 255, 255})
-	}
-}
@@ -1,363 +0,0 @@
-package examples
-
-import "core:os"
-import sdl "vendor:sdl3"
-
-import "../../draw"
-import "../../vendor/clay"
-import cyber "../cybersteel"
-
-// Clay border debug example.
-//
-// Lays out a grid of bordered Clay elements that exercise every code path in
-// `clay_emit_partial_border` and `try_dispatch_clay_rect_border_pair`:
-//
-//   1. Uniform borders (fast path) — sharp, rounded, and the border-thicker-than-radius
-//      edge case (inner corner clamps to 0).
-//   2. Background + border combinations — opaque bg + opaque uniform border MERGES into one
-//      SDF primitive; translucent border DECLINES the merge to preserve blend fidelity;
-//      non-uniform border declines and falls through to the slow path; translucent bg with
-//      opaque border still merges (bg alpha doesn't affect merge correctness).
-//   3. Single-side borders — top / right / bottom / left individually.
-//   4. Two-side borders — parallel pairs (no corners drawn) and adjacent pairs (one corner
-//      rounds, others stay square).
-//   5. Three-side borders + asymmetric widths.
-//   6. Layout correctness — a vertical list with bottom-border separators (each border
-//      lives inside its own item, no bleed between siblings) and a row of adjacent fully
-//      bordered siblings (no border overlap, each in its own bounds).
-clay_borders :: proc() {
-	if !sdl.Init({.VIDEO}) do os.exit(1)
-	window := sdl.CreateWindow("Clay Borders Debug", 1200, 900, {.HIGH_PIXEL_DENSITY})
-	gpu := sdl.CreateGPUDevice(draw.PLATFORM_SHADER_FORMAT, true, nil)
-	if !sdl.ClaimWindowForGPUDevice(gpu, window) do os.exit(1)
-	if !draw.init(gpu, window) do os.exit(1)
-	PLEX_SANS_REGULAR = draw.register_font(cyber.SANS_REGULAR_RAW)
-
-	// Distinct colors so the fill, border, and translucent variants are visually unambiguous.
-	BG_PAGE :: draw.Color{25, 25, 30, 255}
-	FILL_OPAQUE :: draw.Color{80, 120, 200, 255}
-	FILL_TRANSLUCENT :: draw.Color{80, 120, 200, 128}
-	BORDER_OPAQUE :: draw.Color{255, 200, 100, 255}
-	BORDER_TRANSLUCENT :: draw.Color{255, 200, 100, 128}
-
-	label_config := clay.TextElementConfig {
-		fontId    = PLEX_SANS_REGULAR,
-		fontSize  = 12,
-		textColor = {220, 220, 220, 255},
-	}
-	header_config := clay.TextElementConfig {
-		fontId    = PLEX_SANS_REGULAR,
-		fontSize  = 16,
-		textColor = {255, 255, 255, 255},
-	}
-	title_config := clay.TextElementConfig {
-		fontId    = PLEX_SANS_REGULAR,
-		fontSize  = 22,
-		textColor = {255, 255, 255, 255},
-	}
-
-	for {
-		defer free_all(context.temp_allocator)
-		ev: sdl.Event
-		for sdl.PollEvent(&ev) {
-			if ev.type == .QUIT do return
-		}
-
-		base_layer := draw.begin({width = 1200, height = 900})
-		clay.SetLayoutDimensions({width = base_layer.bounds.width, height = base_layer.bounds.height})
-		clay.BeginLayout()
-
-		if clay.UI(clay.ID("borders_page"))(
-		{
-			layout = {
-				sizing = {clay.SizingGrow({}), clay.SizingGrow({})},
-				padding = clay.PaddingAll(20),
-				childGap = 14,
-				layoutDirection = .TopToBottom,
-			},
-			backgroundColor = clay_color(BG_PAGE),
-		},
-		) {
-			clay.Text("Clay Borders Debug", title_config)
-
-			//----- Section 1: Uniform borders (fast path) -----------------------------------
-			clay.Text("Uniform borders (fast path)", header_config)
-			if clay.UI(clay.ID("row_uniform"))(border_row_layout()) {
-				border_test_card(
-					"1px sharp",
-					label_config,
-					FILL_OPAQUE,
-					BORDER_OPAQUE,
-					{left = 1, right = 1, top = 1, bottom = 1},
-					{},
-				)
-				border_test_card(
-					"2px, radius 8",
-					label_config,
-					FILL_OPAQUE,
-					BORDER_OPAQUE,
-					{left = 2, right = 2, top = 2, bottom = 2},
-					{topLeft = 8, topRight = 8, bottomRight = 8, bottomLeft = 8},
-				)
-				border_test_card(
-					"8px, radius 20",
-					label_config,
-					FILL_OPAQUE,
-					BORDER_OPAQUE,
-					{left = 8, right = 8, top = 8, bottom = 8},
-					{topLeft = 20, topRight = 20, bottomRight = 20, bottomLeft = 20},
-				)
-				border_test_card(
-					"10px > radius 5 (inner clamps)",
-					label_config,
-					FILL_OPAQUE,
-					BORDER_OPAQUE,
-					{left = 10, right = 10, top = 10, bottom = 10},
-					{topLeft = 5, topRight = 5, bottomRight = 5, bottomLeft = 5},
-				)
-			}
-
-			//----- Section 2: Background + border (merge optimization) ----------------------
-			clay.Text("Background + border (merge optimization)", header_config)
-			if clay.UI(clay.ID("row_bg_border"))(border_row_layout()) {
-				border_test_card(
-					"opaque bg + opaque (MERGES: 1 prim)",
-					label_config,
-					FILL_OPAQUE,
-					BORDER_OPAQUE,
-					{left = 2, right = 2, top = 2, bottom = 2},
-					{topLeft = 6, topRight = 6, bottomRight = 6, bottomLeft = 6},
-				)
-				border_test_card(
-					"translucent bg + opaque (MERGES)",
-					label_config,
-					FILL_TRANSLUCENT,
-					BORDER_OPAQUE,
-					{left = 3, right = 3, top = 3, bottom = 3},
-					{topLeft = 6, topRight = 6, bottomRight = 6, bottomLeft = 6},
-				)
-				border_test_card(
-					"opaque bg + translucent (NO merge)",
-					label_config,
-					FILL_OPAQUE,
-					BORDER_TRANSLUCENT,
-					{left = 4, right = 4, top = 4, bottom = 4},
-					{topLeft = 8, topRight = 8, bottomRight = 8, bottomLeft = 8},
-				)
-				border_test_card(
-					"opaque bg + non-uniform (NO merge)",
-					label_config,
-					FILL_OPAQUE,
-					BORDER_OPAQUE,
-					{left = 1, right = 4, top = 2, bottom = 3},
-					{topLeft = 6, topRight = 6, bottomRight = 6, bottomLeft = 6},
-				)
-			}
-
-			//----- Section 3: Single side borders -------------------------------------------
-			clay.Text("Single side", header_config)
-			if clay.UI(clay.ID("row_single_side"))(border_row_layout()) {
-				border_test_card("top only (4px)", label_config, FILL_OPAQUE, BORDER_OPAQUE, {top = 4}, {})
-				border_test_card("right only (4px)", label_config, FILL_OPAQUE, BORDER_OPAQUE, {right = 4}, {})
-				border_test_card(
-					"bottom only (4px, divider)",
-					label_config,
-					FILL_OPAQUE,
-					BORDER_OPAQUE,
-					{bottom = 4},
-					{},
-				)
-				border_test_card("left only (4px)", label_config, FILL_OPAQUE, BORDER_OPAQUE, {left = 4}, {})
-			}
-
-			//----- Section 4: Two side borders ----------------------------------------------
-			clay.Text("Two sides", header_config)
-			if clay.UI(clay.ID("row_two_sides"))(border_row_layout()) {
-				border_test_card(
-					"T+B parallel (no corners)",
-					label_config,
-					FILL_OPAQUE,
-					BORDER_OPAQUE,
-					{top = 3, bottom = 3},
-					{topLeft = 8, topRight = 8, bottomRight = 8, bottomLeft = 8},
-				)
-				border_test_card(
-					"L+R parallel (no corners)",
-					label_config,
-					FILL_OPAQUE,
-					BORDER_OPAQUE,
-					{left = 3, right = 3},
-					{topLeft = 8, topRight = 8, bottomRight = 8, bottomLeft = 8},
-				)
-				border_test_card(
-					"T+L adjacent (TL rounds)",
-					label_config,
-					FILL_OPAQUE,
-					BORDER_OPAQUE,
-					{top = 3, left = 3},
-					{topLeft = 12, topRight = 12, bottomRight = 12, bottomLeft = 12},
-				)
-				border_test_card(
-					"B+R adjacent (BR rounds)",
-					label_config,
-					FILL_OPAQUE,
-					BORDER_OPAQUE,
-					{bottom = 3, right = 3},
-					{topLeft = 12, topRight = 12, bottomRight = 12, bottomLeft = 12},
-				)
-			}
-
-			//----- Section 5: Three sides + asymmetric widths -------------------------------
-			clay.Text("Three sides + asymmetric widths", header_config)
-			if clay.UI(clay.ID("row_advanced"))(border_row_layout()) {
-				border_test_card(
-					"T+R+B (no L), rounded",
-					label_config,
-					FILL_OPAQUE,
-					BORDER_OPAQUE,
-					{top = 3, right = 3, bottom = 3},
-					{topLeft = 8, topRight = 8, bottomRight = 8, bottomLeft = 8},
-				)
-				border_test_card(
-					"T+L+R (no B), rounded",
-					label_config,
-					FILL_OPAQUE,
-					BORDER_OPAQUE,
-					{top = 3, left = 3, right = 3},
-					{topLeft = 8, topRight = 8, bottomRight = 8, bottomLeft = 8},
-				)
-				border_test_card(
-					"asym 1/2/3/4 T/R/B/L",
-					label_config,
-					FILL_OPAQUE,
-					BORDER_OPAQUE,
-					{top = 1, right = 2, bottom = 3, left = 4},
-					{},
-				)
-				border_test_card(
-					"asym + rounded",
-					label_config,
-					FILL_OPAQUE,
-					BORDER_OPAQUE,
-					{top = 2, right = 4, bottom = 2, left = 4},
-					{topLeft = 10, topRight = 10, bottomRight = 10, bottomLeft = 10},
-				)
-			}
-
-			//----- Section 6: Layout correctness --------------------------------------------
-			clay.Text("Layout correctness", header_config)
-			if clay.UI(clay.ID("row_correctness"))(
-			{layout = {sizing = {clay.SizingGrow({}), clay.SizingFit({})}, childGap = 14}},
-			) {
-				// 6a: vertical list with per-item bottom-border separator. Each item's
-				// border draws INSIDE its own bounds, so adjacent items don't bleed.
-				if clay.UI(clay.ID("list_demo"))(
-				{
-					layout = {
-						sizing = {clay.SizingFixed(300), clay.SizingFit({})},
-						padding = clay.PaddingAll(6),
-						childGap = 6,
-						layoutDirection = .TopToBottom,
-					},
-				},
-				) {
-					clay.Text("List with bottom-border separators", label_config)
-					if clay.UI(clay.ID("list_outer"))(
-					{
-						layout = {sizing = {clay.SizingGrow({}), clay.SizingFit({})}, layoutDirection = .TopToBottom},
-						backgroundColor = clay_color(FILL_OPAQUE),
-					},
-					) {
-						for index in 0 ..< 5 {
-							if clay.UI(clay.ID("list_item", u32(index)))(
-							{
-								layout = {sizing = {clay.SizingGrow({}), clay.SizingFixed(28)}, padding = clay.PaddingAll(6)},
-								border = {color = clay_color(BORDER_OPAQUE), width = {bottom = 1}},
-							},
-							) {
-								clay.Text("Item", label_config)
-							}
-						}
-					}
-				}
-
-				// 6b: row of adjacent fully bordered siblings. With borders rendered
-				// INSIDE each element's bounds, the boundary between two siblings shows
-				// the natural 2*width sum (no overlap, no bleed).
-				if clay.UI(clay.ID("adj_demo"))(
-				{
-					layout = {
-						sizing = {clay.SizingFixed(380), clay.SizingFit({})},
-						padding = clay.PaddingAll(6),
-						childGap = 6,
-						layoutDirection = .TopToBottom,
-					},
-				},
-				) {
-					clay.Text("Adjacent bordered siblings (no gap)", label_config)
-					if clay.UI(clay.ID("adj_row"))({layout = {sizing = {clay.SizingGrow({}), clay.SizingFit({})}}}) {
-						for index in 0 ..< 4 {
-							if clay.UI(clay.ID("adj_item", u32(index)))(
-							{
-								layout = {sizing = {clay.SizingFixed(80), clay.SizingFixed(60)}},
-								backgroundColor = clay_color(FILL_OPAQUE),
-								border = {color = clay_color(BORDER_OPAQUE), width = {left = 2, right = 2, top = 2, bottom = 2}},
-							},
-							) {}
-						}
-					}
-				}
-			}
-		}
-
-		clay_batch := draw.ClayBatch {
-			bounds = base_layer.bounds,
-			cmds   = clay.EndLayout(0),
-		}
-		draw.prepare_clay_batch(base_layer, &clay_batch)
-		draw.end(gpu, window)
-	}
-}
-
-// Helper: convert a draw.Color (RGBA u8) to clay.Color (RGBA float in 0-255 range).
-clay_color :: proc(c: draw.Color) -> clay.Color {
-	return clay.Color{f32(c[0]), f32(c[1]), f32(c[2]), f32(c[3])}
-}
-
-// Helper: shared row container declaration for the test sections.
-border_row_layout :: proc() -> clay.ElementDeclaration {
-	return clay.ElementDeclaration{layout = {sizing = {clay.SizingGrow({}), clay.SizingFit({})}, childGap = 12}}
-}
-
-// One labeled test card: a fixed-width column with a caption above and a sample bordered
-// rectangle below. Uses `clay.ID_LOCAL` for the inner element so each card gets a unique
-// child ID without the caller passing one explicitly.
-border_test_card :: proc(
-	label: string,
-	label_config: clay.TextElementConfig,
-	fill_color: draw.Color,
-	border_color: draw.Color,
-	border_width: clay.BorderWidth,
-	corner_radii: clay.CornerRadius,
-) {
-	if clay.UI(clay.ID(label))(
-	{
-		layout = {
-			sizing = {clay.SizingFixed(275), clay.SizingFit({})},
-			padding = clay.PaddingAll(4),
-			childGap = 6,
-			layoutDirection = .TopToBottom,
-		},
-	},
-	) {
-		clay.Text(label, label_config)
-		if clay.UI(clay.ID_LOCAL("test_inner"))(
-		{
-			layout = {sizing = {clay.SizingGrow({}), clay.SizingFixed(64)}},
-			backgroundColor = clay_color(fill_color),
-			border = clay.BorderElementConfig{color = clay_color(border_color), width = border_width},
-			cornerRadius = corner_radii,
-		},
-		) {}
-	}
-}
@@ -1,96 +0,0 @@
-package examples
-
-import "core:fmt"
-import "core:log"
-import "core:mem"
-import "core:os"
-
-EX_HELLOPE_SHAPES :: "hellope-shapes"
-EX_HELLOPE_TEXT :: "hellope-text"
-EX_HELLOPE_CLAY :: "hellope-clay"
-EX_HELLOPE_CUSTOM :: "hellope-custom"
-EX_CLAY_BORDERS :: "clay-borders"
-EX_TEXTURES :: "textures"
-EX_GAUSSIAN_BLUR :: "gaussian-blur"
-EX_GAUSSIAN_BLUR_DEBUG :: "gaussian-blur-debug"
-
-AVAILABLE_EXAMPLES_MSG ::
-	"Available examples: " +
-	EX_HELLOPE_SHAPES +
-	", " +
-	EX_HELLOPE_TEXT +
-	", " +
-	EX_HELLOPE_CLAY +
-	", " +
-	EX_HELLOPE_CUSTOM +
-	", " +
-	EX_CLAY_BORDERS +
-	", " +
-	EX_TEXTURES +
-	", " +
-	EX_GAUSSIAN_BLUR +
-	", " +
-	EX_GAUSSIAN_BLUR_DEBUG
-
-main :: proc() {
-	//----- General setup ----------------------------------
-	// Temp
-	track_temp: mem.Tracking_Allocator
-	mem.tracking_allocator_init(&track_temp, context.temp_allocator)
-	context.temp_allocator = mem.tracking_allocator(&track_temp)
-
-	// Default
-	track: mem.Tracking_Allocator
-	mem.tracking_allocator_init(&track, context.allocator)
-	context.allocator = mem.tracking_allocator(&track)
-	// Log a warning about any memory that was not freed by the end of the program.
-	// This could be fine for some global state or it could be a memory leak.
-	defer {
-		// Temp allocator
-		if len(track_temp.bad_free_array) > 0 {
-			fmt.eprintf("=== %v incorrect frees - temp allocator: ===\n", len(track_temp.bad_free_array))
-			for entry in track_temp.bad_free_array {
-				fmt.eprintf("- %p @ %v\n", entry.memory, entry.location)
-			}
-			mem.tracking_allocator_destroy(&track_temp)
-		}
-		// Default allocator
-		if len(track.allocation_map) > 0 {
-			fmt.eprintf("=== %v allocations not freed - main allocator: ===\n", len(track.allocation_map))
-			for _, entry in track.allocation_map {
-				fmt.eprintf("- %v bytes @ %v\n", entry.size, entry.location)
-			}
-		}
-		if len(track.bad_free_array) > 0 {
-			fmt.eprintf("=== %v incorrect frees - main allocator: ===\n", len(track.bad_free_array))
-			for entry in track.bad_free_array {
-				fmt.eprintf("- %p @ %v\n", entry.memory, entry.location)
-			}
-		}
-		mem.tracking_allocator_destroy(&track)
-	}
-	context.logger = log.create_console_logger()
-	defer log.destroy_console_logger(context.logger)
-
-	args := os.args
-	if len(args) < 2 {
-		fmt.eprintln("Usage: examples <example_name>")
-		fmt.eprintln(AVAILABLE_EXAMPLES_MSG)
-		os.exit(1)
-	}
-
-	switch args[1] {
-	case EX_HELLOPE_CLAY: hellope_clay()
-	case EX_HELLOPE_CUSTOM: hellope_custom()
-	case EX_HELLOPE_SHAPES: hellope_shapes()
-	case EX_HELLOPE_TEXT: hellope_text()
-	case EX_CLAY_BORDERS: clay_borders()
-	case EX_TEXTURES: textures()
-	case EX_GAUSSIAN_BLUR: gaussian_blur()
-	case EX_GAUSSIAN_BLUR_DEBUG: gaussian_blur_debug()
-	case:
-		fmt.eprintf("Unknown example: %v\n", args[1])
-		fmt.eprintln(AVAILABLE_EXAMPLES_MSG)
-		os.exit(1)
-	}
-}
@@ -1,15 +1,13 @@
 package examples

+import "../../draw"
+import "../../vendor/clay"
 import "core:math"
 import "core:os"
 import sdl "vendor:sdl3"

-import "../../draw"
-import "../../draw/tess"
-import "../../vendor/clay"
-import cyber "../cybersteel"
-
-PLEX_SANS_REGULAR: draw.Font_Id = max(draw.Font_Id) // Max so we crash if registration is forgotten
+JETBRAINS_MONO_REGULAR_RAW :: #load("fonts/JetBrainsMono-Regular.ttf")
+JETBRAINS_MONO_REGULAR: draw.Font_Id = max(draw.Font_Id) // Max so we crash if registration is forgotten

 hellope_shapes :: proc() {
 	if !sdl.Init({.VIDEO}) do os.exit(1)
@@ -30,25 +28,19 @@ hellope_shapes :: proc() {
 		base_layer := draw.begin({width = 500, height = 500})

 		// Background
-		draw.rectangle(base_layer, {0, 0, 500, 500}, draw.Color{40, 40, 40, 255})
+		draw.rectangle(base_layer, {0, 0, 500, 500}, {40, 40, 40, 255})

 		// ----- Shapes without rotation (existing demo) -----
-		draw.rectangle(
-			base_layer,
-			{20, 20, 200, 120},
-			draw.Color{80, 120, 200, 255},
-			outline_color = draw.WHITE,
-			outline_width = 2,
-			radii = {top_right = 15, top_left = 5},
-		)
-
-		red_rect_raddi := draw.uniform_radii({240, 20, 240, 120}, 0.3)
-		red_rect_raddi.bottom_left = 0
-		draw.rectangle(base_layer, {240, 20, 240, 120}, draw.Color{200, 80, 80, 255}, radii = red_rect_raddi)
-		draw.rectangle(
+		draw.rectangle(base_layer, {20, 20, 200, 120}, {80, 120, 200, 255})
+		draw.rectangle_lines(base_layer, {20, 20, 200, 120}, draw.WHITE, thickness = 2)
+		draw.rectangle(base_layer, {240, 20, 240, 120}, {200, 80, 80, 255}, roundness = 0.3)
+		draw.rectangle_gradient(
 			base_layer,
 			{20, 160, 460, 60},
-			draw.Linear_Gradient{start_color = {255, 0, 0, 255}, end_color = {0, 0, 255, 255}, angle = 0},
+			{255, 0, 0, 255},
+			{0, 255, 0, 255},
+			{0, 0, 255, 255},
+			{255, 255, 0, 255},
 		)

 		// ----- Rotation demos -----
@@ -58,12 +50,17 @@ hellope_shapes :: proc() {
 		draw.rectangle(
 			base_layer,
 			rect,
-			draw.Color{100, 200, 100, 255},
-			outline_color = draw.WHITE,
-			outline_width = 2,
+			{100, 200, 100, 255},
+			origin = draw.center_of(rect),
+			rotation = spin_angle,
+		)
+		draw.rectangle_lines(
+			base_layer,
+			rect,
+			draw.WHITE,
+			thickness = 2,
 			origin = draw.center_of(rect),
 			rotation = spin_angle,
-			feather_ppx = 1,
 		)

 		// Rounded rectangle rotating around its center
@@ -71,46 +68,30 @@ hellope_shapes :: proc() {
 		draw.rectangle(
 			base_layer,
 			rrect,
-			draw.Color{200, 100, 200, 255},
-			radii = draw.uniform_radii(rrect, 0.4),
+			{200, 100, 200, 255},
+			roundness = 0.4,
 			origin = draw.center_of(rrect),
 			rotation = spin_angle,
 		)

 		// Ellipse rotating around its center (tilted ellipse)
-		draw.ellipse(base_layer, {410, 340}, 50, 30, draw.Color{255, 200, 50, 255}, rotation = spin_angle)
+		draw.ellipse(base_layer, {410, 340}, 50, 30, {255, 200, 50, 255}, rotation = spin_angle)

 		// Circle orbiting a point (moon orbiting planet)
 		// Convention B: center = pivot point (planet), origin = offset from moon center to pivot.
 		// Moon's visual center at rotation=0: planet_pos - origin = (100, 450) - (0, 40) = (100, 410).
-		planet_pos := draw.Vec2{100, 450}
-		draw.circle(base_layer, planet_pos, 8, draw.Color{200, 200, 200, 255}) // planet (stationary)
-		draw.circle(
-			base_layer,
-			planet_pos,
-			5,
-			draw.Color{100, 150, 255, 255},
-			origin = draw.Vec2{0, 40},
-			rotation = spin_angle,
-		) // moon orbiting
+		planet_pos := [2]f32{100, 450}
+		draw.circle(base_layer, planet_pos, 8, {200, 200, 200, 255}) // planet (stationary)
+		draw.circle(base_layer, planet_pos, 5, {100, 150, 255, 255}, origin = {0, 40}, rotation = spin_angle) // moon orbiting

-		// Sector (pie slice) rotating in place
-		draw.ring(
-			base_layer,
-			draw.Vec2{250, 450},
-			0,
-			30,
-			draw.Color{100, 100, 220, 255},
-			start_angle = 0,
-			end_angle = 270,
-			rotation = spin_angle,
-		)
+		// Ring arc rotating in place
+		draw.ring(base_layer, {250, 450}, 15, 30, 0, 270, {100, 100, 220, 255}, rotation = spin_angle)

 		// Triangle rotating around its center
-		tv1 := draw.Vec2{350, 420}
-		tv2 := draw.Vec2{420, 480}
-		tv3 := draw.Vec2{340, 480}
-		tess.triangle_aa(
+		tv1 := [2]f32{350, 420}
+		tv2 := [2]f32{420, 480}
+		tv3 := [2]f32{340, 480}
+		draw.triangle(
 			base_layer,
 			tv1,
 			tv2,
@@ -121,16 +102,8 @@ hellope_shapes :: proc() {
 		)

 		// Polygon rotating around its center (already had rotation; now with origin for orbit)
-		draw.polygon(
-			base_layer,
-			{460, 450},
-			6,
-			30,
-			draw.Color{180, 100, 220, 255},
-			outline_color = draw.WHITE,
-			outline_width = 2,
-			rotation = spin_angle,
-		)
+		draw.polygon(base_layer, {460, 450}, 6, 30, {180, 100, 220, 255}, rotation = spin_angle)
+		draw.polygon_lines(base_layer, {460, 450}, 6, 30, draw.WHITE, rotation = spin_angle, thickness = 2)

 		draw.end(gpu, window)
 	}
@@ -147,7 +120,7 @@ hellope_text :: proc() {
 	gpu := sdl.CreateGPUDevice(draw.PLATFORM_SHADER_FORMAT, true, nil)
 	if !sdl.ClaimWindowForGPUDevice(gpu, window) do os.exit(1)
 	if !draw.init(gpu, window) do os.exit(1)
-	PLEX_SANS_REGULAR = draw.register_font(cyber.SANS_REGULAR_RAW)
+	JETBRAINS_MONO_REGULAR = draw.register_font(JETBRAINS_MONO_REGULAR_RAW)

 	FONT_SIZE :: u16(24)
 	spin_angle: f32 = 0
@@ -161,6 +134,9 @@ hellope_text :: proc() {
 		spin_angle += 0.5
 		base_layer := draw.begin({width = 600, height = 600})

+		// Grey background
+		draw.rectangle(base_layer, {0, 0, 600, 600}, {127, 127, 127, 255})
+
 		// ----- Text API demos -----

 		// Cached text with id — TTF_Text reused across frames (good for text-heavy apps)
@@ -168,10 +144,10 @@ hellope_text :: proc() {
 			base_layer,
 			"Hellope!",
 			{300, 80},
-			PLEX_SANS_REGULAR,
+			JETBRAINS_MONO_REGULAR,
 			FONT_SIZE,
 			color = draw.WHITE,
-			origin = draw.center_of("Hellope!", PLEX_SANS_REGULAR, FONT_SIZE),
+			origin = draw.center_of("Hellope!", JETBRAINS_MONO_REGULAR, FONT_SIZE),
 			id = HELLOPE_ID,
 		)

@@ -180,28 +156,35 @@ hellope_text :: proc() {
 			base_layer,
 			"Hellope World!",
 			{300, 250},
-			PLEX_SANS_REGULAR,
+			JETBRAINS_MONO_REGULAR,
 			FONT_SIZE,
 			color = {255, 200, 50, 255},
-			origin = draw.center_of("Hellope World!", PLEX_SANS_REGULAR, FONT_SIZE),
+			origin = draw.center_of("Hellope World!", JETBRAINS_MONO_REGULAR, FONT_SIZE),
 			rotation = spin_angle,
 			id = ROTATING_SENTENCE_ID,
 		)

 		// Uncached text (no id) — created and destroyed each frame, simplest usage
-		draw.text(base_layer, "Top-left anchored", {20, 450}, PLEX_SANS_REGULAR, FONT_SIZE, color = draw.WHITE)
+		draw.text(
+			base_layer,
+			"Top-left anchored",
+			{20, 450},
+			JETBRAINS_MONO_REGULAR,
+			FONT_SIZE,
+			color = draw.WHITE,
+		)

 		// Measure text for manual layout
-		size := draw.measure_text("Measured!", PLEX_SANS_REGULAR, FONT_SIZE)
-		draw.rectangle(base_layer, {300 - size.x / 2, 380, size.x, size.y}, draw.Color{60, 60, 60, 200})
+		size := draw.measure_text("Measured!", JETBRAINS_MONO_REGULAR, FONT_SIZE)
+		draw.rectangle(base_layer, {300 - size.x / 2, 380, size.x, size.y}, {60, 60, 60, 200})
 		draw.text(
 			base_layer,
 			"Measured!",
 			{300, 380},
-			PLEX_SANS_REGULAR,
+			JETBRAINS_MONO_REGULAR,
 			FONT_SIZE,
 			color = draw.WHITE,
-			origin = draw.top_of("Measured!", PLEX_SANS_REGULAR, FONT_SIZE),
+			origin = draw.top_of("Measured!", JETBRAINS_MONO_REGULAR, FONT_SIZE),
 			id = MEASURED_ID,
 		)

@@ -210,14 +193,14 @@ hellope_text :: proc() {
 			base_layer,
 			"Corner spin",
 			{150, 530},
-			PLEX_SANS_REGULAR,
+			JETBRAINS_MONO_REGULAR,
 			FONT_SIZE,
 			color = {100, 200, 255, 255},
 			rotation = spin_angle,
 			id = CORNER_SPIN_ID,
 		)

-		draw.end(gpu, window, draw.Color{127, 127, 127, 255})
+		draw.end(gpu, window)
 	}
 }

@@ -227,10 +210,10 @@ hellope_clay :: proc() {
 	gpu := sdl.CreateGPUDevice(draw.PLATFORM_SHADER_FORMAT, true, nil)
 	if !sdl.ClaimWindowForGPUDevice(gpu, window) do os.exit(1)
 	if !draw.init(gpu, window) do os.exit(1)
-	PLEX_SANS_REGULAR = draw.register_font(cyber.SANS_REGULAR_RAW)
+	JETBRAINS_MONO_REGULAR = draw.register_font(JETBRAINS_MONO_REGULAR_RAW)

 	text_config := clay.TextElementConfig {
-		fontId    = PLEX_SANS_REGULAR,
+		fontId    = JETBRAINS_MONO_REGULAR,
 		fontSize  = 36,
 		textColor = {255, 255, 255, 255},
 	}
@@ -244,8 +227,9 @@ hellope_clay :: proc() {
 		base_layer := draw.begin({width = 500, height = 500})
 		clay.SetLayoutDimensions({width = base_layer.bounds.width, height = base_layer.bounds.height})
 		clay.BeginLayout()
-		if clay.UI(clay.ID("outer"))(
+		if clay.UI()(
 		{
+			id = clay.ID("outer"),
 			layout = {
 				sizing = {clay.SizingGrow({}), clay.SizingGrow({})},
 				childAlignment = {x = .Center, y = .Center},
@@ -253,13 +237,13 @@ hellope_clay :: proc() {
 			backgroundColor = {127, 127, 127, 255},
 		},
 		) {
-			clay.Text("Hellope!", text_config)
+			clay.Text("Hellope!", &text_config)
 		}
 		clay_batch := draw.ClayBatch {
 			bounds = base_layer.bounds,
-			cmds   = clay.EndLayout(0),
+			cmds   = clay.EndLayout(),
 		}
-		draw.prepare_clay_batch(base_layer, &clay_batch)
+		draw.prepare_clay_batch(base_layer, &clay_batch, {0, 0})
 		draw.end(gpu, window)
 	}
 }
@@ -270,40 +254,22 @@ hellope_custom :: proc() {
 	gpu := sdl.CreateGPUDevice(draw.PLATFORM_SHADER_FORMAT, true, nil)
 	if !sdl.ClaimWindowForGPUDevice(gpu, window) do os.exit(1)
 	if !draw.init(gpu, window) do os.exit(1)
-	PLEX_SANS_REGULAR = draw.register_font(cyber.SANS_REGULAR_RAW)
+	JETBRAINS_MONO_REGULAR = draw.register_font(JETBRAINS_MONO_REGULAR_RAW)

 	text_config := clay.TextElementConfig {
-		fontId    = PLEX_SANS_REGULAR,
+		fontId    = JETBRAINS_MONO_REGULAR,
 		fontSize  = 24,
 		textColor = {255, 255, 255, 255},
 	}

 	gauge := Gauge {
-		value    = 0.73,
-		color    = {50, 200, 100, 255},
-		bg_color = {80, 80, 80, 255},
+		value = 0.73,
+		color = {50, 200, 100, 255},
 	}
 	gauge2 := Gauge {
-		value    = 0.45,
-		color    = {200, 100, 50, 255},
-		bg_color = {80, 80, 80, 255},
+		value = 0.45,
+		color = {200, 100, 50, 255},
 	}
-
-	// `clay.CustomElementConfig.customData` is a rawptr; the Clay integration in `draw`
-	// requires it to point at a `Clay_Custom` value. The explicit `rawptr(...)` cast is
-	// necessary because Odin does not chain `^Gauge -> rawptr -> Clay_Custom` implicitly
-	// (variant-to-union and ^T-to-rawptr are each implicit on their own, but not stacked).
-	gauge_custom: draw.Clay_Custom = rawptr(&gauge)
-	gauge2_custom: draw.Clay_Custom = rawptr(&gauge2)
-
-	// Backdrop variant: variant-to-union conversion is implicit, so no cast needed.
-	// `tint = draw.WHITE` is the no-op tint per the backdrop module's convention
-	// (matches `examples/backdrop.odin`'s "pure blur, no color" usage).
-	backdrop_custom: draw.Clay_Custom = draw.Backdrop_Marker {
-		sigma = 8,
-		tint  = draw.WHITE,
-	}
-
 	spin_angle: f32 = 0

 	for {
@@ -321,8 +287,9 @@ hellope_custom :: proc() {
 		clay.SetLayoutDimensions({width = base_layer.bounds.width, height = base_layer.bounds.height})
 		clay.BeginLayout()

-		if clay.UI(clay.ID("outer"))(
+		if clay.UI()(
 		{
+			id = clay.ID("outer"),
 			layout = {
 				sizing = {clay.SizingGrow({}), clay.SizingGrow({})},
 				childAlignment = {x = .Center, y = .Center},
@@ -332,75 +299,54 @@ hellope_custom :: proc() {
 			backgroundColor = {50, 50, 50, 255},
 		},
 		) {
-			if clay.UI(clay.ID("title"))({layout = {sizing = {clay.SizingFit({}), clay.SizingFit({})}}}) {
-				clay.Text("Custom Draw Demo", text_config)
+			if clay.UI()({id = clay.ID("title"), layout = {sizing = {clay.SizingFit({}), clay.SizingFit({})}}}) {
+				clay.Text("Custom Draw Demo", &text_config)
 			}

-			// gauge1 is BEHIND the backdrop — the backdrop is declared as a floating CHILD
-			// of gauge1, pinned to gauge1's LeftTop and sized 300x30 so it covers exactly
-			// gauge1's footprint. Clay emits a floating child's render command after the
-			// parent's, so the stream order is gauge1 → backdrop → gauge2: gauge1's pixels
-			// land in `source_texture` before the bracket samples (visible as a blurred
-			// reflection inside the strip), and gauge2 is deferred-replayed by
-			// `prepare_clay_batch` after the bracket closes (renders crisp on top of the
-			// bracket output — unrelated to the strip since they don't overlap).
-			// `backgroundColor` is omitted on the gauges; bg lives on `Gauge.bg_color`. See `draw_custom`.
-			if clay.UI(clay.ID("gauge"))(
+			if clay.UI()(
 			{
+				id = clay.ID("gauge"),
 				layout = {sizing = {clay.SizingFixed(300), clay.SizingFixed(30)}},
-				custom = {customData = &gauge_custom},
+				custom = {customData = &gauge},
+				backgroundColor = {80, 80, 80, 255},
 			},
-			) {
-				if clay.UI(clay.ID("backdrop"))(
-				{
-					floating = {attachTo = .Parent, attachment = {parent = .LeftTop, element = .LeftTop}},
-					layout = {sizing = {clay.SizingFixed(300), clay.SizingFixed(30)}},
-					custom = {customData = &backdrop_custom},
-				},
-				) {}
-			}
+			) {}

-			if clay.UI(clay.ID("gauge2"))(
+			if clay.UI()(
 			{
+				id = clay.ID("gauge2"),
 				layout = {sizing = {clay.SizingFixed(300), clay.SizingFixed(30)}},
-				custom = {customData = &gauge2_custom},
+				custom = {customData = &gauge2},
+				backgroundColor = {80, 80, 80, 255},
 			},
 			) {}
 		}

 		clay_batch := draw.ClayBatch {
 			bounds = base_layer.bounds,
-			cmds   = clay.EndLayout(0),
+			cmds   = clay.EndLayout(),
 		}
-		draw.prepare_clay_batch(base_layer, &clay_batch, custom_draw = draw_custom)
+		draw.prepare_clay_batch(base_layer, &clay_batch, {0, 0}, custom_draw = draw_custom)
 		draw.end(gpu, window)
 	}

 	Gauge :: struct {
-		value:    f32,
-		color:    draw.Color,
-		bg_color: draw.Color,
+		value: f32,
+		color: draw.Color,
 	}

 	draw_custom :: proc(layer: ^draw.Layer, bounds: draw.Rectangle, render_data: clay.CustomRenderData) {
-		// `render_data.customData` has been unwrapped from the `Clay_Custom` envelope by
-		// `prepare_clay_batch` — it points at the Gauge directly, the same as it would have
-		// before the union refactor.
 		gauge := cast(^Gauge)render_data.customData

-		// `gauge.bg_color` instead of `render_data.backgroundColor`: under Clay master, an
-		// element with both `custom.customData` and `backgroundColor` emits a Custom AND a
-		// Rectangle for the same bounds, in that order — the Rectangle paints over the
-		// callback's output. Carrying bg on user data sidesteps it.
-		border_width: f32 = 2
-		draw.rectangle(layer, bounds, gauge.bg_color, outline_color = draw.WHITE, outline_width = border_width)
+		// Background from clay's backgroundColor
+		draw.rectangle(layer, bounds, draw.color_from_clay(render_data.backgroundColor), roundness = 0.25)

-		fill := draw.Rectangle {
-			x      = bounds.x,
-			y      = bounds.y,
-			width  = bounds.width * gauge.value,
-			height = bounds.height,
-		}
-		draw.rectangle(layer, fill, gauge.color)
+		// Fill bar
+		fill := bounds
+		fill.width *= gauge.value
+		draw.rectangle(layer, fill, gauge.color, roundness = 0.25)
+
+		// Border
+		draw.rectangle_lines(layer, bounds, draw.WHITE, thickness = 2, roundness = 0.25)
 	}
 }
@@ -0,0 +1,75 @@
+package examples
+
+import "core:fmt"
+import "core:mem"
+import "core:os"
+
+main :: proc() {
+	//----- Tracking allocator ----------------------------------
+	{
+		tracking_temp_allocator := false
+		// Temp
+		track_temp: mem.Tracking_Allocator
+		if tracking_temp_allocator {
+			mem.tracking_allocator_init(&track_temp, context.temp_allocator)
+			context.temp_allocator = mem.tracking_allocator(&track_temp)
+		}
+		// Default
+		track: mem.Tracking_Allocator
+		mem.tracking_allocator_init(&track, context.allocator)
+		context.allocator = mem.tracking_allocator(&track)
+		// Log a warning about any memory that was not freed by the end of the program.
+		// This could be fine for some global state or it could be a memory leak.
+		defer {
+			// Temp allocator
+			if tracking_temp_allocator {
+				if len(track_temp.allocation_map) > 0 {
+					fmt.eprintf("=== %v allocations not freed - temp allocator: ===\n", len(track_temp.allocation_map))
+					for _, entry in track_temp.allocation_map {
+						fmt.eprintf("- %v bytes @ %v\n", entry.size, entry.location)
+					}
+				}
+				if len(track_temp.bad_free_array) > 0 {
+					fmt.eprintf("=== %v incorrect frees - temp allocator: ===\n", len(track_temp.bad_free_array))
+					for entry in track_temp.bad_free_array {
+						fmt.eprintf("- %p @ %v\n", entry.memory, entry.location)
+					}
+				}
+				mem.tracking_allocator_destroy(&track_temp)
+			}
+			// Default allocator
+			if len(track.allocation_map) > 0 {
+				fmt.eprintf("=== %v allocations not freed - main allocator: ===\n", len(track.allocation_map))
+				for _, entry in track.allocation_map {
+					fmt.eprintf("- %v bytes @ %v\n", entry.size, entry.location)
+				}
+			}
+			if len(track.bad_free_array) > 0 {
+				fmt.eprintf("=== %v incorrect frees - main allocator: ===\n", len(track.bad_free_array))
+				for entry in track.bad_free_array {
+					fmt.eprintf("- %p @ %v\n", entry.memory, entry.location)
+				}
+			}
+			mem.tracking_allocator_destroy(&track)
+		}
+	}
+
+	args := os.args
+	if len(args) < 2 {
+		fmt.eprintln("Usage: examples <example_name>")
+		fmt.eprintln("Available examples: hellope-shapes, hellope-text, hellope-clay, hellope-custom, textures")
+		os.exit(1)
+	}
+
+	switch args[1] {
+	case "hellope-clay": hellope_clay()
+	case "hellope-custom": hellope_custom()
+	case "hellope-shapes": hellope_shapes()
+	case "hellope-text": hellope_text()
+	case "textures": textures()
+	case:
+		fmt.eprintf("Unknown example: %v\n", args[1])
+		fmt.eprintln("Available examples: hellope-shapes, hellope-text, hellope-clay, hellope-custom, textures")
+		os.exit(1)
+	}
+}
@@ -1,19 +1,17 @@
 package examples

-import "core:os"
-import sdl "vendor:sdl3"
-
 import "../../draw"
 import "../../draw/draw_qr"
-import cyber "../cybersteel"
+import "core:os"
+import sdl "vendor:sdl3"

 textures :: proc() {
 	if !sdl.Init({.VIDEO}) do os.exit(1)
-	window := sdl.CreateWindow("Textures", 800, 750, {.HIGH_PIXEL_DENSITY})
+	window := sdl.CreateWindow("Textures", 800, 600, {.HIGH_PIXEL_DENSITY})
 	gpu := sdl.CreateGPUDevice(draw.PLATFORM_SHADER_FORMAT, true, nil)
 	if !sdl.ClaimWindowForGPUDevice(gpu, window) do os.exit(1)
 	if !draw.init(gpu, window) do os.exit(1)
-	PLEX_SANS_REGULAR = draw.register_font(cyber.SANS_REGULAR_RAW)
+	JETBRAINS_MONO_REGULAR = draw.register_font(JETBRAINS_MONO_REGULAR_RAW)

 	FONT_SIZE :: u16(14)
 	LABEL_OFFSET :: f32(8) // gap between item and its label
@@ -88,10 +86,10 @@ textures :: proc() {
 		}
 		spin_angle += 1

-		base_layer := draw.begin({width = 800, height = 750})
+		base_layer := draw.begin({width = 800, height = 600})

 		// Background
-		draw.rectangle(base_layer, {0, 0, 800, 750}, draw.Color{30, 30, 30, 255})
+		draw.rectangle(base_layer, {0, 0, 800, 600}, {30, 30, 30, 255})

 		//----- Row 1: Sampler presets (y=30) ----------------------------------

@@ -103,61 +101,50 @@ textures :: proc() {
 		COL4 :: f32(480)

 		// Nearest (sharp pixel edges)
-		draw.rectangle(
+		draw.rectangle_texture(
 			base_layer,
 			{COL1, ROW1_Y, ITEM_SIZE, ITEM_SIZE},
-			draw.Texture_Fill {
-				id = checker_texture,
-				tint = draw.WHITE,
-				uv_rect = {0, 0, 1, 1},
-				sampler = .Nearest_Clamp,
-			},
+			checker_texture,
+			sampler = .Nearest_Clamp,
 		)
 		draw.text(
 			base_layer,
 			"Nearest",
 			{COL1, ROW1_Y + ITEM_SIZE + LABEL_OFFSET},
-			PLEX_SANS_REGULAR,
+			JETBRAINS_MONO_REGULAR,
 			FONT_SIZE,
 			color = draw.WHITE,
 		)

 		// Linear (bilinear blur)
-		draw.rectangle(
+		draw.rectangle_texture(
 			base_layer,
 			{COL2, ROW1_Y, ITEM_SIZE, ITEM_SIZE},
-			draw.Texture_Fill {
-				id = checker_texture,
-				tint = draw.WHITE,
-				uv_rect = {0, 0, 1, 1},
-				sampler = .Linear_Clamp,
-			},
+			checker_texture,
+			sampler = .Linear_Clamp,
 		)
 		draw.text(
 			base_layer,
 			"Linear",
 			{COL2, ROW1_Y + ITEM_SIZE + LABEL_OFFSET},
-			PLEX_SANS_REGULAR,
+			JETBRAINS_MONO_REGULAR,
 			FONT_SIZE,
 			color = draw.WHITE,
 		)

 		// Tiled (4x repeat)
-		draw.rectangle(
+		draw.rectangle_texture(
 			base_layer,
 			{COL3, ROW1_Y, ITEM_SIZE, ITEM_SIZE},
-			draw.Texture_Fill {
-				id = checker_texture,
-				tint = draw.WHITE,
-				uv_rect = {0, 0, 4, 4},
-				sampler = .Nearest_Repeat,
-			},
+			checker_texture,
+			sampler = .Nearest_Repeat,
+			uv_rect = {0, 0, 4, 4},
 		)
 		draw.text(
 			base_layer,
 			"Tiled 4x",
 			{COL3, ROW1_Y + ITEM_SIZE + LABEL_OFFSET},
-			PLEX_SANS_REGULAR,
+			JETBRAINS_MONO_REGULAR,
 			FONT_SIZE,
 			color = draw.WHITE,
 		)
@@ -166,60 +153,47 @@ textures :: proc() {

 		ROW2_Y :: f32(190)

-		// QR code (RGBA texture with baked colors, nearest sampling) + thin framing border.
-		draw.rectangle(base_layer, {COL1, ROW2_Y, ITEM_SIZE, ITEM_SIZE}, draw.Color{255, 255, 255, 255}) // white bg
-		draw.rectangle(
+		// QR code (RGBA texture with baked colors, nearest sampling)
+		draw.rectangle(base_layer, {COL1, ROW2_Y, ITEM_SIZE, ITEM_SIZE}, {255, 255, 255, 255}) // white bg
+		draw.rectangle_texture(
 			base_layer,
 			{COL1, ROW2_Y, ITEM_SIZE, ITEM_SIZE},
-			draw.Texture_Fill{id = qr_texture, tint = draw.WHITE, uv_rect = {0, 0, 1, 1}, sampler = .Nearest_Clamp},
-			outline_color = draw.WHITE,
-			outline_width = 2,
+			qr_texture,
+			sampler = .Nearest_Clamp,
 		)
 		draw.text(
 			base_layer,
 			"QR Code",
 			{COL1, ROW2_Y + ITEM_SIZE + LABEL_OFFSET},
-			PLEX_SANS_REGULAR,
+			JETBRAINS_MONO_REGULAR,
 			FONT_SIZE,
 			color = draw.WHITE,
 		)

-		// Rounded corners + outline traces the rounded shape.
-		draw.rectangle(
+		// Rounded corners
+		draw.rectangle_texture(
 			base_layer,
 			{COL2, ROW2_Y, ITEM_SIZE, ITEM_SIZE},
-			draw.Texture_Fill {
-				id = checker_texture,
-				tint = draw.WHITE,
-				uv_rect = {0, 0, 1, 1},
-				sampler = .Nearest_Clamp,
-			},
-			outline_color = draw.Color{255, 200, 100, 255},
-			outline_width = 3,
-			radii = draw.uniform_radii({COL2, ROW2_Y, ITEM_SIZE, ITEM_SIZE}, 0.3),
+			checker_texture,
+			sampler = .Nearest_Clamp,
+			roundness = 0.3,
 		)
 		draw.text(
 			base_layer,
 			"Rounded",
 			{COL2, ROW2_Y + ITEM_SIZE + LABEL_OFFSET},
-			PLEX_SANS_REGULAR,
+			JETBRAINS_MONO_REGULAR,
 			FONT_SIZE,
 			color = draw.WHITE,
 		)

-		// Rotating + outline rotates with the texture.
+		// Rotating
 		rot_rect := draw.Rectangle{COL3, ROW2_Y, ITEM_SIZE, ITEM_SIZE}
-		draw.rectangle(
+		draw.rectangle_texture(
 			base_layer,
 			rot_rect,
-			draw.Texture_Fill {
-				id = checker_texture,
-				tint = draw.WHITE,
-				uv_rect = {0, 0, 1, 1},
-				sampler = .Nearest_Clamp,
-			},
-			outline_color = draw.WHITE,
-			outline_width = 2,
+			checker_texture,
+			sampler = .Nearest_Clamp,
 			origin = draw.center_of(rot_rect),
 			rotation = spin_angle,
 		)
@@ -227,7 +201,7 @@ textures :: proc() {
 			base_layer,
 			"Rotating",
 			{COL3, ROW2_Y + ITEM_SIZE + LABEL_OFFSET},
-			PLEX_SANS_REGULAR,
+			JETBRAINS_MONO_REGULAR,
 			FONT_SIZE,
 			color = draw.WHITE,
 		)
@@ -239,178 +213,56 @@ textures :: proc() {

 		// Stretch
 		uv_s, sampler_s, inner_s := draw.fit_params(.Stretch, {COL1, ROW3_Y, FIT_SIZE, FIT_SIZE}, stripe_texture)
-		draw.rectangle(base_layer, {COL1, ROW3_Y, FIT_SIZE, FIT_SIZE}, draw.Color{60, 60, 60, 255}) // bg
-		draw.rectangle(
-			base_layer,
-			inner_s,
-			draw.Texture_Fill{id = stripe_texture, tint = draw.WHITE, uv_rect = uv_s, sampler = sampler_s},
-		)
+		draw.rectangle(base_layer, {COL1, ROW3_Y, FIT_SIZE, FIT_SIZE}, {60, 60, 60, 255}) // bg
+		draw.rectangle_texture(base_layer, inner_s, stripe_texture, uv_rect = uv_s, sampler = sampler_s)
 		draw.text(
 			base_layer,
 			"Stretch",
 			{COL1, ROW3_Y + FIT_SIZE + LABEL_OFFSET},
-			PLEX_SANS_REGULAR,
+			JETBRAINS_MONO_REGULAR,
 			FONT_SIZE,
 			color = draw.WHITE,
 		)

 		// Fill (center-crop)
 		uv_f, sampler_f, inner_f := draw.fit_params(.Fill, {COL2, ROW3_Y, FIT_SIZE, FIT_SIZE}, stripe_texture)
-		draw.rectangle(base_layer, {COL2, ROW3_Y, FIT_SIZE, FIT_SIZE}, draw.Color{60, 60, 60, 255})
-		draw.rectangle(
-			base_layer,
-			inner_f,
-			draw.Texture_Fill{id = stripe_texture, tint = draw.WHITE, uv_rect = uv_f, sampler = sampler_f},
-		)
+		draw.rectangle(base_layer, {COL2, ROW3_Y, FIT_SIZE, FIT_SIZE}, {60, 60, 60, 255})
+		draw.rectangle_texture(base_layer, inner_f, stripe_texture, uv_rect = uv_f, sampler = sampler_f)
 		draw.text(
 			base_layer,
 			"Fill",
 			{COL2, ROW3_Y + FIT_SIZE + LABEL_OFFSET},
-			PLEX_SANS_REGULAR,
+			JETBRAINS_MONO_REGULAR,
 			FONT_SIZE,
 			color = draw.WHITE,
 		)

 		// Fit (letterbox)
 		uv_ft, sampler_ft, inner_ft := draw.fit_params(.Fit, {COL3, ROW3_Y, FIT_SIZE, FIT_SIZE}, stripe_texture)
-		draw.rectangle(base_layer, {COL3, ROW3_Y, FIT_SIZE, FIT_SIZE}, draw.Color{60, 60, 60, 255}) // visible margin bg
-		draw.rectangle(
-			base_layer,
-			inner_ft,
-			draw.Texture_Fill{id = stripe_texture, tint = draw.WHITE, uv_rect = uv_ft, sampler = sampler_ft},
-		)
+		draw.rectangle(base_layer, {COL3, ROW3_Y, FIT_SIZE, FIT_SIZE}, {60, 60, 60, 255}) // visible margin bg
+		draw.rectangle_texture(base_layer, inner_ft, stripe_texture, uv_rect = uv_ft, sampler = sampler_ft)
 		draw.text(
 			base_layer,
 			"Fit",
 			{COL3, ROW3_Y + FIT_SIZE + LABEL_OFFSET},
-			PLEX_SANS_REGULAR,
+			JETBRAINS_MONO_REGULAR,
 			FONT_SIZE,
 			color = draw.WHITE,
 		)

-		// Per-corner radii + outline traces the asymmetric corner shape.
-		draw.rectangle(
+		// Per-corner radii
+		draw.rectangle_texture_corners(
 			base_layer,
 			{COL4, ROW3_Y, FIT_SIZE, FIT_SIZE},
-			draw.Texture_Fill {
-				id = checker_texture,
-				tint = draw.WHITE,
-				uv_rect = {0, 0, 1, 1},
-				sampler = .Nearest_Clamp,
-			},
-			outline_color = draw.Color{255, 100, 100, 255},
-			outline_width = 3,
-			radii = {20, 0, 20, 0},
+			{20, 0, 20, 0},
+			checker_texture,
+			sampler = .Nearest_Clamp,
 		)
 		draw.text(
 			base_layer,
 			"Per-corner",
 			{COL4, ROW3_Y + FIT_SIZE + LABEL_OFFSET},
-			PLEX_SANS_REGULAR,
-			FONT_SIZE,
-			color = draw.WHITE,
-		)
-
-		//----- Row 4: Textured shapes (y=520) ----------------------------------
-
-		ROW4_Y :: f32(520)
-		SHAPE_SIZE :: f32(80)
-		SHAPE_GAP :: f32(30)
-		SHAPE_COL1 :: f32(30)
-		SHAPE_COL2 :: SHAPE_COL1 + SHAPE_SIZE + SHAPE_GAP
-		SHAPE_COL3 :: SHAPE_COL2 + SHAPE_SIZE + SHAPE_GAP
-		SHAPE_COL4 :: SHAPE_COL3 + SHAPE_SIZE + SHAPE_GAP
-		SHAPE_COL5 :: SHAPE_COL4 + SHAPE_SIZE + SHAPE_GAP
-
-		checker_fill := draw.Texture_Fill {
-			id      = checker_texture,
-			tint    = draw.WHITE,
-			uv_rect = {0, 0, 1, 1},
-			sampler = .Nearest_Clamp,
-		}
-
-		// Textured circle + outline (textured shape with built-in border).
-		draw.circle(
-			base_layer,
-			{SHAPE_COL1 + SHAPE_SIZE / 2, ROW4_Y + SHAPE_SIZE / 2},
-			SHAPE_SIZE / 2,
-			checker_fill,
-			outline_color = draw.WHITE,
-			outline_width = 2,
-		)
-		draw.text(
-			base_layer,
-			"Circle",
-			{SHAPE_COL1, ROW4_Y + SHAPE_SIZE + LABEL_OFFSET},
-			PLEX_SANS_REGULAR,
-			FONT_SIZE,
-			color = draw.WHITE,
-		)
-
-		// Textured ellipse
-		draw.ellipse(
-			base_layer,
-			{SHAPE_COL2 + SHAPE_SIZE / 2, ROW4_Y + SHAPE_SIZE / 2},
-			SHAPE_SIZE / 2,
-			SHAPE_SIZE / 3,
-			checker_fill,
-		)
-		draw.text(
-			base_layer,
-			"Ellipse",
-			{SHAPE_COL2, ROW4_Y + SHAPE_SIZE + LABEL_OFFSET},
-			PLEX_SANS_REGULAR,
-			FONT_SIZE,
-			color = draw.WHITE,
-		)
-
-		// Textured polygon (hexagon)
-		draw.polygon(
-			base_layer,
-			{SHAPE_COL3 + SHAPE_SIZE / 2, ROW4_Y + SHAPE_SIZE / 2},
-			6,
-			SHAPE_SIZE / 2,
-			checker_fill,
-		)
-		draw.text(
-			base_layer,
-			"Polygon",
-			{SHAPE_COL3, ROW4_Y + SHAPE_SIZE + LABEL_OFFSET},
-			PLEX_SANS_REGULAR,
-			FONT_SIZE,
-			color = draw.WHITE,
-		)
-
-		// Textured ring
-		draw.ring(
-			base_layer,
-			{SHAPE_COL4 + SHAPE_SIZE / 2, ROW4_Y + SHAPE_SIZE / 2},
-			SHAPE_SIZE / 4,
-			SHAPE_SIZE / 2,
-			checker_fill,
-		)
-		draw.text(
-			base_layer,
-			"Ring",
-			{SHAPE_COL4, ROW4_Y + SHAPE_SIZE + LABEL_OFFSET},
-			PLEX_SANS_REGULAR,
-			FONT_SIZE,
-			color = draw.WHITE,
-		)
-
-		// Textured line (capsule)
-		draw.line(
-			base_layer,
-			{SHAPE_COL5, ROW4_Y + SHAPE_SIZE / 2},
-			{SHAPE_COL5 + SHAPE_SIZE, ROW4_Y + SHAPE_SIZE / 2},
-			checker_fill,
-			thickness = 20,
-		)
-		draw.text(
-			base_layer,
-			"Line",
-			{SHAPE_COL5, ROW4_Y + SHAPE_SIZE + LABEL_OFFSET},
-			PLEX_SANS_REGULAR,
+			JETBRAINS_MONO_REGULAR,
 			FONT_SIZE,
 			color = draw.WHITE,
 		)
@@ -0,0 +1,685 @@
+package draw
+
+import "core:c"
+import "core:log"
+import "core:mem"
+import sdl "vendor:sdl3"
+
+Vertex :: struct {
+	position: [2]f32,
+	uv:       [2]f32,
+	color:    Color,
+}
+
+TextBatch :: struct {
+	atlas_texture: ^sdl.GPUTexture,
+	vertex_start:  u32,
+	vertex_count:  u32,
+	index_start:   u32,
+	index_count:   u32,
+}
+
+// ----------------------------------------------------------------------------------------------------------------
+// ----- SDF primitive types -----------
+// ----------------------------------------------------------------------------------------------------------------
+
+Shape_Kind :: enum u8 {
+	Solid    = 0,
+	RRect    = 1,
+	Circle   = 2,
+	Ellipse  = 3,
+	Segment  = 4,
+	Ring_Arc = 5,
+	NGon     = 6,
+}
+
+Shape_Flag :: enum u8 {
+	Stroke,
+	Textured,
+}
+
+Shape_Flags :: bit_set[Shape_Flag;u8]
+
+RRect_Params :: struct {
+	half_size: [2]f32,
+	radii:     [4]f32,
+	soft_px:   f32,
+	stroke_px: f32,
+}
+
+Circle_Params :: struct {
+	radius:    f32,
+	soft_px:   f32,
+	stroke_px: f32,
+	_:         [5]f32,
+}
+
+Ellipse_Params :: struct {
+	radii:     [2]f32,
+	soft_px:   f32,
+	stroke_px: f32,
+	_:         [4]f32,
+}
+
+Segment_Params :: struct {
+	a:       [2]f32,
+	b:       [2]f32,
+	width:   f32,
+	soft_px: f32,
+	_:       [2]f32,
+}
+
+Ring_Arc_Params :: struct {
+	inner_radius: f32,
+	outer_radius: f32,
+	start_rad:    f32,
+	end_rad:      f32,
+	soft_px:      f32,
+	_:            [3]f32,
+}
+
+NGon_Params :: struct {
+	radius:    f32,
+	rotation:  f32,
+	sides:     f32,
+	soft_px:   f32,
+	stroke_px: f32,
+	_:         [3]f32,
+}
+
+Shape_Params :: struct #raw_union {
+	rrect:    RRect_Params,
+	circle:   Circle_Params,
+	ellipse:  Ellipse_Params,
+	segment:  Segment_Params,
+	ring_arc: Ring_Arc_Params,
+	ngon:     NGon_Params,
+	raw:      [8]f32,
+}
+
+#assert(size_of(Shape_Params) == 32)
+
+// GPU layout: 64 bytes, std430-compatible. The shader declares this as a storage buffer struct.
+Primitive :: struct {
+	bounds:     [4]f32, //  0: min_x, min_y, max_x, max_y (world-space, pre-DPI)
+	color:      Color, // 16: u8x4, unpacked in shader via unpackUnorm4x8
+	kind_flags: u32, // 20: (kind as u32) | (flags as u32 << 8)
+	rotation:   f32, // 24: shader self-rotation in radians (used by RRect, Ellipse)
+	_pad:       f32, // 28: alignment to vec4 boundary
+	params:     Shape_Params, // 32: two vec4s of shape params
+	uv_rect:    [4]f32, // 64: u_min, v_min, u_max, v_max (default {0,0,1,1})
+}
+
+#assert(size_of(Primitive) == 80)
+
+pack_kind_flags :: #force_inline proc(kind: Shape_Kind, flags: Shape_Flags) -> u32 {
+	return u32(kind) | (u32(transmute(u8)flags) << 8)
+}
+
+Pipeline_2D_Base :: struct {
+	sdl_pipeline:     ^sdl.GPUGraphicsPipeline,
+	vertex_buffer:    Buffer,
+	index_buffer:     Buffer,
+	unit_quad_buffer: ^sdl.GPUBuffer,
+	primitive_buffer: Buffer,
+	white_texture:    ^sdl.GPUTexture,
+	sampler:          ^sdl.GPUSampler,
+}
+
+@(private)
+create_pipeline_2d_base :: proc(
+	device: ^sdl.GPUDevice,
+	window: ^sdl.Window,
+	sample_count: sdl.GPUSampleCount,
+) -> (
+	pipeline: Pipeline_2D_Base,
+	ok: bool,
+) {
+	// On failure, clean up any partially-created resources
+	defer if !ok {
+		if pipeline.sampler != nil do sdl.ReleaseGPUSampler(device, pipeline.sampler)
+		if pipeline.white_texture != nil do sdl.ReleaseGPUTexture(device, pipeline.white_texture)
+		if pipeline.unit_quad_buffer != nil do sdl.ReleaseGPUBuffer(device, pipeline.unit_quad_buffer)
+		if pipeline.primitive_buffer.gpu != nil do destroy_buffer(device, &pipeline.primitive_buffer)
+		if pipeline.index_buffer.gpu != nil do destroy_buffer(device, &pipeline.index_buffer)
+		if pipeline.vertex_buffer.gpu != nil do destroy_buffer(device, &pipeline.vertex_buffer)
+		if pipeline.sdl_pipeline != nil do sdl.ReleaseGPUGraphicsPipeline(device, pipeline.sdl_pipeline)
+	}
+
+	active_shader_formats := sdl.GetGPUShaderFormats(device)
+	if PLATFORM_SHADER_FORMAT_FLAG not_in active_shader_formats {
+		log.errorf(
+			"draw: no embedded shader matches active GPU formats; this build supports %v but device reports %v",
+			PLATFORM_SHADER_FORMAT,
+			active_shader_formats,
+		)
+		return pipeline, false
+	}
+
+	log.debug("Loaded", len(BASE_VERT_2D_RAW), "vert bytes")
+	log.debug("Loaded", len(BASE_FRAG_2D_RAW), "frag bytes")
+
+	vert_info := sdl.GPUShaderCreateInfo {
+		code_size           = len(BASE_VERT_2D_RAW),
+		code                = raw_data(BASE_VERT_2D_RAW),
+		entrypoint          = SHADER_ENTRY,
+		format              = {PLATFORM_SHADER_FORMAT_FLAG},
+		stage               = .VERTEX,
+		num_uniform_buffers = 1,
+		num_storage_buffers = 1,
+	}
+
+	frag_info := sdl.GPUShaderCreateInfo {
+		code_size    = len(BASE_FRAG_2D_RAW),
+		code         = raw_data(BASE_FRAG_2D_RAW),
+		entrypoint   = SHADER_ENTRY,
+		format       = {PLATFORM_SHADER_FORMAT_FLAG},
+		stage        = .FRAGMENT,
+		num_samplers = 1,
+	}
+
+	vert_shader := sdl.CreateGPUShader(device, vert_info)
+	if vert_shader == nil {
+		log.errorf("Could not create draw vertex shader: %s", sdl.GetError())
+		return pipeline, false
+	}
+
+	frag_shader := sdl.CreateGPUShader(device, frag_info)
+	if frag_shader == nil {
+		sdl.ReleaseGPUShader(device, vert_shader)
+		log.errorf("Could not create draw fragment shader: %s", sdl.GetError())
+		return pipeline, false
+	}
+
+	vertex_attributes: [3]sdl.GPUVertexAttribute = {
+		// position (GLSL location 0)
+		sdl.GPUVertexAttribute{buffer_slot = 0, location = 0, format = .FLOAT2, offset = 0},
+		// uv (GLSL location 1)
+		sdl.GPUVertexAttribute{buffer_slot = 0, location = 1, format = .FLOAT2, offset = size_of([2]f32)},
+		// color (GLSL location 2, u8x4 normalized to float by GPU)
+		sdl.GPUVertexAttribute{buffer_slot = 0, location = 2, format = .UBYTE4_NORM, offset = size_of([2]f32) * 2},
+	}
+
+	pipeline_info := sdl.GPUGraphicsPipelineCreateInfo {
+		vertex_shader = vert_shader,
+		fragment_shader = frag_shader,
+		primitive_type = .TRIANGLELIST,
+		multisample_state = sdl.GPUMultisampleState{sample_count = sample_count},
+		target_info = sdl.GPUGraphicsPipelineTargetInfo {
+			color_target_descriptions = &sdl.GPUColorTargetDescription {
+				format = sdl.GetGPUSwapchainTextureFormat(device, window),
+				blend_state = sdl.GPUColorTargetBlendState {
+					enable_blend = true,
+					enable_color_write_mask = true,
+					src_color_blendfactor = .SRC_ALPHA,
+					dst_color_blendfactor = .ONE_MINUS_SRC_ALPHA,
+					color_blend_op = .ADD,
+					src_alpha_blendfactor = .SRC_ALPHA,
+					dst_alpha_blendfactor = .ONE_MINUS_SRC_ALPHA,
+					alpha_blend_op = .ADD,
+					color_write_mask = sdl.GPUColorComponentFlags{.R, .G, .B, .A},
+				},
+			},
+			num_color_targets = 1,
+		},
+		vertex_input_state = sdl.GPUVertexInputState {
+			vertex_buffer_descriptions = &sdl.GPUVertexBufferDescription {
+				slot = 0,
+				input_rate = .VERTEX,
+				pitch = size_of(Vertex),
+			},
+			num_vertex_buffers = 1,
+			vertex_attributes = raw_data(vertex_attributes[:]),
+			num_vertex_attributes = 3,
+		},
+	}
+
+	pipeline.sdl_pipeline = sdl.CreateGPUGraphicsPipeline(device, pipeline_info)
+	// Shaders are no longer needed regardless of pipeline creation success
+	sdl.ReleaseGPUShader(device, vert_shader)
+	sdl.ReleaseGPUShader(device, frag_shader)
+	if pipeline.sdl_pipeline == nil {
+		log.errorf("Failed to create draw graphics pipeline: %s", sdl.GetError())
+		return pipeline, false
+	}
+
+	// Create vertex buffer
+	vert_buf_ok: bool
+	pipeline.vertex_buffer, vert_buf_ok = create_buffer(
+		device,
+		size_of(Vertex) * BUFFER_INIT_SIZE,
+		sdl.GPUBufferUsageFlags{.VERTEX},
+	)
+	if !vert_buf_ok do return pipeline, false
+
+	// Create index buffer (used by text)
+	idx_buf_ok: bool
+	pipeline.index_buffer, idx_buf_ok = create_buffer(
+		device,
+		size_of(c.int) * BUFFER_INIT_SIZE,
+		sdl.GPUBufferUsageFlags{.INDEX},
+	)
+	if !idx_buf_ok do return pipeline, false
+
+	// Create primitive storage buffer (used by SDF instanced drawing)
+	prim_buf_ok: bool
+	pipeline.primitive_buffer, prim_buf_ok = create_buffer(
+		device,
+		size_of(Primitive) * BUFFER_INIT_SIZE,
+		sdl.GPUBufferUsageFlags{.GRAPHICS_STORAGE_READ},
+	)
+	if !prim_buf_ok do return pipeline, false
+
+	// Create static 6-vertex unit quad buffer (two triangles, TRIANGLELIST)
+	pipeline.unit_quad_buffer = sdl.CreateGPUBuffer(
+		device,
+		sdl.GPUBufferCreateInfo{usage = {.VERTEX}, size = 6 * size_of(Vertex)},
+	)
+	if pipeline.unit_quad_buffer == nil {
+		log.errorf("Failed to create unit quad buffer: %s", sdl.GetError())
+		return pipeline, false
+	}
+
+	// Create 1x1 white pixel texture
+	pipeline.white_texture = sdl.CreateGPUTexture(
+		device,
+		sdl.GPUTextureCreateInfo {
+			type = .D2,
+			format = .R8G8B8A8_UNORM,
+			usage = {.SAMPLER},
+			width = 1,
+			height = 1,
+			layer_count_or_depth = 1,
+			num_levels = 1,
+			sample_count = ._1,
+		},
+	)
+	if pipeline.white_texture == nil {
+		log.errorf("Failed to create white pixel texture: %s", sdl.GetError())
+		return pipeline, false
+	}
+
+	// Upload white pixel and unit quad data in a single command buffer
+	white_pixel := [4]u8{255, 255, 255, 255}
+	white_transfer_buf := sdl.CreateGPUTransferBuffer(
+		device,
+		sdl.GPUTransferBufferCreateInfo{usage = .UPLOAD, size = size_of(white_pixel)},
+	)
+	if white_transfer_buf == nil {
+		log.errorf("Failed to create white pixel transfer buffer: %s", sdl.GetError())
+		return pipeline, false
+	}
+	defer sdl.ReleaseGPUTransferBuffer(device, white_transfer_buf)
+
+	white_ptr := sdl.MapGPUTransferBuffer(device, white_transfer_buf, false)
+	if white_ptr == nil {
+		log.errorf("Failed to map white pixel transfer buffer: %s", sdl.GetError())
+		return pipeline, false
+	}
+	mem.copy(white_ptr, &white_pixel, size_of(white_pixel))
+	sdl.UnmapGPUTransferBuffer(device, white_transfer_buf)
+
+	quad_verts := [6]Vertex {
+		{position = {0, 0}},
+		{position = {1, 0}},
+		{position = {0, 1}},
+		{position = {0, 1}},
+		{position = {1, 0}},
+		{position = {1, 1}},
+	}
+	quad_transfer_buf := sdl.CreateGPUTransferBuffer(
+		device,
+		sdl.GPUTransferBufferCreateInfo{usage = .UPLOAD, size = size_of(quad_verts)},
+	)
+	if quad_transfer_buf == nil {
+		log.errorf("Failed to create unit quad transfer buffer: %s", sdl.GetError())
+		return pipeline, false
+	}
+	defer sdl.ReleaseGPUTransferBuffer(device, quad_transfer_buf)
+
+	quad_ptr := sdl.MapGPUTransferBuffer(device, quad_transfer_buf, false)
+	if quad_ptr == nil {
+		log.errorf("Failed to map unit quad transfer buffer: %s", sdl.GetError())
+		return pipeline, false
+	}
+	mem.copy(quad_ptr, &quad_verts, size_of(quad_verts))
+	sdl.UnmapGPUTransferBuffer(device, quad_transfer_buf)
+
+	upload_cmd_buffer := sdl.AcquireGPUCommandBuffer(device)
+	if upload_cmd_buffer == nil {
+		log.errorf("Failed to acquire command buffer for init upload: %s", sdl.GetError())
+		return pipeline, false
+	}
+	upload_pass := sdl.BeginGPUCopyPass(upload_cmd_buffer)
+
+	sdl.UploadToGPUTexture(
+		upload_pass,
+		sdl.GPUTextureTransferInfo{transfer_buffer = white_transfer_buf},
+		sdl.GPUTextureRegion{texture = pipeline.white_texture, w = 1, h = 1, d = 1},
+		false,
+	)
+
+	sdl.UploadToGPUBuffer(
+		upload_pass,
+		sdl.GPUTransferBufferLocation{transfer_buffer = quad_transfer_buf},
+		sdl.GPUBufferRegion{buffer = pipeline.unit_quad_buffer, offset = 0, size = size_of(quad_verts)},
+		false,
+	)
+
+	sdl.EndGPUCopyPass(upload_pass)
+	if !sdl.SubmitGPUCommandBuffer(upload_cmd_buffer) {
+		log.errorf("Failed to submit init upload command buffer: %s", sdl.GetError())
+		return pipeline, false
+	}
+
+	log.debug("White pixel texture and unit quad buffer created and uploaded")
+
+	// Create sampler (shared by shapes and text)
+	pipeline.sampler = sdl.CreateGPUSampler(
+		device,
+		sdl.GPUSamplerCreateInfo {
+			min_filter = .LINEAR,
+			mag_filter = .LINEAR,
+			mipmap_mode = .LINEAR,
+			address_mode_u = .CLAMP_TO_EDGE,
+			address_mode_v = .CLAMP_TO_EDGE,
+			address_mode_w = .CLAMP_TO_EDGE,
+		},
+	)
+	if pipeline.sampler == nil {
+		log.errorf("Could not create GPU sampler: %s", sdl.GetError())
+		return pipeline, false
+	}
+
+	log.debug("Done creating unified draw pipeline")
+	return pipeline, true
+}
+
+@(private)
+upload :: proc(device: ^sdl.GPUDevice, pass: ^sdl.GPUCopyPass) {
+	// Upload vertices (shapes then text into one buffer)
+	shape_vert_count := u32(len(GLOB.tmp_shape_verts))
+	text_vert_count := u32(len(GLOB.tmp_text_verts))
+	total_vert_count := shape_vert_count + text_vert_count
+
+	if total_vert_count > 0 {
+		total_vert_size := total_vert_count * size_of(Vertex)
+		shape_vert_size := shape_vert_count * size_of(Vertex)
+		text_vert_size := text_vert_count * size_of(Vertex)
+
+		grow_buffer_if_needed(
+			device,
+			&GLOB.pipeline_2d_base.vertex_buffer,
+			total_vert_size,
+			sdl.GPUBufferUsageFlags{.VERTEX},
+		)
+
+		vert_array := sdl.MapGPUTransferBuffer(device, GLOB.pipeline_2d_base.vertex_buffer.transfer, false)
+		if vert_array == nil {
+			log.panicf("Failed to map vertex transfer buffer: %s", sdl.GetError())
+		}
+		if shape_vert_size > 0 {
+			mem.copy(vert_array, raw_data(GLOB.tmp_shape_verts), int(shape_vert_size))
+		}
+		if text_vert_size > 0 {
+			mem.copy(
+				rawptr(uintptr(vert_array) + uintptr(shape_vert_size)),
+				raw_data(GLOB.tmp_text_verts),
+				int(text_vert_size),
+			)
+		}
+		sdl.UnmapGPUTransferBuffer(device, GLOB.pipeline_2d_base.vertex_buffer.transfer)
+
+		sdl.UploadToGPUBuffer(
+			pass,
+			sdl.GPUTransferBufferLocation{transfer_buffer = GLOB.pipeline_2d_base.vertex_buffer.transfer},
+			sdl.GPUBufferRegion{buffer = GLOB.pipeline_2d_base.vertex_buffer.gpu, offset = 0, size = total_vert_size},
+			false,
+		)
+	}
+
+	// Upload text indices
+	index_count := u32(len(GLOB.tmp_text_indices))
+	if index_count > 0 {
+		index_size := index_count * size_of(c.int)
+
+		grow_buffer_if_needed(
+			device,
+			&GLOB.pipeline_2d_base.index_buffer,
+			index_size,
+			sdl.GPUBufferUsageFlags{.INDEX},
+		)
+
+		idx_array := sdl.MapGPUTransferBuffer(device, GLOB.pipeline_2d_base.index_buffer.transfer, false)
+		if idx_array == nil {
+			log.panicf("Failed to map index transfer buffer: %s", sdl.GetError())
+		}
+		mem.copy(idx_array, raw_data(GLOB.tmp_text_indices), int(index_size))
+		sdl.UnmapGPUTransferBuffer(device, GLOB.pipeline_2d_base.index_buffer.transfer)
+
+		sdl.UploadToGPUBuffer(
+			pass,
+			sdl.GPUTransferBufferLocation{transfer_buffer = GLOB.pipeline_2d_base.index_buffer.transfer},
+			sdl.GPUBufferRegion{buffer = GLOB.pipeline_2d_base.index_buffer.gpu, offset = 0, size = index_size},
+			false,
+		)
+	}
+
+	// Upload SDF primitives
+	prim_count := u32(len(GLOB.tmp_primitives))
+	if prim_count > 0 {
+		prim_size := prim_count * size_of(Primitive)
+
+		grow_buffer_if_needed(
+			device,
+			&GLOB.pipeline_2d_base.primitive_buffer,
+			prim_size,
+			sdl.GPUBufferUsageFlags{.GRAPHICS_STORAGE_READ},
+		)
+
+		prim_array := sdl.MapGPUTransferBuffer(device, GLOB.pipeline_2d_base.primitive_buffer.transfer, false)
+		if prim_array == nil {
+			log.panicf("Failed to map primitive transfer buffer: %s", sdl.GetError())
+		}
+		mem.copy(prim_array, raw_data(GLOB.tmp_primitives), int(prim_size))
+		sdl.UnmapGPUTransferBuffer(device, GLOB.pipeline_2d_base.primitive_buffer.transfer)
+
+		sdl.UploadToGPUBuffer(
+			pass,
+			sdl.GPUTransferBufferLocation{transfer_buffer = GLOB.pipeline_2d_base.primitive_buffer.transfer},
+			sdl.GPUBufferRegion{buffer = GLOB.pipeline_2d_base.primitive_buffer.gpu, offset = 0, size = prim_size},
+			false,
+		)
+	}
+}
+
+@(private)
+draw_layer :: proc(
+	device: ^sdl.GPUDevice,
+	window: ^sdl.Window,
+	cmd_buffer: ^sdl.GPUCommandBuffer,
+	render_texture: ^sdl.GPUTexture,
+	swapchain_width: u32,
+	swapchain_height: u32,
+	clear_color: [4]f32,
+	layer: ^Layer,
+) {
+	if layer.sub_batch_len == 0 {
+		if !GLOB.cleared {
+			pass := sdl.BeginGPURenderPass(
+				cmd_buffer,
+				&sdl.GPUColorTargetInfo {
+					texture = render_texture,
+					clear_color = sdl.FColor{clear_color[0], clear_color[1], clear_color[2], clear_color[3]},
+					load_op = .CLEAR,
+					store_op = .STORE,
+				},
+				1,
+				nil,
+			)
+			sdl.EndGPURenderPass(pass)
+			GLOB.cleared = true
+		}
+		return
+	}
+
+	render_pass := sdl.BeginGPURenderPass(
+		cmd_buffer,
+		&sdl.GPUColorTargetInfo {
+			texture = render_texture,
+			clear_color = sdl.FColor{clear_color[0], clear_color[1], clear_color[2], clear_color[3]},
+			load_op = GLOB.cleared ? .LOAD : .CLEAR,
+			store_op = .STORE,
+		},
+		1,
+		nil,
+	)
+	GLOB.cleared = true
+
+	sdl.BindGPUGraphicsPipeline(render_pass, GLOB.pipeline_2d_base.sdl_pipeline)
+
+	// Bind storage buffer (read by vertex shader in SDF mode)
+	sdl.BindGPUVertexStorageBuffers(
+		render_pass,
+		0,
+		([^]^sdl.GPUBuffer)(&GLOB.pipeline_2d_base.primitive_buffer.gpu),
+		1,
+	)
+
+	// Always bind index buffer — harmless if no indexed draws are issued
+	sdl.BindGPUIndexBuffer(
+		render_pass,
+		sdl.GPUBufferBinding{buffer = GLOB.pipeline_2d_base.index_buffer.gpu, offset = 0},
+		._32BIT,
+	)
+
+	// Shorthand aliases for frequently-used pipeline resources
+	main_vert_buf := GLOB.pipeline_2d_base.vertex_buffer.gpu
+	unit_quad := GLOB.pipeline_2d_base.unit_quad_buffer
+	white_texture := GLOB.pipeline_2d_base.white_texture
+	sampler := GLOB.pipeline_2d_base.sampler
+	width := f32(swapchain_width)
+	height := f32(swapchain_height)
+
+	// Initial GPU state: tessellated mode, main vertex buffer, no atlas bound yet
+	push_globals(cmd_buffer, width, height, .Tessellated)
+	sdl.BindGPUVertexBuffers(render_pass, 0, &sdl.GPUBufferBinding{buffer = main_vert_buf, offset = 0}, 1)
+
+	current_mode: Draw_Mode = .Tessellated
+	current_vert_buf := main_vert_buf
+	current_atlas: ^sdl.GPUTexture
+	current_sampler := sampler
+
+	// Text vertices live after shape vertices in the GPU vertex buffer
+	text_vertex_gpu_base := u32(len(GLOB.tmp_shape_verts))
+
+	for &scissor in GLOB.scissors[layer.scissor_start:][:layer.scissor_len] {
+		sdl.SetGPUScissor(render_pass, scissor.bounds)
+
+		for &batch in GLOB.tmp_sub_batches[scissor.sub_batch_start:][:scissor.sub_batch_len] {
+			switch batch.kind {
+			case .Shapes:
+				if current_mode != .Tessellated {
+					push_globals(cmd_buffer, width, height, .Tessellated)
+					current_mode = .Tessellated
+				}
+				if current_vert_buf != main_vert_buf {
+					sdl.BindGPUVertexBuffers(render_pass, 0, &sdl.GPUBufferBinding{buffer = main_vert_buf, offset = 0}, 1)
+					current_vert_buf = main_vert_buf
+				}
+				// Determine texture and sampler for this batch
+				batch_texture: ^sdl.GPUTexture = white_texture
+				batch_sampler: ^sdl.GPUSampler = sampler
+				if batch.texture_id != INVALID_TEXTURE {
+					if bound_texture := texture_gpu_handle(batch.texture_id); bound_texture != nil {
+						batch_texture = bound_texture
+					}
+					batch_sampler = get_sampler(batch.sampler)
+				}
+				if current_atlas != batch_texture || current_sampler != batch_sampler {
+					sdl.BindGPUFragmentSamplers(
+						render_pass,
+						0,
+						&sdl.GPUTextureSamplerBinding{texture = batch_texture, sampler = batch_sampler},
+						1,
+					)
+					current_atlas = batch_texture
+					current_sampler = batch_sampler
+				}
+				sdl.DrawGPUPrimitives(render_pass, batch.count, 1, batch.offset, 0)
+
+			case .Text:
+				if current_mode != .Tessellated {
+					push_globals(cmd_buffer, width, height, .Tessellated)
+					current_mode = .Tessellated
+				}
+				if current_vert_buf != main_vert_buf {
+					sdl.BindGPUVertexBuffers(render_pass, 0, &sdl.GPUBufferBinding{buffer = main_vert_buf, offset = 0}, 1)
+					current_vert_buf = main_vert_buf
+				}
+				text_batch := &GLOB.tmp_text_batches[batch.offset]
+				if current_atlas != text_batch.atlas_texture {
+					sdl.BindGPUFragmentSamplers(
+						render_pass,
+						0,
+						&sdl.GPUTextureSamplerBinding{texture = text_batch.atlas_texture, sampler = sampler},
+						1,
+					)
+					current_atlas = text_batch.atlas_texture
+				}
+				sdl.DrawGPUIndexedPrimitives(
+					render_pass,
+					text_batch.index_count,
+					1,
+					text_batch.index_start,
+					i32(text_vertex_gpu_base + text_batch.vertex_start),
+					0,
+				)
+
+			case .SDF:
+				if current_mode != .SDF {
+					push_globals(cmd_buffer, width, height, .SDF)
+					current_mode = .SDF
+				}
+				if current_vert_buf != unit_quad {
+					sdl.BindGPUVertexBuffers(render_pass, 0, &sdl.GPUBufferBinding{buffer = unit_quad, offset = 0}, 1)
+					current_vert_buf = unit_quad
+				}
+				// Determine texture and sampler for this batch
+				batch_texture: ^sdl.GPUTexture = white_texture
+				batch_sampler: ^sdl.GPUSampler = sampler
+				if batch.texture_id != INVALID_TEXTURE {
+					if bound_texture := texture_gpu_handle(batch.texture_id); bound_texture != nil {
+						batch_texture = bound_texture
+					}
+					batch_sampler = get_sampler(batch.sampler)
+				}
+				if current_atlas != batch_texture || current_sampler != batch_sampler {
+					sdl.BindGPUFragmentSamplers(
+						render_pass,
+						0,
+						&sdl.GPUTextureSamplerBinding{texture = batch_texture, sampler = batch_sampler},
+						1,
+					)
+					current_atlas = batch_texture
+					current_sampler = batch_sampler
+				}
+				sdl.DrawGPUPrimitives(render_pass, 6, batch.count, 0, batch.offset)
+			}
+		}
+	}
+
+	sdl.EndGPURenderPass(render_pass)
+}
+
+destroy_pipeline_2d_base :: proc(device: ^sdl.GPUDevice, pipeline: ^Pipeline_2D_Base) {
+	destroy_buffer(device, &pipeline.vertex_buffer)
+	destroy_buffer(device, &pipeline.index_buffer)
+	destroy_buffer(device, &pipeline.primitive_buffer)
+	if pipeline.unit_quad_buffer != nil {
+		sdl.ReleaseGPUBuffer(device, pipeline.unit_quad_buffer)
+	}
+	sdl.ReleaseGPUTexture(device, pipeline.white_texture)
+	sdl.ReleaseGPUSampler(device, pipeline.sampler)
+	sdl.ReleaseGPUGraphicsPipeline(device, pipeline.sdl_pipeline)
+}
@@ -1,118 +0,0 @@
-#pragma clang diagnostic ignored "-Wmissing-prototypes"
-
-#include <metal_stdlib>
-#include <simd/simd.h>
-
-using namespace metal;
-
-struct Uniforms
-{
-    float2 inv_working_size;
-    uint pair_count;
-    uint mode;
-    float2 direction;
-    float inv_downsample_factor;
-    float _pad0;
-    float4 kernel0[32];
-};
-
-struct main0_out
-{
-    float4 out_color [[color(0)]];
-};
-
-struct main0_in
-{
-    float2 p_local [[user(locn0)]];
-    float4 f_color [[user(locn1)]];
-    float2 f_half_size_ppx [[user(locn2), flat]];
-    float4 f_radii_ppx [[user(locn3), flat]];
-    float f_half_feather_ppx [[user(locn4), flat]];
-};
-
-static inline __attribute__((always_inline))
-float3 blur_sample(thread const float2& uv, constant Uniforms& _108, texture2d<float> blur_input_tex, sampler blur_input_texSmplr)
-{
-    float3 color = blur_input_tex.sample(blur_input_texSmplr, uv).xyz * _108.kernel0[0].x;
-    float2 axis_step = _108.direction * _108.inv_working_size;
-    for (uint i = 1u; i < _108.pair_count; i++)
-    {
-        float w = _108.kernel0[i].x;
-        float off = _108.kernel0[i].y;
-        float2 step_uv = axis_step * off;
-        color += (blur_input_tex.sample(blur_input_texSmplr, (uv - step_uv)).xyz * w);
-        color += (blur_input_tex.sample(blur_input_texSmplr, (uv + step_uv)).xyz * w);
-    }
-    return color;
-}
-
-static inline __attribute__((always_inline))
-float sdRoundedBox(thread const float2& p, thread const float2& b, thread const float4& r)
-{
-    float2 _36;
-    if (p.x > 0.0)
-    {
-        _36 = r.xy;
-    }
-    else
-    {
-        _36 = r.zw;
-    }
-    float2 rxy = _36;
-    float _50;
-    if (p.y > 0.0)
-    {
-        _50 = rxy.x;
-    }
-    else
-    {
-        _50 = rxy.y;
-    }
-    float rr = _50;
-    float2 q = abs(p) - b;
-    if (rr == 0.0)
-    {
-        return fast::max(q.x, q.y);
-    }
-    q += float2(rr);
-    return (fast::min(fast::max(q.x, q.y), 0.0) + length(fast::max(q, float2(0.0)))) - rr;
-}
-
-static inline __attribute__((always_inline))
-float sdf_alpha(thread const float& d, thread const float& h)
-{
-    return 1.0 - smoothstep(-h, h, d);
-}
-
-fragment main0_out main0(main0_in in [[stage_in]], constant Uniforms& _108 [[buffer(0)]], texture2d<float> blur_input_tex [[texture(0)]], sampler blur_input_texSmplr [[sampler(0)]], float4 gl_FragCoord [[position]])
-{
-    main0_out out = {};
-    if (_108.mode == 0u)
-    {
-        float2 uv = gl_FragCoord.xy * _108.inv_working_size;
-        float2 param = uv;
-        float3 color = blur_sample(param, _108, blur_input_tex, blur_input_texSmplr);
-        out.out_color = float4(color, 1.0);
-        return out;
-    }
-    float2 param_1 = in.p_local;
-    float2 param_2 = in.f_half_size_ppx;
-    float4 param_3 = in.f_radii_ppx;
-    float d = sdRoundedBox(param_1, param_2, param_3);
-    if (d > in.f_half_feather_ppx)
-    {
-        discard_fragment();
-    }
-    float grad_magnitude = fast::max(fwidth(d), 9.9999999747524270787835121154785e-07);
-    float d_n = d / grad_magnitude;
-    float h_n = in.f_half_feather_ppx / grad_magnitude;
-    float2 uv_1 = (gl_FragCoord.xy * _108.inv_downsample_factor) * _108.inv_working_size;
-    float3 color_1 = blur_input_tex.sample(blur_input_texSmplr, uv_1).xyz;
-    float3 tinted = mix(color_1, color_1 * in.f_color.xyz, float3(in.f_color.w));
-    float param_4 = d_n;
-    float param_5 = h_n;
-    float coverage = sdf_alpha(param_4, param_5);
-    out.out_color = float4(tinted * coverage, coverage);
-    return out;
-}
-
@@ -1,123 +0,0 @@
-#pragma clang diagnostic ignored "-Wmissing-prototypes"
-#pragma clang diagnostic ignored "-Wmissing-braces"
-
-#include <metal_stdlib>
-#include <simd/simd.h>
-
-using namespace metal;
-
-template<typename T, size_t Num>
-struct spvUnsafeArray
-{
-    T elements[Num ? Num : 1];
-    
-    thread T& operator [] (size_t pos) thread
-    {
-        return elements[pos];
-    }
-    constexpr const thread T& operator [] (size_t pos) const thread
-    {
-        return elements[pos];
-    }
-    
-    device T& operator [] (size_t pos) device
-    {
-        return elements[pos];
-    }
-    constexpr const device T& operator [] (size_t pos) const device
-    {
-        return elements[pos];
-    }
-    
-    constexpr const constant T& operator [] (size_t pos) const constant
-    {
-        return elements[pos];
-    }
-    
-    threadgroup T& operator [] (size_t pos) threadgroup
-    {
-        return elements[pos];
-    }
-    constexpr const threadgroup T& operator [] (size_t pos) const threadgroup
-    {
-        return elements[pos];
-    }
-};
-
-struct Uniforms
-{
-    float4x4 projection;
-    float dpi_scale;
-    uint mode;
-    float2 _pad0;
-};
-
-struct Gaussian_Blur_Primitive
-{
-    float4 bounds;
-    float4 radii_ppx;
-    float2 half_size_ppx;
-    float half_feather_ppx;
-    uint color;
-};
-
-struct Gaussian_Blur_Primitive_1
-{
-    float4 bounds;
-    float4 radii_ppx;
-    float2 half_size_ppx;
-    float half_feather_ppx;
-    uint color;
-};
-
-struct Gaussian_Blur_Primitives
-{
-    Gaussian_Blur_Primitive_1 primitives[1];
-};
-
-constant spvUnsafeArray<float2, 6> _97 = spvUnsafeArray<float2, 6>({ float2(0.0), float2(1.0, 0.0), float2(0.0, 1.0), float2(0.0, 1.0), float2(1.0, 0.0), float2(1.0) });
-
-struct main0_out
-{
-    float2 p_local [[user(locn0)]];
-    float4 f_color [[user(locn1)]];
-    float2 f_half_size_ppx [[user(locn2)]];
-    float4 f_radii_ppx [[user(locn3)]];
-    float f_half_feather_ppx [[user(locn4)]];
-    float4 gl_Position [[position]];
-};
-
-vertex main0_out main0(constant Uniforms& _13 [[buffer(0)]], const device Gaussian_Blur_Primitives& _69 [[buffer(1)]], uint gl_VertexIndex [[vertex_id]], uint gl_InstanceIndex [[instance_id]])
-{
-    main0_out out = {};
-    if (_13.mode == 0u)
-    {
-        float2 ndc = float2((int(gl_VertexIndex) == 1) ? 3.0 : (-1.0), (int(gl_VertexIndex) == 2) ? 3.0 : (-1.0));
-        out.gl_Position = float4(ndc, 0.0, 1.0);
-        out.p_local = float2(0.0);
-        out.f_color = float4(0.0);
-        out.f_half_size_ppx = float2(0.0);
-        out.f_radii_ppx = float4(0.0);
-        out.f_half_feather_ppx = 0.0;
-    }
-    else
-    {
-        Gaussian_Blur_Primitive p;
-        p.bounds = _69.primitives[int(gl_InstanceIndex)].bounds;
-        p.radii_ppx = _69.primitives[int(gl_InstanceIndex)].radii_ppx;
-        p.half_size_ppx = _69.primitives[int(gl_InstanceIndex)].half_size_ppx;
-        p.half_feather_ppx = _69.primitives[int(gl_InstanceIndex)].half_feather_ppx;
-        p.color = _69.primitives[int(gl_InstanceIndex)].color;
-        float2 corner = _97[int(gl_VertexIndex)];
-        float2 world_pos = mix(p.bounds.xy, p.bounds.zw, corner);
-        float2 center = (p.bounds.xy + p.bounds.zw) * 0.5;
-        out.p_local = (world_pos - center) * _13.dpi_scale;
-        out.f_color = unpack_unorm4x8_to_float(p.color);
-        out.f_half_size_ppx = p.half_size_ppx;
-        out.f_radii_ppx = p.radii_ppx;
-        out.f_half_feather_ppx = p.half_feather_ppx;
-        out.gl_Position = _13.projection * float4(world_pos * _13.dpi_scale, 0.0, 1.0);
-    }
-    return out;
-}
-
@@ -1,47 +0,0 @@
-#include <metal_stdlib>
-#include <simd/simd.h>
-
-using namespace metal;
-
-struct Uniforms
-{
-    float2 inv_source_size;
-    uint downsample_factor;
-    uint _pad0;
-};
-
-struct main0_out
-{
-    float4 out_color [[color(0)]];
-};
-
-fragment main0_out main0(constant Uniforms& _18 [[buffer(0)]], texture2d<float> source_tex [[texture(0)]], sampler source_texSmplr [[sampler(0)]], float4 gl_FragCoord [[position]])
-{
-    main0_out out = {};
-    float2 src_block_center = gl_FragCoord.xy * float(_18.downsample_factor);
-    if (_18.downsample_factor == 1u)
-    {
-        float2 uv = src_block_center * _18.inv_source_size;
-        out.out_color = source_tex.sample(source_texSmplr, uv);
-    }
-    else
-    {
-        if (_18.downsample_factor == 2u)
-        {
-            float2 uv_1 = src_block_center * _18.inv_source_size;
-            out.out_color = source_tex.sample(source_texSmplr, uv_1);
-        }
-        else
-        {
-            float off = float(_18.downsample_factor) * 0.25;
-            float2 uv_tl = (src_block_center + float2(-off, -off)) * _18.inv_source_size;
-            float2 uv_tr = (src_block_center + float2(off, -off)) * _18.inv_source_size;
-            float2 uv_bl = (src_block_center + float2(-off, off)) * _18.inv_source_size;
-            float2 uv_br = (src_block_center + float2(off)) * _18.inv_source_size;
-            float4 c = ((source_tex.sample(source_texSmplr, uv_tl) + source_tex.sample(source_texSmplr, uv_tr)) + source_tex.sample(source_texSmplr, uv_bl)) + source_tex.sample(source_texSmplr, uv_br);
-            out.out_color = c * 0.25;
-        }
-    }
-    return out;
-}
-
@@ -1,18 +0,0 @@
-#include <metal_stdlib>
-#include <simd/simd.h>
-
-using namespace metal;
-
-struct main0_out
-{
-    float4 gl_Position [[position]];
-};
-
-vertex main0_out main0(uint gl_VertexIndex [[vertex_id]])
-{
-    main0_out out = {};
-    float2 ndc = float2((int(gl_VertexIndex) == 1) ? 3.0 : (-1.0), (int(gl_VertexIndex) == 2) ? 3.0 : (-1.0));
-    out.gl_Position = float4(ndc, 0.0, 1.0);
-    return out;
-}
-
@@ -23,220 +23,293 @@ struct main0_in
    float2 f_local_or_uv [[user(locn1)]];
    float4 f_params [[user(locn2)]];
    float4 f_params2 [[user(locn3)]];
-    uint f_flags [[user(locn4)]];
+    uint f_kind_flags [[user(locn4)]];
+    float f_rotation [[user(locn5), flat]];
    float4 f_uv_rect [[user(locn6), flat]];
-    uint4 f_effects [[user(locn7)]];
 };

 static inline __attribute__((always_inline))
-float sdRoundedBox(thread const float2& p, thread const float2& b, thread const float4& r)
+float2 apply_rotation(thread const float2& p, thread const float& angle)
 {
-    float2 _48;
+    float cr = cos(-angle);
+    float sr = sin(-angle);
+    return float2x2(float2(cr, sr), float2(-sr, cr)) * p;
+}
+
+static inline __attribute__((always_inline))
+float sdRoundedBox(thread const float2& p, thread const float2& b, thread float4& r)
+{
+    float2 _61;
    if (p.x > 0.0)
    {
-        _48 = r.xy;
+        _61 = r.xy;
    }
    else
    {
-        _48 = r.zw;
+        _61 = r.zw;
    }
-    float2 rxy = _48;
-    float _62;
+    r.x = _61.x;
+    r.y = _61.y;
+    float _78;
    if (p.y > 0.0)
    {
-        _62 = rxy.x;
+        _78 = r.x;
    }
    else
    {
-        _62 = rxy.y;
+        _78 = r.y;
    }
-    float rr = _62;
-    float2 q = abs(p) - b;
-    if (rr == 0.0)
+    r.x = _78;
+    float2 q = (abs(p) - b) + float2(r.x);
+    return (fast::min(fast::max(q.x, q.y), 0.0) + length(fast::max(q, float2(0.0)))) - r.x;
+}
+
+static inline __attribute__((always_inline))
+float sdf_stroke(thread const float& d, thread const float& stroke_width)
+{
+    return abs(d) - (stroke_width * 0.5);
+}
+
+static inline __attribute__((always_inline))
+float sdf_alpha(thread const float& d, thread const float& soft)
+{
+    return 1.0 - smoothstep(-soft, soft, d);
+}
+
+static inline __attribute__((always_inline))
+float sdCircle(thread const float2& p, thread const float& r)
+{
+    return length(p) - r;
+}
+
+static inline __attribute__((always_inline))
+float sdEllipse(thread float2& p, thread float2& ab)
+{
+    p = abs(p);
+    if (p.x > p.y)
    {
-        return fast::max(q.x, q.y);
+        p = p.yx;
+        ab = ab.yx;
    }
-    q += float2(rr);
-    return (fast::min(fast::max(q.x, q.y), 0.0) + length(fast::max(q, float2(0.0)))) - rr;
+    float l = (ab.y * ab.y) - (ab.x * ab.x);
+    float m = (ab.x * p.x) / l;
+    float m2 = m * m;
+    float n = (ab.y * p.y) / l;
+    float n2 = n * n;
+    float c = ((m2 + n2) - 1.0) / 3.0;
+    float c3 = (c * c) * c;
+    float q = c3 + ((m2 * n2) * 2.0);
+    float d = c3 + (m2 * n2);
+    float g = m + (m * n2);
+    float co;
+    if (d < 0.0)
+    {
+        float h = acos(q / c3) / 3.0;
+        float s = cos(h);
+        float t = sin(h) * 1.73205077648162841796875;
+        float rx = sqrt(((-c) * ((s + t) + 2.0)) + m2);
+        float ry = sqrt(((-c) * ((s - t) + 2.0)) + m2);
+        co = (((ry + (sign(l) * rx)) + (abs(g) / (rx * ry))) - m) / 2.0;
+    }
+    else
+    {
+        float h_1 = ((2.0 * m) * n) * sqrt(d);
+        float s_1 = sign(q + h_1) * powr(abs(q + h_1), 0.3333333432674407958984375);
+        float u = sign(q - h_1) * powr(abs(q - h_1), 0.3333333432674407958984375);
+        float rx_1 = (((-s_1) - u) - (c * 4.0)) + (2.0 * m2);
+        float ry_1 = (s_1 - u) * 1.73205077648162841796875;
+        float rm = sqrt((rx_1 * rx_1) + (ry_1 * ry_1));
+        co = (((ry_1 / sqrt(rm - rx_1)) + ((2.0 * g) / rm)) - m) / 2.0;
+    }
+    float2 r = ab * float2(co, sqrt(1.0 - (co * co)));
+    return length(r - p) * sign(p.y - r.y);
 }

 static inline __attribute__((always_inline))
-float sdRegularPolygon(thread const float2& p, thread const float& r, thread const float& n)
+float sdSegment(thread const float2& p, thread const float2& a, thread const float2& b)
 {
-    float an = 3.1415927410125732421875 / n;
-    float bn = mod(precise::atan2(p.y, p.x), 2.0 * an) - an;
-    return (length(p) * cos(bn)) - r;
-}
-
-static inline __attribute__((always_inline))
-float sdEllipseApprox(thread const float2& p, thread const float2& ab)
-{
-    float k0 = length(p / ab);
-    float k1 = length(p / (ab * ab));
-    return (k0 * (k0 - 1.0)) / k1;
-}
-
-static inline __attribute__((always_inline))
-float4 gradient_2color(thread const float4& start_color, thread const float4& end_color, thread const float& t)
-{
-    return mix(start_color, end_color, float4(fast::clamp(t, 0.0, 1.0)));
-}
-
-static inline __attribute__((always_inline))
-float sdf_alpha(thread const float& d, thread const float& h)
-{
-    return 1.0 - smoothstep(-h, h, d);
+    float2 pa = p - a;
+    float2 ba = b - a;
+    float h = fast::clamp(dot(pa, ba) / dot(ba, ba), 0.0, 1.0);
+    return length(pa - (ba * h));
 }

 fragment main0_out main0(main0_in in [[stage_in]], texture2d<float> tex [[texture(0)]], sampler texSmplr [[sampler(0)]])
 {
    main0_out out = {};
-    uint kind = in.f_flags & 255u;
-    uint flags = (in.f_flags >> 8u) & 255u;
+    uint kind = in.f_kind_flags & 255u;
+    uint flags = (in.f_kind_flags >> 8u) & 255u;
    if (kind == 0u)
    {
-        float4 t = tex.sample(texSmplr, in.f_local_or_uv);
-        float _195 = t.w;
-        float4 _197 = t;
-        float3 _199 = _197.xyz * _195;
-        t.x = _199.x;
-        t.y = _199.y;
-        t.z = _199.z;
-        out.out_color = in.f_color * t;
+        out.out_color = in.f_color * tex.sample(texSmplr, in.f_local_or_uv);
        return out;
    }
    float d = 1000000015047466219876688855040.0;
-    float h = 0.5;
-    float2 half_size_ppx = in.f_params.xy;
-    float2 p_local_ppx = in.f_local_or_uv;
+    float soft = 1.0;
    if (kind == 1u)
    {
-        float4 corner_radii_ppx = float4(in.f_params.zw, in.f_params2.xy);
-        h = in.f_params2.z;
-        float2 param = p_local_ppx;
-        float2 param_1 = half_size_ppx;
-        float4 param_2 = corner_radii_ppx;
-        d = sdRoundedBox(param, param_1, param_2);
+        float2 b = in.f_params.xy;
+        float4 r = float4(in.f_params.zw, in.f_params2.xy);
+        soft = fast::max(in.f_params2.z, 1.0);
+        float stroke_px = in.f_params2.w;
+        float2 p_local = in.f_local_or_uv;
+        if (in.f_rotation != 0.0)
+        {
+            float2 param = p_local;
+            float param_1 = in.f_rotation;
+            p_local = apply_rotation(param, param_1);
+        }
+        float2 param_2 = p_local;
+        float2 param_3 = b;
+        float4 param_4 = r;
+        float _491 = sdRoundedBox(param_2, param_3, param_4);
+        d = _491;
+        if ((flags & 1u) != 0u)
+        {
+            float param_5 = d;
+            float param_6 = stroke_px;
+            d = sdf_stroke(param_5, param_6);
+        }
+        float4 shape_color = in.f_color;
+        if ((flags & 2u) != 0u)
+        {
+            float2 p_for_uv = in.f_local_or_uv;
+            if (in.f_rotation != 0.0)
+            {
+                float2 param_7 = p_for_uv;
+                float param_8 = in.f_rotation;
+                p_for_uv = apply_rotation(param_7, param_8);
+            }
+            float2 local_uv = ((p_for_uv / b) * 0.5) + float2(0.5);
+            float2 uv = mix(in.f_uv_rect.xy, in.f_uv_rect.zw, local_uv);
+            shape_color *= tex.sample(texSmplr, uv);
+        }
+        float param_9 = d;
+        float param_10 = soft;
+        float alpha = sdf_alpha(param_9, param_10);
+        out.out_color = float4(shape_color.xyz, shape_color.w * alpha);
+        return out;
    }
    else
    {
        if (kind == 2u)
        {
-            float radius_ppx = in.f_params.x;
-            float sides = in.f_params.y;
-            h = in.f_params.z;
-            float2 param_3 = p_local_ppx;
-            float param_4 = radius_ppx;
-            float param_5 = sides;
-            d = sdRegularPolygon(param_3, param_4, param_5);
-            half_size_ppx = float2(radius_ppx);
+            float radius = in.f_params.x;
+            soft = fast::max(in.f_params.y, 1.0);
+            float stroke_px_1 = in.f_params.z;
+            float2 param_11 = in.f_local_or_uv;
+            float param_12 = radius;
+            d = sdCircle(param_11, param_12);
+            if ((flags & 1u) != 0u)
+            {
+                float param_13 = d;
+                float param_14 = stroke_px_1;
+                d = sdf_stroke(param_13, param_14);
+            }
        }
        else
        {
            if (kind == 3u)
            {
-                float2 radii_ppx = in.f_params.xy;
-                h = in.f_params.z;
-                float2 param_6 = p_local_ppx;
-                float2 param_7 = radii_ppx;
-                d = sdEllipseApprox(param_6, param_7);
-                half_size_ppx = radii_ppx;
+                float2 ab = in.f_params.xy;
+                soft = fast::max(in.f_params.z, 1.0);
+                float stroke_px_2 = in.f_params.w;
+                float2 p_local_1 = in.f_local_or_uv;
+                if (in.f_rotation != 0.0)
+                {
+                    float2 param_15 = p_local_1;
+                    float param_16 = in.f_rotation;
+                    p_local_1 = apply_rotation(param_15, param_16);
+                }
+                float2 param_17 = p_local_1;
+                float2 param_18 = ab;
+                float _616 = sdEllipse(param_17, param_18);
+                d = _616;
+                if ((flags & 1u) != 0u)
+                {
+                    float param_19 = d;
+                    float param_20 = stroke_px_2;
+                    d = sdf_stroke(param_19, param_20);
+                }
            }
            else
            {
                if (kind == 4u)
                {
-                    float inner_radius_ppx = in.f_params.x;
-                    float outer_radius_ppx = in.f_params.y;
-                    float2 n_start = in.f_params.zw;
-                    float2 n_end = in.f_params2.xy;
-                    uint arc_bits = (flags >> 5u) & 3u;
-                    h = in.f_params2.z;
-                    float r = length(p_local_ppx);
-                    d = fast::max(inner_radius_ppx - r, r - outer_radius_ppx);
-                    if (arc_bits != 0u)
+                    float2 a = in.f_params.xy;
+                    float2 b_1 = in.f_params.zw;
+                    float width = in.f_params2.x;
+                    soft = fast::max(in.f_params2.y, 1.0);
+                    float2 param_21 = in.f_local_or_uv;
+                    float2 param_22 = a;
+                    float2 param_23 = b_1;
+                    d = sdSegment(param_21, param_22, param_23) - (width * 0.5);
+                }
+                else
+                {
+                    if (kind == 5u)
                    {
-                        float d_start = dot(p_local_ppx, n_start);
-                        float d_end = dot(p_local_ppx, n_end);
-                        float _338;
-                        if (arc_bits == 1u)
+                        float inner = in.f_params.x;
+                        float outer = in.f_params.y;
+                        float start_rad = in.f_params.z;
+                        float end_rad = in.f_params.w;
+                        soft = fast::max(in.f_params2.x, 1.0);
+                        float r_1 = length(in.f_local_or_uv);
+                        float d_ring = fast::max(inner - r_1, r_1 - outer);
+                        float angle = precise::atan2(in.f_local_or_uv.y, in.f_local_or_uv.x);
+                        if (angle < 0.0)
                        {
-                            _338 = fast::max(d_start, d_end);
+                            angle += 6.283185482025146484375;
+                        }
+                        float ang_start = mod(start_rad, 6.283185482025146484375);
+                        float ang_end = mod(end_rad, 6.283185482025146484375);
+                        float _710;
+                        if (ang_end > ang_start)
+                        {
+                            _710 = float((angle >= ang_start) && (angle <= ang_end));
                        }
                        else
                        {
-                            _338 = fast::min(d_start, d_end);
+                            _710 = float((angle >= ang_start) || (angle <= ang_end));
+                        }
+                        float in_arc = _710;
+                        if (abs(ang_end - ang_start) >= 6.282185077667236328125)
+                        {
+                            in_arc = 1.0;
+                        }
+                        d = (in_arc > 0.5) ? d_ring : 1000000015047466219876688855040.0;
+                    }
+                    else
+                    {
+                        if (kind == 6u)
+                        {
+                            float radius_1 = in.f_params.x;
+                            float rotation = in.f_params.y;
+                            float sides = in.f_params.z;
+                            soft = fast::max(in.f_params.w, 1.0);
+                            float stroke_px_3 = in.f_params2.x;
+                            float2 p = in.f_local_or_uv;
+                            float c = cos(rotation);
+                            float s = sin(rotation);
+                            p = float2x2(float2(c, -s), float2(s, c)) * p;
+                            float an = 3.1415927410125732421875 / sides;
+                            float bn = mod(precise::atan2(p.y, p.x), 2.0 * an) - an;
+                            d = (length(p) * cos(bn)) - radius_1;
+                            if ((flags & 1u) != 0u)
+                            {
+                                float param_24 = d;
+                                float param_25 = stroke_px_3;
+                                d = sdf_stroke(param_24, param_25);
+                            }
                        }
-                        float d_wedge = _338;
-                        d = fast::max(d, d_wedge);
                    }
-                    half_size_ppx = float2(outer_radius_ppx);
                }
            }
        }
    }
-    float grad_magnitude = fast::max(fwidth(d), 9.9999999747524270787835121154785e-07);
-    d /= grad_magnitude;
-    h /= grad_magnitude;
-    float4 shape_color;
-    if ((flags & 2u) != 0u)
-    {
-        float4 gradient_start = in.f_color;
-        float4 gradient_end = unpack_unorm4x8_to_float(in.f_effects.x);
-        if ((flags & 4u) != 0u)
-        {
-            float t_1 = length(p_local_ppx / half_size_ppx);
-            float4 param_8 = gradient_start;
-            float4 param_9 = gradient_end;
-            float param_10 = t_1;
-            shape_color = gradient_2color(param_8, param_9, param_10);
-        }
-        else
-        {
-            float2 direction = float2(as_type<half2>(in.f_effects.z));
-            float t_2 = (dot(p_local_ppx / half_size_ppx, direction) * 0.5) + 0.5;
-            float4 param_11 = gradient_start;
-            float4 param_12 = gradient_end;
-            float param_13 = t_2;
-            shape_color = gradient_2color(param_11, param_12, param_13);
-        }
-    }
-    else
-    {
-        if ((flags & 1u) != 0u)
-        {
-            float4 uv_rect = in.f_uv_rect;
-            float2 local_uv = ((p_local_ppx / half_size_ppx) * 0.5) + float2(0.5);
-            float2 uv = mix(uv_rect.xy, uv_rect.zw, local_uv);
-            shape_color = in.f_color * tex.sample(texSmplr, uv);
-        }
-        else
-        {
-            shape_color = in.f_color;
-        }
-    }
-    if ((flags & 8u) != 0u)
-    {
-        float4 ol_color = unpack_unorm4x8_to_float(in.f_effects.y);
-        float ol_width = float2(as_type<half2>(in.f_effects.w)).x / grad_magnitude;
-        float param_14 = d;
-        float param_15 = h;
-        float fill_cov = sdf_alpha(param_14, param_15);
-        float param_16 = d - ol_width;
-        float param_17 = h;
-        float total_cov = sdf_alpha(param_16, param_17);
-        float outline_cov = fast::max(total_cov - fill_cov, 0.0);
-        float3 rgb_pm = ((shape_color.xyz * shape_color.w) * fill_cov) + ((ol_color.xyz * ol_color.w) * outline_cov);
-        float alpha_pm = (shape_color.w * fill_cov) + (ol_color.w * outline_cov);
-        out.out_color = float4(rgb_pm, alpha_pm);
-    }
-    else
-    {
-        float param_18 = d;
-        float param_19 = h;
-        float alpha = sdf_alpha(param_18, param_19);
-        out.out_color = float4((shape_color.xyz * shape_color.w) * alpha, shape_color.w * alpha);
-    }
+    float param_26 = d;
+    float param_27 = soft;
+    float alpha_1 = sdf_alpha(param_26, param_27);
+    out.out_color = float4(in.f_color.xyz, in.f_color.w * alpha_1);
    return out;
 }
-
@@ -10,35 +10,33 @@ struct Uniforms
    uint mode;
 };

-struct Core_2D_Primitive
+struct Primitive
 {
    float4 bounds;
    uint color;
-    uint flags;
-    uint rotation_sc;
+    uint kind_flags;
+    float rotation;
    float _pad;
    float4 params;
    float4 params2;
    float4 uv_rect;
-    uint4 effects;
 };

-struct Core_2D_Primitive_1
+struct Primitive_1
 {
    float4 bounds;
    uint color;
-    uint flags;
-    uint rotation_sc;
+    uint kind_flags;
+    float rotation;
    float _pad;
    float4 params;
    float4 params2;
    float4 uv_rect;
-    uint4 effects;
 };

-struct Core_2D_Primitives
+struct Primitives
 {
-    Core_2D_Primitive_1 primitives[1];
+    Primitive_1 primitives[1];
 };

 struct main0_out
@@ -47,9 +45,9 @@ struct main0_out
    float2 f_local_or_uv [[user(locn1)]];
    float4 f_params [[user(locn2)]];
    float4 f_params2 [[user(locn3)]];
-    uint f_flags [[user(locn4)]];
+    uint f_kind_flags [[user(locn4)]];
+    float f_rotation [[user(locn5)]];
    float4 f_uv_rect [[user(locn6)]];
-    uint4 f_effects [[user(locn7)]];
    float4 gl_Position [[position]];
 };

@@ -60,61 +58,42 @@ struct main0_in
    float4 v_color [[attribute(2)]];
 };

-vertex main0_out main0(main0_in in [[stage_in]], constant Uniforms& _12 [[buffer(0)]], const device Core_2D_Primitives& _31 [[buffer(1)]], uint gl_InstanceIndex [[instance_id]])
+vertex main0_out main0(main0_in in [[stage_in]], constant Uniforms& _12 [[buffer(0)]], const device Primitives& _74 [[buffer(1)]], uint gl_InstanceIndex [[instance_id]])
 {
    main0_out out = {};
-    if (_12.mode == 1u)
-    {
-        Core_2D_Primitive p;
-        p.bounds = _31.primitives[int(gl_InstanceIndex)].bounds;
-        p.color = _31.primitives[int(gl_InstanceIndex)].color;
-        p.flags = _31.primitives[int(gl_InstanceIndex)].flags;
-        p.rotation_sc = _31.primitives[int(gl_InstanceIndex)].rotation_sc;
-        p._pad = _31.primitives[int(gl_InstanceIndex)]._pad;
-        p.params = _31.primitives[int(gl_InstanceIndex)].params;
-        p.params2 = _31.primitives[int(gl_InstanceIndex)].params2;
-        p.uv_rect = _31.primitives[int(gl_InstanceIndex)].uv_rect;
-        p.effects = _31.primitives[int(gl_InstanceIndex)].effects;
-        float2 corner = in.v_position;
-        float2 world_pos = mix(p.bounds.xy, p.bounds.zw, corner);
-        float2 center = (p.bounds.xy + p.bounds.zw) * 0.5;
-        float2 local = (world_pos - center) * _12.dpi_scale;
-        uint flags = (p.flags >> 8u) & 255u;
-        if ((flags & 16u) != 0u)
-        {
-            float2 sc = float2(as_type<half2>(p.rotation_sc));
-            local = float2((sc.y * local.x) + (sc.x * local.y), ((-sc.x) * local.x) + (sc.y * local.y));
-        }
-        out.f_color = unpack_unorm4x8_to_float(p.color);
-        out.f_local_or_uv = local;
-        out.f_params = p.params;
-        out.f_params2 = p.params2;
-        out.f_flags = p.flags;
-        out.f_uv_rect = p.uv_rect;
-        out.f_effects = p.effects;
-        out.gl_Position = _12.projection * float4(world_pos * _12.dpi_scale, 0.0, 1.0);
-    }
-    else
+    if (_12.mode == 0u)
    {
        out.f_color = in.v_color;
        out.f_local_or_uv = in.v_uv;
        out.f_params = float4(0.0);
        out.f_params2 = float4(0.0);
-        out.f_flags = 0u;
-        out.f_uv_rect = float4(0.0);
-        out.f_effects = uint4(0u);
-        float2 _199;
-        if (_12.mode == 2u)
-        {
-            _199 = in.v_position;
-        }
-        else
-        {
-            _199 = in.v_position * _12.dpi_scale;
-        }
-        float2 pos = _199;
-        out.gl_Position = _12.projection * float4(pos, 0.0, 1.0);
+        out.f_kind_flags = 0u;
+        out.f_rotation = 0.0;
+        out.f_uv_rect = float4(0.0, 0.0, 1.0, 1.0);
+        out.gl_Position = _12.projection * float4(in.v_position * _12.dpi_scale, 0.0, 1.0);
+    }
+    else
+    {
+        Primitive p;
+        p.bounds = _74.primitives[int(gl_InstanceIndex)].bounds;
+        p.color = _74.primitives[int(gl_InstanceIndex)].color;
+        p.kind_flags = _74.primitives[int(gl_InstanceIndex)].kind_flags;
+        p.rotation = _74.primitives[int(gl_InstanceIndex)].rotation;
+        p._pad = _74.primitives[int(gl_InstanceIndex)]._pad;
+        p.params = _74.primitives[int(gl_InstanceIndex)].params;
+        p.params2 = _74.primitives[int(gl_InstanceIndex)].params2;
+        p.uv_rect = _74.primitives[int(gl_InstanceIndex)].uv_rect;
+        float2 corner = in.v_position;
+        float2 world_pos = mix(p.bounds.xy, p.bounds.zw, corner);
+        float2 center = (p.bounds.xy + p.bounds.zw) * 0.5;
+        out.f_color = unpack_unorm4x8_to_float(p.color);
+        out.f_local_or_uv = (world_pos - center) * _12.dpi_scale;
+        out.f_params = p.params;
+        out.f_params2 = p.params2;
+        out.f_kind_flags = p.kind_flags;
+        out.f_rotation = p.rotation;
+        out.f_uv_rect = p.uv_rect;
+        out.gl_Position = _12.projection * float4(world_pos * _12.dpi_scale, 0.0, 1.0);
    }
    return out;
 }
-
@@ -1,155 +0,0 @@
-#version 450 core
-
-// Unified backdrop blur fragment shader.
-// Handles both the 1D separable blur passes (mode 0, used for BOTH the H-pass and V-pass;
-// `direction` picks the axis) and the composite pass (mode 1, reads the fully-blurred
-// working texture, masks via RRect SDF, applies tint, and writes to source_texture with
-// premultiplied-over blending). Working textures are sized at the full swapchain resolution;
-// downsampled content occupies only a sub-rect at downsample factor > 1 (set via viewport).
-//
-// The composite blends with source_texture via the standard premultiplied-over blend state
-// (ONE, ONE_MINUS_SRC_ALPHA).
-//
-// Backdrop primitives are tint-only — there is no outline. A specialized edge effect
-// (e.g. liquid-glass-style refraction outlines) would be implemented as a dedicated
-// primitive type with its own pipeline.
-//
-// Two modes, structurally distinct:
-//
-//   Mode 0: 1D separable blur. Used for BOTH the H-pass and V-pass; `direction` (set in the
-//           per-pass uniforms) picks (1,0) for H or (0,1) for V. Reads the previous working-
-//           res texture and writes the next working-res texture. Fullscreen-triangle vertex
-//           output; gl_FragCoord.xy is in working-res target pixel space; UV =
-//           gl_FragCoord.xy * inv_working_size.
-//
-//   Mode 1: composite. Reads the fully-blurred working-res texture, applies the SDF mask and
-//           tint, writes to source_texture. Instanced unit-quad vertex output covering the
-//           per-primitive bounds; gl_FragCoord.xy is in the full-resolution render target;
-//           UV into the blurred working texture =
-//           (gl_FragCoord.xy * inv_downsample_factor) * inv_working_size.
-//           No kernel is applied here — the blur is already complete.
-//
-// V-blur is run as its own working→working pass rather than folded into the composite. The
-// folded variant produced a horizontal-vs-vertical asymmetry artifact: when V-blur sampled
-// the H-blur output through the bilinear-upsample/SDF-mask/tint pipeline in one shader
-// invocation, horizontal source features ended up looking sharper than vertical ones.
-// Matching V's structure exactly to H's restores symmetry.
-
-const uint MAX_KERNEL_PAIRS = 32;
-
-// --- Inputs from vertex shader ---
-layout(location = 0) in vec2 p_local;
-layout(location = 1) in mediump vec4 f_color;
-layout(location = 2) flat in vec2 f_half_size_ppx;
-layout(location = 3) flat in vec4 f_radii_ppx;
-layout(location = 4) flat in float f_half_feather_ppx;
-
-// --- Output ---
-layout(location = 0) out vec4 out_color;
-
-// --- Sampler ---
-// Mode 0: bound to downsample_texture. Mode 1: bound to h_blur_texture.
-layout(set = 2, binding = 0) uniform sampler2D blur_input_tex;
-
-// --- Uniforms (set 3) ---
-// Per-bracket-substage. `mode` matches the vertex shader's mode (0 = H, 1 = V).
-// `direction` selects the kernel axis for blur offsets.
-// `kernel` holds the per-sigma weight/offset pairs computed CPU-side using the
-// linear-sampling pair adjustment (RAD/Rákos).
-layout(set = 3, binding = 0) uniform Uniforms {
-    vec2 inv_working_size; // 1.0 / working-resolution texture dimensions
-    uint pair_count; // number of (weight, offset) pairs; pair[0] is the center
-    uint mode; // 0 = H-blur, 1 = V-composite
-    vec2 direction; // (1,0) for H, (0,1) for V — multiplied into the kernel offset
-    float inv_downsample_factor; // 1.0 / downsample_factor (mode 1 only; mode 0 ignores)
-    float _pad0;
-    vec4 kernel[MAX_KERNEL_PAIRS]; // .x = weight (paired-sum for idx>0), .y = offset (texels)
-};
-
-// ---------------------------------------------------------------------------------------------------------------------
-// ----- SDF helper --------------------
-// ---------------------------------------------------------------------------------------------------------------------
-
-float sdRoundedBox(vec2 p, vec2 b, vec4 r) {
-    vec2 rxy = (p.x > 0.0) ? r.xy : r.zw;
-    float rr = (p.y > 0.0) ? rxy.x : rxy.y;
-    vec2 q = abs(p) - b;
-    if (rr == 0.0) {
-        return max(q.x, q.y);
-    }
-    q += rr;
-    return min(max(q.x, q.y), 0.0) + length(max(q, vec2(0.0))) - rr;
-}
-
-float sdf_alpha(float d, float h) {
-    return 1.0 - smoothstep(-h, h, d);
-}
-
-// ---------------------------------------------------------------------------------------------------------------------
-// ----- Blur sample loop --------------
-// ---------------------------------------------------------------------------------------------------------------------
-
-vec3 blur_sample(vec2 uv) {
-    vec3 color = kernel[0].x * texture(blur_input_tex, uv).rgb;
-
-    // Per-pair offset in texel space, projected onto the active axis.
-    vec2 axis_step = direction * inv_working_size;
-
-    for (uint i = 1u; i < pair_count; i += 1u) {
-        float w = kernel[i].x;
-        float off = kernel[i].y;
-        vec2 step_uv = off * axis_step;
-        color += w * texture(blur_input_tex, uv - step_uv).rgb;
-        color += w * texture(blur_input_tex, uv + step_uv).rgb;
-    }
-
-    return color;
-}
-
-// ---------------------------------------------------------------------------------------------------------------------
-// ----- Main --------------------------
-// ---------------------------------------------------------------------------------------------------------------------
-
-void main() {
-    if (mode == 0u) {
-        // ---- Mode 0: 1D separable blur (used for both H-pass and V-pass).
-        // gl_FragCoord is in working-res target pixel space; sample the previous working-res
-        // texture along `direction` with the kernel.
-        vec2 uv = gl_FragCoord.xy * inv_working_size;
-        vec3 color = blur_sample(uv);
-        out_color = vec4(color, 1.0);
-        return;
-    }
-
-    // ---- Mode 1: composite per-primitive.
-    // RRect SDF — early discard for fragments well outside the masked region.
-    float d = sdRoundedBox(p_local, f_half_size_ppx, f_radii_ppx);
-    if (d > f_half_feather_ppx) {
-        discard;
-    }
-
-    // fwidth-based normalization for AA (matches main pipeline approach).
-    float grad_magnitude = max(fwidth(d), 1e-6);
-    float d_n = d / grad_magnitude;
-    float h_n = f_half_feather_ppx / grad_magnitude;
-
-    // Sample the fully-blurred working-res texture. gl_FragCoord is full-res; convert to
-    // working-res UV via inv_downsample_factor. No kernel is applied — the H+V blur passes
-    // already produced the final blurred image; this is just an upsample + tint.
-    vec2 uv = (gl_FragCoord.xy * inv_downsample_factor) * inv_working_size;
-    vec3 color = texture(blur_input_tex, uv).rgb;
-
-    // Tint composition: inside the masked region the panel is fully opaque — it completely
-    // hides the original framebuffer content, just like real frosted glass and like iOS
-    // UIBlurEffect / CSS backdrop-filter. f_color.rgb specifies the tint color; f_color.a
-    // specifies the tint *mix strength* (NOT panel opacity). At alpha=0 we see the pure
-    // blur; at alpha=255 we see the blur fully multiplied by the tint color.
-    //
-    // Output is premultiplied to match the ONE, ONE_MINUS_SRC_ALPHA blend state. Coverage
-    // (the SDF mask's edge AA) modulates only the alpha channel, never the panel-vs-source
-    // blend; that way edge pixels still feather correctly while mid-panel pixels stay fully
-    // opaque.
-    mediump vec3 tinted = mix(color, color * f_color.rgb, f_color.a);
-    mediump float coverage = sdf_alpha(d_n, h_n);
-    out_color = vec4(tinted * coverage, coverage);
-}
@@ -1,110 +0,0 @@
-#version 450 core
-
-// Unified backdrop blur vertex shader.
-// Handles both the 1D separable blur passes (fullscreen triangle, mode 0; used for
-// BOTH the H-pass and V-pass) and the composite pass (instanced unit-quad over
-// Gaussian_Blur_Primitive storage buffer, mode 1) for the second PSO of the backdrop bracket.
-// The first PSO (downsample) uses backdrop_fullscreen.vert.
-//
-// No vertex buffer for either mode. Mode 0 uses gl_VertexIndex 0..2 for a single
-// fullscreen triangle; mode 1 uses gl_VertexIndex 0..5 for a unit-quad (two
-// triangles, TRIANGLELIST topology) and gl_InstanceIndex to select the primitive.
-//
-// Mode 0 viewport+scissor are CPU-set per sigma group to the work region (union AABB
-// of that group's backdrop primitives + halo, clamped to swapchain bounds). Mode 1
-// renders into source_texture with the screen-space orthographic projection; the
-// per-primitive bounds drive the quad in screen space.
-//
-// Backdrop primitives have NO rotation — backdrop sampling is in screen space, so
-// a rotated mask over a stationary blur sample would look wrong.
-
-// --- Outputs to fragment shader ---
-// p_local: shape-local position in physical pixels (origin at shape center).
-//          Only meaningful in mode 1 (V-composite). Zero-init for mode 0.
-layout(location = 0) out vec2 p_local;
-// f_color: tint, unpacked from primitive.color. Only meaningful in mode 1.
-layout(location = 1) out mediump vec4 f_color;
-// f_half_size_ppx: RRect half extents in physical pixels (mode 1 only).
-layout(location = 2) flat out vec2 f_half_size_ppx;
-// f_radii_ppx: per-corner radii in physical pixels (mode 1 only).
-layout(location = 3) flat out vec4 f_radii_ppx;
-// f_half_feather_ppx: SDF anti-aliasing feather in physical pixels (mode 1 only).
-layout(location = 4) flat out float f_half_feather_ppx;
-
-// --- Uniforms (set 1) ---
-// Backdrop pipeline's own uniform block — distinct from the main pipeline's
-// Vertex_Uniforms_2D. `mode` selects between H-blur (0) and V-composite (1).
-layout(set = 1, binding = 0) uniform Uniforms {
-    mat4 projection;
-    float dpi_scale;
-    uint mode; // 0 = H-blur, 1 = V-composite
-    vec2 _pad0;
-};
-
-// --- Gaussian blur primitive storage buffer (set 0) ---
-// 48 bytes, std430-natural layout (no implicit padding). vec4 members are
-// front-loaded so their 16-byte alignment is satisfied without holes; the
-// vec2 and scalar tail packs tight to land the struct at a clean 48-byte
-// stride (a multiple of 16, so the array stride needs no rounding either).
-// Field semantics match the CPU-side Gaussian_Blur_Primitive declared in
-// levlib/draw/backdrop.odin; keep both in sync.
-//
-// Gaussian blur primitives are tint-only: outline is intentionally absent. Specialized
-// edge effects (e.g. liquid-glass-style refraction outlines) would be a dedicated
-// primitive type with its own pipeline rather than a flag bit here.
-struct Gaussian_Blur_Primitive {
-    vec4 bounds; //  0-15: min_xy, max_xy (world-space, logical px)
-    vec4 radii_ppx; // 16-31: per-corner radii
-    vec2 half_size_ppx; // 32-39: RRect half extents
-    float half_feather_ppx; // 40-43: SDF anti-aliasing feather
-    uint color; // 44-47: tint, packed RGBA u8x4
-};
-
-layout(std430, set = 0, binding = 0) readonly buffer Gaussian_Blur_Primitives {
-    Gaussian_Blur_Primitive primitives[];
-};
-
-void main() {
-    if (mode == 0u) {
-        // ---- Mode 0: H-blur fullscreen triangle ----
-        // gl_VertexIndex 0 -> ( -1, -1)
-        // gl_VertexIndex 1 -> (  3, -1)
-        // gl_VertexIndex 2 -> ( -1,  3)
-        vec2 ndc = vec2(
-                (gl_VertexIndex == 1) ? 3.0 : -1.0,
-                (gl_VertexIndex == 2) ? 3.0 : -1.0);
-        gl_Position = vec4(ndc, 0.0, 1.0);
-
-        // Mode 0 doesn't read the per-primitive varyings; zero-init for safety.
-        p_local = vec2(0.0);
-        f_color = vec4(0.0);
-        f_half_size_ppx = vec2(0.0);
-        f_radii_ppx = vec4(0.0);
-        f_half_feather_ppx = 0.0;
-    } else {
-        // ---- Mode 1: V-composite instanced unit-quad over Gaussian_Blur_Primitive ----
-        Gaussian_Blur_Primitive p = primitives[gl_InstanceIndex];
-
-        // Unit-quad corners for TRIANGLELIST (2 triangles, 6 vertices):
-        //   index 0 -> (0,0)   index 3 -> (0,1)
-        //   index 1 -> (1,0)   index 4 -> (1,0)
-        //   index 2 -> (0,1)   index 5 -> (1,1)
-        vec2 quad_corners[6] = vec2[6](
-                vec2(0.0, 0.0), vec2(1.0, 0.0), vec2(0.0, 1.0),
-                vec2(0.0, 1.0), vec2(1.0, 0.0), vec2(1.0, 1.0));
-        vec2 corner = quad_corners[gl_VertexIndex];
-
-        vec2 world_pos = mix(p.bounds.xy, p.bounds.zw, corner);
-        vec2 center = 0.5 * (p.bounds.xy + p.bounds.zw);
-
-        // Shape-local position in physical pixels (no rotation for backdrops).
-        p_local = (world_pos - center) * dpi_scale;
-
-        f_color = unpackUnorm4x8(p.color);
-        f_half_size_ppx = p.half_size_ppx;
-        f_radii_ppx = p.radii_ppx;
-        f_half_feather_ppx = p.half_feather_ppx;
-
-        gl_Position = projection * vec4(world_pos * dpi_scale, 0.0, 1.0);
-    }
-}
@@ -1,67 +0,0 @@
-#version 450 core
-
-// Backdrop downsample fragment shader.
-// Reads source_texture (full-resolution snapshot of pre-bracket framebuffer contents) and
-// writes a downsampled copy at factor 1, 2, or 4. The output is the working texture (sized
-// at full swapchain resolution); larger factors only fill a sub-rect of it via the CPU-set
-// viewport. See backdrop.odin for the factor selection table (Flutter-style).
-//
-// Shader paths by factor:
-//
-//   factor=1: identity copy. One bilinear tap aligned to the source pixel center. Useful
-//             when sigma is small enough that any downsample round-trip would visibly soften
-//             the output (Flutter does this for sigma_phys ≤ 4).
-//
-//   factor=2: each output covers a 2×2 source block. Single bilinear tap at the shared
-//             corner reads all 4 source pixels with 0.25 weight.
-//
-//   factor=4: each output covers a 4×4 source block. We use 4 bilinear taps, each at the
-//             shared corner of a 2×2 sub-block. Each tap reads 4 source pixels uniformly;
-//             combined, the 4 taps sample 16 source pixels arranged uniformly across the
-//             block (full coverage at factor=4). The factor>=4 path is structured so the
-//             same shader code would extend to factor=8 (16 pixels of 64) or factor=16 (16
-//             of 256) if the CPU-side cap is ever raised, though the current cap is 4.
-//
-// The viewport+scissor are set by the CPU to limit output to the layer's work region in
-// working-texture coords (work_region_phys / factor), clamped to the texture bounds.
-
-layout(set = 3, binding = 0) uniform Uniforms {
-    vec2 inv_source_size; // 1.0 / source_texture pixel dimensions
-    uint downsample_factor; // 1, 2, 4, 8, or 16
-    uint _pad0;
-};
-
-layout(set = 2, binding = 0) uniform sampler2D source_tex;
-
-layout(location = 0) out vec4 out_color;
-
-void main() {
-    // Output pixel index (i): gl_FragCoord.xy - 0.5. Source-pixel block top-left for this
-    // output: i * factor. Center of the block: i*factor + factor/2 = gl_FragCoord.xy * factor.
-    vec2 src_block_center = gl_FragCoord.xy * float(downsample_factor);
-
-    if (downsample_factor == 1u) {
-        // Identity copy. UV at src_block_center hits the source pixel center directly.
-        vec2 uv = src_block_center * inv_source_size;
-        out_color = texture(source_tex, uv);
-    } else if (downsample_factor == 2u) {
-        // Single tap at the shared corner of the 2×2 source block; one bilinear sample reads
-        // all 4 source pixels with equal 0.25 weights — uniform 2×2 box filter for free.
-        vec2 uv = src_block_center * inv_source_size;
-        out_color = texture(source_tex, uv);
-    } else {
-        // Four taps at offsets ±(factor/4) from the block center. Each tap lands on a corner
-        // shared by 4 source pixels of a (factor/2)×(factor/2) sub-block (equivalent at the
-        // bilinear level), giving a 4-tap = 16-source-pixel uniform sample of the block.
-        float off = float(downsample_factor) * 0.25;
-        vec2 uv_tl = (src_block_center + vec2(-off, -off)) * inv_source_size;
-        vec2 uv_tr = (src_block_center + vec2(off, -off)) * inv_source_size;
-        vec2 uv_bl = (src_block_center + vec2(-off, off)) * inv_source_size;
-        vec2 uv_br = (src_block_center + vec2(off, off)) * inv_source_size;
-        vec4 c = texture(source_tex, uv_tl)
-                + texture(source_tex, uv_tr)
-                + texture(source_tex, uv_bl)
-                + texture(source_tex, uv_br);
-        out_color = c * 0.25;
-    }
-}
@@ -1,21 +0,0 @@
-#version 450 core
-
-// Fullscreen-triangle vertex shader for the backdrop downsample and H-blur sub-passes.
-// Emits a single triangle covering NDC [-1,1]^2; the rasterizer clips edges outside.
-// No vertex buffer; uses gl_VertexIndex to pick corners.
-//
-// The CPU sets the viewport (and matching scissor) per layer-bracket to limit work to
-// the union AABB of the layer's backdrop primitives, expanded by 3*max_sigma and
-// clamped to swapchain bounds. The fragment shader uses gl_FragCoord (absolute pixel
-// space in the bound target) plus an inv-size uniform to compute its own UVs — see
-// each fragment shader for the per-pass sampling math.
-
-void main() {
-    // gl_VertexIndex 0 -> ( -1, -1)
-    // gl_VertexIndex 1 -> (  3, -1)
-    // gl_VertexIndex 2 -> ( -1,  3)
-    vec2 ndc = vec2(
-            (gl_VertexIndex == 1) ? 3.0 : -1.0,
-            (gl_VertexIndex == 2) ? 3.0 : -1.0);
-    gl_Position = vec4(ndc, 0.0, 1.0);
-}
@@ -1,13 +1,13 @@
 #version 450 core

 // --- Inputs from vertex shader ---
-layout(location = 0) in mediump vec4 f_color;
+layout(location = 0) in vec4 f_color;
 layout(location = 1) in vec2 f_local_or_uv;
 layout(location = 2) in vec4 f_params;
 layout(location = 3) in vec4 f_params2;
-layout(location = 4) flat in uint f_flags;
+layout(location = 4) flat in uint f_kind_flags;
+layout(location = 5) flat in float f_rotation;
 layout(location = 6) flat in vec4 f_uv_rect;
-layout(location = 7) flat in uvec4 f_effects;

 // --- Output ---
 layout(location = 0) out vec4 out_color;
@@ -20,43 +20,77 @@ layout(set = 2, binding = 0) uniform sampler2D tex;
 // All operate in physical pixel space — no dpi_scale needed here.
 // ---------------------------------------------------------------------------

+const float PI = 3.14159265358979;
+
+float sdCircle(vec2 p, float r) {
+    return length(p) - r;
+}
+
 float sdRoundedBox(vec2 p, vec2 b, vec4 r) {
-    vec2 rxy = (p.x > 0.0) ? r.xy : r.zw;
-    float rr = (p.y > 0.0) ? rxy.x : rxy.y;
-    vec2 q = abs(p) - b;
-    if (rr == 0.0) {
-        return max(q.x, q.y);
+    r.xy = (p.x > 0.0) ? r.xy : r.zw;
+    r.x = (p.y > 0.0) ? r.x : r.y;
+    vec2 q = abs(p) - b + r.x;
+    return min(max(q.x, q.y), 0.0) + length(max(q, vec2(0.0))) - r.x;
+}
+
+float sdSegment(vec2 p, vec2 a, vec2 b) {
+    vec2 pa = p - a, ba = b - a;
+    float h = clamp(dot(pa, ba) / dot(ba, ba), 0.0, 1.0);
+    return length(pa - ba * h);
+}
+
+float sdEllipse(vec2 p, vec2 ab) {
+    p = abs(p);
+    if (p.x > p.y) {
+        p = p.yx;
+        ab = ab.yx;
    }
-    q += rr;
-    return min(max(q.x, q.y), 0.0) + length(max(q, vec2(0.0))) - rr;
+    float l = ab.y * ab.y - ab.x * ab.x;
+    float m = ab.x * p.x / l;
+    float m2 = m * m;
+    float n = ab.y * p.y / l;
+    float n2 = n * n;
+    float c = (m2 + n2 - 1.0) / 3.0;
+    float c3 = c * c * c;
+    float q = c3 + m2 * n2 * 2.0;
+    float d = c3 + m2 * n2;
+    float g = m + m * n2;
+    float co;
+    if (d < 0.0) {
+        float h = acos(q / c3) / 3.0;
+        float s = cos(h);
+        float t = sin(h) * sqrt(3.0);
+        float rx = sqrt(-c * (s + t + 2.0) + m2);
+        float ry = sqrt(-c * (s - t + 2.0) + m2);
+        co = (ry + sign(l) * rx + abs(g) / (rx * ry) - m) / 2.0;
+    } else {
+        float h = 2.0 * m * n * sqrt(d);
+        float s = sign(q + h) * pow(abs(q + h), 1.0 / 3.0);
+        float u = sign(q - h) * pow(abs(q - h), 1.0 / 3.0);
+        float rx = -s - u - c * 4.0 + 2.0 * m2;
+        float ry = (s - u) * sqrt(3.0);
+        float rm = sqrt(rx * rx + ry * ry);
+        co = (ry / sqrt(rm - rx) + 2.0 * g / rm - m) / 2.0;
+    }
+    vec2 r = ab * vec2(co, sqrt(1.0 - co * co));
+    return length(r - p) * sign(p.y - r.y);
 }

-// Approximate ellipse SDF — fast, suitable for UI, NOT a true Euclidean distance.
-float sdEllipseApprox(vec2 p, vec2 ab) {
-    float k0 = length(p / ab);
-    float k1 = length(p / (ab * ab));
-    return k0 * (k0 - 1.0) / k1;
+float sdf_alpha(float d, float soft) {
+    return 1.0 - smoothstep(-soft, soft, d);
 }

-// Regular N-gon SDF (Inigo Quilez).
-float sdRegularPolygon(vec2 p, float r, float n) {
-    float an = 3.141592653589793 / n;
-    float bn = mod(atan(p.y, p.x), 2.0 * an) - an;
-    return length(p) * cos(bn) - r;
+float sdf_stroke(float d, float stroke_width) {
+    return abs(d) - stroke_width * 0.5;
 }

-// Coverage from SDF distance using half-feather width (feather_ppx * 0.5, pre-computed on CPU).
-// Produces a symmetric transition centered on d=0: smoothstep(-h, h, d).
-float sdf_alpha(float d, float h) {
-    return 1.0 - smoothstep(-h, h, d);
-}
-
-// ---------------------------------------------------------------------------
-// Gradient helpers
-// ---------------------------------------------------------------------------
-
-mediump vec4 gradient_2color(mediump vec4 start_color, mediump vec4 end_color, mediump float t) {
-    return mix(start_color, end_color, clamp(t, 0.0, 1.0));
+// Rotate a 2D point by the negative of the given angle (inverse rotation).
+// Used to rotate the sampling frame opposite to the shape's rotation so that
+// the SDF evaluates correctly for the rotated shape.
+vec2 apply_rotation(vec2 p, float angle) {
+    float cr = cos(-angle);
+    float sr = sin(-angle);
+    return mat2(cr, sr, -sr, cr) * p;
 }

 // ---------------------------------------------------------------------------
@@ -64,128 +98,131 @@ mediump vec4 gradient_2color(mediump vec4 start_color, mediump vec4 end_color, m
 // ---------------------------------------------------------------------------

 void main() {
-    uint kind = f_flags & 0xFFu;
-    uint flags = (f_flags >> 8u) & 0xFFu;
+    uint kind = f_kind_flags & 0xFFu;
+    uint flags = (f_kind_flags >> 8u) & 0xFFu;

-    // Kind 0: Tessellated path — vertex colors arrive premultiplied from CPU.
-    // Texture samples are straight-alpha (SDL_ttf glyph atlas: rgb=1, a=coverage;
-    // or the 1x1 white texture: rgba=1). Convert to premultiplied form so the
-    // blend state (ONE, ONE_MINUS_SRC_ALPHA) composites correctly.
+    // -----------------------------------------------------------------------
+    // Kind 0: Tessellated path. Texture multiply for text atlas,
+    //         white pixel for solid shapes.
+    // -----------------------------------------------------------------------
    if (kind == 0u) {
-        vec4 t = texture(tex, f_local_or_uv);
-        t.rgb *= t.a;
-        out_color = f_color * t;
+        out_color = f_color * texture(tex, f_local_or_uv);
        return;
    }

-    // SDF path — dispatch on kind
+    // -----------------------------------------------------------------------
+    // SDF path. f_local_or_uv = shape-centered position in physical pixels.
+    // All dimensional params are already in physical pixels (CPU pre-scaled).
+    // -----------------------------------------------------------------------
    float d = 1e30;
-    float h = 0.5; // half-feather width (physical px); overwritten per shape kind
-    vec2 half_size_ppx = f_params.xy; // used by RRect and as reference size for gradients
-
-    vec2 p_local_ppx = f_local_or_uv; // arrives rotated; vertex shader handled .Rotated
+    float soft = 1.0;

    if (kind == 1u) {
-        // RRect — half_feather_ppx in params2.z
-        vec4 corner_radii_ppx = vec4(f_params.zw, f_params2.xy);
-        h = f_params2.z;
-        d = sdRoundedBox(p_local_ppx, half_size_ppx, corner_radii_ppx);
+        // RRect: rounded box
+        vec2 b = f_params.xy; // half_size (phys px)
+        vec4 r = vec4(f_params.zw, f_params2.xy); // corner radii: tr, br, tl, bl
+        soft = max(f_params2.z, 1.0);
+        float stroke_px = f_params2.w;
+
+        vec2 p_local = f_local_or_uv;
+        if (f_rotation != 0.0) {
+            p_local = apply_rotation(p_local, f_rotation);
+        }
+
+        d = sdRoundedBox(p_local, b, r);
+        if ((flags & 1u) != 0u) d = sdf_stroke(d, stroke_px);
+
+        // Texture sampling for textured SDF primitives
+        vec4 shape_color = f_color;
+        if ((flags & 2u) != 0u) {
+            // Compute UV from local position and half_size
+            vec2 p_for_uv = f_local_or_uv;
+            if (f_rotation != 0.0) {
+                p_for_uv = apply_rotation(p_for_uv, f_rotation);
+            }
+            vec2 local_uv = p_for_uv / b * 0.5 + 0.5;
+            vec2 uv = mix(f_uv_rect.xy, f_uv_rect.zw, local_uv);
+            shape_color *= texture(tex, uv);
+        }
+
+        float alpha = sdf_alpha(d, soft);
+        out_color = vec4(shape_color.rgb, shape_color.a * alpha);
+        return;
    }
    else if (kind == 2u) {
-        // NGon — half_feather_ppx in params.z
-        float radius_ppx = f_params.x;
-        float sides = f_params.y;
-        h = f_params.z;
-        d = sdRegularPolygon(p_local_ppx, radius_ppx, sides);
-        half_size_ppx = vec2(radius_ppx); // for gradient UV computation
+        // Circle — rotationally symmetric, no rotation needed
+        float radius = f_params.x;
+        soft = max(f_params.y, 1.0);
+        float stroke_px = f_params.z;
+
+        d = sdCircle(f_local_or_uv, radius);
+        if ((flags & 1u) != 0u) d = sdf_stroke(d, stroke_px);
    }
    else if (kind == 3u) {
-        // Ellipse — half_feather_ppx in params.z
-        vec2 radii_ppx = f_params.xy;
-        h = f_params.z;
-        d = sdEllipseApprox(p_local_ppx, radii_ppx);
-        half_size_ppx = radii_ppx; // for gradient UV computation
+        // Ellipse
+        vec2 ab = f_params.xy;
+        soft = max(f_params.z, 1.0);
+        float stroke_px = f_params.w;
+
+        vec2 p_local = f_local_or_uv;
+        if (f_rotation != 0.0) {
+            p_local = apply_rotation(p_local, f_rotation);
+        }
+
+        d = sdEllipse(p_local, ab);
+        if ((flags & 1u) != 0u) d = sdf_stroke(d, stroke_px);
    }
    else if (kind == 4u) {
-        // Ring_Arc — half_feather_ppx in params2.z
-        // Arc mode from flag bits 5-6: 0 = full, 1 = narrow (≤π), 2 = wide (>π)
-        float inner_radius_ppx = f_params.x;
-        float outer_radius_ppx = f_params.y;
-        vec2 n_start = f_params.zw;
-        vec2 n_end = f_params2.xy;
-        uint arc_bits = (flags >> 5u) & 3u;
+        // Segment (capsule line) — no rotation (excluded)
+        vec2 a = f_params.xy; // already in local physical pixels
+        vec2 b = f_params.zw;
+        float width = f_params2.x;
+        soft = max(f_params2.y, 1.0);

-        h = f_params2.z;
+        d = sdSegment(f_local_or_uv, a, b) - width * 0.5;
+    }
+    else if (kind == 5u) {
+        // Ring / Arc — rotation handled by CPU angle offset, no shader rotation
+        float inner = f_params.x;
+        float outer = f_params.y;
+        float start_rad = f_params.z;
+        float end_rad = f_params.w;
+        soft = max(f_params2.x, 1.0);

-        float r = length(p_local_ppx);
-        d = max(inner_radius_ppx - r, r - outer_radius_ppx);
+        float r = length(f_local_or_uv);
+        float d_ring = max(inner - r, r - outer);

-        if (arc_bits != 0u) {
-            float d_start = dot(p_local_ppx, n_start);
-            float d_end = dot(p_local_ppx, n_end);
-            float d_wedge = (arc_bits == 1u)
-                ? max(d_start, d_end) // arc ≤ π: intersect half-planes
-                : min(d_start, d_end); // arc > π: union half-planes
-            d = max(d, d_wedge);
-        }
+        // Angular clip
+        float angle = atan(f_local_or_uv.y, f_local_or_uv.x);
+        if (angle < 0.0) angle += 2.0 * PI;
+        float ang_start = mod(start_rad, 2.0 * PI);
+        float ang_end = mod(end_rad, 2.0 * PI);

-        half_size_ppx = vec2(outer_radius_ppx); // for gradient UV computation
+        float in_arc = (ang_end > ang_start)
+            ? ((angle >= ang_start && angle <= ang_end) ? 1.0 : 0.0) : ((angle >= ang_start || angle <= ang_end) ? 1.0 : 0.0);
+        if (abs(ang_end - ang_start) >= 2.0 * PI - 0.001) in_arc = 1.0;
+
+        d = in_arc > 0.5 ? d_ring : 1e30;
+    }
+    else if (kind == 6u) {
+        // Regular N-gon — has its own rotation in params, no Primitive.rotation used
+        float radius = f_params.x;
+        float rotation = f_params.y;
+        float sides = f_params.z;
+        soft = max(f_params.w, 1.0);
+        float stroke_px = f_params2.x;
+
+        vec2 p = f_local_or_uv;
+        float c = cos(rotation), s = sin(rotation);
+        p = mat2(c, -s, s, c) * p;
+
+        float an = PI / sides;
+        float bn = mod(atan(p.y, p.x), 2.0 * an) - an;
+        d = length(p) * cos(bn) - radius;
+
+        if ((flags & 1u) != 0u) d = sdf_stroke(d, stroke_px);
    }

-    // --- fwidth-based normalization for correct AA and stroke width ---
-    float grad_magnitude = max(fwidth(d), 1e-6);
-    d = d / grad_magnitude;
-    h = h / grad_magnitude;
-
-    // --- Determine shape color based on flags ---
-    mediump vec4 shape_color;
-    if ((flags & 2u) != 0u) {
-        // Gradient active (bit 1)
-        mediump vec4 gradient_start = f_color;
-        mediump vec4 gradient_end = unpackUnorm4x8(f_effects.x);
-
-        if ((flags & 4u) != 0u) {
-            // Radial gradient (bit 2): t from distance to center
-            mediump float t = length(p_local_ppx / half_size_ppx);
-            shape_color = gradient_2color(gradient_start, gradient_end, t);
-        } else {
-            // Linear gradient: direction pre-computed on CPU as (cos, sin) f16 pair
-            vec2 direction = unpackHalf2x16(f_effects.z);
-            mediump float t = dot(p_local_ppx / half_size_ppx, direction) * 0.5 + 0.5;
-            shape_color = gradient_2color(gradient_start, gradient_end, t);
-        }
-    } else if ((flags & 1u) != 0u) {
-        // Textured (bit 0)
-        vec4 uv_rect = f_uv_rect;
-        vec2 local_uv = p_local_ppx / half_size_ppx * 0.5 + 0.5;
-        vec2 uv = mix(uv_rect.xy, uv_rect.zw, local_uv);
-        shape_color = f_color * texture(tex, uv);
-    } else {
-        // Solid color
-        shape_color = f_color;
-    }
-
-    // --- Outline (bit 3) — outer outline via premultiplied compositing ---
-    // The outline band sits OUTSIDE the original shape boundary (d=0 to d=+ol_width).
-    // fill_cov covers the interior with AA at d=0; total_cov covers interior+outline with
-    // AA at d=ol_width. The outline band's coverage is total_cov - fill_cov.
-    // Output is premultiplied: blend state is ONE, ONE_MINUS_SRC_ALPHA.
-    if ((flags & 8u) != 0u) {
-        mediump vec4 ol_color = unpackUnorm4x8(f_effects.y);
-        // Outline width in f_effects.w (low f16 half)
-        float ol_width = unpackHalf2x16(f_effects.w).x / grad_magnitude;
-
-        float fill_cov = sdf_alpha(d, h);
-        float total_cov = sdf_alpha(d - ol_width, h);
-        float outline_cov = max(total_cov - fill_cov, 0.0);
-
-        // Premultiplied output — no divide, no threshold check
-        vec3 rgb_pm = shape_color.rgb * shape_color.a * fill_cov
-                + ol_color.rgb * ol_color.a * outline_cov;
-        float alpha_pm = shape_color.a * fill_cov + ol_color.a * outline_cov;
-        out_color = vec4(rgb_pm, alpha_pm);
-    } else {
-        mediump float alpha = sdf_alpha(d, h);
-        out_color = vec4(shape_color.rgb * shape_color.a * alpha, shape_color.a * alpha);
-    }
+    float alpha = sdf_alpha(d, soft);
+    out_color = vec4(f_color.rgb, f_color.a * alpha);
 }
@@ -1,107 +1,71 @@
 #version 450 core

-// ---------- Vertex attributes (used in all modes) ----------
+// ---------- Vertex attributes (used in both modes) ----------
 layout(location = 0) in vec2 v_position;
 layout(location = 1) in vec2 v_uv;
 layout(location = 2) in vec4 v_color;

 // ---------- Outputs to fragment shader ----------
-layout(location = 0) out mediump vec4 f_color;
+layout(location = 0) out vec4 f_color;
 layout(location = 1) out vec2 f_local_or_uv;
 layout(location = 2) out vec4 f_params;
 layout(location = 3) out vec4 f_params2;
-layout(location = 4) flat out uint f_flags;
-
+layout(location = 4) flat out uint f_kind_flags;
+layout(location = 5) flat out float f_rotation;
 layout(location = 6) flat out vec4 f_uv_rect;
-layout(location = 7) flat out uvec4 f_effects;

 // ---------- Uniforms (single block — avoids spirv-cross reordering on Metal) ----------
-// Mode values mirror Core_2D_Mode in core_2d.odin:
-//   0 = Tessellated  v_position is in logical pixels; shader scales by dpi_scale.
-//   1 = SDF          v_position is a unit-quad corner; world-space comes from
-//                    primitives[gl_InstanceIndex].bounds (logical px). Shader
-//                    scales by dpi_scale.
-//   2 = Text         v_position is in *physical* pixels already (the CPU baked
-//                    the anchor snap and SDL_ttf glyph offsets, both physical).
-//                    Shader must NOT rescale.
 layout(set = 1, binding = 0) uniform Uniforms {
    mat4 projection;
    float dpi_scale;
-    uint mode;
+    uint mode; // 0 = tessellated, 1 = SDF
 };

 // ---------- SDF primitive storage buffer ----------
-// Mirrors the CPU-side Core_2D_Primitive in core_2d.odin. Named with the
-// subsystem prefix so a project-wide grep on the type name matches both the GLSL
-// declaration and the Odin declaration.
-struct Core_2D_Primitive {
-    vec4 bounds; // 0-15
-    uint color; // 16-19
-    uint flags; // 20-23
-    uint rotation_sc; // 24-27: packed f16 pair (sin, cos)
-    float _pad; // 28-31
-    vec4 params; // 32-47
-    vec4 params2; // 48-63
-    vec4 uv_rect; // 64-79: texture UV coordinates (read when .Textured)
-    uvec4 effects; // 80-95: gradient/outline parameters (read when .Gradient/.Outline)
+struct Primitive {
+    vec4 bounds; // 0-15:  min_x, min_y, max_x, max_y
+    uint color; // 16-19: packed u8x4 (unpack with unpackUnorm4x8)
+    uint kind_flags; // 20-23: kind | (flags << 8)
+    float rotation; // 24-27: shader self-rotation in radians
+    float _pad; // 28-31: alignment padding
+    vec4 params; // 32-47: shape params part 1
+    vec4 params2; // 48-63: shape params part 2
+    vec4 uv_rect; // 64-79: u_min, v_min, u_max, v_max
 };

-layout(std430, set = 0, binding = 0) readonly buffer Core_2D_Primitives {
-    Core_2D_Primitive primitives[];
+layout(std430, set = 0, binding = 0) readonly buffer Primitives {
+    Primitive primitives[];
 };

 // ---------- Entry point ----------
 void main() {
-    if (mode == 1u) {
+    if (mode == 0u) {
+        // ---- Mode 0: Tessellated (legacy) ----
+        f_color = v_color;
+        f_local_or_uv = v_uv;
+        f_params = vec4(0.0);
+        f_params2 = vec4(0.0);
+        f_kind_flags = 0u;
+        f_rotation = 0.0;
+        f_uv_rect = vec4(0.0, 0.0, 1.0, 1.0);
+
+        gl_Position = projection * vec4(v_position * dpi_scale, 0.0, 1.0);
+    } else {
        // ---- Mode 1: SDF instanced quads ----
-        Core_2D_Primitive p = primitives[gl_InstanceIndex];
+        Primitive p = primitives[gl_InstanceIndex];

        vec2 corner = v_position; // unit quad corners: (0,0)-(1,1)
        vec2 world_pos = mix(p.bounds.xy, p.bounds.zw, corner);
        vec2 center = 0.5 * (p.bounds.xy + p.bounds.zw);

-        // Compute shape-local position. Apply inverse rotation here in the vertex
-        // shader; the rasterizer interpolates the rotated values across the quad,
-        // which is mathematically equivalent to per-fragment rotation under 2D ortho
-        // projection. Frees one fragment-shader varying and per-pixel rotation math.
-        vec2 local = (world_pos - center) * dpi_scale;
-        uint flags = (p.flags >> 8u) & 0xFFu;
-        if ((flags & 16u) != 0u) {
-            // Rotated flag (bit 4); rotation_sc holds packed f16 (sin, cos).
-            // Inverse rotation matrix R(-angle) = [[cos, sin], [-sin, cos]].
-            vec2 sc = unpackHalf2x16(p.rotation_sc);
-            local = vec2(sc.y * local.x + sc.x * local.y,
-                    -sc.x * local.x + sc.y * local.y);
-        }
-
        f_color = unpackUnorm4x8(p.color);
-        f_local_or_uv = local; // shape-local physical pixels (rotated if .Rotated set)
+        f_local_or_uv = (world_pos - center) * dpi_scale; // shape-centered physical pixels
        f_params = p.params;
        f_params2 = p.params2;
-        f_flags = p.flags;
+        f_kind_flags = p.kind_flags;
+        f_rotation = p.rotation;
        f_uv_rect = p.uv_rect;
-        f_effects = p.effects;

        gl_Position = projection * vec4(world_pos * dpi_scale, 0.0, 1.0);
-    } else {
-        // ---- Mode 0 (Tessellated) and Mode 2 (Text) ----
-        // Both feed the raw-vertex pipeline (kind 0 in the fragment shader).
-        // They differ only in what coord space `v_position` is in:
-        //   Mode 0 — logical pixels, scale here by dpi_scale.
-        //   Mode 2 — physical pixels (CPU pre-scaled and snapped to integer
-        //            physical pixels for atlas-aligned bilinear sampling).
-        //            Do NOT rescale.
-        // `mode` is uniform across the workgroup, so the select compiles to a
-        // uniform-controlled branch with no SIMT divergence cost.
-        f_color = v_color;
-        f_local_or_uv = v_uv;
-        f_params = vec4(0.0);
-        f_params2 = vec4(0.0);
-        f_flags = 0u;
-        f_uv_rect = vec4(0.0);
-        f_effects = uvec4(0);
-
-        vec2 pos = (mode == 2u) ? v_position : (v_position * dpi_scale);
-        gl_Position = projection * vec4(pos, 0.0, 1.0);
    }
 }
@@ -1,369 +0,0 @@
-package tess
-
-import "core:math"
-
-import draw ".."
-
-//INTERNAL
-SMOOTH_CIRCLE_ERROR_RATE :: 0.1
-
-auto_segments :: proc(radius: f32, arc_degrees: f32) -> int {
-	if radius <= 0 do return 4
-	phys_radius := radius * draw.GLOB.dpi_scaling
-	acos_arg := clamp(2 * math.pow(1 - SMOOTH_CIRCLE_ERROR_RATE / phys_radius, 2) - 1, -1, 1)
-	theta := math.acos(acos_arg)
-	if theta <= 0 do return 4
-	full_circle_segments := int(math.ceil(2 * math.PI / theta))
-	segments := int(f32(full_circle_segments) * arc_degrees / 360.0)
-	min_segments := max(int(math.ceil(f64(arc_degrees / 90.0))), 4)
-	return max(segments, min_segments)
-}
-
-// ----- Internal helpers -----
-
-// Premultiplies the color before storing it on the vertex (see draw package doc's
-// "Color and blending" section for why).
-//INTERNAL
-solid_vertex :: proc(position: draw.Vec2, color: draw.Color) -> draw.Vertex_2D {
-	return draw.Vertex_2D{position = position, color = draw.premultiply_color(color)}
-}
-
-//INTERNAL
-emit_rectangle :: proc(
-	x, y, width, height: f32,
-	color: draw.Color,
-	vertices: []draw.Vertex_2D,
-	offset: int,
-) {
-	vertices[offset + 0] = solid_vertex({x, y}, color)
-	vertices[offset + 1] = solid_vertex({x + width, y}, color)
-	vertices[offset + 2] = solid_vertex({x + width, y + height}, color)
-	vertices[offset + 3] = solid_vertex({x, y}, color)
-	vertices[offset + 4] = solid_vertex({x + width, y + height}, color)
-	vertices[offset + 5] = solid_vertex({x, y + height}, color)
-}
-
-//INTERNAL
-extrude_line :: proc(
-	start, end_pos: draw.Vec2,
-	thickness: f32,
-	color: draw.Color,
-	vertices: []draw.Vertex_2D,
-	offset: int,
-) -> int {
-	direction := end_pos - start
-	delta_x := direction[0]
-	delta_y := direction[1]
-	length := math.sqrt(delta_x * delta_x + delta_y * delta_y)
-	if length < 0.0001 do return 0
-
-	scale := thickness / (2 * length)
-	perpendicular := draw.Vec2{-delta_y * scale, delta_x * scale}
-
-	p0 := start + perpendicular
-	p1 := start - perpendicular
-	p2 := end_pos - perpendicular
-	p3 := end_pos + perpendicular
-
-	vertices[offset + 0] = solid_vertex(p0, color)
-	vertices[offset + 1] = solid_vertex(p1, color)
-	vertices[offset + 2] = solid_vertex(p2, color)
-	vertices[offset + 3] = solid_vertex(p0, color)
-	vertices[offset + 4] = solid_vertex(p2, color)
-	vertices[offset + 5] = solid_vertex(p3, color)
-
-	return 6
-}
-
-// ----- Public draw -----
-
-pixel :: proc(layer: ^draw.Layer, pos: draw.Vec2, color: draw.Color) {
-	vertices: [6]draw.Vertex_2D
-	emit_rectangle(pos[0], pos[1], 1, 1, color, vertices[:], 0)
-	draw.prepare_shape(layer, vertices[:])
-}
-
-triangle :: proc(
-	layer: ^draw.Layer,
-	v1, v2, v3: draw.Vec2,
-	color: draw.Color,
-	origin: draw.Vec2 = {},
-	rotation: f32 = 0,
-) {
-	if !draw.needs_transform(origin, rotation) {
-		vertices := [3]draw.Vertex_2D{solid_vertex(v1, color), solid_vertex(v2, color), solid_vertex(v3, color)}
-		draw.prepare_shape(layer, vertices[:])
-		return
-	}
-	bounds_min := draw.Vec2{min(v1.x, v2.x, v3.x), min(v1.y, v2.y, v3.y)}
-	transform := draw.build_pivot_rotation(bounds_min, origin, rotation)
-	local_v1 := v1 - bounds_min
-	local_v2 := v2 - bounds_min
-	local_v3 := v3 - bounds_min
-	vertices := [3]draw.Vertex_2D {
-		solid_vertex(draw.apply_transform(transform, local_v1), color),
-		solid_vertex(draw.apply_transform(transform, local_v2), color),
-		solid_vertex(draw.apply_transform(transform, local_v3), color),
-	}
-	draw.prepare_shape(layer, vertices[:])
-}
-
-// Draw an anti-aliased triangle via extruded edge quads plus corner fan caps.
-// Interior vertices get the full premultiplied color; outer fringe vertices get BLANK (0,0,0,0).
-// The rasterizer linearly interpolates between them, producing a smooth ~1-physical-pixel AA band.
-// `aa_ppx` controls the extrusion width in *physical* pixels (default 1.0). The CPU divides by
-// `dpi_scaling` here so the vertex stream stays in logical px; the mode-0 vertex shader scales
-// back to physical at draw time. Net AA band is ~aa_ppx physical pixels regardless of DPI.
-//
-// Topology: 3 interior verts + 6 edge-quad triangles (×3 verts) + 3 corner-fan triangles (×3 verts)
-// = 30 verts total. The corner fans plug the wedge gaps that would otherwise appear between
-// adjacent edge fringes at each triangle vertex; without them, sharp corners show a small
-// background-colored crescent. Apex vertex is full color, both fringe verts are BLANK, so the
-// fan rasterizes as an alpha-falloff triangle that blends visually into the adjacent edge bands.
-triangle_aa :: proc(
-	layer: ^draw.Layer,
-	v1, v2, v3: draw.Vec2,
-	color: draw.Color,
-	aa_ppx: f32 = draw.DFT_FEATHER_PPX,
-	origin: draw.Vec2 = {},
-	rotation: f32 = 0,
-) {
-	// Apply rotation if needed, then work in world space.
-	p0, p1, p2: draw.Vec2
-	if !draw.needs_transform(origin, rotation) {
-		p0 = v1
-		p1 = v2
-		p2 = v3
-	} else {
-		bounds_min := draw.Vec2{min(v1.x, v2.x, v3.x), min(v1.y, v2.y, v3.y)}
-		transform := draw.build_pivot_rotation(bounds_min, origin, rotation)
-		p0 = draw.apply_transform(transform, v1 - bounds_min)
-		p1 = draw.apply_transform(transform, v2 - bounds_min)
-		p2 = draw.apply_transform(transform, v3 - bounds_min)
-	}
-
-	// Compute outward edge normals (unit length, pointing away from triangle interior).
-	// Winding-independent: we check against the centroid to ensure normals point outward.
-	centroid_x := (p0.x + p1.x + p2.x) / 3.0
-	centroid_y := (p0.y + p1.y + p2.y) / 3.0
-
-	edge_normal :: proc(edge_start, edge_end: draw.Vec2, centroid_x, centroid_y: f32) -> draw.Vec2 {
-		delta_x := edge_end.x - edge_start.x
-		delta_y := edge_end.y - edge_start.y
-		length := math.sqrt(delta_x * delta_x + delta_y * delta_y)
-		if length < 0.0001 do return {0, 0}
-		inverse_length := 1.0 / length
-		// Perpendicular: (-delta_y, delta_x) normalized
-		normal_x := -delta_y * inverse_length
-		normal_y := delta_x * inverse_length
-		// Midpoint of the edge
-		midpoint_x := (edge_start.x + edge_end.x) * 0.5
-		midpoint_y := (edge_start.y + edge_end.y) * 0.5
-		// If normal points toward centroid, flip it
-		if normal_x * (centroid_x - midpoint_x) + normal_y * (centroid_y - midpoint_y) > 0 {
-			normal_x = -normal_x
-			normal_y = -normal_y
-		}
-		return {normal_x, normal_y}
-	}
-
-	normal_01 := edge_normal(p0, p1, centroid_x, centroid_y)
-	normal_12 := edge_normal(p1, p2, centroid_x, centroid_y)
-	normal_20 := edge_normal(p2, p0, centroid_x, centroid_y)
-
-	// aa_ppx is in physical pixels; divide by dpi_scaling so the extrusion lives in logical-pixel
-	// space (the mode-0 vertex shader will scale back to physical at draw time).
-	extrude_distance := aa_ppx / draw.GLOB.dpi_scaling
-
-	// Outer fringe vertices: each edge vertex extruded outward
-	outer_0_01 := p0 + normal_01 * extrude_distance
-	outer_1_01 := p1 + normal_01 * extrude_distance
-	outer_1_12 := p1 + normal_12 * extrude_distance
-	outer_2_12 := p2 + normal_12 * extrude_distance
-	outer_2_20 := p2 + normal_20 * extrude_distance
-	outer_0_20 := p0 + normal_20 * extrude_distance
-
-	// Premultiplied interior color (solid_vertex does premul internally).
-	// Outer fringe is BLANK = {0,0,0,0} which is already premul.
-	transparent := draw.BLANK
-
-	// 3 interior + 6 edge-quad tris (×3 verts) + 3 corner-fan tris (×3 verts) = 30 vertices
-	vertices: [30]draw.Vertex_2D
-
-	// Interior triangle
-	vertices[0] = solid_vertex(p0, color)
-	vertices[1] = solid_vertex(p1, color)
-	vertices[2] = solid_vertex(p2, color)
-
-	// Edge quad: p0→p1 (2 triangles)
-	vertices[3] = solid_vertex(p0, color)
-	vertices[4] = solid_vertex(p1, color)
-	vertices[5] = solid_vertex(outer_1_01, transparent)
-	vertices[6] = solid_vertex(p0, color)
-	vertices[7] = solid_vertex(outer_1_01, transparent)
-	vertices[8] = solid_vertex(outer_0_01, transparent)
-
-	// Edge quad: p1→p2 (2 triangles)
-	vertices[9] = solid_vertex(p1, color)
-	vertices[10] = solid_vertex(p2, color)
-	vertices[11] = solid_vertex(outer_2_12, transparent)
-	vertices[12] = solid_vertex(p1, color)
-	vertices[13] = solid_vertex(outer_2_12, transparent)
-	vertices[14] = solid_vertex(outer_1_12, transparent)
-
-	// Edge quad: p2→p0 (2 triangles)
-	vertices[15] = solid_vertex(p2, color)
-	vertices[16] = solid_vertex(p0, color)
-	vertices[17] = solid_vertex(outer_0_20, transparent)
-	vertices[18] = solid_vertex(p2, color)
-	vertices[19] = solid_vertex(outer_0_20, transparent)
-	vertices[20] = solid_vertex(outer_2_20, transparent)
-
-	// Corner fan caps: each fills the wedge gap between the two edge fringes meeting at a
-	// triangle vertex. Apex is full color; both fringe verts are BLANK, so the rasterizer
-	// produces a smooth alpha falloff across the wedge (matches the adjacent edge-band
-	// gradients at the shared edges, so the seams are invisible). Vertex order per fan:
-	// [apex, fringe-from-incoming-edge, fringe-from-outgoing-edge].
-
-	// Cap at p0 (between incoming edge p2→p0 and outgoing edge p0→p1)
-	vertices[21] = solid_vertex(p0, color)
-	vertices[22] = solid_vertex(outer_0_20, transparent)
-	vertices[23] = solid_vertex(outer_0_01, transparent)
-
-	// Cap at p1 (between incoming edge p0→p1 and outgoing edge p1→p2)
-	vertices[24] = solid_vertex(p1, color)
-	vertices[25] = solid_vertex(outer_1_01, transparent)
-	vertices[26] = solid_vertex(outer_1_12, transparent)
-
-	// Cap at p2 (between incoming edge p1→p2 and outgoing edge p2→p0)
-	vertices[27] = solid_vertex(p2, color)
-	vertices[28] = solid_vertex(outer_2_12, transparent)
-	vertices[29] = solid_vertex(outer_2_20, transparent)
-
-	draw.prepare_shape(layer, vertices[:])
-}
-
-triangle_lines :: proc(
-	layer: ^draw.Layer,
-	v1, v2, v3: draw.Vec2,
-	color: draw.Color,
-	thickness: f32 = draw.DFT_STROKE_THICKNESS,
-	origin: draw.Vec2 = {},
-	rotation: f32 = 0,
-	temp_allocator := context.temp_allocator,
-) {
-	vertices := make([]draw.Vertex_2D, 18, temp_allocator)
-	defer delete(vertices, temp_allocator)
-	write_offset := 0
-
-	if !draw.needs_transform(origin, rotation) {
-		write_offset += extrude_line(v1, v2, thickness, color, vertices, write_offset)
-		write_offset += extrude_line(v2, v3, thickness, color, vertices, write_offset)
-		write_offset += extrude_line(v3, v1, thickness, color, vertices, write_offset)
-	} else {
-		bounds_min := draw.Vec2{min(v1.x, v2.x, v3.x), min(v1.y, v2.y, v3.y)}
-		transform := draw.build_pivot_rotation(bounds_min, origin, rotation)
-		transformed_v1 := draw.apply_transform(transform, v1 - bounds_min)
-		transformed_v2 := draw.apply_transform(transform, v2 - bounds_min)
-		transformed_v3 := draw.apply_transform(transform, v3 - bounds_min)
-		write_offset += extrude_line(transformed_v1, transformed_v2, thickness, color, vertices, write_offset)
-		write_offset += extrude_line(transformed_v2, transformed_v3, thickness, color, vertices, write_offset)
-		write_offset += extrude_line(transformed_v3, transformed_v1, thickness, color, vertices, write_offset)
-	}
-
-	if write_offset > 0 {
-		draw.prepare_shape(layer, vertices[:write_offset])
-	}
-}
-
-triangle_fan :: proc(
-	layer: ^draw.Layer,
-	points: []draw.Vec2,
-	color: draw.Color,
-	origin: draw.Vec2 = {},
-	rotation: f32 = 0,
-	temp_allocator := context.temp_allocator,
-) {
-	if len(points) < 3 do return
-
-	triangle_count := len(points) - 2
-	vertex_count := triangle_count * 3
-	vertices := make([]draw.Vertex_2D, vertex_count, temp_allocator)
-	defer delete(vertices, temp_allocator)
-
-	if !draw.needs_transform(origin, rotation) {
-		for i in 1 ..< len(points) - 1 {
-			idx := (i - 1) * 3
-			vertices[idx + 0] = solid_vertex(points[0], color)
-			vertices[idx + 1] = solid_vertex(points[i], color)
-			vertices[idx + 2] = solid_vertex(points[i + 1], color)
-		}
-	} else {
-		bounds_min := draw.Vec2{max(f32), max(f32)}
-		for point in points {
-			bounds_min.x = min(bounds_min.x, point.x)
-			bounds_min.y = min(bounds_min.y, point.y)
-		}
-		transform := draw.build_pivot_rotation(bounds_min, origin, rotation)
-		for i in 1 ..< len(points) - 1 {
-			idx := (i - 1) * 3
-			vertices[idx + 0] = solid_vertex(draw.apply_transform(transform, points[0] - bounds_min), color)
-			vertices[idx + 1] = solid_vertex(draw.apply_transform(transform, points[i] - bounds_min), color)
-			vertices[idx + 2] = solid_vertex(draw.apply_transform(transform, points[i + 1] - bounds_min), color)
-		}
-	}
-
-	draw.prepare_shape(layer, vertices)
-}
-
-triangle_strip :: proc(
-	layer: ^draw.Layer,
-	points: []draw.Vec2,
-	color: draw.Color,
-	origin: draw.Vec2 = {},
-	rotation: f32 = 0,
-	temp_allocator := context.temp_allocator,
-) {
-	if len(points) < 3 do return
-
-	triangle_count := len(points) - 2
-	vertex_count := triangle_count * 3
-	vertices := make([]draw.Vertex_2D, vertex_count, temp_allocator)
-	defer delete(vertices, temp_allocator)
-
-	if !draw.needs_transform(origin, rotation) {
-		for i in 0 ..< triangle_count {
-			idx := i * 3
-			if i % 2 == 0 {
-				vertices[idx + 0] = solid_vertex(points[i], color)
-				vertices[idx + 1] = solid_vertex(points[i + 1], color)
-				vertices[idx + 2] = solid_vertex(points[i + 2], color)
-			} else {
-				vertices[idx + 0] = solid_vertex(points[i + 1], color)
-				vertices[idx + 1] = solid_vertex(points[i], color)
-				vertices[idx + 2] = solid_vertex(points[i + 2], color)
-			}
-		}
-	} else {
-		bounds_min := draw.Vec2{max(f32), max(f32)}
-		for point in points {
-			bounds_min.x = min(bounds_min.x, point.x)
-			bounds_min.y = min(bounds_min.y, point.y)
-		}
-		transform := draw.build_pivot_rotation(bounds_min, origin, rotation)
-		for i in 0 ..< triangle_count {
-			idx := i * 3
-			if i % 2 == 0 {
-				vertices[idx + 0] = solid_vertex(draw.apply_transform(transform, points[i] - bounds_min), color)
-				vertices[idx + 1] = solid_vertex(draw.apply_transform(transform, points[i + 1] - bounds_min), color)
-				vertices[idx + 2] = solid_vertex(draw.apply_transform(transform, points[i + 2] - bounds_min), color)
-			} else {
-				vertices[idx + 0] = solid_vertex(draw.apply_transform(transform, points[i + 1] - bounds_min), color)
-				vertices[idx + 1] = solid_vertex(draw.apply_transform(transform, points[i] - bounds_min), color)
-				vertices[idx + 2] = solid_vertex(draw.apply_transform(transform, points[i + 2] - bounds_min), color)
-			}
-		}
-	}
-
-	draw.prepare_shape(layer, vertices)
-}
@@ -8,25 +8,21 @@ import sdl_ttf "vendor:sdl3/ttf"

 Font_Id :: u16

-//INTERNAL
 Font_Key :: struct {
 	id:   Font_Id,
 	size: u16,
 }

-//INTERNAL
 Cache_Source :: enum u8 {
 	Custom,
 	Clay,
 }

-//INTERNAL
 Cache_Key :: struct {
 	id:     u32,
 	source: Cache_Source,
 }

-//INTERNAL
 Text_Cache :: struct {
 	engine:     ^sdl_ttf.TextEngine,
 	font_bytes: [dynamic][]u8,
@@ -34,8 +30,7 @@ Text_Cache :: struct {
 	cache:      map[Cache_Key]^sdl_ttf.Text,
 }

-// Fetch SDL TTF font pointer for rendering.
-//INTERNAL
+// Internal for fetching SDL TTF font pointer for rendering
 get_font :: proc(id: Font_Id, size: u16) -> ^sdl_ttf.Font {
 	assert(int(id) < len(GLOB.text_cache.font_bytes), "Invalid font ID.")
 	key := Font_Key{id, size}
@@ -82,10 +77,9 @@ register_font :: proc(bytes: []u8) -> (id: Font_Id, ok: bool) #optional_ok {
 	return Font_Id(len(GLOB.text_cache.font_bytes) - 1), true
 }

-//INTERNAL
 Text :: struct {
 	sdl_text: ^sdl_ttf.Text,
-	position: Vec2,
+	position: [2]f32,
 	color:    Color,
 }

@@ -95,7 +89,7 @@ Text :: struct {

 // Shared cache lookup/create/update logic used by both the `text` proc and the Clay render path.
 // Returns the cached (or newly created) TTF_Text pointer.
-//INTERNAL
+@(private)
 cache_get_or_update :: proc(key: Cache_Key, c_str: cstring, font: ^sdl_ttf.Font) -> ^sdl_ttf.Text {
 	existing, found := GLOB.text_cache.cache[key]
 	if !found {
@@ -135,11 +129,11 @@ cache_get_or_update :: proc(key: Cache_Key, c_str: cstring, font: ^sdl_ttf.Font)
 text :: proc(
 	layer: ^Layer,
 	text_string: string,
-	position: Vec2,
+	position: [2]f32,
 	font_id: Font_Id,
-	font_size: u16 = DFT_FONT_SIZE,
-	color: Color = DFT_TEXT_COLOR,
-	origin: Vec2 = {},
+	font_size: u16 = 44,
+	color: Color = BLACK,
+	origin: [2]f32 = {0, 0},
 	rotation: f32 = 0,
 	id: Maybe(u32) = nil,
 	temp_allocator := context.temp_allocator,
@@ -183,9 +177,9 @@ text :: proc(
 measure_text :: proc(
 	text_string: string,
 	font_id: Font_Id,
-	font_size: u16 = DFT_FONT_SIZE,
+	font_size: u16 = 44,
 	allocator := context.temp_allocator,
-) -> Vec2 {
+) -> [2]f32 {
 	c_str := strings.clone_to_cstring(text_string, allocator)
 	defer delete(c_str, allocator)
 	width, height: c.int
@@ -199,46 +193,46 @@ measure_text :: proc(
 // ----- Text anchor helpers -----------
 // ---------------------------------------------------------------------------------------------------------------------

-center_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = DFT_FONT_SIZE) -> Vec2 {
+center_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = 44) -> [2]f32 {
 	size := measure_text(text_string, font_id, font_size)
 	return size * 0.5
 }

-top_left_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = DFT_FONT_SIZE) -> Vec2 {
+top_left_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = 44) -> [2]f32 {
 	return {0, 0}
 }

-top_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = DFT_FONT_SIZE) -> Vec2 {
+top_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = 44) -> [2]f32 {
 	size := measure_text(text_string, font_id, font_size)
 	return {size.x * 0.5, 0}
 }

-top_right_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = DFT_FONT_SIZE) -> Vec2 {
+top_right_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = 44) -> [2]f32 {
 	size := measure_text(text_string, font_id, font_size)
 	return {size.x, 0}
 }

-left_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = DFT_FONT_SIZE) -> Vec2 {
+left_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = 44) -> [2]f32 {
 	size := measure_text(text_string, font_id, font_size)
 	return {0, size.y * 0.5}
 }

-right_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = DFT_FONT_SIZE) -> Vec2 {
+right_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = 44) -> [2]f32 {
 	size := measure_text(text_string, font_id, font_size)
 	return {size.x, size.y * 0.5}
 }

-bottom_left_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = DFT_FONT_SIZE) -> Vec2 {
+bottom_left_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = 44) -> [2]f32 {
 	size := measure_text(text_string, font_id, font_size)
 	return {0, size.y}
 }

-bottom_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = DFT_FONT_SIZE) -> Vec2 {
+bottom_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = 44) -> [2]f32 {
 	size := measure_text(text_string, font_id, font_size)
 	return {size.x * 0.5, size.y}
 }

-bottom_right_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = DFT_FONT_SIZE) -> Vec2 {
+bottom_right_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = 44) -> [2]f32 {
 	size := measure_text(text_string, font_id, font_size)
 	return size
 }
@@ -274,8 +268,7 @@ clear_text_cache_entry :: proc(id: u32) {
 // ----- Internal cache lifecycle ------
 // ---------------------------------------------------------------------------------------------------------------------

-//INTERNAL
-@(require_results)
+@(private, require_results)
 init_text_cache :: proc(
 	device: ^sdl.GPUDevice,
 	allocator := context.allocator,
@@ -306,7 +299,6 @@ init_text_cache :: proc(
 	return text_cache, true
 }

-//INTERNAL
 destroy_text_cache :: proc() {
 	for _, font in GLOB.text_cache.sdl_fonts {
 		sdl_ttf.CloseFont(font)
@@ -14,8 +14,8 @@ Texture_Kind :: enum u8 {
 }

 Sampler_Preset :: enum u8 {
-	Linear_Clamp,
 	Nearest_Clamp,
+	Linear_Clamp,
 	Nearest_Repeat,
 	Linear_Repeat,
 }
@@ -41,7 +41,8 @@ Texture_Desc :: struct {
 	kind:            Texture_Kind,
 }

-//INTERNAL
+// Internal slot — not exported.
+@(private)
 Texture_Slot :: struct {
 	gpu_texture: ^sdl.GPUTexture,
 	desc:        Texture_Desc,
@@ -56,6 +57,16 @@ Texture_Slot :: struct {
 //   GLOB.pending_texture_releases : [dynamic]Texture_Id
 //   GLOB.samplers               : [SAMPLER_PRESET_COUNT]^sdl.GPUSampler

+Clay_Image_Data :: struct {
+	texture_id: Texture_Id,
+	fit:        Fit_Mode,
+	tint:       Color,
+}
+
+clay_image_data :: proc(id: Texture_Id, fit: Fit_Mode = .Stretch, tint: Color = WHITE) -> Clay_Image_Data {
+	return {texture_id = id, fit = fit, tint = tint}
+}
+
 // ---------------------------------------------------------------------------------------------------------------------
 // ----- Registration -------------
 // ---------------------------------------------------------------------------------------------------------------------
@@ -308,8 +319,8 @@ texture_kind :: proc(id: Texture_Id) -> Texture_Kind {
 	return GLOB.texture_slots[u32(id)].desc.kind
 }

-// Get the raw GPU texture pointer for binding during draw.
-//INTERNAL
+// Internal: get the raw GPU texture pointer for binding during draw.
+@(private)
 texture_gpu_handle :: proc(id: Texture_Id) -> ^sdl.GPUTexture {
 	if id == INVALID_TEXTURE do return nil
 	idx := u32(id)
@@ -317,8 +328,8 @@ texture_gpu_handle :: proc(id: Texture_Id) -> ^sdl.GPUTexture {
 	return GLOB.texture_slots[idx].gpu_texture
 }

-// Deferred release (called from end / clear_global).
-//INTERNAL
+// Deferred release (called from draw.end / clear_global)
+@(private)
 process_pending_texture_releases :: proc() {
 	device := GLOB.device
 	for id in GLOB.pending_texture_releases {
@@ -335,7 +346,7 @@ process_pending_texture_releases :: proc() {
 	clear(&GLOB.pending_texture_releases)
 }

-//INTERNAL
+@(private)
 get_sampler :: proc(preset: Sampler_Preset) -> ^sdl.GPUSampler {
 	idx := int(preset)
 	if GLOB.samplers[idx] != nil do return GLOB.samplers[idx]
@@ -368,15 +379,15 @@ get_sampler :: proc(preset: Sampler_Preset) -> ^sdl.GPUSampler {
 	)
 	if sampler == nil {
 		log.errorf("Failed to create sampler preset %v: %s", preset, sdl.GetError())
-		return GLOB.core_2d.sampler // fallback to existing default sampler
+		return GLOB.pipeline_2d_base.sampler // fallback to existing default sampler
 	}

 	GLOB.samplers[idx] = sampler
 	return sampler
 }

-// Destroy all sampler pool entries. Called from destroy().
-//INTERNAL
+// Internal: destroy all sampler pool entries. Called from draw.destroy().
+@(private)
 destroy_sampler_pool :: proc() {
 	device := GLOB.device
 	for &s in GLOB.samplers {
@@ -387,8 +398,8 @@ destroy_sampler_pool :: proc() {
 	}
 }

-// Destroy all registered textures. Called from destroy().
-//INTERNAL
+// Internal: destroy all registered textures. Called from draw.destroy().
+@(private)
 destroy_all_textures :: proc() {
 	device := GLOB.device
 	for &slot in GLOB.texture_slots {
@@ -120,52 +120,10 @@ spinlock_try_lock :: #force_inline proc "contextless" (lock: ^Spinlock) -> bool
 	return lock_acquired
 }

-// Spins until the lock is acquired, relaxing the CPU between attempts.
-spinlock_lock :: #force_inline proc "contextless" (lock: ^Spinlock) {
-	for !spinlock_try_lock(lock) {
-		intrinsics.cpu_relax()
-	}
-}
-
 spinlock_unlock :: #force_inline proc "contextless" (lock: ^Spinlock) {
 	intrinsics.atomic_store_explicit(lock, false, .Release)
 }

-// Spins until the lock is acquired, then unlocks at the end of the calling scope. Always returns
-// true so it can guard a critical section from within an `if`:
-//
-//	if spinlock_guard(&lock) {
-//		// critical section
-//	}
-@(deferred_in = spinlock_unlock)
-spinlock_guard :: #force_inline proc "contextless" (lock: ^Spinlock) -> bool {
-	spinlock_lock(lock)
-	return true
-}
-
-// Tries to acquire the lock once without spinning. Returns true and unlocks at the end of the
-// calling scope if acquired, otherwise returns false and does nothing:
-//
-//	if spinlock_try_guard(&lock) {
-//		// critical section, entered only if the lock was acquired
-//	}
-@(deferred_in_out = spinlock_try_guard_unlock)
-spinlock_try_guard :: #force_inline proc "contextless" (lock: ^Spinlock) -> bool {
-	return spinlock_try_lock(lock)
-}
-
-// Deferred companion of `spinlock_try_guard`; unlocks only when the lock was actually acquired.
-@(private)
-spinlock_try_guard_unlock :: #force_inline proc "contextless" (lock: ^Spinlock, locked: bool) {
-	if locked {
-		spinlock_unlock(lock)
-	}
-}
-
-lock :: proc {
-	spinlock_lock,
-}
-
 try_lock :: proc {
 	spinlock_try_lock,
 }
@@ -174,14 +132,6 @@ unlock :: proc {
 	spinlock_unlock,
 }

-guard :: proc {
-	spinlock_guard,
-}
-
-try_guard :: proc {
-	spinlock_try_guard,
-}
-
 // ---------------------------------------------------------------------------------------------------------------------
 // ----- Tests ------------------------
 // ---------------------------------------------------------------------------------------------------------------------
@@ -189,10 +139,10 @@ import "core:sync"
 import "core:testing"
 import "core:thread"

-// Multiple threads will each add 1.0 this many times.
-// If any updates are lost due to race conditions, the final sum will be wrong.
@(test)
 test_concurrent_atomic_add_no_lost_updates :: proc(t: ^testing.T) {
+	// Multiple threads will each add 1.0 this many times.
+	// If any updates are lost due to race conditions, the final sum will be wrong.
 	NUM_THREADS :: 8
 	ITERATIONS_PER_THREAD :: 10_000

@@ -234,10 +184,10 @@ test_concurrent_atomic_add_no_lost_updates :: proc(t: ^testing.T) {
 	testing.expect_value(t, shared_value, expected)
 }

-// Start with a known value, multiple threads subtract.
-// If any updates are lost due to race conditions, the final result will be wrong.
@(test)
 test_concurrent_atomic_sub_no_lost_updates :: proc(t: ^testing.T) {
+	// Start with a known value, multiple threads subtract.
+	// If any updates are lost due to race conditions, the final result will be wrong.
 	NUM_THREADS :: 8
 	ITERATIONS_PER_THREAD :: 10_000

@@ -278,11 +228,11 @@ test_concurrent_atomic_sub_no_lost_updates :: proc(t: ^testing.T) {
 	testing.expect_value(t, shared_value, 0.0)
 }

-// Each thread multiplies by 2.0 then divides by 2.0.
-// Since these are inverses, the final value should equal the starting value
-// regardless of how operations interleave.
@(test)
 test_concurrent_atomic_mul_div_round_trip :: proc(t: ^testing.T) {
+	// Each thread multiplies by 2.0 then divides by 2.0.
+	// Since these are inverses, the final value should equal the starting value
+	// regardless of how operations interleave.
 	NUM_THREADS :: 8
 	ITERATIONS_PER_THREAD :: 10_000

@@ -324,10 +274,10 @@ test_concurrent_atomic_mul_div_round_trip :: proc(t: ^testing.T) {
 	testing.expect_value(t, shared_value, 1000.0)
 }

-// Verify the f32 type dispatch works correctly under contention.
-// Same approach as the f64 add test but with f32.
@(test)
 test_atomic_add_with_f32 :: proc(t: ^testing.T) {
+	// Verify the f32 type dispatch works correctly under contention.
+	// Same approach as the f64 add test but with f32.
 	NUM_THREADS :: 8
 	ITERATIONS_PER_THREAD :: 10_000

@@ -369,17 +319,17 @@ test_atomic_add_with_f32 :: proc(t: ^testing.T) {
 	testing.expect_value(t, shared_value, expected)
 }

-// Tests that the memory order passed to atomic_float_op's CAS success condition
-// provides full ordering guarantees for the entire float operation.
-//
-// Both sides use atomic_add_float (not raw intrinsics) to verify:
-// - Release on CAS success publishes prior non-atomic writes
-// - Acquire on CAS success makes those writes visible to the reader
-//
-// NOTE: This test may pass even with Relaxed ordering on x86 due to its strong memory model.
-// On ARM or other weak-memory architectures, using Relaxed here would likely cause failures.
@(test)
 test_atomic_release_acquire_publish_visibility :: proc(t: ^testing.T) {
+	// Tests that the memory order passed to atomic_float_op's CAS success condition
+	// provides full ordering guarantees for the entire float operation.
+	//
+	// Both sides use atomic_add_float (not raw intrinsics) to verify:
+	// - Release on CAS success publishes prior non-atomic writes
+	// - Acquire on CAS success makes those writes visible to the reader
+	//
+	// NOTE: This test may pass even with Relaxed ordering on x86 due to its strong memory model.
+	// On ARM or other weak-memory architectures, using Relaxed here would likely cause failures.
 	NUM_READERS :: 4

 	Shared_State :: struct {
@@ -476,20 +426,17 @@ test_atomic_release_acquire_publish_visibility :: proc(t: ^testing.T) {
 	}
 }

-// Stress test for every spinlock acquisition variant: N threads contend on a
-// single lock and perform a deliberate non-atomic read-modify-write on shared
-// data. Each iteration rotates through spinlock_try_lock, spinlock_lock,
-// spinlock_guard, and spinlock_try_guard so every variant runs concurrently and
-// must uphold mutual exclusion on the same lock.
-//
-// If mutual exclusion holds:
-//   - `counter` ends at exactly NUM_THREADS * ITERATIONS_PER_THREAD
-//   - `concurrent_holders` never exceeds 1
-//
-// A multi-step RMW (read → relax → write) widens the critical section so
-// any failure to exclude is virtually guaranteed to corrupt the counter.
@(test)
-test_spinlock_mutual_exclusion :: proc(t: ^testing.T) {
+test_spinlock_try_lock_mutual_exclusion :: proc(t: ^testing.T) {
+	// Stress test for spinlock_try_lock: N threads spin-acquire the lock and
+	// perform a deliberate non-atomic read-modify-write on shared data.
+	//
+	// If mutual exclusion holds:
+	//   - `counter` ends at exactly NUM_THREADS * ITERATIONS_PER_THREAD
+	//   - `concurrent_holders` never exceeds 1
+	//
+	// A multi-step RMW (read → relax → write) widens the critical section so
+	// any failure to exclude is virtually guaranteed to corrupt the counter.
 	NUM_THREADS :: 8
 	ITERATIONS_PER_THREAD :: 50_000

@@ -514,29 +461,6 @@ test_spinlock_mutual_exclusion :: proc(t: ^testing.T) {
 	barrier: sync.Barrier
 	sync.barrier_init(&barrier, NUM_THREADS)

-	// The single critical section every acquisition variant must protect. Sharing
-	// it guarantees they all stress the exact same non-atomic read-modify-write.
-	critical_section :: proc(s: ^Shared) {
-		// Atomically bump the holder count so we can detect overlapping holders.
-		holders := intrinsics.atomic_add_explicit(&s.concurrent_holders, 1, .Relaxed)
-
-		// Track the maximum we ever observed (relaxed is fine, this is
-		// purely diagnostic and protected by the spinlock for writes).
-		if holders + 1 > s.max_holders {
-			s.max_holders = holders + 1
-		}
-
-		// Non-atomic RMW: read, spin a tiny bit, then write.
-		// This deliberately creates a wide window where a second holder
-		// would cause a lost update.
-		val := s.counter
-		intrinsics.cpu_relax()
-		intrinsics.cpu_relax()
-		s.counter = val + 1
-
-		intrinsics.atomic_sub_explicit(&s.concurrent_holders, 1, .Relaxed)
-	}
-
 	thread_proc :: proc(th: ^thread.Thread) {
 		ctx := cast(^Thread_Data)th.data
 		s := ctx.shared
@@ -544,35 +468,36 @@ test_spinlock_mutual_exclusion :: proc(t: ^testing.T) {
 		// All threads rendezvous here for maximum contention.
 		sync.barrier_wait(ctx.barrier)

-		for i in 0 ..< ITERATIONS_PER_THREAD {
-			// Rotate through every acquisition variant so they all contend on the
-			// same lock simultaneously and must each uphold mutual exclusion.
-			switch i & 3 {
-			case 0:
-				// Manual spin on try_lock until we acquire it.
-				for !spinlock_try_lock(&s.lock) {
-					intrinsics.cpu_relax()
-				}
-				critical_section(s)
-				spinlock_unlock(&s.lock)
-			case 1:
-				// Blocking lock that loops internally until acquired.
-				spinlock_lock(&s.lock)
-				critical_section(s)
-				spinlock_unlock(&s.lock)
-			case 2: // Scoped guard: unlocks automatically at the end of the block.
-					if spinlock_guard(&s.lock) {
-						critical_section(s)
-					}
-			case 3: // Scoped try-guard: retry until acquired, auto-unlocks on success.
-					for {
-						if spinlock_try_guard(&s.lock) {
-							critical_section(s)
-							break
-						}
-						intrinsics.cpu_relax()
-					}
+		for _ in 0 ..< ITERATIONS_PER_THREAD {
+			// Spin on try_lock until we acquire it.
+			for !spinlock_try_lock(&s.lock) {
+				intrinsics.cpu_relax()
 			}
+
+			// --- critical section start ---
+
+			// Atomically bump the holder count so we can detect overlapping holders.
+			holders := intrinsics.atomic_add_explicit(&s.concurrent_holders, 1, .Relaxed)
+
+			// Track the maximum we ever observed (relaxed is fine, this is
+			// purely diagnostic and protected by the spinlock for writes).
+			if holders + 1 > s.max_holders {
+				s.max_holders = holders + 1
+			}
+
+			// Non-atomic RMW: read, spin a tiny bit, then write.
+			// This deliberately creates a wide window where a second holder
+			// would cause a lost update.
+			val := s.counter
+			intrinsics.cpu_relax()
+			intrinsics.cpu_relax()
+			s.counter = val + 1
+
+			intrinsics.atomic_sub_explicit(&s.concurrent_holders, 1, .Relaxed)
+
+			// --- critical section end ---
+
+			spinlock_unlock(&s.lock)
 		}
 	}

@@ -2,7 +2,6 @@ package many_bits

 import "base:builtin"
 import "base:intrinsics"
-import "base:runtime"
 import "core:fmt"
 import "core:slice"

@@ -26,20 +25,15 @@ Bits :: struct {
 	length:    int, // Total number of bits being stored
 }

-destroy :: proc(bits: Bits, allocator := context.allocator) -> runtime.Allocator_Error {
-	return delete_slice(bits.int_array, allocator)
+delete :: proc(bits: Bits, allocator := context.allocator) {
+	delete_slice(bits.int_array, allocator)
 }

-create :: proc(
-	#any_int length: int,
-	allocator := context.allocator,
-) -> (
-	bits: Bits,
-	err: runtime.Allocator_Error,
-) #optional_allocator_error {
-	bits.int_array, err = make_slice([]Int_Bits, ((length - 1) >> INDEX_SHIFT) + 1, allocator)
-	bits.length = length
-	return bits, err
+make :: proc(#any_int length: int, allocator := context.allocator) -> Bits {
+	return Bits {
+		int_array = make_slice([]Int_Bits, ((length - 1) >> INDEX_SHIFT) + 1, allocator),
+		length = length,
+	}
 }

 // Sets all bits to 0 (false)
@@ -513,8 +507,8 @@ import "core:testing"

@(test)
 test_set :: proc(t: ^testing.T) {
-	bits := create(128)
-	defer destroy(bits)
+	bits := make(128)
+	defer delete(bits)

 	set(bits, 0, true)
 	testing.expect_value(t, bits.int_array[0], Int_Bits{0})
@@ -530,8 +524,8 @@ test_set :: proc(t: ^testing.T) {

@(test)
 test_get :: proc(t: ^testing.T) {
-	bits := create(128)
-	defer destroy(bits)
+	bits := make(128)
+	defer delete(bits)

 	// Default is false
 	testing.expect(t, !get(bits, 0))
@@ -566,8 +560,8 @@ test_get :: proc(t: ^testing.T) {

@(test)
 test_set_true_set_false :: proc(t: ^testing.T) {
-	bits := create(128)
-	defer destroy(bits)
+	bits := make(128)
+	defer delete(bits)

 	// set_true within first uint
 	set_true(bits, 0)
@@ -611,8 +605,8 @@ all_true_test :: proc(t: ^testing.T) {
 	uint_max := UINT_MAX
 	all_ones := transmute(Int_Bits)uint_max

-	bits := create(132)
-	defer destroy(bits)
+	bits := make(132)
+	defer delete(bits)

 	bits.int_array[0] = all_ones
 	bits.int_array[1] = all_ones
@@ -622,8 +616,8 @@ all_true_test :: proc(t: ^testing.T) {
 	bits.int_array[2] = {0, 1, 2}
 	testing.expect(t, !all_true(bits))

-	bits2 := create(1)
-	defer destroy(bits2)
+	bits2 := make(1)
+	defer delete(bits2)

 	bits2.int_array[0] = {0}
 	testing.expect(t, all_true(bits2))
@@ -634,8 +628,8 @@ test_range_true :: proc(t: ^testing.T) {
 	uint_max := UINT_MAX
 	all_ones := transmute(Int_Bits)uint_max

-	bits := create(192)
-	defer destroy(bits)
+	bits := make(192)
+	defer delete(bits)

 	// Empty range is vacuously true
 	testing.expect(t, range_true(bits, 0, 0))
@@ -682,7 +676,7 @@ test_range_true :: proc(t: ^testing.T) {

@(test)
 nearest_true_handles_same_word_and_boundaries :: proc(t: ^testing.T) {
-	bits := create(128, context.temp_allocator)
+	bits := make(128, context.temp_allocator)

 	set_true(bits, 0)
 	set_true(bits, 10)
@@ -716,7 +710,7 @@ nearest_true_handles_same_word_and_boundaries :: proc(t: ^testing.T) {

@(test)
 nearest_false_handles_same_word_and_boundaries :: proc(t: ^testing.T) {
-	bits := create(128, context.temp_allocator)
+	bits := make(128, context.temp_allocator)

 	// Start with all bits true, then clear a few to false.
 	for i := 0; i < bits.length; i += 1 {
@@ -755,7 +749,7 @@ nearest_false_handles_same_word_and_boundaries :: proc(t: ^testing.T) {

@(test)
 nearest_false_scans_across_words_and_returns_false_when_all_true :: proc(t: ^testing.T) {
-	bits := create(192, context.temp_allocator)
+	bits := make(192, context.temp_allocator)

 	// Start with all bits true, then clear a couple far apart.
 	for i := 0; i < bits.length; i += 1 {
@@ -779,7 +773,7 @@ nearest_false_scans_across_words_and_returns_false_when_all_true :: proc(t: ^tes

@(test)
 nearest_true_scans_across_words_and_returns_false_when_empty :: proc(t: ^testing.T) {
-	bits := create(192, context.temp_allocator)
+	bits := make(192, context.temp_allocator)

 	set_true(bits, 5)
 	set_true(bits, 130)
@@ -796,7 +790,7 @@ nearest_true_scans_across_words_and_returns_false_when_empty :: proc(t: ^testing

@(test)
 nearest_false_handles_last_word_partial_length :: proc(t: ^testing.T) {
-	bits := create(130, context.temp_allocator)
+	bits := make(130, context.temp_allocator)

 	// Start with all bits true, then clear the first and last valid bits.
 	for i := 0; i < bits.length; i += 1 {
@@ -817,7 +811,7 @@ nearest_false_handles_last_word_partial_length :: proc(t: ^testing.T) {

@(test)
 nearest_true_handles_last_word_partial_length :: proc(t: ^testing.T) {
-	bits := create(130, context.temp_allocator)
+	bits := make(130, context.temp_allocator)

 	set_true(bits, 0)
 	set_true(bits, 129)
@@ -834,7 +828,7 @@ nearest_true_handles_last_word_partial_length :: proc(t: ^testing.T) {
@(test)
 iterator_basic_mixed_bits :: proc(t: ^testing.T) {
 	// Use non-word-aligned length to test partial last word handling
-	bits := create(100, context.temp_allocator)
+	bits := make(100, context.temp_allocator)

 	// Set specific bits: 0, 3, 64, 99 (last valid index)
 	set_true(bits, 0)
@@ -909,7 +903,7 @@ iterator_basic_mixed_bits :: proc(t: ^testing.T) {
@(test)
 iterator_all_false_bits :: proc(t: ^testing.T) {
 	// Use non-word-aligned length
-	bits := create(100, context.temp_allocator)
+	bits := make(100, context.temp_allocator)
 	// All bits default to false, no need to set anything

 	// Test iterate - should return all 100 bits as false
@@ -950,7 +944,7 @@ iterator_all_false_bits :: proc(t: ^testing.T) {
@(test)
 iterator_all_true_bits :: proc(t: ^testing.T) {
 	// Use non-word-aligned length
-	bits := create(100, context.temp_allocator)
+	bits := make(100, context.temp_allocator)
 	// Set all bits to true
 	for i := 0; i < bits.length; i += 1 {
 		set_true(bits, i)
@@ -1,8 +1,6 @@
 package meta

 import "core:fmt"
-import "core:log"
-import "core:mem"
 import "core:os"

 Command :: struct {
@@ -22,48 +20,6 @@ COMMANDS :: []Command {
 }

 main :: proc() {
-	//----- General setup ----------------------------------
-	when ODIN_DEBUG {
-		// Temp
-		track_temp: mem.Tracking_Allocator
-		mem.tracking_allocator_init(&track_temp, context.temp_allocator)
-		context.temp_allocator = mem.tracking_allocator(&track_temp)
-
-		// Default
-		track: mem.Tracking_Allocator
-		mem.tracking_allocator_init(&track, context.allocator)
-		context.allocator = mem.tracking_allocator(&track)
-		// Log a warning about any memory that was not freed by the end of the program.
-		// This could be fine for some global state or it could be a memory leak.
-		defer {
-			// Temp allocator
-			if len(track_temp.bad_free_array) > 0 {
-				fmt.eprintf("=== %v incorrect frees - temp allocator: ===\n", len(track_temp.bad_free_array))
-				for entry in track_temp.bad_free_array {
-					fmt.eprintf("- %p @ %v\n", entry.memory, entry.location)
-				}
-				mem.tracking_allocator_destroy(&track_temp)
-			}
-			// Default allocator
-			if len(track.allocation_map) > 0 {
-				fmt.eprintf("=== %v allocations not freed - main allocator: ===\n", len(track.allocation_map))
-				for _, entry in track.allocation_map {
-					fmt.eprintf("- %v bytes @ %v\n", entry.size, entry.location)
-				}
-			}
-			if len(track.bad_free_array) > 0 {
-				fmt.eprintf("=== %v incorrect frees - main allocator: ===\n", len(track.bad_free_array))
-				for entry in track.bad_free_array {
-					fmt.eprintf("- %p @ %v\n", entry.memory, entry.location)
-				}
-			}
-			mem.tracking_allocator_destroy(&track)
-		}
-		// Logger
-		context.logger = log.create_console_logger()
-		defer log.destroy_console_logger(context.logger)
-	}
-
 	args := os.args[1:]

 	if len(args) == 0 {
@@ -4,8 +4,7 @@
 package phased_executor

 import "base:intrinsics"
-import "base:runtime"
-import que "core:container/queue"
+import q "core:container/queue"
 import "core:prof/spall"
 import "core:sync"
 import "core:thread"
@@ -19,7 +18,7 @@ DEFT_SPIN_LIMIT :: 2_500_000
 Harness :: struct($T: typeid) where intrinsics.type_has_nil(T) {
 	mutex:      sync.Mutex,
 	condition:  sync.Cond,
-	cmd_queue:  que.Queue(T),
+	cmd_queue:  q.Queue(T),
 	spin:       bool,
 	lock:       levsync.Spinlock,
 	_pad:       [64 - size_of(uint)]u8, // We want join_count to have its own cache line
@@ -43,13 +42,13 @@ Executor :: struct($T: typeid) where intrinsics.type_has_nil(T) {
 }

 //TODO: Provide a way to set some aspects of context for the executor threads. Namely a logger.
-init :: proc(
+init_executor :: proc(
 	executor: ^Executor($T),
 	#any_int num_threads: int,
 	$on_command_received: proc(command: T),
 	#any_int spin_limit: uint = DEFT_SPIN_LIMIT,
 	allocator := context.allocator,
-) -> runtime.Allocator_Error {
+) {
 	was_initialized, _ := intrinsics.atomic_compare_exchange_strong_explicit(
 		&executor.initialized,
 		false,
@@ -61,9 +60,9 @@ init :: proc(

 	slave_task := build_task(on_command_received)
 	executor.spin_limit = spin_limit
-	executor.harnesses = make([]Harness(T), num_threads, allocator) or_return
+	executor.harnesses = make([]Harness(T), num_threads, allocator)
 	for &harness in executor.harnesses {
-		que.init(&harness.cmd_queue, allocator = allocator) or_return
+		q.init(&harness.cmd_queue, allocator = allocator)
 		harness.spin = true
 	}

@@ -73,11 +72,11 @@ init :: proc(
 	}
 	thread.pool_start(&executor.thread_pool)

-	return nil
+	return
 }

 // Cleanly shuts down all executor tasks then destroys the executor
-destroy :: proc(executor: ^Executor($T), allocator := context.allocator) -> runtime.Allocator_Error {
+destroy_executor :: proc(executor: ^Executor($T), allocator := context.allocator) {
 	was_initialized, _ := intrinsics.atomic_compare_exchange_strong_explicit(
 		&executor.initialized,
 		true,
@@ -91,7 +90,7 @@ destroy :: proc(executor: ^Executor($T), allocator := context.allocator) -> runt
 	for &harness in executor.harnesses {
 		for {
 			if levsync.try_lock(&harness.lock) {
-				que.push_back(&harness.cmd_queue, nil)
+				q.push_back(&harness.cmd_queue, nil)
 				if !harness.spin {
 					sync.mutex_lock(&harness.mutex)
 					sync.cond_signal(&harness.condition)
@@ -106,11 +105,9 @@ destroy :: proc(executor: ^Executor($T), allocator := context.allocator) -> runt
 	thread.pool_join(&executor.thread_pool)
 	thread.pool_destroy(&executor.thread_pool)
 	for &harness in executor.harnesses {
-		que.destroy(&harness.cmd_queue)
+		q.destroy(&harness.cmd_queue)
 	}
-	delete(executor.harnesses, allocator) or_return
-
-	return nil
+	delete(executor.harnesses, allocator)
 }

 build_task :: proc(
@@ -134,10 +131,10 @@ build_task :: proc(
 			spin_count: uint = 0
 			spin_loop: for {
 				if levsync.try_lock(&harness.lock) {
-					if que.len(harness.cmd_queue) > 0 {
+					if q.len(harness.cmd_queue) > 0 {

 						// Execute command
-						command := que.pop_front(&harness.cmd_queue)
+						command := q.pop_front(&harness.cmd_queue)
 						levsync.unlock(&harness.lock)
 						if command == nil do return
 						on_command_received(command)
@@ -166,7 +163,7 @@ build_task :: proc(
 					defer intrinsics.cpu_relax()
 					if levsync.try_lock(&harness.lock) {
 						defer levsync.unlock(&harness.lock)
-						if que.len(harness.cmd_queue) > 0 {
+						if q.len(harness.cmd_queue) > 0 {
 							harness.spin = true
 							break cond_loop
 						} else {
@@ -193,9 +190,9 @@ exec_command :: proc(executor: ^Executor($T), command: T) {
 		}
 		harness := &executor.harnesses[executor.harness_index]
 		if levsync.try_lock(&harness.lock) {
-			if que.len(harness.cmd_queue) <= executor.cmd_queue_floor {
-				que.push_back(&harness.cmd_queue, command)
-				executor.cmd_queue_floor = que.len(harness.cmd_queue)
+			if q.len(harness.cmd_queue) <= executor.cmd_queue_floor {
+				q.push_back(&harness.cmd_queue, command)
+				executor.cmd_queue_floor = q.len(harness.cmd_queue)
 				slave_sleeping := !harness.spin
 				// Must release lock before signalling to avoid race from slave spurious wakeup
 				levsync.unlock(&harness.lock)
@@ -261,7 +258,7 @@ stress_test_executor :: proc(t: ^testing.T) {
 	defer free(exec_counts)

 	executor: Executor(Stress_Cmd)
-	init(&executor, STRESS_NUM_THREADS, stress_handler, spin_limit = 500)
+	init_executor(&executor, STRESS_NUM_THREADS, stress_handler, spin_limit = 500)

 	for round in 0 ..< STRESS_NUM_ROUNDS {
 		base := round * STRESS_CMDS_PER_ROUND
@@ -284,6 +281,6 @@ stress_test_executor :: proc(t: ^testing.T) {
 	// Explicitly destroy to verify clean shutdown.
 	// If destroy_executor returns, all threads received the nil sentinel and exited,
 	// and thread.pool_join completed without deadlock.
-	destroy(&executor)
+	destroy_executor(&executor)
 	testing.expect(t, !executor.initialized, "Executor still marked initialized after destroy")
 }
@@ -1,53 +1,58 @@
 package examples

 import "core:fmt"
-import "core:log"
 import "core:mem"
 import "core:os"

 import qr ".."

 main :: proc() {
-	//----- General setup ----------------------------------
-	// Temp
-	track_temp: mem.Tracking_Allocator
-	mem.tracking_allocator_init(&track_temp, context.temp_allocator)
-	context.temp_allocator = mem.tracking_allocator(&track_temp)
-
-	// Default
-	track: mem.Tracking_Allocator
-	mem.tracking_allocator_init(&track, context.allocator)
-	context.allocator = mem.tracking_allocator(&track)
-	// Log a warning about any memory that was not freed by the end of the program.
-	// This could be fine for some global state or it could be a memory leak.
-	defer {
-		// Temp allocator
-		if len(track_temp.bad_free_array) > 0 {
-			fmt.eprintf("=== %v incorrect frees - temp allocator: ===\n", len(track_temp.bad_free_array))
-			for entry in track_temp.bad_free_array {
-				fmt.eprintf("- %p @ %v\n", entry.memory, entry.location)
-			}
-			mem.tracking_allocator_destroy(&track_temp)
+	//----- Tracking allocator ----------------------------------
+	{
+		tracking_temp_allocator := false
+		// Temp
+		track_temp: mem.Tracking_Allocator
+		if tracking_temp_allocator {
+			mem.tracking_allocator_init(&track_temp, context.temp_allocator)
+			context.temp_allocator = mem.tracking_allocator(&track_temp)
 		}
-		// Default allocator
-		if len(track.allocation_map) > 0 {
-			fmt.eprintf("=== %v allocations not freed - main allocator: ===\n", len(track.allocation_map))
-			for _, entry in track.allocation_map {
-				fmt.eprintf("- %v bytes @ %v\n", entry.size, entry.location)
+		// Default
+		track: mem.Tracking_Allocator
+		mem.tracking_allocator_init(&track, context.allocator)
+		context.allocator = mem.tracking_allocator(&track)
+		defer {
+			// Temp allocator
+			if tracking_temp_allocator {
+				if len(track_temp.allocation_map) > 0 {
+					fmt.eprintf("=== %v allocations not freed - temp allocator: ===\n", len(track_temp.allocation_map))
+					for _, entry in track_temp.allocation_map {
+						fmt.eprintf("- %v bytes @ %v\n", entry.size, entry.location)
+					}
+				}
+				if len(track_temp.bad_free_array) > 0 {
+					fmt.eprintf("=== %v incorrect frees - temp allocator: ===\n", len(track_temp.bad_free_array))
+					for entry in track_temp.bad_free_array {
+						fmt.eprintf("- %p @ %v\n", entry.memory, entry.location)
+					}
+				}
+				mem.tracking_allocator_destroy(&track_temp)
 			}
-		}
-		if len(track.bad_free_array) > 0 {
-			fmt.eprintf("=== %v incorrect frees - main allocator: ===\n", len(track.bad_free_array))
-			for entry in track.bad_free_array {
-				fmt.eprintf("- %p @ %v\n", entry.memory, entry.location)
+			// Default allocator
+			if len(track.allocation_map) > 0 {
+				fmt.eprintf("=== %v allocations not freed - main allocator: ===\n", len(track.allocation_map))
+				for _, entry in track.allocation_map {
+					fmt.eprintf("- %v bytes @ %v\n", entry.size, entry.location)
+				}
 			}
+			if len(track.bad_free_array) > 0 {
+				fmt.eprintf("=== %v incorrect frees - main allocator: ===\n", len(track.bad_free_array))
+				for entry in track.bad_free_array {
+					fmt.eprintf("- %p @ %v\n", entry.memory, entry.location)
+				}
+			}
+			mem.tracking_allocator_destroy(&track)
 		}
-		mem.tracking_allocator_destroy(&track)
 	}
-	// Logger
-	context.logger = log.create_console_logger()
-	defer log.destroy_console_logger(context.logger)
-

 	args := os.args
 	if len(args) < 2 {
@@ -1,774 +0,0 @@
-package quantity
-
-import "base:intrinsics"
-
-GRAMS_PER_POUND :: 453.59237
-OUNCES_PER_POUND :: 16
-GRAMS_PER_OUNCE :: GRAMS_PER_POUND / OUNCES_PER_POUND
-
-KILO_GRAMS_PER_POUND :: GRAMS_PER_POUND / KILO
-MILLI_GRAMS_PER_POUND :: GRAMS_PER_POUND * MILLI
-MICRO_GRAMS_PER_POUND :: GRAMS_PER_POUND * MICRO
-NANO_GRAMS_PER_POUND :: GRAMS_PER_POUND * NANO
-
-KILO_GRAMS_PER_OUNCE :: GRAMS_PER_OUNCE / KILO
-MILLI_GRAMS_PER_OUNCE :: GRAMS_PER_OUNCE * MILLI
-MICRO_GRAMS_PER_OUNCE :: GRAMS_PER_OUNCE * MICRO
-NANO_GRAMS_PER_OUNCE :: GRAMS_PER_OUNCE * NANO
-
-//----- Grams ----------------------------------
-Grams :: struct($V: typeid) where intrinsics.type_is_numeric(V) {
-	v: V,
-}
-
-// Prefer the to_kilo_grams procedure group.
-@(fast_math = {.Allow_Reciprocal})
-grams_to_kilo_grams :: #force_inline proc "contextless" (
-	grams: Grams($V),
-) -> Kilo_Grams(V) where intrinsics.type_is_numeric(V) {
-	return Kilo_Grams(V){grams.v / KILO}
-}
-
-// Prefer the to_milli_grams procedure group.
-grams_to_milli_grams :: #force_inline proc "contextless" (
-	grams: Grams($V),
-) -> Milli_Grams(V) where intrinsics.type_is_numeric(V) {
-	return Milli_Grams(V){grams.v * MILLI}
-}
-
-// Prefer the to_micro_grams procedure group.
-grams_to_micro_grams :: #force_inline proc "contextless" (
-	grams: Grams($V),
-) -> Micro_Grams(V) where intrinsics.type_is_numeric(V) {
-	return Micro_Grams(V){grams.v * MICRO}
-}
-
-// Prefer the to_nano_grams procedure group.
-grams_to_nano_grams :: #force_inline proc "contextless" (
-	grams: Grams($V),
-) -> Nano_Grams(V) where intrinsics.type_is_numeric(V) {
-	return Nano_Grams(V){grams.v * NANO}
-}
-
-// Prefer the to_pounds procedure group.
-@(fast_math = {.Allow_Reciprocal})
-grams_to_pounds :: #force_inline proc "contextless" (
-	grams: Grams($V),
-) -> Pounds(V) where intrinsics.type_is_float(V) {
-	return Pounds(V){grams.v / GRAMS_PER_POUND}
-}
-
-// Prefer the to_ounces procedure group.
-@(fast_math = {.Allow_Reciprocal})
-grams_to_ounces :: #force_inline proc "contextless" (
-	grams: Grams($V),
-) -> Ounces(V) where intrinsics.type_is_float(V) {
-	return Ounces(V){grams.v / GRAMS_PER_OUNCE}
-}
-
-//----- Kilograms ----------------------------------
-Kilo_Grams :: struct($V: typeid) where intrinsics.type_is_numeric(V) {
-	v: V,
-}
-
-// Prefer the to_grams procedure group.
-kilo_grams_to_grams :: #force_inline proc "contextless" (
-	kilo_grams: Kilo_Grams($V),
-) -> Grams(V) where intrinsics.type_is_numeric(V) {
-	return Grams(V){kilo_grams.v * KILO}
-}
-
-// Prefer the to_milli_grams procedure group.
-kilo_grams_to_milli_grams :: #force_inline proc "contextless" (
-	kilo_grams: Kilo_Grams($V),
-) -> Milli_Grams(V) where intrinsics.type_is_numeric(V) {
-	return Milli_Grams(V){kilo_grams.v * (KILO * MILLI)}
-}
-
-// Prefer the to_micro_grams procedure group.
-kilo_grams_to_micro_grams :: #force_inline proc "contextless" (
-	kilo_grams: Kilo_Grams($V),
-) -> Micro_Grams(V) where intrinsics.type_is_numeric(V) {
-	return Micro_Grams(V){kilo_grams.v * (KILO * MICRO)}
-}
-
-// Prefer the to_nano_grams procedure group.
-kilo_grams_to_nano_grams :: #force_inline proc "contextless" (
-	kilo_grams: Kilo_Grams($V),
-) -> Nano_Grams(V) where intrinsics.type_is_numeric(V) {
-	return Nano_Grams(V){kilo_grams.v * (KILO * NANO)}
-}
-
-// Prefer the to_pounds procedure group.
-@(fast_math = {.Allow_Reciprocal})
-kilo_grams_to_pounds :: #force_inline proc "contextless" (
-	kilo_grams: Kilo_Grams($V),
-) -> Pounds(V) where intrinsics.type_is_float(V) {
-	return Pounds(V){kilo_grams.v / KILO_GRAMS_PER_POUND}
-}
-
-// Prefer the to_ounces procedure group.
-@(fast_math = {.Allow_Reciprocal})
-kilo_grams_to_ounces :: #force_inline proc "contextless" (
-	kilo_grams: Kilo_Grams($V),
-) -> Ounces(V) where intrinsics.type_is_float(V) {
-	return Ounces(V){kilo_grams.v / KILO_GRAMS_PER_OUNCE}
-}
-
-//----- Milligrams ----------------------------------
-Milli_Grams :: struct($V: typeid) where intrinsics.type_is_numeric(V) {
-	v: V,
-}
-
-// Prefer the to_grams procedure group.
-@(fast_math = {.Allow_Reciprocal})
-milli_grams_to_grams :: #force_inline proc "contextless" (
-	milli_grams: Milli_Grams($V),
-) -> Grams(V) where intrinsics.type_is_numeric(V) {
-	return Grams(V){milli_grams.v / MILLI}
-}
-
-// Prefer the to_kilo_grams procedure group.
-@(fast_math = {.Allow_Reciprocal})
-milli_grams_to_kilo_grams :: #force_inline proc "contextless" (
-	milli_grams: Milli_Grams($V),
-) -> Kilo_Grams(V) where intrinsics.type_is_numeric(V) {
-	return Kilo_Grams(V){milli_grams.v / (KILO * MILLI)}
-}
-
-// Prefer the to_micro_grams procedure group.
-milli_grams_to_micro_grams :: #force_inline proc "contextless" (
-	milli_grams: Milli_Grams($V),
-) -> Micro_Grams(V) where intrinsics.type_is_numeric(V) {
-	return Micro_Grams(V){milli_grams.v * (MICRO / MILLI)}
-}
-
-// Prefer the to_nano_grams procedure group.
-milli_grams_to_nano_grams :: #force_inline proc "contextless" (
-	milli_grams: Milli_Grams($V),
-) -> Nano_Grams(V) where intrinsics.type_is_numeric(V) {
-	return Nano_Grams(V){milli_grams.v * (NANO / MILLI)}
-}
-
-// Prefer the to_pounds procedure group.
-@(fast_math = {.Allow_Reciprocal})
-milli_grams_to_pounds :: #force_inline proc "contextless" (
-	milli_grams: Milli_Grams($V),
-) -> Pounds(V) where intrinsics.type_is_float(V) {
-	return Pounds(V){milli_grams.v / MILLI_GRAMS_PER_POUND}
-}
-
-// Prefer the to_ounces procedure group.
-@(fast_math = {.Allow_Reciprocal})
-milli_grams_to_ounces :: #force_inline proc "contextless" (
-	milli_grams: Milli_Grams($V),
-) -> Ounces(V) where intrinsics.type_is_float(V) {
-	return Ounces(V){milli_grams.v / MILLI_GRAMS_PER_OUNCE}
-}
-
-//----- Micrograms ----------------------------------
-Micro_Grams :: struct($V: typeid) where intrinsics.type_is_numeric(V) {
-	v: V,
-}
-
-// Prefer the to_grams procedure group.
-@(fast_math = {.Allow_Reciprocal})
-micro_grams_to_grams :: #force_inline proc "contextless" (
-	micro_grams: Micro_Grams($V),
-) -> Grams(V) where intrinsics.type_is_numeric(V) {
-	return Grams(V){micro_grams.v / MICRO}
-}
-
-// Prefer the to_kilo_grams procedure group.
-@(fast_math = {.Allow_Reciprocal})
-micro_grams_to_kilo_grams :: #force_inline proc "contextless" (
-	micro_grams: Micro_Grams($V),
-) -> Kilo_Grams(V) where intrinsics.type_is_numeric(V) {
-	return Kilo_Grams(V){micro_grams.v / (KILO * MICRO)}
-}
-
-// Prefer the to_milli_grams procedure group.
-@(fast_math = {.Allow_Reciprocal})
-micro_grams_to_milli_grams :: #force_inline proc "contextless" (
-	micro_grams: Micro_Grams($V),
-) -> Milli_Grams(V) where intrinsics.type_is_numeric(V) {
-	return Milli_Grams(V){micro_grams.v / (MICRO / MILLI)}
-}
-
-// Prefer the to_nano_grams procedure group.
-micro_grams_to_nano_grams :: #force_inline proc "contextless" (
-	micro_grams: Micro_Grams($V),
-) -> Nano_Grams(V) where intrinsics.type_is_numeric(V) {
-	return Nano_Grams(V){micro_grams.v * (NANO / MICRO)}
-}
-
-// Prefer the to_pounds procedure group.
-@(fast_math = {.Allow_Reciprocal})
-micro_grams_to_pounds :: #force_inline proc "contextless" (
-	micro_grams: Micro_Grams($V),
-) -> Pounds(V) where intrinsics.type_is_float(V) {
-	return Pounds(V){micro_grams.v / MICRO_GRAMS_PER_POUND}
-}
-
-// Prefer the to_ounces procedure group.
-@(fast_math = {.Allow_Reciprocal})
-micro_grams_to_ounces :: #force_inline proc "contextless" (
-	micro_grams: Micro_Grams($V),
-) -> Ounces(V) where intrinsics.type_is_float(V) {
-	return Ounces(V){micro_grams.v / MICRO_GRAMS_PER_OUNCE}
-}
-
-//----- Nanograms ----------------------------------
-Nano_Grams :: struct($V: typeid) where intrinsics.type_is_numeric(V) {
-	v: V,
-}
-
-// Prefer the to_grams procedure group.
-@(fast_math = {.Allow_Reciprocal})
-nano_grams_to_grams :: #force_inline proc "contextless" (
-	nano_grams: Nano_Grams($V),
-) -> Grams(V) where intrinsics.type_is_numeric(V) {
-	return Grams(V){nano_grams.v / NANO}
-}
-
-// Prefer the to_kilo_grams procedure group.
-@(fast_math = {.Allow_Reciprocal})
-nano_grams_to_kilo_grams :: #force_inline proc "contextless" (
-	nano_grams: Nano_Grams($V),
-) -> Kilo_Grams(V) where intrinsics.type_is_numeric(V) {
-	return Kilo_Grams(V){nano_grams.v / (KILO * NANO)}
-}
-
-// Prefer the to_milli_grams procedure group.
-@(fast_math = {.Allow_Reciprocal})
-nano_grams_to_milli_grams :: #force_inline proc "contextless" (
-	nano_grams: Nano_Grams($V),
-) -> Milli_Grams(V) where intrinsics.type_is_numeric(V) {
-	return Milli_Grams(V){nano_grams.v / (NANO / MILLI)}
-}
-
-// Prefer the to_micro_grams procedure group.
-@(fast_math = {.Allow_Reciprocal})
-nano_grams_to_micro_grams :: #force_inline proc "contextless" (
-	nano_grams: Nano_Grams($V),
-) -> Micro_Grams(V) where intrinsics.type_is_numeric(V) {
-	return Micro_Grams(V){nano_grams.v / (NANO / MICRO)}
-}
-
-// Prefer the to_pounds procedure group.
-@(fast_math = {.Allow_Reciprocal})
-nano_grams_to_pounds :: #force_inline proc "contextless" (
-	nano_grams: Nano_Grams($V),
-) -> Pounds(V) where intrinsics.type_is_float(V) {
-	return Pounds(V){nano_grams.v / NANO_GRAMS_PER_POUND}
-}
-
-// Prefer the to_ounces procedure group.
-@(fast_math = {.Allow_Reciprocal})
-nano_grams_to_ounces :: #force_inline proc "contextless" (
-	nano_grams: Nano_Grams($V),
-) -> Ounces(V) where intrinsics.type_is_float(V) {
-	return Ounces(V){nano_grams.v / NANO_GRAMS_PER_OUNCE}
-}
-
-//----- Pounds ----------------------------------
-Pounds :: struct($V: typeid) where intrinsics.type_is_numeric(V) {
-	v: V,
-}
-
-// Prefer the to_grams procedure group.
-pounds_to_grams :: #force_inline proc "contextless" (
-	pounds: Pounds($V),
-) -> Grams(V) where intrinsics.type_is_float(V) {
-	return Grams(V){pounds.v * GRAMS_PER_POUND}
-}
-
-// Prefer the to_kilo_grams procedure group.
-pounds_to_kilo_grams :: #force_inline proc "contextless" (
-	pounds: Pounds($V),
-) -> Kilo_Grams(V) where intrinsics.type_is_float(V) {
-	return Kilo_Grams(V){pounds.v * KILO_GRAMS_PER_POUND}
-}
-
-// Prefer the to_milli_grams procedure group.
-pounds_to_milli_grams :: #force_inline proc "contextless" (
-	pounds: Pounds($V),
-) -> Milli_Grams(V) where intrinsics.type_is_float(V) {
-	return Milli_Grams(V){pounds.v * MILLI_GRAMS_PER_POUND}
-}
-
-// Prefer the to_micro_grams procedure group.
-pounds_to_micro_grams :: #force_inline proc "contextless" (
-	pounds: Pounds($V),
-) -> Micro_Grams(V) where intrinsics.type_is_float(V) {
-	return Micro_Grams(V){pounds.v * MICRO_GRAMS_PER_POUND}
-}
-
-// Prefer the to_nano_grams procedure group.
-pounds_to_nano_grams :: #force_inline proc "contextless" (
-	pounds: Pounds($V),
-) -> Nano_Grams(V) where intrinsics.type_is_float(V) {
-	return Nano_Grams(V){pounds.v * NANO_GRAMS_PER_POUND}
-}
-
-// Prefer the to_ounces procedure group.
-pounds_to_ounces :: #force_inline proc "contextless" (
-	pounds: Pounds($V),
-) -> Ounces(V) where intrinsics.type_is_numeric(V) {
-	return Ounces(V){pounds.v * OUNCES_PER_POUND}
-}
-
-//----- Ounces ----------------------------------
-Ounces :: struct($V: typeid) where intrinsics.type_is_numeric(V) {
-	v: V,
-}
-
-// Prefer the to_grams procedure group.
-ounces_to_grams :: #force_inline proc "contextless" (
-	ounces: Ounces($V),
-) -> Grams(V) where intrinsics.type_is_float(V) {
-	return Grams(V){ounces.v * GRAMS_PER_OUNCE}
-}
-
-// Prefer the to_kilo_grams procedure group.
-ounces_to_kilo_grams :: #force_inline proc "contextless" (
-	ounces: Ounces($V),
-) -> Kilo_Grams(V) where intrinsics.type_is_float(V) {
-	return Kilo_Grams(V){ounces.v * KILO_GRAMS_PER_OUNCE}
-}
-
-// Prefer the to_milli_grams procedure group.
-ounces_to_milli_grams :: #force_inline proc "contextless" (
-	ounces: Ounces($V),
-) -> Milli_Grams(V) where intrinsics.type_is_float(V) {
-	return Milli_Grams(V){ounces.v * MILLI_GRAMS_PER_OUNCE}
-}
-
-// Prefer the to_micro_grams procedure group.
-ounces_to_micro_grams :: #force_inline proc "contextless" (
-	ounces: Ounces($V),
-) -> Micro_Grams(V) where intrinsics.type_is_float(V) {
-	return Micro_Grams(V){ounces.v * MICRO_GRAMS_PER_OUNCE}
-}
-
-// Prefer the to_nano_grams procedure group.
-ounces_to_nano_grams :: #force_inline proc "contextless" (
-	ounces: Ounces($V),
-) -> Nano_Grams(V) where intrinsics.type_is_float(V) {
-	return Nano_Grams(V){ounces.v * NANO_GRAMS_PER_OUNCE}
-}
-
-// Prefer the to_pounds procedure group.
-@(fast_math = {.Allow_Reciprocal})
-ounces_to_pounds :: #force_inline proc "contextless" (
-	ounces: Ounces($V),
-) -> Pounds(V) where intrinsics.type_is_numeric(V) {
-	return Pounds(V){ounces.v / OUNCES_PER_POUND}
-}
-
-// ---------------------------------------------------------------------------------------------------------------------
-// ----- Conversion Overloads ------------------------
-// ---------------------------------------------------------------------------------------------------------------------
-to_grams :: proc {
-	kilo_grams_to_grams,
-	milli_grams_to_grams,
-	micro_grams_to_grams,
-	nano_grams_to_grams,
-	pounds_to_grams,
-	ounces_to_grams,
-}
-
-to_kilo_grams :: proc {
-	grams_to_kilo_grams,
-	milli_grams_to_kilo_grams,
-	micro_grams_to_kilo_grams,
-	nano_grams_to_kilo_grams,
-	pounds_to_kilo_grams,
-	ounces_to_kilo_grams,
-}
-
-to_milli_grams :: proc {
-	grams_to_milli_grams,
-	kilo_grams_to_milli_grams,
-	micro_grams_to_milli_grams,
-	nano_grams_to_milli_grams,
-	pounds_to_milli_grams,
-	ounces_to_milli_grams,
-}
-
-to_micro_grams :: proc {
-	grams_to_micro_grams,
-	kilo_grams_to_micro_grams,
-	milli_grams_to_micro_grams,
-	nano_grams_to_micro_grams,
-	pounds_to_micro_grams,
-	ounces_to_micro_grams,
-}
-
-to_nano_grams :: proc {
-	grams_to_nano_grams,
-	kilo_grams_to_nano_grams,
-	milli_grams_to_nano_grams,
-	micro_grams_to_nano_grams,
-	pounds_to_nano_grams,
-	ounces_to_nano_grams,
-}
-
-to_pounds :: proc {
-	grams_to_pounds,
-	kilo_grams_to_pounds,
-	milli_grams_to_pounds,
-	micro_grams_to_pounds,
-	nano_grams_to_pounds,
-	ounces_to_pounds,
-}
-
-to_ounces :: proc {
-	grams_to_ounces,
-	kilo_grams_to_ounces,
-	milli_grams_to_ounces,
-	micro_grams_to_ounces,
-	nano_grams_to_ounces,
-	pounds_to_ounces,
-}
-
-// ---------------------------------------------------------------------------------------------------------------------
-// ----- Tests ------------------------
-// ---------------------------------------------------------------------------------------------------------------------
-import "core:testing"
-
-@(test)
-test_grams_to_kilo_grams :: proc(t: ^testing.T) {
-	grams := Grams(int){12_000}
-	kilo_grams := to_kilo_grams(grams)
-
-	testing.expect_value(t, kilo_grams, Kilo_Grams(int){12})
-}
-
-@(test)
-test_grams_to_milli_grams :: proc(t: ^testing.T) {
-	grams := Grams(int){12}
-	milli_grams := to_milli_grams(grams)
-
-	testing.expect_value(t, milli_grams, Milli_Grams(int){12_000})
-}
-
-@(test)
-test_grams_to_micro_grams :: proc(t: ^testing.T) {
-	grams := Grams(int){12}
-	micro_grams := to_micro_grams(grams)
-
-	testing.expect_value(t, micro_grams, Micro_Grams(int){12_000_000})
-}
-
-@(test)
-test_grams_to_nano_grams :: proc(t: ^testing.T) {
-	grams := Grams(int){12}
-	nano_grams := to_nano_grams(grams)
-
-	testing.expect_value(t, nano_grams, Nano_Grams(int){12_000_000_000})
-}
-
-@(test)
-test_grams_to_pounds :: proc(t: ^testing.T) {
-	grams := Grams(f32){453.59237}
-	pounds := to_pounds(grams)
-
-	testing.expect(t, pounds.v > 0.99 && pounds.v < 1.01)
-}
-
-@(test)
-test_grams_to_ounces :: proc(t: ^testing.T) {
-	grams := Grams(f32){28.349523125}
-	ounces := to_ounces(grams)
-
-	testing.expect(t, ounces.v > 0.99 && ounces.v < 1.01)
-}
-
-@(test)
-test_kilo_grams_to_grams :: proc(t: ^testing.T) {
-	kilo_grams := Kilo_Grams(int){12}
-	grams := to_grams(kilo_grams)
-
-	testing.expect_value(t, grams, Grams(int){12_000})
-}
-
-@(test)
-test_kilo_grams_to_milli_grams :: proc(t: ^testing.T) {
-	kilo_grams := Kilo_Grams(int){5}
-	milli_grams := to_milli_grams(kilo_grams)
-
-	testing.expect_value(t, milli_grams, Milli_Grams(int){5_000_000})
-}
-
-@(test)
-test_kilo_grams_to_micro_grams :: proc(t: ^testing.T) {
-	kilo_grams := Kilo_Grams(int){5}
-	micro_grams := to_micro_grams(kilo_grams)
-
-	testing.expect_value(t, micro_grams, Micro_Grams(int){5_000_000_000})
-}
-
-@(test)
-test_kilo_grams_to_nano_grams :: proc(t: ^testing.T) {
-	kilo_grams := Kilo_Grams(int){5}
-	nano_grams := to_nano_grams(kilo_grams)
-
-	testing.expect_value(t, nano_grams, Nano_Grams(int){5_000_000_000_000})
-}
-
-@(test)
-test_kilo_grams_to_pounds :: proc(t: ^testing.T) {
-	kilo_grams := Kilo_Grams(f32){0.45359237}
-	pounds := to_pounds(kilo_grams)
-
-	testing.expect(t, pounds.v > 0.99 && pounds.v < 1.01)
-}
-
-@(test)
-test_kilo_grams_to_ounces :: proc(t: ^testing.T) {
-	kilo_grams := Kilo_Grams(f32){0.028349523125}
-	ounces := to_ounces(kilo_grams)
-
-	testing.expect(t, ounces.v > 0.99 && ounces.v < 1.01)
-}
-
-@(test)
-test_milli_grams_to_grams :: proc(t: ^testing.T) {
-	milli_grams := Milli_Grams(int){12_000}
-	grams := to_grams(milli_grams)
-
-	testing.expect_value(t, grams, Grams(int){12})
-}
-
-@(test)
-test_milli_grams_to_kilo_grams :: proc(t: ^testing.T) {
-	milli_grams := Milli_Grams(int){5_000_000}
-	kilo_grams := to_kilo_grams(milli_grams)
-
-	testing.expect_value(t, kilo_grams, Kilo_Grams(int){5})
-}
-
-@(test)
-test_milli_grams_to_micro_grams :: proc(t: ^testing.T) {
-	milli_grams := Milli_Grams(int){5}
-	micro_grams := to_micro_grams(milli_grams)
-
-	testing.expect_value(t, micro_grams, Micro_Grams(int){5_000})
-}
-
-@(test)
-test_milli_grams_to_nano_grams :: proc(t: ^testing.T) {
-	milli_grams := Milli_Grams(int){5}
-	nano_grams := to_nano_grams(milli_grams)
-
-	testing.expect_value(t, nano_grams, Nano_Grams(int){5_000_000})
-}
-
-@(test)
-test_milli_grams_to_pounds :: proc(t: ^testing.T) {
-	milli_grams := Milli_Grams(f64){453_592.37}
-	pounds := to_pounds(milli_grams)
-
-	testing.expect(t, pounds.v > 0.9999999 && pounds.v < 1.0000001)
-}
-
-@(test)
-test_milli_grams_to_ounces :: proc(t: ^testing.T) {
-	milli_grams := Milli_Grams(f64){28_349.523125}
-	ounces := to_ounces(milli_grams)
-
-	testing.expect(t, ounces.v > 0.9999999 && ounces.v < 1.0000001)
-}
-
-@(test)
-test_micro_grams_to_grams :: proc(t: ^testing.T) {
-	micro_grams := Micro_Grams(int){12_000_000}
-	grams := to_grams(micro_grams)
-
-	testing.expect_value(t, grams, Grams(int){12})
-}
-
-@(test)
-test_micro_grams_to_kilo_grams :: proc(t: ^testing.T) {
-	micro_grams := Micro_Grams(int){5_000_000_000}
-	kilo_grams := to_kilo_grams(micro_grams)
-
-	testing.expect_value(t, kilo_grams, Kilo_Grams(int){5})
-}
-
-@(test)
-test_micro_grams_to_milli_grams :: proc(t: ^testing.T) {
-	micro_grams := Micro_Grams(int){5_000}
-	milli_grams := to_milli_grams(micro_grams)
-
-	testing.expect_value(t, milli_grams, Milli_Grams(int){5})
-}
-
-@(test)
-test_micro_grams_to_nano_grams :: proc(t: ^testing.T) {
-	micro_grams := Micro_Grams(int){5}
-	nano_grams := to_nano_grams(micro_grams)
-
-	testing.expect_value(t, nano_grams, Nano_Grams(int){5_000})
-}
-
-@(test)
-test_micro_grams_to_pounds :: proc(t: ^testing.T) {
-	micro_grams := Micro_Grams(f64){453_592_370}
-	pounds := to_pounds(micro_grams)
-
-	testing.expect(t, pounds.v > 0.9999999 && pounds.v < 1.0000001)
-}
-
-@(test)
-test_micro_grams_to_ounces :: proc(t: ^testing.T) {
-	micro_grams := Micro_Grams(f64){28_349_523.125}
-	ounces := to_ounces(micro_grams)
-
-	testing.expect(t, ounces.v > 0.9999999 && ounces.v < 1.0000001)
-}
-
-@(test)
-test_nano_grams_to_grams :: proc(t: ^testing.T) {
-	nano_grams := Nano_Grams(int){12_000_000_000}
-	grams := to_grams(nano_grams)
-
-	testing.expect_value(t, grams, Grams(int){12})
-}
-
-@(test)
-test_nano_grams_to_kilo_grams :: proc(t: ^testing.T) {
-	nano_grams := Nano_Grams(int){5_000_000_000_000}
-	kilo_grams := to_kilo_grams(nano_grams)
-
-	testing.expect_value(t, kilo_grams, Kilo_Grams(int){5})
-}
-
-@(test)
-test_nano_grams_to_milli_grams :: proc(t: ^testing.T) {
-	nano_grams := Nano_Grams(int){5_000_000}
-	milli_grams := to_milli_grams(nano_grams)
-
-	testing.expect_value(t, milli_grams, Milli_Grams(int){5})
-}
-
-@(test)
-test_nano_grams_to_micro_grams :: proc(t: ^testing.T) {
-	nano_grams := Nano_Grams(int){5_000}
-	micro_grams := to_micro_grams(nano_grams)
-
-	testing.expect_value(t, micro_grams, Micro_Grams(int){5})
-}
-
-@(test)
-test_nano_grams_to_pounds :: proc(t: ^testing.T) {
-	nano_grams := Nano_Grams(f64){453_592_370_000}
-	pounds := to_pounds(nano_grams)
-
-	testing.expect(t, pounds.v > 0.9999999 && pounds.v < 1.0000001)
-}
-
-@(test)
-test_nano_grams_to_ounces :: proc(t: ^testing.T) {
-	nano_grams := Nano_Grams(f64){28_349_523_125}
-	ounces := to_ounces(nano_grams)
-
-	testing.expect(t, ounces.v > 0.9999999 && ounces.v < 1.0000001)
-}
-
-@(test)
-test_pounds_to_grams :: proc(t: ^testing.T) {
-	pounds := Pounds(f32){1}
-	grams := to_grams(pounds)
-
-	testing.expect(t, grams.v > 453.59 && grams.v < 453.6)
-}
-
-@(test)
-test_pounds_to_kilo_grams :: proc(t: ^testing.T) {
-	pounds := Pounds(f32){1}
-	kilo_grams := to_kilo_grams(pounds)
-
-	testing.expect(t, kilo_grams.v > 0.4535 && kilo_grams.v < 0.4536)
-}
-
-@(test)
-test_pounds_to_milli_grams :: proc(t: ^testing.T) {
-	pounds := Pounds(f64){1}
-	milli_grams := to_milli_grams(pounds)
-
-	testing.expect(t, milli_grams.v > 453_592.369 && milli_grams.v < 453_592.371)
-}
-
-@(test)
-test_pounds_to_micro_grams :: proc(t: ^testing.T) {
-	pounds := Pounds(f64){1}
-	micro_grams := to_micro_grams(pounds)
-
-	testing.expect(t, micro_grams.v > 453_592_369.9 && micro_grams.v < 453_592_370.1)
-}
-
-@(test)
-test_pounds_to_nano_grams :: proc(t: ^testing.T) {
-	pounds := Pounds(f64){1}
-	nano_grams := to_nano_grams(pounds)
-
-	testing.expect(t, nano_grams.v > 453_592_369_999.0 && nano_grams.v < 453_592_370_001.0)
-}
-
-@(test)
-test_pounds_to_ounces :: proc(t: ^testing.T) {
-	pounds := Pounds(int){2}
-	ounces := to_ounces(pounds)
-
-	testing.expect_value(t, ounces, Ounces(int){32})
-}
-
-@(test)
-test_ounces_to_grams :: proc(t: ^testing.T) {
-	ounces := Ounces(f32){1}
-	grams := to_grams(ounces)
-
-	testing.expect(t, grams.v > 28.34 && grams.v < 28.35)
-}
-
-@(test)
-test_ounces_to_kilo_grams :: proc(t: ^testing.T) {
-	ounces := Ounces(f32){1}
-	kilo_grams := to_kilo_grams(ounces)
-
-	testing.expect(t, kilo_grams.v > 0.0283 && kilo_grams.v < 0.0284)
-}
-
-@(test)
-test_ounces_to_milli_grams :: proc(t: ^testing.T) {
-	ounces := Ounces(f64){1}
-	milli_grams := to_milli_grams(ounces)
-
-	testing.expect(t, milli_grams.v > 28_349.523 && milli_grams.v < 28_349.524)
-}
-
-@(test)
-test_ounces_to_micro_grams :: proc(t: ^testing.T) {
-	ounces := Ounces(f64){1}
-	micro_grams := to_micro_grams(ounces)
-
-	testing.expect(t, micro_grams.v > 28_349_523.124 && micro_grams.v < 28_349_523.126)
-}
-
-@(test)
-test_ounces_to_nano_grams :: proc(t: ^testing.T) {
-	ounces := Ounces(f64){1}
-	nano_grams := to_nano_grams(ounces)
-
-	testing.expect(t, nano_grams.v > 28_349_523_124.0 && nano_grams.v < 28_349_523_126.0)
-}
-
-@(test)
-test_ounces_to_pounds :: proc(t: ^testing.T) {
-	ounces := Ounces(int){32}
-	pounds := to_pounds(ounces)
-
-	testing.expect_value(t, pounds, Pounds(int){2})
-}
@@ -10,14 +10,14 @@ Pascals :: struct($V: typeid) where intrinsics.type_is_numeric(V) {
 	v: V,
 }

-// Prefer the to_kilo_pascals procedure group.
+@(private = "file")
 pascals_to_kilo_pascals :: #force_inline proc "contextless" (
 	pascals: Pascals($V),
 ) -> Kilo_Pascals(V) where intrinsics.type_is_numeric(V) {
 	return Kilo_Pascals(V){pascals.v / KILO}
 }

-// Prefer the to_torr procedure group.
+@(private = "file")
 pascals_to_torr :: #force_inline proc "contextless" (
 	pascals: Pascals($V),
 ) -> Torr(V) where intrinsics.type_is_float(V) {
@@ -29,16 +29,15 @@ Kilo_Pascals :: struct($V: typeid) where intrinsics.type_is_numeric(V) {
 	v: V,
 }

-// Prefer the to_pascals procedure group.
+@(private = "file")
 kilo_pascals_to_pascals :: #force_inline proc "contextless" (
 	kilo_pascals: Kilo_Pascals($V),
 ) -> Pascals(V) where intrinsics.type_is_numeric(V) {
 	return Pascals(V){kilo_pascals.v * KILO}
 }

-// Prefer the to_psi procedure group.
 kilo_pascals_to_psi :: #force_inline proc "contextless" (
-	kilo_pascals: Kilo_Pascals($V),
+    kilo_pascals: Kilo_Pascals($V),
 ) -> Psi(V) where intrinsics.type_is_float(V) {
 	return Psi(V){kilo_pascals.v / KILO_PASCALS_PER_PSI}
 }
@@ -48,7 +47,7 @@ Torr :: struct($V: typeid) where intrinsics.type_is_numeric(V) {
 	v: V,
 }

-// Prefer the to_pascals procedure group.
+@(private = "file")
 torr_to_pascals :: #force_inline proc "contextless" (
 	torr: Torr($V),
 ) -> Pascals(V) where intrinsics.type_is_float(V) {
@@ -60,7 +59,6 @@ Psi :: struct($V: typeid) where intrinsics.type_is_numeric(V) {
 	v: V,
 }

-// Prefer the to_kilo_pascals procedure group.
 psi_to_kilo_pascals :: #force_inline proc "contextless" (
 	psi: Psi($V),
 ) -> Kilo_Pascals(V) where intrinsics.type_is_float(V) {
@@ -81,11 +79,11 @@ to_kilo_pascals :: proc {
 }

 to_torr :: proc {
-	pascals_to_torr,
+    pascals_to_torr,
 }

 to_psi :: proc {
-	kilo_pascals_to_psi,
+    kilo_pascals_to_psi,
 }


@@ -112,25 +110,25 @@ test_kilo_pascals_to_pascals :: proc(t: ^testing.T) {

@(test)
 test_pascals_to_torr :: proc(t: ^testing.T) {
-	pascals := Pascals(f32){1000}
-	torr := to_torr(pascals)
+    pascals := Pascals(f32){1000}
+    torr := to_torr(pascals)

-	testing.expect(t, torr.v > 7.49 && torr.v < 7.51)
+    testing.expect(t, torr.v > 7.49 && torr.v < 7.51)
 }

@(test)
 test_torr_to_pascals :: proc(t: ^testing.T) {
-	torr := Torr(f32){7.5}
-	pascals := to_pascals(torr)
+    torr := Torr(f32){7.5}
+    pascals := to_pascals(torr)

-	testing.expect(t, pascals.v > 999.91 && pascals.v < 999.92)
+    testing.expect(t, pascals.v > 999.91 && pascals.v < 999.92)
 }

@(test)
 test_psi_kilo_pascals :: proc(t: ^testing.T) {
-	psi := Psi(f32){2.5}
-	kilo_pascals := Kilo_Pascals(f32){17.23689323292091}
+    psi := Psi(f32){2.5}
+    kilo_pascals := Kilo_Pascals(f32){17.23689323292091}

-	testing.expect(t, to_kilo_pascals(psi).v > 17.22 && to_kilo_pascals(psi).v < 17.24)
-	testing.expect(t, to_psi(kilo_pascals).v > 2.49 && to_psi(kilo_pascals).v < 2.51)
+    testing.expect(t, to_kilo_pascals(psi).v > 17.22 && to_kilo_pascals(psi).v < 17.24)
+    testing.expect(t, to_psi(kilo_pascals).v > 2.49 && to_psi(kilo_pascals).v < 2.51)
 }
@@ -17,11 +17,6 @@ kelvins_celsius_offset :: #force_inline proc "contextless" (
 	return OFFSET
 }

-@(private = "file")
-FAHRENHEIT_PER_CELSIUS_DEGREE :: 9.0 / 5.0
-@(private = "file")
-CELSIUS_PER_FAHRENHEIT_DEGREE :: 5.0 / 9.0
-
 // ---------------------------------------------------------------------------------------------------------------------
 // ----- Types ------------------------
 // ---------------------------------------------------------------------------------------------------------------------
@@ -30,33 +25,26 @@ Kelvins :: struct($V: typeid) where intrinsics.type_is_numeric(V) {
 	v: V,
 }

-// Prefer the to_celsius procedure group.
+@(private = "file")
 kelvins_to_celsius :: #force_inline proc "contextless" (
 	kelvins: Kelvins($V),
 ) -> Celsius(V) where intrinsics.type_is_numeric(V) {
 	return Celsius(V){kelvins.v - kelvins_celsius_offset(V)}
 }

-// Prefer the to_deci_kelvins procedure group.
+@(private = "file")
 kelvins_to_deci_kelvins :: #force_inline proc "contextless" (
 	kelvins: Kelvins($V),
 ) -> Deci_Kelvins(V) where intrinsics.type_is_numeric(V) {
 	return Deci_Kelvins(V){kelvins.v * DECI}
 }

-// Prefer the to_fahrenheit procedure group.
-kelvins_to_fahrenheit :: #force_inline proc "contextless" (
-	kelvins: Kelvins($V),
-) -> Fahrenheit(V) where intrinsics.type_is_numeric(V) {
-	return celsius_to_fahrenheit(kelvins_to_celsius(kelvins))
-}
-
 //----- Decikelvins ----------------------------------
 Deci_Kelvins :: struct($V: typeid) where intrinsics.type_is_numeric(V) {
 	v: V,
 }

-// Prefer the to_kelvins procedure group.
+@(private = "file")
 deci_kelvins_to_kelvins :: #force_inline proc "contextless" (
 	deci_kelvins: Deci_Kelvins($V),
 ) -> Kelvins(V) where intrinsics.type_is_numeric(V) {
@@ -68,74 +56,38 @@ Celsius :: struct($V: typeid) where intrinsics.type_is_numeric(V) {
 	v: V,
 }

-// Prefer the to_kelvins procedure group.
+@(private = "file")
 celsius_to_kelvins :: #force_inline proc "contextless" (
 	degrees_celsius: Celsius($V),
 ) -> Kelvins(V) where intrinsics.type_is_numeric(V) {
 	return Kelvins(V){degrees_celsius.v + kelvins_celsius_offset(V)}
 }

-// Prefer the to_deci_celsius procedure group.
+@(private = "file")
 celsius_to_deci_celsius :: #force_inline proc "contextless" (
 	degrees_celsius: Celsius($V),
 ) -> Deci_Celsius(V) where intrinsics.type_is_numeric(V) {
 	return Deci_Celsius(V){degrees_celsius.v * DECI}
 }

-// Prefer the to_fahrenheit procedure group.
-@(fast_math = {.Allow_Contract})
-celsius_to_fahrenheit :: #force_inline proc "contextless" (
-	degrees_celsius: Celsius($V),
-) -> Fahrenheit(V) where intrinsics.type_is_numeric(V) {
-	when intrinsics.type_is_float(V) {
-		return Fahrenheit(V){degrees_celsius.v * FAHRENHEIT_PER_CELSIUS_DEGREE + 32.0}
-	} else {
-		return Fahrenheit(V){degrees_celsius.v * 9 / 5 + 32}
-	}
-}
-
 //----- Deci Degrees Celsius ----------------------------------
 Deci_Celsius :: struct($V: typeid) where intrinsics.type_is_numeric(V) {
 	v: V,
 }

-// Prefer the to_celsius procedure group.
+@(private = "file")
 deci_celsius_to_celsius :: #force_inline proc "contextless" (
 	deci_degrees_celsius: Deci_Celsius($V),
 ) -> Celsius(V) where intrinsics.type_is_numeric(V) {
 	return Celsius(V){deci_degrees_celsius.v / DECI}
 }

-//----- Degrees Fahrenheit ----------------------------------
-Fahrenheit :: struct($V: typeid) where intrinsics.type_is_numeric(V) {
-	v: V,
-}
-
-// Prefer the to_celsius procedure group.
-fahrenheit_to_celsius :: #force_inline proc "contextless" (
-	degrees_fahrenheit: Fahrenheit($V),
-) -> Celsius(V) where intrinsics.type_is_numeric(V) {
-	when intrinsics.type_is_float(V) {
-		return Celsius(V){(degrees_fahrenheit.v - 32.0) * CELSIUS_PER_FAHRENHEIT_DEGREE}
-	} else {
-		return Celsius(V){(degrees_fahrenheit.v - 32) * 5 / 9}
-	}
-}
-
-// Prefer the to_kelvins procedure group.
-fahrenheit_to_kelvins :: #force_inline proc "contextless" (
-	degrees_fahrenheit: Fahrenheit($V),
-) -> Kelvins(V) where intrinsics.type_is_numeric(V) {
-	return celsius_to_kelvins(fahrenheit_to_celsius(degrees_fahrenheit))
-}
-
 // ---------------------------------------------------------------------------------------------------------------------
 // ----- Conversion Overloads ------------------------
 // ---------------------------------------------------------------------------------------------------------------------
 to_kelvins :: proc {
 	deci_kelvins_to_kelvins,
 	celsius_to_kelvins,
-	fahrenheit_to_kelvins,
 }

 to_deci_kelvins :: proc {
@@ -145,18 +97,12 @@ to_deci_kelvins :: proc {
 to_celsius :: proc {
 	kelvins_to_celsius,
 	deci_celsius_to_celsius,
-	fahrenheit_to_celsius,
 }

 to_deci_celsius :: proc {
 	celsius_to_deci_celsius,
 }

-to_fahrenheit :: proc {
-	celsius_to_fahrenheit,
-	kelvins_to_fahrenheit,
-}
-
 // ---------------------------------------------------------------------------------------------------------------------
 // ----- Tests ------------------------
 // ---------------------------------------------------------------------------------------------------------------------
@@ -180,80 +126,32 @@ test_kelvins_to_deci_kelvins :: proc(t: ^testing.T) {

@(test)
 test_deci_kelvins_to_kelvins :: proc(t: ^testing.T) {
-	deci_kelvins := Deci_Kelvins(int){1000}
-	kelvins := to_kelvins(deci_kelvins)
+    deci_kelvins := Deci_Kelvins(int){1000}
+    kelvins := to_kelvins(deci_kelvins)

-	testing.expect_value(t, kelvins, Kelvins(int){100})
+    testing.expect_value(t, kelvins, Kelvins(int){100})
 }

@(test)
 test_celsius_to_kelvins :: proc(t: ^testing.T) {
-	degrees_celsius := Celsius(f32){0}
-	kelvins := to_kelvins(degrees_celsius)
+    degrees_celsius := Celsius(f32){0}
+    kelvins := to_kelvins(degrees_celsius)

-	testing.expect_value(t, kelvins, Kelvins(f32){273.15})
+    testing.expect_value(t, kelvins, Kelvins(f32){273.15})
 }

@(test)
 test_celsius_to_deci_celsius :: proc(t: ^testing.T) {
-	degrees_celsius := Celsius(int){100}
-	deci_degrees_celsius := to_deci_celsius(degrees_celsius)
+    degrees_celsius := Celsius(int){100}
+    deci_degrees_celsius := to_deci_celsius(degrees_celsius)

-	testing.expect_value(t, deci_degrees_celsius, Deci_Celsius(int){1000})
+    testing.expect_value(t, deci_degrees_celsius, Deci_Celsius(int){1000})
 }

@(test)
 test_deci_celsius_to_celsius :: proc(t: ^testing.T) {
-	deci_degrees_celsius := Deci_Celsius(int){1000}
-	degrees_celsius := to_celsius(deci_degrees_celsius)
+    deci_degrees_celsius := Deci_Celsius(int){1000}
+    degrees_celsius := to_celsius(deci_degrees_celsius)

-	testing.expect_value(t, degrees_celsius, Celsius(int){100})
-}
-
-@(test)
-test_celsius_to_fahrenheit :: proc(t: ^testing.T) {
-	degrees_celsius := Celsius(int){100}
-	degrees_fahrenheit := to_fahrenheit(degrees_celsius)
-
-	testing.expect_value(t, degrees_fahrenheit, Fahrenheit(int){212})
-}
-
-@(test)
-test_fahrenheit_to_celsius :: proc(t: ^testing.T) {
-	degrees_fahrenheit := Fahrenheit(int){212}
-	degrees_celsius := to_celsius(degrees_fahrenheit)
-
-	testing.expect_value(t, degrees_celsius, Celsius(int){100})
-}
-
-@(test)
-test_kelvins_to_fahrenheit :: proc(t: ^testing.T) {
-	kelvins := Kelvins(int){373}
-	degrees_fahrenheit := to_fahrenheit(kelvins)
-
-	testing.expect_value(t, degrees_fahrenheit, Fahrenheit(int){212})
-}
-
-@(test)
-test_fahrenheit_to_kelvins :: proc(t: ^testing.T) {
-	degrees_fahrenheit := Fahrenheit(int){212}
-	kelvins := to_kelvins(degrees_fahrenheit)
-
-	testing.expect_value(t, kelvins, Kelvins(int){373})
-}
-
-@(test)
-test_celsius_to_fahrenheit_f64 :: proc(t: ^testing.T) {
-	// -40 is the point where the Celsius and Fahrenheit scales coincide. It converts exactly in
-	// f64, so a passing equality here also confirms the 9/5 ratio constant is not lossy.
-	degrees_fahrenheit := to_fahrenheit(Celsius(f64){-40})
-
-	testing.expect_value(t, degrees_fahrenheit, Fahrenheit(f64){-40})
-}
-
-@(test)
-test_fahrenheit_to_celsius_f64 :: proc(t: ^testing.T) {
-	degrees_celsius := to_celsius(Fahrenheit(f64){-40})
-
-	testing.expect_value(t, degrees_celsius, Celsius(f64){-40})
+    testing.expect_value(t, degrees_celsius, Celsius(int){100})
 }
@@ -7,7 +7,7 @@ Volts :: struct($V: typeid) where intrinsics.type_is_numeric(V) {
 	v: V,
 }

-// Prefer the to_milli_volts procedure group.
+@(private = "file")
 volts_to_milli_volts :: #force_inline proc "contextless" (
 	volts: Volts($V),
 ) -> Milli_Volts(V) where intrinsics.type_is_numeric(V) {
@@ -19,7 +19,7 @@ Milli_Volts :: struct($V: typeid) where intrinsics.type_is_numeric(V) {
 	v: V,
 }

-// Prefer the to_volts procedure group.
+@(private = "file")
 milli_volts_to_volts :: #force_inline proc "contextless" (
 	milli_volts: Milli_Volts($V),
 ) -> Volts(V) where intrinsics.type_is_numeric(V) {
@@ -30,11 +30,11 @@ milli_volts_to_volts :: #force_inline proc "contextless" (
 // ----- Conversion Overloads ------------------------
 // ---------------------------------------------------------------------------------------------------------------------
 to_volts :: proc {
-	milli_volts_to_volts,
+    milli_volts_to_volts,
 }

 to_milli_volts :: proc {
-	volts_to_milli_volts,
+    volts_to_milli_volts,
 }

 // ---------------------------------------------------------------------------------------------------------------------
@@ -52,8 +52,8 @@ test_volts_to_milli_volts :: proc(t: ^testing.T) {

@(test)
 test_milli_volts_to_volts :: proc(t: ^testing.T) {
-	milli_volts := Milli_Volts(int){1000}
-	volts := to_volts(milli_volts)
+    milli_volts := Milli_Volts(int){1000}
+    volts := to_volts(milli_volts)

-	testing.expect_value(t, volts, Volts(int){1})
+    testing.expect_value(t, volts, Volts(int){1})
 }
@@ -2,34 +2,16 @@ package quantity

 import "base:intrinsics"

-LITERS_PER_GALLON :: 3.785411784
-MICRO_LITERS_PER_GALLON :: 473176473.0 / 125.0
-
 //----- Liters ----------------------------------
 Liters :: struct($V: typeid) where intrinsics.type_is_numeric(V) {
 	v: V,
 }

-// Prefer the to_milli_liters procedure group.
+@(private = "file")
 liters_to_milli_liters :: #force_inline proc "contextless" (
-	liters: Liters($V),
+    liters: Liters($V),
 ) -> Milli_Liters(V) where intrinsics.type_is_numeric(V) {
-	return Milli_Liters(V){liters.v * MILLI}
-}
-
-// Prefer the to_gallons procedure group.
-@(fast_math = {.Allow_Reciprocal})
-liters_to_gallons :: #force_inline proc "contextless" (
-	liters: Liters($V),
-) -> Gallons(V) where intrinsics.type_is_float(V) {
-	return Gallons(V){liters.v / LITERS_PER_GALLON}
-}
-
-// Prefer the to_micro_liters procedure group.
-liters_to_micro_liters :: #force_inline proc "contextless" (
-	liters: Liters($V),
-) -> Micro_Liters(V) where intrinsics.type_is_numeric(V) {
-	return Micro_Liters(V){liters.v * MICRO}
+    return Milli_Liters(V){liters.v * MILLI}
 }

 //----- Milliliters ----------------------------------
@@ -37,92 +19,22 @@ Milli_Liters :: struct($V: typeid) where intrinsics.type_is_numeric(V) {
 	v: V,
 }

-// Prefer the to_liters procedure group.
-@(fast_math = {.Allow_Reciprocal})
+@(private = "file")
 milli_liters_to_liters :: #force_inline proc "contextless" (
 	milli_liters: Milli_Liters($V),
 ) -> Liters(V) where intrinsics.type_is_numeric(V) {
 	return Liters(V){milli_liters.v / MILLI}
 }

-// Prefer the to_micro_liters procedure group.
-milli_liters_to_micro_liters :: #force_inline proc "contextless" (
-	milli_liters: Milli_Liters($V),
-) -> Micro_Liters(V) where intrinsics.type_is_numeric(V) {
-	return Micro_Liters(V){milli_liters.v * (MICRO / MILLI)}
-}
-
-//----- Microliters ----------------------------------
-Micro_Liters :: struct($V: typeid) where intrinsics.type_is_numeric(V) {
-	v: V,
-}
-
-// Prefer the to_liters procedure group.
-@(fast_math = {.Allow_Reciprocal})
-micro_liters_to_liters :: #force_inline proc "contextless" (
-	micro_liters: Micro_Liters($V),
-) -> Liters(V) where intrinsics.type_is_numeric(V) {
-	return Liters(V){micro_liters.v / MICRO}
-}
-
-// Prefer the to_milli_liters procedure group.
-@(fast_math = {.Allow_Reciprocal})
-micro_liters_to_milli_liters :: #force_inline proc "contextless" (
-	micro_liters: Micro_Liters($V),
-) -> Milli_Liters(V) where intrinsics.type_is_numeric(V) {
-	return Milli_Liters(V){micro_liters.v / (MICRO / MILLI)}
-}
-
-// Prefer the to_gallons procedure group.
-@(fast_math = {.Allow_Reciprocal})
-micro_liters_to_gallons :: #force_inline proc "contextless" (
-	micro_liters: Micro_Liters($V),
-) -> Gallons(V) where intrinsics.type_is_float(V) {
-	return Gallons(V){micro_liters.v / MICRO_LITERS_PER_GALLON}
-}
-
-//----- Gallons ----------------------------------
-Gallons :: struct($V: typeid) where intrinsics.type_is_numeric(V) {
-	v: V,
-}
-
-// Prefer the to_liters procedure group.
-gallons_to_liters :: #force_inline proc "contextless" (
-	gallons: Gallons($V),
-) -> Liters(V) where intrinsics.type_is_float(V) {
-	return Liters(V){gallons.v * LITERS_PER_GALLON}
-}
-
-// Prefer the to_micro_liters procedure group.
-gallons_to_micro_liters :: #force_inline proc "contextless" (
-	gallons: Gallons($V),
-) -> Micro_Liters(V) where intrinsics.type_is_float(V) {
-	return Micro_Liters(V){gallons.v * MICRO_LITERS_PER_GALLON}
-}
-
 // ---------------------------------------------------------------------------------------------------------------------
 // ----- Conversion Overloads ------------------------
 // ---------------------------------------------------------------------------------------------------------------------
 to_liters :: proc {
-	milli_liters_to_liters,
-	micro_liters_to_liters,
-	gallons_to_liters,
+    milli_liters_to_liters,
 }

 to_milli_liters :: proc {
-	liters_to_milli_liters,
-	micro_liters_to_milli_liters,
-}
-
-to_micro_liters :: proc {
-	liters_to_micro_liters,
-	milli_liters_to_micro_liters,
-	gallons_to_micro_liters,
-}
-
-to_gallons :: proc {
-	liters_to_gallons,
-	micro_liters_to_gallons,
+    liters_to_milli_liters,
 }

 // ---------------------------------------------------------------------------------------------------------------------
@@ -132,80 +44,16 @@ import "core:testing"

@(test)
 test_liters_to_milli_liters :: proc(t: ^testing.T) {
-	liters := Liters(int){12}
-	milli_liters := to_milli_liters(liters)
+    liters := Liters(int){12}
+    milli_liters := to_milli_liters(liters)

-	testing.expect_value(t, milli_liters, Milli_Liters(int){12_000})
+    testing.expect_value(t, milli_liters, Milli_Liters(int){12_000})
 }

@(test)
 test_milli_liters_to_liters :: proc(t: ^testing.T) {
-	milli_liters := Milli_Liters(int){12_000}
-	liters := to_liters(milli_liters)
+    milli_liters := Milli_Liters(int){12_000}
+    liters := to_liters(milli_liters)

-	testing.expect_value(t, liters, Liters(int){12})
-}
-
-@(test)
-test_gallons_to_liters :: proc(t: ^testing.T) {
-	gallons := Gallons(f32){1}
-	liters := to_liters(gallons)
-
-	testing.expect(t, liters.v > 3.78 && liters.v < 3.79)
-}
-
-@(test)
-test_liters_to_gallons :: proc(t: ^testing.T) {
-	liters := Liters(f32){3.785411784}
-	gallons := to_gallons(liters)
-
-	testing.expect(t, gallons.v > 0.99 && gallons.v < 1.01)
-}
-
-@(test)
-test_liters_to_micro_liters :: proc(t: ^testing.T) {
-	liters := Liters(int){12}
-	micro_liters := to_micro_liters(liters)
-
-	testing.expect_value(t, micro_liters, Micro_Liters(int){12_000_000})
-}
-
-@(test)
-test_micro_liters_to_liters :: proc(t: ^testing.T) {
-	micro_liters := Micro_Liters(int){12_000_000}
-	liters := to_liters(micro_liters)
-
-	testing.expect_value(t, liters, Liters(int){12})
-}
-
-@(test)
-test_milli_liters_to_micro_liters :: proc(t: ^testing.T) {
-	milli_liters := Milli_Liters(int){5}
-	micro_liters := to_micro_liters(milli_liters)
-
-	testing.expect_value(t, micro_liters, Micro_Liters(int){5_000})
-}
-
-@(test)
-test_micro_liters_to_milli_liters :: proc(t: ^testing.T) {
-	micro_liters := Micro_Liters(int){5_000}
-	milli_liters := to_milli_liters(micro_liters)
-
-	testing.expect_value(t, milli_liters, Milli_Liters(int){5})
-}
-
-@(test)
-test_gallons_to_micro_liters :: proc(t: ^testing.T) {
-	gallons := Gallons(f64){1}
-	micro_liters := to_micro_liters(gallons)
-
-	testing.expect(t, micro_liters.v > 3_785_411.783 && micro_liters.v < 3_785_411.785)
-}
-
-@(test)
-test_micro_liters_to_gallons :: proc(t: ^testing.T) {
-	micro_liters := Micro_Liters(f64){3_785_411.784}
-	gallons := to_gallons(micro_liters)
-
-	testing.expect(t, gallons.v > 0.9999999 && gallons.v < 1.0000001)
+    testing.expect_value(t, liters, Liters(int){12})
 }
@@ -2,58 +2,6 @@ package quantity

 import "base:intrinsics"

-//----- Liters Per Minute ----------------------------------
 Liters_Per_Minute :: struct($V: typeid) where intrinsics.type_is_numeric(V) {
 	v: V,
 }
-
-// Prefer the to_gallons_per_minute procedure group.
-liters_per_minute_to_gallons_per_minute :: #force_inline proc "contextless" (
-	liters_per_minute: Liters_Per_Minute($V),
-) -> Gallons_Per_Minute(V) where intrinsics.type_is_float(V) {
-	return Gallons_Per_Minute(V){liters_per_minute.v / LITERS_PER_GALLON}
-}
-
-//----- Gallons Per Minute ----------------------------------
-Gallons_Per_Minute :: struct($V: typeid) where intrinsics.type_is_numeric(V) {
-	v: V,
-}
-
-// Prefer the to_liters_per_minute procedure group.
-gallons_per_minute_to_liters_per_minute :: #force_inline proc "contextless" (
-	gallons_per_minute: Gallons_Per_Minute($V),
-) -> Liters_Per_Minute(V) where intrinsics.type_is_float(V) {
-	return Liters_Per_Minute(V){gallons_per_minute.v * LITERS_PER_GALLON}
-}
-
-// ---------------------------------------------------------------------------------------------------------------------
-// ----- Conversion Overloads ------------------------
-// ---------------------------------------------------------------------------------------------------------------------
-to_liters_per_minute :: proc {
-	gallons_per_minute_to_liters_per_minute,
-}
-
-to_gallons_per_minute :: proc {
-	liters_per_minute_to_gallons_per_minute,
-}
-
-// ---------------------------------------------------------------------------------------------------------------------
-// ----- Tests ------------------------
-// ---------------------------------------------------------------------------------------------------------------------
-import "core:testing"
-
-@(test)
-test_gallons_per_minute_to_liters_per_minute :: proc(t: ^testing.T) {
-	gallons_per_minute := Gallons_Per_Minute(f32){1}
-	liters_per_minute := to_liters_per_minute(gallons_per_minute)
-
-	testing.expect(t, liters_per_minute.v > 3.78 && liters_per_minute.v < 3.79)
-}
-
-@(test)
-test_liters_per_minute_to_gallons_per_minute :: proc(t: ^testing.T) {
-	liters_per_minute := Liters_Per_Minute(f32){3.785411784}
-	gallons_per_minute := to_gallons_per_minute(liters_per_minute)
-
-	testing.expect(t, gallons_per_minute.v > 0.99 && gallons_per_minute.v < 1.01)
-}
@@ -1,139 +1,103 @@
 package ring

-import "base:runtime"
 import "core:fmt"

@(private)
 ODIN_BOUNDS_CHECK :: !ODIN_NO_BOUNDS_CHECK

-Ring :: struct($E: typeid) {
-	data:                  []E,
-	next_write_index, len: int,
+Ring :: struct($T: typeid) {
+	data:            []T,
+	_end_index, len: int,
 }

-Ring_Soa :: struct($E: typeid) {
-	data:                  #soa[]E,
-	next_write_index, len: int,
+Ring_Soa :: struct($T: typeid) {
+	data:            #soa[]T,
+	_end_index, len: int,
 }

-destroy_aos :: #force_inline proc(
-	ring: ^Ring($E),
-	allocator := context.allocator,
-) -> runtime.Allocator_Error {
-	return delete(ring.data)
+from_slice_raos :: #force_inline proc(data: $T/[]$E) -> Ring(E) {
+	return {data = data, _end_index = -1}
 }

-destroy_soa :: #force_inline proc(
-	ring: ^Ring_Soa($E),
-	allocator := context.allocator,
-) -> runtime.Allocator_Error {
-	return delete(ring.data)
+from_slice_rsoa :: #force_inline proc(data: $T/#soa[]$E) -> Ring_Soa(E) {
+	return {data = data, _end_index = -1}
 }

-destroy :: proc {
-	destroy_aos,
-	destroy_soa,
+from_slice :: proc {
+	from_slice_raos,
+	from_slice_rsoa,
 }

-create_aos :: #force_inline proc(
-	$E: typeid,
-	capacity: int,
-	allocator := context.allocator,
-) -> (
-	ring: Ring(E),
-	err: runtime.Allocator_Error,
-) #optional_allocator_error {
-	ring.data, err = make([]E, capacity, allocator)
-	return ring, err
-}
-
-create_soa :: #force_inline proc(
-	$E: typeid,
-	capacity: int,
-	allocator := context.allocator,
-) -> (
-	ring: Ring_Soa(E),
-	err: runtime.Allocator_Error,
-) #optional_allocator_error {
-	ring.data, err = make(#soa[]E, capacity, allocator)
-	return ring, err
-}
-
-// All contents of `data` will be completely ignored, `data` is treated as an empty slice.
-init_from_slice_aos :: #force_inline proc(ring: ^Ring($E), data: $T/[]E) {
-	ring.data = data
-	ring.len = 0
-	ring.next_write_index = 0
-	return
-}
-
-// All contents of `data` will be completely ignored, `data` is treated as an empty slice.
-init_from_slice_soa :: #force_inline proc(ring: ^Ring_Soa($E), data: $T/#soa[]E) {
-	ring.data = data
-	ring.len = 0
-	ring.next_write_index = 0
-	return
-}
-
-init_from_slice :: proc {
-	init_from_slice_aos,
-	init_from_slice_soa,
-}
-
-// Internal
 // Index in the backing array where the ring starts
-start_index_aos :: #force_inline proc(ring: Ring($E)) -> int {
-	return ring.len < len(ring.data) ? 0 : ring.next_write_index
+_start_index_raos :: proc(ring: Ring($T)) -> int {
+	if ring.len < len(ring.data) {
+		return 0
+	} else {
+		start_index := ring._end_index + 1
+		return 0 if start_index == len(ring.data) else start_index
+	}
 }

-// Internal
 // Index in the backing array where the ring starts
-start_index_soa :: #force_inline proc(ring: Ring_Soa($E)) -> int {
-	return ring.len < len(ring.data) ? 0 : ring.next_write_index
+_start_index_rsoa :: proc(ring: Ring_Soa($T)) -> int {
+	if ring.len < len(ring.data) {
+		return 0
+	} else {
+		start_index := ring._end_index + 1
+		return 0 if start_index == len(ring.data) else start_index
+	}
 }

-advance_aos :: #force_inline proc(ring: ^Ring($E)) {
+advance_raos :: proc(ring: ^Ring($T)) {
 	// Length
 	if ring.len != len(ring.data) do ring.len += 1
-	// Write index
-	ring.next_write_index += 1
-	if ring.next_write_index == len(ring.data) do ring.next_write_index = 0
+	// End index
+	if ring._end_index == len(ring.data) - 1 { 	// If we are at the end of the backing array
+		ring._end_index = 0 // Overflow end to 0
+	} else {
+		ring._end_index += 1
+	}
 }

-advance_soa :: #force_inline proc(ring: ^Ring_Soa($E)) {
+advance_rsoa :: proc(ring: ^Ring_Soa($T)) {
 	// Length
 	if ring.len != len(ring.data) do ring.len += 1
-	// Write index
-	ring.next_write_index += 1
-	if ring.next_write_index == len(ring.data) do ring.next_write_index = 0
+	// End index
+	if ring._end_index == len(ring.data) - 1 { 	// If we are at the end of the backing array
+		ring._end_index = 0 // Overflow end to 0
+	} else {
+		ring._end_index += 1
+	}
 }

 advance :: proc {
-	advance_aos,
-	advance_soa,
+	advance_raos,
+	advance_rsoa,
 }

-push_aos :: #force_inline proc(ring: ^Ring($E), element: E) {
-	ring.data[ring.next_write_index] = element
+append_raos :: proc(ring: ^Ring($T), element: T) {
 	advance(ring)
+	ring.data[ring._end_index] = element
 }

-push_soa :: #force_inline proc(ring: ^Ring_Soa($E), element: E) {
-	ring.data[ring.next_write_index] = element
+append_rsoa :: proc(ring: ^Ring_Soa($T), element: T) {
 	advance(ring)
+	ring.data[ring._end_index] = element
 }

-push :: proc {
-	push_aos,
-	push_soa,
+append :: proc {
+	append_raos,
+	append_rsoa,
 }

-get_aos :: #force_inline proc(ring: Ring($E), index: int) -> ^E {
+get_raos :: proc(ring: Ring($T), index: int) -> ^T {
 	when ODIN_BOUNDS_CHECK {
-		fmt.assertf(index < ring.len, "Ring index %i out of bounds for length %i", index, ring.len)
+		if index >= ring.len {
+			panic(fmt.tprintf("Ring index %i out of bounds for length %i", index, ring.len))
+		}
 	}

-	array_index := start_index_aos(ring) + index
+	array_index := _start_index_raos(ring) + index
 	if array_index < len(ring.data) {
 		return &ring.data[array_index]
 	} else {
@@ -143,12 +107,14 @@ get_aos :: #force_inline proc(ring: Ring($E), index: int) -> ^E {
 }

 // SOA can't return soa pointer to parapoly T.
-get_soa :: #force_inline proc(ring: Ring_Soa($E), index: int) -> E {
+get_rsoa :: proc(ring: Ring_Soa($T), index: int) -> T {
 	when ODIN_BOUNDS_CHECK {
-		fmt.assertf(index < ring.len, "Ring index %i out of bounds for length %i", index, ring.len)
+		if index >= ring.len {
+			panic(fmt.tprintf("Ring index %i out of bounds for length %i", index, ring.len))
+		}
 	}

-	array_index := start_index_soa(ring) + index
+	array_index := _start_index_rsoa(ring) + index
 	if array_index < len(ring.data) {
 		return ring.data[array_index]
 	} else {
@@ -158,36 +124,36 @@ get_soa :: #force_inline proc(ring: Ring_Soa($E), index: int) -> E {
 }

 get :: proc {
-	get_aos,
-	get_soa,
+	get_raos,
+	get_rsoa,
 }

-get_last_aos :: #force_inline proc(ring: Ring($E)) -> ^E {
+get_last_raos :: #force_inline proc(ring: Ring($T)) -> ^T {
 	return get(ring, ring.len - 1)
 }

-get_last_soa :: #force_inline proc(ring: Ring_Soa($E)) -> E {
+get_last_rsoa :: #force_inline proc(ring: Ring_Soa($T)) -> T {
 	return get(ring, ring.len - 1)
 }

 get_last :: proc {
-	get_last_aos,
-	get_last_soa,
+	get_last_raos,
+	get_last_rsoa,
 }

-clear_aos :: #force_inline proc "contextless" (ring: ^Ring($E)) {
+clear_raos :: #force_inline proc "contextless" (ring: ^Ring($T)) {
 	ring.len = 0
-	ring.next_write_index = 0
+	ring._end_index = -1
 }

-clear_soa :: #force_inline proc "contextless" (ring: ^Ring_Soa($E)) {
+clear_rsoa :: #force_inline proc "contextless" (ring: ^Ring_Soa($T)) {
 	ring.len = 0
-	ring.next_write_index = 0
+	ring._end_index = -1
 }

 clear :: proc {
-	clear_aos,
-	clear_soa,
+	clear_raos,
+	clear_rsoa,
 }

 // ---------------------------------------------------------------------------------------------------------------------
@@ -198,27 +164,28 @@ import "core:testing"

@(test)
 test_ring_aos :: proc(t: ^testing.T) {
-	ring := create_aos(int, 10)
-	defer destroy(&ring)
+	data := make_slice([]int, 10)
+	ring := from_slice(data)
+	defer delete(ring.data)

 	for i in 1 ..= 5 {
-		push(&ring, i)
+		append(&ring, i)
 		log.debug("Length:", ring.len)
-		log.debug("Start index:", start_index_aos(ring))
-		log.debug("Next write index:", ring.next_write_index)
+		log.debug("Start index:", _start_index_raos(ring))
+		log.debug("End index:", ring._end_index)
 		log.debug(ring.data)
 	}
 	testing.expect_value(t, get(ring, 0)^, 1)
 	testing.expect_value(t, get(ring, 4)^, 5)
 	testing.expect_value(t, ring.len, 5)
-	testing.expect_value(t, ring.next_write_index, 5)
-	testing.expect_value(t, start_index_aos(ring), 0)
+	testing.expect_value(t, ring._end_index, 4)
+	testing.expect_value(t, _start_index_raos(ring), 0)

 	for i in 6 ..= 15 {
-		push(&ring, i)
+		append(&ring, i)
 		log.debug("Length:", ring.len)
-		log.debug("Start index:", start_index_aos(ring))
-		log.debug("Next write index:", ring.next_write_index)
+		log.debug("Start index:", _start_index_raos(ring))
+		log.debug("End index:", ring._end_index)
 		log.debug(ring.data)
 	}
 	testing.expect_value(t, get(ring, 0)^, 6)
@@ -226,22 +193,22 @@ test_ring_aos :: proc(t: ^testing.T) {
 	testing.expect_value(t, get(ring, 9)^, 15)
 	testing.expect_value(t, get_last(ring)^, 15)
 	testing.expect_value(t, ring.len, 10)
-	testing.expect_value(t, ring.next_write_index, 5)
-	testing.expect_value(t, start_index_aos(ring), 5)
+	testing.expect_value(t, ring._end_index, 4)
+	testing.expect_value(t, _start_index_raos(ring), 5)

 	for i in 15 ..= 25 {
-		push(&ring, i)
+		append(&ring, i)
 		log.debug("Length:", ring.len)
-		log.debug("Start index:", start_index_aos(ring))
-		log.debug("Next write index:", ring.next_write_index)
+		log.debug("Start index:", _start_index_raos(ring))
+		log.debug("End index:", ring._end_index)
 		log.debug(ring.data)
 	}
 	testing.expect_value(t, get(ring, 0)^, 16)
-	testing.expect_value(t, ring.next_write_index, 6)
+	testing.expect_value(t, ring._end_index, 5)
 	testing.expect_value(t, get_last(ring)^, 25)

 	clear(&ring)
-	push(&ring, 1)
+	append(&ring, 1)
 	testing.expect_value(t, ring.len, 1)
 	testing.expect_value(t, get(ring, 0)^, 1)
 }
@@ -252,27 +219,28 @@ test_ring_soa :: proc(t: ^testing.T) {
 		x, y: int,
 	}

-	ring := create_soa(Ints, 10)
-	defer destroy(&ring)
+	data := make_soa_slice(#soa[]Ints, 10)
+	ring := from_slice(data)
+	defer delete(ring.data)

 	for i in 1 ..= 5 {
-		push(&ring, Ints{i, i})
+		append(&ring, Ints{i, i})
 		log.debug("Length:", ring.len)
-		log.debug("Start index:", start_index_soa(ring))
-		log.debug("Next write index:", ring.next_write_index)
+		log.debug("Start index:", _start_index_rsoa(ring))
+		log.debug("End index:", ring._end_index)
 		log.debug(ring.data)
 	}
 	testing.expect_value(t, get(ring, 0), Ints{1, 1})
 	testing.expect_value(t, get(ring, 4), Ints{5, 5})
 	testing.expect_value(t, ring.len, 5)
-	testing.expect_value(t, ring.next_write_index, 5)
-	testing.expect_value(t, start_index_soa(ring), 0)
+	testing.expect_value(t, ring._end_index, 4)
+	testing.expect_value(t, _start_index_rsoa(ring), 0)

 	for i in 6 ..= 15 {
-		push(&ring, Ints{i, i})
+		append(&ring, Ints{i, i})
 		log.debug("Length:", ring.len)
-		log.debug("Start index:", start_index_soa(ring))
-		log.debug("Next write index:", ring.next_write_index)
+		log.debug("Start index:", _start_index_rsoa(ring))
+		log.debug("End index:", ring._end_index)
 		log.debug(ring.data)
 	}
 	testing.expect_value(t, get(ring, 0), Ints{6, 6})
@@ -280,160 +248,22 @@ test_ring_soa :: proc(t: ^testing.T) {
 	testing.expect_value(t, get(ring, 9), Ints{15, 15})
 	testing.expect_value(t, get_last(ring), Ints{15, 15})
 	testing.expect_value(t, ring.len, 10)
-	testing.expect_value(t, ring.next_write_index, 5)
-	testing.expect_value(t, start_index_soa(ring), 5)
+	testing.expect_value(t, ring._end_index, 4)
+	testing.expect_value(t, _start_index_rsoa(ring), 5)

 	for i in 15 ..= 25 {
-		push(&ring, Ints{i, i})
+		append(&ring, Ints{i, i})
 		log.debug("Length:", ring.len)
-		log.debug("Start index:", start_index_soa(ring))
-		log.debug("Next write index:", ring.next_write_index)
+		log.debug("Start index:", _start_index_rsoa(ring))
+		log.debug("End index:", ring._end_index)
 		log.debug(ring.data)
 	}
 	testing.expect_value(t, get(ring, 0), Ints{16, 16})
-	testing.expect_value(t, ring.next_write_index, 6)
+	testing.expect_value(t, ring._end_index, 5)
 	testing.expect_value(t, get_last(ring), Ints{25, 25})

 	clear(&ring)
-	push(&ring, Ints{1, 1})
+	append(&ring, Ints{1, 1})
 	testing.expect_value(t, ring.len, 1)
 	testing.expect_value(t, get(ring, 0), Ints{1, 1})
 }
-
-@(test)
-test_ring_aos_init_from_slice :: proc(t: ^testing.T) {
-	// Stack-allocated backing with pre-existing garbage and odd capacity.
-	backing: [7]int = {99, 99, 99, 99, 99, 99, 99}
-
-	ring: Ring(int)
-	init_from_slice(&ring, backing[:])
-
-	// Empty ring invariants after init_from_slice.
-	testing.expect_value(t, ring.len, 0)
-	testing.expect_value(t, ring.next_write_index, 0)
-	testing.expect_value(t, start_index_aos(ring), 0)
-
-	// Partial fill (3 / 7).
-	for i in 1 ..= 3 do push(&ring, i)
-	testing.expect_value(t, ring.len, 3)
-	testing.expect_value(t, ring.next_write_index, 3)
-	testing.expect_value(t, start_index_aos(ring), 0)
-	testing.expect_value(t, get(ring, 0)^, 1)
-	testing.expect_value(t, get(ring, 2)^, 3)
-	testing.expect_value(t, get_last(ring)^, 3)
-
-	// Fill exactly to capacity. Pushing element 7 must make len == cap
-	// AND wrap next_write_index from 6 back to 0 in the same step.
-	for i in 4 ..= 7 do push(&ring, i)
-	testing.expect_value(t, ring.len, 7)
-	testing.expect_value(t, ring.next_write_index, 0)
-	testing.expect_value(t, start_index_aos(ring), 0)
-	testing.expect_value(t, get(ring, 0)^, 1)
-	testing.expect_value(t, get(ring, 6)^, 7)
-	testing.expect_value(t, get_last(ring)^, 7)
-
-	// First overwrite — oldest element shifts by one.
-	push(&ring, 8)
-	testing.expect_value(t, ring.len, 7)
-	testing.expect_value(t, ring.next_write_index, 1)
-	testing.expect_value(t, start_index_aos(ring), 1)
-	testing.expect_value(t, get(ring, 0)^, 2)
-	testing.expect_value(t, get(ring, 6)^, 8)
-	testing.expect_value(t, get_last(ring)^, 8)
-
-	// Stress: 3 more complete wrap cycles (21 more pushes).
-	// After 29 total pushes, ring contains the last 7 (23..=29),
-	// and next_write_index = 29 mod 7 = 1.
-	for i in 9 ..= 29 do push(&ring, i)
-	testing.expect_value(t, ring.len, 7)
-	testing.expect_value(t, ring.next_write_index, 1)
-	testing.expect_value(t, start_index_aos(ring), 1)
-	testing.expect_value(t, get(ring, 0)^, 23)
-	testing.expect_value(t, get(ring, 3)^, 26)
-	testing.expect_value(t, get(ring, 6)^, 29)
-	testing.expect_value(t, get_last(ring)^, 29)
-
-	// Clear returns ring to empty-equivalent state.
-	clear(&ring)
-	testing.expect_value(t, ring.len, 0)
-	testing.expect_value(t, ring.next_write_index, 0)
-	testing.expect_value(t, start_index_aos(ring), 0)
-
-	// Single-element edge case: get_last(len==1) routes through get(ring, 0).
-	push(&ring, 42)
-	testing.expect_value(t, ring.len, 1)
-	testing.expect_value(t, ring.next_write_index, 1)
-	testing.expect_value(t, get(ring, 0)^, 42)
-	testing.expect_value(t, get_last(ring)^, 42)
-}
-
-@(test)
-test_ring_soa_init_from_slice :: proc(t: ^testing.T) {
-	Ints :: struct {
-		x, y: int,
-	}
-
-	// Stack-allocated backing with pre-existing garbage and odd capacity.
-	backing: #soa[7]Ints = {{99, 99}, {99, 99}, {99, 99}, {99, 99}, {99, 99}, {99, 99}, {99, 99}}
-
-	ring: Ring_Soa(Ints)
-	init_from_slice(&ring, backing[:])
-
-	// Empty ring invariants after init_from_slice.
-	testing.expect_value(t, ring.len, 0)
-	testing.expect_value(t, ring.next_write_index, 0)
-	testing.expect_value(t, start_index_soa(ring), 0)
-
-	// Partial fill (3 / 7).
-	for i in 1 ..= 3 do push(&ring, Ints{i, i})
-	testing.expect_value(t, ring.len, 3)
-	testing.expect_value(t, ring.next_write_index, 3)
-	testing.expect_value(t, start_index_soa(ring), 0)
-	testing.expect_value(t, get(ring, 0), Ints{1, 1})
-	testing.expect_value(t, get(ring, 2), Ints{3, 3})
-	testing.expect_value(t, get_last(ring), Ints{3, 3})
-
-	// Fill exactly to capacity. Pushing element 7 must make len == cap
-	// AND wrap next_write_index from 6 back to 0 in the same step.
-	for i in 4 ..= 7 do push(&ring, Ints{i, i})
-	testing.expect_value(t, ring.len, 7)
-	testing.expect_value(t, ring.next_write_index, 0)
-	testing.expect_value(t, start_index_soa(ring), 0)
-	testing.expect_value(t, get(ring, 0), Ints{1, 1})
-	testing.expect_value(t, get(ring, 6), Ints{7, 7})
-	testing.expect_value(t, get_last(ring), Ints{7, 7})
-
-	// First overwrite — oldest element shifts by one.
-	push(&ring, Ints{8, 8})
-	testing.expect_value(t, ring.len, 7)
-	testing.expect_value(t, ring.next_write_index, 1)
-	testing.expect_value(t, start_index_soa(ring), 1)
-	testing.expect_value(t, get(ring, 0), Ints{2, 2})
-	testing.expect_value(t, get(ring, 6), Ints{8, 8})
-	testing.expect_value(t, get_last(ring), Ints{8, 8})
-
-	// Stress: 3 more complete wrap cycles (21 more pushes).
-	// After 29 total pushes, ring contains the last 7 (23..=29),
-	// and next_write_index = 29 mod 7 = 1.
-	for i in 9 ..= 29 do push(&ring, Ints{i, i})
-	testing.expect_value(t, ring.len, 7)
-	testing.expect_value(t, ring.next_write_index, 1)
-	testing.expect_value(t, start_index_soa(ring), 1)
-	testing.expect_value(t, get(ring, 0), Ints{23, 23})
-	testing.expect_value(t, get(ring, 3), Ints{26, 26})
-	testing.expect_value(t, get(ring, 6), Ints{29, 29})
-	testing.expect_value(t, get_last(ring), Ints{29, 29})
-
-	// Clear returns ring to empty-equivalent state.
-	clear(&ring)
-	testing.expect_value(t, ring.len, 0)
-	testing.expect_value(t, ring.next_write_index, 0)
-	testing.expect_value(t, start_index_soa(ring), 0)
-
-	// Single-element edge case: get_last(len==1) routes through get(ring, 0).
-	push(&ring, Ints{42, 42})
-	testing.expect_value(t, ring.len, 1)
-	testing.expect_value(t, ring.next_write_index, 1)
-	testing.expect_value(t, get(ring, 0), Ints{42, 42})
-	testing.expect_value(t, get_last(ring), Ints{42, 42})
-}
@@ -18,14 +18,14 @@ when ODIN_OS == .Windows {

 String :: struct {
 	isStaticallyAllocated: c.bool,
-	length:                c.int32_t,
-	chars:                 [^]c.char,
+	length: c.int32_t,
+	chars:  [^]c.char,
 }

 StringSlice :: struct {
-	length:    c.int32_t,
-	chars:     [^]c.char,
-	baseChars: [^]c.char,
+	length: c.int32_t,
+	chars:  [^]c.char,
+	baseChars:  [^]c.char,
 }

 Vector2 :: [2]c.float
@@ -57,6 +57,11 @@ CornerRadius :: struct {
 	bottomRight: c.float,
 }

+BorderData :: struct {
+	width: u32,
+	color: Color,
+}
+
 ElementId :: struct {
 	id:       u32,
 	offset:   u32,
@@ -64,12 +69,6 @@ ElementId :: struct {
 	stringId: String,
 }

-ElementIdArray :: struct {
-	capacity:      i32,
-	length:        i32,
-	internalArray: [^]ElementId,
-}
-
 when ODIN_OS == .Windows {
 	EnumBackingType :: u32
 } else {
@@ -84,13 +83,11 @@ RenderCommandType :: enum EnumBackingType {
 	Image,
 	ScissorStart,
 	ScissorEnd,
-	OverlayColorStart,
-	OverlayColorEnd,
 	Custom,
 }

 RectangleElementConfig :: struct {
-	color: Color,
+	color:        Color,
 }

 TextWrapMode :: enum EnumBackingType {
@@ -106,22 +103,22 @@ TextAlignment :: enum EnumBackingType {
 }

 TextElementConfig :: struct {
-	userData:      rawptr,
-	textColor:     Color,
-	fontId:        u16,
-	fontSize:      u16,
-	letterSpacing: u16,
-	lineHeight:    u16,
-	wrapMode:      TextWrapMode,
-	textAlignment: TextAlignment,
+	userData:           rawptr,
+	textColor:          Color,
+	fontId:             u16,
+	fontSize:           u16,
+	letterSpacing:      u16,
+	lineHeight:         u16,
+	wrapMode:           TextWrapMode,
+	textAlignment:      TextAlignment,
 }

 AspectRatioElementConfig :: struct {
-	aspectRatio: f32,
+	aspectRatio:        f32,
 }

 ImageElementConfig :: struct {
-	imageData: rawptr,
+	imageData:        rawptr,
 }

 CustomElementConfig :: struct {
@@ -129,10 +126,10 @@ CustomElementConfig :: struct {
 }

 BorderWidth :: struct {
-	left:            u16,
-	right:           u16,
-	top:             u16,
-	bottom:          u16,
+	left: u16,
+	right: u16,
+	top: u16,
+	bottom: u16,
 	betweenChildren: u16,
 }

@@ -141,92 +138,6 @@ BorderElementConfig :: struct {
 	width: BorderWidth,
 }

-TransitionData :: struct {
-	boundingBox:     BoundingBox,
-	backgroundColor: Color,
-	overlayColor:    Color,
-	borderColor:     Color,
-	borderWidth:     BorderWidth,
-}
-
-TransitionState :: enum c.int {
-	Idle,
-	Entering,
-	Transitioning,
-	Exiting,
-}
-
-TransitionProperty :: enum c.int {
-	X,
-	Y,
-	Width,
-	Height,
-	BackgroundColor,
-	OverlayColor,
-	CornerRadius,
-	BorderColor,
-	BorderWidth,
-}
-
-TransitionPropertyFlags :: bit_set[TransitionProperty;c.int]
-TransitionPropertyPosition :: TransitionPropertyFlags{.X, .Y}
-TransitionPropertyDimensions :: TransitionPropertyFlags{.Width, .Height}
-TransitionPropertyBoundingBox :: TransitionPropertyPosition + TransitionPropertyDimensions
-TransitionPropertyBorder :: TransitionPropertyFlags{.BorderColor, .BorderWidth}
-
-TransitionCallbackArguments :: struct {
-	transitionState: TransitionState,
-	initial:         TransitionData,
-	current:         ^TransitionData,
-	target:          TransitionData,
-	elapsedTime:     f32,
-	duration:        f32,
-	properties:      TransitionPropertyFlags,
-}
-
-TransitionEnterTriggerType :: enum EnumBackingType {
-	SkipOnFirstParentFrame,
-	TriggerOnFirstParentFrame,
-}
-
-TransitionExitTriggerType :: enum EnumBackingType {
-	SkipWhenParentExits,
-	TriggerWhenParentExits,
-}
-
-TransitionInteractionHandlingType :: enum EnumBackingType {
-	DisableInteractionsWhileTransitioningPosition,
-	AllowInteractionsWhileTransitioningPosition,
-}
-
-ExitTransitionSiblingOrdering :: enum EnumBackingType {
-	UnderneathSiblings,
-	NaturalOrder,
-	AboveSiblings,
-}
-
-TransitionElementConfig :: struct {
-	handler:             proc "c" (args: TransitionCallbackArguments) -> bool,
-	duration:            f32,
-	properties:          TransitionPropertyFlags,
-	interactionHandling: TransitionInteractionHandlingType,
-	enter:               struct {
-		setInitialState: proc "c" (
-			initialState: TransitionData,
-			properties: TransitionPropertyFlags,
-		) -> TransitionData,
-		trigger:         TransitionEnterTriggerType,
-	},
-	exit:                struct {
-		setFinalState:   proc "c" (
-			finalState: TransitionData,
-			properties: TransitionPropertyFlags,
-		) -> TransitionData,
-		trigger:         TransitionExitTriggerType,
-		siblingOrdering: ExitTransitionSiblingOrdering,
-	},
-}
-
 ClipElementConfig :: struct {
 	horizontal:  bool, // clip overflowing elements on the "X" axis
 	vertical:    bool, // clip overflowing elements on the "Y" axis
@@ -275,67 +186,56 @@ FloatingElementConfig :: struct {
 	attachment:         FloatingAttachPoints,
 	pointerCaptureMode: PointerCaptureMode,
 	attachTo:           FloatingAttachToElement,
-	clipTo:             FloatingClipToElement,
+	clipTo: 			FloatingClipToElement,
 }

 TextRenderData :: struct {
 	stringContents: StringSlice,
-	textColor:      Color,
-	fontId:         u16,
-	fontSize:       u16,
-	letterSpacing:  u16,
-	lineHeight:     u16,
+	textColor: Color,
+	fontId: u16,
+	fontSize: u16,
+	letterSpacing: u16,
+	lineHeight: u16,
 }

 RectangleRenderData :: struct {
 	backgroundColor: Color,
-	cornerRadius:    CornerRadius,
+	cornerRadius: CornerRadius,
 }

 ImageRenderData :: struct {
 	backgroundColor: Color,
-	cornerRadius:    CornerRadius,
-	imageData:       rawptr,
+	cornerRadius: CornerRadius,
+	imageData: rawptr,
 }

 CustomRenderData :: struct {
 	backgroundColor: Color,
-	cornerRadius:    CornerRadius,
-	customData:      rawptr,
-}
-
-ClipRenderData :: struct {
-	horizontal: bool,
-	vertical:   bool,
-}
-
-OverlayColorRenderData :: struct {
-	color: Color,
+	cornerRadius: CornerRadius,
+	customData: rawptr,
 }

 BorderRenderData :: struct {
-	color:        Color,
+	color: Color,
 	cornerRadius: CornerRadius,
-	width:        BorderWidth,
+	width: BorderWidth,
 }

 RenderCommandData :: struct #raw_union {
-	rectangle:    RectangleRenderData,
-	text:         TextRenderData,
-	image:        ImageRenderData,
-	custom:       CustomRenderData,
-	border:       BorderRenderData,
-	clip:         ClipRenderData,
-	overlayColor: OverlayColorRenderData,
+	rectangle: RectangleRenderData,
+	text: TextRenderData,
+	image: ImageRenderData,
+	custom: CustomRenderData,
+	border: BorderRenderData,
 }

 RenderCommand :: struct {
-	boundingBox: BoundingBox,
-	renderData:  RenderCommandData,
-	userData:    rawptr,
-	id:          u32,
-	zIndex:      i16,
-	commandType: RenderCommandType,
+	boundingBox:        BoundingBox,
+	renderData:         RenderCommandData,
+	userData:           rawptr,
+	id:                 u32,
+	zIndex:             i16,
+	commandType:        RenderCommandType,
 }

 ScrollContainerData :: struct {
@@ -395,9 +295,9 @@ Sizing :: struct {
 }

 Padding :: struct {
-	left:   u16,
-	right:  u16,
-	top:    u16,
+	left: u16,
+	right: u16,
+	top: u16,
 	bottom: u16,
 }

@@ -438,17 +338,16 @@ ClayArray :: struct($type: typeid) {
 }

 ElementDeclaration :: struct {
+	id:              ElementId,
 	layout:          LayoutConfig,
 	backgroundColor: Color,
-	overlayColor:    Color,
 	cornerRadius:    CornerRadius,
-	aspectRatio:     AspectRatioElementConfig,
+	aspectRatio: 	 AspectRatioElementConfig,
 	image:           ImageElementConfig,
 	floating:        FloatingElementConfig,
 	custom:          CustomElementConfig,
 	clip:            ClipElementConfig,
 	border:          BorderElementConfig,
-	transition:      TransitionElementConfig,
 	userData:        rawptr,
 }

@@ -461,17 +360,16 @@ ErrorType :: enum EnumBackingType {
 	FloatingContainerParentNotFound,
 	PercentageOver1,
 	InternalError,
-	UnbalancedOpenClose,
 }

 ErrorData :: struct {
 	errorType: ErrorType,
 	errorText: String,
-	userData:  rawptr,
+	userData: rawptr,
 }

 ErrorHandler :: struct {
-	handler:  proc "c" (errorData: ErrorData),
+	handler: proc "c" (errorData: ErrorData),
 	userData: rawptr,
 }

@@ -480,27 +378,23 @@ Context :: struct {} // opaque structure, only use as a pointer
@(link_prefix = "Clay_", default_calling_convention = "c")
 foreign Clay {
 	_OpenElement :: proc() ---
-	_OpenElementWithId :: proc(id: ElementId) ---
 	_CloseElement :: proc() ---
 	MinMemorySize :: proc() -> u32 ---
 	CreateArenaWithCapacityAndMemory :: proc(capacity: c.size_t, offset: [^]u8) -> Arena ---
 	SetPointerState :: proc(position: Vector2, pointerDown: bool) ---
-	GetPointerState :: proc() -> PointerData ---
 	Initialize :: proc(arena: Arena, layoutDimensions: Dimensions, errorHandler: ErrorHandler) -> ^Context ---
 	GetCurrentContext :: proc() -> ^Context ---
 	SetCurrentContext :: proc(ctx: ^Context) ---
 	UpdateScrollContainers :: proc(enableDragScrolling: bool, scrollDelta: Vector2, deltaTime: c.float) ---
 	SetLayoutDimensions :: proc(dimensions: Dimensions) ---
 	BeginLayout :: proc() ---
-	EndLayout :: proc(deltaTime: c.float) -> ClayArray(RenderCommand) ---
-	GetOpenElementId :: proc() -> u32 ---
+	EndLayout :: proc() -> ClayArray(RenderCommand) ---
 	GetElementId :: proc(id: String) -> ElementId ---
 	GetElementIdWithIndex :: proc(id: String, index: u32) -> ElementId ---
 	GetElementData :: proc(id: ElementId) -> ElementData ---
 	Hovered :: proc() -> bool ---
 	OnHover :: proc(onHoverFunction: proc "c" (id: ElementId, pointerData: PointerData, userData: rawptr), userData: rawptr) ---
 	PointerOver :: proc(id: ElementId) -> bool ---
-	GetPointerOverIds :: proc() -> ElementIdArray ---
 	GetScrollOffset :: proc() -> Vector2 ---
 	GetScrollContainerData :: proc(id: ElementId) -> ScrollContainerData ---
 	SetMeasureTextFunction :: proc(measureTextFunction: proc "c" (text: StringSlice, config: ^TextElementConfig, userData: rawptr) -> Dimensions, userData: rawptr) ---
@@ -514,15 +408,15 @@ foreign Clay {
 	GetMaxMeasureTextCacheWordCount :: proc() -> i32 ---
 	SetMaxMeasureTextCacheWordCount :: proc(maxMeasureTextCacheWordCount: i32) ---
 	ResetMeasureTextCache :: proc() ---
-	EaseOut :: proc(arguments: TransitionCallbackArguments) -> bool ---
 }

@(link_prefix = "Clay_", default_calling_convention = "c", private)
 foreign Clay {
 	_ConfigureOpenElement :: proc(config: ElementDeclaration) ---
-	_HashString :: proc(key: String, seed: u32) -> ElementId ---
-	_HashStringWithOffset :: proc(key: String, index: u32, seed: u32) -> ElementId ---
-	_OpenTextElement :: proc(text: String, textConfig: TextElementConfig) ---
+	_HashString :: proc(key: String, offset: u32, seed: u32) -> ElementId ---
+	_OpenTextElement :: proc(text: String, textConfig: ^TextElementConfig) ---
+	_StoreTextElementConfig :: proc(config: TextElementConfig) -> ^TextElementConfig ---
+	_GetParentElementId :: proc() -> u32 ---
 }

 ConfigureOpenElement :: proc(config: ElementDeclaration) -> bool {
@@ -531,39 +425,27 @@ ConfigureOpenElement :: proc(config: ElementDeclaration) -> bool {
 }

@(deferred_none = _CloseElement)
-UI_WithId :: proc(id: ElementId) -> proc(config: ElementDeclaration) -> bool {
-	_OpenElementWithId(id)
-	return ConfigureOpenElement
-}
-
-@(deferred_none = _CloseElement)
-UI_AutoId :: proc() -> proc(config: ElementDeclaration) -> bool {
+UI :: proc() -> proc (config: ElementDeclaration) -> bool {
 	_OpenElement()
 	return ConfigureOpenElement
 }

-UI :: proc {
-	UI_WithId,
-	UI_AutoId,
-}
-
-Text :: proc {
-	TextStatic,
-	TextDynamic,
-}
-
-TextStatic :: proc($text: string, config: TextElementConfig) {
+Text :: proc($text: string, config: ^TextElementConfig) {
 	wrapped := MakeString(text)
 	wrapped.isStaticallyAllocated = true
 	_OpenTextElement(wrapped, config)
 }

-TextDynamic :: proc(text: string, config: TextElementConfig) {
+TextDynamic :: proc(text: string, config: ^TextElementConfig) {
 	_OpenTextElement(MakeString(text), config)
 }

+TextConfig :: proc(config: TextElementConfig) -> ^TextElementConfig {
+	return _StoreTextElementConfig(config)
+}
+
 PaddingAll :: proc(allPadding: u16) -> Padding {
-	return {left = allPadding, right = allPadding, top = allPadding, bottom = allPadding}
+	return { left = allPadding, right = allPadding, top = allPadding, bottom = allPadding }
 }

 BorderOutside :: proc(width: u16) -> BorderWidth {
@@ -578,11 +460,11 @@ CornerRadiusAll :: proc(radius: f32) -> CornerRadius {
 	return CornerRadius{radius, radius, radius, radius}
 }

-SizingFit :: proc(sizeMinMax: SizingConstraintsMinMax = {}) -> SizingAxis {
+SizingFit :: proc(sizeMinMax: SizingConstraintsMinMax) -> SizingAxis {
 	return SizingAxis{type = SizingType.Fit, constraints = {sizeMinMax = sizeMinMax}}
 }

-SizingGrow :: proc(sizeMinMax: SizingConstraintsMinMax = {}) -> SizingAxis {
+SizingGrow :: proc(sizeMinMax: SizingConstraintsMinMax) -> SizingAxis {
 	return SizingAxis{type = SizingType.Grow, constraints = {sizeMinMax = sizeMinMax}}
 }

@@ -599,9 +481,9 @@ MakeString :: proc(label: string) -> String {
 }

 ID :: proc(label: string, index: u32 = 0) -> ElementId {
-	return _HashString(MakeString(label), index)
+	return _HashString(MakeString(label), index, 0)
 }

 ID_LOCAL :: proc(label: string, index: u32 = 0) -> ElementId {
-	return _HashStringWithOffset(MakeString(label), index, GetOpenElementId())
-}
+	return _HashString(MakeString(label), index, _GetParentElementId())
+}
@@ -0,0 +1,6 @@
+{
+	"$schema": "https://raw.githubusercontent.com/DanielGavin/ols/master/misc/odinfmt.schema.json",
+	"character_width": 180,
+	"sort_imports": true,
+	"tabs": false
+}
@@ -1,11 +1,8 @@
 package examples

 import "core:fmt"
-import "core:log"
-import "core:mem"
 import "core:os"
 import "core:sys/posix"
-
 import mdb "../../lmdb"

 // 0o660
@@ -13,74 +10,33 @@ DB_MODE :: posix.mode_t{.IWGRP, .IRGRP, .IWUSR, .IRUSR}
 DB_PATH :: "out/debug/lmdb_example_db"

 main :: proc() {
-	//----- General setup ----------------------------------
-	// Temp
-	track_temp: mem.Tracking_Allocator
-	mem.tracking_allocator_init(&track_temp, context.temp_allocator)
-	context.temp_allocator = mem.tracking_allocator(&track_temp)
+    environment: ^mdb.Env

-	// Default
-	track: mem.Tracking_Allocator
-	mem.tracking_allocator_init(&track, context.allocator)
-	context.allocator = mem.tracking_allocator(&track)
-	// Log a warning about any memory that was not freed by the end of the program.
-	// This could be fine for some global state or it could be a memory leak.
-	defer {
-		// Temp allocator
-		if len(track_temp.bad_free_array) > 0 {
-			fmt.eprintf("=== %v incorrect frees - temp allocator: ===\n", len(track_temp.bad_free_array))
-			for entry in track_temp.bad_free_array {
-				fmt.eprintf("- %p @ %v\n", entry.memory, entry.location)
-			}
-			mem.tracking_allocator_destroy(&track_temp)
-		}
-		// Default allocator
-		if len(track.allocation_map) > 0 {
-			fmt.eprintf("=== %v allocations not freed - main allocator: ===\n", len(track.allocation_map))
-			for _, entry in track.allocation_map {
-				fmt.eprintf("- %v bytes @ %v\n", entry.size, entry.location)
-			}
-		}
-		if len(track.bad_free_array) > 0 {
-			fmt.eprintf("=== %v incorrect frees - main allocator: ===\n", len(track.bad_free_array))
-			for entry in track.bad_free_array {
-				fmt.eprintf("- %p @ %v\n", entry.memory, entry.location)
-			}
-		}
-		mem.tracking_allocator_destroy(&track)
-	}
-	// Logger
-	context.logger = log.create_console_logger()
-	defer log.destroy_console_logger(context.logger)
+    // Create environment for lmdb
+    mdb.panic_on_err(mdb.env_create(&environment))
+    // Create directory for databases. Won't do anything if it already exists.
+    os.make_directory(DB_PATH)
+    // Open the database files (creates them if they don't already exist)
+    mdb.panic_on_err(mdb.env_open(environment, DB_PATH, {}, DB_MODE))

+    // Transactions
+    txn_handle: ^mdb.Txn
+    db_handle: mdb.Dbi
+    // Put transaction
+    key := 7
+    key_val := mdb.blittable_val(&key)
+    put_data := 12
+    put_data_val := mdb.blittable_val(&put_data)
+    mdb.panic_on_err(mdb.txn_begin(environment, nil, {}, &txn_handle))
+    mdb.panic_on_err(mdb.dbi_open(txn_handle, nil, {}, &db_handle))
+    mdb.panic_on_err(mdb.put(txn_handle, db_handle, &key_val, &put_data_val, {}))
+    mdb.panic_on_err(mdb.txn_commit(txn_handle))

-	environment: ^mdb.Env
-
-	// Create environment for lmdb
-	mdb.panic_on_err(mdb.env_create(&environment))
-	// Create directory for databases. Won't do anything if it already exists.
-	os.make_directory(DB_PATH)
-	// Open the database files (creates them if they don't already exist)
-	mdb.panic_on_err(mdb.env_open(environment, DB_PATH, {}, DB_MODE))
-
-	// Transactions
-	txn_handle: ^mdb.Txn
-	db_handle: mdb.Dbi
-	// Put transaction
-	key := 7
-	key_val := mdb.pod_val(&key)
-	put_data := 12
-	put_data_val := mdb.pod_val(&put_data)
-	mdb.panic_on_err(mdb.txn_begin(environment, nil, {}, &txn_handle))
-	mdb.panic_on_err(mdb.dbi_open(txn_handle, nil, {}, &db_handle))
-	mdb.panic_on_err(mdb.put(txn_handle, db_handle, &key_val, &put_data_val, {}))
-	mdb.panic_on_err(mdb.txn_commit(txn_handle))
-
-	// Get transaction
-	data_val: mdb.Val
-	mdb.panic_on_err(mdb.txn_begin(environment, nil, {}, &txn_handle))
-	mdb.panic_on_err(mdb.get(txn_handle, db_handle, &key_val, &data_val))
-	data_cpy := mdb.pod_copy(data_val, int)
-	mdb.txn_abort(txn_handle)
-	fmt.println("Get result:", data_cpy)
+    // Get transaction
+    data_val: mdb.Val
+    mdb.panic_on_err(mdb.txn_begin(environment, nil, {}, &txn_handle))
+    mdb.panic_on_err(mdb.get(txn_handle, db_handle, &key_val, &data_val))
+    data_cpy := mdb.blittable_copy(&data_val, int)
+    mdb.panic_on_err(mdb.txn_commit(txn_handle))
+    fmt.println("Get result:", data_cpy)
 }
@@ -169,86 +169,58 @@ import "core:fmt"
 import "core:reflect"
 import "core:sys/posix"

-import b "../../basic"
-
 // ---------------------------------------------------------------------------------------------------------------------
 // ----- Added Odin Helpers ------------------------
 // ---------------------------------------------------------------------------------------------------------------------

-// Wrap a POD value's bytes as an LMDB Val.
+// Wrap a blittable value's bytes as an LMDB Val.
 // T must be a contiguous type with no indirection (no pointers, slices, strings, maps, etc.).
-pod_val :: #force_inline proc(val_ptr: ^$T) -> Val {
-	when ODIN_DEBUG {
-		fmt.assertf(
-			reflect.has_no_indirections(type_info_of(T)),
-			"pod_val: type '%v' contains indirection and cannot be stored directly in LMDB",
-			typeid_of(T),
-		)
-	}
+blittable_val :: #force_inline proc(val_ptr: ^$T) -> Val {
+	fmt.assertf(
+		reflect.has_no_indirections(type_info_of(T)),
+		"blitval: type '%v' contains indirection and cannot be stored directly in LMDB",
+		typeid_of(T),
+	)
 	return Val{size_of(T), val_ptr}
 }

-// Reads a POD T out of the LMDB memory map by copying it into caller
+// Reads a blittable T out of the LMDB memory map by copying it into caller
 // storage. The returned T has no lifetime tie to the transaction.
-pod_copy :: #force_inline proc(val: Val, $T: typeid) -> T {
-	when ODIN_DEBUG {
-		fmt.assertf(
-			reflect.has_no_indirections(type_info_of(T)),
-			"pod_copy: type '%v' contains indirection and cannot be read directly from LMDB",
-			typeid_of(T),
-		)
-	}
-	when b.ODIN_BOUNDS_CHECK {
-		fmt.assertf(
-			val.size == size_of(T),
-			"size_of(%v) (%v) != val.size (%v)",
-			typeid_of(T),
-			size_of(T),
-			val.size,
-		)
-	}
+blittable_copy :: #force_inline proc(val: ^Val, $T: typeid) -> T {
+	fmt.assertf(
+		reflect.has_no_indirections(type_info_of(T)),
+		"blitval_copy: type '%v' contains indirection and cannot be read directly from LMDB",
+		typeid_of(T),
+	)
 	return (cast(^T)val.data)^
 }

 // Zero-copy pointer view into the LMDB memory map as a ^T.
-// Useful for large POD types where you want to read individual fields
+// Useful for large blittable types where you want to read individual fields
 // without copying the entire value (e.g. ptr.timestamp, ptr.flags).
 // MUST NOT be written through — writes either segfault (default env mode)
 // or silently corrupt the database (ENV_WRITEMAP).
 // MUST NOT be retained past txn_commit, txn_abort, or any subsequent write
 // operation on the same env — the pointer is invalidated.
-pod_view :: #force_inline proc(val: Val, $T: typeid) -> ^T {
-	when ODIN_DEBUG {
-		fmt.assertf(
-			reflect.has_no_indirections(type_info_of(T)),
-			"pod_view: type '%v' contains indirection and cannot be viewed directly from LMDB",
-			typeid_of(T),
-		)
-	}
-	when b.ODIN_BOUNDS_CHECK {
-		fmt.assertf(
-			val.size == size_of(T),
-			"size_of(%v) (%v) != val.size (%v)",
-			typeid_of(T),
-			size_of(T),
-			val.size,
-		)
-	}
+blittable_view :: #force_inline proc(val: ^Val, $T: typeid) -> ^T {
+	fmt.assertf(
+		reflect.has_no_indirections(type_info_of(T)),
+		"blitval_view: type '%v' contains indirection and cannot be viewed directly from LMDB",
+		typeid_of(T),
+	)
 	return cast(^T)val.data
 }

-// Wrap a slice of POD elements as an LMDB Val for use with put/get.
+// Wrap a slice of blittable elements as an LMDB Val for use with put/get.
 // T must be a contiguous type with no indirection.
 // The caller's slice must remain valid (not freed, not resized) for the
 // duration of the put call that consumes this Val.
-pod_slice_val :: #force_inline proc(s: []$T) -> Val {
-	when ODIN_DEBUG {
-		fmt.assertf(
-			reflect.has_no_indirections(type_info_of(T)),
-			"pod_slice_val: element type '%v' contains indirection and cannot be stored directly in LMDB",
-			typeid_of(T),
-		)
-	}
+slice_val :: #force_inline proc(s: []$T) -> Val {
+	fmt.assertf(
+		reflect.has_no_indirections(type_info_of(T)),
+		"slice_val: element type '%v' contains indirection and cannot be stored directly in LMDB",
+		typeid_of(T),
+	)
 	return Val{uint(len(s) * size_of(T)), raw_data(s)}
 }

@@ -259,21 +231,12 @@ pod_slice_val :: #force_inline proc(s: []$T) -> Val {
 // MUST be copied (e.g. slice.clone) if it needs to outlive the current
 // transaction; the view is invalidated by txn_commit, txn_abort, or any
 // subsequent write operation on the same env.
-pod_slice_view :: #force_inline proc(val: Val, $T: typeid) -> []T {
-	when ODIN_DEBUG {
-		fmt.assertf(
-			reflect.has_no_indirections(type_info_of(T)),
-			"pod_slice_view: element type '%v' contains indirection and cannot be read directly from LMDB",
-			typeid_of(T),
-		)
-		fmt.assertf(
-			val.size % size_of(T) == 0,
-			"pod_slice_view: val.size (%v) is not a multiple of size_of(%v) (%v)",
-			val.size,
-			typeid_of(T),
-			size_of(T),
-		)
-	}
+slice_view :: #force_inline proc(val: ^Val, $T: typeid) -> []T {
+	fmt.assertf(
+		reflect.has_no_indirections(type_info_of(T)),
+		"slice_view: element type '%v' contains indirection and cannot be read directly from LMDB",
+		typeid_of(T),
+	)
 	return (cast([^]T)val.data)[:val.size / size_of(T)]
 }

@@ -290,7 +253,7 @@ string_val :: #force_inline proc(s: string) -> Val {
 // MUST be copied (e.g. strings.clone) if it needs to outlive the current
 // transaction; the view is invalidated by txn_commit, txn_abort, or any
 // subsequent write operation on the same env.
-string_view :: #force_inline proc(val: Val) -> string {
+string_view :: #force_inline proc(val: ^Val) -> string {
 	return string((cast([^]u8)val.data)[:val.size])
 }
Author	SHA1	Message	Date
Zachary Levy	20b9360925	libusb cleanup	2026-04-22 04:47:43 +00:00
Zachary Levy	3bfd14158a	LMDB cleanup	2026-04-22 04:47:43 +00:00