draw-improvements (#17)

Major rework to draw rendering system. We are making a SDF first rendering system with tesselated stuff only as a fallback strategy for specific situations where SDF is particularly poorly suited

Co-authored-by: Zachary Levy <zachary@sunforge.is>
Reviewed-on: #17
This commit was merged in pull request #17.
This commit is contained in:
2026-04-24 07:57:44 +00:00
parent 37da2ea068
commit bca19277b3
15 changed files with 1773 additions and 1736 deletions

View File

@@ -9,35 +9,51 @@ The renderer uses a single unified `Pipeline_2D_Base` (`TRIANGLELIST` pipeline)
modes dispatched by a push constant: modes dispatched by a push constant:
- **Mode 0 (Tessellated):** Vertex buffer contains real geometry. Used for text (indexed draws into - **Mode 0 (Tessellated):** Vertex buffer contains real geometry. Used for text (indexed draws into
SDL_ttf atlas textures), axis-aligned sharp-corner rectangles (already optimal as 2 triangles), SDL_ttf atlas textures), single-pixel points (`tes_pixel`), arbitrary user geometry (`tes_triangle`,
per-vertex color gradients (`rectangle_gradient`, `circle_gradient`), angular-clipped circle `tes_triangle_fan`, `tes_triangle_strip`), and shapes without a closed-form rounded-rectangle
sectors (`circle_sector`), and arbitrary user geometry (`triangle`, `triangle_fan`, reduction: ellipses (`tes_ellipse`), regular polygons (`tes_polygon`), and circle sectors
`triangle_strip`). The fragment shader computes `out = color * texture(tex, uv)`. (`tes_sector`). The fragment shader computes `out = color * texture(tex, uv)`.
- **Mode 1 (SDF):** A static 6-vertex unit-quad buffer is drawn instanced, with per-primitive - **Mode 1 (SDF):** A static 6-vertex unit-quad buffer is drawn instanced, with per-primitive
`Primitive` structs uploaded each frame to a GPU storage buffer. The vertex shader reads `Primitive` structs (80 bytes each) uploaded each frame to a GPU storage buffer. The vertex shader
`primitives[gl_InstanceIndex]`, computes world-space position from unit quad corners + primitive reads `primitives[gl_InstanceIndex]`, computes world-space position from unit quad corners +
bounds. The fragment shader dispatches on `Shape_Kind` to evaluate the correct signed distance primitive bounds. The fragment shader always evaluates `sdRoundedBox` — there is no per-primitive
function analytically. kind dispatch.
Seven SDF shape kinds are implemented: The SDF path handles all shapes that are algebraically reducible to a rounded rectangle:
1. **RRect** — rounded rectangle with per-corner radii (iq's `sdRoundedBox`) - **Rounded rectangles** — per-corner radii via `sdRoundedBox` (iq). Covers filled, stroked,
2. **Circle** — filled or stroked circle textured, and gradient-filled rectangles.
3. **Ellipse**exact signed-distance ellipse (iq's iterative `sdEllipse`) - **Circles** — uniform radii equal to half-size. Covers filled, stroked, and radial-gradient circles.
4. **Segment** — capsule-style line segment with rounded caps - **Line segments / capsules** — rotated RRect with uniform radii equal to half-thickness (stadium shape).
5. **Ring_Arc** — annular ring with angular clipping for arcs - **Full rings / annuli** — stroked circle (mid-radius with stroke thickness = outer - inner).
6. **NGon** — regular polygon with arbitrary side count and rotation
7. **Polyline** — decomposed into independent `Segment` primitives per adjacent point pair
All SDF shapes support fill and stroke modes via `Shape_Flags`, and produce mathematically exact All SDF shapes support fill, stroke, solid color, bilinear 4-corner gradients, radial 2-color
curves with analytical anti-aliasing via `smoothstep` — no tessellation, no piecewise-linear gradients, and texture fills via `Shape_Flags`. Gradient colors are packed into the same 16 bytes as
approximation. A rounded rectangle is 1 primitive (64 bytes) instead of ~250 vertices (~5000 bytes). the texture UV rect via a `Uv_Or_Gradient` raw union — zero size increase to the 80-byte `Primitive`
struct. Gradient and texture are mutually exclusive.
All SDF shapes produce mathematically exact curves with analytical anti-aliasing via `smoothstep`
no tessellation, no piecewise-linear approximation. A rounded rectangle is 1 primitive (80 bytes)
instead of ~250 vertices (~5000 bytes).
The fragment shader's estimated register footprint is ~2023 VGPRs via static live-range analysis.
RRect and Ring_Arc are roughly tied at peak pressure — RRect carries `corner_radii` (4 regs) plus
`sdRoundedBox` temporaries, Ring_Arc carries wedge normals plus dot-product temporaries. Both land
comfortably under Mali Valhall's 32-register occupancy cliff (G57/G77/G78 and later) and well under
desktop limits. On older Bifrost Mali (G71/G72/G76, 16-register cliff) either shape kind may incur
partial occupancy reduction. These estimates are hand-counted; exact numbers require `malioc` or
Radeon GPU Analyzer against the compiled SPIR-V.
MSAA is opt-in (default `._1`, no MSAA) via `Init_Options.msaa_samples`. SDF rendering does not MSAA is opt-in (default `._1`, no MSAA) via `Init_Options.msaa_samples`. SDF rendering does not
benefit from MSAA because fragment coverage is computed analytically. MSAA remains useful for text benefit from MSAA because fragment coverage is computed analytically. MSAA remains useful for text
glyph edges and tessellated user geometry if desired. glyph edges and tessellated user geometry if desired.
All public drawing procs use prefixed names for clarity: `sdf_*` for SDF-path shapes, `tes_*` for
tessellated-path shapes. Proc groups provide a single entry point per shape concept (e.g.,
`sdf_rectangle` dispatches to `sdf_rectangle_solid` or `sdf_rectangle_gradient` based on argument
count).
## 2D rendering pipeline plan ## 2D rendering pipeline plan
This section documents the planned architecture for levlib's 2D rendering system. The design is driven This section documents the planned architecture for levlib's 2D rendering system. The design is driven
@@ -91,19 +107,19 @@ Below the cliff, adding registers has zero occupancy cost.
On consumer Ampere/Ada GPUs (RTX 30xx/40xx, 65,536 regs/SM, max 1,536 threads/SM, cliff at ~43 regs): On consumer Ampere/Ada GPUs (RTX 30xx/40xx, 65,536 regs/SM, max 1,536 threads/SM, cliff at ~43 regs):
| Register allocation | Reg-limited threads | Actual (hw-capped) | Occupancy | | Register allocation | Reg-limited threads | Actual (hw-capped) | Occupancy |
| ----------------------- | ------------------- | ------------------ | --------- | | ------------------------ | ------------------- | ------------------ | --------- |
| 20 regs (main pipeline) | 3,276 | 1,536 | 100% | | ~16 regs (main pipeline) | 4,096 | 1,536 | 100% |
| 32 regs | 2,048 | 1,536 | 100% | | 32 regs | 2,048 | 1,536 | 100% |
| 48 regs (effects) | 1,365 | 1,365 | ~89% | | 48 regs (effects) | 1,365 | 1,365 | ~89% |
On Volta/A100 GPUs (65,536 regs/SM, max 2,048 threads/SM, cliff at ~32 regs): On Volta/A100 GPUs (65,536 regs/SM, max 2,048 threads/SM, cliff at ~32 regs):
| Register allocation | Reg-limited threads | Actual (hw-capped) | Occupancy | | Register allocation | Reg-limited threads | Actual (hw-capped) | Occupancy |
| ----------------------- | ------------------- | ------------------ | --------- | | ------------------------ | ------------------- | ------------------ | --------- |
| 20 regs (main pipeline) | 3,276 | 2,048 | 100% | | ~16 regs (main pipeline) | 4,096 | 2,048 | 100% |
| 32 regs | 2,048 | 2,048 | 100% | | 32 regs | 2,048 | 2,048 | 100% |
| 48 regs (effects) | 1,365 | 1,365 | ~67% | | 48 regs (effects) | 1,365 | 1,365 | ~67% |
On low-end mobile (ARM Mali Bifrost/Valhall, 64 regs/thread, cliff fixed at 32 regs): On low-end mobile (ARM Mali Bifrost/Valhall, 64 regs/thread, cliff fixed at 32 regs):
@@ -261,11 +277,12 @@ Our design has two branch points:
Every thread in every warp of a draw call sees the same `mode` value. **Zero divergence, zero Every thread in every warp of a draw call sees the same `mode` value. **Zero divergence, zero
cost.** cost.**
2. **`shape_kind` (flat varying from storage buffer): which SDF to evaluate.** This is category 3. 2. **`flags` (flat varying from storage buffer): gradient/texture/stroke mode.** This is category 3.
The `flat` interpolation qualifier ensures that all fragments rasterized from one primitive's quad The `flat` interpolation qualifier ensures that all fragments rasterized from one primitive's quad
receive the same `shape_kind` value. Divergence can only occur at the **boundary between two receive the same flag bits. However, since the SDF path now evaluates only `sdRoundedBox` with no
adjacent primitives of different kinds**, where the rasterizer might pack fragments from both kind dispatch, the only flag-dependent branches are gradient vs. texture vs. solid color selection
primitives into the same warp. — all lightweight (38 instructions per path). Divergence at primitive boundaries between
different flag combinations has negligible cost.
For category 3, the divergence analysis depends on primitive size: For category 3, the divergence analysis depends on primitive size:
@@ -282,9 +299,10 @@ For category 3, the divergence analysis depends on primitive size:
frame-level divergence is typically **13%** of all warps. frame-level divergence is typically **13%** of all warps.
At 13% divergence, the throughput impact is negligible. At 4K with 12.4M total fragments At 13% divergence, the throughput impact is negligible. At 4K with 12.4M total fragments
(~387,000 warps), divergent boundary warps number in the low thousands. Each divergent warp pays at (~387,000 warps), divergent boundary warps number in the low thousands. Without kind dispatch, the
most ~25 extra instructions (the cost of the longest untaken SDF branch). At ~12G instructions/sec longest untaken branch is the gradient evaluation (~8 instructions), not a different SDF function.
on a mid-range GPU, that totals ~4μs — under 0.05% of an 8.3ms (120 FPS) frame budget. This is Each divergent warp pays at most ~8 extra instructions. At ~12G instructions/sec on a mid-range GPU,
that totals ~1.3μs — under 0.02% of an 8.3ms (120 FPS) frame budget. This is
confirmed by production renderers that use exactly this pattern: confirmed by production renderers that use exactly this pattern:
- **vger / vger-rs** (Audulus): single pipeline, 11 primitive kinds dispatched by a `switch` on a - **vger / vger-rs** (Audulus): single pipeline, 11 primitive kinds dispatched by a `switch` on a
@@ -309,9 +327,10 @@ our design:
> have no per-fragment data-dependent branches in the main pipeline. > have no per-fragment data-dependent branches in the main pipeline.
2. **Branches where both paths are very long.** If both sides of a branch are 500+ instructions, 2. **Branches where both paths are very long.** If both sides of a branch are 500+ instructions,
divergent warps pay double a large cost. Our SDF functions are 1025 instructions each. Even divergent warps pay double a large cost. Without kind dispatch, the SDF path always evaluates
fully divergent, the penalty is ~25 extra instructions — less than a single texture sample's `sdRoundedBox`; the only branches are gradient/texture/solid color selection at 38 instructions
latency. each. Even fully divergent, the penalty is ~8 extra instructions — less than a single texture
sample's latency.
3. **Branches that prevent compiler optimizations.** Some compilers cannot schedule instructions 3. **Branches that prevent compiler optimizations.** Some compilers cannot schedule instructions
across branch boundaries, reducing VLIW utilization on older architectures. Modern GPUs (NVIDIA across branch boundaries, reducing VLIW utilization on older architectures. Modern GPUs (NVIDIA
@@ -319,9 +338,9 @@ our design:
concern. concern.
4. **Register pressure from the union of all branches.** This is the real cost, and it is why we 4. **Register pressure from the union of all branches.** This is the real cost, and it is why we
split heavy effects (shadows, glass) into separate pipelines. Within the main pipeline, all SDF split heavy effects (shadows, glass) into separate pipelines. Within the main pipeline, the SDF
branches have similar register footprints (1222 registers), so combining them causes negligible path has a single evaluation (sdRoundedBox) with flag-based color selection, clustering at ~1518
occupancy loss. registers, so there is negligible occupancy loss.
**References:** **References:**
@@ -342,17 +361,19 @@ our design:
### Main pipeline: SDF + tessellated (unified) ### Main pipeline: SDF + tessellated (unified)
The main pipeline serves two submission modes through a single `TRIANGLELIST` pipeline and a single The main pipeline serves two submission modes through a single `TRIANGLELIST` pipeline and a single
vertex input layout, distinguished by a push constant: vertex input layout, distinguished by a mode marker in the `Primitive.flags` field (low byte:
0 = tessellated, 1 = SDF). The tessellated path sets this to 0 via zero-initialization in the vertex
shader; the SDF path sets it to 1 via `pack_flags`.
- **Tessellated mode** (`mode = 0`): direct vertex buffer with explicit geometry. Unchanged from - **Tessellated mode** (`mode = 0`): direct vertex buffer with explicit geometry. Used for text
today. Used for text (SDL_ttf atlas sampling), polylines, triangle fans/strips, gradient-filled (SDL_ttf atlas sampling), triangle fans/strips, ellipses, regular polygons, circle sectors, and
shapes, and any user-provided raw vertex geometry. any user-provided raw vertex geometry.
- **SDF mode** (`mode = 1`): shared unit-quad vertex buffer + GPU storage buffer of `Primitive` - **SDF mode** (`mode = 1`): shared unit-quad vertex buffer + GPU storage buffer of `Primitive`
structs, drawn instanced. Used for all shapes with closed-form signed distance functions. structs, drawn instanced. Used for all shapes with closed-form signed distance functions.
Both modes converge on the same fragment shader, which dispatches on a `shape_kind` discriminant Both modes use the same fragment shader. The fragment shader checks the mode marker: mode 0 computes
carried either in the vertex data (tessellated, always `Solid = 0`) or in the storage-buffer `out = color * texture(tex, uv)`; mode 1 always evaluates `sdRoundedBox` and applies
primitive struct (SDF modes). gradient/texture/solid color based on flag bits.
#### Why SDF for shapes #### Why SDF for shapes
@@ -391,49 +412,60 @@ SDF primitives are submitted via a GPU storage buffer indexed by `gl_InstanceInd
shader, rather than encoding per-primitive data redundantly in vertex attributes. This follows the shader, rather than encoding per-primitive data redundantly in vertex attributes. This follows the
pattern used by both Zed GPUI and vger-rs. pattern used by both Zed GPUI and vger-rs.
Each SDF shape is described by a single `Primitive` struct (~56 bytes) in the storage buffer. The Each SDF shape is described by a single `Primitive` struct (80 bytes) in the storage buffer. The
vertex shader reads `primitives[gl_InstanceIndex]`, computes the quad corner position from the unit vertex shader reads `primitives[gl_InstanceIndex]`, computes the quad corner position from the unit
vertex and the primitive's bounds, and passes shape parameters to the fragment shader via `flat` vertex and the primitive's bounds, and passes shape parameters to the fragment shader via `flat`
interpolated varyings. interpolated varyings.
Compared to encoding per-primitive data in vertex attributes (the "fat vertex" approach), storage- Compared to encoding per-primitive data in vertex attributes (the "fat vertex" approach), storage-
buffer instancing eliminates the 46× data duplication across quad corners. A rounded rectangle costs buffer instancing eliminates the 46× data duplication across quad corners. A rounded rectangle costs
56 bytes instead of 4 vertices × 40+ bytes = 160+ bytes. 80 bytes instead of 4 vertices × 40+ bytes = 160+ bytes.
The tessellated path retains the existing direct vertex buffer layout (20 bytes/vertex, no storage The tessellated path retains the existing direct vertex buffer layout (20 bytes/vertex, no storage
buffer access). The vertex shader branch on `mode` (push constant) is warp-uniform — every invocation buffer access). The vertex shader branch on `mode` (push constant) is warp-uniform — every invocation
in a draw call has the same mode — so it is effectively free on all modern GPUs. in a draw call has the same mode — so it is effectively free on all modern GPUs.
#### Shape kinds #### Shape folding
Primitives in the main pipeline's storage buffer carry a `Shape_Kind` discriminant: The SDF path evaluates a single function — `sdRoundedBox` — for all primitives. There is no
`Shape_Kind` enum or per-primitive kind dispatch in the fragment shader. Shapes that are algebraically
special cases of a rounded rectangle are emitted as RRect primitives by the CPU-side drawing procs:
| Kind | SDF function | Notes | | User-facing shape | RRect mapping | Notes |
| ---------- | -------------------------------------- | --------------------------------------------------------- | | ---------------------------- | -------------------------------------------- | ---------------------------------------- |
| `RRect` | `sdRoundedBox` (iq) | Per-corner radii. Covers all Clay rectangles and borders. | | Rectangle (sharp or rounded) | Direct | Per-corner radii from `radii` param |
| `Circle` | `sdCircle` | Filled and stroked. | | Circle | `half_size = (r, r)`, `radii = (r, r, r, r)` | Uniform radii = half-size |
| `Ellipse` | `sdEllipse` | Exact (iq's closed-form). | | Line segment / capsule | Rotated RRect, `radii = half_thickness` | Stadium shape (fully-rounded minor axis) |
| `Segment` | `sdSegment` capsule | Rounded caps, correct sub-pixel thin lines. | | Full ring / annulus | Stroked circle at mid-radius | `stroke_px = outer - inner` |
| `Ring_Arc` | `abs(sdCircle) - thickness` + arc mask | Rings, arcs, circle sectors unified. |
| `NGon` | `sdRegularPolygon` | Regular n-gon for n ≥ 5. |
The `Solid` kind (value 0) is reserved for the tessellated path, where `shape_kind` is implicitly Shapes without a closed-form RRect reduction are drawn via the tessellated path:
zero because the fragment shader receives it from zero-initialized vertex attributes.
Stroke/outline variants of each shape are handled by the `Shape_Flags` bit set rather than separate | Shape | Tessellated proc | Method |
shape kinds. The fragment shader transforms `d = abs(d) - stroke_width` when the `Stroke` flag is | ------------------------- | ---------------------------------- | -------------------------- |
set. | Ellipse | `tes_ellipse`, `tes_ellipse_lines` | Triangle fan approximation |
| Regular polygon (N-gon) | `tes_polygon`, `tes_polygon_lines` | Triangle fan from center |
| Circle sector (pie slice) | `tes_sector` | Triangle fan arc |
The `Shape_Flags` bit set controls rendering mode per primitive:
| Flag | Bit | Effect |
| ----------------- | --- | -------------------------------------------------------------------- |
| `Stroke` | 0 | Outline instead of fill (`d = abs(d) - stroke_width/2`) |
| `Textured` | 1 | Sample texture using `uv.uv_rect` (mutually exclusive with Gradient) |
| `Gradient` | 2 | Bilinear 4-corner interpolation from `uv.corner_colors` |
| `Gradient_Radial` | 3 | Radial 2-color falloff (inner/outer) from `uv.corner_colors[0..1]` |
**What stays tessellated:** **What stays tessellated:**
- Text (SDL_ttf atlas, pending future MSDF evaluation) - Text (SDL_ttf atlas, pending future MSDF evaluation)
- `rectangle_gradient`, `circle_gradient` (per-vertex color interpolation) - Ellipses (`tes_ellipse`, `tes_ellipse_lines`)
- `triangle_fan`, `triangle_strip` (arbitrary user-provided point lists) - Regular polygons (`tes_polygon`, `tes_polygon_lines`)
- `line_strip` / polylines (SDF polyline rendering is possible but complex; deferred) - Circle sectors / pie slices (`tes_sector`)
- `tes_triangle`, `tes_triangle_fan`, `tes_triangle_strip` (arbitrary user-provided geometry)
- Any raw vertex geometry submitted via `prepare_shape` - Any raw vertex geometry submitted via `prepare_shape`
The rule: if the shape has a closed-form SDF, it goes SDF. If it's described only by a vertex list or The design rule: if the shape reduces to `sdRoundedBox`, it goes SDF. If it requires a different SDF
needs per-vertex color interpolation, it stays tessellated. function or is described by a vertex list, it stays tessellated.
### Effects pipeline ### Effects pipeline
@@ -547,21 +579,21 @@ The `Primitive` struct for SDF shapes lives in the storage buffer, not in vertex
``` ```
Primitive :: struct { Primitive :: struct {
bounds: [4]f32, // 0: min_x, min_y, max_x, max_y bounds: [4]f32, // 0: min_x, min_y, max_x, max_y
color: Color, // 16: u8x4, unpacked in shader via unpackUnorm4x8 color: Color, // 16: u8x4, unpacked in shader via unpackUnorm4x8
kind_flags: u32, // 20: (kind as u32) | (flags as u32 << 8) flags: u32, // 20: low byte = Shape_Kind, bits 8+ = Shape_Flags
rotation: f32, // 24: shader self-rotation in radians rotation_sc: u32, // 24: packed f16 pair (sin, cos). Requires .Rotated flag.
_pad: f32, // 28: alignment _pad: f32, // 28: reserved for future use
params: Shape_Params, // 32: raw union, 32 bytes (two vec4s of shape-specific data) params: Shape_Params, // 32: per-kind params union (half_feather, radii, etc.) (32 bytes)
uv_rect: [4]f32, // 64: texture UV sub-region (u_min, v_min, u_max, v_max) uv: Uv_Or_Effects, // 64: texture UV rect or gradient/outline parameters (16 bytes)
} }
// Total: 80 bytes (std430 aligned) // Total: 80 bytes (std430 aligned)
``` ```
`Shape_Params` is a `#raw_union` with named variants per shape kind (`rrect`, `circle`, `segment`, `RRect_Params` holds the rounded-rectangle parameters directly — there is no `Shape_Params` union.
etc.), ensuring type safety on the CPU side and zero-cost reinterpretation on the GPU side. The `Uv_Or_Gradient` is a `#raw_union` that aliases `[4]f32` (texture UV rect) with `[4]Color` (gradient
`uv_rect` field is used by textured SDF primitives (Shape_Flag.Textured); non-textured primitives corner colors, clockwise from top-left: TL, TR, BR, BL). The `flags` field encodes both the
leave it zeroed. tessellated/SDF mode marker (low byte) and shape flags (bits 8+) via `pack_flags`.
### Draw submission order ### Draw submission order
@@ -583,14 +615,15 @@ invariant is that each primitive is drawn exactly once, in the pipeline that own
Text rendering currently uses SDL_ttf's GPU text engine, which rasterizes glyphs per `(font, size)` Text rendering currently uses SDL_ttf's GPU text engine, which rasterizes glyphs per `(font, size)`
pair into bitmap atlases and emits indexed triangle data via `GetGPUTextDrawData`. This path is pair into bitmap atlases and emits indexed triangle data via `GetGPUTextDrawData`. This path is
**unchanged** by the SDF migration — text continues to flow through the main pipeline's tessellated **unchanged** by the SDF migration — text continues to flow through the main pipeline's tessellated
mode with `shape_kind = Solid`, sampling the SDL_ttf atlas texture. mode with `mode = 0`, sampling the SDL_ttf atlas texture.
A future phase may evaluate MSDF (multi-channel signed distance field) text rendering, which would A future phase may evaluate MSDF (multi-channel signed distance field) text rendering, which would
allow resolution-independent glyph rendering from a single small atlas per font. This would involve: allow resolution-independent glyph rendering from a single small atlas per font. This would involve:
- Offline atlas generation via Chlumský's msdf-atlas-gen tool. - Offline atlas generation via Chlumský's msdf-atlas-gen tool.
- Runtime glyph metrics via `vendor:stb/truetype` (already in the Odin distribution). - Runtime glyph metrics via `vendor:stb/truetype` (already in the Odin distribution).
- A new `Shape_Kind.MSDF_Glyph` variant in the main pipeline's fragment shader. - A new MSDF glyph mode in the fragment shader, which would require reintroducing a mode/kind
distinction (the current shader evaluates only `sdRoundedBox` with no kind dispatch).
- Potential removal of the SDL_ttf dependency. - Potential removal of the SDL_ttf dependency.
This is explicitly deferred. The SDF shape migration is independent of and does not block text This is explicitly deferred. The SDF shape migration is independent of and does not block text
@@ -659,30 +692,26 @@ with the same texture but different samplers produce separate draw calls, which
#### Textured draw procs #### Textured draw procs
Textured rectangles route through the existing SDF path via `draw.rectangle_texture` and Textured rectangles route through the existing SDF path via `sdf_rectangle_texture` and
`draw.rectangle_texture_corners`, mirroring `draw.rectangle` and `draw.rectangle_corners` exactly — `sdf_rectangle_texture_corners`, mirroring `sdf_rectangle` and `sdf_rectangle_corners` exactly —
same parameters, same naming — with the color parameter replaced by a texture ID plus an optional same parameters, same naming — with the color parameter replaced by a texture ID plus an optional
tint. tint.
An earlier iteration of this design considered a separate tessellated `draw.texture` proc for An earlier iteration of this design considered a separate tessellated proc for "simple" fullscreen
"simple" fullscreen quads, on the theory that the tessellated path's lower register count (~16 regs quads, on the theory that the tessellated path's lower register count (~16 regs vs ~18 for the SDF
vs ~24 for the SDF textured branch) would improve occupancy at large fragment counts. Applying the textured branch) would improve occupancy at large fragment counts. Applying the register-pressure
register-pressure analysis from the pipeline-strategy section above shows this is wrong: both 16 and analysis from the pipeline-strategy section above shows this is wrong: both 16 and 18 registers are
24 registers are well below the register cliff (~43 regs on consumer Ampere/Ada, ~32 on Volta/A100), well below the register cliff (~43 regs on consumer Ampere/Ada, ~32 on Volta/A100), so both run at
so both run at 100% occupancy. The remaining ALU difference (~15 extra instructions for the SDF 100% occupancy. The remaining ALU difference (~15 extra instructions for the SDF evaluation) amounts
evaluation) amounts to ~20μs at 4K — below noise. Meanwhile, splitting into a separate pipeline to ~20μs at 4K — below noise. Meanwhile, splitting into a separate pipeline would add ~15μs per
would add ~15μs per pipeline bind on the CPU side per scissor, matching or exceeding the GPU-side pipeline bind on the CPU side per scissor, matching or exceeding the GPU-side savings. Within the
savings. Within the main pipeline, unified remains strictly better. main pipeline, unified remains strictly better.
The naming convention follows the existing shape API: `rectangle_texture` and The naming convention uses `sdf_` and `tes_` prefixes to indicate the rendering path, with suffixes
`rectangle_texture_corners` sit alongside `rectangle` and `rectangle_corners`, mirroring the for modifiers: `sdf_rectangle_texture` and `sdf_rectangle_texture_corners` sit alongside
`rectangle_gradient` / `circle_gradient` pattern where the shape is the primary noun and the `sdf_rectangle` (solid or gradient overload). Proc groups like `sdf_rectangle` dispatch to
modifier (gradient, texture) is secondary. This groups related procs together in autocomplete `sdf_rectangle_solid` or `sdf_rectangle_gradient` based on argument count. Future per-shape texture
(`rectangle_*`) and reads as natural English ("draw a rectangle with a texture"). variants (`sdf_circle_texture`) are additive.
Future per-shape texture variants (`circle_texture`, `ellipse_texture`, `polygon_texture`) are
reserved by this naming convention and require only a `Shape_Flag.Textured` bit plus a small
per-shape UV mapping function in the fragment shader. These are additive.
#### What SDF anti-aliasing does and does not do for textured draws #### What SDF anti-aliasing does and does not do for textured draws
@@ -721,9 +750,9 @@ textures onto a free list that is processed in `r_end_frame`, not at the call si
Clay's `RenderCommandType.Image` is handled by dereferencing `imageData: rawptr` as a pointer to a Clay's `RenderCommandType.Image` is handled by dereferencing `imageData: rawptr` as a pointer to a
`Clay_Image_Data` struct containing a `Texture_Id`, `Fit_Mode`, and tint color. Routing mirrors the `Clay_Image_Data` struct containing a `Texture_Id`, `Fit_Mode`, and tint color. Routing mirrors the
existing rectangle handling: zero `cornerRadius` dispatches to `draw.texture` (tessellated), nonzero existing rectangle handling: zero `cornerRadius` dispatches to `sdf_rectangle_texture` (SDF, sharp
dispatches to `draw.rectangle_texture_corners` (SDF). A `fit_params` call computes UVs from the fit corners), nonzero dispatches to `sdf_rectangle_texture_corners` (SDF, per-corner radii). A
mode before dispatch. `fit_params` call computes UVs from the fit mode before dispatch.
#### Deferred features #### Deferred features
@@ -735,7 +764,7 @@ The following are plumbed in the descriptor but not implemented in phase 1:
- **3D textures, arrays, cube maps**: `Texture_Desc.type` and `depth_or_layers` fields exist. - **3D textures, arrays, cube maps**: `Texture_Desc.type` and `depth_or_layers` fields exist.
- **Additional samplers**: anisotropic, trilinear, clamp-to-border — additive enum values. - **Additional samplers**: anisotropic, trilinear, clamp-to-border — additive enum values.
- **Atlas packing**: internal optimization for sub-batch coalescing; invisible to callers. - **Atlas packing**: internal optimization for sub-batch coalescing; invisible to callers.
- **Per-shape texture variants**: `circle_texture`, `ellipse_texture`, etc. — reserved by naming. - **Per-shape texture variants**: `sdf_circle_texture`, `tes_ellipse_texture`, `tes_polygon_texture` — potential future additions, reserved by naming convention.
**References:** **References:**

View File

@@ -4,6 +4,7 @@ import "base:runtime"
import "core:c" import "core:c"
import "core:log" import "core:log"
import "core:math" import "core:math"
import "core:strings" import "core:strings"
import sdl "vendor:sdl3" import sdl "vendor:sdl3"
import sdl_ttf "vendor:sdl3/ttf" import sdl_ttf "vendor:sdl3/ttf"
@@ -27,72 +28,22 @@ BUFFER_INIT_SIZE :: 256
INITIAL_LAYER_SIZE :: 5 INITIAL_LAYER_SIZE :: 5
INITIAL_SCISSOR_SIZE :: 10 INITIAL_SCISSOR_SIZE :: 10
// --------------------------------------------------------------------------------------------------------------------- // Sentinel value: when passed as msaa_samples, `init` will use the maximum MSAA sample count
// ----- Color ------------------------- // supported by the GPU for the swapchain format.
// --------------------------------------------------------------------------------------------------------------------- MSAA_MAX :: sdl.GPUSampleCount(0xFF)
Color :: distinct [4]u8 // ----- Default parameter values -----
// Named constants for non-zero default procedure parameters. Centralizes magic numbers
BLACK :: Color{0, 0, 0, 255} // so they can be tuned in one place and referenced by name in proc signatures.
WHITE :: Color{255, 255, 255, 255} DFT_FEATHER_PX :: 1 // Total AA feather width in physical pixels (half on each side of boundary).
RED :: Color{255, 0, 0, 255} DFT_STROKE_THICKNESS :: 1 // Default line/stroke thickness in logical pixels.
GREEN :: Color{0, 255, 0, 255} DFT_FONT_SIZE :: 44 // Default font size in points for text rendering.
BLUE :: Color{0, 0, 255, 255} DFT_CIRC_END_ANGLE :: 360 // Full-circle end angle in degrees (ring/arc).
BLANK :: Color{0, 0, 0, 0} DFT_UV_RECT :: Rectangle{0, 0, 1, 1} // Full-texture UV rect (rectangle_texture).
DFT_TINT :: WHITE // Default texture tint (rectangle_texture, clay_image).
// Convert clay.Color ([4]c.float in 0255 range) to Color. DFT_TEXT_COLOR :: BLACK // Default text color.
color_from_clay :: proc(clay_color: clay.Color) -> Color { DFT_CLEAR_COLOR :: BLACK // Default clear color for end().
return Color{u8(clay_color[0]), u8(clay_color[1]), u8(clay_color[2]), u8(clay_color[3])} DFT_SAMPLER :: Sampler_Preset.Linear_Clamp // Default texture sampler preset.
}
// Convert Color to [4]f32 in 0.01.0 range. Useful for SDL interop (e.g. clear color).
color_to_f32 :: proc(color: Color) -> [4]f32 {
INV :: 1.0 / 255.0
return {f32(color[0]) * INV, f32(color[1]) * INV, f32(color[2]) * INV, f32(color[3]) * INV}
}
// ---------------------------------------------------------------------------------------------------------------------
// ----- Core types --------------------
// ---------------------------------------------------------------------------------------------------------------------
Rectangle :: struct {
x: f32,
y: f32,
width: f32,
height: f32,
}
Sub_Batch_Kind :: enum u8 {
Shapes, // non-indexed, white texture or user texture, mode 0
Text, // indexed, atlas texture, mode 0
SDF, // instanced unit quad, white texture or user texture, mode 1
}
Sub_Batch :: struct {
kind: Sub_Batch_Kind,
offset: u32, // Shapes: vertex offset; Text: text_batch index; SDF: primitive index
count: u32, // Shapes: vertex count; Text: always 1; SDF: primitive count
texture_id: Texture_Id,
sampler: Sampler_Preset,
}
Layer :: struct {
bounds: Rectangle,
sub_batch_start: u32,
sub_batch_len: u32,
scissor_start: u32,
scissor_len: u32,
}
Scissor :: struct {
bounds: sdl.Rect,
sub_batch_start: u32,
sub_batch_len: u32,
}
// ---------------------------------------------------------------------------------------------------------------------
// ----- Global state ------------------
// ---------------------------------------------------------------------------------------------------------------------
GLOB: Global GLOB: Global
@@ -153,6 +104,136 @@ Global :: struct {
odin_context: runtime.Context, // Odin context captured at init for use in callbacks. odin_context: runtime.Context, // Odin context captured at init for use in callbacks.
} }
// ---------------------------------------------------------------------------------------------------------------------
// ----- Core types --------------------
// ---------------------------------------------------------------------------------------------------------------------
// A 2D position in world space. Non-distinct alias for [2]f32 — bare literals like {100, 200}
// work at non-ambiguous call sites.
//
// Coordinate system: origin is the top-left corner of the window/layer. X increases rightward,
// Y increases downward. This matches SDL, HTML Canvas, and most 2D UI coordinate conventions.
// All position parameters in the draw API (center, origin, start_position, end_position, etc.)
// use this coordinate system.
//
// Units are logical pixels (pre-DPI-scaling). The renderer multiplies by dpi_scaling internally
// before uploading to the GPU. A Vec2{100, 50} refers to the same visual location regardless of
// display DPI.
Vec2 :: [2]f32
// An RGBA color with 8 bits per channel. Distinct type over [4]u8 so that proc-group
// overloads can disambiguate Color from other 4-byte structs.
//
// Channel order: R, G, B, A (indices 0, 1, 2, 3). Alpha 255 is fully opaque, 0 is fully
// transparent. This matches the GPU-side layout: the shader unpacks via unpackUnorm4x8 which
// reads the bytes in memory order as R, G, B, A and normalizes each to [0, 1].
//
// When used in the Primitive struct (Primitive.color), the 4 bytes are stored as a u32 in
// native byte order and unpacked by the shader.
Color :: [4]u8
BLACK :: Color{0, 0, 0, 255}
WHITE :: Color{255, 255, 255, 255}
RED :: Color{255, 0, 0, 255}
GREEN :: Color{0, 255, 0, 255}
BLUE :: Color{0, 0, 255, 255}
BLANK :: Color{0, 0, 0, 0}
// Per-corner rounding radii for rectangles, specified clockwise from top-left.
// All values are in logical pixels (pre-DPI-scaling).
Rectangle_Radii :: struct {
top_left: f32,
top_right: f32,
bottom_right: f32,
bottom_left: f32,
}
// A linear gradient between two colors along an arbitrary angle.
// The `end_color` is the color at the end of the gradient direction; the shape's fill `color`
// parameter acts as the start color. `angle` is in degrees: 0 = left-to-right, 90 = top-to-bottom.
Linear_Gradient :: struct {
end_color: Color,
angle: f32,
}
// A radial gradient between two colors from center to edge.
// The `outer_color` is the color at the shape's edge; the shape's fill `color` parameter
// acts as the inner (center) color.
Radial_Gradient :: struct {
outer_color: Color,
}
// Tagged union for specifying a gradient on any shape. Defaults to `nil` (no gradient).
// When a gradient is active, the shape's `color` parameter becomes the start/inner color,
// and the gradient struct carries the end/outer color plus any type-specific parameters.
//
// Gradient and Textured are mutually exclusive on the same primitive. If a shape uses
// `rectangle_texture`, gradients are not applicable — use the tint color instead.
Gradient :: union {
Linear_Gradient,
Radial_Gradient,
}
// Convert clay.Color ([4]c.float in 0255 range) to Color.
color_from_clay :: #force_inline proc(clay_color: clay.Color) -> Color {
return Color{u8(clay_color[0]), u8(clay_color[1]), u8(clay_color[2]), u8(clay_color[3])}
}
// Convert Color to [4]f32 in 0.01.0 range. Useful for SDL interop (e.g. clear color).
color_to_f32 :: proc(color: Color) -> [4]f32 {
INV :: 1.0 / 255.0
return {f32(color[0]) * INV, f32(color[1]) * INV, f32(color[2]) * INV, f32(color[3]) * INV}
}
// Pre-multiply RGB channels by alpha. The tessellated vertex path and text path require
// premultiplied colors because the blend state is ONE, ONE_MINUS_SRC_ALPHA and the
// tessellated fragment shader passes vertex color through without further modification.
// Users who construct Vertex structs manually for prepare_shape must premultiply their colors.
premultiply_color :: #force_inline proc(color: Color) -> Color {
a := u32(color[3])
return Color {
u8((u32(color[0]) * a + 127) / 255),
u8((u32(color[1]) * a + 127) / 255),
u8((u32(color[2]) * a + 127) / 255),
color[3],
}
}
Rectangle :: struct {
x: f32,
y: f32,
width: f32,
height: f32,
}
Sub_Batch_Kind :: enum u8 {
Tessellated, // non-indexed, white texture or user texture, mode 0
Text, // indexed, atlas texture, mode 0
SDF, // instanced unit quad, white texture or user texture, mode 1
}
Sub_Batch :: struct {
kind: Sub_Batch_Kind,
offset: u32, // Tessellated: vertex offset; Text: text_batch index; SDF: primitive index
count: u32, // Tessellated: vertex count; Text: always 1; SDF: primitive count
texture_id: Texture_Id,
sampler: Sampler_Preset,
}
Layer :: struct {
bounds: Rectangle,
sub_batch_start: u32,
sub_batch_len: u32,
scissor_start: u32,
scissor_len: u32,
}
Scissor :: struct {
bounds: sdl.Rect,
sub_batch_start: u32,
sub_batch_len: u32,
}
Init_Options :: struct { Init_Options :: struct {
// MSAA sample count. Default is ._1 (no MSAA). SDF rendering does not benefit from MSAA // MSAA sample count. Default is ._1 (no MSAA). SDF rendering does not benefit from MSAA
// because SDF fragments compute coverage analytically via `smoothstep`. MSAA helps for // because SDF fragments compute coverage analytically via `smoothstep`. MSAA helps for
@@ -162,10 +243,6 @@ Init_Options :: struct {
msaa_samples: sdl.GPUSampleCount, msaa_samples: sdl.GPUSampleCount,
} }
// Sentinel value: when passed as msaa_samples, `init` will use the maximum MSAA sample count
// supported by the GPU for the swapchain format.
MSAA_MAX :: sdl.GPUSampleCount(0xFF)
// Initialize the renderer. Returns false if GPU pipeline or text engine creation fails. // Initialize the renderer. Returns false if GPU pipeline or text engine creation fails.
@(require_results) @(require_results)
init :: proc( init :: proc(
@@ -378,12 +455,13 @@ new_layer :: proc(prev_layer: ^Layer, bounds: Rectangle) -> ^Layer {
// --------------------------------------------------------------------------------------------------------------------- // ---------------------------------------------------------------------------------------------------------------------
// Submit shape vertices (colored triangles) to the given layer for rendering. // Submit shape vertices (colored triangles) to the given layer for rendering.
// TODO: Should probably be renamed to better match tesselated naming conventions in the library.
prepare_shape :: proc(layer: ^Layer, vertices: []Vertex) { prepare_shape :: proc(layer: ^Layer, vertices: []Vertex) {
if len(vertices) == 0 do return if len(vertices) == 0 do return
offset := u32(len(GLOB.tmp_shape_verts)) offset := u32(len(GLOB.tmp_shape_verts))
append(&GLOB.tmp_shape_verts, ..vertices) append(&GLOB.tmp_shape_verts, ..vertices)
scissor := &GLOB.scissors[layer.scissor_start + layer.scissor_len - 1] scissor := &GLOB.scissors[layer.scissor_start + layer.scissor_len - 1]
append_or_extend_sub_batch(scissor, layer, .Shapes, offset, u32(len(vertices))) append_or_extend_sub_batch(scissor, layer, .Tessellated, offset, u32(len(vertices)))
} }
// Submit an SDF primitive to the given layer for rendering. // Submit an SDF primitive to the given layer for rendering.
@@ -409,6 +487,9 @@ prepare_text :: proc(layer: ^Layer, text: Text) {
base_x := math.round(text.position[0] * GLOB.dpi_scaling) base_x := math.round(text.position[0] * GLOB.dpi_scaling)
base_y := math.round(text.position[1] * GLOB.dpi_scaling) base_y := math.round(text.position[1] * GLOB.dpi_scaling)
// Premultiply text color once — reused across all glyph vertices.
pm_color := premultiply_color(text.color)
for data != nil { for data != nil {
vertex_start := u32(len(GLOB.tmp_text_verts)) vertex_start := u32(len(GLOB.tmp_text_verts))
index_start := u32(len(GLOB.tmp_text_indices)) index_start := u32(len(GLOB.tmp_text_indices))
@@ -419,7 +500,7 @@ prepare_text :: proc(layer: ^Layer, text: Text) {
uv := data.uv[i] uv := data.uv[i]
append( append(
&GLOB.tmp_text_verts, &GLOB.tmp_text_verts,
Vertex{position = {pos.x + base_x, -pos.y + base_y}, uv = {uv.x, uv.y}, color = text.color}, Vertex{position = {pos.x + base_x, -pos.y + base_y}, uv = {uv.x, uv.y}, color = pm_color},
) )
} }
@@ -457,6 +538,9 @@ prepare_text_transformed :: proc(layer: ^Layer, text: Text, transform: Transform
scissor := &GLOB.scissors[layer.scissor_start + layer.scissor_len - 1] scissor := &GLOB.scissors[layer.scissor_start + layer.scissor_len - 1]
// Premultiply text color once — reused across all glyph vertices.
pm_color := premultiply_color(text.color)
for data != nil { for data != nil {
vertex_start := u32(len(GLOB.tmp_text_verts)) vertex_start := u32(len(GLOB.tmp_text_verts))
index_start := u32(len(GLOB.tmp_text_indices)) index_start := u32(len(GLOB.tmp_text_indices))
@@ -469,7 +553,7 @@ prepare_text_transformed :: proc(layer: ^Layer, text: Text, transform: Transform
// so we apply directly — no per-vertex DPI divide/multiply. // so we apply directly — no per-vertex DPI divide/multiply.
append( append(
&GLOB.tmp_text_verts, &GLOB.tmp_text_verts,
Vertex{position = apply_transform(transform, {pos.x, -pos.y}), uv = {uv.x, uv.y}, color = text.color}, Vertex{position = apply_transform(transform, {pos.x, -pos.y}), uv = {uv.x, uv.y}, color = pm_color},
) )
} }
@@ -502,7 +586,7 @@ append_or_extend_sub_batch :: proc(
offset: u32, offset: u32,
count: u32, count: u32,
texture_id: Texture_Id = INVALID_TEXTURE, texture_id: Texture_Id = INVALID_TEXTURE,
sampler: Sampler_Preset = .Linear_Clamp, sampler: Sampler_Preset = DFT_SAMPLER,
) { ) {
if scissor.sub_batch_len > 0 { if scissor.sub_batch_len > 0 {
last := &GLOB.tmp_sub_batches[scissor.sub_batch_start + scissor.sub_batch_len - 1] last := &GLOB.tmp_sub_batches[scissor.sub_batch_start + scissor.sub_batch_len - 1]
@@ -595,6 +679,9 @@ prepare_clay_batch :: proc(
switch (render_command.commandType) { switch (render_command.commandType) {
case clay.RenderCommandType.None: case clay.RenderCommandType.None:
log.errorf(
"Received render command with type None. This generally means we're in some kind of fucked up state.",
)
case clay.RenderCommandType.Text: case clay.RenderCommandType.Text:
render_data := render_command.renderData.text render_data := render_command.renderData.text
txt := string(render_data.stringContents.chars[:render_data.stringContents.length]) txt := string(render_data.stringContents.chars[:render_data.stringContents.length])
@@ -609,46 +696,29 @@ prepare_clay_batch :: proc(
) )
prepare_text(layer, Text{sdl_text, {bounds.x, bounds.y}, color_from_clay(render_data.textColor)}) prepare_text(layer, Text{sdl_text, {bounds.x, bounds.y}, color_from_clay(render_data.textColor)})
case clay.RenderCommandType.Image: case clay.RenderCommandType.Image:
// Any texture
render_data := render_command.renderData.image render_data := render_command.renderData.image
if render_data.imageData == nil do continue if render_data.imageData == nil do continue
img_data := (^Clay_Image_Data)(render_data.imageData)^ img_data := (^Clay_Image_Data)(render_data.imageData)^
cr := render_data.cornerRadius cr := render_data.cornerRadius
radii := [4]f32{cr.topLeft, cr.topRight, cr.bottomRight, cr.bottomLeft} radii := Rectangle_Radii {
top_left = cr.topLeft,
top_right = cr.topRight,
bottom_right = cr.bottomRight,
bottom_left = cr.bottomLeft,
}
// Background color behind the image (Clay allows it) // Background color behind the image (Clay allows it)
bg := color_from_clay(render_data.backgroundColor) bg := color_from_clay(render_data.backgroundColor)
if bg[3] > 0 { if bg[3] > 0 {
if radii == {0, 0, 0, 0} { rectangle(layer, bounds, bg, radii = radii)
rectangle(layer, bounds, bg)
} else {
rectangle_corners(layer, bounds, radii, bg)
}
} }
// Compute fit UVs // Compute fit UVs
uv, sampler, inner := fit_params(img_data.fit, bounds, img_data.texture_id) uv, sampler, inner := fit_params(img_data.fit, bounds, img_data.texture_id)
// Draw the image — route by cornerRadius // Draw the image
if radii == {0, 0, 0, 0} { rectangle_texture(layer, inner, img_data.texture_id, img_data.tint, uv, sampler, radii)
rectangle_texture(
layer,
inner,
img_data.texture_id,
tint = img_data.tint,
uv_rect = uv,
sampler = sampler,
)
} else {
rectangle_texture_corners(
layer,
inner,
radii,
img_data.texture_id,
tint = img_data.tint,
uv_rect = uv,
sampler = sampler,
)
}
case clay.RenderCommandType.ScissorStart: case clay.RenderCommandType.ScissorStart:
if bounds.width == 0 || bounds.height == 0 do continue if bounds.width == 0 || bounds.height == 0 do continue
@@ -680,34 +750,38 @@ prepare_clay_batch :: proc(
render_data := render_command.renderData.rectangle render_data := render_command.renderData.rectangle
cr := render_data.cornerRadius cr := render_data.cornerRadius
color := color_from_clay(render_data.backgroundColor) color := color_from_clay(render_data.backgroundColor)
radii := [4]f32{cr.topLeft, cr.topRight, cr.bottomRight, cr.bottomLeft} radii := Rectangle_Radii {
top_left = cr.topLeft,
if radii == {0, 0, 0, 0} { top_right = cr.topRight,
rectangle(layer, bounds, color) bottom_right = cr.bottomRight,
} else { bottom_left = cr.bottomLeft,
rectangle_corners(layer, bounds, radii, color)
} }
rectangle(layer, bounds, color, radii = radii)
case clay.RenderCommandType.Border: case clay.RenderCommandType.Border:
render_data := render_command.renderData.border render_data := render_command.renderData.border
cr := render_data.cornerRadius cr := render_data.cornerRadius
color := color_from_clay(render_data.color) color := color_from_clay(render_data.color)
thickness := f32(render_data.width.top) thickness := f32(render_data.width.top)
radii := [4]f32{cr.topLeft, cr.topRight, cr.bottomRight, cr.bottomLeft} radii := Rectangle_Radii {
top_left = cr.topLeft,
if radii == {0, 0, 0, 0} { top_right = cr.topRight,
rectangle_lines(layer, bounds, color, thickness) bottom_right = cr.bottomRight,
} else { bottom_left = cr.bottomLeft,
rectangle_corners_lines(layer, bounds, radii, color, thickness)
} }
rectangle(layer, bounds, BLANK, outline_color = color, outline_width = thickness, radii = radii)
case clay.RenderCommandType.Custom: if custom_draw != nil { case clay.RenderCommandType.Custom: if custom_draw != nil {
custom_draw(layer, bounds, render_command.renderData.custom) custom_draw(layer, bounds, render_command.renderData.custom)
} else {
log.error("Received clay render command of type custom but no custom_draw proc provided.")
} }
} }
} }
} }
// Render primitives. clear_color is the background fill before any layers are drawn. // Render primitives. clear_color is the background fill before any layers are drawn.
end :: proc(device: ^sdl.GPUDevice, window: ^sdl.Window, clear_color: Color = BLACK) { end :: proc(device: ^sdl.GPUDevice, window: ^sdl.Window, clear_color: Color = DFT_CLEAR_COLOR) {
cmd_buffer := sdl.AcquireGPUCommandBuffer(device) cmd_buffer := sdl.AcquireGPUCommandBuffer(device)
if cmd_buffer == nil { if cmd_buffer == nil {
log.panicf("Failed to acquire GPU command buffer: %s", sdl.GetError()) log.panicf("Failed to acquire GPU command buffer: %s", sdl.GetError())
@@ -740,7 +814,16 @@ end :: proc(device: ^sdl.GPUDevice, window: ^sdl.Window, clear_color: Color = BL
render_texture = GLOB.msaa_texture render_texture = GLOB.msaa_texture
} }
clear_color_f32 := color_to_f32(clear_color) // Premultiply clear color: the blend state is ONE, ONE_MINUS_SRC_ALPHA (premultiplied),
// so the clear color must also be premultiplied for correct background compositing.
clear_color_straight := color_to_f32(clear_color)
clear_alpha := clear_color_straight[3]
clear_color_f32 := [4]f32 {
clear_color_straight[0] * clear_alpha,
clear_color_straight[1] * clear_alpha,
clear_color_straight[2] * clear_alpha,
clear_alpha,
}
// Draw layers. One render pass per layer; sub-batches draw in submission order within each scissor. // Draw layers. One render pass per layer; sub-batches draw in submission order within each scissor.
for &layer, index in GLOB.layers { for &layer, index in GLOB.layers {
@@ -948,10 +1031,20 @@ Transform_2D :: struct {
// origin pivot point in local space (measured from the shape's natural reference point). // origin pivot point in local space (measured from the shape's natural reference point).
// rotation_deg rotation in degrees, counter-clockwise. // rotation_deg rotation in degrees, counter-clockwise.
// //
build_pivot_rotation :: proc(position: [2]f32, origin: [2]f32, rotation_deg: f32) -> Transform_2D { build_pivot_rotation :: proc(position: Vec2, origin: Vec2, rotation_deg: f32) -> Transform_2D {
radians := math.to_radians(rotation_deg) radians := math.to_radians(rotation_deg)
cos_angle := math.cos(radians) cos_angle := math.cos(radians)
sin_angle := math.sin(radians) sin_angle := math.sin(radians)
return build_pivot_rotation_sc(position, origin, cos_angle, sin_angle)
}
// Variant of build_pivot_rotation that accepts pre-computed cos/sin values,
// avoiding redundant trigonometry when the caller has already computed them.
build_pivot_rotation_sc :: #force_inline proc(
position: Vec2,
origin: Vec2,
cos_angle, sin_angle: f32,
) -> Transform_2D {
return Transform_2D { return Transform_2D {
m00 = cos_angle, m00 = cos_angle,
m01 = -sin_angle, m01 = -sin_angle,
@@ -963,7 +1056,7 @@ build_pivot_rotation :: proc(position: [2]f32, origin: [2]f32, rotation_deg: f32
} }
// Apply the transform to a local-space point, producing a world-space point. // Apply the transform to a local-space point, producing a world-space point.
apply_transform :: #force_inline proc(transform: Transform_2D, point: [2]f32) -> [2]f32 { apply_transform :: #force_inline proc(transform: Transform_2D, point: Vec2) -> Vec2 {
return { return {
transform.m00 * point.x + transform.m01 * point.y + transform.tx, transform.m00 * point.x + transform.m01 * point.y + transform.tx,
transform.m10 * point.x + transform.m11 * point.y + transform.ty, transform.m10 * point.x + transform.m11 * point.y + transform.ty,
@@ -973,7 +1066,7 @@ apply_transform :: #force_inline proc(transform: Transform_2D, point: [2]f32) ->
// Fast-path check callers use BEFORE building a transform. // Fast-path check callers use BEFORE building a transform.
// Returns true if either the origin is non-zero or rotation is non-zero, // Returns true if either the origin is non-zero or rotation is non-zero,
// meaning a transform actually needs to be computed. // meaning a transform actually needs to be computed.
needs_transform :: #force_inline proc(origin: [2]f32, rotation: f32) -> bool { needs_transform :: #force_inline proc(origin: Vec2, rotation: f32) -> bool {
return origin != {0, 0} || rotation != 0 return origin != {0, 0} || rotation != 0
} }

View File

@@ -3,6 +3,10 @@ package draw_qr
import draw ".." import draw ".."
import "../../qrcode" import "../../qrcode"
DFT_QR_DARK :: draw.BLACK // Default QR code dark module color.
DFT_QR_LIGHT :: draw.WHITE // Default QR code light module color.
DFT_QR_BOOST_ECL :: true // Default QR error correction level boost.
// Returns the number of bytes to_texture will write for the given encoded // Returns the number of bytes to_texture will write for the given encoded
// QR buffer. Equivalent to size*size*4 where size = qrcode.get_size(qrcode_buf). // QR buffer. Equivalent to size*size*4 where size = qrcode.get_size(qrcode_buf).
texture_size :: #force_inline proc(qrcode_buf: []u8) -> int { texture_size :: #force_inline proc(qrcode_buf: []u8) -> int {
@@ -21,8 +25,8 @@ texture_size :: #force_inline proc(qrcode_buf: []u8) -> int {
to_texture :: proc( to_texture :: proc(
qrcode_buf: []u8, qrcode_buf: []u8,
texture_buf: []u8, texture_buf: []u8,
dark: draw.Color = draw.BLACK, dark: draw.Color = DFT_QR_DARK,
light: draw.Color = draw.WHITE, light: draw.Color = DFT_QR_LIGHT,
) -> ( ) -> (
desc: draw.Texture_Desc, desc: draw.Texture_Desc,
ok: bool, ok: bool,
@@ -65,8 +69,8 @@ to_texture :: proc(
@(require_results) @(require_results)
register_texture_from_raw :: proc( register_texture_from_raw :: proc(
qrcode_buf: []u8, qrcode_buf: []u8,
dark: draw.Color = draw.BLACK, dark: draw.Color = DFT_QR_DARK,
light: draw.Color = draw.WHITE, light: draw.Color = DFT_QR_LIGHT,
temp_allocator := context.temp_allocator, temp_allocator := context.temp_allocator,
) -> ( ) -> (
texture: draw.Texture_Id, texture: draw.Texture_Id,
@@ -96,9 +100,9 @@ register_texture_from_text :: proc(
min_version: int = qrcode.VERSION_MIN, min_version: int = qrcode.VERSION_MIN,
max_version: int = qrcode.VERSION_MAX, max_version: int = qrcode.VERSION_MAX,
mask: Maybe(qrcode.Mask) = nil, mask: Maybe(qrcode.Mask) = nil,
boost_ecl: bool = true, boost_ecl: bool = DFT_QR_BOOST_ECL,
dark: draw.Color = draw.BLACK, dark: draw.Color = DFT_QR_DARK,
light: draw.Color = draw.WHITE, light: draw.Color = DFT_QR_LIGHT,
temp_allocator := context.temp_allocator, temp_allocator := context.temp_allocator,
) -> ( ) -> (
texture: draw.Texture_Id, texture: draw.Texture_Id,
@@ -135,9 +139,9 @@ register_texture_from_binary :: proc(
min_version: int = qrcode.VERSION_MIN, min_version: int = qrcode.VERSION_MIN,
max_version: int = qrcode.VERSION_MAX, max_version: int = qrcode.VERSION_MAX,
mask: Maybe(qrcode.Mask) = nil, mask: Maybe(qrcode.Mask) = nil,
boost_ecl: bool = true, boost_ecl: bool = DFT_QR_BOOST_ECL,
dark: draw.Color = draw.BLACK, dark: draw.Color = DFT_QR_DARK,
light: draw.Color = draw.WHITE, light: draw.Color = DFT_QR_LIGHT,
temp_allocator := context.temp_allocator, temp_allocator := context.temp_allocator,
) -> ( ) -> (
texture: draw.Texture_Id, texture: draw.Texture_Id,
@@ -163,13 +167,13 @@ register_texture_from_binary :: proc(
register_texture_from :: proc { register_texture_from :: proc {
register_texture_from_text, register_texture_from_text,
register_texture_from_binary register_texture_from_binary,
} }
// Default fit=.Fit preserves the QR's square aspect; override as needed. // Default fit=.Fit preserves the QR's square aspect; override as needed.
clay_image :: #force_inline proc( clay_image :: #force_inline proc(
texture: draw.Texture_Id, texture: draw.Texture_Id,
tint: draw.Color = draw.WHITE, tint: draw.Color = draw.DFT_TINT,
) -> draw.Clay_Image_Data { ) -> draw.Clay_Image_Data {
return draw.clay_image_data(texture, fit = .Fit, tint = tint) return draw.clay_image_data(texture, fit = .Fit, tint = tint)
} }

View File

@@ -1,6 +1,7 @@
package examples package examples
import "../../draw" import "../../draw"
import "../../draw/tess"
import "../../vendor/clay" import "../../vendor/clay"
import "core:math" import "core:math"
import "core:os" import "core:os"
@@ -28,19 +29,26 @@ hellope_shapes :: proc() {
base_layer := draw.begin({width = 500, height = 500}) base_layer := draw.begin({width = 500, height = 500})
// Background // Background
draw.rectangle(base_layer, {0, 0, 500, 500}, {40, 40, 40, 255}) draw.rectangle(base_layer, {0, 0, 500, 500}, draw.Color{40, 40, 40, 255})
// ----- Shapes without rotation (existing demo) ----- // ----- Shapes without rotation (existing demo) -----
draw.rectangle(base_layer, {20, 20, 200, 120}, {80, 120, 200, 255}) draw.rectangle(
draw.rectangle_lines(base_layer, {20, 20, 200, 120}, draw.WHITE, thickness = 2) base_layer,
draw.rectangle(base_layer, {240, 20, 240, 120}, {200, 80, 80, 255}, roundness = 0.3) {20, 20, 200, 120},
draw.rectangle_gradient( draw.Color{80, 120, 200, 255},
outline_color = draw.WHITE,
outline_width = 2,
radii = {top_right = 15, top_left = 5},
)
red_rect_raddi := draw.uniform_radii({240, 20, 240, 120}, 0.3)
red_rect_raddi.bottom_left = 0
draw.rectangle(base_layer, {240, 20, 240, 120}, draw.Color{200, 80, 80, 255}, radii = red_rect_raddi)
draw.rectangle(
base_layer, base_layer,
{20, 160, 460, 60}, {20, 160, 460, 60},
{255, 0, 0, 255}, {255, 0, 0, 255},
{0, 255, 0, 255}, gradient = draw.Linear_Gradient{end_color = {0, 0, 255, 255}, angle = 0},
{0, 0, 255, 255},
{255, 255, 0, 255},
) )
// ----- Rotation demos ----- // ----- Rotation demos -----
@@ -50,17 +58,12 @@ hellope_shapes :: proc() {
draw.rectangle( draw.rectangle(
base_layer, base_layer,
rect, rect,
{100, 200, 100, 255}, draw.Color{100, 200, 100, 255},
origin = draw.center_of(rect), outline_color = draw.WHITE,
rotation = spin_angle, outline_width = 2,
)
draw.rectangle_lines(
base_layer,
rect,
draw.WHITE,
thickness = 2,
origin = draw.center_of(rect), origin = draw.center_of(rect),
rotation = spin_angle, rotation = spin_angle,
feather_px = 1,
) )
// Rounded rectangle rotating around its center // Rounded rectangle rotating around its center
@@ -68,8 +71,8 @@ hellope_shapes :: proc() {
draw.rectangle( draw.rectangle(
base_layer, base_layer,
rrect, rrect,
{200, 100, 200, 255}, draw.Color{200, 100, 200, 255},
roundness = 0.4, radii = draw.uniform_radii(rrect, 0.4),
origin = draw.center_of(rrect), origin = draw.center_of(rrect),
rotation = spin_angle, rotation = spin_angle,
) )
@@ -80,18 +83,34 @@ hellope_shapes :: proc() {
// Circle orbiting a point (moon orbiting planet) // Circle orbiting a point (moon orbiting planet)
// Convention B: center = pivot point (planet), origin = offset from moon center to pivot. // Convention B: center = pivot point (planet), origin = offset from moon center to pivot.
// Moon's visual center at rotation=0: planet_pos - origin = (100, 450) - (0, 40) = (100, 410). // Moon's visual center at rotation=0: planet_pos - origin = (100, 450) - (0, 40) = (100, 410).
planet_pos := [2]f32{100, 450} planet_pos := draw.Vec2{100, 450}
draw.circle(base_layer, planet_pos, 8, {200, 200, 200, 255}) // planet (stationary) draw.circle(base_layer, planet_pos, 8, {200, 200, 200, 255}) // planet (stationary)
draw.circle(base_layer, planet_pos, 5, {100, 150, 255, 255}, origin = {0, 40}, rotation = spin_angle) // moon orbiting draw.circle(
base_layer,
planet_pos,
5,
{100, 150, 255, 255},
origin = draw.Vec2{0, 40},
rotation = spin_angle,
) // moon orbiting
// Ring arc rotating in place // Sector (pie slice) rotating in place
draw.ring(base_layer, {250, 450}, 15, 30, 0, 270, {100, 100, 220, 255}, rotation = spin_angle) draw.ring(
base_layer,
draw.Vec2{250, 450},
0,
30,
{100, 100, 220, 255},
start_angle = 0,
end_angle = 270,
rotation = spin_angle,
)
// Triangle rotating around its center // Triangle rotating around its center
tv1 := [2]f32{350, 420} tv1 := draw.Vec2{350, 420}
tv2 := [2]f32{420, 480} tv2 := draw.Vec2{420, 480}
tv3 := [2]f32{340, 480} tv3 := draw.Vec2{340, 480}
draw.triangle( tess.triangle_aa(
base_layer, base_layer,
tv1, tv1,
tv2, tv2,
@@ -102,8 +121,16 @@ hellope_shapes :: proc() {
) )
// Polygon rotating around its center (already had rotation; now with origin for orbit) // Polygon rotating around its center (already had rotation; now with origin for orbit)
draw.polygon(base_layer, {460, 450}, 6, 30, {180, 100, 220, 255}, rotation = spin_angle) draw.polygon(
draw.polygon_lines(base_layer, {460, 450}, 6, 30, draw.WHITE, rotation = spin_angle, thickness = 2) base_layer,
{460, 450},
6,
30,
{180, 100, 220, 255},
outline_color = draw.WHITE,
outline_width = 2,
rotation = spin_angle,
)
draw.end(gpu, window) draw.end(gpu, window)
} }
@@ -134,9 +161,6 @@ hellope_text :: proc() {
spin_angle += 0.5 spin_angle += 0.5
base_layer := draw.begin({width = 600, height = 600}) base_layer := draw.begin({width = 600, height = 600})
// Grey background
draw.rectangle(base_layer, {0, 0, 600, 600}, {127, 127, 127, 255})
// ----- Text API demos ----- // ----- Text API demos -----
// Cached text with id — TTF_Text reused across frames (good for text-heavy apps) // Cached text with id — TTF_Text reused across frames (good for text-heavy apps)
@@ -176,7 +200,7 @@ hellope_text :: proc() {
// Measure text for manual layout // Measure text for manual layout
size := draw.measure_text("Measured!", JETBRAINS_MONO_REGULAR, FONT_SIZE) size := draw.measure_text("Measured!", JETBRAINS_MONO_REGULAR, FONT_SIZE)
draw.rectangle(base_layer, {300 - size.x / 2, 380, size.x, size.y}, {60, 60, 60, 200}) draw.rectangle(base_layer, {300 - size.x / 2, 380, size.x, size.y}, draw.Color{60, 60, 60, 200})
draw.text( draw.text(
base_layer, base_layer,
"Measured!", "Measured!",
@@ -200,7 +224,7 @@ hellope_text :: proc() {
id = CORNER_SPIN_ID, id = CORNER_SPIN_ID,
) )
draw.end(gpu, window) draw.end(gpu, window, draw.Color{127, 127, 127, 255})
} }
} }
@@ -338,15 +362,21 @@ hellope_custom :: proc() {
draw_custom :: proc(layer: ^draw.Layer, bounds: draw.Rectangle, render_data: clay.CustomRenderData) { draw_custom :: proc(layer: ^draw.Layer, bounds: draw.Rectangle, render_data: clay.CustomRenderData) {
gauge := cast(^Gauge)render_data.customData gauge := cast(^Gauge)render_data.customData
// Background from clay's backgroundColor border_width: f32 = 2
draw.rectangle(layer, bounds, draw.color_from_clay(render_data.backgroundColor), roundness = 0.25) draw.rectangle(
layer,
bounds,
draw.color_from_clay(render_data.backgroundColor),
outline_color = draw.WHITE,
outline_width = border_width,
)
// Fill bar fill := draw.Rectangle {
fill := bounds x = bounds.x,
fill.width *= gauge.value y = bounds.y,
draw.rectangle(layer, fill, gauge.color, roundness = 0.25) width = bounds.width * gauge.value,
height = bounds.height,
// Border }
draw.rectangle_lines(layer, bounds, draw.WHITE, thickness = 2, roundness = 0.25) draw.rectangle(layer, fill, gauge.color)
} }
} }

View File

@@ -89,7 +89,7 @@ textures :: proc() {
base_layer := draw.begin({width = 800, height = 600}) base_layer := draw.begin({width = 800, height = 600})
// Background // Background
draw.rectangle(base_layer, {0, 0, 800, 600}, {30, 30, 30, 255}) draw.rectangle(base_layer, {0, 0, 800, 600}, draw.Color{30, 30, 30, 255})
//----- Row 1: Sampler presets (y=30) ---------------------------------- //----- Row 1: Sampler presets (y=30) ----------------------------------
@@ -154,7 +154,7 @@ textures :: proc() {
ROW2_Y :: f32(190) ROW2_Y :: f32(190)
// QR code (RGBA texture with baked colors, nearest sampling) // QR code (RGBA texture with baked colors, nearest sampling)
draw.rectangle(base_layer, {COL1, ROW2_Y, ITEM_SIZE, ITEM_SIZE}, {255, 255, 255, 255}) // white bg draw.rectangle(base_layer, {COL1, ROW2_Y, ITEM_SIZE, ITEM_SIZE}, draw.Color{255, 255, 255, 255}) // white bg
draw.rectangle_texture( draw.rectangle_texture(
base_layer, base_layer,
{COL1, ROW2_Y, ITEM_SIZE, ITEM_SIZE}, {COL1, ROW2_Y, ITEM_SIZE, ITEM_SIZE},
@@ -176,7 +176,7 @@ textures :: proc() {
{COL2, ROW2_Y, ITEM_SIZE, ITEM_SIZE}, {COL2, ROW2_Y, ITEM_SIZE, ITEM_SIZE},
checker_texture, checker_texture,
sampler = .Nearest_Clamp, sampler = .Nearest_Clamp,
roundness = 0.3, radii = draw.uniform_radii({COL2, ROW2_Y, ITEM_SIZE, ITEM_SIZE}, 0.3),
) )
draw.text( draw.text(
base_layer, base_layer,
@@ -213,7 +213,7 @@ textures :: proc() {
// Stretch // Stretch
uv_s, sampler_s, inner_s := draw.fit_params(.Stretch, {COL1, ROW3_Y, FIT_SIZE, FIT_SIZE}, stripe_texture) uv_s, sampler_s, inner_s := draw.fit_params(.Stretch, {COL1, ROW3_Y, FIT_SIZE, FIT_SIZE}, stripe_texture)
draw.rectangle(base_layer, {COL1, ROW3_Y, FIT_SIZE, FIT_SIZE}, {60, 60, 60, 255}) // bg draw.rectangle(base_layer, {COL1, ROW3_Y, FIT_SIZE, FIT_SIZE}, draw.Color{60, 60, 60, 255}) // bg
draw.rectangle_texture(base_layer, inner_s, stripe_texture, uv_rect = uv_s, sampler = sampler_s) draw.rectangle_texture(base_layer, inner_s, stripe_texture, uv_rect = uv_s, sampler = sampler_s)
draw.text( draw.text(
base_layer, base_layer,
@@ -226,7 +226,7 @@ textures :: proc() {
// Fill (center-crop) // Fill (center-crop)
uv_f, sampler_f, inner_f := draw.fit_params(.Fill, {COL2, ROW3_Y, FIT_SIZE, FIT_SIZE}, stripe_texture) uv_f, sampler_f, inner_f := draw.fit_params(.Fill, {COL2, ROW3_Y, FIT_SIZE, FIT_SIZE}, stripe_texture)
draw.rectangle(base_layer, {COL2, ROW3_Y, FIT_SIZE, FIT_SIZE}, {60, 60, 60, 255}) draw.rectangle(base_layer, {COL2, ROW3_Y, FIT_SIZE, FIT_SIZE}, draw.Color{60, 60, 60, 255})
draw.rectangle_texture(base_layer, inner_f, stripe_texture, uv_rect = uv_f, sampler = sampler_f) draw.rectangle_texture(base_layer, inner_f, stripe_texture, uv_rect = uv_f, sampler = sampler_f)
draw.text( draw.text(
base_layer, base_layer,
@@ -239,7 +239,7 @@ textures :: proc() {
// Fit (letterbox) // Fit (letterbox)
uv_ft, sampler_ft, inner_ft := draw.fit_params(.Fit, {COL3, ROW3_Y, FIT_SIZE, FIT_SIZE}, stripe_texture) uv_ft, sampler_ft, inner_ft := draw.fit_params(.Fit, {COL3, ROW3_Y, FIT_SIZE, FIT_SIZE}, stripe_texture)
draw.rectangle(base_layer, {COL3, ROW3_Y, FIT_SIZE, FIT_SIZE}, {60, 60, 60, 255}) // visible margin bg draw.rectangle(base_layer, {COL3, ROW3_Y, FIT_SIZE, FIT_SIZE}, draw.Color{60, 60, 60, 255}) // visible margin bg
draw.rectangle_texture(base_layer, inner_ft, stripe_texture, uv_rect = uv_ft, sampler = sampler_ft) draw.rectangle_texture(base_layer, inner_ft, stripe_texture, uv_rect = uv_ft, sampler = sampler_ft)
draw.text( draw.text(
base_layer, base_layer,
@@ -251,12 +251,12 @@ textures :: proc() {
) )
// Per-corner radii // Per-corner radii
draw.rectangle_texture_corners( draw.rectangle_texture(
base_layer, base_layer,
{COL4, ROW3_Y, FIT_SIZE, FIT_SIZE}, {COL4, ROW3_Y, FIT_SIZE, FIT_SIZE},
{20, 0, 20, 0},
checker_texture, checker_texture,
sampler = .Nearest_Clamp, sampler = .Nearest_Clamp,
radii = {20, 0, 20, 0},
) )
draw.text( draw.text(
base_layer, base_layer,

View File

@@ -5,8 +5,13 @@ import "core:log"
import "core:mem" import "core:mem"
import sdl "vendor:sdl3" import sdl "vendor:sdl3"
// Vertex layout for tessellated and text geometry.
// IMPORTANT: `color` must be premultiplied alpha (RGB channels pre-scaled by alpha).
// The tessellated fragment shader passes vertex color through directly — it does NOT
// premultiply. The blend state is ONE, ONE_MINUS_SRC_ALPHA (premultiplied-over).
// Use `premultiply_color` when constructing vertices manually for `prepare_shape`.
Vertex :: struct { Vertex :: struct {
position: [2]f32, position: Vec2,
uv: [2]f32, uv: [2]f32,
color: Color, color: Color,
} }
@@ -23,99 +28,127 @@ TextBatch :: struct {
// ----- SDF primitive types ----------- // ----- SDF primitive types -----------
// ---------------------------------------------------------------------------------------------------------------- // ----------------------------------------------------------------------------------------------------------------
// The SDF path evaluates one of four signed distance functions per primitive, dispatched
// by Shape_Kind encoded in the low byte of Primitive.flags:
//
// RRect — rounded rectangle with per-corner radii (sdRoundedBox). Also covers circles
// (uniform radii = half-size), capsule-style line segments (rotated, max rounding),
// and other RRect-reducible shapes.
// NGon — regular polygon with N sides and optional rounding.
// Ellipse — approximate ellipse (non-exact SDF, suitable for UI but not for shape merging).
// Ring_Arc — annular ring with optional angular clipping. Covers full rings, partial arcs,
// pie slices (inner_radius = 0), and loading spinners.
Shape_Kind :: enum u8 { Shape_Kind :: enum u8 {
Solid = 0, Solid = 0, // tessellated path (mode marker; not a real SDF kind)
RRect = 1, RRect = 1,
Circle = 2, NGon = 2,
Ellipse = 3, Ellipse = 3,
Segment = 4, Ring_Arc = 4,
Ring_Arc = 5,
NGon = 6,
} }
Shape_Flag :: enum u8 { Shape_Flag :: enum u8 {
Stroke, Textured, // bit 0: sample texture using uv.uv_rect (mutually exclusive with Gradient)
Textured, Gradient, // bit 1: 2-color gradient using uv.effects.gradient_color as end/outer color
Gradient_Radial, // bit 2: if set with Gradient, radial from center; else linear at angle
Outline, // bit 3: outer outline band using uv.effects.outline_color; CPU expands bounds by outline_width
Rotated, // bit 4: shape has non-zero rotation; rotation_sc contains packed sin/cos
Arc_Narrow, // bit 5: ring arc span ≤ π — intersect half-planes. Neither Arc bit = full ring.
Arc_Wide, // bit 6: ring arc span > π — union half-planes. Neither Arc bit = full ring.
} }
Shape_Flags :: bit_set[Shape_Flag;u8] Shape_Flags :: bit_set[Shape_Flag;u8]
RRect_Params :: struct { RRect_Params :: struct {
half_size: [2]f32, half_size: [2]f32,
radii: [4]f32, radii: [4]f32,
soft_px: f32, half_feather: f32, // feather_px * 0.5; shader uses smoothstep(-h, h, d)
stroke_px: f32, _: f32,
}
Circle_Params :: struct {
radius: f32,
soft_px: f32,
stroke_px: f32,
_: [5]f32,
}
Ellipse_Params :: struct {
radii: [2]f32,
soft_px: f32,
stroke_px: f32,
_: [4]f32,
}
Segment_Params :: struct {
a: [2]f32,
b: [2]f32,
width: f32,
soft_px: f32,
_: [2]f32,
}
Ring_Arc_Params :: struct {
inner_radius: f32,
outer_radius: f32,
start_rad: f32,
end_rad: f32,
soft_px: f32,
_: [3]f32,
} }
NGon_Params :: struct { NGon_Params :: struct {
radius: f32, radius: f32,
rotation: f32, sides: f32,
sides: f32, half_feather: f32, // feather_px * 0.5; shader uses smoothstep(-h, h, d)
soft_px: f32, _: [5]f32,
stroke_px: f32, }
_: [3]f32,
Ellipse_Params :: struct {
radii: [2]f32,
half_feather: f32, // feather_px * 0.5; shader uses smoothstep(-h, h, d)
_: [5]f32,
}
Ring_Arc_Params :: struct {
inner_radius: f32, // inner radius in physical pixels (0 for pie slice)
outer_radius: f32, // outer radius in physical pixels
normal_start: [2]f32, // pre-computed outward normal of start edge: (sin(start), -cos(start))
normal_end: [2]f32, // pre-computed outward normal of end edge: (-sin(end), cos(end))
half_feather: f32, // feather_px * 0.5; shader uses smoothstep(-h, h, d)
_: f32,
} }
Shape_Params :: struct #raw_union { Shape_Params :: struct #raw_union {
rrect: RRect_Params, rrect: RRect_Params,
circle: Circle_Params,
ellipse: Ellipse_Params,
segment: Segment_Params,
ring_arc: Ring_Arc_Params,
ngon: NGon_Params, ngon: NGon_Params,
ellipse: Ellipse_Params,
ring_arc: Ring_Arc_Params,
raw: [8]f32, raw: [8]f32,
} }
#assert(size_of(Shape_Params) == 32) #assert(size_of(Shape_Params) == 32)
// GPU layout: 64 bytes, std430-compatible. The shader declares this as a storage buffer struct. // GPU-side storage for 2-color gradient parameters and/or outline parameters.
// Packed into 16 bytes to alias with uv_rect in the Uv_Or_Effects raw union.
// The shader reads gradient_color and outline_color via unpackUnorm4x8.
// gradient_dir_sc stores the pre-computed gradient direction as (cos, sin) in f16 pair
// via unpackHalf2x16. outline_packed stores outline_width as f16 via unpackHalf2x16.
Gradient_Outline :: struct {
gradient_color: Color, // 0: end (linear) or outer (radial) gradient color
outline_color: Color, // 4: outline band color
gradient_dir_sc: u32, // 8: packed f16 pair: low = cos(angle), high = sin(angle) — pre-computed gradient direction
outline_packed: u32, // 12: packed f16 pair: low = outline_width (f16, physical pixels), high = reserved
}
#assert(size_of(Gradient_Outline) == 16)
// Uv_Or_Effects aliases the final 16 bytes of a Primitive. When .Textured is set,
// uv_rect holds texture-atlas coordinates. When .Gradient or .Outline is set,
// effects holds 2-color gradient parameters and/or outline parameters.
// Textured and Gradient are mutually exclusive; if both are set, Gradient takes precedence.
Uv_Or_Effects :: struct #raw_union {
uv_rect: [4]f32, // u_min, v_min, u_max, v_max (default {0,0,1,1})
effects: Gradient_Outline, // gradient + outline parameters
}
// GPU layout: 80 bytes, std430-compatible. The shader declares this as a storage buffer struct.
// The low byte of `flags` encodes the Shape_Kind (0 = tessellated, 1-4 = SDF kinds).
// Bits 8-15 encode Shape_Flags (Textured, Gradient, Gradient_Radial, Outline, Rotated, Arc_Narrow, Arc_Wide).
// rotation_sc stores pre-computed sin/cos of the rotation angle as a packed f16 pair,
// avoiding per-pixel trigonometry in the fragment shader. Only read when .Rotated is set.
Primitive :: struct { Primitive :: struct {
bounds: [4]f32, // 0: min_x, min_y, max_x, max_y (world-space, pre-DPI) bounds: [4]f32, // 0: min_x, min_y, max_x, max_y (world-space, pre-DPI)
color: Color, // 16: u8x4, unpacked in shader via unpackUnorm4x8 color: Color, // 16: u8x4, fill color / gradient start color / texture tint
kind_flags: u32, // 20: (kind as u32) | (flags as u32 << 8) flags: u32, // 20: low byte = Shape_Kind, bits 8+ = Shape_Flags
rotation: f32, // 24: shader self-rotation in radians (used by RRect, Ellipse) rotation_sc: u32, // 24: packed f16 pair: low = sin(angle), high = cos(angle). Requires .Rotated flag.
_pad: f32, // 28: alignment to vec4 boundary _pad: f32, // 28: reserved for future use
params: Shape_Params, // 32: two vec4s of shape params params: Shape_Params, // 32: per-kind shape parameters (raw union, 32 bytes)
uv_rect: [4]f32, // 64: u_min, v_min, u_max, v_max (default {0,0,1,1}) uv: Uv_Or_Effects, // 64: texture coords or gradient/outline parameters
} }
#assert(size_of(Primitive) == 80) #assert(size_of(Primitive) == 80)
// Pack shape kind and flags into the Primitive.flags field. The low byte encodes the Shape_Kind
// (which also serves as the SDF mode marker — kind > 0 means SDF path). The tessellated path
// leaves the field at 0 (Solid kind, set by vertex shader zero-initialization).
pack_kind_flags :: #force_inline proc(kind: Shape_Kind, flags: Shape_Flags) -> u32 { pack_kind_flags :: #force_inline proc(kind: Shape_Kind, flags: Shape_Flags) -> u32 {
return u32(kind) | (u32(transmute(u8)flags) << 8) return u32(kind) | (u32(transmute(u8)flags) << 8)
} }
// Pack two f16 values into a single u32 for GPU consumption via unpackHalf2x16.
// Used to pack gradient_dir_sc (cos/sin) and outline_packed (width/reserved) in Gradient_Outline.
pack_f16_pair :: #force_inline proc(low, high: f16) -> u32 {
return u32(transmute(u16)low) | (u32(transmute(u16)high) << 16)
}
Pipeline_2D_Base :: struct { Pipeline_2D_Base :: struct {
sdl_pipeline: ^sdl.GPUGraphicsPipeline, sdl_pipeline: ^sdl.GPUGraphicsPipeline,
vertex_buffer: Buffer, vertex_buffer: Buffer,
@@ -208,19 +241,23 @@ create_pipeline_2d_base :: proc(
target_info = sdl.GPUGraphicsPipelineTargetInfo { target_info = sdl.GPUGraphicsPipelineTargetInfo {
color_target_descriptions = &sdl.GPUColorTargetDescription { color_target_descriptions = &sdl.GPUColorTargetDescription {
format = sdl.GetGPUSwapchainTextureFormat(device, window), format = sdl.GetGPUSwapchainTextureFormat(device, window),
// Premultiplied-alpha blending: src outputs RGB pre-multiplied by alpha,
// so src factor is ONE (not SRC_ALPHA). This eliminates the per-pixel
// divide in the outline path and is the standard blend mode used by
// Skia, Flutter, and GPUI.
blend_state = sdl.GPUColorTargetBlendState { blend_state = sdl.GPUColorTargetBlendState {
enable_blend = true, enable_blend = true,
enable_color_write_mask = true, enable_color_write_mask = true,
src_color_blendfactor = .SRC_ALPHA, src_color_blendfactor = .ONE,
dst_color_blendfactor = .ONE_MINUS_SRC_ALPHA, dst_color_blendfactor = .ONE_MINUS_SRC_ALPHA,
color_blend_op = .ADD, color_blend_op = .ADD,
src_alpha_blendfactor = .SRC_ALPHA, src_alpha_blendfactor = .ONE,
dst_alpha_blendfactor = .ONE_MINUS_SRC_ALPHA, dst_alpha_blendfactor = .ONE_MINUS_SRC_ALPHA,
alpha_blend_op = .ADD, alpha_blend_op = .ADD,
color_write_mask = sdl.GPUColorComponentFlags{.R, .G, .B, .A}, color_write_mask = sdl.GPUColorComponentFlags{.R, .G, .B, .A},
}, },
}, },
num_color_targets = 1, num_color_targets = 1,
}, },
vertex_input_state = sdl.GPUVertexInputState { vertex_input_state = sdl.GPUVertexInputState {
vertex_buffer_descriptions = &sdl.GPUVertexBufferDescription { vertex_buffer_descriptions = &sdl.GPUVertexBufferDescription {
@@ -300,7 +337,7 @@ create_pipeline_2d_base :: proc(
} }
// Upload white pixel and unit quad data in a single command buffer // Upload white pixel and unit quad data in a single command buffer
white_pixel := [4]u8{255, 255, 255, 255} white_pixel := Color{255, 255, 255, 255}
white_transfer_buf := sdl.CreateGPUTransferBuffer( white_transfer_buf := sdl.CreateGPUTransferBuffer(
device, device,
sdl.GPUTransferBufferCreateInfo{usage = .UPLOAD, size = size_of(white_pixel)}, sdl.GPUTransferBufferCreateInfo{usage = .UPLOAD, size = size_of(white_pixel)},
@@ -578,7 +615,7 @@ draw_layer :: proc(
for &batch in GLOB.tmp_sub_batches[scissor.sub_batch_start:][:scissor.sub_batch_len] { for &batch in GLOB.tmp_sub_batches[scissor.sub_batch_start:][:scissor.sub_batch_len] {
switch batch.kind { switch batch.kind {
case .Shapes: case .Tessellated:
if current_mode != .Tessellated { if current_mode != .Tessellated {
push_globals(cmd_buffer, width, height, .Tessellated) push_globals(cmd_buffer, width, height, .Tessellated)
current_mode = .Tessellated current_mode = .Tessellated

View File

@@ -23,293 +23,225 @@ struct main0_in
float2 f_local_or_uv [[user(locn1)]]; float2 f_local_or_uv [[user(locn1)]];
float4 f_params [[user(locn2)]]; float4 f_params [[user(locn2)]];
float4 f_params2 [[user(locn3)]]; float4 f_params2 [[user(locn3)]];
uint f_kind_flags [[user(locn4)]]; uint f_flags [[user(locn4)]];
float f_rotation [[user(locn5), flat]]; uint f_rotation_sc [[user(locn5)]];
float4 f_uv_rect [[user(locn6), flat]]; uint4 f_uv_or_effects [[user(locn6)]];
}; };
static inline __attribute__((always_inline)) static inline __attribute__((always_inline))
float2 apply_rotation(thread const float2& p, thread const float& angle) float sdRoundedBox(thread const float2& p, thread const float2& b, thread const float4& r)
{ {
float cr = cos(-angle); float2 _48;
float sr = sin(-angle);
return float2x2(float2(cr, sr), float2(-sr, cr)) * p;
}
static inline __attribute__((always_inline))
float sdRoundedBox(thread const float2& p, thread const float2& b, thread float4& r)
{
float2 _61;
if (p.x > 0.0) if (p.x > 0.0)
{ {
_61 = r.xy; _48 = r.xy;
} }
else else
{ {
_61 = r.zw; _48 = r.zw;
} }
r.x = _61.x; float2 rxy = _48;
r.y = _61.y; float _62;
float _78;
if (p.y > 0.0) if (p.y > 0.0)
{ {
_78 = r.x; _62 = rxy.x;
} }
else else
{ {
_78 = r.y; _62 = rxy.y;
} }
r.x = _78; float rr = _62;
float2 q = (abs(p) - b) + float2(r.x); float2 q = abs(p) - b;
return (fast::min(fast::max(q.x, q.y), 0.0) + length(fast::max(q, float2(0.0)))) - r.x; if (rr == 0.0)
}
static inline __attribute__((always_inline))
float sdf_stroke(thread const float& d, thread const float& stroke_width)
{
return abs(d) - (stroke_width * 0.5);
}
static inline __attribute__((always_inline))
float sdf_alpha(thread const float& d, thread const float& soft)
{
return 1.0 - smoothstep(-soft, soft, d);
}
static inline __attribute__((always_inline))
float sdCircle(thread const float2& p, thread const float& r)
{
return length(p) - r;
}
static inline __attribute__((always_inline))
float sdEllipse(thread float2& p, thread float2& ab)
{
p = abs(p);
if (p.x > p.y)
{ {
p = p.yx; return fast::max(q.x, q.y);
ab = ab.yx;
} }
float l = (ab.y * ab.y) - (ab.x * ab.x); q += float2(rr);
float m = (ab.x * p.x) / l; return (fast::min(fast::max(q.x, q.y), 0.0) + length(fast::max(q, float2(0.0)))) - rr;
float m2 = m * m;
float n = (ab.y * p.y) / l;
float n2 = n * n;
float c = ((m2 + n2) - 1.0) / 3.0;
float c3 = (c * c) * c;
float q = c3 + ((m2 * n2) * 2.0);
float d = c3 + (m2 * n2);
float g = m + (m * n2);
float co;
if (d < 0.0)
{
float h = acos(q / c3) / 3.0;
float s = cos(h);
float t = sin(h) * 1.73205077648162841796875;
float rx = sqrt(((-c) * ((s + t) + 2.0)) + m2);
float ry = sqrt(((-c) * ((s - t) + 2.0)) + m2);
co = (((ry + (sign(l) * rx)) + (abs(g) / (rx * ry))) - m) / 2.0;
}
else
{
float h_1 = ((2.0 * m) * n) * sqrt(d);
float s_1 = sign(q + h_1) * powr(abs(q + h_1), 0.3333333432674407958984375);
float u = sign(q - h_1) * powr(abs(q - h_1), 0.3333333432674407958984375);
float rx_1 = (((-s_1) - u) - (c * 4.0)) + (2.0 * m2);
float ry_1 = (s_1 - u) * 1.73205077648162841796875;
float rm = sqrt((rx_1 * rx_1) + (ry_1 * ry_1));
co = (((ry_1 / sqrt(rm - rx_1)) + ((2.0 * g) / rm)) - m) / 2.0;
}
float2 r = ab * float2(co, sqrt(1.0 - (co * co)));
return length(r - p) * sign(p.y - r.y);
} }
static inline __attribute__((always_inline)) static inline __attribute__((always_inline))
float sdSegment(thread const float2& p, thread const float2& a, thread const float2& b) float sdRegularPolygon(thread const float2& p, thread const float& r, thread const float& n)
{ {
float2 pa = p - a; float an = 3.1415927410125732421875 / n;
float2 ba = b - a; float bn = mod(precise::atan2(p.y, p.x), 2.0 * an) - an;
float h = fast::clamp(dot(pa, ba) / dot(ba, ba), 0.0, 1.0); return (length(p) * cos(bn)) - r;
return length(pa - (ba * h)); }
static inline __attribute__((always_inline))
float sdEllipseApprox(thread const float2& p, thread const float2& ab)
{
float k0 = length(p / ab);
float k1 = length(p / (ab * ab));
return (k0 * (k0 - 1.0)) / k1;
}
static inline __attribute__((always_inline))
float4 gradient_2color(thread const float4& start_color, thread const float4& end_color, thread const float& t)
{
return mix(start_color, end_color, float4(fast::clamp(t, 0.0, 1.0)));
}
static inline __attribute__((always_inline))
float sdf_alpha(thread const float& d, thread const float& h)
{
return 1.0 - smoothstep(-h, h, d);
} }
fragment main0_out main0(main0_in in [[stage_in]], texture2d<float> tex [[texture(0)]], sampler texSmplr [[sampler(0)]]) fragment main0_out main0(main0_in in [[stage_in]], texture2d<float> tex [[texture(0)]], sampler texSmplr [[sampler(0)]])
{ {
main0_out out = {}; main0_out out = {};
uint kind = in.f_kind_flags & 255u; uint kind = in.f_flags & 255u;
uint flags = (in.f_kind_flags >> 8u) & 255u; uint flags = (in.f_flags >> 8u) & 255u;
if (kind == 0u) if (kind == 0u)
{ {
out.out_color = in.f_color * tex.sample(texSmplr, in.f_local_or_uv); float4 t = tex.sample(texSmplr, in.f_local_or_uv);
float _195 = t.w;
float4 _197 = t;
float3 _199 = _197.xyz * _195;
t.x = _199.x;
t.y = _199.y;
t.z = _199.z;
out.out_color = in.f_color * t;
return out; return out;
} }
float d = 1000000015047466219876688855040.0; float d = 1000000015047466219876688855040.0;
float soft = 1.0; float h = 0.5;
float2 half_size = in.f_params.xy;
float2 p_local = in.f_local_or_uv;
if ((flags & 16u) != 0u)
{
float2 sc = float2(as_type<half2>(in.f_rotation_sc));
p_local = float2((sc.y * p_local.x) + (sc.x * p_local.y), ((-sc.x) * p_local.x) + (sc.y * p_local.y));
}
if (kind == 1u) if (kind == 1u)
{ {
float2 b = in.f_params.xy; float4 corner_radii = float4(in.f_params.zw, in.f_params2.xy);
float4 r = float4(in.f_params.zw, in.f_params2.xy); h = in.f_params2.z;
soft = fast::max(in.f_params2.z, 1.0); float2 param = p_local;
float stroke_px = in.f_params2.w; float2 param_1 = half_size;
float2 p_local = in.f_local_or_uv; float4 param_2 = corner_radii;
if (in.f_rotation != 0.0) d = sdRoundedBox(param, param_1, param_2);
{
float2 param = p_local;
float param_1 = in.f_rotation;
p_local = apply_rotation(param, param_1);
}
float2 param_2 = p_local;
float2 param_3 = b;
float4 param_4 = r;
float _491 = sdRoundedBox(param_2, param_3, param_4);
d = _491;
if ((flags & 1u) != 0u)
{
float param_5 = d;
float param_6 = stroke_px;
d = sdf_stroke(param_5, param_6);
}
float4 shape_color = in.f_color;
if ((flags & 2u) != 0u)
{
float2 p_for_uv = in.f_local_or_uv;
if (in.f_rotation != 0.0)
{
float2 param_7 = p_for_uv;
float param_8 = in.f_rotation;
p_for_uv = apply_rotation(param_7, param_8);
}
float2 local_uv = ((p_for_uv / b) * 0.5) + float2(0.5);
float2 uv = mix(in.f_uv_rect.xy, in.f_uv_rect.zw, local_uv);
shape_color *= tex.sample(texSmplr, uv);
}
float param_9 = d;
float param_10 = soft;
float alpha = sdf_alpha(param_9, param_10);
out.out_color = float4(shape_color.xyz, shape_color.w * alpha);
return out;
} }
else else
{ {
if (kind == 2u) if (kind == 2u)
{ {
float radius = in.f_params.x; float radius = in.f_params.x;
soft = fast::max(in.f_params.y, 1.0); float sides = in.f_params.y;
float stroke_px_1 = in.f_params.z; h = in.f_params.z;
float2 param_11 = in.f_local_or_uv; float2 param_3 = p_local;
float param_12 = radius; float param_4 = radius;
d = sdCircle(param_11, param_12); float param_5 = sides;
if ((flags & 1u) != 0u) d = sdRegularPolygon(param_3, param_4, param_5);
{ half_size = float2(radius);
float param_13 = d;
float param_14 = stroke_px_1;
d = sdf_stroke(param_13, param_14);
}
} }
else else
{ {
if (kind == 3u) if (kind == 3u)
{ {
float2 ab = in.f_params.xy; float2 ab = in.f_params.xy;
soft = fast::max(in.f_params.z, 1.0); h = in.f_params.z;
float stroke_px_2 = in.f_params.w; float2 param_6 = p_local;
float2 p_local_1 = in.f_local_or_uv; float2 param_7 = ab;
if (in.f_rotation != 0.0) d = sdEllipseApprox(param_6, param_7);
{ half_size = ab;
float2 param_15 = p_local_1;
float param_16 = in.f_rotation;
p_local_1 = apply_rotation(param_15, param_16);
}
float2 param_17 = p_local_1;
float2 param_18 = ab;
float _616 = sdEllipse(param_17, param_18);
d = _616;
if ((flags & 1u) != 0u)
{
float param_19 = d;
float param_20 = stroke_px_2;
d = sdf_stroke(param_19, param_20);
}
} }
else else
{ {
if (kind == 4u) if (kind == 4u)
{ {
float2 a = in.f_params.xy; float inner = in.f_params.x;
float2 b_1 = in.f_params.zw; float outer = in.f_params.y;
float width = in.f_params2.x; float2 n_start = in.f_params.zw;
soft = fast::max(in.f_params2.y, 1.0); float2 n_end = in.f_params2.xy;
float2 param_21 = in.f_local_or_uv; uint arc_bits = (flags >> 5u) & 3u;
float2 param_22 = a; h = in.f_params2.z;
float2 param_23 = b_1; float r = length(p_local);
d = sdSegment(param_21, param_22, param_23) - (width * 0.5); d = fast::max(inner - r, r - outer);
} if (arc_bits != 0u)
else
{
if (kind == 5u)
{ {
float inner = in.f_params.x; float d_start = dot(p_local, n_start);
float outer = in.f_params.y; float d_end = dot(p_local, n_end);
float start_rad = in.f_params.z; float _372;
float end_rad = in.f_params.w; if (arc_bits == 1u)
soft = fast::max(in.f_params2.x, 1.0);
float r_1 = length(in.f_local_or_uv);
float d_ring = fast::max(inner - r_1, r_1 - outer);
float angle = precise::atan2(in.f_local_or_uv.y, in.f_local_or_uv.x);
if (angle < 0.0)
{ {
angle += 6.283185482025146484375; _372 = fast::max(d_start, d_end);
}
float ang_start = mod(start_rad, 6.283185482025146484375);
float ang_end = mod(end_rad, 6.283185482025146484375);
float _710;
if (ang_end > ang_start)
{
_710 = float((angle >= ang_start) && (angle <= ang_end));
} }
else else
{ {
_710 = float((angle >= ang_start) || (angle <= ang_end)); _372 = fast::min(d_start, d_end);
}
float in_arc = _710;
if (abs(ang_end - ang_start) >= 6.282185077667236328125)
{
in_arc = 1.0;
}
d = (in_arc > 0.5) ? d_ring : 1000000015047466219876688855040.0;
}
else
{
if (kind == 6u)
{
float radius_1 = in.f_params.x;
float rotation = in.f_params.y;
float sides = in.f_params.z;
soft = fast::max(in.f_params.w, 1.0);
float stroke_px_3 = in.f_params2.x;
float2 p = in.f_local_or_uv;
float c = cos(rotation);
float s = sin(rotation);
p = float2x2(float2(c, -s), float2(s, c)) * p;
float an = 3.1415927410125732421875 / sides;
float bn = mod(precise::atan2(p.y, p.x), 2.0 * an) - an;
d = (length(p) * cos(bn)) - radius_1;
if ((flags & 1u) != 0u)
{
float param_24 = d;
float param_25 = stroke_px_3;
d = sdf_stroke(param_24, param_25);
}
} }
float d_wedge = _372;
d = fast::max(d, d_wedge);
} }
half_size = float2(outer);
} }
} }
} }
} }
float param_26 = d; float grad_magnitude = fast::max(fwidth(d), 9.9999999747524270787835121154785e-07);
float param_27 = soft; d /= grad_magnitude;
float alpha_1 = sdf_alpha(param_26, param_27); h /= grad_magnitude;
out.out_color = float4(in.f_color.xyz, in.f_color.w * alpha_1); float4 shape_color;
if ((flags & 2u) != 0u)
{
float4 gradient_start = in.f_color;
float4 gradient_end = unpack_unorm4x8_to_float(in.f_uv_or_effects.x);
if ((flags & 4u) != 0u)
{
float t_1 = length(p_local / half_size);
float4 param_8 = gradient_start;
float4 param_9 = gradient_end;
float param_10 = t_1;
shape_color = gradient_2color(param_8, param_9, param_10);
}
else
{
float2 direction = float2(as_type<half2>(in.f_uv_or_effects.z));
float t_2 = (dot(p_local / half_size, direction) * 0.5) + 0.5;
float4 param_11 = gradient_start;
float4 param_12 = gradient_end;
float param_13 = t_2;
shape_color = gradient_2color(param_11, param_12, param_13);
}
}
else
{
if ((flags & 1u) != 0u)
{
float4 uv_rect = as_type<float4>(in.f_uv_or_effects);
float2 local_uv = ((p_local / half_size) * 0.5) + float2(0.5);
float2 uv = mix(uv_rect.xy, uv_rect.zw, local_uv);
shape_color = in.f_color * tex.sample(texSmplr, uv);
}
else
{
shape_color = in.f_color;
}
}
if ((flags & 8u) != 0u)
{
float4 ol_color = unpack_unorm4x8_to_float(in.f_uv_or_effects.y);
float ol_width = float2(as_type<half2>(in.f_uv_or_effects.w)).x / grad_magnitude;
float param_14 = d;
float param_15 = h;
float fill_cov = sdf_alpha(param_14, param_15);
float param_16 = d - ol_width;
float param_17 = h;
float total_cov = sdf_alpha(param_16, param_17);
float outline_cov = fast::max(total_cov - fill_cov, 0.0);
float3 rgb_pm = ((shape_color.xyz * shape_color.w) * fill_cov) + ((ol_color.xyz * ol_color.w) * outline_cov);
float alpha_pm = (shape_color.w * fill_cov) + (ol_color.w * outline_cov);
out.out_color = float4(rgb_pm, alpha_pm);
}
else
{
float param_18 = d;
float param_19 = h;
float alpha = sdf_alpha(param_18, param_19);
out.out_color = float4((shape_color.xyz * shape_color.w) * alpha, shape_color.w * alpha);
}
return out; return out;
} }

View File

@@ -14,24 +14,24 @@ struct Primitive
{ {
float4 bounds; float4 bounds;
uint color; uint color;
uint kind_flags; uint flags;
float rotation; uint rotation_sc;
float _pad; float _pad;
float4 params; float4 params;
float4 params2; float4 params2;
float4 uv_rect; uint4 uv_or_effects;
}; };
struct Primitive_1 struct Primitive_1
{ {
float4 bounds; float4 bounds;
uint color; uint color;
uint kind_flags; uint flags;
float rotation; uint rotation_sc;
float _pad; float _pad;
float4 params; float4 params;
float4 params2; float4 params2;
float4 uv_rect; uint4 uv_or_effects;
}; };
struct Primitives struct Primitives
@@ -45,9 +45,9 @@ struct main0_out
float2 f_local_or_uv [[user(locn1)]]; float2 f_local_or_uv [[user(locn1)]];
float4 f_params [[user(locn2)]]; float4 f_params [[user(locn2)]];
float4 f_params2 [[user(locn3)]]; float4 f_params2 [[user(locn3)]];
uint f_kind_flags [[user(locn4)]]; uint f_flags [[user(locn4)]];
float f_rotation [[user(locn5)]]; uint f_rotation_sc [[user(locn5)]];
float4 f_uv_rect [[user(locn6)]]; uint4 f_uv_or_effects [[user(locn6)]];
float4 gl_Position [[position]]; float4 gl_Position [[position]];
}; };
@@ -58,7 +58,7 @@ struct main0_in
float4 v_color [[attribute(2)]]; float4 v_color [[attribute(2)]];
}; };
vertex main0_out main0(main0_in in [[stage_in]], constant Uniforms& _12 [[buffer(0)]], const device Primitives& _74 [[buffer(1)]], uint gl_InstanceIndex [[instance_id]]) vertex main0_out main0(main0_in in [[stage_in]], constant Uniforms& _12 [[buffer(0)]], const device Primitives& _75 [[buffer(1)]], uint gl_InstanceIndex [[instance_id]])
{ {
main0_out out = {}; main0_out out = {};
if (_12.mode == 0u) if (_12.mode == 0u)
@@ -67,22 +67,22 @@ vertex main0_out main0(main0_in in [[stage_in]], constant Uniforms& _12 [[buffer
out.f_local_or_uv = in.v_uv; out.f_local_or_uv = in.v_uv;
out.f_params = float4(0.0); out.f_params = float4(0.0);
out.f_params2 = float4(0.0); out.f_params2 = float4(0.0);
out.f_kind_flags = 0u; out.f_flags = 0u;
out.f_rotation = 0.0; out.f_rotation_sc = 0u;
out.f_uv_rect = float4(0.0, 0.0, 1.0, 1.0); out.f_uv_or_effects = uint4(0u);
out.gl_Position = _12.projection * float4(in.v_position * _12.dpi_scale, 0.0, 1.0); out.gl_Position = _12.projection * float4(in.v_position * _12.dpi_scale, 0.0, 1.0);
} }
else else
{ {
Primitive p; Primitive p;
p.bounds = _74.primitives[int(gl_InstanceIndex)].bounds; p.bounds = _75.primitives[int(gl_InstanceIndex)].bounds;
p.color = _74.primitives[int(gl_InstanceIndex)].color; p.color = _75.primitives[int(gl_InstanceIndex)].color;
p.kind_flags = _74.primitives[int(gl_InstanceIndex)].kind_flags; p.flags = _75.primitives[int(gl_InstanceIndex)].flags;
p.rotation = _74.primitives[int(gl_InstanceIndex)].rotation; p.rotation_sc = _75.primitives[int(gl_InstanceIndex)].rotation_sc;
p._pad = _74.primitives[int(gl_InstanceIndex)]._pad; p._pad = _75.primitives[int(gl_InstanceIndex)]._pad;
p.params = _74.primitives[int(gl_InstanceIndex)].params; p.params = _75.primitives[int(gl_InstanceIndex)].params;
p.params2 = _74.primitives[int(gl_InstanceIndex)].params2; p.params2 = _75.primitives[int(gl_InstanceIndex)].params2;
p.uv_rect = _74.primitives[int(gl_InstanceIndex)].uv_rect; p.uv_or_effects = _75.primitives[int(gl_InstanceIndex)].uv_or_effects;
float2 corner = in.v_position; float2 corner = in.v_position;
float2 world_pos = mix(p.bounds.xy, p.bounds.zw, corner); float2 world_pos = mix(p.bounds.xy, p.bounds.zw, corner);
float2 center = (p.bounds.xy + p.bounds.zw) * 0.5; float2 center = (p.bounds.xy + p.bounds.zw) * 0.5;
@@ -90,10 +90,11 @@ vertex main0_out main0(main0_in in [[stage_in]], constant Uniforms& _12 [[buffer
out.f_local_or_uv = (world_pos - center) * _12.dpi_scale; out.f_local_or_uv = (world_pos - center) * _12.dpi_scale;
out.f_params = p.params; out.f_params = p.params;
out.f_params2 = p.params2; out.f_params2 = p.params2;
out.f_kind_flags = p.kind_flags; out.f_flags = p.flags;
out.f_rotation = p.rotation; out.f_rotation_sc = p.rotation_sc;
out.f_uv_rect = p.uv_rect; out.f_uv_or_effects = p.uv_or_effects;
out.gl_Position = _12.projection * float4(world_pos * _12.dpi_scale, 0.0, 1.0); out.gl_Position = _12.projection * float4(world_pos * _12.dpi_scale, 0.0, 1.0);
} }
return out; return out;
} }

View File

@@ -1,13 +1,13 @@
#version 450 core #version 450 core
// --- Inputs from vertex shader --- // --- Inputs from vertex shader ---
layout(location = 0) in vec4 f_color; layout(location = 0) in mediump vec4 f_color;
layout(location = 1) in vec2 f_local_or_uv; layout(location = 1) in vec2 f_local_or_uv;
layout(location = 2) in vec4 f_params; layout(location = 2) in vec4 f_params;
layout(location = 3) in vec4 f_params2; layout(location = 3) in vec4 f_params2;
layout(location = 4) flat in uint f_kind_flags; layout(location = 4) flat in uint f_flags;
layout(location = 5) flat in float f_rotation; layout(location = 5) flat in uint f_rotation_sc;
layout(location = 6) flat in vec4 f_uv_rect; layout(location = 6) flat in uvec4 f_uv_or_effects;
// --- Output --- // --- Output ---
layout(location = 0) out vec4 out_color; layout(location = 0) out vec4 out_color;
@@ -20,77 +20,43 @@ layout(set = 2, binding = 0) uniform sampler2D tex;
// All operate in physical pixel space — no dpi_scale needed here. // All operate in physical pixel space — no dpi_scale needed here.
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
const float PI = 3.14159265358979;
float sdCircle(vec2 p, float r) {
return length(p) - r;
}
float sdRoundedBox(vec2 p, vec2 b, vec4 r) { float sdRoundedBox(vec2 p, vec2 b, vec4 r) {
r.xy = (p.x > 0.0) ? r.xy : r.zw; vec2 rxy = (p.x > 0.0) ? r.xy : r.zw;
r.x = (p.y > 0.0) ? r.x : r.y; float rr = (p.y > 0.0) ? rxy.x : rxy.y;
vec2 q = abs(p) - b + r.x; vec2 q = abs(p) - b;
return min(max(q.x, q.y), 0.0) + length(max(q, vec2(0.0))) - r.x; if (rr == 0.0) {
} return max(q.x, q.y);
float sdSegment(vec2 p, vec2 a, vec2 b) {
vec2 pa = p - a, ba = b - a;
float h = clamp(dot(pa, ba) / dot(ba, ba), 0.0, 1.0);
return length(pa - ba * h);
}
float sdEllipse(vec2 p, vec2 ab) {
p = abs(p);
if (p.x > p.y) {
p = p.yx;
ab = ab.yx;
} }
float l = ab.y * ab.y - ab.x * ab.x; q += rr;
float m = ab.x * p.x / l; return min(max(q.x, q.y), 0.0) + length(max(q, vec2(0.0))) - rr;
float m2 = m * m;
float n = ab.y * p.y / l;
float n2 = n * n;
float c = (m2 + n2 - 1.0) / 3.0;
float c3 = c * c * c;
float q = c3 + m2 * n2 * 2.0;
float d = c3 + m2 * n2;
float g = m + m * n2;
float co;
if (d < 0.0) {
float h = acos(q / c3) / 3.0;
float s = cos(h);
float t = sin(h) * sqrt(3.0);
float rx = sqrt(-c * (s + t + 2.0) + m2);
float ry = sqrt(-c * (s - t + 2.0) + m2);
co = (ry + sign(l) * rx + abs(g) / (rx * ry) - m) / 2.0;
} else {
float h = 2.0 * m * n * sqrt(d);
float s = sign(q + h) * pow(abs(q + h), 1.0 / 3.0);
float u = sign(q - h) * pow(abs(q - h), 1.0 / 3.0);
float rx = -s - u - c * 4.0 + 2.0 * m2;
float ry = (s - u) * sqrt(3.0);
float rm = sqrt(rx * rx + ry * ry);
co = (ry / sqrt(rm - rx) + 2.0 * g / rm - m) / 2.0;
}
vec2 r = ab * vec2(co, sqrt(1.0 - co * co));
return length(r - p) * sign(p.y - r.y);
} }
float sdf_alpha(float d, float soft) { // Approximate ellipse SDF — fast, suitable for UI, NOT a true Euclidean distance.
return 1.0 - smoothstep(-soft, soft, d); float sdEllipseApprox(vec2 p, vec2 ab) {
float k0 = length(p / ab);
float k1 = length(p / (ab * ab));
return k0 * (k0 - 1.0) / k1;
} }
float sdf_stroke(float d, float stroke_width) { // Regular N-gon SDF (Inigo Quilez).
return abs(d) - stroke_width * 0.5; float sdRegularPolygon(vec2 p, float r, float n) {
float an = 3.141592653589793 / n;
float bn = mod(atan(p.y, p.x), 2.0 * an) - an;
return length(p) * cos(bn) - r;
} }
// Rotate a 2D point by the negative of the given angle (inverse rotation). // Coverage from SDF distance using half-feather width (feather_px * 0.5, pre-computed on CPU).
// Used to rotate the sampling frame opposite to the shape's rotation so that // Produces a symmetric transition centered on d=0: smoothstep(-h, h, d).
// the SDF evaluates correctly for the rotated shape. float sdf_alpha(float d, float h) {
vec2 apply_rotation(vec2 p, float angle) { return 1.0 - smoothstep(-h, h, d);
float cr = cos(-angle); }
float sr = sin(-angle);
return mat2(cr, sr, -sr, cr) * p; // ---------------------------------------------------------------------------
// Gradient helpers
// ---------------------------------------------------------------------------
mediump vec4 gradient_2color(mediump vec4 start_color, mediump vec4 end_color, mediump float t) {
return mix(start_color, end_color, clamp(t, 0.0, 1.0));
} }
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
@@ -98,131 +64,137 @@ vec2 apply_rotation(vec2 p, float angle) {
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
void main() { void main() {
uint kind = f_kind_flags & 0xFFu; uint kind = f_flags & 0xFFu;
uint flags = (f_kind_flags >> 8u) & 0xFFu; uint flags = (f_flags >> 8u) & 0xFFu;
// ----------------------------------------------------------------------- // Kind 0: Tessellated path — vertex colors arrive premultiplied from CPU.
// Kind 0: Tessellated path. Texture multiply for text atlas, // Texture samples are straight-alpha (SDL_ttf glyph atlas: rgb=1, a=coverage;
// white pixel for solid shapes. // or the 1x1 white texture: rgba=1). Convert to premultiplied form so the
// ----------------------------------------------------------------------- // blend state (ONE, ONE_MINUS_SRC_ALPHA) composites correctly.
if (kind == 0u) { if (kind == 0u) {
out_color = f_color * texture(tex, f_local_or_uv); vec4 t = texture(tex, f_local_or_uv);
t.rgb *= t.a;
out_color = f_color * t;
return; return;
} }
// ----------------------------------------------------------------------- // SDF path — dispatch on kind
// SDF path. f_local_or_uv = shape-centered position in physical pixels.
// All dimensional params are already in physical pixels (CPU pre-scaled).
// -----------------------------------------------------------------------
float d = 1e30; float d = 1e30;
float soft = 1.0; float h = 0.5; // half-feather width; overwritten per shape kind
vec2 half_size = f_params.xy; // used by RRect and as reference size for gradients
vec2 p_local = f_local_or_uv;
// Apply inverse rotation using pre-computed sin/cos (no per-pixel trig).
// .Rotated flag = bit 4 = 16u
if ((flags & 16u) != 0u) {
vec2 sc = unpackHalf2x16(f_rotation_sc); // .x = sin(angle), .y = cos(angle)
// Inverse rotation matrix R(-angle) = [[cos, sin], [-sin, cos]]
p_local = vec2(sc.y * p_local.x + sc.x * p_local.y,
-sc.x * p_local.x + sc.y * p_local.y);
}
if (kind == 1u) { if (kind == 1u) {
// RRect: rounded box // RRect — half_feather in params2.z
vec2 b = f_params.xy; // half_size (phys px) vec4 corner_radii = vec4(f_params.zw, f_params2.xy);
vec4 r = vec4(f_params.zw, f_params2.xy); // corner radii: tr, br, tl, bl h = f_params2.z;
soft = max(f_params2.z, 1.0); d = sdRoundedBox(p_local, half_size, corner_radii);
float stroke_px = f_params2.w;
vec2 p_local = f_local_or_uv;
if (f_rotation != 0.0) {
p_local = apply_rotation(p_local, f_rotation);
}
d = sdRoundedBox(p_local, b, r);
if ((flags & 1u) != 0u) d = sdf_stroke(d, stroke_px);
// Texture sampling for textured SDF primitives
vec4 shape_color = f_color;
if ((flags & 2u) != 0u) {
// Compute UV from local position and half_size
vec2 p_for_uv = f_local_or_uv;
if (f_rotation != 0.0) {
p_for_uv = apply_rotation(p_for_uv, f_rotation);
}
vec2 local_uv = p_for_uv / b * 0.5 + 0.5;
vec2 uv = mix(f_uv_rect.xy, f_uv_rect.zw, local_uv);
shape_color *= texture(tex, uv);
}
float alpha = sdf_alpha(d, soft);
out_color = vec4(shape_color.rgb, shape_color.a * alpha);
return;
} }
else if (kind == 2u) { else if (kind == 2u) {
// Circle — rotationally symmetric, no rotation needed // NGon — half_feather in params.z
float radius = f_params.x; float radius = f_params.x;
soft = max(f_params.y, 1.0); float sides = f_params.y;
float stroke_px = f_params.z; h = f_params.z;
d = sdRegularPolygon(p_local, radius, sides);
d = sdCircle(f_local_or_uv, radius); half_size = vec2(radius); // for gradient UV computation
if ((flags & 1u) != 0u) d = sdf_stroke(d, stroke_px);
} }
else if (kind == 3u) { else if (kind == 3u) {
// Ellipse // Ellipse — half_feather in params.z
vec2 ab = f_params.xy; vec2 ab = f_params.xy;
soft = max(f_params.z, 1.0); h = f_params.z;
float stroke_px = f_params.w; d = sdEllipseApprox(p_local, ab);
half_size = ab; // for gradient UV computation
vec2 p_local = f_local_or_uv;
if (f_rotation != 0.0) {
p_local = apply_rotation(p_local, f_rotation);
}
d = sdEllipse(p_local, ab);
if ((flags & 1u) != 0u) d = sdf_stroke(d, stroke_px);
} }
else if (kind == 4u) { else if (kind == 4u) {
// Segment (capsule line) — no rotation (excluded) // Ring_Arc — half_feather in params2.z
vec2 a = f_params.xy; // already in local physical pixels // Arc mode from flag bits 5-6: 0 = full, 1 = narrow (≤π), 2 = wide (>π)
vec2 b = f_params.zw;
float width = f_params2.x;
soft = max(f_params2.y, 1.0);
d = sdSegment(f_local_or_uv, a, b) - width * 0.5;
}
else if (kind == 5u) {
// Ring / Arc — rotation handled by CPU angle offset, no shader rotation
float inner = f_params.x; float inner = f_params.x;
float outer = f_params.y; float outer = f_params.y;
float start_rad = f_params.z; vec2 n_start = f_params.zw;
float end_rad = f_params.w; vec2 n_end = f_params2.xy;
soft = max(f_params2.x, 1.0); uint arc_bits = (flags >> 5u) & 3u;
float r = length(f_local_or_uv); h = f_params2.z;
float d_ring = max(inner - r, r - outer);
// Angular clip float r = length(p_local);
float angle = atan(f_local_or_uv.y, f_local_or_uv.x); d = max(inner - r, r - outer);
if (angle < 0.0) angle += 2.0 * PI;
float ang_start = mod(start_rad, 2.0 * PI);
float ang_end = mod(end_rad, 2.0 * PI);
float in_arc = (ang_end > ang_start) if (arc_bits != 0u) {
? ((angle >= ang_start && angle <= ang_end) ? 1.0 : 0.0) : ((angle >= ang_start || angle <= ang_end) ? 1.0 : 0.0); float d_start = dot(p_local, n_start);
if (abs(ang_end - ang_start) >= 2.0 * PI - 0.001) in_arc = 1.0; float d_end = dot(p_local, n_end);
float d_wedge = (arc_bits == 1u)
? max(d_start, d_end) // arc ≤ π: intersect half-planes
: min(d_start, d_end); // arc > π: union half-planes
d = max(d, d_wedge);
}
d = in_arc > 0.5 ? d_ring : 1e30; half_size = vec2(outer); // for gradient UV computation
}
else if (kind == 6u) {
// Regular N-gon — has its own rotation in params, no Primitive.rotation used
float radius = f_params.x;
float rotation = f_params.y;
float sides = f_params.z;
soft = max(f_params.w, 1.0);
float stroke_px = f_params2.x;
vec2 p = f_local_or_uv;
float c = cos(rotation), s = sin(rotation);
p = mat2(c, -s, s, c) * p;
float an = PI / sides;
float bn = mod(atan(p.y, p.x), 2.0 * an) - an;
d = length(p) * cos(bn) - radius;
if ((flags & 1u) != 0u) d = sdf_stroke(d, stroke_px);
} }
float alpha = sdf_alpha(d, soft); // --- fwidth-based normalization for correct AA and stroke width ---
out_color = vec4(f_color.rgb, f_color.a * alpha); float grad_magnitude = max(fwidth(d), 1e-6);
d = d / grad_magnitude;
h = h / grad_magnitude;
// --- Determine shape color based on flags ---
mediump vec4 shape_color;
if ((flags & 2u) != 0u) {
// Gradient active (bit 1)
mediump vec4 gradient_start = f_color;
mediump vec4 gradient_end = unpackUnorm4x8(f_uv_or_effects.x);
if ((flags & 4u) != 0u) {
// Radial gradient (bit 2): t from distance to center
mediump float t = length(p_local / half_size);
shape_color = gradient_2color(gradient_start, gradient_end, t);
} else {
// Linear gradient: direction pre-computed on CPU as (cos, sin) f16 pair
vec2 direction = unpackHalf2x16(f_uv_or_effects.z);
mediump float t = dot(p_local / half_size, direction) * 0.5 + 0.5;
shape_color = gradient_2color(gradient_start, gradient_end, t);
}
} else if ((flags & 1u) != 0u) {
// Textured (bit 0) — RRect only in practice
vec4 uv_rect = uintBitsToFloat(f_uv_or_effects);
vec2 local_uv = p_local / half_size * 0.5 + 0.5;
vec2 uv = mix(uv_rect.xy, uv_rect.zw, local_uv);
shape_color = f_color * texture(tex, uv);
} else {
// Solid color
shape_color = f_color;
}
// --- Outline (bit 3) — outer outline via premultiplied compositing ---
// The outline band sits OUTSIDE the original shape boundary (d=0 to d=+ol_width).
// fill_cov covers the interior with AA at d=0; total_cov covers interior+outline with
// AA at d=ol_width. The outline band's coverage is total_cov - fill_cov.
// Output is premultiplied: blend state is ONE, ONE_MINUS_SRC_ALPHA.
if ((flags & 8u) != 0u) {
mediump vec4 ol_color = unpackUnorm4x8(f_uv_or_effects.y);
// Outline width in f_uv_or_effects.w (low f16 half)
float ol_width = unpackHalf2x16(f_uv_or_effects.w).x / grad_magnitude;
float fill_cov = sdf_alpha(d, h);
float total_cov = sdf_alpha(d - ol_width, h);
float outline_cov = max(total_cov - fill_cov, 0.0);
// Premultiplied output — no divide, no threshold check
vec3 rgb_pm = shape_color.rgb * shape_color.a * fill_cov
+ ol_color.rgb * ol_color.a * outline_cov;
float alpha_pm = shape_color.a * fill_cov + ol_color.a * outline_cov;
out_color = vec4(rgb_pm, alpha_pm);
} else {
mediump float alpha = sdf_alpha(d, h);
out_color = vec4(shape_color.rgb * shape_color.a * alpha, shape_color.a * alpha);
}
} }

View File

@@ -6,13 +6,13 @@ layout(location = 1) in vec2 v_uv;
layout(location = 2) in vec4 v_color; layout(location = 2) in vec4 v_color;
// ---------- Outputs to fragment shader ---------- // ---------- Outputs to fragment shader ----------
layout(location = 0) out vec4 f_color; layout(location = 0) out mediump vec4 f_color;
layout(location = 1) out vec2 f_local_or_uv; layout(location = 1) out vec2 f_local_or_uv;
layout(location = 2) out vec4 f_params; layout(location = 2) out vec4 f_params;
layout(location = 3) out vec4 f_params2; layout(location = 3) out vec4 f_params2;
layout(location = 4) flat out uint f_kind_flags; layout(location = 4) flat out uint f_flags;
layout(location = 5) flat out float f_rotation; layout(location = 5) flat out uint f_rotation_sc;
layout(location = 6) flat out vec4 f_uv_rect; layout(location = 6) flat out uvec4 f_uv_or_effects;
// ---------- Uniforms (single block — avoids spirv-cross reordering on Metal) ---------- // ---------- Uniforms (single block — avoids spirv-cross reordering on Metal) ----------
layout(set = 1, binding = 0) uniform Uniforms { layout(set = 1, binding = 0) uniform Uniforms {
@@ -23,14 +23,14 @@ layout(set = 1, binding = 0) uniform Uniforms {
// ---------- SDF primitive storage buffer ---------- // ---------- SDF primitive storage buffer ----------
struct Primitive { struct Primitive {
vec4 bounds; // 0-15: min_x, min_y, max_x, max_y vec4 bounds; // 0-15
uint color; // 16-19: packed u8x4 (unpack with unpackUnorm4x8) uint color; // 16-19
uint kind_flags; // 20-23: kind | (flags << 8) uint flags; // 20-23
float rotation; // 24-27: shader self-rotation in radians uint rotation_sc; // 24-27: packed f16 pair (sin, cos)
float _pad; // 28-31: alignment padding float _pad; // 28-31
vec4 params; // 32-47: shape params part 1 vec4 params; // 32-47
vec4 params2; // 48-63: shape params part 2 vec4 params2; // 48-63
vec4 uv_rect; // 64-79: u_min, v_min, u_max, v_max uvec4 uv_or_effects; // 64-79
}; };
layout(std430, set = 0, binding = 0) readonly buffer Primitives { layout(std430, set = 0, binding = 0) readonly buffer Primitives {
@@ -45,9 +45,9 @@ void main() {
f_local_or_uv = v_uv; f_local_or_uv = v_uv;
f_params = vec4(0.0); f_params = vec4(0.0);
f_params2 = vec4(0.0); f_params2 = vec4(0.0);
f_kind_flags = 0u; f_flags = 0u;
f_rotation = 0.0; f_rotation_sc = 0u;
f_uv_rect = vec4(0.0, 0.0, 1.0, 1.0); f_uv_or_effects = uvec4(0);
gl_Position = projection * vec4(v_position * dpi_scale, 0.0, 1.0); gl_Position = projection * vec4(v_position * dpi_scale, 0.0, 1.0);
} else { } else {
@@ -62,9 +62,9 @@ void main() {
f_local_or_uv = (world_pos - center) * dpi_scale; // shape-centered physical pixels f_local_or_uv = (world_pos - center) * dpi_scale; // shape-centered physical pixels
f_params = p.params; f_params = p.params;
f_params2 = p.params2; f_params2 = p.params2;
f_kind_flags = p.kind_flags; f_flags = p.flags;
f_rotation = p.rotation; f_rotation_sc = p.rotation_sc;
f_uv_rect = p.uv_rect; f_uv_or_effects = p.uv_or_effects;
gl_Position = projection * vec4(world_pos * dpi_scale, 0.0, 1.0); gl_Position = projection * vec4(world_pos * dpi_scale, 0.0, 1.0);
} }

File diff suppressed because it is too large Load Diff

330
draw/tess/tess.odin Normal file
View File

@@ -0,0 +1,330 @@
package tess
import "core:math"
import draw ".."
SMOOTH_CIRCLE_ERROR_RATE :: 0.1
auto_segments :: proc(radius: f32, arc_degrees: f32) -> int {
if radius <= 0 do return 4
phys_radius := radius * draw.GLOB.dpi_scaling
acos_arg := clamp(2 * math.pow(1 - SMOOTH_CIRCLE_ERROR_RATE / phys_radius, 2) - 1, -1, 1)
theta := math.acos(acos_arg)
if theta <= 0 do return 4
full_circle_segments := int(math.ceil(2 * math.PI / theta))
segments := int(f32(full_circle_segments) * arc_degrees / 360.0)
min_segments := max(int(math.ceil(f64(arc_degrees / 90.0))), 4)
return max(segments, min_segments)
}
// ----- Internal helpers -----
// Color is premultiplied: the tessellated fragment shader passes it through directly
// and the blend state is ONE, ONE_MINUS_SRC_ALPHA.
solid_vertex :: proc(position: draw.Vec2, color: draw.Color) -> draw.Vertex {
return draw.Vertex{position = position, color = draw.premultiply_color(color)}
}
emit_rectangle :: proc(x, y, width, height: f32, color: draw.Color, vertices: []draw.Vertex, offset: int) {
vertices[offset + 0] = solid_vertex({x, y}, color)
vertices[offset + 1] = solid_vertex({x + width, y}, color)
vertices[offset + 2] = solid_vertex({x + width, y + height}, color)
vertices[offset + 3] = solid_vertex({x, y}, color)
vertices[offset + 4] = solid_vertex({x + width, y + height}, color)
vertices[offset + 5] = solid_vertex({x, y + height}, color)
}
extrude_line :: proc(
start, end_pos: draw.Vec2,
thickness: f32,
color: draw.Color,
vertices: []draw.Vertex,
offset: int,
) -> int {
direction := end_pos - start
delta_x := direction[0]
delta_y := direction[1]
length := math.sqrt(delta_x * delta_x + delta_y * delta_y)
if length < 0.0001 do return 0
scale := thickness / (2 * length)
perpendicular := draw.Vec2{-delta_y * scale, delta_x * scale}
p0 := start + perpendicular
p1 := start - perpendicular
p2 := end_pos - perpendicular
p3 := end_pos + perpendicular
vertices[offset + 0] = solid_vertex(p0, color)
vertices[offset + 1] = solid_vertex(p1, color)
vertices[offset + 2] = solid_vertex(p2, color)
vertices[offset + 3] = solid_vertex(p0, color)
vertices[offset + 4] = solid_vertex(p2, color)
vertices[offset + 5] = solid_vertex(p3, color)
return 6
}
// ----- Public draw -----
pixel :: proc(layer: ^draw.Layer, pos: draw.Vec2, color: draw.Color) {
vertices: [6]draw.Vertex
emit_rectangle(pos[0], pos[1], 1, 1, color, vertices[:], 0)
draw.prepare_shape(layer, vertices[:])
}
triangle :: proc(
layer: ^draw.Layer,
v1, v2, v3: draw.Vec2,
color: draw.Color,
origin: draw.Vec2 = {},
rotation: f32 = 0,
) {
if !draw.needs_transform(origin, rotation) {
vertices := [3]draw.Vertex{solid_vertex(v1, color), solid_vertex(v2, color), solid_vertex(v3, color)}
draw.prepare_shape(layer, vertices[:])
return
}
bounds_min := draw.Vec2{min(v1.x, v2.x, v3.x), min(v1.y, v2.y, v3.y)}
transform := draw.build_pivot_rotation(bounds_min, origin, rotation)
local_v1 := v1 - bounds_min
local_v2 := v2 - bounds_min
local_v3 := v3 - bounds_min
vertices := [3]draw.Vertex {
solid_vertex(draw.apply_transform(transform, local_v1), color),
solid_vertex(draw.apply_transform(transform, local_v2), color),
solid_vertex(draw.apply_transform(transform, local_v3), color),
}
draw.prepare_shape(layer, vertices[:])
}
// Draw an anti-aliased triangle via extruded edge quads.
// Interior vertices get the full premultiplied color; outer fringe vertices get BLANK (0,0,0,0).
// The rasterizer linearly interpolates between them, producing a smooth 1-pixel AA band.
// `aa_px` controls the extrusion width in logical pixels (default 1.0).
// This proc emits 21 vertices (3 interior + 6 edge quads × 3 verts each).
triangle_aa :: proc(
layer: ^draw.Layer,
v1, v2, v3: draw.Vec2,
color: draw.Color,
aa_px: f32 = draw.DFT_FEATHER_PX,
origin: draw.Vec2 = {},
rotation: f32 = 0,
) {
// Apply rotation if needed, then work in world space.
p0, p1, p2: draw.Vec2
if !draw.needs_transform(origin, rotation) {
p0 = v1
p1 = v2
p2 = v3
} else {
bounds_min := draw.Vec2{min(v1.x, v2.x, v3.x), min(v1.y, v2.y, v3.y)}
transform := draw.build_pivot_rotation(bounds_min, origin, rotation)
p0 = draw.apply_transform(transform, v1 - bounds_min)
p1 = draw.apply_transform(transform, v2 - bounds_min)
p2 = draw.apply_transform(transform, v3 - bounds_min)
}
// Compute outward edge normals (unit length, pointing away from triangle interior).
// Winding-independent: we check against the centroid to ensure normals point outward.
centroid_x := (p0.x + p1.x + p2.x) / 3.0
centroid_y := (p0.y + p1.y + p2.y) / 3.0
edge_normal :: proc(edge_start, edge_end: draw.Vec2, centroid_x, centroid_y: f32) -> draw.Vec2 {
delta_x := edge_end.x - edge_start.x
delta_y := edge_end.y - edge_start.y
length := math.sqrt(delta_x * delta_x + delta_y * delta_y)
if length < 0.0001 do return {0, 0}
inverse_length := 1.0 / length
// Perpendicular: (-delta_y, delta_x) normalized
normal_x := -delta_y * inverse_length
normal_y := delta_x * inverse_length
// Midpoint of the edge
midpoint_x := (edge_start.x + edge_end.x) * 0.5
midpoint_y := (edge_start.y + edge_end.y) * 0.5
// If normal points toward centroid, flip it
if normal_x * (centroid_x - midpoint_x) + normal_y * (centroid_y - midpoint_y) > 0 {
normal_x = -normal_x
normal_y = -normal_y
}
return {normal_x, normal_y}
}
normal_01 := edge_normal(p0, p1, centroid_x, centroid_y)
normal_12 := edge_normal(p1, p2, centroid_x, centroid_y)
normal_20 := edge_normal(p2, p0, centroid_x, centroid_y)
extrude_distance := aa_px * draw.GLOB.dpi_scaling
// Outer fringe vertices: each edge vertex extruded outward
outer_0_01 := p0 + normal_01 * extrude_distance
outer_1_01 := p1 + normal_01 * extrude_distance
outer_1_12 := p1 + normal_12 * extrude_distance
outer_2_12 := p2 + normal_12 * extrude_distance
outer_2_20 := p2 + normal_20 * extrude_distance
outer_0_20 := p0 + normal_20 * extrude_distance
// Premultiplied interior color (solid_vertex does premul internally).
// Outer fringe is BLANK = {0,0,0,0} which is already premul.
transparent := draw.BLANK
// 3 interior + 6 × 3 edge-quad = 21 vertices
vertices: [21]draw.Vertex
// Interior triangle
vertices[0] = solid_vertex(p0, color)
vertices[1] = solid_vertex(p1, color)
vertices[2] = solid_vertex(p2, color)
// Edge quad: p0→p1 (2 triangles)
vertices[3] = solid_vertex(p0, color)
vertices[4] = solid_vertex(p1, color)
vertices[5] = solid_vertex(outer_1_01, transparent)
vertices[6] = solid_vertex(p0, color)
vertices[7] = solid_vertex(outer_1_01, transparent)
vertices[8] = solid_vertex(outer_0_01, transparent)
// Edge quad: p1→p2 (2 triangles)
vertices[9] = solid_vertex(p1, color)
vertices[10] = solid_vertex(p2, color)
vertices[11] = solid_vertex(outer_2_12, transparent)
vertices[12] = solid_vertex(p1, color)
vertices[13] = solid_vertex(outer_2_12, transparent)
vertices[14] = solid_vertex(outer_1_12, transparent)
// Edge quad: p2→p0 (2 triangles)
vertices[15] = solid_vertex(p2, color)
vertices[16] = solid_vertex(p0, color)
vertices[17] = solid_vertex(outer_0_20, transparent)
vertices[18] = solid_vertex(p2, color)
vertices[19] = solid_vertex(outer_0_20, transparent)
vertices[20] = solid_vertex(outer_2_20, transparent)
draw.prepare_shape(layer, vertices[:])
}
triangle_lines :: proc(
layer: ^draw.Layer,
v1, v2, v3: draw.Vec2,
color: draw.Color,
thickness: f32 = draw.DFT_STROKE_THICKNESS,
origin: draw.Vec2 = {},
rotation: f32 = 0,
temp_allocator := context.temp_allocator,
) {
vertices := make([]draw.Vertex, 18, temp_allocator)
defer delete(vertices, temp_allocator)
write_offset := 0
if !draw.needs_transform(origin, rotation) {
write_offset += extrude_line(v1, v2, thickness, color, vertices, write_offset)
write_offset += extrude_line(v2, v3, thickness, color, vertices, write_offset)
write_offset += extrude_line(v3, v1, thickness, color, vertices, write_offset)
} else {
bounds_min := draw.Vec2{min(v1.x, v2.x, v3.x), min(v1.y, v2.y, v3.y)}
transform := draw.build_pivot_rotation(bounds_min, origin, rotation)
transformed_v1 := draw.apply_transform(transform, v1 - bounds_min)
transformed_v2 := draw.apply_transform(transform, v2 - bounds_min)
transformed_v3 := draw.apply_transform(transform, v3 - bounds_min)
write_offset += extrude_line(transformed_v1, transformed_v2, thickness, color, vertices, write_offset)
write_offset += extrude_line(transformed_v2, transformed_v3, thickness, color, vertices, write_offset)
write_offset += extrude_line(transformed_v3, transformed_v1, thickness, color, vertices, write_offset)
}
if write_offset > 0 {
draw.prepare_shape(layer, vertices[:write_offset])
}
}
triangle_fan :: proc(
layer: ^draw.Layer,
points: []draw.Vec2,
color: draw.Color,
origin: draw.Vec2 = {},
rotation: f32 = 0,
temp_allocator := context.temp_allocator,
) {
if len(points) < 3 do return
triangle_count := len(points) - 2
vertex_count := triangle_count * 3
vertices := make([]draw.Vertex, vertex_count, temp_allocator)
defer delete(vertices, temp_allocator)
if !draw.needs_transform(origin, rotation) {
for i in 1 ..< len(points) - 1 {
idx := (i - 1) * 3
vertices[idx + 0] = solid_vertex(points[0], color)
vertices[idx + 1] = solid_vertex(points[i], color)
vertices[idx + 2] = solid_vertex(points[i + 1], color)
}
} else {
bounds_min := draw.Vec2{max(f32), max(f32)}
for point in points {
bounds_min.x = min(bounds_min.x, point.x)
bounds_min.y = min(bounds_min.y, point.y)
}
transform := draw.build_pivot_rotation(bounds_min, origin, rotation)
for i in 1 ..< len(points) - 1 {
idx := (i - 1) * 3
vertices[idx + 0] = solid_vertex(draw.apply_transform(transform, points[0] - bounds_min), color)
vertices[idx + 1] = solid_vertex(draw.apply_transform(transform, points[i] - bounds_min), color)
vertices[idx + 2] = solid_vertex(draw.apply_transform(transform, points[i + 1] - bounds_min), color)
}
}
draw.prepare_shape(layer, vertices)
}
triangle_strip :: proc(
layer: ^draw.Layer,
points: []draw.Vec2,
color: draw.Color,
origin: draw.Vec2 = {},
rotation: f32 = 0,
temp_allocator := context.temp_allocator,
) {
if len(points) < 3 do return
triangle_count := len(points) - 2
vertex_count := triangle_count * 3
vertices := make([]draw.Vertex, vertex_count, temp_allocator)
defer delete(vertices, temp_allocator)
if !draw.needs_transform(origin, rotation) {
for i in 0 ..< triangle_count {
idx := i * 3
if i % 2 == 0 {
vertices[idx + 0] = solid_vertex(points[i], color)
vertices[idx + 1] = solid_vertex(points[i + 1], color)
vertices[idx + 2] = solid_vertex(points[i + 2], color)
} else {
vertices[idx + 0] = solid_vertex(points[i + 1], color)
vertices[idx + 1] = solid_vertex(points[i], color)
vertices[idx + 2] = solid_vertex(points[i + 2], color)
}
}
} else {
bounds_min := draw.Vec2{max(f32), max(f32)}
for point in points {
bounds_min.x = min(bounds_min.x, point.x)
bounds_min.y = min(bounds_min.y, point.y)
}
transform := draw.build_pivot_rotation(bounds_min, origin, rotation)
for i in 0 ..< triangle_count {
idx := i * 3
if i % 2 == 0 {
vertices[idx + 0] = solid_vertex(draw.apply_transform(transform, points[i] - bounds_min), color)
vertices[idx + 1] = solid_vertex(draw.apply_transform(transform, points[i + 1] - bounds_min), color)
vertices[idx + 2] = solid_vertex(draw.apply_transform(transform, points[i + 2] - bounds_min), color)
} else {
vertices[idx + 0] = solid_vertex(draw.apply_transform(transform, points[i + 1] - bounds_min), color)
vertices[idx + 1] = solid_vertex(draw.apply_transform(transform, points[i] - bounds_min), color)
vertices[idx + 2] = solid_vertex(draw.apply_transform(transform, points[i + 2] - bounds_min), color)
}
}
}
draw.prepare_shape(layer, vertices)
}

View File

@@ -79,7 +79,7 @@ register_font :: proc(bytes: []u8) -> (id: Font_Id, ok: bool) #optional_ok {
Text :: struct { Text :: struct {
sdl_text: ^sdl_ttf.Text, sdl_text: ^sdl_ttf.Text,
position: [2]f32, position: Vec2,
color: Color, color: Color,
} }
@@ -129,11 +129,11 @@ cache_get_or_update :: proc(key: Cache_Key, c_str: cstring, font: ^sdl_ttf.Font)
text :: proc( text :: proc(
layer: ^Layer, layer: ^Layer,
text_string: string, text_string: string,
position: [2]f32, position: Vec2,
font_id: Font_Id, font_id: Font_Id,
font_size: u16 = 44, font_size: u16 = DFT_FONT_SIZE,
color: Color = BLACK, color: Color = DFT_TEXT_COLOR,
origin: [2]f32 = {0, 0}, origin: Vec2 = {},
rotation: f32 = 0, rotation: f32 = 0,
id: Maybe(u32) = nil, id: Maybe(u32) = nil,
temp_allocator := context.temp_allocator, temp_allocator := context.temp_allocator,
@@ -177,9 +177,9 @@ text :: proc(
measure_text :: proc( measure_text :: proc(
text_string: string, text_string: string,
font_id: Font_Id, font_id: Font_Id,
font_size: u16 = 44, font_size: u16 = DFT_FONT_SIZE,
allocator := context.temp_allocator, allocator := context.temp_allocator,
) -> [2]f32 { ) -> Vec2 {
c_str := strings.clone_to_cstring(text_string, allocator) c_str := strings.clone_to_cstring(text_string, allocator)
defer delete(c_str, allocator) defer delete(c_str, allocator)
width, height: c.int width, height: c.int
@@ -193,46 +193,46 @@ measure_text :: proc(
// ----- Text anchor helpers ----------- // ----- Text anchor helpers -----------
// --------------------------------------------------------------------------------------------------------------------- // ---------------------------------------------------------------------------------------------------------------------
center_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = 44) -> [2]f32 { center_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = DFT_FONT_SIZE) -> Vec2 {
size := measure_text(text_string, font_id, font_size) size := measure_text(text_string, font_id, font_size)
return size * 0.5 return size * 0.5
} }
top_left_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = 44) -> [2]f32 { top_left_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = DFT_FONT_SIZE) -> Vec2 {
return {0, 0} return {0, 0}
} }
top_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = 44) -> [2]f32 { top_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = DFT_FONT_SIZE) -> Vec2 {
size := measure_text(text_string, font_id, font_size) size := measure_text(text_string, font_id, font_size)
return {size.x * 0.5, 0} return {size.x * 0.5, 0}
} }
top_right_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = 44) -> [2]f32 { top_right_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = DFT_FONT_SIZE) -> Vec2 {
size := measure_text(text_string, font_id, font_size) size := measure_text(text_string, font_id, font_size)
return {size.x, 0} return {size.x, 0}
} }
left_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = 44) -> [2]f32 { left_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = DFT_FONT_SIZE) -> Vec2 {
size := measure_text(text_string, font_id, font_size) size := measure_text(text_string, font_id, font_size)
return {0, size.y * 0.5} return {0, size.y * 0.5}
} }
right_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = 44) -> [2]f32 { right_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = DFT_FONT_SIZE) -> Vec2 {
size := measure_text(text_string, font_id, font_size) size := measure_text(text_string, font_id, font_size)
return {size.x, size.y * 0.5} return {size.x, size.y * 0.5}
} }
bottom_left_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = 44) -> [2]f32 { bottom_left_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = DFT_FONT_SIZE) -> Vec2 {
size := measure_text(text_string, font_id, font_size) size := measure_text(text_string, font_id, font_size)
return {0, size.y} return {0, size.y}
} }
bottom_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = 44) -> [2]f32 { bottom_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = DFT_FONT_SIZE) -> Vec2 {
size := measure_text(text_string, font_id, font_size) size := measure_text(text_string, font_id, font_size)
return {size.x * 0.5, size.y} return {size.x * 0.5, size.y}
} }
bottom_right_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = 44) -> [2]f32 { bottom_right_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = DFT_FONT_SIZE) -> Vec2 {
size := measure_text(text_string, font_id, font_size) size := measure_text(text_string, font_id, font_size)
return size return size
} }