Orgnaization & cleanup
This commit is contained in:
+107
-92
@@ -15,10 +15,10 @@ modes dispatched by a push constant:
|
||||
shader premultiplies the texture sample (`t.rgb *= t.a`) and computes `out = color * t`.
|
||||
|
||||
- **Mode 1 (SDF):** A static 6-vertex unit-quad buffer is drawn instanced, with per-primitive
|
||||
`Primitive` structs (80 bytes each) uploaded each frame to a GPU storage buffer. The vertex shader
|
||||
reads `primitives[gl_InstanceIndex]`, computes world-space position from unit quad corners +
|
||||
`Base_2D_Primitive` structs (96 bytes each) uploaded each frame to a GPU storage buffer. The vertex
|
||||
shader reads `primitives[gl_InstanceIndex]`, computes world-space position from unit quad corners +
|
||||
primitive bounds. The fragment shader dispatches on `Shape_Kind` (encoded in the low byte of
|
||||
`Primitive.flags`) to evaluate one of four signed distance functions:
|
||||
`Base_2D_Primitive.flags`) to evaluate one of four signed distance functions:
|
||||
- **RRect** (kind 1) — `sdRoundedBox` with per-corner radii. Covers rectangles (sharp or rounded),
|
||||
circles (uniform radii = half-size), and line segments / capsules (rotated RRect with uniform
|
||||
radii = half-thickness). Covers filled, outlined, textured, and gradient-filled variants.
|
||||
@@ -28,21 +28,22 @@ modes dispatched by a push constant:
|
||||
normals. Covers full rings, partial arcs, and pie slices (`inner_radius = 0`).
|
||||
|
||||
All SDF shapes support fill, outline, solid color, 2-color linear gradients, 2-color radial
|
||||
gradients, and texture fills via `Shape_Flags` (see `pipeline_2d_base.odin`). Gradient and outline
|
||||
parameters are packed into the same 16 bytes as the texture UV rect via a `Uv_Or_Effects` raw union
|
||||
— zero size increase to the 80-byte `Primitive` struct. Gradient/outline and texture are mutually
|
||||
exclusive.
|
||||
gradients, and texture fills via `Shape_Flags` (see `pipeline_2d_base.odin`). The texture UV rect
|
||||
(`uv_rect: [4]f32`) and the gradient/outline parameters (`effects: Gradient_Outline`) live in their
|
||||
own 16-byte slots in `Base_2D_Primitive`, so a primitive can carry texture and outline simultaneously.
|
||||
Gradient and texture remain mutually exclusive at the fill-source level (a Brush variant chooses one
|
||||
or the other) since they share the worst-case fragment-shader register path.
|
||||
|
||||
All SDF shapes produce mathematically exact curves with analytical anti-aliasing via `smoothstep` —
|
||||
no tessellation, no piecewise-linear approximation. A rounded rectangle is 1 primitive (80 bytes)
|
||||
no tessellation, no piecewise-linear approximation. A rounded rectangle is 1 primitive (96 bytes)
|
||||
instead of ~250 vertices (~5000 bytes).
|
||||
|
||||
The main pipeline's register budget is **≤24 registers** (see "Main/effects split: register pressure"
|
||||
in the pipeline plan below for the full cliff/margin analysis and SBC architecture context). The
|
||||
fragment shader's estimated peak footprint is ~22–26 fp32 VGPRs (~16–22 fp16 VGPRs on architectures
|
||||
in the pipeline plan below for the full cliff/margin analysis and SBC architecture context).
|
||||
The fragment shader's estimated peak footprint is ~22–26 fp32 VGPRs (~16–22 fp16 VGPRs on architectures
|
||||
with native mediump) via manual live-range analysis. The dominant peak is the Ring_Arc kind path
|
||||
(wedge normals + inner/outer radii + dot-product temporaries live simultaneously with carried state
|
||||
like `f_color`, `f_uv_or_effects`, and `half_size`). RRect is 1–2 regs lower (`corner_radii` vec4
|
||||
like `f_color`, `f_uv_rect`/`f_effects`, and `half_size`). RRect is 1–2 regs lower (`corner_radii` vec4
|
||||
replaces the separate inner/outer + normal pairs). NGon and Ellipse are lighter still. Real compilers
|
||||
apply live-range coalescing, mediump-to-fp16 promotion, and rematerialization that typically shave
|
||||
2–4 regs from hand-counted estimates — the conservative 26-reg upper bound is expected to compile
|
||||
@@ -439,12 +440,13 @@ vertex shader branches on this uniform to select the tessellated or SDF code pat
|
||||
- **Tessellated mode** (`mode = 0`): direct vertex buffer with explicit geometry. Used for text
|
||||
(SDL_ttf atlas sampling), triangles, triangle fans/strips, single-pixel points, and any
|
||||
user-provided raw vertex geometry.
|
||||
- **SDF mode** (`mode = 1`): shared unit-quad vertex buffer + GPU storage buffer of `Primitive`
|
||||
structs, drawn instanced. Used for all shapes with closed-form signed distance functions.
|
||||
- **SDF mode** (`mode = 1`): shared unit-quad vertex buffer + GPU storage buffer of
|
||||
`Base_2D_Primitive` structs, drawn instanced. Used for all shapes with closed-form signed distance
|
||||
functions.
|
||||
|
||||
Both modes use the same fragment shader. The fragment shader checks `Shape_Kind` (low byte of
|
||||
`Primitive.flags`): kind 0 (`Solid`) is the tessellated path, which premultiplies the texture sample
|
||||
and computes `out = color * t`; kinds 1–4 dispatch to one of four SDF functions (RRect, NGon,
|
||||
`Base_2D_Primitive.flags`): kind 0 (`Solid`) is the tessellated path, which premultiplies the texture
|
||||
sample and computes `out = color * t`; kinds 1–4 dispatch to one of four SDF functions (RRect, NGon,
|
||||
Ellipse, Ring_Arc) and apply gradient/texture/outline/solid color based on `Shape_Flags` bits.
|
||||
|
||||
#### Why SDF for shapes
|
||||
@@ -452,8 +454,8 @@ Ellipse, Ring_Arc) and apply gradient/texture/outline/solid color based on `Shap
|
||||
CPU-side adaptive tessellation for curved shapes (the current approach) has three problems:
|
||||
|
||||
1. **Vertex bandwidth.** A rounded rectangle with four corner arcs produces ~250 vertices × 20 bytes
|
||||
= 5 KB. An SDF rounded rectangle is one `Primitive` struct (~56 bytes) plus 4 shared unit-quad
|
||||
vertices. That is roughly a 90× reduction per shape.
|
||||
= 5 KB. An SDF rounded rectangle is one `Base_2D_Primitive` struct (96 bytes) plus 4 shared
|
||||
unit-quad vertices. That is roughly a 50× reduction per shape.
|
||||
|
||||
2. **Quality.** Tessellated curves are piecewise-linear approximations. At high DPI or under
|
||||
animation/zoom, faceting is visible at any practical segment count. SDF evaluation produces
|
||||
@@ -484,14 +486,14 @@ SDF primitives are submitted via a GPU storage buffer indexed by `gl_InstanceInd
|
||||
shader, rather than encoding per-primitive data redundantly in vertex attributes. This follows the
|
||||
pattern used by both Zed GPUI and vger-rs.
|
||||
|
||||
Each SDF shape is described by a single `Primitive` struct (80 bytes) in the storage buffer. The
|
||||
vertex shader reads `primitives[gl_InstanceIndex]`, computes the quad corner position from the unit
|
||||
vertex and the primitive's bounds, and passes shape parameters to the fragment shader via `flat`
|
||||
interpolated varyings.
|
||||
Each SDF shape is described by a single `Base_2D_Primitive` struct (96 bytes) in the storage
|
||||
buffer. The vertex shader reads `primitives[gl_InstanceIndex]`, computes the quad corner position
|
||||
from the unit vertex and the primitive's bounds, and passes shape parameters to the fragment shader
|
||||
via `flat` interpolated varyings.
|
||||
|
||||
Compared to encoding per-primitive data in vertex attributes (the "fat vertex" approach), storage-
|
||||
buffer instancing eliminates the 4–6× data duplication across quad corners. A rounded rectangle costs
|
||||
80 bytes instead of 4 vertices × 40+ bytes = 160+ bytes.
|
||||
96 bytes instead of 4 vertices × 60+ bytes = 240+ bytes.
|
||||
|
||||
The tessellated path retains the existing direct vertex buffer layout (20 bytes/vertex, no storage
|
||||
buffer access). The vertex shader branch on `mode` (push constant) is warp-uniform — every invocation
|
||||
@@ -499,15 +501,18 @@ in a draw call has the same mode — so it is effectively free on all modern GPU
|
||||
|
||||
#### Shape kinds and SDF dispatch
|
||||
|
||||
The fragment shader dispatches on `Shape_Kind` (low byte of `Primitive.flags`) to evaluate one of
|
||||
four signed distance functions. The `Shape_Kind` enum and per-kind `*_Params` structs are defined in
|
||||
`pipeline_2d_base.odin`. CPU-side drawing procs in `shapes.odin` build the appropriate `Primitive`
|
||||
and set the kind automatically:
|
||||
The fragment shader dispatches on `Shape_Kind` (low byte of `Base_2D_Primitive.flags`) to evaluate
|
||||
one of four signed distance functions. The `Shape_Kind` enum and per-kind `*_Params` structs are
|
||||
defined in `pipeline_2d_base.odin`. CPU-side drawing procs in `shapes.odin` build the appropriate
|
||||
`Base_2D_Primitive` and set the kind automatically:
|
||||
|
||||
Each user-facing shape proc accepts a `Brush` union (color, linear gradient, radial gradient,
|
||||
or textured fill) as its fill source, plus optional outline parameters. The procs map to SDF
|
||||
kinds as follows:
|
||||
|
||||
| User-facing proc | Shape_Kind | SDF function | Notes |
|
||||
| -------------------- | ---------- | ------------------ | ---------------------------------------------------------- |
|
||||
| `rectangle` | `RRect` | `sdRoundedBox` | Per-corner radii from `radii` param |
|
||||
| `rectangle_texture` | `RRect` | `sdRoundedBox` | Textured fill; `.Textured` flag set |
|
||||
| `circle` | `RRect` | `sdRoundedBox` | Uniform radii = half-size (circle is a degenerate RRect) |
|
||||
| `line`, `line_strip` | `RRect` | `sdRoundedBox` | Rotated capsule — stadium shape (radii = half-thickness) |
|
||||
| `ellipse` | `Ellipse` | `sdEllipseApprox` | Approximate ellipse SDF (fast, suitable for UI) |
|
||||
@@ -599,20 +604,21 @@ to is a hard GPU constraint; the only way to satisfy it is to end the current re
|
||||
a new one. That render-pass boundary is what a “bracket” is.
|
||||
|
||||
**Multi-pass implementation.** Backdrop effects are implemented as separable multi-pass sequences
|
||||
(downsample → horizontal blur → vertical-blur+composite), following the standard approach used by
|
||||
iOS `UIVisualEffectView`, Android `RenderEffect`, and Flutter's `BackdropFilter`. Each individual
|
||||
(downsample → horizontal blur → vertical blur → composite), following the standard approach used
|
||||
by iOS `UIVisualEffectView`, Android `RenderEffect`, and Flutter's `BackdropFilter`. Each individual
|
||||
sub-pass is budgeted at **≤24 registers** (same as the main pipeline — full Valhall occupancy). The
|
||||
multi-pass approach avoids the monolithic 70+ register shader that a single-pass Gaussian blur would
|
||||
require, keeping each sub-pass well under the 32-register cliff.
|
||||
|
||||
**Approach B: render-target choice.** When any layer in the frame contains a backdrop draw, the
|
||||
entire frame renders into `source_texture` (a full-resolution single-sample texture owned by the
|
||||
backdrop pipeline) instead of directly into the swapchain. At the end of the frame, `source_texture`
|
||||
is copied to the swapchain via a single `CopyGPUTextureToTexture` call. This means the bracket has
|
||||
no mid-frame texture copy: by the time the bracket runs, `source_texture` already contains the pre-
|
||||
bracket frame contents and is the natural sampler input. When no layer in the frame has a backdrop
|
||||
draw, the existing fast path runs: the frame renders directly to the swapchain and the backdrop
|
||||
pipeline's working textures are never touched. Zero cost for backdrop-free frames.
|
||||
**Render-target choice.** When any layer in the frame contains a backdrop draw, the entire
|
||||
frame renders into `source_texture` (a full-resolution single-sample texture owned by the
|
||||
backdrop pipeline) instead of directly into the swapchain. At the end of the frame,
|
||||
`source_texture` is copied to the swapchain via a single `CopyGPUTextureToTexture` call.
|
||||
This means the bracket has no mid-frame texture copy: by the time the bracket runs,
|
||||
`source_texture` already contains the pre-bracket frame contents and is the natural sampler
|
||||
input. When no layer in the frame has a backdrop draw, the existing fast path runs: the frame
|
||||
renders directly to the swapchain and the backdrop pipeline's working textures are never
|
||||
touched. Zero cost for backdrop-free frames.
|
||||
|
||||
**Why not split the backdrop sub-passes into separate pipelines?** Each sub-pass is budgeted at ≤24
|
||||
registers, well under Valhall's 32-register cliff, so there is no occupancy motivation for splitting.
|
||||
@@ -638,13 +644,20 @@ submission order. Concretely, a layer with one or more backdrops splits into thr
|
||||
range. If the layer has no backdrops, none of this kicks in and the layer renders in a single render
|
||||
pass via the existing fast path.
|
||||
|
||||
The downsample runs once per layer, not once per sigma: it just copies `source_texture` to a ¼-
|
||||
resolution working texture and doesn't depend on the kernel. Each unique sigma in the layer triggers
|
||||
one H-blur (reads `downsample_texture`, writes `h_blur_texture`) and one V-composite (reads
|
||||
`h_blur_texture`, writes `source_texture` per-primitive with the SDF mask). Sub-batch coalescing in
|
||||
`append_or_extend_sub_batch` merges contiguous same-sigma backdrops into a single instanced V-
|
||||
composite draw call; non-contiguous same-sigma backdrops still share the H-blur output but issue
|
||||
separate V-composite draws.
|
||||
Per-sigma-group execution. The bracket walks each layer's sub-batches and groups contiguous
|
||||
`.Backdrop` sub-batches that share a sigma; each group picks its own downsample factor (1, 2, or 4)
|
||||
based on `compute_backdrop_downsample_factor`. For each group it runs four sub-passes: a downsample
|
||||
from `source_texture` to `downsample_texture`; an H-blur from `downsample_texture` to
|
||||
`h_blur_texture`; a V-blur from `h_blur_texture` back into `downsample_texture` (ping-pong reuse);
|
||||
and finally a composite that reads the fully-blurred `downsample_texture`, applies the SDF mask
|
||||
and tint, and writes the result to `source_texture`. Sub-batch coalescing in
|
||||
`append_or_extend_sub_batch` merges contiguous same-sigma backdrops into a single instanced
|
||||
composite draw; non-contiguous same-sigma backdrops still share the blur output but issue separate
|
||||
composite draws.
|
||||
|
||||
The working textures are sized at the full swapchain resolution; larger downsample factors only
|
||||
fill a sub-rect via viewport-limited rendering (see the comment block at the top of `backdrop.odin`
|
||||
for the factor-selection table and rationale).
|
||||
|
||||
#### Submission-order trade-off
|
||||
|
||||
@@ -654,12 +667,12 @@ layer. A non-backdrop sub-batch submitted between two backdrops still renders in
|
||||
bracket), not at its submission position. Worked example:
|
||||
|
||||
```
|
||||
draw.rectangle(layer, bg, GRAY) // 0 Tessellated → Pass A
|
||||
draw.rectangle(layer, card_blue, BLUE) // 1 SDF → Pass A
|
||||
draw.rectangle_backdrop(layer, panelA, 12) // 2 Backdrop → Bracket (sees: bg + blue card)
|
||||
draw.rectangle(layer, card_red, RED) // 3 SDF → Pass B (drawn ON TOP of panelA)
|
||||
draw.rectangle_backdrop(layer, panelB, 12) // 4 Backdrop → Bracket (sees: bg + blue card; same as panelA)
|
||||
draw.text(layer, "label", ...) // 5 Text → Pass B (drawn ON TOP of both panels)
|
||||
draw.rectangle(layer, bg, GRAY) // 0 Tessellated → Pass A
|
||||
draw.rectangle(layer, card_blue, BLUE) // 1 SDF → Pass A
|
||||
draw.gaussian_blur(layer, panelA, sigma=12) // 2 Backdrop → Bracket (sees: bg + blue card)
|
||||
draw.rectangle(layer, card_red, RED) // 3 SDF → Pass B (drawn ON TOP of panelA)
|
||||
draw.gaussian_blur(layer, panelB, sigma=12) // 4 Backdrop → Bracket (sees: bg + blue card; same as panelA)
|
||||
draw.text(layer, "label", ...) // 5 Text → Pass B (drawn ON TOP of both panels)
|
||||
```
|
||||
|
||||
In this layer, panelB does *not* see card_red — even though card_red was submitted before panelB —
|
||||
@@ -674,11 +687,11 @@ card_red:
|
||||
base := draw.begin(...)
|
||||
draw.rectangle(base, bg, GRAY)
|
||||
draw.rectangle(base, card_blue, BLUE)
|
||||
draw.rectangle_backdrop(base, panelA, 12) // panelA in base layer's bracket
|
||||
draw.gaussian_blur(base, panelA, sigma=12) // panelA in base layer's bracket
|
||||
|
||||
top := draw.new_layer(base, ...)
|
||||
draw.rectangle(top, card_red, RED)
|
||||
draw.rectangle_backdrop(top, panelB, 12) // top layer's bracket; sees base + card_red
|
||||
draw.gaussian_blur(top, panelB, sigma=12) // top layer's bracket; sees base + card_red
|
||||
draw.text(top, "label", ...)
|
||||
```
|
||||
|
||||
@@ -708,29 +721,30 @@ draws, `position` carries actual world-space geometry. For SDF draws, `position`
|
||||
corners (0,0 to 1,1) and the vertex shader computes world-space position from the storage-buffer
|
||||
primitive's bounds.
|
||||
|
||||
The `Primitive` struct for SDF shapes lives in the storage buffer, not in vertex attributes:
|
||||
The `Base_2D_Primitive` struct for SDF shapes lives in the storage buffer, not in vertex attributes:
|
||||
|
||||
```
|
||||
Primitive :: struct {
|
||||
bounds: [4]f32, // 0: min_x, min_y, max_x, max_y
|
||||
color: Color, // 16: u8x4, unpacked in shader via unpackUnorm4x8
|
||||
flags: u32, // 20: low byte = Shape_Kind, bits 8+ = Shape_Flags
|
||||
rotation_sc: u32, // 24: packed f16 pair (sin, cos). Requires .Rotated flag.
|
||||
_pad: f32, // 28: reserved for future use
|
||||
params: Shape_Params, // 32: per-kind params union (half_feather, radii, etc.) (32 bytes)
|
||||
uv: Uv_Or_Effects, // 64: texture UV rect or gradient/outline parameters (16 bytes)
|
||||
Base_2D_Primitive :: struct {
|
||||
bounds: [4]f32, // 0: min_x, min_y, max_x, max_y
|
||||
color: Color, // 16: u8x4, unpacked in shader via unpackUnorm4x8
|
||||
flags: u32, // 20: low byte = Shape_Kind, bits 8+ = Shape_Flags
|
||||
rotation_sc: u32, // 24: packed f16 pair (sin, cos). Requires .Rotated flag.
|
||||
_pad: f32, // 28: reserved for future use
|
||||
params: Shape_Params, // 32: per-kind params union (half_feather, radii, etc.) (32 bytes)
|
||||
uv_rect: [4]f32, // 64: texture UV coordinates. Read when .Textured.
|
||||
effects: Gradient_Outline, // 80: gradient and/or outline parameters (16 bytes).
|
||||
}
|
||||
// Total: 80 bytes (std430 aligned)
|
||||
// Total: 96 bytes (std430 aligned)
|
||||
```
|
||||
|
||||
`Shape_Params` is a `#raw_union` over `RRect_Params`, `NGon_Params`, `Ellipse_Params`, and
|
||||
`Ring_Arc_Params` (plus a `raw: [8]f32` view), defined in `pipeline_2d_base.odin`. Each SDF kind
|
||||
writes its own params variant; the fragment shader reads the appropriate fields based on `Shape_Kind`.
|
||||
`Uv_Or_Effects` is a `#raw_union` that aliases `[4]f32` (texture UV rect: u_min, v_min, u_max,
|
||||
v_max) with a `Gradient_Outline` struct containing `gradient_color: Color`, `outline_color: Color`,
|
||||
`Gradient_Outline` is a 16-byte struct containing `gradient_color: Color`, `outline_color: Color`,
|
||||
`gradient_dir_sc: u32` (packed f16 cos/sin pair), and `outline_packed: u32` (packed f16 outline
|
||||
width). The `flags` field encodes the `Shape_Kind` in the low byte and `Shape_Flags` in bits 8+
|
||||
via `pack_kind_flags`.
|
||||
width). It is independent of `uv_rect`, so a primitive can carry texture and outline parameters at
|
||||
the same time. The `flags` field encodes the `Shape_Kind` in the low byte and `Shape_Flags` in bits
|
||||
8+ via `pack_kind_flags`.
|
||||
|
||||
### Draw submission order
|
||||
|
||||
@@ -754,7 +768,7 @@ pair into bitmap atlases and emits indexed triangle data via `GetGPUTextDrawData
|
||||
**unchanged** by the SDF migration — text continues to flow through the main pipeline's tessellated
|
||||
mode with `mode = 0`, sampling the SDL_ttf atlas texture.
|
||||
|
||||
A future phase may evaluate MSDF (multi-channel signed distance field) text rendering, which would
|
||||
MSDF (multi-channel signed distance field) text rendering may be evaluated later, which would
|
||||
allow resolution-independent glyph rendering from a single small atlas per font. This would involve:
|
||||
|
||||
- Offline atlas generation via Chlumský's msdf-atlas-gen tool.
|
||||
@@ -763,8 +777,7 @@ allow resolution-independent glyph rendering from a single small atlas per font.
|
||||
already exists for the four current SDF kinds).
|
||||
- Potential removal of the SDL_ttf dependency.
|
||||
|
||||
This is explicitly deferred. The SDF shape migration is independent of and does not block text
|
||||
changes.
|
||||
This is explicitly deferred.
|
||||
|
||||
**References:**
|
||||
|
||||
@@ -778,8 +791,8 @@ changes.
|
||||
### Textures
|
||||
|
||||
Textures plug into the existing main pipeline — no additional GPU pipeline, no shader rewrite. The
|
||||
work is a resource layer (registration, upload, sampling, lifecycle) plus two textured-draw procs
|
||||
that route into the existing tessellated and SDF paths respectively.
|
||||
work is a resource layer (registration, upload, sampling, lifecycle) plus a `Texture_Fill` Brush
|
||||
variant that routes the existing shape procs through the SDF path with the `.Textured` flag set.
|
||||
|
||||
#### Why draw owns registered textures
|
||||
|
||||
@@ -829,22 +842,25 @@ with the same texture but different samplers produce separate draw calls, which
|
||||
|
||||
#### Textured draw procs
|
||||
|
||||
Textured rectangles route through the existing SDF path via `rectangle_texture`, which mirrors
|
||||
`rectangle` exactly — same parameters for radii, origin, rotation, feather — with the `color`
|
||||
parameter replaced by a `Texture_Id`, an optional `tint`, a `uv_rect`, and a `Sampler_Preset`.
|
||||
Textures share the same shape procs as colors and gradients. Each shape proc takes a `Brush`
|
||||
union as its fill source; passing a `Texture_Fill` value (carrying `Texture_Id`, `tint`,
|
||||
`uv_rect`, and `Sampler_Preset`) routes the draw through the SDF path with the `.Textured`
|
||||
flag set. There is no dedicated `rectangle_texture` / `circle_texture` proc — the same
|
||||
`rectangle`, `circle`, `ellipse`, `polygon`, `ring`, `line`, and `line_strip` procs handle
|
||||
all fill sources.
|
||||
|
||||
An earlier iteration of this design considered a separate tessellated proc for "simple" fullscreen
|
||||
quads, on the theory that the tessellated path's lower register count would improve occupancy at
|
||||
large fragment counts. Both paths are well within the ≤24-register main pipeline budget — both run at
|
||||
full occupancy on every target architecture (Valhall and above). The remaining ALU difference (~15
|
||||
extra instructions for the SDF evaluation) amounts to ~20μs at 4K — below noise. Meanwhile,
|
||||
splitting into a separate pipeline would add ~1–5μs per pipeline bind on the CPU side per scissor,
|
||||
matching or exceeding the GPU-side savings. Within the main pipeline, unified remains strictly better.
|
||||
A separate tessellated proc for "simple" fullscreen quads was considered on the theory that
|
||||
the tessellated path's lower register count would improve occupancy at large fragment counts.
|
||||
Both paths are well within the ≤24-register main pipeline budget — both run at full
|
||||
occupancy on every target architecture (Valhall and above). The remaining ALU difference
|
||||
(~15 extra instructions for the SDF evaluation) amounts to ~20μs at 4K — below noise.
|
||||
Meanwhile, splitting into a separate pipeline would add ~1–5μs per pipeline bind on the CPU
|
||||
side per scissor, matching or exceeding the GPU-side savings. Within the main pipeline,
|
||||
unified remains strictly better.
|
||||
|
||||
SDF drawing procs live in the `draw` package with unprefixed names (`rectangle`, `rectangle_texture`,
|
||||
`circle`, `ellipse`, `polygon`, `ring`, `line`, `line_strip`). Gradients and outlines are optional
|
||||
parameters on each proc rather than separate overloads. Future per-shape texture variants
|
||||
(`circle_texture`, `ellipse_texture`) are additive.
|
||||
SDF drawing procs live in the `draw` package with unprefixed names (`rectangle`, `circle`,
|
||||
`ellipse`, `polygon`, `ring`, `line`, `line_strip`). Gradients, textures, and outlines are
|
||||
selected via the `Brush` union and optional outline parameters rather than separate overloads.
|
||||
|
||||
#### What SDF anti-aliasing does and does not do for textured draws
|
||||
|
||||
@@ -858,8 +874,8 @@ depends on how closely the display size matches the SDL_ttf atlas's rasterized s
|
||||
#### Fit modes are a computation layer, not a renderer concept
|
||||
|
||||
Standard image-fit behaviors (stretch, fill/cover, fit/contain, tile, center) are expressed as UV
|
||||
sub-region computations on top of the `uv_rect` parameter that both textured-draw procs accept. The
|
||||
renderer has no knowledge of fit modes — it samples whatever UV region it is given.
|
||||
sub-region computations on top of the `uv_rect` field of `Texture_Fill`. The renderer has no
|
||||
knowledge of fit modes — it samples whatever UV region it is given.
|
||||
|
||||
A `fit_params` helper computes the appropriate `uv_rect`, sampler preset, and (for letterbox/fit
|
||||
mode) shrunken inner rect from a `Fit_Mode` enum, the target rect, and the texture's pixel size.
|
||||
@@ -883,13 +899,13 @@ textures onto a free list that is processed in `r_end_frame`, not at the call si
|
||||
|
||||
Clay's `RenderCommandType.Image` is handled by dereferencing `imageData: rawptr` as a pointer to a
|
||||
`Clay_Image_Data` struct containing a `Texture_Id`, `Fit_Mode`, and tint color. Routing mirrors the
|
||||
existing rectangle handling: `fit_params` computes UVs from the fit mode, then
|
||||
`rectangle_texture` is called with the appropriate radii (zero for sharp corners, per-corner values
|
||||
from Clay's `cornerRadius` otherwise).
|
||||
existing rectangle handling: `fit_params` computes UVs from the fit mode, then `rectangle` is
|
||||
called with a `Texture_Fill` brush and the appropriate radii (zero for sharp corners, per-corner
|
||||
values from Clay's `cornerRadius` otherwise).
|
||||
|
||||
#### Deferred features
|
||||
|
||||
The following are plumbed in the descriptor but not implemented in phase 1:
|
||||
The following are plumbed in `Texture_Desc` but not yet implemented:
|
||||
|
||||
- **Mipmaps**: `Texture_Desc.mip_levels` field exists; generation via SDL3 deferred.
|
||||
- **Compressed formats**: `Texture_Desc.format` accepts BC/ASTC; upload path deferred.
|
||||
@@ -897,7 +913,6 @@ The following are plumbed in the descriptor but not implemented in phase 1:
|
||||
- **3D textures, arrays, cube maps**: `Texture_Desc.type` and `depth_or_layers` fields exist.
|
||||
- **Additional samplers**: anisotropic, trilinear, clamp-to-border — additive enum values.
|
||||
- **Atlas packing**: internal optimization for sub-batch coalescing; invisible to callers.
|
||||
- **Per-shape texture variants**: `circle_texture`, `ellipse_texture`, `polygon_texture` — potential future additions, following the existing naming convention.
|
||||
|
||||
**References:**
|
||||
|
||||
|
||||
Reference in New Issue
Block a user