DPI scaling fixes
This commit is contained in:
+53
-30
@@ -5,14 +5,27 @@ Clay UI integration.
|
||||
|
||||
## Current state
|
||||
|
||||
The renderer uses a single unified `Core_2D` (`TRIANGLELIST` pipeline) with two submission
|
||||
modes dispatched by a push constant:
|
||||
The renderer uses a single unified `Core_2D` (`TRIANGLELIST` pipeline) with three submission
|
||||
modes dispatched by a push constant. The split is by **vertex coordinate space**, not by what the
|
||||
fragment shader does — modes 0 and 2 share the same fragment-shader path (kind 0) and differ only
|
||||
in whether the vertex shader applies `dpi_scale` to incoming positions:
|
||||
|
||||
- **Mode 0 (Tessellated):** Vertex buffer contains real geometry. Used for text (indexed draws into
|
||||
SDL_ttf atlas textures), single-pixel points (`tess.pixel`), arbitrary user geometry
|
||||
(`tess.triangle`, `tess.triangle_aa`, `tess.triangle_lines`, `tess.triangle_fan`,
|
||||
`tess.triangle_strip`), and any raw vertex geometry submitted via `prepare_shape`. The fragment
|
||||
shader premultiplies the texture sample (`t.rgb *= t.a`) and computes `out = color * t`.
|
||||
- **Mode 0 (Tessellated):** Vertex buffer contains real geometry in _logical_ pixels. The vertex
|
||||
shader scales by `dpi_scale` before projecting. Used for single-pixel points (`tess.pixel`),
|
||||
arbitrary user geometry (`tess.triangle`, `tess.triangle_aa`, `tess.triangle_lines`,
|
||||
`tess.triangle_fan`, `tess.triangle_strip`), and any raw vertex geometry submitted via
|
||||
`prepare_shape`. The fragment shader premultiplies the texture sample (`t.rgb *= t.a`) and
|
||||
computes `out = color * t`.
|
||||
|
||||
- **Mode 2 (Text):** Vertex buffer contains real geometry in _physical_ pixels. SDL_ttf's GPU text
|
||||
engine lays out glyphs in physical pixels (`TTF_SetFontSizeDPI` is called with `72 * dpi_scale`),
|
||||
so `prepare_text` adds an anchor offset that is itself snapped to integer physical pixels for
|
||||
atlas-aligned bilinear sampling, then writes vertices straight to the buffer. The vertex shader
|
||||
must NOT rescale these vertices. Same fragment-shader kind as Tessellated; same indexed draws
|
||||
into SDL_ttf atlas textures; the only difference is the coordinate space of the input. Mode 2
|
||||
exists because integer-physical-pixel snapping is the load-bearing property of crisp glyph
|
||||
rendering and CPU is the only place that snap can happen once-per-text-element instead of
|
||||
per-vertex.
|
||||
|
||||
- **Mode 1 (SDF):** A static 6-vertex unit-quad buffer is drawn instanced, with per-primitive
|
||||
`Core_2D_Primitive` structs (96 bytes each) uploaded each frame to a GPU storage buffer. The vertex
|
||||
@@ -43,8 +56,8 @@ in the pipeline plan below for the full cliff/margin analysis and SBC architectu
|
||||
The fragment shader's estimated peak footprint is ~22–26 fp32 VGPRs (~16–22 fp16 VGPRs on architectures
|
||||
with native mediump) via manual live-range analysis. The dominant peak is the Ring_Arc kind path
|
||||
(wedge normals + inner/outer radii + dot-product temporaries live simultaneously with carried state
|
||||
like `f_color`, `f_uv_rect`/`f_effects`, and `half_size`). RRect is 1–2 regs lower (`corner_radii` vec4
|
||||
replaces the separate inner/outer + normal pairs). NGon and Ellipse are lighter still. Real compilers
|
||||
like `f_color`, `f_uv_rect`/`f_effects`, and `half_size_ppx`). RRect is 1–2 regs lower
|
||||
(`corner_radii_ppx` vec4 replaces the separate inner/outer + normal pairs). NGon and Ellipse are lighter still. Real compilers
|
||||
apply live-range coalescing, mediump-to-fp16 promotion, and rematerialization that typically shave
|
||||
2–4 regs from hand-counted estimates — the conservative 26-reg upper bound is expected to compile
|
||||
down to within the 24-register budget, but this must be verified with `malioc` (see "Verifying
|
||||
@@ -432,22 +445,32 @@ our design:
|
||||
|
||||
### Main pipeline: SDF + tessellated (unified)
|
||||
|
||||
The main pipeline serves two submission modes through a single `TRIANGLELIST` pipeline and a single
|
||||
vertex input layout, distinguished by a `mode` field in the `Vertex_Uniforms_2D` push constant
|
||||
(`Core_2D_Mode.Tessellated = 0`, `Core_2D_Mode.SDF = 1`), pushed per draw call via `push_globals`. The
|
||||
vertex shader branches on this uniform to select the tessellated or SDF code path.
|
||||
The main pipeline serves three submission modes through a single `TRIANGLELIST` pipeline and a
|
||||
single vertex input layout, distinguished by a `mode` field in the `Vertex_Uniforms_2D` push
|
||||
constant (`Core_2D_Mode.Tessellated = 0`, `Core_2D_Mode.SDF = 1`, `Core_2D_Mode.Text = 2`), pushed
|
||||
per draw call via `push_globals`. The vertex shader branches on this uniform to select the
|
||||
appropriate code path.
|
||||
|
||||
- **Tessellated mode** (`mode = 0`): direct vertex buffer with explicit geometry. Used for text
|
||||
(SDL_ttf atlas sampling), triangles, triangle fans/strips, single-pixel points, and any
|
||||
user-provided raw vertex geometry.
|
||||
- **Tessellated mode** (`mode = 0`): direct vertex buffer with explicit geometry in _logical_
|
||||
pixels. Vertex shader scales positions by `dpi_scale`. Used for triangles, triangle fans/strips,
|
||||
single-pixel points, and any user-provided raw vertex geometry.
|
||||
- **SDF mode** (`mode = 1`): shared unit-quad vertex buffer + GPU storage buffer of
|
||||
`Core_2D_Primitive` structs, drawn instanced. Used for all shapes with closed-form signed distance
|
||||
functions.
|
||||
functions. `Core_2D_Primitive.bounds` is in logical pixels; the vertex shader scales by
|
||||
`dpi_scale`.
|
||||
- **Text mode** (`mode = 2`): direct vertex buffer with explicit geometry in _physical_ pixels.
|
||||
Vertex shader does NOT scale. Used for SDL_ttf atlas sampling. The CPU-side anchor snap to
|
||||
integer physical pixels (`prepare_text`/`prepare_text_transformed`) is what produces crisp glyphs
|
||||
— sub-pixel anchors blur via the bilinear sampler. Mode 2 shares the fragment-shader path with
|
||||
Tessellated (kind 0), so the only divergence between text and shape rasterization is the vertex
|
||||
shader's `* dpi_scale` step.
|
||||
|
||||
Both modes use the same fragment shader. The fragment shader checks `Shape_Kind` (low byte of
|
||||
`Core_2D_Primitive.flags`): kind 0 (`Solid`) is the tessellated path, which premultiplies the texture
|
||||
sample and computes `out = color * t`; kinds 1–4 dispatch to one of four SDF functions (RRect, NGon,
|
||||
Ellipse, Ring_Arc) and apply gradient/texture/outline/solid color based on `Shape_Flags` bits.
|
||||
All three modes use the same fragment shader. Modes 0 (Tessellated) and 2 (Text) take the same
|
||||
fragment-shader path (kind 0), which premultiplies the texture sample and computes `out = color * t`;
|
||||
they differ only in the vertex shader (whether positions are pre-scaled to physical pixels). Mode 1
|
||||
(SDF) checks `Shape_Kind` (low byte of `Core_2D_Primitive.flags`): kinds 1–4 dispatch to one of four
|
||||
SDF functions (RRect, NGon, Ellipse, Ring_Arc) and apply gradient/texture/outline/solid color based
|
||||
on `Shape_Flags` bits.
|
||||
|
||||
#### Why SDF for shapes
|
||||
|
||||
@@ -495,9 +518,9 @@ Compared to encoding per-primitive data in vertex attributes (the "fat vertex" a
|
||||
buffer instancing eliminates the 4–6× data duplication across quad corners. A rounded rectangle costs
|
||||
96 bytes instead of 4 vertices × 60+ bytes = 240+ bytes.
|
||||
|
||||
The tessellated path retains the existing direct vertex buffer layout (20 bytes/vertex, no storage
|
||||
buffer access). The vertex shader branch on `mode` (push constant) is warp-uniform — every invocation
|
||||
in a draw call has the same mode — so it is effectively free on all modern GPUs.
|
||||
The tessellated and text paths retain the existing direct vertex buffer layout (20 bytes/vertex, no
|
||||
storage buffer access). The vertex shader branch on `mode` (push constant) is warp-uniform — every
|
||||
invocation in a draw call has the same mode — so it is effectively free on all modern GPUs.
|
||||
|
||||
#### Shape kinds and SDF dispatch
|
||||
|
||||
@@ -715,11 +738,11 @@ Clay has no notion of backdrops. The integration uses Clay's only extension poin
|
||||
|
||||
```
|
||||
Backdrop_Marker :: struct {
|
||||
magic: u32, // BACKDROP_MARKER_MAGIC (0x42445054, 'BDPT')
|
||||
sigma: f32,
|
||||
tint: Color,
|
||||
radii: Rectangle_Radii,
|
||||
feather_px: f32,
|
||||
magic: u32, // BACKDROP_MARKER_MAGIC (0x42445054, 'BDPT')
|
||||
sigma: f32,
|
||||
tint: Color,
|
||||
radii: Rectangle_Radii,
|
||||
feather_ppx: f32,
|
||||
}
|
||||
```
|
||||
|
||||
@@ -762,7 +785,7 @@ Core_2D_Primitive :: struct {
|
||||
flags: u32, // 20: low byte = Shape_Kind, bits 8+ = Shape_Flags
|
||||
rotation_sc: u32, // 24: packed f16 pair (sin, cos). Requires .Rotated flag.
|
||||
_pad: f32, // 28: reserved for future use
|
||||
params: Shape_Params, // 32: per-kind params union (half_feather, radii, etc.) (32 bytes)
|
||||
params: Shape_Params, // 32: per-kind params union (half_feather_ppx, radii_ppx, etc.) (32 bytes)
|
||||
uv_rect: [4]f32, // 64: texture UV coordinates. Read when .Textured.
|
||||
effects: Gradient_Outline, // 80: gradient and/or outline parameters (16 bytes).
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user