Compare commits

...

5 Commits

Author SHA1 Message Date
zack e36229a3ef Improved consistency with naming of init / create / destroy and when to propagate allocation errors and (#18)
Co-authored-by: Zachary Levy <zachary@sunforge.is>
Reviewed-on: #18
2026-04-24 21:46:21 +00:00
zack bca19277b3 draw-improvements (#17)
Major rework to draw rendering system. We are making a SDF first rendering system with tesselated stuff only as a fallback strategy for specific situations where SDF is particularly poorly suited

Co-authored-by: Zachary Levy <zachary@sunforge.is>
Reviewed-on: #17
2026-04-24 07:57:44 +00:00
zack 37da2ea068 Tweaked general setup tracking allocator and added logger (#11)
Co-authored-by: Zachary Levy <zachary@sunforge.is>
Reviewed-on: #11
2026-04-22 06:03:10 +00:00
zack cfd9e504e1 vendor-cleanup (#10)
Major rework of libusb and lmdb bindings

Co-authored-by: Zachary Levy <zachary@sunforge.is>
Reviewed-on: #10
2026-04-22 04:47:59 +00:00
zack 0d424cbd6e Texture Rendering (#9)
Co-authored-by: Zachary Levy <zachary@sunforge.is>
Reviewed-on: #9
2026-04-22 00:05:08 +00:00
33 changed files with 4667 additions and 3017 deletions
+5
View File
@@ -70,6 +70,11 @@
"command": "odin run draw/examples -debug -out=out/debug/draw-examples -- hellope-custom", "command": "odin run draw/examples -debug -out=out/debug/draw-examples -- hellope-custom",
"cwd": "$ZED_WORKTREE_ROOT", "cwd": "$ZED_WORKTREE_ROOT",
}, },
{
"label": "Run draw textures example",
"command": "odin run draw/examples -debug -out=out/debug/draw-examples -- textures",
"cwd": "$ZED_WORKTREE_ROOT",
},
{ {
"label": "Run qrcode basic example", "label": "Run qrcode basic example",
"command": "odin run qrcode/examples -debug -out=out/debug/qrcode-examples -- basic", "command": "odin run qrcode/examples -debug -out=out/debug/qrcode-examples -- basic",
+397 -134
View File
@@ -9,35 +9,51 @@ The renderer uses a single unified `Pipeline_2D_Base` (`TRIANGLELIST` pipeline)
modes dispatched by a push constant: modes dispatched by a push constant:
- **Mode 0 (Tessellated):** Vertex buffer contains real geometry. Used for text (indexed draws into - **Mode 0 (Tessellated):** Vertex buffer contains real geometry. Used for text (indexed draws into
SDL_ttf atlas textures), axis-aligned sharp-corner rectangles (already optimal as 2 triangles), SDL_ttf atlas textures), single-pixel points (`tes_pixel`), arbitrary user geometry (`tes_triangle`,
per-vertex color gradients (`rectangle_gradient`, `circle_gradient`), angular-clipped circle `tes_triangle_fan`, `tes_triangle_strip`), and shapes without a closed-form rounded-rectangle
sectors (`circle_sector`), and arbitrary user geometry (`triangle`, `triangle_fan`, reduction: ellipses (`tes_ellipse`), regular polygons (`tes_polygon`), and circle sectors
`triangle_strip`). The fragment shader computes `out = color * texture(tex, uv)`. (`tes_sector`). The fragment shader computes `out = color * texture(tex, uv)`.
- **Mode 1 (SDF):** A static 6-vertex unit-quad buffer is drawn instanced, with per-primitive - **Mode 1 (SDF):** A static 6-vertex unit-quad buffer is drawn instanced, with per-primitive
`Primitive` structs uploaded each frame to a GPU storage buffer. The vertex shader reads `Primitive` structs (80 bytes each) uploaded each frame to a GPU storage buffer. The vertex shader
`primitives[gl_InstanceIndex]`, computes world-space position from unit quad corners + primitive reads `primitives[gl_InstanceIndex]`, computes world-space position from unit quad corners +
bounds. The fragment shader dispatches on `Shape_Kind` to evaluate the correct signed distance primitive bounds. The fragment shader always evaluates `sdRoundedBox` — there is no per-primitive
function analytically. kind dispatch.
Seven SDF shape kinds are implemented: The SDF path handles all shapes that are algebraically reducible to a rounded rectangle:
1. **RRect** — rounded rectangle with per-corner radii (iq's `sdRoundedBox`) - **Rounded rectangles** — per-corner radii via `sdRoundedBox` (iq). Covers filled, stroked,
2. **Circle** — filled or stroked circle textured, and gradient-filled rectangles.
3. **Ellipse**exact signed-distance ellipse (iq's iterative `sdEllipse`) - **Circles** — uniform radii equal to half-size. Covers filled, stroked, and radial-gradient circles.
4. **Segment** — capsule-style line segment with rounded caps - **Line segments / capsules** — rotated RRect with uniform radii equal to half-thickness (stadium shape).
5. **Ring_Arc** — annular ring with angular clipping for arcs - **Full rings / annuli** — stroked circle (mid-radius with stroke thickness = outer - inner).
6. **NGon** — regular polygon with arbitrary side count and rotation
7. **Polyline** — decomposed into independent `Segment` primitives per adjacent point pair
All SDF shapes support fill and stroke modes via `Shape_Flags`, and produce mathematically exact All SDF shapes support fill, stroke, solid color, bilinear 4-corner gradients, radial 2-color
curves with analytical anti-aliasing via `smoothstep` — no tessellation, no piecewise-linear gradients, and texture fills via `Shape_Flags`. Gradient colors are packed into the same 16 bytes as
approximation. A rounded rectangle is 1 primitive (64 bytes) instead of ~250 vertices (~5000 bytes). the texture UV rect via a `Uv_Or_Gradient` raw union — zero size increase to the 80-byte `Primitive`
struct. Gradient and texture are mutually exclusive.
All SDF shapes produce mathematically exact curves with analytical anti-aliasing via `smoothstep`
no tessellation, no piecewise-linear approximation. A rounded rectangle is 1 primitive (80 bytes)
instead of ~250 vertices (~5000 bytes).
The fragment shader's estimated register footprint is ~2023 VGPRs via static live-range analysis.
RRect and Ring_Arc are roughly tied at peak pressure — RRect carries `corner_radii` (4 regs) plus
`sdRoundedBox` temporaries, Ring_Arc carries wedge normals plus dot-product temporaries. Both land
comfortably under Mali Valhall's 32-register occupancy cliff (G57/G77/G78 and later) and well under
desktop limits. On older Bifrost Mali (G71/G72/G76, 16-register cliff) either shape kind may incur
partial occupancy reduction. These estimates are hand-counted; exact numbers require `malioc` or
Radeon GPU Analyzer against the compiled SPIR-V.
MSAA is opt-in (default `._1`, no MSAA) via `Init_Options.msaa_samples`. SDF rendering does not MSAA is opt-in (default `._1`, no MSAA) via `Init_Options.msaa_samples`. SDF rendering does not
benefit from MSAA because fragment coverage is computed analytically. MSAA remains useful for text benefit from MSAA because fragment coverage is computed analytically. MSAA remains useful for text
glyph edges and tessellated user geometry if desired. glyph edges and tessellated user geometry if desired.
All public drawing procs use prefixed names for clarity: `sdf_*` for SDF-path shapes, `tes_*` for
tessellated-path shapes. Proc groups provide a single entry point per shape concept (e.g.,
`sdf_rectangle` dispatches to `sdf_rectangle_solid` or `sdf_rectangle_gradient` based on argument
count).
## 2D rendering pipeline plan ## 2D rendering pipeline plan
This section documents the planned architecture for levlib's 2D rendering system. The design is driven This section documents the planned architecture for levlib's 2D rendering system. The design is driven
@@ -47,68 +63,107 @@ primitives and effects can be added to the library without architectural changes
### Overview: three pipelines ### Overview: three pipelines
The 2D renderer will use three GPU pipelines, split by **register pressure compatibility** and The 2D renderer uses three GPU pipelines, split by **register pressure** (main vs effects) and
**render-state requirements**: **render-pass structure** (everything vs backdrop):
1. **Main pipeline** — shapes (SDF and tessellated) and text. Low register footprint (~1822 1. **Main pipeline** — shapes (SDF and tessellated), text, and textured rectangles. Low register
registers per thread). Runs at high GPU occupancy. Handles 90%+ of all fragments in a typical footprint (~1824 registers per thread). Runs at full GPU occupancy on every architecture.
frame. Handles 90%+ of all fragments in a typical frame.
2. **Effects pipeline** — drop shadows, inner shadows, outer glow, and similar ALU-bound blur 2. **Effects pipeline** — drop shadows, inner shadows, outer glow, and similar ALU-bound blur
effects. Medium register footprint (~4860 registers). Each effects primitive includes the base effects. Medium register footprint (~4860 registers). Each effects primitive includes the base
shape's SDF so that it can draw both the effect and the shape in a single fragment pass, avoiding shape's SDF so that it can draw both the effect and the shape in a single fragment pass, avoiding
redundant overdraw. redundant overdraw. Separated from the main pipeline to protect main-pipeline occupancy on
low-end hardware (see register analysis below).
3. **Backdrop-effects pipeline** — frosted glass, refraction, and any effect that samples the current 3. **Backdrop pipeline** — frosted glass, refraction, and any effect that samples the current render
render target as input. High register footprint (~7080 registers) and structurally requires a target as input. Implemented as a multi-pass sequence (downsample, separable blur, composite),
`CopyGPUTextureToTexture` from the render target before drawing. Separated both for register where each individual pass has a low-to-medium register footprint (~1540 registers). Separated
pressure and because the texture-copy requirement forces a render-pass-level state change. from the other pipelines because it structurally requires ending the current render pass and
copying the render target before any backdrop-sampling fragment can execute — a command-buffer-
level boundary that cannot be avoided regardless of shader complexity.
A typical UI frame with no effects uses 1 pipeline bind and 0 switches. A frame with drop shadows A typical UI frame with no effects uses 1 pipeline bind and 0 switches. A frame with drop shadows
uses 2 pipelines and 1 switch. A frame with shadows and frosted glass uses all 3 pipelines and 2 uses 2 pipelines and 1 switch. A frame with shadows and frosted glass uses all 3 pipelines and 2
switches plus 1 texture copy. At ~5μs per pipeline bind on modern APIs, worst-case switching overhead switches plus 1 texture copy. At ~15μs per pipeline bind on modern APIs, worst-case switching
is under 0.15% of an 8.3ms (120 FPS) frame budget. overhead is negligible relative to an 8.3ms (120 FPS) frame budget.
### Why three pipelines, not one or seven ### Why three pipelines, not one or seven
The natural question is whether we should use a single unified pipeline (fewer state changes, simpler The natural question is whether we should use a single unified pipeline (fewer state changes, simpler
code) or many per-primitive-type pipelines (no branching overhead, lean per-shader register usage). code) or many per-primitive-type pipelines (no branching overhead, lean per-shader register usage).
The dominant cost factor is **GPU register pressure**, not pipeline switching overhead or fragment #### Main/effects split: register pressure
shader branching. A GPU shader core has a fixed register pool shared among all concurrent threads. The
compiler allocates registers pessimistically based on the worst-case path through the shader. If the
shader contains both a 20-register RRect SDF and a 72-register frosted-glass blur, _every_ fragment
— even trivial RRects — is allocated 72 registers. This directly reduces **occupancy** (the number of
warps that can run simultaneously), which reduces the GPU's ability to hide memory latency.
Concrete example on a modern NVIDIA SM with 65,536 registers: A GPU shader core has a fixed register pool shared among all concurrent threads. The compiler
allocates registers pessimistically based on the worst-case path through the shader. If the shader
contains both a 20-register RRect SDF and a 48-register drop-shadow blur, _every_ fragment — even
trivial RRects — is allocated 48 registers. This directly reduces **occupancy** (the number of
warps/wavefronts that can run simultaneously), which reduces the GPU's ability to hide memory
latency.
| Register allocation | Max concurrent threads | Occupancy | Each GPU architecture has a **register cliff** — a threshold above which occupancy starts dropping.
| ------------------------- | ---------------------- | --------- | Below the cliff, adding registers has zero occupancy cost.
| 20 regs (RRect only) | 3,276 | ~100% |
| 48 regs (+ drop shadow) | 1,365 | ~42% |
| 72 regs (+ frosted glass) | 910 | ~28% |
For a 4K frame (3840×2160) at 1.5× overdraw (~12.4M fragments), running all fragments at 28% On consumer Ampere/Ada GPUs (RTX 30xx/40xx, 65,536 regs/SM, max 1,536 threads/SM, cliff at ~43 regs):
occupancy instead of 100% roughly triples fragment shading time. At 4K this is severe: if the main
pipeline's fragment work at full occupancy takes ~2ms, a single unified shader containing the glass
branch would push it to ~6ms — consuming 72% of the 8.3ms budget available at 120 FPS and leaving
almost nothing for CPU work, uploads, and presentation. This is a per-frame multiplier, not a
per-primitive cost — it applies even when the heavy branch is never taken.
The three-pipeline split groups primitives by register footprint so that: | Register allocation | Reg-limited threads | Actual (hw-capped) | Occupancy |
| ------------------------ | ------------------- | ------------------ | --------- |
| ~16 regs (main pipeline) | 4,096 | 1,536 | 100% |
| 32 regs | 2,048 | 1,536 | 100% |
| 48 regs (effects) | 1,365 | 1,365 | ~89% |
- Main pipeline (~20 regs): 90%+ of fragments run at near-full occupancy. On Volta/A100 GPUs (65,536 regs/SM, max 2,048 threads/SM, cliff at ~32 regs):
- Effects pipeline (~55 regs): shadow/glow fragments run at moderate occupancy; unavoidable given the
blur math complexity.
- Backdrop-effects pipeline (~75 regs): glass fragments run at low occupancy; also unavoidable, and
structurally separated anyway by the texture-copy requirement.
This avoids the register-pressure tax of a single unified shader while keeping pipeline count minimal | Register allocation | Reg-limited threads | Actual (hw-capped) | Occupancy |
(3 vs. Zed GPUI's 7). The effects that drag occupancy down are isolated to the fragments that | ------------------------ | ------------------- | ------------------ | --------- |
actually need them. | ~16 regs (main pipeline) | 4,096 | 2,048 | 100% |
| 32 regs | 2,048 | 2,048 | 100% |
| 48 regs (effects) | 1,365 | 1,365 | ~67% |
**Why not per-primitive-type pipelines (GPUI's approach)?** Zed's GPUI uses 7 separate shader pairs: On low-end mobile (ARM Mali Bifrost/Valhall, 64 regs/thread, cliff fixed at 32 regs):
| Register allocation | Occupancy |
| -------------------- | -------------------------- |
| 032 regs (main) | 100% (full thread count) |
| 3364 regs (effects) | ~50% (thread count halves) |
Mali's cliff at 32 registers is the binding constraint. On desktop the occupancy difference between
20 and 48 registers is modest (89100%); on Mali it is a hard 2× throughput reduction. The
main/effects split protects 90%+ of a frame's fragments (shapes, text, textures) from the effects
pipeline's register cost.
For the effects pipeline's drop-shadow shader — erf-approximation blur math with several texture
fetches — 50% occupancy on Mali roughly halves throughput. At 4K with 1.5× overdraw (~12.4M
fragments), a single unified shader containing the shadow branch would cost ~4ms instead of ~2ms on
low-end mobile. This is a per-frame multiplier even when the heavy branch is never taken, because the
compiler allocates registers for the worst-case path.
All main-pipeline members (SDF shapes, tessellated geometry, text, textured rectangles) cluster at
1224 registers — below the cliff on every architecture — so unifying them costs nothing in
occupancy.
**Note on Apple M3+ GPUs:** Apple's M3 introduces Dynamic Caching (register file virtualization),
which allocates registers at runtime based on actual usage rather than worst-case. This weakens the
static register-pressure argument on M3 and later, but the split remains useful for isolating blur
ALU complexity and keeping the backdrop texture-copy out of the main render pass.
#### Backdrop split: render-pass structure
The backdrop pipeline (frosted glass, refraction, mirror surfaces) is separated for a structural
reason unrelated to register pressure. Before any backdrop-sampling fragment can execute, the current
render target must be copied to a separate texture via `CopyGPUTextureToTexture` — a command-buffer-
level operation that requires ending the current render pass. This boundary exists regardless of
shader complexity and cannot be optimized away.
The backdrop pipeline's individual shader passes (downsample, separable blur, composite) are
register-light (~1540 regs each), so merging them into the effects pipeline would cause no occupancy
problem. But the render-pass boundary makes merging structurally impossible — effects draws happen
inside the main render pass, backdrop draws happen inside their own bracketed pass sequence.
#### Why not per-primitive-type pipelines (GPUI's approach)
Zed's GPUI uses 7 separate shader pairs:
quad, shadow, underline, monochrome sprite, polychrome sprite, path, surface. This eliminates all quad, shadow, underline, monochrome sprite, polychrome sprite, path, surface. This eliminates all
branching and gives each shader minimal register usage. Three concrete costs make this approach wrong branching and gives each shader minimal register usage. Three concrete costs make this approach wrong
for our use case: for our use case:
@@ -120,7 +175,7 @@ typical UI frame with 15 scissors and 34 primitive kinds per scissor, per-kin
~4560 draw calls and pipeline binds; our unified approach produces ~1520 draw calls and 15 ~4560 draw calls and pipeline binds; our unified approach produces ~1520 draw calls and 15
pipeline binds. At ~5μs each for CPU-side command encoding on modern APIs, per-kind splitting adds pipeline binds. At ~5μs each for CPU-side command encoding on modern APIs, per-kind splitting adds
375500μs of CPU overhead per frame — **4.56% of an 8.3ms (120 FPS) budget** — with no 375500μs of CPU overhead per frame — **4.56% of an 8.3ms (120 FPS) budget** — with no
compensating GPU-side benefit, because the register-pressure savings within the simple-SDF tier are compensating GPU-side benefit, because the register-pressure savings within the simple-SDF range are
negligible (all members cluster at 1222 registers). negligible (all members cluster at 1222 registers).
**Z-order preservation forces the API to expose layers.** With a single pipeline drawing all kinds **Z-order preservation forces the API to expose layers.** With a single pipeline drawing all kinds
@@ -159,10 +214,10 @@ in submission order:
~60 boundary warps at ~80 extra instructions each), unified divergence costs ~13μs — still 3.5× ~60 boundary warps at ~80 extra instructions each), unified divergence costs ~13μs — still 3.5×
cheaper than the pipeline-switching alternative. cheaper than the pipeline-switching alternative.
The split we _do_ perform (main / effects / backdrop-effects) is motivated by register-pressure tier The split we _do_ perform (main / effects / backdrop) is motivated by register-pressure boundaries
boundaries where occupancy differences are catastrophic at 4K (see numbers above). Within a tier, and structural render-pass requirements (see analysis above). Within a pipeline, unified is
unified is strictly better by every measure: fewer draw calls, simpler Z-order, lower CPU overhead, strictly better by every measure: fewer draw calls, simpler Z-order, lower CPU overhead, and
and negligible GPU-side branching cost. negligible GPU-side branching cost.
**References:** **References:**
@@ -172,6 +227,16 @@ and negligible GPU-side branching cost.
https://github.com/zed-industries/zed/blob/cb6fc11/crates/gpui/src/platform/mac/shaders.metal https://github.com/zed-industries/zed/blob/cb6fc11/crates/gpui/src/platform/mac/shaders.metal
- NVIDIA Nsight Graphics 2024.3 documentation on active-threads-per-warp and divergence analysis: - NVIDIA Nsight Graphics 2024.3 documentation on active-threads-per-warp and divergence analysis:
https://developer.nvidia.com/blog/optimize-gpu-workloads-for-graphics-applications-with-nvidia-nsight-graphics/ https://developer.nvidia.com/blog/optimize-gpu-workloads-for-graphics-applications-with-nvidia-nsight-graphics/
- NVIDIA Ampere GPU Architecture Tuning Guide — SM specs, max warps per SM (48 for cc 8.6, 64 for
cc 8.0), register file size (64K), occupancy factors:
https://docs.nvidia.com/cuda/ampere-tuning-guide/index.html
- NVIDIA Ada GPU Architecture Tuning Guide — SM specs, max warps per SM (48 for cc 8.9):
https://docs.nvidia.com/cuda/ada-tuning-guide/index.html
- CUDA Occupancy Calculation walkthrough (register allocation granularity, worked examples):
https://leimao.github.io/blog/CUDA-Occupancy-Calculation/
- Apple M3 GPU architecture — Dynamic Caching (register file virtualization) eliminates static
worst-case register allocation, reducing the occupancy penalty for high-register shaders:
https://asplos.dev/wiki/m3-chip-explainer/gpu/index.html
### Why fragment shader branching is safe in this design ### Why fragment shader branching is safe in this design
@@ -212,11 +277,12 @@ Our design has two branch points:
Every thread in every warp of a draw call sees the same `mode` value. **Zero divergence, zero Every thread in every warp of a draw call sees the same `mode` value. **Zero divergence, zero
cost.** cost.**
2. **`shape_kind` (flat varying from storage buffer): which SDF to evaluate.** This is category 3. 2. **`flags` (flat varying from storage buffer): gradient/texture/stroke mode.** This is category 3.
The `flat` interpolation qualifier ensures that all fragments rasterized from one primitive's quad The `flat` interpolation qualifier ensures that all fragments rasterized from one primitive's quad
receive the same `shape_kind` value. Divergence can only occur at the **boundary between two receive the same flag bits. However, since the SDF path now evaluates only `sdRoundedBox` with no
adjacent primitives of different kinds**, where the rasterizer might pack fragments from both kind dispatch, the only flag-dependent branches are gradient vs. texture vs. solid color selection
primitives into the same warp. — all lightweight (38 instructions per path). Divergence at primitive boundaries between
different flag combinations has negligible cost.
For category 3, the divergence analysis depends on primitive size: For category 3, the divergence analysis depends on primitive size:
@@ -233,9 +299,10 @@ For category 3, the divergence analysis depends on primitive size:
frame-level divergence is typically **13%** of all warps. frame-level divergence is typically **13%** of all warps.
At 13% divergence, the throughput impact is negligible. At 4K with 12.4M total fragments At 13% divergence, the throughput impact is negligible. At 4K with 12.4M total fragments
(~387,000 warps), divergent boundary warps number in the low thousands. Each divergent warp pays at (~387,000 warps), divergent boundary warps number in the low thousands. Without kind dispatch, the
most ~25 extra instructions (the cost of the longest untaken SDF branch). At ~12G instructions/sec longest untaken branch is the gradient evaluation (~8 instructions), not a different SDF function.
on a mid-range GPU, that totals ~4μs — under 0.05% of an 8.3ms (120 FPS) frame budget. This is Each divergent warp pays at most ~8 extra instructions. At ~12G instructions/sec on a mid-range GPU,
that totals ~1.3μs — under 0.02% of an 8.3ms (120 FPS) frame budget. This is
confirmed by production renderers that use exactly this pattern: confirmed by production renderers that use exactly this pattern:
- **vger / vger-rs** (Audulus): single pipeline, 11 primitive kinds dispatched by a `switch` on a - **vger / vger-rs** (Audulus): single pipeline, 11 primitive kinds dispatched by a `switch` on a
@@ -260,9 +327,10 @@ our design:
> have no per-fragment data-dependent branches in the main pipeline. > have no per-fragment data-dependent branches in the main pipeline.
2. **Branches where both paths are very long.** If both sides of a branch are 500+ instructions, 2. **Branches where both paths are very long.** If both sides of a branch are 500+ instructions,
divergent warps pay double a large cost. Our SDF functions are 1025 instructions each. Even divergent warps pay double a large cost. Without kind dispatch, the SDF path always evaluates
fully divergent, the penalty is ~25 extra instructions — less than a single texture sample's `sdRoundedBox`; the only branches are gradient/texture/solid color selection at 38 instructions
latency. each. Even fully divergent, the penalty is ~8 extra instructions — less than a single texture
sample's latency.
3. **Branches that prevent compiler optimizations.** Some compilers cannot schedule instructions 3. **Branches that prevent compiler optimizations.** Some compilers cannot schedule instructions
across branch boundaries, reducing VLIW utilization on older architectures. Modern GPUs (NVIDIA across branch boundaries, reducing VLIW utilization on older architectures. Modern GPUs (NVIDIA
@@ -270,9 +338,9 @@ our design:
concern. concern.
4. **Register pressure from the union of all branches.** This is the real cost, and it is why we 4. **Register pressure from the union of all branches.** This is the real cost, and it is why we
split heavy effects (shadows, glass) into separate pipelines. Within the main pipeline, all SDF split heavy effects (shadows, glass) into separate pipelines. Within the main pipeline, the SDF
branches have similar register footprints (1222 registers), so combining them causes negligible path has a single evaluation (sdRoundedBox) with flag-based color selection, clustering at ~1518
occupancy loss. registers, so there is negligible occupancy loss.
**References:** **References:**
@@ -293,17 +361,19 @@ our design:
### Main pipeline: SDF + tessellated (unified) ### Main pipeline: SDF + tessellated (unified)
The main pipeline serves two submission modes through a single `TRIANGLELIST` pipeline and a single The main pipeline serves two submission modes through a single `TRIANGLELIST` pipeline and a single
vertex input layout, distinguished by a push constant: vertex input layout, distinguished by a mode marker in the `Primitive.flags` field (low byte:
0 = tessellated, 1 = SDF). The tessellated path sets this to 0 via zero-initialization in the vertex
shader; the SDF path sets it to 1 via `pack_flags`.
- **Tessellated mode** (`mode = 0`): direct vertex buffer with explicit geometry. Unchanged from - **Tessellated mode** (`mode = 0`): direct vertex buffer with explicit geometry. Used for text
today. Used for text (SDL_ttf atlas sampling), polylines, triangle fans/strips, gradient-filled (SDL_ttf atlas sampling), triangle fans/strips, ellipses, regular polygons, circle sectors, and
shapes, and any user-provided raw vertex geometry. any user-provided raw vertex geometry.
- **SDF mode** (`mode = 1`): shared unit-quad vertex buffer + GPU storage buffer of `Primitive` - **SDF mode** (`mode = 1`): shared unit-quad vertex buffer + GPU storage buffer of `Primitive`
structs, drawn instanced. Used for all shapes with closed-form signed distance functions. structs, drawn instanced. Used for all shapes with closed-form signed distance functions.
Both modes converge on the same fragment shader, which dispatches on a `shape_kind` discriminant Both modes use the same fragment shader. The fragment shader checks the mode marker: mode 0 computes
carried either in the vertex data (tessellated, always `Solid = 0`) or in the storage-buffer `out = color * texture(tex, uv)`; mode 1 always evaluates `sdRoundedBox` and applies
primitive struct (SDF modes). gradient/texture/solid color based on flag bits.
#### Why SDF for shapes #### Why SDF for shapes
@@ -342,49 +412,60 @@ SDF primitives are submitted via a GPU storage buffer indexed by `gl_InstanceInd
shader, rather than encoding per-primitive data redundantly in vertex attributes. This follows the shader, rather than encoding per-primitive data redundantly in vertex attributes. This follows the
pattern used by both Zed GPUI and vger-rs. pattern used by both Zed GPUI and vger-rs.
Each SDF shape is described by a single `Primitive` struct (~56 bytes) in the storage buffer. The Each SDF shape is described by a single `Primitive` struct (80 bytes) in the storage buffer. The
vertex shader reads `primitives[gl_InstanceIndex]`, computes the quad corner position from the unit vertex shader reads `primitives[gl_InstanceIndex]`, computes the quad corner position from the unit
vertex and the primitive's bounds, and passes shape parameters to the fragment shader via `flat` vertex and the primitive's bounds, and passes shape parameters to the fragment shader via `flat`
interpolated varyings. interpolated varyings.
Compared to encoding per-primitive data in vertex attributes (the "fat vertex" approach), storage- Compared to encoding per-primitive data in vertex attributes (the "fat vertex" approach), storage-
buffer instancing eliminates the 46× data duplication across quad corners. A rounded rectangle costs buffer instancing eliminates the 46× data duplication across quad corners. A rounded rectangle costs
56 bytes instead of 4 vertices × 40+ bytes = 160+ bytes. 80 bytes instead of 4 vertices × 40+ bytes = 160+ bytes.
The tessellated path retains the existing direct vertex buffer layout (20 bytes/vertex, no storage The tessellated path retains the existing direct vertex buffer layout (20 bytes/vertex, no storage
buffer access). The vertex shader branch on `mode` (push constant) is warp-uniform — every invocation buffer access). The vertex shader branch on `mode` (push constant) is warp-uniform — every invocation
in a draw call has the same mode — so it is effectively free on all modern GPUs. in a draw call has the same mode — so it is effectively free on all modern GPUs.
#### Shape kinds #### Shape folding
Primitives in the main pipeline's storage buffer carry a `Shape_Kind` discriminant: The SDF path evaluates a single function — `sdRoundedBox` — for all primitives. There is no
`Shape_Kind` enum or per-primitive kind dispatch in the fragment shader. Shapes that are algebraically
special cases of a rounded rectangle are emitted as RRect primitives by the CPU-side drawing procs:
| Kind | SDF function | Notes | | User-facing shape | RRect mapping | Notes |
| ---------- | -------------------------------------- | --------------------------------------------------------- | | ---------------------------- | -------------------------------------------- | ---------------------------------------- |
| `RRect` | `sdRoundedBox` (iq) | Per-corner radii. Covers all Clay rectangles and borders. | | Rectangle (sharp or rounded) | Direct | Per-corner radii from `radii` param |
| `Circle` | `sdCircle` | Filled and stroked. | | Circle | `half_size = (r, r)`, `radii = (r, r, r, r)` | Uniform radii = half-size |
| `Ellipse` | `sdEllipse` | Exact (iq's closed-form). | | Line segment / capsule | Rotated RRect, `radii = half_thickness` | Stadium shape (fully-rounded minor axis) |
| `Segment` | `sdSegment` capsule | Rounded caps, correct sub-pixel thin lines. | | Full ring / annulus | Stroked circle at mid-radius | `stroke_px = outer - inner` |
| `Ring_Arc` | `abs(sdCircle) - thickness` + arc mask | Rings, arcs, circle sectors unified. |
| `NGon` | `sdRegularPolygon` | Regular n-gon for n ≥ 5. |
The `Solid` kind (value 0) is reserved for the tessellated path, where `shape_kind` is implicitly Shapes without a closed-form RRect reduction are drawn via the tessellated path:
zero because the fragment shader receives it from zero-initialized vertex attributes.
Stroke/outline variants of each shape are handled by the `Shape_Flags` bit set rather than separate | Shape | Tessellated proc | Method |
shape kinds. The fragment shader transforms `d = abs(d) - stroke_width` when the `Stroke` flag is | ------------------------- | ---------------------------------- | -------------------------- |
set. | Ellipse | `tes_ellipse`, `tes_ellipse_lines` | Triangle fan approximation |
| Regular polygon (N-gon) | `tes_polygon`, `tes_polygon_lines` | Triangle fan from center |
| Circle sector (pie slice) | `tes_sector` | Triangle fan arc |
The `Shape_Flags` bit set controls rendering mode per primitive:
| Flag | Bit | Effect |
| ----------------- | --- | -------------------------------------------------------------------- |
| `Stroke` | 0 | Outline instead of fill (`d = abs(d) - stroke_width/2`) |
| `Textured` | 1 | Sample texture using `uv.uv_rect` (mutually exclusive with Gradient) |
| `Gradient` | 2 | Bilinear 4-corner interpolation from `uv.corner_colors` |
| `Gradient_Radial` | 3 | Radial 2-color falloff (inner/outer) from `uv.corner_colors[0..1]` |
**What stays tessellated:** **What stays tessellated:**
- Text (SDL_ttf atlas, pending future MSDF evaluation) - Text (SDL_ttf atlas, pending future MSDF evaluation)
- `rectangle_gradient`, `circle_gradient` (per-vertex color interpolation) - Ellipses (`tes_ellipse`, `tes_ellipse_lines`)
- `triangle_fan`, `triangle_strip` (arbitrary user-provided point lists) - Regular polygons (`tes_polygon`, `tes_polygon_lines`)
- `line_strip` / polylines (SDF polyline rendering is possible but complex; deferred) - Circle sectors / pie slices (`tes_sector`)
- `tes_triangle`, `tes_triangle_fan`, `tes_triangle_strip` (arbitrary user-provided geometry)
- Any raw vertex geometry submitted via `prepare_shape` - Any raw vertex geometry submitted via `prepare_shape`
The rule: if the shape has a closed-form SDF, it goes SDF. If it's described only by a vertex list or The design rule: if the shape reduces to `sdRoundedBox`, it goes SDF. If it requires a different SDF
needs per-vertex color interpolation, it stays tessellated. function or is described by a vertex list, it stays tessellated.
### Effects pipeline ### Effects pipeline
@@ -442,25 +523,40 @@ Wallace's variant) and vger-rs.
- Vello's implementation of blurred rounded rectangle as a gradient type: - Vello's implementation of blurred rounded rectangle as a gradient type:
https://github.com/linebender/vello/pull/665 https://github.com/linebender/vello/pull/665
### Backdrop-effects pipeline ### Backdrop pipeline
The backdrop-effects pipeline handles effects that sample the current render target as input: frosted The backdrop pipeline handles effects that sample the current render target as input: frosted glass,
glass, refraction, mirror surfaces. It is structurally separated from the effects pipeline for two refraction, mirror surfaces. It is separated from the effects pipeline for a structural reason, not
reasons: register pressure.
1. **Render-state requirement.** Before any backdrop-sampling fragment can run, the current render **Render-pass boundary.** Before any backdrop-sampling fragment can run, the current render target
target must be copied to a separate texture via `CopyGPUTextureToTexture`. This is a command- must be copied to a separate texture via `CopyGPUTextureToTexture`. This is a command-buffer-level
buffer-level operation that cannot happen mid-render-pass. The copy naturally creates a pipeline operation that cannot happen mid-render-pass. The copy naturally creates a pipeline boundary that no
boundary. amount of shader optimization can eliminate — it is a fundamental requirement of sampling a surface
while also writing to it.
2. **Register pressure.** Backdrop-sampling shaders read from a texture with Gaussian kernel weights **Multi-pass implementation.** Backdrop effects are implemented as separable multi-pass sequences
(multiple texture fetches per fragment), pushing register usage to ~7080. Including this in the (downsample → horizontal blur → vertical blur → composite), following the standard approach used by
effects pipeline would reduce occupancy for all shadow/glow fragments from ~30% to ~20%, costing iOS `UIVisualEffectView`, Android `RenderEffect`, and Flutter's `BackdropFilter`. Each individual
measurable throughput on the common case. pass has a low-to-medium register footprint (~1540 registers), well within the main pipeline's
occupancy range. The multi-pass approach avoids the monolithic 70+ register shader that a single-pass
Gaussian blur would require, making backdrop effects viable on low-end mobile GPUs (including
Mali-G31 and VideoCore VI) where per-thread register limits are tight.
The backdrop-effects pipeline binds a secondary sampler pointing at the captured backdrop texture. When **Bracketed execution.** All backdrop draws in a frame share a single bracketed region of the command
no backdrop effects are present in a frame, this pipeline is never bound and the texture copy never buffer: end the current render pass, copy the render target, execute all backdrop sub-passes, then
happens — zero cost. resume normal drawing. The entry/exit cost (texture copy + render-pass break) is paid once per frame
regardless of how many backdrop effects are visible. When no backdrop effects are present, the bracket
is never entered and the texture copy never happens — zero cost.
**Why not split the backdrop sub-passes into separate pipelines?** The individual passes range from
~15 to ~40 registers, which does cross Mali's 32-register cliff. However, the register-pressure argument
that justifies the main/effects split does not apply here. The main/effects split protects the
_common path_ (90%+ of frame fragments) from the uncommon path's register cost. Inside the backdrop
pipeline there is no common-vs-uncommon distinction — if backdrop effects are active, every sub-pass
runs; if not, none run. The backdrop pipeline either executes as a complete unit or not at all.
Additionally, backdrop effects cover a small fraction of the frame's total fragments (~5% at typical
UI scales), so the occupancy variation within the bracket has negligible impact on frame time.
### Vertex layout ### Vertex layout
@@ -483,19 +579,21 @@ The `Primitive` struct for SDF shapes lives in the storage buffer, not in vertex
``` ```
Primitive :: struct { Primitive :: struct {
kind: Shape_Kind, // 0: enum u8 bounds: [4]f32, // 0: min_x, min_y, max_x, max_y
flags: Shape_Flags, // 1: bit_set[Shape_Flag; u8] color: Color, // 16: u8x4, unpacked in shader via unpackUnorm4x8
_pad: u16, // 2: reserved flags: u32, // 20: low byte = Shape_Kind, bits 8+ = Shape_Flags
bounds: [4]f32, // 4: min_x, min_y, max_x, max_y rotation_sc: u32, // 24: packed f16 pair (sin, cos). Requires .Rotated flag.
color: Color, // 20: u8x4 _pad: f32, // 28: reserved for future use
_pad2: [3]u8, // 24: alignment params: Shape_Params, // 32: per-kind params union (half_feather, radii, etc.) (32 bytes)
params: Shape_Params, // 28: raw union, 32 bytes uv: Uv_Or_Effects, // 64: texture UV rect or gradient/outline parameters (16 bytes)
} }
// Total: 60 bytes (padded to 64 for GPU alignment) // Total: 80 bytes (std430 aligned)
``` ```
`Shape_Params` is a `#raw_union` with named variants per shape kind (`rrect`, `circle`, `segment`, `RRect_Params` holds the rounded-rectangle parameters directly — there is no `Shape_Params` union.
etc.), ensuring type safety on the CPU side and zero-cost reinterpretation on the GPU side. `Uv_Or_Gradient` is a `#raw_union` that aliases `[4]f32` (texture UV rect) with `[4]Color` (gradient
corner colors, clockwise from top-left: TL, TR, BR, BL). The `flags` field encodes both the
tessellated/SDF mode marker (low byte) and shape flags (bits 8+) via `pack_flags`.
### Draw submission order ### Draw submission order
@@ -506,7 +604,7 @@ Within each scissor region, draws are issued in submission order to preserve the
2. Bind **main pipeline, tessellated mode** → draw all queued tessellated vertices (non-indexed for 2. Bind **main pipeline, tessellated mode** → draw all queued tessellated vertices (non-indexed for
shapes, indexed for text). Pipeline state unchanged from today. shapes, indexed for text). Pipeline state unchanged from today.
3. Bind **main pipeline, SDF mode** → draw all queued SDF primitives (instanced, one draw call). 3. Bind **main pipeline, SDF mode** → draw all queued SDF primitives (instanced, one draw call).
4. If backdrop effects are present: copy render target, bind **backdrop-effects pipeline** → draw 4. If backdrop effects are present: copy render target, bind **backdrop pipeline** → draw
backdrop primitives. backdrop primitives.
The exact ordering within a scissor may be refined based on actual Z-ordering requirements. The key The exact ordering within a scissor may be refined based on actual Z-ordering requirements. The key
@@ -517,14 +615,15 @@ invariant is that each primitive is drawn exactly once, in the pipeline that own
Text rendering currently uses SDL_ttf's GPU text engine, which rasterizes glyphs per `(font, size)` Text rendering currently uses SDL_ttf's GPU text engine, which rasterizes glyphs per `(font, size)`
pair into bitmap atlases and emits indexed triangle data via `GetGPUTextDrawData`. This path is pair into bitmap atlases and emits indexed triangle data via `GetGPUTextDrawData`. This path is
**unchanged** by the SDF migration — text continues to flow through the main pipeline's tessellated **unchanged** by the SDF migration — text continues to flow through the main pipeline's tessellated
mode with `shape_kind = Solid`, sampling the SDL_ttf atlas texture. mode with `mode = 0`, sampling the SDL_ttf atlas texture.
A future phase may evaluate MSDF (multi-channel signed distance field) text rendering, which would A future phase may evaluate MSDF (multi-channel signed distance field) text rendering, which would
allow resolution-independent glyph rendering from a single small atlas per font. This would involve: allow resolution-independent glyph rendering from a single small atlas per font. This would involve:
- Offline atlas generation via Chlumský's msdf-atlas-gen tool. - Offline atlas generation via Chlumský's msdf-atlas-gen tool.
- Runtime glyph metrics via `vendor:stb/truetype` (already in the Odin distribution). - Runtime glyph metrics via `vendor:stb/truetype` (already in the Odin distribution).
- A new `Shape_Kind.MSDF_Glyph` variant in the main pipeline's fragment shader. - A new MSDF glyph mode in the fragment shader, which would require reintroducing a mode/kind
distinction (the current shader evaluates only `sdRoundedBox` with no kind dispatch).
- Potential removal of the SDL_ttf dependency. - Potential removal of the SDL_ttf dependency.
This is explicitly deferred. The SDF shape migration is independent of and does not block text This is explicitly deferred. The SDF shape migration is independent of and does not block text
@@ -539,12 +638,176 @@ changes.
- Valve's original SDF text rendering paper (SIGGRAPH 2007): - Valve's original SDF text rendering paper (SIGGRAPH 2007):
https://steamcdn-a.akamaihd.net/apps/valve/2007/SIGGRAPH2007_AlphaTestedMagnification.pdf https://steamcdn-a.akamaihd.net/apps/valve/2007/SIGGRAPH2007_AlphaTestedMagnification.pdf
### Textures
Textures plug into the existing main pipeline — no additional GPU pipeline, no shader rewrite. The
work is a resource layer (registration, upload, sampling, lifecycle) plus two textured-draw procs
that route into the existing tessellated and SDF paths respectively.
#### Why draw owns registered textures
A texture's GPU resource (the `^sdl.GPUTexture`, transfer buffer, shader resource view) is created
and destroyed by draw. The user provides raw bytes and a descriptor at registration time; draw
uploads synchronously and returns an opaque `Texture_Id` handle. The user can free their CPU-side
bytes immediately after `register_texture` returns.
This follows the model used by the RAD Debugger's render layer (`src/render/render_core.h` in
EpicGamesExt/raddebugger, MIT license), where `r_tex2d_alloc` takes `(kind, size, format, data)`
and returns an opaque handle that the renderer owns and releases. The single-owner model eliminates
an entire class of lifecycle bugs (double-free, use-after-free across subsystems, unclear cleanup
responsibility) that dual-ownership designs introduce.
If advanced interop is ever needed (e.g., a future 3D pipeline or compute shader sharing the same
GPU texture), the clean extension is a borrowed-reference accessor (`get_gpu_texture(id)`) that
returns the underlying handle without transferring ownership. This is purely additive and does not
require changing the registration API.
#### Why `Texture_Kind` exists
`Texture_Kind` (Static / Dynamic / Stream) is a driver hint for update frequency, adopted from the
RAD Debugger's `R_ResourceKind`. It maps directly to SDL3 GPU usage patterns:
- **Static**: uploaded once, never changes. Covers QR codes, decoded PNGs, icons — the 90% case.
- **Dynamic**: updatable via `update_texture_region`. Covers font atlas growth, procedural updates.
- **Stream**: frequent full re-uploads. Covers video playback, per-frame procedural generation.
This costs one byte in the descriptor and lets the backend pick optimal memory placement without a
future API change.
#### Why samplers are per-draw, not per-texture
A sampler describes how to filter and address a texture during sampling — nearest vs bilinear, clamp
vs repeat. This is a property of the _draw_, not the texture. The same QR code texture should be
sampled with `Nearest_Clamp` when displayed at native resolution but could reasonably be sampled
with `Linear_Clamp` in a zoomed-out thumbnail. The same icon atlas might be sampled with
`Nearest_Clamp` for pixel art or `Linear_Clamp` for smooth scaling.
The RAD Debugger follows this pattern: `R_BatchGroup2DParams` carries `tex_sample_kind` alongside
the texture handle, chosen per batch group at draw time. We do the same — `Sampler_Preset` is a
parameter on the draw procs, not a field on `Texture_Desc`.
Internally, draw keeps a small pool of pre-created `^sdl.GPUSampler` objects (one per preset,
lazily initialized). Sub-batch coalescing keys on `(kind, texture_id, sampler_preset)` — draws
with the same texture but different samplers produce separate draw calls, which is correct.
#### Textured draw procs
Textured rectangles route through the existing SDF path via `sdf_rectangle_texture` and
`sdf_rectangle_texture_corners`, mirroring `sdf_rectangle` and `sdf_rectangle_corners` exactly —
same parameters, same naming — with the color parameter replaced by a texture ID plus an optional
tint.
An earlier iteration of this design considered a separate tessellated proc for "simple" fullscreen
quads, on the theory that the tessellated path's lower register count (~16 regs vs ~18 for the SDF
textured branch) would improve occupancy at large fragment counts. Applying the register-pressure
analysis from the pipeline-strategy section above shows this is wrong: both 16 and 18 registers are
well below the register cliff (~43 regs on consumer Ampere/Ada, ~32 on Volta/A100), so both run at
100% occupancy. The remaining ALU difference (~15 extra instructions for the SDF evaluation) amounts
to ~20μs at 4K — below noise. Meanwhile, splitting into a separate pipeline would add ~15μs per
pipeline bind on the CPU side per scissor, matching or exceeding the GPU-side savings. Within the
main pipeline, unified remains strictly better.
The naming convention uses `sdf_` and `tes_` prefixes to indicate the rendering path, with suffixes
for modifiers: `sdf_rectangle_texture` and `sdf_rectangle_texture_corners` sit alongside
`sdf_rectangle` (solid or gradient overload). Proc groups like `sdf_rectangle` dispatch to
`sdf_rectangle_solid` or `sdf_rectangle_gradient` based on argument count. Future per-shape texture
variants (`sdf_circle_texture`) are additive.
#### What SDF anti-aliasing does and does not do for textured draws
The SDF path anti-aliases the **shape's outer silhouette** — rounded-corner edges, rotated edges,
stroke outlines. It does not anti-alias or sharpen the texture content. Inside the shape, fragments
sample through the chosen `Sampler_Preset`, and image quality is whatever the sampler produces from
the source texels. A low-resolution texture displayed at a large size shows bilinear blur regardless
of which draw proc is used. This matches the current text-rendering model, where glyph sharpness
depends on how closely the display size matches the SDL_ttf atlas's rasterized size.
#### Fit modes are a computation layer, not a renderer concept
Standard image-fit behaviors (stretch, fill/cover, fit/contain, tile, center) are expressed as UV
sub-region computations on top of the `uv_rect` parameter that both textured-draw procs accept. The
renderer has no knowledge of fit modes — it samples whatever UV region it is given.
A `fit_params` helper computes the appropriate `uv_rect`, sampler preset, and (for letterbox/fit
mode) shrunken inner rect from a `Fit_Mode` enum, the target rect, and the texture's pixel size.
Users who need custom UV control (sprite atlas sub-regions, UV animation, nine-patch slicing) skip
the helper and compute `uv_rect` directly. This keeps the renderer primitive minimal while making
the common cases convenient.
#### Deferred release
`unregister_texture` does not immediately release the GPU texture. It queues the slot for release at
the end of the current frame, after `SubmitGPUCommandBuffer` has handed work to the GPU. This
prevents a race condition where a texture is freed while the GPU is still sampling from it in an
already-submitted command buffer. The same deferred-release pattern is applied to `clear_text_cache`
and `clear_text_cache_entry`, fixing a pre-existing latent bug where destroying a cached
`^sdl_ttf.Text` mid-frame could free an atlas texture still referenced by in-flight draw batches.
This pattern is standard in production renderers — the RAD Debugger's `r_tex2d_release` queues
textures onto a free list that is processed in `r_end_frame`, not at the call site.
#### Clay integration
Clay's `RenderCommandType.Image` is handled by dereferencing `imageData: rawptr` as a pointer to a
`Clay_Image_Data` struct containing a `Texture_Id`, `Fit_Mode`, and tint color. Routing mirrors the
existing rectangle handling: zero `cornerRadius` dispatches to `sdf_rectangle_texture` (SDF, sharp
corners), nonzero dispatches to `sdf_rectangle_texture_corners` (SDF, per-corner radii). A
`fit_params` call computes UVs from the fit mode before dispatch.
#### Deferred features
The following are plumbed in the descriptor but not implemented in phase 1:
- **Mipmaps**: `Texture_Desc.mip_levels` field exists; generation via SDL3 deferred.
- **Compressed formats**: `Texture_Desc.format` accepts BC/ASTC; upload path deferred.
- **Render-to-texture**: `Texture_Desc.usage` accepts `.COLOR_TARGET`; render-pass refactor deferred.
- **3D textures, arrays, cube maps**: `Texture_Desc.type` and `depth_or_layers` fields exist.
- **Additional samplers**: anisotropic, trilinear, clamp-to-border — additive enum values.
- **Atlas packing**: internal optimization for sub-batch coalescing; invisible to callers.
- **Per-shape texture variants**: `sdf_circle_texture`, `tes_ellipse_texture`, `tes_polygon_texture` — potential future additions, reserved by naming convention.
**References:**
- RAD Debugger render layer (ownership model, deferred release, sampler-at-draw-time):
https://github.com/EpicGamesExt/raddebugger — `src/render/render_core.h`, `src/render/d3d11/render_d3d11.c`
- Casey Muratori, Handmade Hero day 472 — texture handling as a renderer-owned resource concern,
atlases as a separate layer above the renderer.
## 3D rendering ## 3D rendering
3D pipeline architecture is under consideration and will be documented separately. The current 3D pipeline architecture is under consideration and will be documented separately. The current
expectation is that 3D rendering will use dedicated pipelines (separate from the 2D pipelines) expectation is that 3D rendering will use dedicated pipelines (separate from the 2D pipelines)
sharing GPU resources (textures, samplers, command buffer lifecycle) with the 2D renderer. sharing GPU resources (textures, samplers, command buffer lifecycle) with the 2D renderer.
## Multi-window support
The renderer currently assumes a single window via the global `GLOB` state. Multi-window support is
deferred but anticipated. When revisited, the RAD Debugger's bucket + pass-list model
(`src/draw/draw.h`, `src/draw/draw.c` in EpicGamesExt/raddebugger) is worth studying as a reference.
RAD separates draw submission from rendering via **buckets**. A `DR_Bucket` is an explicit handle
that accumulates an ordered list of render passes (`R_PassList`). The user creates a bucket, pushes
it onto a thread-local stack, issues draw calls (which target the top-of-stack bucket), then submits
the bucket to a specific window. Multiple buckets can exist simultaneously — one per window, or one
per UI panel that gets composited into a parent bucket via `dr_sub_bucket`. Implicit draw parameters
(clip rect, 2D transform, sampler mode, transparency) are managed via push/pop stacks scoped to each
bucket, so different windows can have independent clip and transform state without interference.
The key properties this gives RAD:
- **Per-window isolation.** Each window builds its own bucket with its own pass list and state stacks.
No global contention.
- **Thread-parallel building.** Each thread has its own draw context and arena. Multiple threads can
build buckets concurrently, then submit them to the render backend sequentially.
- **Compositing.** A pre-built bucket (e.g., a tooltip or overlay) can be injected into another
bucket with a transform applied, without rebuilding its draw calls.
For our library, the likely adaptation would be replacing the single `GLOB` with a per-window draw
context that users create and pass to `begin`/`end`, while keeping the explicit-parameter draw call
style rather than adopting RAD's implicit state stacks. Texture and sampler resources would remain
global (shared across windows), with only the per-frame staging buffers and layer/scissor state
becoming per-context.
## Building shaders ## Building shaders
GLSL shader sources live in `shaders/source/`. Compiled outputs (SPIR-V and Metal Shading Language) GLSL shader sources live in `shaders/source/`. Compiled outputs (SPIR-V and Metal Shading Language)
+282 -91
View File
@@ -4,6 +4,7 @@ import "base:runtime"
import "core:c" import "core:c"
import "core:log" import "core:log"
import "core:math" import "core:math"
import "core:strings" import "core:strings"
import sdl "vendor:sdl3" import sdl "vendor:sdl3"
import sdl_ttf "vendor:sdl3/ttf" import sdl_ttf "vendor:sdl3/ttf"
@@ -27,11 +28,109 @@ BUFFER_INIT_SIZE :: 256
INITIAL_LAYER_SIZE :: 5 INITIAL_LAYER_SIZE :: 5
INITIAL_SCISSOR_SIZE :: 10 INITIAL_SCISSOR_SIZE :: 10
// Sentinel value: when passed as msaa_samples, `init` will use the maximum MSAA sample count
// supported by the GPU for the swapchain format.
MSAA_MAX :: sdl.GPUSampleCount(0xFF)
// ----- Default parameter values -----
// Named constants for non-zero default procedure parameters. Centralizes magic numbers
// so they can be tuned in one place and referenced by name in proc signatures.
DFT_FEATHER_PX :: 1 // Total AA feather width in physical pixels (half on each side of boundary).
DFT_STROKE_THICKNESS :: 1 // Default line/stroke thickness in logical pixels.
DFT_FONT_SIZE :: 44 // Default font size in points for text rendering.
DFT_CIRC_END_ANGLE :: 360 // Full-circle end angle in degrees (ring/arc).
DFT_UV_RECT :: Rectangle{0, 0, 1, 1} // Full-texture UV rect (rectangle_texture).
DFT_TINT :: WHITE // Default texture tint (rectangle_texture, clay_image).
DFT_TEXT_COLOR :: BLACK // Default text color.
DFT_CLEAR_COLOR :: BLACK // Default clear color for end().
DFT_SAMPLER :: Sampler_Preset.Linear_Clamp // Default texture sampler preset.
GLOB: Global
Global :: struct {
// -- Per-frame staging (hottest — touched by every prepare/upload/clear cycle) --
tmp_shape_verts: [dynamic]Vertex, // Tessellated shape vertices staged for GPU upload.
tmp_text_verts: [dynamic]Vertex, // Text vertices staged for GPU upload.
tmp_text_indices: [dynamic]c.int, // Text index buffer staged for GPU upload.
tmp_text_batches: [dynamic]TextBatch, // Text atlas batch metadata for indexed drawing.
tmp_primitives: [dynamic]Primitive, // SDF primitives staged for GPU storage buffer upload.
tmp_sub_batches: [dynamic]Sub_Batch, // Sub-batch records that drive draw call dispatch.
tmp_uncached_text: [dynamic]^sdl_ttf.Text, // Uncached TTF_Text objects destroyed after end() submits.
layers: [dynamic]Layer, // Draw layers, each with its own scissor stack.
scissors: [dynamic]Scissor, // Scissor rects that clip drawing within each layer.
// -- Per-frame scalars (accessed during prepare and draw_layer) --
curr_layer_index: uint, // Index of the currently active layer.
dpi_scaling: f32, // Window DPI scale factor applied to all pixel coordinates.
clay_z_index: i16, // Tracks z-index for layer splitting during Clay batch processing.
cleared: bool, // Whether the render target has been cleared this frame.
// -- Pipeline (accessed every draw_layer call) --
pipeline_2d_base: Pipeline_2D_Base, // The unified 2D GPU pipeline (shaders, buffers, samplers).
device: ^sdl.GPUDevice, // GPU device handle, stored at init.
samplers: [SAMPLER_PRESET_COUNT]^sdl.GPUSampler, // Lazily-created sampler objects, one per Sampler_Preset.
// -- Deferred release (processed once per frame at frame boundary) --
pending_texture_releases: [dynamic]Texture_Id, // Deferred GPU texture releases, processed next frame.
pending_text_releases: [dynamic]^sdl_ttf.Text, // Deferred TTF_Text destroys, processed next frame.
// -- Textures (registration is occasional, binding is per draw call) --
texture_slots: [dynamic]Texture_Slot, // Registered texture slots indexed by Texture_Id.
texture_free_list: [dynamic]u32, // Recycled slot indices available for reuse.
// -- MSAA (once per frame in end()) --
msaa_texture: ^sdl.GPUTexture, // Intermediate render target for multi-sample resolve.
msaa_width: u32, // Cached width to detect when MSAA texture needs recreation.
msaa_height: u32, // Cached height to detect when MSAA texture needs recreation.
sample_count: sdl.GPUSampleCount, // Sample count chosen at init (._1 means MSAA disabled).
// -- Clay (once per frame in prepare_clay_batch) --
clay_memory: [^]u8, // Raw memory block backing Clay's internal arena.
// -- Text (occasional — font registration and text cache lookups) --
text_cache: Text_Cache, // Font registry, SDL_ttf engine, and cached TTF_Text objects.
// -- Resize tracking (cold — checked once per frame in resize_global) --
max_layers: int, // High-water marks for dynamic array shrink heuristic.
max_scissors: int,
max_shape_verts: int,
max_text_verts: int,
max_text_indices: int,
max_text_batches: int,
max_primitives: int,
max_sub_batches: int,
// -- Init-only (coldest — set once at init, never written again) --
odin_context: runtime.Context, // Odin context captured at init for use in callbacks.
}
// --------------------------------------------------------------------------------------------------------------------- // ---------------------------------------------------------------------------------------------------------------------
// ----- Color ------------------------- // ----- Core types --------------------
// --------------------------------------------------------------------------------------------------------------------- // ---------------------------------------------------------------------------------------------------------------------
Color :: distinct [4]u8 // A 2D position in world space. Non-distinct alias for [2]f32 — bare literals like {100, 200}
// work at non-ambiguous call sites.
//
// Coordinate system: origin is the top-left corner of the window/layer. X increases rightward,
// Y increases downward. This matches SDL, HTML Canvas, and most 2D UI coordinate conventions.
// All position parameters in the draw API (center, origin, start_position, end_position, etc.)
// use this coordinate system.
//
// Units are logical pixels (pre-DPI-scaling). The renderer multiplies by dpi_scaling internally
// before uploading to the GPU. A Vec2{100, 50} refers to the same visual location regardless of
// display DPI.
Vec2 :: [2]f32
// An RGBA color with 8 bits per channel. Distinct type over [4]u8 so that proc-group
// overloads can disambiguate Color from other 4-byte structs.
//
// Channel order: R, G, B, A (indices 0, 1, 2, 3). Alpha 255 is fully opaque, 0 is fully
// transparent. This matches the GPU-side layout: the shader unpacks via unpackUnorm4x8 which
// reads the bytes in memory order as R, G, B, A and normalizes each to [0, 1].
//
// When used in the Primitive struct (Primitive.color), the 4 bytes are stored as a u32 in
// native byte order and unpacked by the shader.
Color :: [4]u8
BLACK :: Color{0, 0, 0, 255} BLACK :: Color{0, 0, 0, 255}
WHITE :: Color{255, 255, 255, 255} WHITE :: Color{255, 255, 255, 255}
@@ -40,8 +139,43 @@ GREEN :: Color{0, 255, 0, 255}
BLUE :: Color{0, 0, 255, 255} BLUE :: Color{0, 0, 255, 255}
BLANK :: Color{0, 0, 0, 0} BLANK :: Color{0, 0, 0, 0}
// Per-corner rounding radii for rectangles, specified clockwise from top-left.
// All values are in logical pixels (pre-DPI-scaling).
Rectangle_Radii :: struct {
top_left: f32,
top_right: f32,
bottom_right: f32,
bottom_left: f32,
}
// A linear gradient between two colors along an arbitrary angle.
// The `end_color` is the color at the end of the gradient direction; the shape's fill `color`
// parameter acts as the start color. `angle` is in degrees: 0 = left-to-right, 90 = top-to-bottom.
Linear_Gradient :: struct {
end_color: Color,
angle: f32,
}
// A radial gradient between two colors from center to edge.
// The `outer_color` is the color at the shape's edge; the shape's fill `color` parameter
// acts as the inner (center) color.
Radial_Gradient :: struct {
outer_color: Color,
}
// Tagged union for specifying a gradient on any shape. Defaults to `nil` (no gradient).
// When a gradient is active, the shape's `color` parameter becomes the start/inner color,
// and the gradient struct carries the end/outer color plus any type-specific parameters.
//
// Gradient and Textured are mutually exclusive on the same primitive. If a shape uses
// `rectangle_texture`, gradients are not applicable — use the tint color instead.
Gradient :: union {
Linear_Gradient,
Radial_Gradient,
}
// Convert clay.Color ([4]c.float in 0255 range) to Color. // Convert clay.Color ([4]c.float in 0255 range) to Color.
color_from_clay :: proc(clay_color: clay.Color) -> Color { color_from_clay :: #force_inline proc(clay_color: clay.Color) -> Color {
return Color{u8(clay_color[0]), u8(clay_color[1]), u8(clay_color[2]), u8(clay_color[3])} return Color{u8(clay_color[0]), u8(clay_color[1]), u8(clay_color[2]), u8(clay_color[3])}
} }
@@ -51,9 +185,19 @@ color_to_f32 :: proc(color: Color) -> [4]f32 {
return {f32(color[0]) * INV, f32(color[1]) * INV, f32(color[2]) * INV, f32(color[3]) * INV} return {f32(color[0]) * INV, f32(color[1]) * INV, f32(color[2]) * INV, f32(color[3]) * INV}
} }
// --------------------------------------------------------------------------------------------------------------------- // Pre-multiply RGB channels by alpha. The tessellated vertex path and text path require
// ----- Core types -------------------- // premultiplied colors because the blend state is ONE, ONE_MINUS_SRC_ALPHA and the
// --------------------------------------------------------------------------------------------------------------------- // tessellated fragment shader passes vertex color through without further modification.
// Users who construct Vertex structs manually for prepare_shape must premultiply their colors.
premultiply_color :: #force_inline proc(color: Color) -> Color {
a := u32(color[3])
return Color {
u8((u32(color[0]) * a + 127) / 255),
u8((u32(color[1]) * a + 127) / 255),
u8((u32(color[2]) * a + 127) / 255),
color[3],
}
}
Rectangle :: struct { Rectangle :: struct {
x: f32, x: f32,
@@ -63,15 +207,17 @@ Rectangle :: struct {
} }
Sub_Batch_Kind :: enum u8 { Sub_Batch_Kind :: enum u8 {
Shapes, // non-indexed, white texture, mode 0 Tessellated, // non-indexed, white texture or user texture, mode 0
Text, // indexed, atlas texture, mode 0 Text, // indexed, atlas texture, mode 0
SDF, // instanced unit quad, white texture, mode 1 SDF, // instanced unit quad, white texture or user texture, mode 1
} }
Sub_Batch :: struct { Sub_Batch :: struct {
kind: Sub_Batch_Kind, kind: Sub_Batch_Kind,
offset: u32, // Shapes: vertex offset; Text: text_batch index; SDF: primitive index offset: u32, // Tessellated: vertex offset; Text: text_batch index; SDF: primitive index
count: u32, // Shapes: vertex count; Text: always 1; SDF: primitive count count: u32, // Tessellated: vertex count; Text: always 1; SDF: primitive count
texture_id: Texture_Id,
sampler: Sampler_Preset,
} }
Layer :: struct { Layer :: struct {
@@ -88,44 +234,6 @@ Scissor :: struct {
sub_batch_len: u32, sub_batch_len: u32,
} }
// ---------------------------------------------------------------------------------------------------------------------
// ----- Global state ------------------
// ---------------------------------------------------------------------------------------------------------------------
GLOB: Global
Global :: struct {
odin_context: runtime.Context,
pipeline_2d_base: Pipeline_2D_Base,
text_cache: Text_Cache,
layers: [dynamic]Layer,
scissors: [dynamic]Scissor,
tmp_shape_verts: [dynamic]Vertex,
tmp_text_verts: [dynamic]Vertex,
tmp_text_indices: [dynamic]c.int,
tmp_text_batches: [dynamic]TextBatch,
tmp_primitives: [dynamic]Primitive,
tmp_sub_batches: [dynamic]Sub_Batch,
tmp_uncached_text: [dynamic]^sdl_ttf.Text, // Uncached TTF_Text objects to destroy after end()
clay_memory: [^]u8,
msaa_texture: ^sdl.GPUTexture,
curr_layer_index: uint,
max_layers: int,
max_scissors: int,
max_shape_verts: int,
max_text_verts: int,
max_text_indices: int,
max_text_batches: int,
max_primitives: int,
max_sub_batches: int,
dpi_scaling: f32,
msaa_width: u32,
msaa_height: u32,
sample_count: sdl.GPUSampleCount,
clay_z_index: i16,
cleared: bool,
}
Init_Options :: struct { Init_Options :: struct {
// MSAA sample count. Default is ._1 (no MSAA). SDF rendering does not benefit from MSAA // MSAA sample count. Default is ._1 (no MSAA). SDF rendering does not benefit from MSAA
// because SDF fragments compute coverage analytically via `smoothstep`. MSAA helps for // because SDF fragments compute coverage analytically via `smoothstep`. MSAA helps for
@@ -135,10 +243,6 @@ Init_Options :: struct {
msaa_samples: sdl.GPUSampleCount, msaa_samples: sdl.GPUSampleCount,
} }
// Sentinel value: when passed as msaa_samples, `init` will use the maximum MSAA sample count
// supported by the GPU for the swapchain format.
MSAA_MAX :: sdl.GPUSampleCount(0xFF)
// Initialize the renderer. Returns false if GPU pipeline or text engine creation fails. // Initialize the renderer. Returns false if GPU pipeline or text engine creation fails.
@(require_results) @(require_results)
init :: proc( init :: proc(
@@ -168,22 +272,30 @@ init :: proc(
} }
GLOB = Global { GLOB = Global {
layers = make([dynamic]Layer, 0, INITIAL_LAYER_SIZE, allocator = allocator), layers = make([dynamic]Layer, 0, INITIAL_LAYER_SIZE, allocator = allocator),
scissors = make([dynamic]Scissor, 0, INITIAL_SCISSOR_SIZE, allocator = allocator), scissors = make([dynamic]Scissor, 0, INITIAL_SCISSOR_SIZE, allocator = allocator),
tmp_shape_verts = make([dynamic]Vertex, 0, BUFFER_INIT_SIZE, allocator = allocator), tmp_shape_verts = make([dynamic]Vertex, 0, BUFFER_INIT_SIZE, allocator = allocator),
tmp_text_verts = make([dynamic]Vertex, 0, BUFFER_INIT_SIZE, allocator = allocator), tmp_text_verts = make([dynamic]Vertex, 0, BUFFER_INIT_SIZE, allocator = allocator),
tmp_text_indices = make([dynamic]c.int, 0, BUFFER_INIT_SIZE, allocator = allocator), tmp_text_indices = make([dynamic]c.int, 0, BUFFER_INIT_SIZE, allocator = allocator),
tmp_text_batches = make([dynamic]TextBatch, 0, BUFFER_INIT_SIZE, allocator = allocator), tmp_text_batches = make([dynamic]TextBatch, 0, BUFFER_INIT_SIZE, allocator = allocator),
tmp_primitives = make([dynamic]Primitive, 0, BUFFER_INIT_SIZE, allocator = allocator), tmp_primitives = make([dynamic]Primitive, 0, BUFFER_INIT_SIZE, allocator = allocator),
tmp_sub_batches = make([dynamic]Sub_Batch, 0, BUFFER_INIT_SIZE, allocator = allocator), tmp_sub_batches = make([dynamic]Sub_Batch, 0, BUFFER_INIT_SIZE, allocator = allocator),
tmp_uncached_text = make([dynamic]^sdl_ttf.Text, 0, 16, allocator = allocator), tmp_uncached_text = make([dynamic]^sdl_ttf.Text, 0, 16, allocator = allocator),
odin_context = odin_context, device = device,
dpi_scaling = sdl.GetWindowDisplayScale(window), texture_slots = make([dynamic]Texture_Slot, 0, 16, allocator = allocator),
clay_memory = make([^]u8, min_memory_size, allocator = allocator), texture_free_list = make([dynamic]u32, 0, 16, allocator = allocator),
sample_count = resolved_sample_count, pending_texture_releases = make([dynamic]Texture_Id, 0, 16, allocator = allocator),
pipeline_2d_base = pipeline, pending_text_releases = make([dynamic]^sdl_ttf.Text, 0, 16, allocator = allocator),
text_cache = text_cache, odin_context = odin_context,
dpi_scaling = sdl.GetWindowDisplayScale(window),
clay_memory = make([^]u8, min_memory_size, allocator = allocator),
sample_count = resolved_sample_count,
pipeline_2d_base = pipeline,
text_cache = text_cache,
} }
// Reserve slot 0 for INVALID_TEXTURE
append(&GLOB.texture_slots, Texture_Slot{})
log.debug("Window DPI scaling:", GLOB.dpi_scaling) log.debug("Window DPI scaling:", GLOB.dpi_scaling)
arena := clay.CreateArenaWithCapacityAndMemory(min_memory_size, GLOB.clay_memory) arena := clay.CreateArenaWithCapacityAndMemory(min_memory_size, GLOB.clay_memory)
window_width, window_height: c.int window_width, window_height: c.int
@@ -230,12 +342,23 @@ destroy :: proc(device: ^sdl.GPUDevice, allocator := context.allocator) {
if GLOB.msaa_texture != nil { if GLOB.msaa_texture != nil {
sdl.ReleaseGPUTexture(device, GLOB.msaa_texture) sdl.ReleaseGPUTexture(device, GLOB.msaa_texture)
} }
process_pending_texture_releases()
destroy_all_textures()
destroy_sampler_pool()
for ttf_text in GLOB.pending_text_releases do sdl_ttf.DestroyText(ttf_text)
delete(GLOB.pending_text_releases)
destroy_pipeline_2d_base(device, &GLOB.pipeline_2d_base) destroy_pipeline_2d_base(device, &GLOB.pipeline_2d_base)
destroy_text_cache() destroy_text_cache()
} }
// Internal // Internal
clear_global :: proc() { clear_global :: proc() {
// Process deferred texture releases from the previous frame
process_pending_texture_releases()
// Process deferred text releases from the previous frame
for ttf_text in GLOB.pending_text_releases do sdl_ttf.DestroyText(ttf_text)
clear(&GLOB.pending_text_releases)
GLOB.curr_layer_index = 0 GLOB.curr_layer_index = 0
GLOB.clay_z_index = 0 GLOB.clay_z_index = 0
GLOB.cleared = false GLOB.cleared = false
@@ -265,6 +388,7 @@ measure_text_clay :: proc "c" (
context = GLOB.odin_context context = GLOB.odin_context
text := string(text.chars[:text.length]) text := string(text.chars[:text.length])
c_text := strings.clone_to_cstring(text, context.temp_allocator) c_text := strings.clone_to_cstring(text, context.temp_allocator)
defer delete(c_text, context.temp_allocator)
width, height: c.int width, height: c.int
if !sdl_ttf.GetStringSize(get_font(config.fontId, config.fontSize), c_text, 0, &width, &height) { if !sdl_ttf.GetStringSize(get_font(config.fontId, config.fontSize), c_text, 0, &width, &height) {
log.panicf("Failed to measure text: %s", sdl.GetError()) log.panicf("Failed to measure text: %s", sdl.GetError())
@@ -331,12 +455,13 @@ new_layer :: proc(prev_layer: ^Layer, bounds: Rectangle) -> ^Layer {
// --------------------------------------------------------------------------------------------------------------------- // ---------------------------------------------------------------------------------------------------------------------
// Submit shape vertices (colored triangles) to the given layer for rendering. // Submit shape vertices (colored triangles) to the given layer for rendering.
// TODO: Should probably be renamed to better match tesselated naming conventions in the library.
prepare_shape :: proc(layer: ^Layer, vertices: []Vertex) { prepare_shape :: proc(layer: ^Layer, vertices: []Vertex) {
if len(vertices) == 0 do return if len(vertices) == 0 do return
offset := u32(len(GLOB.tmp_shape_verts)) offset := u32(len(GLOB.tmp_shape_verts))
append(&GLOB.tmp_shape_verts, ..vertices) append(&GLOB.tmp_shape_verts, ..vertices)
scissor := &GLOB.scissors[layer.scissor_start + layer.scissor_len - 1] scissor := &GLOB.scissors[layer.scissor_start + layer.scissor_len - 1]
append_or_extend_sub_batch(scissor, layer, .Shapes, offset, u32(len(vertices))) append_or_extend_sub_batch(scissor, layer, .Tessellated, offset, u32(len(vertices)))
} }
// Submit an SDF primitive to the given layer for rendering. // Submit an SDF primitive to the given layer for rendering.
@@ -362,6 +487,9 @@ prepare_text :: proc(layer: ^Layer, text: Text) {
base_x := math.round(text.position[0] * GLOB.dpi_scaling) base_x := math.round(text.position[0] * GLOB.dpi_scaling)
base_y := math.round(text.position[1] * GLOB.dpi_scaling) base_y := math.round(text.position[1] * GLOB.dpi_scaling)
// Premultiply text color once — reused across all glyph vertices.
pm_color := premultiply_color(text.color)
for data != nil { for data != nil {
vertex_start := u32(len(GLOB.tmp_text_verts)) vertex_start := u32(len(GLOB.tmp_text_verts))
index_start := u32(len(GLOB.tmp_text_indices)) index_start := u32(len(GLOB.tmp_text_indices))
@@ -372,7 +500,7 @@ prepare_text :: proc(layer: ^Layer, text: Text) {
uv := data.uv[i] uv := data.uv[i]
append( append(
&GLOB.tmp_text_verts, &GLOB.tmp_text_verts,
Vertex{position = {pos.x + base_x, -pos.y + base_y}, uv = {uv.x, uv.y}, color = text.color}, Vertex{position = {pos.x + base_x, -pos.y + base_y}, uv = {uv.x, uv.y}, color = pm_color},
) )
} }
@@ -410,6 +538,9 @@ prepare_text_transformed :: proc(layer: ^Layer, text: Text, transform: Transform
scissor := &GLOB.scissors[layer.scissor_start + layer.scissor_len - 1] scissor := &GLOB.scissors[layer.scissor_start + layer.scissor_len - 1]
// Premultiply text color once — reused across all glyph vertices.
pm_color := premultiply_color(text.color)
for data != nil { for data != nil {
vertex_start := u32(len(GLOB.tmp_text_verts)) vertex_start := u32(len(GLOB.tmp_text_verts))
index_start := u32(len(GLOB.tmp_text_indices)) index_start := u32(len(GLOB.tmp_text_indices))
@@ -422,7 +553,7 @@ prepare_text_transformed :: proc(layer: ^Layer, text: Text, transform: Transform
// so we apply directly — no per-vertex DPI divide/multiply. // so we apply directly — no per-vertex DPI divide/multiply.
append( append(
&GLOB.tmp_text_verts, &GLOB.tmp_text_verts,
Vertex{position = apply_transform(transform, {pos.x, -pos.y}), uv = {uv.x, uv.y}, color = text.color}, Vertex{position = apply_transform(transform, {pos.x, -pos.y}), uv = {uv.x, uv.y}, color = pm_color},
) )
} }
@@ -454,15 +585,24 @@ append_or_extend_sub_batch :: proc(
kind: Sub_Batch_Kind, kind: Sub_Batch_Kind,
offset: u32, offset: u32,
count: u32, count: u32,
texture_id: Texture_Id = INVALID_TEXTURE,
sampler: Sampler_Preset = DFT_SAMPLER,
) { ) {
if scissor.sub_batch_len > 0 { if scissor.sub_batch_len > 0 {
last := &GLOB.tmp_sub_batches[scissor.sub_batch_start + scissor.sub_batch_len - 1] last := &GLOB.tmp_sub_batches[scissor.sub_batch_start + scissor.sub_batch_len - 1]
if last.kind == kind && kind != .Text && last.offset + last.count == offset { if last.kind == kind &&
kind != .Text &&
last.offset + last.count == offset &&
last.texture_id == texture_id &&
last.sampler == sampler {
last.count += count last.count += count
return return
} }
} }
append(&GLOB.tmp_sub_batches, Sub_Batch{kind = kind, offset = offset, count = count}) append(
&GLOB.tmp_sub_batches,
Sub_Batch{kind = kind, offset = offset, count = count, texture_id = texture_id, sampler = sampler},
)
scissor.sub_batch_len += 1 scissor.sub_batch_len += 1
layer.sub_batch_len += 1 layer.sub_batch_len += 1
} }
@@ -502,6 +642,7 @@ prepare_clay_batch :: proc(
mouse_wheel_delta: [2]f32, mouse_wheel_delta: [2]f32,
frame_time: f32 = 0, frame_time: f32 = 0,
custom_draw: Custom_Draw = nil, custom_draw: Custom_Draw = nil,
temp_allocator := context.temp_allocator,
) { ) {
mouse_pos: [2]f32 mouse_pos: [2]f32
mouse_flags := sdl.GetMouseState(&mouse_pos.x, &mouse_pos.y) mouse_flags := sdl.GetMouseState(&mouse_pos.x, &mouse_pos.y)
@@ -538,10 +679,14 @@ prepare_clay_batch :: proc(
switch (render_command.commandType) { switch (render_command.commandType) {
case clay.RenderCommandType.None: case clay.RenderCommandType.None:
log.errorf(
"Received render command with type None. This generally means we're in some kind of fucked up state.",
)
case clay.RenderCommandType.Text: case clay.RenderCommandType.Text:
render_data := render_command.renderData.text render_data := render_command.renderData.text
txt := string(render_data.stringContents.chars[:render_data.stringContents.length]) txt := string(render_data.stringContents.chars[:render_data.stringContents.length])
c_text := strings.clone_to_cstring(txt, context.temp_allocator) c_text := strings.clone_to_cstring(txt, temp_allocator)
defer delete(c_text, temp_allocator)
// Clay render-command IDs are derived via Clay's internal HashNumber (Jenkins-family) // Clay render-command IDs are derived via Clay's internal HashNumber (Jenkins-family)
// and namespaced with .Clay so they can never collide with user-provided custom text IDs. // and namespaced with .Clay so they can never collide with user-provided custom text IDs.
sdl_text := cache_get_or_update( sdl_text := cache_get_or_update(
@@ -551,6 +696,29 @@ prepare_clay_batch :: proc(
) )
prepare_text(layer, Text{sdl_text, {bounds.x, bounds.y}, color_from_clay(render_data.textColor)}) prepare_text(layer, Text{sdl_text, {bounds.x, bounds.y}, color_from_clay(render_data.textColor)})
case clay.RenderCommandType.Image: case clay.RenderCommandType.Image:
// Any texture
render_data := render_command.renderData.image
if render_data.imageData == nil do continue
img_data := (^Clay_Image_Data)(render_data.imageData)^
cr := render_data.cornerRadius
radii := Rectangle_Radii {
top_left = cr.topLeft,
top_right = cr.topRight,
bottom_right = cr.bottomRight,
bottom_left = cr.bottomLeft,
}
// Background color behind the image (Clay allows it)
bg := color_from_clay(render_data.backgroundColor)
if bg[3] > 0 {
rectangle(layer, bounds, bg, radii = radii)
}
// Compute fit UVs
uv, sampler, inner := fit_params(img_data.fit, bounds, img_data.texture_id)
// Draw the image
rectangle_texture(layer, inner, img_data.texture_id, img_data.tint, uv, sampler, radii)
case clay.RenderCommandType.ScissorStart: case clay.RenderCommandType.ScissorStart:
if bounds.width == 0 || bounds.height == 0 do continue if bounds.width == 0 || bounds.height == 0 do continue
@@ -582,34 +750,38 @@ prepare_clay_batch :: proc(
render_data := render_command.renderData.rectangle render_data := render_command.renderData.rectangle
cr := render_data.cornerRadius cr := render_data.cornerRadius
color := color_from_clay(render_data.backgroundColor) color := color_from_clay(render_data.backgroundColor)
radii := [4]f32{cr.topLeft, cr.topRight, cr.bottomRight, cr.bottomLeft} radii := Rectangle_Radii {
top_left = cr.topLeft,
if radii == {0, 0, 0, 0} { top_right = cr.topRight,
rectangle(layer, bounds, color) bottom_right = cr.bottomRight,
} else { bottom_left = cr.bottomLeft,
rectangle_corners(layer, bounds, radii, color)
} }
rectangle(layer, bounds, color, radii = radii)
case clay.RenderCommandType.Border: case clay.RenderCommandType.Border:
render_data := render_command.renderData.border render_data := render_command.renderData.border
cr := render_data.cornerRadius cr := render_data.cornerRadius
color := color_from_clay(render_data.color) color := color_from_clay(render_data.color)
thickness := f32(render_data.width.top) thickness := f32(render_data.width.top)
radii := [4]f32{cr.topLeft, cr.topRight, cr.bottomRight, cr.bottomLeft} radii := Rectangle_Radii {
top_left = cr.topLeft,
if radii == {0, 0, 0, 0} { top_right = cr.topRight,
rectangle_lines(layer, bounds, color, thickness) bottom_right = cr.bottomRight,
} else { bottom_left = cr.bottomLeft,
rectangle_corners_lines(layer, bounds, radii, color, thickness)
} }
rectangle(layer, bounds, BLANK, outline_color = color, outline_width = thickness, radii = radii)
case clay.RenderCommandType.Custom: if custom_draw != nil { case clay.RenderCommandType.Custom: if custom_draw != nil {
custom_draw(layer, bounds, render_command.renderData.custom) custom_draw(layer, bounds, render_command.renderData.custom)
} else {
log.error("Received clay render command of type custom but no custom_draw proc provided.")
} }
} }
} }
} }
// Render primitives. clear_color is the background fill before any layers are drawn. // Render primitives. clear_color is the background fill before any layers are drawn.
end :: proc(device: ^sdl.GPUDevice, window: ^sdl.Window, clear_color: Color = BLACK) { end :: proc(device: ^sdl.GPUDevice, window: ^sdl.Window, clear_color: Color = DFT_CLEAR_COLOR) {
cmd_buffer := sdl.AcquireGPUCommandBuffer(device) cmd_buffer := sdl.AcquireGPUCommandBuffer(device)
if cmd_buffer == nil { if cmd_buffer == nil {
log.panicf("Failed to acquire GPU command buffer: %s", sdl.GetError()) log.panicf("Failed to acquire GPU command buffer: %s", sdl.GetError())
@@ -642,7 +814,16 @@ end :: proc(device: ^sdl.GPUDevice, window: ^sdl.Window, clear_color: Color = BL
render_texture = GLOB.msaa_texture render_texture = GLOB.msaa_texture
} }
clear_color_f32 := color_to_f32(clear_color) // Premultiply clear color: the blend state is ONE, ONE_MINUS_SRC_ALPHA (premultiplied),
// so the clear color must also be premultiplied for correct background compositing.
clear_color_straight := color_to_f32(clear_color)
clear_alpha := clear_color_straight[3]
clear_color_f32 := [4]f32 {
clear_color_straight[0] * clear_alpha,
clear_color_straight[1] * clear_alpha,
clear_color_straight[2] * clear_alpha,
clear_alpha,
}
// Draw layers. One render pass per layer; sub-batches draw in submission order within each scissor. // Draw layers. One render pass per layer; sub-batches draw in submission order within each scissor.
for &layer, index in GLOB.layers { for &layer, index in GLOB.layers {
@@ -850,10 +1031,20 @@ Transform_2D :: struct {
// origin pivot point in local space (measured from the shape's natural reference point). // origin pivot point in local space (measured from the shape's natural reference point).
// rotation_deg rotation in degrees, counter-clockwise. // rotation_deg rotation in degrees, counter-clockwise.
// //
build_pivot_rotation :: proc(position: [2]f32, origin: [2]f32, rotation_deg: f32) -> Transform_2D { build_pivot_rotation :: proc(position: Vec2, origin: Vec2, rotation_deg: f32) -> Transform_2D {
radians := math.to_radians(rotation_deg) radians := math.to_radians(rotation_deg)
cos_angle := math.cos(radians) cos_angle := math.cos(radians)
sin_angle := math.sin(radians) sin_angle := math.sin(radians)
return build_pivot_rotation_sc(position, origin, cos_angle, sin_angle)
}
// Variant of build_pivot_rotation that accepts pre-computed cos/sin values,
// avoiding redundant trigonometry when the caller has already computed them.
build_pivot_rotation_sc :: #force_inline proc(
position: Vec2,
origin: Vec2,
cos_angle, sin_angle: f32,
) -> Transform_2D {
return Transform_2D { return Transform_2D {
m00 = cos_angle, m00 = cos_angle,
m01 = -sin_angle, m01 = -sin_angle,
@@ -865,7 +1056,7 @@ build_pivot_rotation :: proc(position: [2]f32, origin: [2]f32, rotation_deg: f32
} }
// Apply the transform to a local-space point, producing a world-space point. // Apply the transform to a local-space point, producing a world-space point.
apply_transform :: #force_inline proc(transform: Transform_2D, point: [2]f32) -> [2]f32 { apply_transform :: #force_inline proc(transform: Transform_2D, point: Vec2) -> Vec2 {
return { return {
transform.m00 * point.x + transform.m01 * point.y + transform.tx, transform.m00 * point.x + transform.m01 * point.y + transform.tx,
transform.m10 * point.x + transform.m11 * point.y + transform.ty, transform.m10 * point.x + transform.m11 * point.y + transform.ty,
@@ -875,7 +1066,7 @@ apply_transform :: #force_inline proc(transform: Transform_2D, point: [2]f32) ->
// Fast-path check callers use BEFORE building a transform. // Fast-path check callers use BEFORE building a transform.
// Returns true if either the origin is non-zero or rotation is non-zero, // Returns true if either the origin is non-zero or rotation is non-zero,
// meaning a transform actually needs to be computed. // meaning a transform actually needs to be computed.
needs_transform :: #force_inline proc(origin: [2]f32, rotation: f32) -> bool { needs_transform :: #force_inline proc(origin: Vec2, rotation: f32) -> bool {
return origin != {0, 0} || rotation != 0 return origin != {0, 0} || rotation != 0
} }
+179
View File
@@ -0,0 +1,179 @@
package draw_qr
import draw ".."
import "../../qrcode"
DFT_QR_DARK :: draw.BLACK // Default QR code dark module color.
DFT_QR_LIGHT :: draw.WHITE // Default QR code light module color.
DFT_QR_BOOST_ECL :: true // Default QR error correction level boost.
// Returns the number of bytes to_texture will write for the given encoded
// QR buffer. Equivalent to size*size*4 where size = qrcode.get_size(qrcode_buf).
texture_size :: #force_inline proc(qrcode_buf: []u8) -> int {
size := qrcode.get_size(qrcode_buf)
return size * size * 4
}
// Decodes an encoded QR buffer into tightly-packed RGBA pixel data written to
// texture_buf. No allocations, no GPU calls. Returns the Texture_Desc the
// caller should pass to draw.register_texture alongside texture_buf.
//
// Returns ok=false when:
// - qrcode_buf is invalid (qrcode.get_size returns 0).
// - texture_buf is smaller than to_texture_size(qrcode_buf).
@(require_results)
to_texture :: proc(
qrcode_buf: []u8,
texture_buf: []u8,
dark: draw.Color = DFT_QR_DARK,
light: draw.Color = DFT_QR_LIGHT,
) -> (
desc: draw.Texture_Desc,
ok: bool,
) {
size := qrcode.get_size(qrcode_buf)
if size == 0 do return {}, false
if len(texture_buf) < size * size * 4 do return {}, false
for y in 0 ..< size {
for x in 0 ..< size {
i := (y * size + x) * 4
c := dark if qrcode.get_module(qrcode_buf, x, y) else light
texture_buf[i + 0] = c[0]
texture_buf[i + 1] = c[1]
texture_buf[i + 2] = c[2]
texture_buf[i + 3] = c[3]
}
}
return draw.Texture_Desc {
width = u32(size),
height = u32(size),
depth_or_layers = 1,
type = .D2,
format = .R8G8B8A8_UNORM,
usage = {.SAMPLER},
mip_levels = 1,
kind = .Static,
},
true
}
// Allocates pixel buffer via temp_allocator, decodes qrcode_buf into it, and
// registers with the GPU. The pixel allocation is freed before return.
//
// Returns ok=false when:
// - qrcode_buf is invalid (qrcode.get_size returns 0).
// - temp_allocator fails to allocate the pixel buffer.
// - GPU texture registration fails.
@(require_results)
register_texture_from_raw :: proc(
qrcode_buf: []u8,
dark: draw.Color = DFT_QR_DARK,
light: draw.Color = DFT_QR_LIGHT,
temp_allocator := context.temp_allocator,
) -> (
texture: draw.Texture_Id,
ok: bool,
) {
tex_size := texture_size(qrcode_buf)
if tex_size == 0 do return draw.INVALID_TEXTURE, false
pixels, alloc_err := make([]u8, tex_size, temp_allocator)
if alloc_err != nil do return draw.INVALID_TEXTURE, false
defer delete(pixels, temp_allocator)
desc := to_texture(qrcode_buf, pixels, dark, light) or_return
return draw.register_texture(desc, pixels)
}
// Encodes text as a QR Code and registers the result as an RGBA texture.
//
// Returns ok=false when:
// - temp_allocator fails to allocate.
// - The text cannot fit in any version within [min_version, max_version] at the given ECL.
// - GPU texture registration fails.
@(require_results)
register_texture_from_text :: proc(
text: string,
ecl: qrcode.Ecc = .Low,
min_version: int = qrcode.VERSION_MIN,
max_version: int = qrcode.VERSION_MAX,
mask: Maybe(qrcode.Mask) = nil,
boost_ecl: bool = DFT_QR_BOOST_ECL,
dark: draw.Color = DFT_QR_DARK,
light: draw.Color = DFT_QR_LIGHT,
temp_allocator := context.temp_allocator,
) -> (
texture: draw.Texture_Id,
ok: bool,
) {
qrcode_buf, alloc_err := make([]u8, qrcode.buffer_len_for_version(max_version), temp_allocator)
if alloc_err != nil do return draw.INVALID_TEXTURE, false
defer delete(qrcode_buf, temp_allocator)
qrcode.encode_auto(
text,
qrcode_buf,
ecl,
min_version,
max_version,
mask,
boost_ecl,
temp_allocator,
) or_return
return register_texture_from_raw(qrcode_buf, dark, light, temp_allocator)
}
// Encodes arbitrary binary data as a QR Code and registers the result as an RGBA texture.
//
// Returns ok=false when:
// - temp_allocator fails to allocate.
// - The payload cannot fit in any version within [min_version, max_version] at the given ECL.
// - GPU texture registration fails.
@(require_results)
register_texture_from_binary :: proc(
bin_data: []u8,
ecl: qrcode.Ecc = .Low,
min_version: int = qrcode.VERSION_MIN,
max_version: int = qrcode.VERSION_MAX,
mask: Maybe(qrcode.Mask) = nil,
boost_ecl: bool = DFT_QR_BOOST_ECL,
dark: draw.Color = DFT_QR_DARK,
light: draw.Color = DFT_QR_LIGHT,
temp_allocator := context.temp_allocator,
) -> (
texture: draw.Texture_Id,
ok: bool,
) {
qrcode_buf, alloc_err := make([]u8, qrcode.buffer_len_for_version(max_version), temp_allocator)
if alloc_err != nil do return draw.INVALID_TEXTURE, false
defer delete(qrcode_buf, temp_allocator)
qrcode.encode_auto(
bin_data,
qrcode_buf,
ecl,
min_version,
max_version,
mask,
boost_ecl,
temp_allocator,
) or_return
return register_texture_from_raw(qrcode_buf, dark, light, temp_allocator)
}
register_texture_from :: proc {
register_texture_from_text,
register_texture_from_binary,
}
// Default fit=.Fit preserves the QR's square aspect; override as needed.
clay_image :: #force_inline proc(
texture: draw.Texture_Id,
tint: draw.Color = draw.DFT_TINT,
) -> draw.Clay_Image_Data {
return draw.clay_image_data(texture, fit = .Fit, tint = tint)
}
@@ -1,19 +1,18 @@
package examples package examples
import "core:fmt" import "core:fmt"
import "core:log"
import "core:mem" import "core:mem"
import "core:os" import "core:os"
main :: proc() { main :: proc() {
//----- Tracking allocator ---------------------------------- //----- General setup ----------------------------------
{ {
tracking_temp_allocator := false
// Temp // Temp
track_temp: mem.Tracking_Allocator track_temp: mem.Tracking_Allocator
if tracking_temp_allocator { mem.tracking_allocator_init(&track_temp, context.temp_allocator)
mem.tracking_allocator_init(&track_temp, context.temp_allocator) context.temp_allocator = mem.tracking_allocator(&track_temp)
context.temp_allocator = mem.tracking_allocator(&track_temp)
}
// Default // Default
track: mem.Tracking_Allocator track: mem.Tracking_Allocator
mem.tracking_allocator_init(&track, context.allocator) mem.tracking_allocator_init(&track, context.allocator)
@@ -22,18 +21,10 @@ main :: proc() {
// This could be fine for some global state or it could be a memory leak. // This could be fine for some global state or it could be a memory leak.
defer { defer {
// Temp allocator // Temp allocator
if tracking_temp_allocator { if len(track_temp.bad_free_array) > 0 {
if len(track_temp.allocation_map) > 0 { fmt.eprintf("=== %v incorrect frees - temp allocator: ===\n", len(track_temp.bad_free_array))
fmt.eprintf("=== %v allocations not freed - temp allocator: ===\n", len(track_temp.allocation_map)) for entry in track_temp.bad_free_array {
for _, entry in track_temp.allocation_map { fmt.eprintf("- %p @ %v\n", entry.memory, entry.location)
fmt.eprintf("- %v bytes @ %v\n", entry.size, entry.location)
}
}
if len(track_temp.bad_free_array) > 0 {
fmt.eprintf("=== %v incorrect frees - temp allocator: ===\n", len(track_temp.bad_free_array))
for entry in track_temp.bad_free_array {
fmt.eprintf("- %p @ %v\n", entry.memory, entry.location)
}
} }
mem.tracking_allocator_destroy(&track_temp) mem.tracking_allocator_destroy(&track_temp)
} }
@@ -52,12 +43,15 @@ main :: proc() {
} }
mem.tracking_allocator_destroy(&track) mem.tracking_allocator_destroy(&track)
} }
// Logger
context.logger = log.create_console_logger()
defer log.destroy_console_logger(context.logger)
} }
args := os.args args := os.args
if len(args) < 2 { if len(args) < 2 {
fmt.eprintln("Usage: examples <example_name>") fmt.eprintln("Usage: examples <example_name>")
fmt.eprintln("Available examples: hellope-shapes, hellope-text, hellope-clay, hellope-custom") fmt.eprintln("Available examples: hellope-shapes, hellope-text, hellope-clay, hellope-custom, textures")
os.exit(1) os.exit(1)
} }
@@ -66,9 +60,10 @@ main :: proc() {
case "hellope-custom": hellope_custom() case "hellope-custom": hellope_custom()
case "hellope-shapes": hellope_shapes() case "hellope-shapes": hellope_shapes()
case "hellope-text": hellope_text() case "hellope-text": hellope_text()
case "textures": textures()
case: case:
fmt.eprintf("Unknown example: %v\n", args[1]) fmt.eprintf("Unknown example: %v\n", args[1])
fmt.eprintln("Available examples: hellope-shapes, hellope-text, hellope-clay, hellope-custom") fmt.eprintln("Available examples: hellope-shapes, hellope-text, hellope-clay, hellope-custom, textures")
os.exit(1) os.exit(1)
} }
} }
+75 -44
View File
@@ -1,6 +1,7 @@
package examples package examples
import "../../draw" import "../../draw"
import "../../draw/tess"
import "../../vendor/clay" import "../../vendor/clay"
import "core:math" import "core:math"
import "core:os" import "core:os"
@@ -28,19 +29,26 @@ hellope_shapes :: proc() {
base_layer := draw.begin({width = 500, height = 500}) base_layer := draw.begin({width = 500, height = 500})
// Background // Background
draw.rectangle(base_layer, {0, 0, 500, 500}, {40, 40, 40, 255}) draw.rectangle(base_layer, {0, 0, 500, 500}, draw.Color{40, 40, 40, 255})
// ----- Shapes without rotation (existing demo) ----- // ----- Shapes without rotation (existing demo) -----
draw.rectangle(base_layer, {20, 20, 200, 120}, {80, 120, 200, 255}) draw.rectangle(
draw.rectangle_lines(base_layer, {20, 20, 200, 120}, draw.WHITE, thickness = 2) base_layer,
draw.rectangle(base_layer, {240, 20, 240, 120}, {200, 80, 80, 255}, roundness = 0.3) {20, 20, 200, 120},
draw.rectangle_gradient( draw.Color{80, 120, 200, 255},
outline_color = draw.WHITE,
outline_width = 2,
radii = {top_right = 15, top_left = 5},
)
red_rect_raddi := draw.uniform_radii({240, 20, 240, 120}, 0.3)
red_rect_raddi.bottom_left = 0
draw.rectangle(base_layer, {240, 20, 240, 120}, draw.Color{200, 80, 80, 255}, radii = red_rect_raddi)
draw.rectangle(
base_layer, base_layer,
{20, 160, 460, 60}, {20, 160, 460, 60},
{255, 0, 0, 255}, {255, 0, 0, 255},
{0, 255, 0, 255}, gradient = draw.Linear_Gradient{end_color = {0, 0, 255, 255}, angle = 0},
{0, 0, 255, 255},
{255, 255, 0, 255},
) )
// ----- Rotation demos ----- // ----- Rotation demos -----
@@ -50,17 +58,12 @@ hellope_shapes :: proc() {
draw.rectangle( draw.rectangle(
base_layer, base_layer,
rect, rect,
{100, 200, 100, 255}, draw.Color{100, 200, 100, 255},
origin = draw.center_of(rect), outline_color = draw.WHITE,
rotation = spin_angle, outline_width = 2,
)
draw.rectangle_lines(
base_layer,
rect,
draw.WHITE,
thickness = 2,
origin = draw.center_of(rect), origin = draw.center_of(rect),
rotation = spin_angle, rotation = spin_angle,
feather_px = 1,
) )
// Rounded rectangle rotating around its center // Rounded rectangle rotating around its center
@@ -68,8 +71,8 @@ hellope_shapes :: proc() {
draw.rectangle( draw.rectangle(
base_layer, base_layer,
rrect, rrect,
{200, 100, 200, 255}, draw.Color{200, 100, 200, 255},
roundness = 0.4, radii = draw.uniform_radii(rrect, 0.4),
origin = draw.center_of(rrect), origin = draw.center_of(rrect),
rotation = spin_angle, rotation = spin_angle,
) )
@@ -78,19 +81,36 @@ hellope_shapes :: proc() {
draw.ellipse(base_layer, {410, 340}, 50, 30, {255, 200, 50, 255}, rotation = spin_angle) draw.ellipse(base_layer, {410, 340}, 50, 30, {255, 200, 50, 255}, rotation = spin_angle)
// Circle orbiting a point (moon orbiting planet) // Circle orbiting a point (moon orbiting planet)
planet_pos := [2]f32{100, 450} // Convention B: center = pivot point (planet), origin = offset from moon center to pivot.
moon_pos := planet_pos + {0, -40} // Moon's visual center at rotation=0: planet_pos - origin = (100, 450) - (0, 40) = (100, 410).
planet_pos := draw.Vec2{100, 450}
draw.circle(base_layer, planet_pos, 8, {200, 200, 200, 255}) // planet (stationary) draw.circle(base_layer, planet_pos, 8, {200, 200, 200, 255}) // planet (stationary)
draw.circle(base_layer, moon_pos, 5, {100, 150, 255, 255}, origin = {0, 40}, rotation = spin_angle) // moon orbiting draw.circle(
base_layer,
planet_pos,
5,
{100, 150, 255, 255},
origin = draw.Vec2{0, 40},
rotation = spin_angle,
) // moon orbiting
// Ring arc rotating in place // Sector (pie slice) rotating in place
draw.ring(base_layer, {250, 450}, 15, 30, 0, 270, {100, 100, 220, 255}, rotation = spin_angle) draw.ring(
base_layer,
draw.Vec2{250, 450},
0,
30,
{100, 100, 220, 255},
start_angle = 0,
end_angle = 270,
rotation = spin_angle,
)
// Triangle rotating around its center // Triangle rotating around its center
tv1 := [2]f32{350, 420} tv1 := draw.Vec2{350, 420}
tv2 := [2]f32{420, 480} tv2 := draw.Vec2{420, 480}
tv3 := [2]f32{340, 480} tv3 := draw.Vec2{340, 480}
draw.triangle( tess.triangle_aa(
base_layer, base_layer,
tv1, tv1,
tv2, tv2,
@@ -101,8 +121,16 @@ hellope_shapes :: proc() {
) )
// Polygon rotating around its center (already had rotation; now with origin for orbit) // Polygon rotating around its center (already had rotation; now with origin for orbit)
draw.polygon(base_layer, {460, 450}, 6, 30, {180, 100, 220, 255}, rotation = spin_angle) draw.polygon(
draw.polygon_lines(base_layer, {460, 450}, 6, 30, draw.WHITE, rotation = spin_angle, thickness = 2) base_layer,
{460, 450},
6,
30,
{180, 100, 220, 255},
outline_color = draw.WHITE,
outline_width = 2,
rotation = spin_angle,
)
draw.end(gpu, window) draw.end(gpu, window)
} }
@@ -133,9 +161,6 @@ hellope_text :: proc() {
spin_angle += 0.5 spin_angle += 0.5
base_layer := draw.begin({width = 600, height = 600}) base_layer := draw.begin({width = 600, height = 600})
// Grey background
draw.rectangle(base_layer, {0, 0, 600, 600}, {127, 127, 127, 255})
// ----- Text API demos ----- // ----- Text API demos -----
// Cached text with id — TTF_Text reused across frames (good for text-heavy apps) // Cached text with id — TTF_Text reused across frames (good for text-heavy apps)
@@ -175,7 +200,7 @@ hellope_text :: proc() {
// Measure text for manual layout // Measure text for manual layout
size := draw.measure_text("Measured!", JETBRAINS_MONO_REGULAR, FONT_SIZE) size := draw.measure_text("Measured!", JETBRAINS_MONO_REGULAR, FONT_SIZE)
draw.rectangle(base_layer, {300 - size.x / 2, 380, size.x, size.y}, {60, 60, 60, 200}) draw.rectangle(base_layer, {300 - size.x / 2, 380, size.x, size.y}, draw.Color{60, 60, 60, 200})
draw.text( draw.text(
base_layer, base_layer,
"Measured!", "Measured!",
@@ -199,7 +224,7 @@ hellope_text :: proc() {
id = CORNER_SPIN_ID, id = CORNER_SPIN_ID,
) )
draw.end(gpu, window) draw.end(gpu, window, draw.Color{127, 127, 127, 255})
} }
} }
@@ -337,15 +362,21 @@ hellope_custom :: proc() {
draw_custom :: proc(layer: ^draw.Layer, bounds: draw.Rectangle, render_data: clay.CustomRenderData) { draw_custom :: proc(layer: ^draw.Layer, bounds: draw.Rectangle, render_data: clay.CustomRenderData) {
gauge := cast(^Gauge)render_data.customData gauge := cast(^Gauge)render_data.customData
// Background from clay's backgroundColor border_width: f32 = 2
draw.rectangle(layer, bounds, draw.color_from_clay(render_data.backgroundColor), roundness = 0.25) draw.rectangle(
layer,
bounds,
draw.color_from_clay(render_data.backgroundColor),
outline_color = draw.WHITE,
outline_width = border_width,
)
// Fill bar fill := draw.Rectangle {
fill := bounds x = bounds.x,
fill.width *= gauge.value y = bounds.y,
draw.rectangle(layer, fill, gauge.color, roundness = 0.25) width = bounds.width * gauge.value,
height = bounds.height,
// Border }
draw.rectangle_lines(layer, bounds, draw.WHITE, thickness = 2, roundness = 0.25) draw.rectangle(layer, fill, gauge.color)
} }
} }
+272
View File
@@ -0,0 +1,272 @@
package examples
import "../../draw"
import "../../draw/draw_qr"
import "core:os"
import sdl "vendor:sdl3"
textures :: proc() {
if !sdl.Init({.VIDEO}) do os.exit(1)
window := sdl.CreateWindow("Textures", 800, 600, {.HIGH_PIXEL_DENSITY})
gpu := sdl.CreateGPUDevice(draw.PLATFORM_SHADER_FORMAT, true, nil)
if !sdl.ClaimWindowForGPUDevice(gpu, window) do os.exit(1)
if !draw.init(gpu, window) do os.exit(1)
JETBRAINS_MONO_REGULAR = draw.register_font(JETBRAINS_MONO_REGULAR_RAW)
FONT_SIZE :: u16(14)
LABEL_OFFSET :: f32(8) // gap between item and its label
//----- Texture registration ----------------------------------
checker_size :: 8
checker_pixels: [checker_size * checker_size * 4]u8
for y in 0 ..< checker_size {
for x in 0 ..< checker_size {
i := (y * checker_size + x) * 4
is_dark := ((x + y) % 2) == 0
val: u8 = 40 if is_dark else 220
checker_pixels[i + 0] = val // R
checker_pixels[i + 1] = val / 2 // G — slight color tint
checker_pixels[i + 2] = val // B
checker_pixels[i + 3] = 255 // A
}
}
checker_texture, _ := draw.register_texture(
draw.Texture_Desc {
width = checker_size,
height = checker_size,
depth_or_layers = 1,
type = .D2,
format = .R8G8B8A8_UNORM,
usage = {.SAMPLER},
mip_levels = 1,
},
checker_pixels[:],
)
defer draw.unregister_texture(checker_texture)
stripe_w :: 16
stripe_h :: 8
stripe_pixels: [stripe_w * stripe_h * 4]u8
for y in 0 ..< stripe_h {
for x in 0 ..< stripe_w {
i := (y * stripe_w + x) * 4
stripe_pixels[i + 0] = u8(x * 255 / (stripe_w - 1)) // R gradient left→right
stripe_pixels[i + 1] = u8(y * 255 / (stripe_h - 1)) // G gradient top→bottom
stripe_pixels[i + 2] = 128 // B constant
stripe_pixels[i + 3] = 255 // A
}
}
stripe_texture, _ := draw.register_texture(
draw.Texture_Desc {
width = stripe_w,
height = stripe_h,
depth_or_layers = 1,
type = .D2,
format = .R8G8B8A8_UNORM,
usage = {.SAMPLER},
mip_levels = 1,
},
stripe_pixels[:],
)
defer draw.unregister_texture(stripe_texture)
qr_texture, _ := draw_qr.register_texture_from("https://x.com/miiilato/status/1880241066471051443")
defer draw.unregister_texture(qr_texture)
spin_angle: f32 = 0
//----- Draw loop ----------------------------------
for {
defer free_all(context.temp_allocator)
ev: sdl.Event
for sdl.PollEvent(&ev) {
if ev.type == .QUIT do return
}
spin_angle += 1
base_layer := draw.begin({width = 800, height = 600})
// Background
draw.rectangle(base_layer, {0, 0, 800, 600}, draw.Color{30, 30, 30, 255})
//----- Row 1: Sampler presets (y=30) ----------------------------------
ROW1_Y :: f32(30)
ITEM_SIZE :: f32(120)
COL1 :: f32(30)
COL2 :: f32(180)
COL3 :: f32(330)
COL4 :: f32(480)
// Nearest (sharp pixel edges)
draw.rectangle_texture(
base_layer,
{COL1, ROW1_Y, ITEM_SIZE, ITEM_SIZE},
checker_texture,
sampler = .Nearest_Clamp,
)
draw.text(
base_layer,
"Nearest",
{COL1, ROW1_Y + ITEM_SIZE + LABEL_OFFSET},
JETBRAINS_MONO_REGULAR,
FONT_SIZE,
color = draw.WHITE,
)
// Linear (bilinear blur)
draw.rectangle_texture(
base_layer,
{COL2, ROW1_Y, ITEM_SIZE, ITEM_SIZE},
checker_texture,
sampler = .Linear_Clamp,
)
draw.text(
base_layer,
"Linear",
{COL2, ROW1_Y + ITEM_SIZE + LABEL_OFFSET},
JETBRAINS_MONO_REGULAR,
FONT_SIZE,
color = draw.WHITE,
)
// Tiled (4x repeat)
draw.rectangle_texture(
base_layer,
{COL3, ROW1_Y, ITEM_SIZE, ITEM_SIZE},
checker_texture,
sampler = .Nearest_Repeat,
uv_rect = {0, 0, 4, 4},
)
draw.text(
base_layer,
"Tiled 4x",
{COL3, ROW1_Y + ITEM_SIZE + LABEL_OFFSET},
JETBRAINS_MONO_REGULAR,
FONT_SIZE,
color = draw.WHITE,
)
//----- Row 2: Sampler presets (y=190) ----------------------------------
ROW2_Y :: f32(190)
// QR code (RGBA texture with baked colors, nearest sampling)
draw.rectangle(base_layer, {COL1, ROW2_Y, ITEM_SIZE, ITEM_SIZE}, draw.Color{255, 255, 255, 255}) // white bg
draw.rectangle_texture(
base_layer,
{COL1, ROW2_Y, ITEM_SIZE, ITEM_SIZE},
qr_texture,
sampler = .Nearest_Clamp,
)
draw.text(
base_layer,
"QR Code",
{COL1, ROW2_Y + ITEM_SIZE + LABEL_OFFSET},
JETBRAINS_MONO_REGULAR,
FONT_SIZE,
color = draw.WHITE,
)
// Rounded corners
draw.rectangle_texture(
base_layer,
{COL2, ROW2_Y, ITEM_SIZE, ITEM_SIZE},
checker_texture,
sampler = .Nearest_Clamp,
radii = draw.uniform_radii({COL2, ROW2_Y, ITEM_SIZE, ITEM_SIZE}, 0.3),
)
draw.text(
base_layer,
"Rounded",
{COL2, ROW2_Y + ITEM_SIZE + LABEL_OFFSET},
JETBRAINS_MONO_REGULAR,
FONT_SIZE,
color = draw.WHITE,
)
// Rotating
rot_rect := draw.Rectangle{COL3, ROW2_Y, ITEM_SIZE, ITEM_SIZE}
draw.rectangle_texture(
base_layer,
rot_rect,
checker_texture,
sampler = .Nearest_Clamp,
origin = draw.center_of(rot_rect),
rotation = spin_angle,
)
draw.text(
base_layer,
"Rotating",
{COL3, ROW2_Y + ITEM_SIZE + LABEL_OFFSET},
JETBRAINS_MONO_REGULAR,
FONT_SIZE,
color = draw.WHITE,
)
//----- Row 3: Fit modes + Per-corner radii (y=360) ----------------------------------
ROW3_Y :: f32(360)
FIT_SIZE :: f32(120) // square target rect
// Stretch
uv_s, sampler_s, inner_s := draw.fit_params(.Stretch, {COL1, ROW3_Y, FIT_SIZE, FIT_SIZE}, stripe_texture)
draw.rectangle(base_layer, {COL1, ROW3_Y, FIT_SIZE, FIT_SIZE}, draw.Color{60, 60, 60, 255}) // bg
draw.rectangle_texture(base_layer, inner_s, stripe_texture, uv_rect = uv_s, sampler = sampler_s)
draw.text(
base_layer,
"Stretch",
{COL1, ROW3_Y + FIT_SIZE + LABEL_OFFSET},
JETBRAINS_MONO_REGULAR,
FONT_SIZE,
color = draw.WHITE,
)
// Fill (center-crop)
uv_f, sampler_f, inner_f := draw.fit_params(.Fill, {COL2, ROW3_Y, FIT_SIZE, FIT_SIZE}, stripe_texture)
draw.rectangle(base_layer, {COL2, ROW3_Y, FIT_SIZE, FIT_SIZE}, draw.Color{60, 60, 60, 255})
draw.rectangle_texture(base_layer, inner_f, stripe_texture, uv_rect = uv_f, sampler = sampler_f)
draw.text(
base_layer,
"Fill",
{COL2, ROW3_Y + FIT_SIZE + LABEL_OFFSET},
JETBRAINS_MONO_REGULAR,
FONT_SIZE,
color = draw.WHITE,
)
// Fit (letterbox)
uv_ft, sampler_ft, inner_ft := draw.fit_params(.Fit, {COL3, ROW3_Y, FIT_SIZE, FIT_SIZE}, stripe_texture)
draw.rectangle(base_layer, {COL3, ROW3_Y, FIT_SIZE, FIT_SIZE}, draw.Color{60, 60, 60, 255}) // visible margin bg
draw.rectangle_texture(base_layer, inner_ft, stripe_texture, uv_rect = uv_ft, sampler = sampler_ft)
draw.text(
base_layer,
"Fit",
{COL3, ROW3_Y + FIT_SIZE + LABEL_OFFSET},
JETBRAINS_MONO_REGULAR,
FONT_SIZE,
color = draw.WHITE,
)
// Per-corner radii
draw.rectangle_texture(
base_layer,
{COL4, ROW3_Y, FIT_SIZE, FIT_SIZE},
checker_texture,
sampler = .Nearest_Clamp,
radii = {20, 0, 20, 0},
)
draw.text(
base_layer,
"Per-corner",
{COL4, ROW3_Y + FIT_SIZE + LABEL_OFFSET},
JETBRAINS_MONO_REGULAR,
FONT_SIZE,
color = draw.WHITE,
)
draw.end(gpu, window)
}
}
+133 -73
View File
@@ -5,8 +5,13 @@ import "core:log"
import "core:mem" import "core:mem"
import sdl "vendor:sdl3" import sdl "vendor:sdl3"
// Vertex layout for tessellated and text geometry.
// IMPORTANT: `color` must be premultiplied alpha (RGB channels pre-scaled by alpha).
// The tessellated fragment shader passes vertex color through directly — it does NOT
// premultiply. The blend state is ONE, ONE_MINUS_SRC_ALPHA (premultiplied-over).
// Use `premultiply_color` when constructing vertices manually for `prepare_shape`.
Vertex :: struct { Vertex :: struct {
position: [2]f32, position: Vec2,
uv: [2]f32, uv: [2]f32,
color: Color, color: Color,
} }
@@ -23,97 +28,127 @@ TextBatch :: struct {
// ----- SDF primitive types ----------- // ----- SDF primitive types -----------
// ---------------------------------------------------------------------------------------------------------------- // ----------------------------------------------------------------------------------------------------------------
// The SDF path evaluates one of four signed distance functions per primitive, dispatched
// by Shape_Kind encoded in the low byte of Primitive.flags:
//
// RRect — rounded rectangle with per-corner radii (sdRoundedBox). Also covers circles
// (uniform radii = half-size), capsule-style line segments (rotated, max rounding),
// and other RRect-reducible shapes.
// NGon — regular polygon with N sides and optional rounding.
// Ellipse — approximate ellipse (non-exact SDF, suitable for UI but not for shape merging).
// Ring_Arc — annular ring with optional angular clipping. Covers full rings, partial arcs,
// pie slices (inner_radius = 0), and loading spinners.
Shape_Kind :: enum u8 { Shape_Kind :: enum u8 {
Solid = 0, Solid = 0, // tessellated path (mode marker; not a real SDF kind)
RRect = 1, RRect = 1,
Circle = 2, NGon = 2,
Ellipse = 3, Ellipse = 3,
Segment = 4, Ring_Arc = 4,
Ring_Arc = 5,
NGon = 6,
} }
Shape_Flag :: enum u8 { Shape_Flag :: enum u8 {
Stroke, Textured, // bit 0: sample texture using uv.uv_rect (mutually exclusive with Gradient)
Gradient, // bit 1: 2-color gradient using uv.effects.gradient_color as end/outer color
Gradient_Radial, // bit 2: if set with Gradient, radial from center; else linear at angle
Outline, // bit 3: outer outline band using uv.effects.outline_color; CPU expands bounds by outline_width
Rotated, // bit 4: shape has non-zero rotation; rotation_sc contains packed sin/cos
Arc_Narrow, // bit 5: ring arc span ≤ π — intersect half-planes. Neither Arc bit = full ring.
Arc_Wide, // bit 6: ring arc span > π — union half-planes. Neither Arc bit = full ring.
} }
Shape_Flags :: bit_set[Shape_Flag;u8] Shape_Flags :: bit_set[Shape_Flag;u8]
RRect_Params :: struct { RRect_Params :: struct {
half_size: [2]f32, half_size: [2]f32,
radii: [4]f32, radii: [4]f32,
soft_px: f32, half_feather: f32, // feather_px * 0.5; shader uses smoothstep(-h, h, d)
stroke_px: f32, _: f32,
}
Circle_Params :: struct {
radius: f32,
soft_px: f32,
stroke_px: f32,
_: [5]f32,
}
Ellipse_Params :: struct {
radii: [2]f32,
soft_px: f32,
stroke_px: f32,
_: [4]f32,
}
Segment_Params :: struct {
a: [2]f32,
b: [2]f32,
width: f32,
soft_px: f32,
_: [2]f32,
}
Ring_Arc_Params :: struct {
inner_radius: f32,
outer_radius: f32,
start_rad: f32,
end_rad: f32,
soft_px: f32,
_: [3]f32,
} }
NGon_Params :: struct { NGon_Params :: struct {
radius: f32, radius: f32,
rotation: f32, sides: f32,
sides: f32, half_feather: f32, // feather_px * 0.5; shader uses smoothstep(-h, h, d)
soft_px: f32, _: [5]f32,
stroke_px: f32, }
_: [3]f32,
Ellipse_Params :: struct {
radii: [2]f32,
half_feather: f32, // feather_px * 0.5; shader uses smoothstep(-h, h, d)
_: [5]f32,
}
Ring_Arc_Params :: struct {
inner_radius: f32, // inner radius in physical pixels (0 for pie slice)
outer_radius: f32, // outer radius in physical pixels
normal_start: [2]f32, // pre-computed outward normal of start edge: (sin(start), -cos(start))
normal_end: [2]f32, // pre-computed outward normal of end edge: (-sin(end), cos(end))
half_feather: f32, // feather_px * 0.5; shader uses smoothstep(-h, h, d)
_: f32,
} }
Shape_Params :: struct #raw_union { Shape_Params :: struct #raw_union {
rrect: RRect_Params, rrect: RRect_Params,
circle: Circle_Params,
ellipse: Ellipse_Params,
segment: Segment_Params,
ring_arc: Ring_Arc_Params,
ngon: NGon_Params, ngon: NGon_Params,
ellipse: Ellipse_Params,
ring_arc: Ring_Arc_Params,
raw: [8]f32, raw: [8]f32,
} }
#assert(size_of(Shape_Params) == 32) #assert(size_of(Shape_Params) == 32)
// GPU layout: 64 bytes, std430-compatible. The shader declares this as a storage buffer struct. // GPU-side storage for 2-color gradient parameters and/or outline parameters.
Primitive :: struct { // Packed into 16 bytes to alias with uv_rect in the Uv_Or_Effects raw union.
bounds: [4]f32, // 0: min_x, min_y, max_x, max_y (world-space, pre-DPI) // The shader reads gradient_color and outline_color via unpackUnorm4x8.
color: Color, // 16: u8x4, unpacked in shader via unpackUnorm4x8 // gradient_dir_sc stores the pre-computed gradient direction as (cos, sin) in f16 pair
kind_flags: u32, // 20: (kind as u32) | (flags as u32 << 8) // via unpackHalf2x16. outline_packed stores outline_width as f16 via unpackHalf2x16.
rotation: f32, // 24: shader self-rotation in radians (used by RRect, Ellipse) Gradient_Outline :: struct {
_pad: f32, // 28: alignment to vec4 boundary gradient_color: Color, // 0: end (linear) or outer (radial) gradient color
params: Shape_Params, // 32: two vec4s of shape params outline_color: Color, // 4: outline band color
gradient_dir_sc: u32, // 8: packed f16 pair: low = cos(angle), high = sin(angle) — pre-computed gradient direction
outline_packed: u32, // 12: packed f16 pair: low = outline_width (f16, physical pixels), high = reserved
} }
#assert(size_of(Primitive) == 64) #assert(size_of(Gradient_Outline) == 16)
// Uv_Or_Effects aliases the final 16 bytes of a Primitive. When .Textured is set,
// uv_rect holds texture-atlas coordinates. When .Gradient or .Outline is set,
// effects holds 2-color gradient parameters and/or outline parameters.
// Textured and Gradient are mutually exclusive; if both are set, Gradient takes precedence.
Uv_Or_Effects :: struct #raw_union {
uv_rect: [4]f32, // u_min, v_min, u_max, v_max (default {0,0,1,1})
effects: Gradient_Outline, // gradient + outline parameters
}
// GPU layout: 80 bytes, std430-compatible. The shader declares this as a storage buffer struct.
// The low byte of `flags` encodes the Shape_Kind (0 = tessellated, 1-4 = SDF kinds).
// Bits 8-15 encode Shape_Flags (Textured, Gradient, Gradient_Radial, Outline, Rotated, Arc_Narrow, Arc_Wide).
// rotation_sc stores pre-computed sin/cos of the rotation angle as a packed f16 pair,
// avoiding per-pixel trigonometry in the fragment shader. Only read when .Rotated is set.
Primitive :: struct {
bounds: [4]f32, // 0: min_x, min_y, max_x, max_y (world-space, pre-DPI)
color: Color, // 16: u8x4, fill color / gradient start color / texture tint
flags: u32, // 20: low byte = Shape_Kind, bits 8+ = Shape_Flags
rotation_sc: u32, // 24: packed f16 pair: low = sin(angle), high = cos(angle). Requires .Rotated flag.
_pad: f32, // 28: reserved for future use
params: Shape_Params, // 32: per-kind shape parameters (raw union, 32 bytes)
uv: Uv_Or_Effects, // 64: texture coords or gradient/outline parameters
}
#assert(size_of(Primitive) == 80)
// Pack shape kind and flags into the Primitive.flags field. The low byte encodes the Shape_Kind
// (which also serves as the SDF mode marker — kind > 0 means SDF path). The tessellated path
// leaves the field at 0 (Solid kind, set by vertex shader zero-initialization).
pack_kind_flags :: #force_inline proc(kind: Shape_Kind, flags: Shape_Flags) -> u32 { pack_kind_flags :: #force_inline proc(kind: Shape_Kind, flags: Shape_Flags) -> u32 {
return u32(kind) | (u32(transmute(u8)flags) << 8) return u32(kind) | (u32(transmute(u8)flags) << 8)
} }
// Pack two f16 values into a single u32 for GPU consumption via unpackHalf2x16.
// Used to pack gradient_dir_sc (cos/sin) and outline_packed (width/reserved) in Gradient_Outline.
pack_f16_pair :: #force_inline proc(low, high: f16) -> u32 {
return u32(transmute(u16)low) | (u32(transmute(u16)high) << 16)
}
Pipeline_2D_Base :: struct { Pipeline_2D_Base :: struct {
sdl_pipeline: ^sdl.GPUGraphicsPipeline, sdl_pipeline: ^sdl.GPUGraphicsPipeline,
vertex_buffer: Buffer, vertex_buffer: Buffer,
@@ -206,19 +241,23 @@ create_pipeline_2d_base :: proc(
target_info = sdl.GPUGraphicsPipelineTargetInfo { target_info = sdl.GPUGraphicsPipelineTargetInfo {
color_target_descriptions = &sdl.GPUColorTargetDescription { color_target_descriptions = &sdl.GPUColorTargetDescription {
format = sdl.GetGPUSwapchainTextureFormat(device, window), format = sdl.GetGPUSwapchainTextureFormat(device, window),
// Premultiplied-alpha blending: src outputs RGB pre-multiplied by alpha,
// so src factor is ONE (not SRC_ALPHA). This eliminates the per-pixel
// divide in the outline path and is the standard blend mode used by
// Skia, Flutter, and GPUI.
blend_state = sdl.GPUColorTargetBlendState { blend_state = sdl.GPUColorTargetBlendState {
enable_blend = true, enable_blend = true,
enable_color_write_mask = true, enable_color_write_mask = true,
src_color_blendfactor = .SRC_ALPHA, src_color_blendfactor = .ONE,
dst_color_blendfactor = .ONE_MINUS_SRC_ALPHA, dst_color_blendfactor = .ONE_MINUS_SRC_ALPHA,
color_blend_op = .ADD, color_blend_op = .ADD,
src_alpha_blendfactor = .SRC_ALPHA, src_alpha_blendfactor = .ONE,
dst_alpha_blendfactor = .ONE_MINUS_SRC_ALPHA, dst_alpha_blendfactor = .ONE_MINUS_SRC_ALPHA,
alpha_blend_op = .ADD, alpha_blend_op = .ADD,
color_write_mask = sdl.GPUColorComponentFlags{.R, .G, .B, .A}, color_write_mask = sdl.GPUColorComponentFlags{.R, .G, .B, .A},
}, },
}, },
num_color_targets = 1, num_color_targets = 1,
}, },
vertex_input_state = sdl.GPUVertexInputState { vertex_input_state = sdl.GPUVertexInputState {
vertex_buffer_descriptions = &sdl.GPUVertexBufferDescription { vertex_buffer_descriptions = &sdl.GPUVertexBufferDescription {
@@ -298,7 +337,7 @@ create_pipeline_2d_base :: proc(
} }
// Upload white pixel and unit quad data in a single command buffer // Upload white pixel and unit quad data in a single command buffer
white_pixel := [4]u8{255, 255, 255, 255} white_pixel := Color{255, 255, 255, 255}
white_transfer_buf := sdl.CreateGPUTransferBuffer( white_transfer_buf := sdl.CreateGPUTransferBuffer(
device, device,
sdl.GPUTransferBufferCreateInfo{usage = .UPLOAD, size = size_of(white_pixel)}, sdl.GPUTransferBufferCreateInfo{usage = .UPLOAD, size = size_of(white_pixel)},
@@ -566,6 +605,7 @@ draw_layer :: proc(
current_mode: Draw_Mode = .Tessellated current_mode: Draw_Mode = .Tessellated
current_vert_buf := main_vert_buf current_vert_buf := main_vert_buf
current_atlas: ^sdl.GPUTexture current_atlas: ^sdl.GPUTexture
current_sampler := sampler
// Text vertices live after shape vertices in the GPU vertex buffer // Text vertices live after shape vertices in the GPU vertex buffer
text_vertex_gpu_base := u32(len(GLOB.tmp_shape_verts)) text_vertex_gpu_base := u32(len(GLOB.tmp_shape_verts))
@@ -575,7 +615,7 @@ draw_layer :: proc(
for &batch in GLOB.tmp_sub_batches[scissor.sub_batch_start:][:scissor.sub_batch_len] { for &batch in GLOB.tmp_sub_batches[scissor.sub_batch_start:][:scissor.sub_batch_len] {
switch batch.kind { switch batch.kind {
case .Shapes: case .Tessellated:
if current_mode != .Tessellated { if current_mode != .Tessellated {
push_globals(cmd_buffer, width, height, .Tessellated) push_globals(cmd_buffer, width, height, .Tessellated)
current_mode = .Tessellated current_mode = .Tessellated
@@ -584,14 +624,24 @@ draw_layer :: proc(
sdl.BindGPUVertexBuffers(render_pass, 0, &sdl.GPUBufferBinding{buffer = main_vert_buf, offset = 0}, 1) sdl.BindGPUVertexBuffers(render_pass, 0, &sdl.GPUBufferBinding{buffer = main_vert_buf, offset = 0}, 1)
current_vert_buf = main_vert_buf current_vert_buf = main_vert_buf
} }
if current_atlas != white_texture { // Determine texture and sampler for this batch
batch_texture: ^sdl.GPUTexture = white_texture
batch_sampler: ^sdl.GPUSampler = sampler
if batch.texture_id != INVALID_TEXTURE {
if bound_texture := texture_gpu_handle(batch.texture_id); bound_texture != nil {
batch_texture = bound_texture
}
batch_sampler = get_sampler(batch.sampler)
}
if current_atlas != batch_texture || current_sampler != batch_sampler {
sdl.BindGPUFragmentSamplers( sdl.BindGPUFragmentSamplers(
render_pass, render_pass,
0, 0,
&sdl.GPUTextureSamplerBinding{texture = white_texture, sampler = sampler}, &sdl.GPUTextureSamplerBinding{texture = batch_texture, sampler = batch_sampler},
1, 1,
) )
current_atlas = white_texture current_atlas = batch_texture
current_sampler = batch_sampler
} }
sdl.DrawGPUPrimitives(render_pass, batch.count, 1, batch.offset, 0) sdl.DrawGPUPrimitives(render_pass, batch.count, 1, batch.offset, 0)
@@ -632,14 +682,24 @@ draw_layer :: proc(
sdl.BindGPUVertexBuffers(render_pass, 0, &sdl.GPUBufferBinding{buffer = unit_quad, offset = 0}, 1) sdl.BindGPUVertexBuffers(render_pass, 0, &sdl.GPUBufferBinding{buffer = unit_quad, offset = 0}, 1)
current_vert_buf = unit_quad current_vert_buf = unit_quad
} }
if current_atlas != white_texture { // Determine texture and sampler for this batch
batch_texture: ^sdl.GPUTexture = white_texture
batch_sampler: ^sdl.GPUSampler = sampler
if batch.texture_id != INVALID_TEXTURE {
if bound_texture := texture_gpu_handle(batch.texture_id); bound_texture != nil {
batch_texture = bound_texture
}
batch_sampler = get_sampler(batch.sampler)
}
if current_atlas != batch_texture || current_sampler != batch_sampler {
sdl.BindGPUFragmentSamplers( sdl.BindGPUFragmentSamplers(
render_pass, render_pass,
0, 0,
&sdl.GPUTextureSamplerBinding{texture = white_texture, sampler = sampler}, &sdl.GPUTextureSamplerBinding{texture = batch_texture, sampler = batch_sampler},
1, 1,
) )
current_atlas = white_texture current_atlas = batch_texture
current_sampler = batch_sampler
} }
sdl.DrawGPUPrimitives(render_pass, 6, batch.count, 0, batch.offset) sdl.DrawGPUPrimitives(render_pass, 6, batch.count, 0, batch.offset)
} }
+153 -202
View File
@@ -23,274 +23,225 @@ struct main0_in
float2 f_local_or_uv [[user(locn1)]]; float2 f_local_or_uv [[user(locn1)]];
float4 f_params [[user(locn2)]]; float4 f_params [[user(locn2)]];
float4 f_params2 [[user(locn3)]]; float4 f_params2 [[user(locn3)]];
uint f_kind_flags [[user(locn4)]]; uint f_flags [[user(locn4)]];
float f_rotation [[user(locn5), flat]]; uint f_rotation_sc [[user(locn5)]];
uint4 f_uv_or_effects [[user(locn6)]];
}; };
static inline __attribute__((always_inline)) static inline __attribute__((always_inline))
float2 apply_rotation(thread const float2& p, thread const float& angle) float sdRoundedBox(thread const float2& p, thread const float2& b, thread const float4& r)
{ {
float cr = cos(-angle); float2 _48;
float sr = sin(-angle);
return float2x2(float2(cr, sr), float2(-sr, cr)) * p;
}
static inline __attribute__((always_inline))
float sdRoundedBox(thread const float2& p, thread const float2& b, thread float4& r)
{
float2 _61;
if (p.x > 0.0) if (p.x > 0.0)
{ {
_61 = r.xy; _48 = r.xy;
} }
else else
{ {
_61 = r.zw; _48 = r.zw;
} }
r.x = _61.x; float2 rxy = _48;
r.y = _61.y; float _62;
float _78;
if (p.y > 0.0) if (p.y > 0.0)
{ {
_78 = r.x; _62 = rxy.x;
} }
else else
{ {
_78 = r.y; _62 = rxy.y;
} }
r.x = _78; float rr = _62;
float2 q = (abs(p) - b) + float2(r.x); float2 q = abs(p) - b;
return (fast::min(fast::max(q.x, q.y), 0.0) + length(fast::max(q, float2(0.0)))) - r.x; if (rr == 0.0)
}
static inline __attribute__((always_inline))
float sdf_stroke(thread const float& d, thread const float& stroke_width)
{
return abs(d) - (stroke_width * 0.5);
}
static inline __attribute__((always_inline))
float sdCircle(thread const float2& p, thread const float& r)
{
return length(p) - r;
}
static inline __attribute__((always_inline))
float sdEllipse(thread float2& p, thread float2& ab)
{
p = abs(p);
if (p.x > p.y)
{ {
p = p.yx; return fast::max(q.x, q.y);
ab = ab.yx;
} }
float l = (ab.y * ab.y) - (ab.x * ab.x); q += float2(rr);
float m = (ab.x * p.x) / l; return (fast::min(fast::max(q.x, q.y), 0.0) + length(fast::max(q, float2(0.0)))) - rr;
float m2 = m * m;
float n = (ab.y * p.y) / l;
float n2 = n * n;
float c = ((m2 + n2) - 1.0) / 3.0;
float c3 = (c * c) * c;
float q = c3 + ((m2 * n2) * 2.0);
float d = c3 + (m2 * n2);
float g = m + (m * n2);
float co;
if (d < 0.0)
{
float h = acos(q / c3) / 3.0;
float s = cos(h);
float t = sin(h) * 1.73205077648162841796875;
float rx = sqrt(((-c) * ((s + t) + 2.0)) + m2);
float ry = sqrt(((-c) * ((s - t) + 2.0)) + m2);
co = (((ry + (sign(l) * rx)) + (abs(g) / (rx * ry))) - m) / 2.0;
}
else
{
float h_1 = ((2.0 * m) * n) * sqrt(d);
float s_1 = sign(q + h_1) * powr(abs(q + h_1), 0.3333333432674407958984375);
float u = sign(q - h_1) * powr(abs(q - h_1), 0.3333333432674407958984375);
float rx_1 = (((-s_1) - u) - (c * 4.0)) + (2.0 * m2);
float ry_1 = (s_1 - u) * 1.73205077648162841796875;
float rm = sqrt((rx_1 * rx_1) + (ry_1 * ry_1));
co = (((ry_1 / sqrt(rm - rx_1)) + ((2.0 * g) / rm)) - m) / 2.0;
}
float2 r = ab * float2(co, sqrt(1.0 - (co * co)));
return length(r - p) * sign(p.y - r.y);
} }
static inline __attribute__((always_inline)) static inline __attribute__((always_inline))
float sdSegment(thread const float2& p, thread const float2& a, thread const float2& b) float sdRegularPolygon(thread const float2& p, thread const float& r, thread const float& n)
{ {
float2 pa = p - a; float an = 3.1415927410125732421875 / n;
float2 ba = b - a; float bn = mod(precise::atan2(p.y, p.x), 2.0 * an) - an;
float h = fast::clamp(dot(pa, ba) / dot(ba, ba), 0.0, 1.0); return (length(p) * cos(bn)) - r;
return length(pa - (ba * h));
} }
static inline __attribute__((always_inline)) static inline __attribute__((always_inline))
float sdf_alpha(thread const float& d, thread const float& soft) float sdEllipseApprox(thread const float2& p, thread const float2& ab)
{ {
return 1.0 - smoothstep(-soft, soft, d); float k0 = length(p / ab);
float k1 = length(p / (ab * ab));
return (k0 * (k0 - 1.0)) / k1;
}
static inline __attribute__((always_inline))
float4 gradient_2color(thread const float4& start_color, thread const float4& end_color, thread const float& t)
{
return mix(start_color, end_color, float4(fast::clamp(t, 0.0, 1.0)));
}
static inline __attribute__((always_inline))
float sdf_alpha(thread const float& d, thread const float& h)
{
return 1.0 - smoothstep(-h, h, d);
} }
fragment main0_out main0(main0_in in [[stage_in]], texture2d<float> tex [[texture(0)]], sampler texSmplr [[sampler(0)]]) fragment main0_out main0(main0_in in [[stage_in]], texture2d<float> tex [[texture(0)]], sampler texSmplr [[sampler(0)]])
{ {
main0_out out = {}; main0_out out = {};
uint kind = in.f_kind_flags & 255u; uint kind = in.f_flags & 255u;
uint flags = (in.f_kind_flags >> 8u) & 255u; uint flags = (in.f_flags >> 8u) & 255u;
if (kind == 0u) if (kind == 0u)
{ {
out.out_color = in.f_color * tex.sample(texSmplr, in.f_local_or_uv); float4 t = tex.sample(texSmplr, in.f_local_or_uv);
float _195 = t.w;
float4 _197 = t;
float3 _199 = _197.xyz * _195;
t.x = _199.x;
t.y = _199.y;
t.z = _199.z;
out.out_color = in.f_color * t;
return out; return out;
} }
float d = 1000000015047466219876688855040.0; float d = 1000000015047466219876688855040.0;
float soft = 1.0; float h = 0.5;
float2 half_size = in.f_params.xy;
float2 p_local = in.f_local_or_uv;
if ((flags & 16u) != 0u)
{
float2 sc = float2(as_type<half2>(in.f_rotation_sc));
p_local = float2((sc.y * p_local.x) + (sc.x * p_local.y), ((-sc.x) * p_local.x) + (sc.y * p_local.y));
}
if (kind == 1u) if (kind == 1u)
{ {
float2 b = in.f_params.xy; float4 corner_radii = float4(in.f_params.zw, in.f_params2.xy);
float4 r = float4(in.f_params.zw, in.f_params2.xy); h = in.f_params2.z;
soft = fast::max(in.f_params2.z, 1.0); float2 param = p_local;
float stroke_px = in.f_params2.w; float2 param_1 = half_size;
float2 p_local = in.f_local_or_uv; float4 param_2 = corner_radii;
if (in.f_rotation != 0.0) d = sdRoundedBox(param, param_1, param_2);
{
float2 param = p_local;
float param_1 = in.f_rotation;
p_local = apply_rotation(param, param_1);
}
float2 param_2 = p_local;
float2 param_3 = b;
float4 param_4 = r;
float _491 = sdRoundedBox(param_2, param_3, param_4);
d = _491;
if ((flags & 1u) != 0u)
{
float param_5 = d;
float param_6 = stroke_px;
d = sdf_stroke(param_5, param_6);
}
} }
else else
{ {
if (kind == 2u) if (kind == 2u)
{ {
float radius = in.f_params.x; float radius = in.f_params.x;
soft = fast::max(in.f_params.y, 1.0); float sides = in.f_params.y;
float stroke_px_1 = in.f_params.z; h = in.f_params.z;
float2 param_7 = in.f_local_or_uv; float2 param_3 = p_local;
float param_8 = radius; float param_4 = radius;
d = sdCircle(param_7, param_8); float param_5 = sides;
if ((flags & 1u) != 0u) d = sdRegularPolygon(param_3, param_4, param_5);
{ half_size = float2(radius);
float param_9 = d;
float param_10 = stroke_px_1;
d = sdf_stroke(param_9, param_10);
}
} }
else else
{ {
if (kind == 3u) if (kind == 3u)
{ {
float2 ab = in.f_params.xy; float2 ab = in.f_params.xy;
soft = fast::max(in.f_params.z, 1.0); h = in.f_params.z;
float stroke_px_2 = in.f_params.w; float2 param_6 = p_local;
float2 p_local_1 = in.f_local_or_uv; float2 param_7 = ab;
if (in.f_rotation != 0.0) d = sdEllipseApprox(param_6, param_7);
{ half_size = ab;
float2 param_11 = p_local_1;
float param_12 = in.f_rotation;
p_local_1 = apply_rotation(param_11, param_12);
}
float2 param_13 = p_local_1;
float2 param_14 = ab;
float _560 = sdEllipse(param_13, param_14);
d = _560;
if ((flags & 1u) != 0u)
{
float param_15 = d;
float param_16 = stroke_px_2;
d = sdf_stroke(param_15, param_16);
}
} }
else else
{ {
if (kind == 4u) if (kind == 4u)
{ {
float2 a = in.f_params.xy; float inner = in.f_params.x;
float2 b_1 = in.f_params.zw; float outer = in.f_params.y;
float width = in.f_params2.x; float2 n_start = in.f_params.zw;
soft = fast::max(in.f_params2.y, 1.0); float2 n_end = in.f_params2.xy;
float2 param_17 = in.f_local_or_uv; uint arc_bits = (flags >> 5u) & 3u;
float2 param_18 = a; h = in.f_params2.z;
float2 param_19 = b_1; float r = length(p_local);
d = sdSegment(param_17, param_18, param_19) - (width * 0.5); d = fast::max(inner - r, r - outer);
} if (arc_bits != 0u)
else
{
if (kind == 5u)
{ {
float inner = in.f_params.x; float d_start = dot(p_local, n_start);
float outer = in.f_params.y; float d_end = dot(p_local, n_end);
float start_rad = in.f_params.z; float _372;
float end_rad = in.f_params.w; if (arc_bits == 1u)
soft = fast::max(in.f_params2.x, 1.0);
float r_1 = length(in.f_local_or_uv);
float d_ring = fast::max(inner - r_1, r_1 - outer);
float angle = precise::atan2(in.f_local_or_uv.y, in.f_local_or_uv.x);
if (angle < 0.0)
{ {
angle += 6.283185482025146484375; _372 = fast::max(d_start, d_end);
}
float ang_start = mod(start_rad, 6.283185482025146484375);
float ang_end = mod(end_rad, 6.283185482025146484375);
float _654;
if (ang_end > ang_start)
{
_654 = float((angle >= ang_start) && (angle <= ang_end));
} }
else else
{ {
_654 = float((angle >= ang_start) || (angle <= ang_end)); _372 = fast::min(d_start, d_end);
}
float in_arc = _654;
if (abs(ang_end - ang_start) >= 6.282185077667236328125)
{
in_arc = 1.0;
}
d = (in_arc > 0.5) ? d_ring : 1000000015047466219876688855040.0;
}
else
{
if (kind == 6u)
{
float radius_1 = in.f_params.x;
float rotation = in.f_params.y;
float sides = in.f_params.z;
soft = fast::max(in.f_params.w, 1.0);
float stroke_px_3 = in.f_params2.x;
float2 p = in.f_local_or_uv;
float c = cos(rotation);
float s = sin(rotation);
p = float2x2(float2(c, -s), float2(s, c)) * p;
float an = 3.1415927410125732421875 / sides;
float bn = mod(precise::atan2(p.y, p.x), 2.0 * an) - an;
d = (length(p) * cos(bn)) - radius_1;
if ((flags & 1u) != 0u)
{
float param_20 = d;
float param_21 = stroke_px_3;
d = sdf_stroke(param_20, param_21);
}
} }
float d_wedge = _372;
d = fast::max(d, d_wedge);
} }
half_size = float2(outer);
} }
} }
} }
} }
float param_22 = d; float grad_magnitude = fast::max(fwidth(d), 9.9999999747524270787835121154785e-07);
float param_23 = soft; d /= grad_magnitude;
float alpha = sdf_alpha(param_22, param_23); h /= grad_magnitude;
out.out_color = float4(in.f_color.xyz, in.f_color.w * alpha); float4 shape_color;
if ((flags & 2u) != 0u)
{
float4 gradient_start = in.f_color;
float4 gradient_end = unpack_unorm4x8_to_float(in.f_uv_or_effects.x);
if ((flags & 4u) != 0u)
{
float t_1 = length(p_local / half_size);
float4 param_8 = gradient_start;
float4 param_9 = gradient_end;
float param_10 = t_1;
shape_color = gradient_2color(param_8, param_9, param_10);
}
else
{
float2 direction = float2(as_type<half2>(in.f_uv_or_effects.z));
float t_2 = (dot(p_local / half_size, direction) * 0.5) + 0.5;
float4 param_11 = gradient_start;
float4 param_12 = gradient_end;
float param_13 = t_2;
shape_color = gradient_2color(param_11, param_12, param_13);
}
}
else
{
if ((flags & 1u) != 0u)
{
float4 uv_rect = as_type<float4>(in.f_uv_or_effects);
float2 local_uv = ((p_local / half_size) * 0.5) + float2(0.5);
float2 uv = mix(uv_rect.xy, uv_rect.zw, local_uv);
shape_color = in.f_color * tex.sample(texSmplr, uv);
}
else
{
shape_color = in.f_color;
}
}
if ((flags & 8u) != 0u)
{
float4 ol_color = unpack_unorm4x8_to_float(in.f_uv_or_effects.y);
float ol_width = float2(as_type<half2>(in.f_uv_or_effects.w)).x / grad_magnitude;
float param_14 = d;
float param_15 = h;
float fill_cov = sdf_alpha(param_14, param_15);
float param_16 = d - ol_width;
float param_17 = h;
float total_cov = sdf_alpha(param_16, param_17);
float outline_cov = fast::max(total_cov - fill_cov, 0.0);
float3 rgb_pm = ((shape_color.xyz * shape_color.w) * fill_cov) + ((ol_color.xyz * ol_color.w) * outline_cov);
float alpha_pm = (shape_color.w * fill_cov) + (ol_color.w * outline_cov);
out.out_color = float4(rgb_pm, alpha_pm);
}
else
{
float param_18 = d;
float param_19 = h;
float alpha = sdf_alpha(param_18, param_19);
out.out_color = float4((shape_color.xyz * shape_color.w) * alpha, shape_color.w * alpha);
}
return out; return out;
} }
Binary file not shown.
+24 -18
View File
@@ -14,22 +14,24 @@ struct Primitive
{ {
float4 bounds; float4 bounds;
uint color; uint color;
uint kind_flags; uint flags;
float rotation; uint rotation_sc;
float _pad; float _pad;
float4 params; float4 params;
float4 params2; float4 params2;
uint4 uv_or_effects;
}; };
struct Primitive_1 struct Primitive_1
{ {
float4 bounds; float4 bounds;
uint color; uint color;
uint kind_flags; uint flags;
float rotation; uint rotation_sc;
float _pad; float _pad;
float4 params; float4 params;
float4 params2; float4 params2;
uint4 uv_or_effects;
}; };
struct Primitives struct Primitives
@@ -43,8 +45,9 @@ struct main0_out
float2 f_local_or_uv [[user(locn1)]]; float2 f_local_or_uv [[user(locn1)]];
float4 f_params [[user(locn2)]]; float4 f_params [[user(locn2)]];
float4 f_params2 [[user(locn3)]]; float4 f_params2 [[user(locn3)]];
uint f_kind_flags [[user(locn4)]]; uint f_flags [[user(locn4)]];
float f_rotation [[user(locn5)]]; uint f_rotation_sc [[user(locn5)]];
uint4 f_uv_or_effects [[user(locn6)]];
float4 gl_Position [[position]]; float4 gl_Position [[position]];
}; };
@@ -55,7 +58,7 @@ struct main0_in
float4 v_color [[attribute(2)]]; float4 v_color [[attribute(2)]];
}; };
vertex main0_out main0(main0_in in [[stage_in]], constant Uniforms& _12 [[buffer(0)]], const device Primitives& _72 [[buffer(1)]], uint gl_InstanceIndex [[instance_id]]) vertex main0_out main0(main0_in in [[stage_in]], constant Uniforms& _12 [[buffer(0)]], const device Primitives& _75 [[buffer(1)]], uint gl_InstanceIndex [[instance_id]])
{ {
main0_out out = {}; main0_out out = {};
if (_12.mode == 0u) if (_12.mode == 0u)
@@ -64,20 +67,22 @@ vertex main0_out main0(main0_in in [[stage_in]], constant Uniforms& _12 [[buffer
out.f_local_or_uv = in.v_uv; out.f_local_or_uv = in.v_uv;
out.f_params = float4(0.0); out.f_params = float4(0.0);
out.f_params2 = float4(0.0); out.f_params2 = float4(0.0);
out.f_kind_flags = 0u; out.f_flags = 0u;
out.f_rotation = 0.0; out.f_rotation_sc = 0u;
out.f_uv_or_effects = uint4(0u);
out.gl_Position = _12.projection * float4(in.v_position * _12.dpi_scale, 0.0, 1.0); out.gl_Position = _12.projection * float4(in.v_position * _12.dpi_scale, 0.0, 1.0);
} }
else else
{ {
Primitive p; Primitive p;
p.bounds = _72.primitives[int(gl_InstanceIndex)].bounds; p.bounds = _75.primitives[int(gl_InstanceIndex)].bounds;
p.color = _72.primitives[int(gl_InstanceIndex)].color; p.color = _75.primitives[int(gl_InstanceIndex)].color;
p.kind_flags = _72.primitives[int(gl_InstanceIndex)].kind_flags; p.flags = _75.primitives[int(gl_InstanceIndex)].flags;
p.rotation = _72.primitives[int(gl_InstanceIndex)].rotation; p.rotation_sc = _75.primitives[int(gl_InstanceIndex)].rotation_sc;
p._pad = _72.primitives[int(gl_InstanceIndex)]._pad; p._pad = _75.primitives[int(gl_InstanceIndex)]._pad;
p.params = _72.primitives[int(gl_InstanceIndex)].params; p.params = _75.primitives[int(gl_InstanceIndex)].params;
p.params2 = _72.primitives[int(gl_InstanceIndex)].params2; p.params2 = _75.primitives[int(gl_InstanceIndex)].params2;
p.uv_or_effects = _75.primitives[int(gl_InstanceIndex)].uv_or_effects;
float2 corner = in.v_position; float2 corner = in.v_position;
float2 world_pos = mix(p.bounds.xy, p.bounds.zw, corner); float2 world_pos = mix(p.bounds.xy, p.bounds.zw, corner);
float2 center = (p.bounds.xy + p.bounds.zw) * 0.5; float2 center = (p.bounds.xy + p.bounds.zw) * 0.5;
@@ -85,8 +90,9 @@ vertex main0_out main0(main0_in in [[stage_in]], constant Uniforms& _12 [[buffer
out.f_local_or_uv = (world_pos - center) * _12.dpi_scale; out.f_local_or_uv = (world_pos - center) * _12.dpi_scale;
out.f_params = p.params; out.f_params = p.params;
out.f_params2 = p.params2; out.f_params2 = p.params2;
out.f_kind_flags = p.kind_flags; out.f_flags = p.flags;
out.f_rotation = p.rotation; out.f_rotation_sc = p.rotation_sc;
out.f_uv_or_effects = p.uv_or_effects;
out.gl_Position = _12.projection * float4(world_pos * _12.dpi_scale, 0.0, 1.0); out.gl_Position = _12.projection * float4(world_pos * _12.dpi_scale, 0.0, 1.0);
} }
return out; return out;
Binary file not shown.
+142 -152
View File
@@ -1,12 +1,13 @@
#version 450 core #version 450 core
// --- Inputs from vertex shader --- // --- Inputs from vertex shader ---
layout(location = 0) in vec4 f_color; layout(location = 0) in mediump vec4 f_color;
layout(location = 1) in vec2 f_local_or_uv; layout(location = 1) in vec2 f_local_or_uv;
layout(location = 2) in vec4 f_params; layout(location = 2) in vec4 f_params;
layout(location = 3) in vec4 f_params2; layout(location = 3) in vec4 f_params2;
layout(location = 4) flat in uint f_kind_flags; layout(location = 4) flat in uint f_flags;
layout(location = 5) flat in float f_rotation; layout(location = 5) flat in uint f_rotation_sc;
layout(location = 6) flat in uvec4 f_uv_or_effects;
// --- Output --- // --- Output ---
layout(location = 0) out vec4 out_color; layout(location = 0) out vec4 out_color;
@@ -19,77 +20,43 @@ layout(set = 2, binding = 0) uniform sampler2D tex;
// All operate in physical pixel space — no dpi_scale needed here. // All operate in physical pixel space — no dpi_scale needed here.
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
const float PI = 3.14159265358979;
float sdCircle(vec2 p, float r) {
return length(p) - r;
}
float sdRoundedBox(vec2 p, vec2 b, vec4 r) { float sdRoundedBox(vec2 p, vec2 b, vec4 r) {
r.xy = (p.x > 0.0) ? r.xy : r.zw; vec2 rxy = (p.x > 0.0) ? r.xy : r.zw;
r.x = (p.y > 0.0) ? r.x : r.y; float rr = (p.y > 0.0) ? rxy.x : rxy.y;
vec2 q = abs(p) - b + r.x; vec2 q = abs(p) - b;
return min(max(q.x, q.y), 0.0) + length(max(q, vec2(0.0))) - r.x; if (rr == 0.0) {
} return max(q.x, q.y);
float sdSegment(vec2 p, vec2 a, vec2 b) {
vec2 pa = p - a, ba = b - a;
float h = clamp(dot(pa, ba) / dot(ba, ba), 0.0, 1.0);
return length(pa - ba * h);
}
float sdEllipse(vec2 p, vec2 ab) {
p = abs(p);
if (p.x > p.y) {
p = p.yx;
ab = ab.yx;
} }
float l = ab.y * ab.y - ab.x * ab.x; q += rr;
float m = ab.x * p.x / l; return min(max(q.x, q.y), 0.0) + length(max(q, vec2(0.0))) - rr;
float m2 = m * m;
float n = ab.y * p.y / l;
float n2 = n * n;
float c = (m2 + n2 - 1.0) / 3.0;
float c3 = c * c * c;
float q = c3 + m2 * n2 * 2.0;
float d = c3 + m2 * n2;
float g = m + m * n2;
float co;
if (d < 0.0) {
float h = acos(q / c3) / 3.0;
float s = cos(h);
float t = sin(h) * sqrt(3.0);
float rx = sqrt(-c * (s + t + 2.0) + m2);
float ry = sqrt(-c * (s - t + 2.0) + m2);
co = (ry + sign(l) * rx + abs(g) / (rx * ry) - m) / 2.0;
} else {
float h = 2.0 * m * n * sqrt(d);
float s = sign(q + h) * pow(abs(q + h), 1.0 / 3.0);
float u = sign(q - h) * pow(abs(q - h), 1.0 / 3.0);
float rx = -s - u - c * 4.0 + 2.0 * m2;
float ry = (s - u) * sqrt(3.0);
float rm = sqrt(rx * rx + ry * ry);
co = (ry / sqrt(rm - rx) + 2.0 * g / rm - m) / 2.0;
}
vec2 r = ab * vec2(co, sqrt(1.0 - co * co));
return length(r - p) * sign(p.y - r.y);
} }
float sdf_alpha(float d, float soft) { // Approximate ellipse SDF — fast, suitable for UI, NOT a true Euclidean distance.
return 1.0 - smoothstep(-soft, soft, d); float sdEllipseApprox(vec2 p, vec2 ab) {
float k0 = length(p / ab);
float k1 = length(p / (ab * ab));
return k0 * (k0 - 1.0) / k1;
} }
float sdf_stroke(float d, float stroke_width) { // Regular N-gon SDF (Inigo Quilez).
return abs(d) - stroke_width * 0.5; float sdRegularPolygon(vec2 p, float r, float n) {
float an = 3.141592653589793 / n;
float bn = mod(atan(p.y, p.x), 2.0 * an) - an;
return length(p) * cos(bn) - r;
} }
// Rotate a 2D point by the negative of the given angle (inverse rotation). // Coverage from SDF distance using half-feather width (feather_px * 0.5, pre-computed on CPU).
// Used to rotate the sampling frame opposite to the shape's rotation so that // Produces a symmetric transition centered on d=0: smoothstep(-h, h, d).
// the SDF evaluates correctly for the rotated shape. float sdf_alpha(float d, float h) {
vec2 apply_rotation(vec2 p, float angle) { return 1.0 - smoothstep(-h, h, d);
float cr = cos(-angle); }
float sr = sin(-angle);
return mat2(cr, sr, -sr, cr) * p; // ---------------------------------------------------------------------------
// Gradient helpers
// ---------------------------------------------------------------------------
mediump vec4 gradient_2color(mediump vec4 start_color, mediump vec4 end_color, mediump float t) {
return mix(start_color, end_color, clamp(t, 0.0, 1.0));
} }
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
@@ -97,114 +64,137 @@ vec2 apply_rotation(vec2 p, float angle) {
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
void main() { void main() {
uint kind = f_kind_flags & 0xFFu; uint kind = f_flags & 0xFFu;
uint flags = (f_kind_flags >> 8u) & 0xFFu; uint flags = (f_flags >> 8u) & 0xFFu;
// ----------------------------------------------------------------------- // Kind 0: Tessellated path — vertex colors arrive premultiplied from CPU.
// Kind 0: Tessellated path. Texture multiply for text atlas, // Texture samples are straight-alpha (SDL_ttf glyph atlas: rgb=1, a=coverage;
// white pixel for solid shapes. // or the 1x1 white texture: rgba=1). Convert to premultiplied form so the
// ----------------------------------------------------------------------- // blend state (ONE, ONE_MINUS_SRC_ALPHA) composites correctly.
if (kind == 0u) { if (kind == 0u) {
out_color = f_color * texture(tex, f_local_or_uv); vec4 t = texture(tex, f_local_or_uv);
t.rgb *= t.a;
out_color = f_color * t;
return; return;
} }
// ----------------------------------------------------------------------- // SDF path — dispatch on kind
// SDF path. f_local_or_uv = shape-centered position in physical pixels.
// All dimensional params are already in physical pixels (CPU pre-scaled).
// -----------------------------------------------------------------------
float d = 1e30; float d = 1e30;
float soft = 1.0; float h = 0.5; // half-feather width; overwritten per shape kind
vec2 half_size = f_params.xy; // used by RRect and as reference size for gradients
vec2 p_local = f_local_or_uv;
// Apply inverse rotation using pre-computed sin/cos (no per-pixel trig).
// .Rotated flag = bit 4 = 16u
if ((flags & 16u) != 0u) {
vec2 sc = unpackHalf2x16(f_rotation_sc); // .x = sin(angle), .y = cos(angle)
// Inverse rotation matrix R(-angle) = [[cos, sin], [-sin, cos]]
p_local = vec2(sc.y * p_local.x + sc.x * p_local.y,
-sc.x * p_local.x + sc.y * p_local.y);
}
if (kind == 1u) { if (kind == 1u) {
// RRect: rounded box // RRect — half_feather in params2.z
vec2 b = f_params.xy; // half_size (phys px) vec4 corner_radii = vec4(f_params.zw, f_params2.xy);
vec4 r = vec4(f_params.zw, f_params2.xy); // corner radii: tr, br, tl, bl h = f_params2.z;
soft = max(f_params2.z, 1.0); d = sdRoundedBox(p_local, half_size, corner_radii);
float stroke_px = f_params2.w;
vec2 p_local = f_local_or_uv;
if (f_rotation != 0.0) {
p_local = apply_rotation(p_local, f_rotation);
}
d = sdRoundedBox(p_local, b, r);
if ((flags & 1u) != 0u) d = sdf_stroke(d, stroke_px);
} }
else if (kind == 2u) { else if (kind == 2u) {
// Circle — rotationally symmetric, no rotation needed // NGon — half_feather in params.z
float radius = f_params.x; float radius = f_params.x;
soft = max(f_params.y, 1.0); float sides = f_params.y;
float stroke_px = f_params.z; h = f_params.z;
d = sdRegularPolygon(p_local, radius, sides);
d = sdCircle(f_local_or_uv, radius); half_size = vec2(radius); // for gradient UV computation
if ((flags & 1u) != 0u) d = sdf_stroke(d, stroke_px);
} }
else if (kind == 3u) { else if (kind == 3u) {
// Ellipse // Ellipse — half_feather in params.z
vec2 ab = f_params.xy; vec2 ab = f_params.xy;
soft = max(f_params.z, 1.0); h = f_params.z;
float stroke_px = f_params.w; d = sdEllipseApprox(p_local, ab);
half_size = ab; // for gradient UV computation
vec2 p_local = f_local_or_uv;
if (f_rotation != 0.0) {
p_local = apply_rotation(p_local, f_rotation);
}
d = sdEllipse(p_local, ab);
if ((flags & 1u) != 0u) d = sdf_stroke(d, stroke_px);
} }
else if (kind == 4u) { else if (kind == 4u) {
// Segment (capsule line) — no rotation (excluded) // Ring_Arc — half_feather in params2.z
vec2 a = f_params.xy; // already in local physical pixels // Arc mode from flag bits 5-6: 0 = full, 1 = narrow (≤π), 2 = wide (>π)
vec2 b = f_params.zw;
float width = f_params2.x;
soft = max(f_params2.y, 1.0);
d = sdSegment(f_local_or_uv, a, b) - width * 0.5;
}
else if (kind == 5u) {
// Ring / Arc — rotation handled by CPU angle offset, no shader rotation
float inner = f_params.x; float inner = f_params.x;
float outer = f_params.y; float outer = f_params.y;
float start_rad = f_params.z; vec2 n_start = f_params.zw;
float end_rad = f_params.w; vec2 n_end = f_params2.xy;
soft = max(f_params2.x, 1.0); uint arc_bits = (flags >> 5u) & 3u;
float r = length(f_local_or_uv); h = f_params2.z;
float d_ring = max(inner - r, r - outer);
// Angular clip float r = length(p_local);
float angle = atan(f_local_or_uv.y, f_local_or_uv.x); d = max(inner - r, r - outer);
if (angle < 0.0) angle += 2.0 * PI;
float ang_start = mod(start_rad, 2.0 * PI);
float ang_end = mod(end_rad, 2.0 * PI);
float in_arc = (ang_end > ang_start) if (arc_bits != 0u) {
? ((angle >= ang_start && angle <= ang_end) ? 1.0 : 0.0) : ((angle >= ang_start || angle <= ang_end) ? 1.0 : 0.0); float d_start = dot(p_local, n_start);
if (abs(ang_end - ang_start) >= 2.0 * PI - 0.001) in_arc = 1.0; float d_end = dot(p_local, n_end);
float d_wedge = (arc_bits == 1u)
? max(d_start, d_end) // arc ≤ π: intersect half-planes
: min(d_start, d_end); // arc > π: union half-planes
d = max(d, d_wedge);
}
d = in_arc > 0.5 ? d_ring : 1e30; half_size = vec2(outer); // for gradient UV computation
}
else if (kind == 6u) {
// Regular N-gon — has its own rotation in params, no Primitive.rotation used
float radius = f_params.x;
float rotation = f_params.y;
float sides = f_params.z;
soft = max(f_params.w, 1.0);
float stroke_px = f_params2.x;
vec2 p = f_local_or_uv;
float c = cos(rotation), s = sin(rotation);
p = mat2(c, -s, s, c) * p;
float an = PI / sides;
float bn = mod(atan(p.y, p.x), 2.0 * an) - an;
d = length(p) * cos(bn) - radius;
if ((flags & 1u) != 0u) d = sdf_stroke(d, stroke_px);
} }
float alpha = sdf_alpha(d, soft); // --- fwidth-based normalization for correct AA and stroke width ---
out_color = vec4(f_color.rgb, f_color.a * alpha); float grad_magnitude = max(fwidth(d), 1e-6);
d = d / grad_magnitude;
h = h / grad_magnitude;
// --- Determine shape color based on flags ---
mediump vec4 shape_color;
if ((flags & 2u) != 0u) {
// Gradient active (bit 1)
mediump vec4 gradient_start = f_color;
mediump vec4 gradient_end = unpackUnorm4x8(f_uv_or_effects.x);
if ((flags & 4u) != 0u) {
// Radial gradient (bit 2): t from distance to center
mediump float t = length(p_local / half_size);
shape_color = gradient_2color(gradient_start, gradient_end, t);
} else {
// Linear gradient: direction pre-computed on CPU as (cos, sin) f16 pair
vec2 direction = unpackHalf2x16(f_uv_or_effects.z);
mediump float t = dot(p_local / half_size, direction) * 0.5 + 0.5;
shape_color = gradient_2color(gradient_start, gradient_end, t);
}
} else if ((flags & 1u) != 0u) {
// Textured (bit 0) — RRect only in practice
vec4 uv_rect = uintBitsToFloat(f_uv_or_effects);
vec2 local_uv = p_local / half_size * 0.5 + 0.5;
vec2 uv = mix(uv_rect.xy, uv_rect.zw, local_uv);
shape_color = f_color * texture(tex, uv);
} else {
// Solid color
shape_color = f_color;
}
// --- Outline (bit 3) — outer outline via premultiplied compositing ---
// The outline band sits OUTSIDE the original shape boundary (d=0 to d=+ol_width).
// fill_cov covers the interior with AA at d=0; total_cov covers interior+outline with
// AA at d=ol_width. The outline band's coverage is total_cov - fill_cov.
// Output is premultiplied: blend state is ONE, ONE_MINUS_SRC_ALPHA.
if ((flags & 8u) != 0u) {
mediump vec4 ol_color = unpackUnorm4x8(f_uv_or_effects.y);
// Outline width in f_uv_or_effects.w (low f16 half)
float ol_width = unpackHalf2x16(f_uv_or_effects.w).x / grad_magnitude;
float fill_cov = sdf_alpha(d, h);
float total_cov = sdf_alpha(d - ol_width, h);
float outline_cov = max(total_cov - fill_cov, 0.0);
// Premultiplied output — no divide, no threshold check
vec3 rgb_pm = shape_color.rgb * shape_color.a * fill_cov
+ ol_color.rgb * ol_color.a * outline_cov;
float alpha_pm = shape_color.a * fill_cov + ol_color.a * outline_cov;
out_color = vec4(rgb_pm, alpha_pm);
} else {
mediump float alpha = sdf_alpha(d, h);
out_color = vec4(shape_color.rgb * shape_color.a * alpha, shape_color.a * alpha);
}
} }
+18 -14
View File
@@ -6,12 +6,13 @@ layout(location = 1) in vec2 v_uv;
layout(location = 2) in vec4 v_color; layout(location = 2) in vec4 v_color;
// ---------- Outputs to fragment shader ---------- // ---------- Outputs to fragment shader ----------
layout(location = 0) out vec4 f_color; layout(location = 0) out mediump vec4 f_color;
layout(location = 1) out vec2 f_local_or_uv; layout(location = 1) out vec2 f_local_or_uv;
layout(location = 2) out vec4 f_params; layout(location = 2) out vec4 f_params;
layout(location = 3) out vec4 f_params2; layout(location = 3) out vec4 f_params2;
layout(location = 4) flat out uint f_kind_flags; layout(location = 4) flat out uint f_flags;
layout(location = 5) flat out float f_rotation; layout(location = 5) flat out uint f_rotation_sc;
layout(location = 6) flat out uvec4 f_uv_or_effects;
// ---------- Uniforms (single block — avoids spirv-cross reordering on Metal) ---------- // ---------- Uniforms (single block — avoids spirv-cross reordering on Metal) ----------
layout(set = 1, binding = 0) uniform Uniforms { layout(set = 1, binding = 0) uniform Uniforms {
@@ -22,13 +23,14 @@ layout(set = 1, binding = 0) uniform Uniforms {
// ---------- SDF primitive storage buffer ---------- // ---------- SDF primitive storage buffer ----------
struct Primitive { struct Primitive {
vec4 bounds; // 0-15: min_x, min_y, max_x, max_y vec4 bounds; // 0-15
uint color; // 16-19: packed u8x4 (unpack with unpackUnorm4x8) uint color; // 16-19
uint kind_flags; // 20-23: kind | (flags << 8) uint flags; // 20-23
float rotation; // 24-27: shader self-rotation in radians uint rotation_sc; // 24-27: packed f16 pair (sin, cos)
float _pad; // 28-31: alignment padding float _pad; // 28-31
vec4 params; // 32-47: shape params part 1 vec4 params; // 32-47
vec4 params2; // 48-63: shape params part 2 vec4 params2; // 48-63
uvec4 uv_or_effects; // 64-79
}; };
layout(std430, set = 0, binding = 0) readonly buffer Primitives { layout(std430, set = 0, binding = 0) readonly buffer Primitives {
@@ -43,8 +45,9 @@ void main() {
f_local_or_uv = v_uv; f_local_or_uv = v_uv;
f_params = vec4(0.0); f_params = vec4(0.0);
f_params2 = vec4(0.0); f_params2 = vec4(0.0);
f_kind_flags = 0u; f_flags = 0u;
f_rotation = 0.0; f_rotation_sc = 0u;
f_uv_or_effects = uvec4(0);
gl_Position = projection * vec4(v_position * dpi_scale, 0.0, 1.0); gl_Position = projection * vec4(v_position * dpi_scale, 0.0, 1.0);
} else { } else {
@@ -59,8 +62,9 @@ void main() {
f_local_or_uv = (world_pos - center) * dpi_scale; // shape-centered physical pixels f_local_or_uv = (world_pos - center) * dpi_scale; // shape-centered physical pixels
f_params = p.params; f_params = p.params;
f_params2 = p.params2; f_params2 = p.params2;
f_kind_flags = p.kind_flags; f_flags = p.flags;
f_rotation = p.rotation; f_rotation_sc = p.rotation_sc;
f_uv_or_effects = p.uv_or_effects;
gl_Position = projection * vec4(world_pos * dpi_scale, 0.0, 1.0); gl_Position = projection * vec4(world_pos * dpi_scale, 0.0, 1.0);
} }
+522 -769
View File
File diff suppressed because it is too large Load Diff
+330
View File
@@ -0,0 +1,330 @@
package tess
import "core:math"
import draw ".."
SMOOTH_CIRCLE_ERROR_RATE :: 0.1
auto_segments :: proc(radius: f32, arc_degrees: f32) -> int {
if radius <= 0 do return 4
phys_radius := radius * draw.GLOB.dpi_scaling
acos_arg := clamp(2 * math.pow(1 - SMOOTH_CIRCLE_ERROR_RATE / phys_radius, 2) - 1, -1, 1)
theta := math.acos(acos_arg)
if theta <= 0 do return 4
full_circle_segments := int(math.ceil(2 * math.PI / theta))
segments := int(f32(full_circle_segments) * arc_degrees / 360.0)
min_segments := max(int(math.ceil(f64(arc_degrees / 90.0))), 4)
return max(segments, min_segments)
}
// ----- Internal helpers -----
// Color is premultiplied: the tessellated fragment shader passes it through directly
// and the blend state is ONE, ONE_MINUS_SRC_ALPHA.
solid_vertex :: proc(position: draw.Vec2, color: draw.Color) -> draw.Vertex {
return draw.Vertex{position = position, color = draw.premultiply_color(color)}
}
emit_rectangle :: proc(x, y, width, height: f32, color: draw.Color, vertices: []draw.Vertex, offset: int) {
vertices[offset + 0] = solid_vertex({x, y}, color)
vertices[offset + 1] = solid_vertex({x + width, y}, color)
vertices[offset + 2] = solid_vertex({x + width, y + height}, color)
vertices[offset + 3] = solid_vertex({x, y}, color)
vertices[offset + 4] = solid_vertex({x + width, y + height}, color)
vertices[offset + 5] = solid_vertex({x, y + height}, color)
}
extrude_line :: proc(
start, end_pos: draw.Vec2,
thickness: f32,
color: draw.Color,
vertices: []draw.Vertex,
offset: int,
) -> int {
direction := end_pos - start
delta_x := direction[0]
delta_y := direction[1]
length := math.sqrt(delta_x * delta_x + delta_y * delta_y)
if length < 0.0001 do return 0
scale := thickness / (2 * length)
perpendicular := draw.Vec2{-delta_y * scale, delta_x * scale}
p0 := start + perpendicular
p1 := start - perpendicular
p2 := end_pos - perpendicular
p3 := end_pos + perpendicular
vertices[offset + 0] = solid_vertex(p0, color)
vertices[offset + 1] = solid_vertex(p1, color)
vertices[offset + 2] = solid_vertex(p2, color)
vertices[offset + 3] = solid_vertex(p0, color)
vertices[offset + 4] = solid_vertex(p2, color)
vertices[offset + 5] = solid_vertex(p3, color)
return 6
}
// ----- Public draw -----
pixel :: proc(layer: ^draw.Layer, pos: draw.Vec2, color: draw.Color) {
vertices: [6]draw.Vertex
emit_rectangle(pos[0], pos[1], 1, 1, color, vertices[:], 0)
draw.prepare_shape(layer, vertices[:])
}
triangle :: proc(
layer: ^draw.Layer,
v1, v2, v3: draw.Vec2,
color: draw.Color,
origin: draw.Vec2 = {},
rotation: f32 = 0,
) {
if !draw.needs_transform(origin, rotation) {
vertices := [3]draw.Vertex{solid_vertex(v1, color), solid_vertex(v2, color), solid_vertex(v3, color)}
draw.prepare_shape(layer, vertices[:])
return
}
bounds_min := draw.Vec2{min(v1.x, v2.x, v3.x), min(v1.y, v2.y, v3.y)}
transform := draw.build_pivot_rotation(bounds_min, origin, rotation)
local_v1 := v1 - bounds_min
local_v2 := v2 - bounds_min
local_v3 := v3 - bounds_min
vertices := [3]draw.Vertex {
solid_vertex(draw.apply_transform(transform, local_v1), color),
solid_vertex(draw.apply_transform(transform, local_v2), color),
solid_vertex(draw.apply_transform(transform, local_v3), color),
}
draw.prepare_shape(layer, vertices[:])
}
// Draw an anti-aliased triangle via extruded edge quads.
// Interior vertices get the full premultiplied color; outer fringe vertices get BLANK (0,0,0,0).
// The rasterizer linearly interpolates between them, producing a smooth 1-pixel AA band.
// `aa_px` controls the extrusion width in logical pixels (default 1.0).
// This proc emits 21 vertices (3 interior + 6 edge quads × 3 verts each).
triangle_aa :: proc(
layer: ^draw.Layer,
v1, v2, v3: draw.Vec2,
color: draw.Color,
aa_px: f32 = draw.DFT_FEATHER_PX,
origin: draw.Vec2 = {},
rotation: f32 = 0,
) {
// Apply rotation if needed, then work in world space.
p0, p1, p2: draw.Vec2
if !draw.needs_transform(origin, rotation) {
p0 = v1
p1 = v2
p2 = v3
} else {
bounds_min := draw.Vec2{min(v1.x, v2.x, v3.x), min(v1.y, v2.y, v3.y)}
transform := draw.build_pivot_rotation(bounds_min, origin, rotation)
p0 = draw.apply_transform(transform, v1 - bounds_min)
p1 = draw.apply_transform(transform, v2 - bounds_min)
p2 = draw.apply_transform(transform, v3 - bounds_min)
}
// Compute outward edge normals (unit length, pointing away from triangle interior).
// Winding-independent: we check against the centroid to ensure normals point outward.
centroid_x := (p0.x + p1.x + p2.x) / 3.0
centroid_y := (p0.y + p1.y + p2.y) / 3.0
edge_normal :: proc(edge_start, edge_end: draw.Vec2, centroid_x, centroid_y: f32) -> draw.Vec2 {
delta_x := edge_end.x - edge_start.x
delta_y := edge_end.y - edge_start.y
length := math.sqrt(delta_x * delta_x + delta_y * delta_y)
if length < 0.0001 do return {0, 0}
inverse_length := 1.0 / length
// Perpendicular: (-delta_y, delta_x) normalized
normal_x := -delta_y * inverse_length
normal_y := delta_x * inverse_length
// Midpoint of the edge
midpoint_x := (edge_start.x + edge_end.x) * 0.5
midpoint_y := (edge_start.y + edge_end.y) * 0.5
// If normal points toward centroid, flip it
if normal_x * (centroid_x - midpoint_x) + normal_y * (centroid_y - midpoint_y) > 0 {
normal_x = -normal_x
normal_y = -normal_y
}
return {normal_x, normal_y}
}
normal_01 := edge_normal(p0, p1, centroid_x, centroid_y)
normal_12 := edge_normal(p1, p2, centroid_x, centroid_y)
normal_20 := edge_normal(p2, p0, centroid_x, centroid_y)
extrude_distance := aa_px * draw.GLOB.dpi_scaling
// Outer fringe vertices: each edge vertex extruded outward
outer_0_01 := p0 + normal_01 * extrude_distance
outer_1_01 := p1 + normal_01 * extrude_distance
outer_1_12 := p1 + normal_12 * extrude_distance
outer_2_12 := p2 + normal_12 * extrude_distance
outer_2_20 := p2 + normal_20 * extrude_distance
outer_0_20 := p0 + normal_20 * extrude_distance
// Premultiplied interior color (solid_vertex does premul internally).
// Outer fringe is BLANK = {0,0,0,0} which is already premul.
transparent := draw.BLANK
// 3 interior + 6 × 3 edge-quad = 21 vertices
vertices: [21]draw.Vertex
// Interior triangle
vertices[0] = solid_vertex(p0, color)
vertices[1] = solid_vertex(p1, color)
vertices[2] = solid_vertex(p2, color)
// Edge quad: p0→p1 (2 triangles)
vertices[3] = solid_vertex(p0, color)
vertices[4] = solid_vertex(p1, color)
vertices[5] = solid_vertex(outer_1_01, transparent)
vertices[6] = solid_vertex(p0, color)
vertices[7] = solid_vertex(outer_1_01, transparent)
vertices[8] = solid_vertex(outer_0_01, transparent)
// Edge quad: p1→p2 (2 triangles)
vertices[9] = solid_vertex(p1, color)
vertices[10] = solid_vertex(p2, color)
vertices[11] = solid_vertex(outer_2_12, transparent)
vertices[12] = solid_vertex(p1, color)
vertices[13] = solid_vertex(outer_2_12, transparent)
vertices[14] = solid_vertex(outer_1_12, transparent)
// Edge quad: p2→p0 (2 triangles)
vertices[15] = solid_vertex(p2, color)
vertices[16] = solid_vertex(p0, color)
vertices[17] = solid_vertex(outer_0_20, transparent)
vertices[18] = solid_vertex(p2, color)
vertices[19] = solid_vertex(outer_0_20, transparent)
vertices[20] = solid_vertex(outer_2_20, transparent)
draw.prepare_shape(layer, vertices[:])
}
triangle_lines :: proc(
layer: ^draw.Layer,
v1, v2, v3: draw.Vec2,
color: draw.Color,
thickness: f32 = draw.DFT_STROKE_THICKNESS,
origin: draw.Vec2 = {},
rotation: f32 = 0,
temp_allocator := context.temp_allocator,
) {
vertices := make([]draw.Vertex, 18, temp_allocator)
defer delete(vertices, temp_allocator)
write_offset := 0
if !draw.needs_transform(origin, rotation) {
write_offset += extrude_line(v1, v2, thickness, color, vertices, write_offset)
write_offset += extrude_line(v2, v3, thickness, color, vertices, write_offset)
write_offset += extrude_line(v3, v1, thickness, color, vertices, write_offset)
} else {
bounds_min := draw.Vec2{min(v1.x, v2.x, v3.x), min(v1.y, v2.y, v3.y)}
transform := draw.build_pivot_rotation(bounds_min, origin, rotation)
transformed_v1 := draw.apply_transform(transform, v1 - bounds_min)
transformed_v2 := draw.apply_transform(transform, v2 - bounds_min)
transformed_v3 := draw.apply_transform(transform, v3 - bounds_min)
write_offset += extrude_line(transformed_v1, transformed_v2, thickness, color, vertices, write_offset)
write_offset += extrude_line(transformed_v2, transformed_v3, thickness, color, vertices, write_offset)
write_offset += extrude_line(transformed_v3, transformed_v1, thickness, color, vertices, write_offset)
}
if write_offset > 0 {
draw.prepare_shape(layer, vertices[:write_offset])
}
}
triangle_fan :: proc(
layer: ^draw.Layer,
points: []draw.Vec2,
color: draw.Color,
origin: draw.Vec2 = {},
rotation: f32 = 0,
temp_allocator := context.temp_allocator,
) {
if len(points) < 3 do return
triangle_count := len(points) - 2
vertex_count := triangle_count * 3
vertices := make([]draw.Vertex, vertex_count, temp_allocator)
defer delete(vertices, temp_allocator)
if !draw.needs_transform(origin, rotation) {
for i in 1 ..< len(points) - 1 {
idx := (i - 1) * 3
vertices[idx + 0] = solid_vertex(points[0], color)
vertices[idx + 1] = solid_vertex(points[i], color)
vertices[idx + 2] = solid_vertex(points[i + 1], color)
}
} else {
bounds_min := draw.Vec2{max(f32), max(f32)}
for point in points {
bounds_min.x = min(bounds_min.x, point.x)
bounds_min.y = min(bounds_min.y, point.y)
}
transform := draw.build_pivot_rotation(bounds_min, origin, rotation)
for i in 1 ..< len(points) - 1 {
idx := (i - 1) * 3
vertices[idx + 0] = solid_vertex(draw.apply_transform(transform, points[0] - bounds_min), color)
vertices[idx + 1] = solid_vertex(draw.apply_transform(transform, points[i] - bounds_min), color)
vertices[idx + 2] = solid_vertex(draw.apply_transform(transform, points[i + 1] - bounds_min), color)
}
}
draw.prepare_shape(layer, vertices)
}
triangle_strip :: proc(
layer: ^draw.Layer,
points: []draw.Vec2,
color: draw.Color,
origin: draw.Vec2 = {},
rotation: f32 = 0,
temp_allocator := context.temp_allocator,
) {
if len(points) < 3 do return
triangle_count := len(points) - 2
vertex_count := triangle_count * 3
vertices := make([]draw.Vertex, vertex_count, temp_allocator)
defer delete(vertices, temp_allocator)
if !draw.needs_transform(origin, rotation) {
for i in 0 ..< triangle_count {
idx := i * 3
if i % 2 == 0 {
vertices[idx + 0] = solid_vertex(points[i], color)
vertices[idx + 1] = solid_vertex(points[i + 1], color)
vertices[idx + 2] = solid_vertex(points[i + 2], color)
} else {
vertices[idx + 0] = solid_vertex(points[i + 1], color)
vertices[idx + 1] = solid_vertex(points[i], color)
vertices[idx + 2] = solid_vertex(points[i + 2], color)
}
}
} else {
bounds_min := draw.Vec2{max(f32), max(f32)}
for point in points {
bounds_min.x = min(bounds_min.x, point.x)
bounds_min.y = min(bounds_min.y, point.y)
}
transform := draw.build_pivot_rotation(bounds_min, origin, rotation)
for i in 0 ..< triangle_count {
idx := i * 3
if i % 2 == 0 {
vertices[idx + 0] = solid_vertex(draw.apply_transform(transform, points[i] - bounds_min), color)
vertices[idx + 1] = solid_vertex(draw.apply_transform(transform, points[i + 1] - bounds_min), color)
vertices[idx + 2] = solid_vertex(draw.apply_transform(transform, points[i + 2] - bounds_min), color)
} else {
vertices[idx + 0] = solid_vertex(draw.apply_transform(transform, points[i + 1] - bounds_min), color)
vertices[idx + 1] = solid_vertex(draw.apply_transform(transform, points[i] - bounds_min), color)
vertices[idx + 2] = solid_vertex(draw.apply_transform(transform, points[i + 2] - bounds_min), color)
}
}
}
draw.prepare_shape(layer, vertices)
}
+20 -18
View File
@@ -79,7 +79,7 @@ register_font :: proc(bytes: []u8) -> (id: Font_Id, ok: bool) #optional_ok {
Text :: struct { Text :: struct {
sdl_text: ^sdl_ttf.Text, sdl_text: ^sdl_ttf.Text,
position: [2]f32, position: Vec2,
color: Color, color: Color,
} }
@@ -129,16 +129,17 @@ cache_get_or_update :: proc(key: Cache_Key, c_str: cstring, font: ^sdl_ttf.Font)
text :: proc( text :: proc(
layer: ^Layer, layer: ^Layer,
text_string: string, text_string: string,
position: [2]f32, position: Vec2,
font_id: Font_Id, font_id: Font_Id,
font_size: u16 = 44, font_size: u16 = DFT_FONT_SIZE,
color: Color = BLACK, color: Color = DFT_TEXT_COLOR,
origin: [2]f32 = {0, 0}, origin: Vec2 = {},
rotation: f32 = 0, rotation: f32 = 0,
id: Maybe(u32) = nil, id: Maybe(u32) = nil,
temp_allocator := context.temp_allocator, temp_allocator := context.temp_allocator,
) { ) {
c_str := strings.clone_to_cstring(text_string, temp_allocator) c_str := strings.clone_to_cstring(text_string, temp_allocator)
defer delete(c_str, temp_allocator)
sdl_text: ^sdl_ttf.Text sdl_text: ^sdl_ttf.Text
cached := false cached := false
@@ -176,10 +177,11 @@ text :: proc(
measure_text :: proc( measure_text :: proc(
text_string: string, text_string: string,
font_id: Font_Id, font_id: Font_Id,
font_size: u16 = 44, font_size: u16 = DFT_FONT_SIZE,
allocator := context.temp_allocator, allocator := context.temp_allocator,
) -> [2]f32 { ) -> Vec2 {
c_str := strings.clone_to_cstring(text_string, allocator) c_str := strings.clone_to_cstring(text_string, allocator)
defer delete(c_str, allocator)
width, height: c.int width, height: c.int
if !sdl_ttf.GetStringSize(get_font(font_id, font_size), c_str, 0, &width, &height) { if !sdl_ttf.GetStringSize(get_font(font_id, font_size), c_str, 0, &width, &height) {
log.panicf("Failed to measure text: %s", sdl.GetError()) log.panicf("Failed to measure text: %s", sdl.GetError())
@@ -191,46 +193,46 @@ measure_text :: proc(
// ----- Text anchor helpers ----------- // ----- Text anchor helpers -----------
// --------------------------------------------------------------------------------------------------------------------- // ---------------------------------------------------------------------------------------------------------------------
center_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = 44) -> [2]f32 { center_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = DFT_FONT_SIZE) -> Vec2 {
size := measure_text(text_string, font_id, font_size) size := measure_text(text_string, font_id, font_size)
return size * 0.5 return size * 0.5
} }
top_left_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = 44) -> [2]f32 { top_left_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = DFT_FONT_SIZE) -> Vec2 {
return {0, 0} return {0, 0}
} }
top_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = 44) -> [2]f32 { top_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = DFT_FONT_SIZE) -> Vec2 {
size := measure_text(text_string, font_id, font_size) size := measure_text(text_string, font_id, font_size)
return {size.x * 0.5, 0} return {size.x * 0.5, 0}
} }
top_right_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = 44) -> [2]f32 { top_right_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = DFT_FONT_SIZE) -> Vec2 {
size := measure_text(text_string, font_id, font_size) size := measure_text(text_string, font_id, font_size)
return {size.x, 0} return {size.x, 0}
} }
left_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = 44) -> [2]f32 { left_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = DFT_FONT_SIZE) -> Vec2 {
size := measure_text(text_string, font_id, font_size) size := measure_text(text_string, font_id, font_size)
return {0, size.y * 0.5} return {0, size.y * 0.5}
} }
right_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = 44) -> [2]f32 { right_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = DFT_FONT_SIZE) -> Vec2 {
size := measure_text(text_string, font_id, font_size) size := measure_text(text_string, font_id, font_size)
return {size.x, size.y * 0.5} return {size.x, size.y * 0.5}
} }
bottom_left_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = 44) -> [2]f32 { bottom_left_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = DFT_FONT_SIZE) -> Vec2 {
size := measure_text(text_string, font_id, font_size) size := measure_text(text_string, font_id, font_size)
return {0, size.y} return {0, size.y}
} }
bottom_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = 44) -> [2]f32 { bottom_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = DFT_FONT_SIZE) -> Vec2 {
size := measure_text(text_string, font_id, font_size) size := measure_text(text_string, font_id, font_size)
return {size.x * 0.5, size.y} return {size.x * 0.5, size.y}
} }
bottom_right_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = 44) -> [2]f32 { bottom_right_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u16 = DFT_FONT_SIZE) -> Vec2 {
size := measure_text(text_string, font_id, font_size) size := measure_text(text_string, font_id, font_size)
return size return size
} }
@@ -244,7 +246,7 @@ bottom_right_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u
// After calling this, subsequent text draws with an `id` will re-create their cache entries. // After calling this, subsequent text draws with an `id` will re-create their cache entries.
clear_text_cache :: proc() { clear_text_cache :: proc() {
for _, sdl_text in GLOB.text_cache.cache { for _, sdl_text in GLOB.text_cache.cache {
sdl_ttf.DestroyText(sdl_text) append(&GLOB.pending_text_releases, sdl_text)
} }
clear(&GLOB.text_cache.cache) clear(&GLOB.text_cache.cache)
} }
@@ -257,7 +259,7 @@ clear_text_cache_entry :: proc(id: u32) {
key := Cache_Key{id, .Custom} key := Cache_Key{id, .Custom}
sdl_text, ok := GLOB.text_cache.cache[key] sdl_text, ok := GLOB.text_cache.cache[key]
if ok { if ok {
sdl_ttf.DestroyText(sdl_text) append(&GLOB.pending_text_releases, sdl_text)
delete_key(&GLOB.text_cache.cache, key) delete_key(&GLOB.text_cache.cache, key)
} }
} }
+414
View File
@@ -0,0 +1,414 @@
package draw
import "core:log"
import "core:mem"
import sdl "vendor:sdl3"
Texture_Id :: distinct u32
INVALID_TEXTURE :: Texture_Id(0) // Slot 0 is reserved/unused
Texture_Kind :: enum u8 {
Static, // Uploaded once, never changes (QR codes, decoded PNGs, icons)
Dynamic, // Updatable via update_texture_region
Stream, // Frequent full re-uploads (video, procedural)
}
Sampler_Preset :: enum u8 {
Nearest_Clamp,
Linear_Clamp,
Nearest_Repeat,
Linear_Repeat,
}
SAMPLER_PRESET_COUNT :: 4
Fit_Mode :: enum u8 {
Stretch, // Fill rect, may distort aspect ratio (default)
Fit, // Preserve aspect, letterbox (may leave margins)
Fill, // Preserve aspect, center-crop (may crop edges)
Tile, // Repeat at native texture size
Center, // 1:1 pixel size, centered, no scaling
}
Texture_Desc :: struct {
width: u32,
height: u32,
depth_or_layers: u32,
type: sdl.GPUTextureType,
format: sdl.GPUTextureFormat,
usage: sdl.GPUTextureUsageFlags,
mip_levels: u32,
kind: Texture_Kind,
}
// Internal slot not exported.
@(private)
Texture_Slot :: struct {
gpu_texture: ^sdl.GPUTexture,
desc: Texture_Desc,
generation: u32,
}
// State stored in GLOB
// This file references:
// GLOB.device : ^sdl.GPUDevice
// GLOB.texture_slots : [dynamic]Texture_Slot
// GLOB.texture_free_list : [dynamic]u32
// GLOB.pending_texture_releases : [dynamic]Texture_Id
// GLOB.samplers : [SAMPLER_PRESET_COUNT]^sdl.GPUSampler
Clay_Image_Data :: struct {
texture_id: Texture_Id,
fit: Fit_Mode,
tint: Color,
}
clay_image_data :: proc(id: Texture_Id, fit: Fit_Mode = .Stretch, tint: Color = WHITE) -> Clay_Image_Data {
return {texture_id = id, fit = fit, tint = tint}
}
// ---------------------------------------------------------------------------------------------------------------------
// ----- Registration -------------
// ---------------------------------------------------------------------------------------------------------------------
// Register a texture. Draw owns the GPU resource and releases it on unregister.
// `data` is tightly-packed row-major bytes matching desc.format.
// The caller may free `data` immediately after this proc returns.
@(require_results)
register_texture :: proc(desc: Texture_Desc, data: []u8) -> (id: Texture_Id, ok: bool) {
device := GLOB.device
if device == nil {
log.error("register_texture called before draw.init()")
return INVALID_TEXTURE, false
}
assert(desc.width > 0, "Texture_Desc.width must be > 0")
assert(desc.height > 0, "Texture_Desc.height must be > 0")
assert(desc.depth_or_layers > 0, "Texture_Desc.depth_or_layers must be > 0")
assert(desc.mip_levels > 0, "Texture_Desc.mip_levels must be > 0")
assert(desc.usage != {}, "Texture_Desc.usage must not be empty (e.g. {.SAMPLER})")
// Create the GPU texture
gpu_texture := sdl.CreateGPUTexture(
device,
sdl.GPUTextureCreateInfo {
type = desc.type,
format = desc.format,
usage = desc.usage,
width = desc.width,
height = desc.height,
layer_count_or_depth = desc.depth_or_layers,
num_levels = desc.mip_levels,
sample_count = ._1,
},
)
if gpu_texture == nil {
log.errorf("Failed to create GPU texture (%dx%d): %s", desc.width, desc.height, sdl.GetError())
return INVALID_TEXTURE, false
}
// Upload pixel data via a transfer buffer
if len(data) > 0 {
data_size := u32(len(data))
transfer := sdl.CreateGPUTransferBuffer(
device,
sdl.GPUTransferBufferCreateInfo{usage = .UPLOAD, size = data_size},
)
if transfer == nil {
log.errorf("Failed to create texture transfer buffer: %s", sdl.GetError())
sdl.ReleaseGPUTexture(device, gpu_texture)
return INVALID_TEXTURE, false
}
defer sdl.ReleaseGPUTransferBuffer(device, transfer)
mapped := sdl.MapGPUTransferBuffer(device, transfer, false)
if mapped == nil {
log.errorf("Failed to map texture transfer buffer: %s", sdl.GetError())
sdl.ReleaseGPUTexture(device, gpu_texture)
return INVALID_TEXTURE, false
}
mem.copy(mapped, raw_data(data), int(data_size))
sdl.UnmapGPUTransferBuffer(device, transfer)
cmd_buffer := sdl.AcquireGPUCommandBuffer(device)
if cmd_buffer == nil {
log.errorf("Failed to acquire command buffer for texture upload: %s", sdl.GetError())
sdl.ReleaseGPUTexture(device, gpu_texture)
return INVALID_TEXTURE, false
}
copy_pass := sdl.BeginGPUCopyPass(cmd_buffer)
sdl.UploadToGPUTexture(
copy_pass,
sdl.GPUTextureTransferInfo{transfer_buffer = transfer},
sdl.GPUTextureRegion{texture = gpu_texture, w = desc.width, h = desc.height, d = desc.depth_or_layers},
false,
)
sdl.EndGPUCopyPass(copy_pass)
if !sdl.SubmitGPUCommandBuffer(cmd_buffer) {
log.errorf("Failed to submit texture upload: %s", sdl.GetError())
sdl.ReleaseGPUTexture(device, gpu_texture)
return INVALID_TEXTURE, false
}
}
// Allocate a slot (reuse from free list or append)
slot_index: u32
if len(GLOB.texture_free_list) > 0 {
slot_index = pop(&GLOB.texture_free_list)
GLOB.texture_slots[slot_index] = Texture_Slot {
gpu_texture = gpu_texture,
desc = desc,
generation = GLOB.texture_slots[slot_index].generation + 1,
}
} else {
slot_index = u32(len(GLOB.texture_slots))
append(&GLOB.texture_slots, Texture_Slot{gpu_texture = gpu_texture, desc = desc, generation = 1})
}
return Texture_Id(slot_index), true
}
// Queue a texture for release at the end of the current frame.
// The GPU resource is not freed immediately see "Deferred release" in the README.
unregister_texture :: proc(id: Texture_Id) {
if id == INVALID_TEXTURE do return
append(&GLOB.pending_texture_releases, id)
}
// Re-upload a sub-region of a Dynamic texture.
update_texture_region :: proc(id: Texture_Id, region: Rectangle, data: []u8) {
if id == INVALID_TEXTURE do return
slot := &GLOB.texture_slots[u32(id)]
if slot.gpu_texture == nil do return
device := GLOB.device
data_size := u32(len(data))
if data_size == 0 do return
transfer := sdl.CreateGPUTransferBuffer(
device,
sdl.GPUTransferBufferCreateInfo{usage = .UPLOAD, size = data_size},
)
if transfer == nil {
log.errorf("Failed to create transfer buffer for texture region update: %s", sdl.GetError())
return
}
defer sdl.ReleaseGPUTransferBuffer(device, transfer)
mapped := sdl.MapGPUTransferBuffer(device, transfer, false)
if mapped == nil {
log.errorf("Failed to map transfer buffer for texture region update: %s", sdl.GetError())
return
}
mem.copy(mapped, raw_data(data), int(data_size))
sdl.UnmapGPUTransferBuffer(device, transfer)
cmd_buffer := sdl.AcquireGPUCommandBuffer(device)
if cmd_buffer == nil {
log.errorf("Failed to acquire command buffer for texture region update: %s", sdl.GetError())
return
}
copy_pass := sdl.BeginGPUCopyPass(cmd_buffer)
sdl.UploadToGPUTexture(
copy_pass,
sdl.GPUTextureTransferInfo{transfer_buffer = transfer},
sdl.GPUTextureRegion {
texture = slot.gpu_texture,
x = u32(region.x),
y = u32(region.y),
w = u32(region.width),
h = u32(region.height),
d = 1,
},
false,
)
sdl.EndGPUCopyPass(copy_pass)
if !sdl.SubmitGPUCommandBuffer(cmd_buffer) {
log.errorf("Failed to submit texture region update: %s", sdl.GetError())
}
}
// ---------------------------------------------------------------------------------------------------------------------
// ----- Helpers -------------
// ---------------------------------------------------------------------------------------------------------------------
// Compute UV rect, recommended sampler, and inner rect for a given fit mode.
// `rect` is the target drawing area; `texture_id` identifies the texture whose
// pixel dimensions are looked up via texture_size().
// For Fit mode, `inner_rect` is smaller than `rect` (centered). For all other modes, `inner_rect == rect`.
fit_params :: proc(
fit: Fit_Mode,
rect: Rectangle,
texture_id: Texture_Id,
) -> (
uv_rect: Rectangle,
sampler: Sampler_Preset,
inner_rect: Rectangle,
) {
size := texture_size(texture_id)
texture_width := f32(size.x)
texture_height := f32(size.y)
rect_width := rect.width
rect_height := rect.height
inner_rect = rect
if texture_width == 0 || texture_height == 0 || rect_width == 0 || rect_height == 0 {
return {0, 0, 1, 1}, .Linear_Clamp, inner_rect
}
texture_aspect := texture_width / texture_height
rect_aspect := rect_width / rect_height
switch fit {
case .Stretch: return {0, 0, 1, 1}, .Linear_Clamp, inner_rect
case .Fill: if texture_aspect > rect_aspect {
// Texture wider than rect crop sides
scale := rect_aspect / texture_aspect
margin := (1 - scale) * 0.5
return {margin, 0, 1 - margin, 1}, .Linear_Clamp, inner_rect
} else {
// Texture taller than rect crop top/bottom
scale := texture_aspect / rect_aspect
margin := (1 - scale) * 0.5
return {0, margin, 1, 1 - margin}, .Linear_Clamp, inner_rect
}
case .Fit:
// Preserve aspect, fit inside rect. Returns a shrunken inner_rect.
if texture_aspect > rect_aspect {
// Image wider letterbox top/bottom
fit_height := rect_width / texture_aspect
padding := (rect_height - fit_height) * 0.5
inner_rect = Rectangle{rect.x, rect.y + padding, rect_width, fit_height}
} else {
// Image taller letterbox left/right
fit_width := rect_height * texture_aspect
padding := (rect_width - fit_width) * 0.5
inner_rect = Rectangle{rect.x + padding, rect.y, fit_width, rect_height}
}
return {0, 0, 1, 1}, .Linear_Clamp, inner_rect
case .Tile:
uv_width := rect_width / texture_width
uv_height := rect_height / texture_height
return {0, 0, uv_width, uv_height}, .Linear_Repeat, inner_rect
case .Center:
u_half := rect_width / (2 * texture_width)
v_half := rect_height / (2 * texture_height)
return {0.5 - u_half, 0.5 - v_half, 0.5 + u_half, 0.5 + v_half}, .Nearest_Clamp, inner_rect
}
return {0, 0, 1, 1}, .Linear_Clamp, inner_rect
}
texture_size :: proc(id: Texture_Id) -> [2]u32 {
if id == INVALID_TEXTURE do return {0, 0}
slot := &GLOB.texture_slots[u32(id)]
return {slot.desc.width, slot.desc.height}
}
texture_format :: proc(id: Texture_Id) -> sdl.GPUTextureFormat {
if id == INVALID_TEXTURE do return .INVALID
return GLOB.texture_slots[u32(id)].desc.format
}
texture_kind :: proc(id: Texture_Id) -> Texture_Kind {
if id == INVALID_TEXTURE do return .Static
return GLOB.texture_slots[u32(id)].desc.kind
}
// Internal: get the raw GPU texture pointer for binding during draw.
@(private)
texture_gpu_handle :: proc(id: Texture_Id) -> ^sdl.GPUTexture {
if id == INVALID_TEXTURE do return nil
idx := u32(id)
if idx >= u32(len(GLOB.texture_slots)) do return nil
return GLOB.texture_slots[idx].gpu_texture
}
// Deferred release (called from draw.end / clear_global)
@(private)
process_pending_texture_releases :: proc() {
device := GLOB.device
for id in GLOB.pending_texture_releases {
idx := u32(id)
if idx >= u32(len(GLOB.texture_slots)) do continue
slot := &GLOB.texture_slots[idx]
if slot.gpu_texture != nil {
sdl.ReleaseGPUTexture(device, slot.gpu_texture)
slot.gpu_texture = nil
}
slot.generation += 1
append(&GLOB.texture_free_list, idx)
}
clear(&GLOB.pending_texture_releases)
}
@(private)
get_sampler :: proc(preset: Sampler_Preset) -> ^sdl.GPUSampler {
idx := int(preset)
if GLOB.samplers[idx] != nil do return GLOB.samplers[idx]
// Lazily create
min_filter, mag_filter: sdl.GPUFilter
address_mode: sdl.GPUSamplerAddressMode
switch preset {
case .Nearest_Clamp:
min_filter = .NEAREST; mag_filter = .NEAREST; address_mode = .CLAMP_TO_EDGE
case .Linear_Clamp:
min_filter = .LINEAR; mag_filter = .LINEAR; address_mode = .CLAMP_TO_EDGE
case .Nearest_Repeat:
min_filter = .NEAREST; mag_filter = .NEAREST; address_mode = .REPEAT
case .Linear_Repeat:
min_filter = .LINEAR; mag_filter = .LINEAR; address_mode = .REPEAT
}
sampler := sdl.CreateGPUSampler(
GLOB.device,
sdl.GPUSamplerCreateInfo {
min_filter = min_filter,
mag_filter = mag_filter,
mipmap_mode = .LINEAR,
address_mode_u = address_mode,
address_mode_v = address_mode,
address_mode_w = address_mode,
},
)
if sampler == nil {
log.errorf("Failed to create sampler preset %v: %s", preset, sdl.GetError())
return GLOB.pipeline_2d_base.sampler // fallback to existing default sampler
}
GLOB.samplers[idx] = sampler
return sampler
}
// Internal: destroy all sampler pool entries. Called from draw.destroy().
@(private)
destroy_sampler_pool :: proc() {
device := GLOB.device
for &s in GLOB.samplers {
if s != nil {
sdl.ReleaseGPUSampler(device, s)
s = nil
}
}
}
// Internal: destroy all registered textures. Called from draw.destroy().
@(private)
destroy_all_textures :: proc() {
device := GLOB.device
for &slot in GLOB.texture_slots {
if slot.gpu_texture != nil {
sdl.ReleaseGPUTexture(device, slot.gpu_texture)
slot.gpu_texture = nil
}
}
delete(GLOB.texture_slots)
delete(GLOB.texture_free_list)
delete(GLOB.pending_texture_releases)
}
+34 -28
View File
@@ -2,6 +2,7 @@ package many_bits
import "base:builtin" import "base:builtin"
import "base:intrinsics" import "base:intrinsics"
import "base:runtime"
import "core:fmt" import "core:fmt"
import "core:slice" import "core:slice"
@@ -25,15 +26,20 @@ Bits :: struct {
length: int, // Total number of bits being stored length: int, // Total number of bits being stored
} }
delete :: proc(bits: Bits, allocator := context.allocator) { destroy :: proc(bits: Bits, allocator := context.allocator) -> runtime.Allocator_Error {
delete_slice(bits.int_array, allocator) return delete_slice(bits.int_array, allocator)
} }
make :: proc(#any_int length: int, allocator := context.allocator) -> Bits { create :: proc(
return Bits { #any_int length: int,
int_array = make_slice([]Int_Bits, ((length - 1) >> INDEX_SHIFT) + 1, allocator), allocator := context.allocator,
length = length, ) -> (
} bits: Bits,
err: runtime.Allocator_Error,
) #optional_allocator_error {
bits.int_array, err = make_slice([]Int_Bits, ((length - 1) >> INDEX_SHIFT) + 1, allocator)
bits.length = length
return bits, err
} }
// Sets all bits to 0 (false) // Sets all bits to 0 (false)
@@ -507,8 +513,8 @@ import "core:testing"
@(test) @(test)
test_set :: proc(t: ^testing.T) { test_set :: proc(t: ^testing.T) {
bits := make(128) bits := create(128)
defer delete(bits) defer destroy(bits)
set(bits, 0, true) set(bits, 0, true)
testing.expect_value(t, bits.int_array[0], Int_Bits{0}) testing.expect_value(t, bits.int_array[0], Int_Bits{0})
@@ -524,8 +530,8 @@ test_set :: proc(t: ^testing.T) {
@(test) @(test)
test_get :: proc(t: ^testing.T) { test_get :: proc(t: ^testing.T) {
bits := make(128) bits := create(128)
defer delete(bits) defer destroy(bits)
// Default is false // Default is false
testing.expect(t, !get(bits, 0)) testing.expect(t, !get(bits, 0))
@@ -560,8 +566,8 @@ test_get :: proc(t: ^testing.T) {
@(test) @(test)
test_set_true_set_false :: proc(t: ^testing.T) { test_set_true_set_false :: proc(t: ^testing.T) {
bits := make(128) bits := create(128)
defer delete(bits) defer destroy(bits)
// set_true within first uint // set_true within first uint
set_true(bits, 0) set_true(bits, 0)
@@ -605,8 +611,8 @@ all_true_test :: proc(t: ^testing.T) {
uint_max := UINT_MAX uint_max := UINT_MAX
all_ones := transmute(Int_Bits)uint_max all_ones := transmute(Int_Bits)uint_max
bits := make(132) bits := create(132)
defer delete(bits) defer destroy(bits)
bits.int_array[0] = all_ones bits.int_array[0] = all_ones
bits.int_array[1] = all_ones bits.int_array[1] = all_ones
@@ -616,8 +622,8 @@ all_true_test :: proc(t: ^testing.T) {
bits.int_array[2] = {0, 1, 2} bits.int_array[2] = {0, 1, 2}
testing.expect(t, !all_true(bits)) testing.expect(t, !all_true(bits))
bits2 := make(1) bits2 := create(1)
defer delete(bits2) defer destroy(bits2)
bits2.int_array[0] = {0} bits2.int_array[0] = {0}
testing.expect(t, all_true(bits2)) testing.expect(t, all_true(bits2))
@@ -628,8 +634,8 @@ test_range_true :: proc(t: ^testing.T) {
uint_max := UINT_MAX uint_max := UINT_MAX
all_ones := transmute(Int_Bits)uint_max all_ones := transmute(Int_Bits)uint_max
bits := make(192) bits := create(192)
defer delete(bits) defer destroy(bits)
// Empty range is vacuously true // Empty range is vacuously true
testing.expect(t, range_true(bits, 0, 0)) testing.expect(t, range_true(bits, 0, 0))
@@ -676,7 +682,7 @@ test_range_true :: proc(t: ^testing.T) {
@(test) @(test)
nearest_true_handles_same_word_and_boundaries :: proc(t: ^testing.T) { nearest_true_handles_same_word_and_boundaries :: proc(t: ^testing.T) {
bits := make(128, context.temp_allocator) bits := create(128, context.temp_allocator)
set_true(bits, 0) set_true(bits, 0)
set_true(bits, 10) set_true(bits, 10)
@@ -710,7 +716,7 @@ nearest_true_handles_same_word_and_boundaries :: proc(t: ^testing.T) {
@(test) @(test)
nearest_false_handles_same_word_and_boundaries :: proc(t: ^testing.T) { nearest_false_handles_same_word_and_boundaries :: proc(t: ^testing.T) {
bits := make(128, context.temp_allocator) bits := create(128, context.temp_allocator)
// Start with all bits true, then clear a few to false. // Start with all bits true, then clear a few to false.
for i := 0; i < bits.length; i += 1 { for i := 0; i < bits.length; i += 1 {
@@ -749,7 +755,7 @@ nearest_false_handles_same_word_and_boundaries :: proc(t: ^testing.T) {
@(test) @(test)
nearest_false_scans_across_words_and_returns_false_when_all_true :: proc(t: ^testing.T) { nearest_false_scans_across_words_and_returns_false_when_all_true :: proc(t: ^testing.T) {
bits := make(192, context.temp_allocator) bits := create(192, context.temp_allocator)
// Start with all bits true, then clear a couple far apart. // Start with all bits true, then clear a couple far apart.
for i := 0; i < bits.length; i += 1 { for i := 0; i < bits.length; i += 1 {
@@ -773,7 +779,7 @@ nearest_false_scans_across_words_and_returns_false_when_all_true :: proc(t: ^tes
@(test) @(test)
nearest_true_scans_across_words_and_returns_false_when_empty :: proc(t: ^testing.T) { nearest_true_scans_across_words_and_returns_false_when_empty :: proc(t: ^testing.T) {
bits := make(192, context.temp_allocator) bits := create(192, context.temp_allocator)
set_true(bits, 5) set_true(bits, 5)
set_true(bits, 130) set_true(bits, 130)
@@ -790,7 +796,7 @@ nearest_true_scans_across_words_and_returns_false_when_empty :: proc(t: ^testing
@(test) @(test)
nearest_false_handles_last_word_partial_length :: proc(t: ^testing.T) { nearest_false_handles_last_word_partial_length :: proc(t: ^testing.T) {
bits := make(130, context.temp_allocator) bits := create(130, context.temp_allocator)
// Start with all bits true, then clear the first and last valid bits. // Start with all bits true, then clear the first and last valid bits.
for i := 0; i < bits.length; i += 1 { for i := 0; i < bits.length; i += 1 {
@@ -811,7 +817,7 @@ nearest_false_handles_last_word_partial_length :: proc(t: ^testing.T) {
@(test) @(test)
nearest_true_handles_last_word_partial_length :: proc(t: ^testing.T) { nearest_true_handles_last_word_partial_length :: proc(t: ^testing.T) {
bits := make(130, context.temp_allocator) bits := create(130, context.temp_allocator)
set_true(bits, 0) set_true(bits, 0)
set_true(bits, 129) set_true(bits, 129)
@@ -828,7 +834,7 @@ nearest_true_handles_last_word_partial_length :: proc(t: ^testing.T) {
@(test) @(test)
iterator_basic_mixed_bits :: proc(t: ^testing.T) { iterator_basic_mixed_bits :: proc(t: ^testing.T) {
// Use non-word-aligned length to test partial last word handling // Use non-word-aligned length to test partial last word handling
bits := make(100, context.temp_allocator) bits := create(100, context.temp_allocator)
// Set specific bits: 0, 3, 64, 99 (last valid index) // Set specific bits: 0, 3, 64, 99 (last valid index)
set_true(bits, 0) set_true(bits, 0)
@@ -903,7 +909,7 @@ iterator_basic_mixed_bits :: proc(t: ^testing.T) {
@(test) @(test)
iterator_all_false_bits :: proc(t: ^testing.T) { iterator_all_false_bits :: proc(t: ^testing.T) {
// Use non-word-aligned length // Use non-word-aligned length
bits := make(100, context.temp_allocator) bits := create(100, context.temp_allocator)
// All bits default to false, no need to set anything // All bits default to false, no need to set anything
// Test iterate - should return all 100 bits as false // Test iterate - should return all 100 bits as false
@@ -944,7 +950,7 @@ iterator_all_false_bits :: proc(t: ^testing.T) {
@(test) @(test)
iterator_all_true_bits :: proc(t: ^testing.T) { iterator_all_true_bits :: proc(t: ^testing.T) {
// Use non-word-aligned length // Use non-word-aligned length
bits := make(100, context.temp_allocator) bits := create(100, context.temp_allocator)
// Set all bits to true // Set all bits to true
for i := 0; i < bits.length; i += 1 { for i := 0; i < bits.length; i += 1 {
set_true(bits, i) set_true(bits, i)
+44
View File
@@ -1,6 +1,8 @@
package meta package meta
import "core:fmt" import "core:fmt"
import "core:log"
import "core:mem"
import "core:os" import "core:os"
Command :: struct { Command :: struct {
@@ -20,6 +22,48 @@ COMMANDS :: []Command {
} }
main :: proc() { main :: proc() {
//----- General setup ----------------------------------
when ODIN_DEBUG {
// Temp
track_temp: mem.Tracking_Allocator
mem.tracking_allocator_init(&track_temp, context.temp_allocator)
context.temp_allocator = mem.tracking_allocator(&track_temp)
// Default
track: mem.Tracking_Allocator
mem.tracking_allocator_init(&track, context.allocator)
context.allocator = mem.tracking_allocator(&track)
// Log a warning about any memory that was not freed by the end of the program.
// This could be fine for some global state or it could be a memory leak.
defer {
// Temp allocator
if len(track_temp.bad_free_array) > 0 {
fmt.eprintf("=== %v incorrect frees - temp allocator: ===\n", len(track_temp.bad_free_array))
for entry in track_temp.bad_free_array {
fmt.eprintf("- %p @ %v\n", entry.memory, entry.location)
}
mem.tracking_allocator_destroy(&track_temp)
}
// Default allocator
if len(track.allocation_map) > 0 {
fmt.eprintf("=== %v allocations not freed - main allocator: ===\n", len(track.allocation_map))
for _, entry in track.allocation_map {
fmt.eprintf("- %v bytes @ %v\n", entry.size, entry.location)
}
}
if len(track.bad_free_array) > 0 {
fmt.eprintf("=== %v incorrect frees - main allocator: ===\n", len(track.bad_free_array))
for entry in track.bad_free_array {
fmt.eprintf("- %p @ %v\n", entry.memory, entry.location)
}
}
mem.tracking_allocator_destroy(&track)
}
// Logger
context.logger = log.create_console_logger()
defer log.destroy_console_logger(context.logger)
}
args := os.args[1:] args := os.args[1:]
if len(args) == 0 { if len(args) == 0 {
+22 -19
View File
@@ -4,7 +4,8 @@
package phased_executor package phased_executor
import "base:intrinsics" import "base:intrinsics"
import q "core:container/queue" import "base:runtime"
import que "core:container/queue"
import "core:prof/spall" import "core:prof/spall"
import "core:sync" import "core:sync"
import "core:thread" import "core:thread"
@@ -18,7 +19,7 @@ DEFT_SPIN_LIMIT :: 2_500_000
Harness :: struct($T: typeid) where intrinsics.type_has_nil(T) { Harness :: struct($T: typeid) where intrinsics.type_has_nil(T) {
mutex: sync.Mutex, mutex: sync.Mutex,
condition: sync.Cond, condition: sync.Cond,
cmd_queue: q.Queue(T), cmd_queue: que.Queue(T),
spin: bool, spin: bool,
lock: levsync.Spinlock, lock: levsync.Spinlock,
_pad: [64 - size_of(uint)]u8, // We want join_count to have its own cache line _pad: [64 - size_of(uint)]u8, // We want join_count to have its own cache line
@@ -42,13 +43,13 @@ Executor :: struct($T: typeid) where intrinsics.type_has_nil(T) {
} }
//TODO: Provide a way to set some aspects of context for the executor threads. Namely a logger. //TODO: Provide a way to set some aspects of context for the executor threads. Namely a logger.
init_executor :: proc( init :: proc(
executor: ^Executor($T), executor: ^Executor($T),
#any_int num_threads: int, #any_int num_threads: int,
$on_command_received: proc(command: T), $on_command_received: proc(command: T),
#any_int spin_limit: uint = DEFT_SPIN_LIMIT, #any_int spin_limit: uint = DEFT_SPIN_LIMIT,
allocator := context.allocator, allocator := context.allocator,
) { ) -> runtime.Allocator_Error {
was_initialized, _ := intrinsics.atomic_compare_exchange_strong_explicit( was_initialized, _ := intrinsics.atomic_compare_exchange_strong_explicit(
&executor.initialized, &executor.initialized,
false, false,
@@ -60,9 +61,9 @@ init_executor :: proc(
slave_task := build_task(on_command_received) slave_task := build_task(on_command_received)
executor.spin_limit = spin_limit executor.spin_limit = spin_limit
executor.harnesses = make([]Harness(T), num_threads, allocator) executor.harnesses = make([]Harness(T), num_threads, allocator) or_return
for &harness in executor.harnesses { for &harness in executor.harnesses {
q.init(&harness.cmd_queue, allocator = allocator) que.init(&harness.cmd_queue, allocator = allocator) or_return
harness.spin = true harness.spin = true
} }
@@ -72,11 +73,11 @@ init_executor :: proc(
} }
thread.pool_start(&executor.thread_pool) thread.pool_start(&executor.thread_pool)
return return nil
} }
// Cleanly shuts down all executor tasks then destroys the executor // Cleanly shuts down all executor tasks then destroys the executor
destroy_executor :: proc(executor: ^Executor($T), allocator := context.allocator) { destroy :: proc(executor: ^Executor($T), allocator := context.allocator) -> runtime.Allocator_Error {
was_initialized, _ := intrinsics.atomic_compare_exchange_strong_explicit( was_initialized, _ := intrinsics.atomic_compare_exchange_strong_explicit(
&executor.initialized, &executor.initialized,
true, true,
@@ -90,7 +91,7 @@ destroy_executor :: proc(executor: ^Executor($T), allocator := context.allocator
for &harness in executor.harnesses { for &harness in executor.harnesses {
for { for {
if levsync.try_lock(&harness.lock) { if levsync.try_lock(&harness.lock) {
q.push_back(&harness.cmd_queue, nil) que.push_back(&harness.cmd_queue, nil)
if !harness.spin { if !harness.spin {
sync.mutex_lock(&harness.mutex) sync.mutex_lock(&harness.mutex)
sync.cond_signal(&harness.condition) sync.cond_signal(&harness.condition)
@@ -105,9 +106,11 @@ destroy_executor :: proc(executor: ^Executor($T), allocator := context.allocator
thread.pool_join(&executor.thread_pool) thread.pool_join(&executor.thread_pool)
thread.pool_destroy(&executor.thread_pool) thread.pool_destroy(&executor.thread_pool)
for &harness in executor.harnesses { for &harness in executor.harnesses {
q.destroy(&harness.cmd_queue) que.destroy(&harness.cmd_queue)
} }
delete(executor.harnesses, allocator) delete(executor.harnesses, allocator) or_return
return nil
} }
build_task :: proc( build_task :: proc(
@@ -131,10 +134,10 @@ build_task :: proc(
spin_count: uint = 0 spin_count: uint = 0
spin_loop: for { spin_loop: for {
if levsync.try_lock(&harness.lock) { if levsync.try_lock(&harness.lock) {
if q.len(harness.cmd_queue) > 0 { if que.len(harness.cmd_queue) > 0 {
// Execute command // Execute command
command := q.pop_front(&harness.cmd_queue) command := que.pop_front(&harness.cmd_queue)
levsync.unlock(&harness.lock) levsync.unlock(&harness.lock)
if command == nil do return if command == nil do return
on_command_received(command) on_command_received(command)
@@ -163,7 +166,7 @@ build_task :: proc(
defer intrinsics.cpu_relax() defer intrinsics.cpu_relax()
if levsync.try_lock(&harness.lock) { if levsync.try_lock(&harness.lock) {
defer levsync.unlock(&harness.lock) defer levsync.unlock(&harness.lock)
if q.len(harness.cmd_queue) > 0 { if que.len(harness.cmd_queue) > 0 {
harness.spin = true harness.spin = true
break cond_loop break cond_loop
} else { } else {
@@ -190,9 +193,9 @@ exec_command :: proc(executor: ^Executor($T), command: T) {
} }
harness := &executor.harnesses[executor.harness_index] harness := &executor.harnesses[executor.harness_index]
if levsync.try_lock(&harness.lock) { if levsync.try_lock(&harness.lock) {
if q.len(harness.cmd_queue) <= executor.cmd_queue_floor { if que.len(harness.cmd_queue) <= executor.cmd_queue_floor {
q.push_back(&harness.cmd_queue, command) que.push_back(&harness.cmd_queue, command)
executor.cmd_queue_floor = q.len(harness.cmd_queue) executor.cmd_queue_floor = que.len(harness.cmd_queue)
slave_sleeping := !harness.spin slave_sleeping := !harness.spin
// Must release lock before signalling to avoid race from slave spurious wakeup // Must release lock before signalling to avoid race from slave spurious wakeup
levsync.unlock(&harness.lock) levsync.unlock(&harness.lock)
@@ -258,7 +261,7 @@ stress_test_executor :: proc(t: ^testing.T) {
defer free(exec_counts) defer free(exec_counts)
executor: Executor(Stress_Cmd) executor: Executor(Stress_Cmd)
init_executor(&executor, STRESS_NUM_THREADS, stress_handler, spin_limit = 500) init(&executor, STRESS_NUM_THREADS, stress_handler, spin_limit = 500)
for round in 0 ..< STRESS_NUM_ROUNDS { for round in 0 ..< STRESS_NUM_ROUNDS {
base := round * STRESS_CMDS_PER_ROUND base := round * STRESS_CMDS_PER_ROUND
@@ -281,6 +284,6 @@ stress_test_executor :: proc(t: ^testing.T) {
// Explicitly destroy to verify clean shutdown. // Explicitly destroy to verify clean shutdown.
// If destroy_executor returns, all threads received the nil sentinel and exited, // If destroy_executor returns, all threads received the nil sentinel and exited,
// and thread.pool_join completed without deadlock. // and thread.pool_join completed without deadlock.
destroy_executor(&executor) destroy(&executor)
testing.expect(t, !executor.initialized, "Executor still marked initialized after destroy") testing.expect(t, !executor.initialized, "Executor still marked initialized after destroy")
} }
+44 -68
View File
@@ -1,39 +1,32 @@
package examples package examples
import "core:fmt" import "core:fmt"
import "core:log"
import "core:mem" import "core:mem"
import "core:os" import "core:os"
import qr ".." import qr ".."
main :: proc() { main :: proc() {
//----- Tracking allocator ---------------------------------- //----- General setup ----------------------------------
{ {
tracking_temp_allocator := false
// Temp // Temp
track_temp: mem.Tracking_Allocator track_temp: mem.Tracking_Allocator
if tracking_temp_allocator { mem.tracking_allocator_init(&track_temp, context.temp_allocator)
mem.tracking_allocator_init(&track_temp, context.temp_allocator) context.temp_allocator = mem.tracking_allocator(&track_temp)
context.temp_allocator = mem.tracking_allocator(&track_temp)
}
// Default // Default
track: mem.Tracking_Allocator track: mem.Tracking_Allocator
mem.tracking_allocator_init(&track, context.allocator) mem.tracking_allocator_init(&track, context.allocator)
context.allocator = mem.tracking_allocator(&track) context.allocator = mem.tracking_allocator(&track)
// Log a warning about any memory that was not freed by the end of the program.
// This could be fine for some global state or it could be a memory leak.
defer { defer {
// Temp allocator // Temp allocator
if tracking_temp_allocator { if len(track_temp.bad_free_array) > 0 {
if len(track_temp.allocation_map) > 0 { fmt.eprintf("=== %v incorrect frees - temp allocator: ===\n", len(track_temp.bad_free_array))
fmt.eprintf("=== %v allocations not freed - temp allocator: ===\n", len(track_temp.allocation_map)) for entry in track_temp.bad_free_array {
for _, entry in track_temp.allocation_map { fmt.eprintf("- %p @ %v\n", entry.memory, entry.location)
fmt.eprintf("- %v bytes @ %v\n", entry.size, entry.location)
}
}
if len(track_temp.bad_free_array) > 0 {
fmt.eprintf("=== %v incorrect frees - temp allocator: ===\n", len(track_temp.bad_free_array))
for entry in track_temp.bad_free_array {
fmt.eprintf("- %p @ %v\n", entry.memory, entry.location)
}
} }
mem.tracking_allocator_destroy(&track_temp) mem.tracking_allocator_destroy(&track_temp)
} }
@@ -52,6 +45,9 @@ main :: proc() {
} }
mem.tracking_allocator_destroy(&track) mem.tracking_allocator_destroy(&track)
} }
// Logger
context.logger = log.create_console_logger()
defer log.destroy_console_logger(context.logger)
} }
args := os.args args := os.args
@@ -73,57 +69,32 @@ main :: proc() {
} }
} }
// -------------------------------------------------------------------------------------------------
// Utilities
// -------------------------------------------------------------------------------------------------
// Prints the given QR Code to the console.
print_qr :: proc(qrcode: []u8) {
size := qr.get_size(qrcode)
border :: 4
for y in -border ..< size + border {
for x in -border ..< size + border {
fmt.print("##" if qr.get_module(qrcode, x, y) else " ")
}
fmt.println()
}
fmt.println()
}
// -------------------------------------------------------------------------------------------------
// Demo: Basic
// -------------------------------------------------------------------------------------------------
// Creates a single QR Code, then prints it to the console. // Creates a single QR Code, then prints it to the console.
basic :: proc() { basic :: proc() {
text :: "Hello, world!" text :: "Hello, world!"
ecl :: qr.Ecc.Low ecl :: qr.Ecc.Low
qrcode: [qr.BUFFER_LEN_MAX]u8 qrcode: [qr.BUFFER_LEN_MAX]u8
ok := qr.encode(text, qrcode[:], ecl) ok := qr.encode_auto(text, qrcode[:], ecl)
if ok do print_qr(qrcode[:]) if ok do print_qr(qrcode[:])
} }
// -------------------------------------------------------------------------------------------------
// Demo: Variety
// -------------------------------------------------------------------------------------------------
// Creates a variety of QR Codes that exercise different features of the library. // Creates a variety of QR Codes that exercise different features of the library.
variety :: proc() { variety :: proc() {
qrcode: [qr.BUFFER_LEN_MAX]u8 qrcode: [qr.BUFFER_LEN_MAX]u8
{ // Numeric mode encoding (3.33 bits per digit) { // Numeric mode encoding (3.33 bits per digit)
ok := qr.encode("314159265358979323846264338327950288419716939937510", qrcode[:], qr.Ecc.Medium) ok := qr.encode_auto("314159265358979323846264338327950288419716939937510", qrcode[:], qr.Ecc.Medium)
if ok do print_qr(qrcode[:]) if ok do print_qr(qrcode[:])
} }
{ // Alphanumeric mode encoding (5.5 bits per character) { // Alphanumeric mode encoding (5.5 bits per character)
ok := qr.encode("DOLLAR-AMOUNT:$39.87 PERCENTAGE:100.00% OPERATIONS:+-*/", qrcode[:], qr.Ecc.High) ok := qr.encode_auto("DOLLAR-AMOUNT:$39.87 PERCENTAGE:100.00% OPERATIONS:+-*/", qrcode[:], qr.Ecc.High)
if ok do print_qr(qrcode[:]) if ok do print_qr(qrcode[:])
} }
{ // Unicode text as UTF-8 { // Unicode text as UTF-8
ok := qr.encode( ok := qr.encode_auto(
"\xE3\x81\x93\xE3\x82\x93\xE3\x81\xAB\xE3\x81\xA1wa\xE3\x80\x81" + "\xE3\x81\x93\xE3\x82\x93\xE3\x81\xAB\xE3\x81\xA1wa\xE3\x80\x81" +
"\xE4\xB8\x96\xE7\x95\x8C\xEF\xBC\x81\x20\xCE\xB1\xCE\xB2\xCE\xB3\xCE\xB4", "\xE4\xB8\x96\xE7\x95\x8C\xEF\xBC\x81\x20\xCE\xB1\xCE\xB2\xCE\xB3\xCE\xB4",
qrcode[:], qrcode[:],
@@ -133,7 +104,7 @@ variety :: proc() {
} }
{ // Moderately large QR Code using longer text (from Lewis Carroll's Alice in Wonderland) { // Moderately large QR Code using longer text (from Lewis Carroll's Alice in Wonderland)
ok := qr.encode( ok := qr.encode_auto(
"Alice was beginning to get very tired of sitting by her sister on the bank, " + "Alice was beginning to get very tired of sitting by her sister on the bank, " +
"and of having nothing to do: once or twice she had peeped into the book her sister was reading, " + "and of having nothing to do: once or twice she had peeped into the book her sister was reading, " +
"but it had no pictures or conversations in it, 'and what is the use of a book,' thought Alice " + "but it had no pictures or conversations in it, 'and what is the use of a book,' thought Alice " +
@@ -148,10 +119,6 @@ variety :: proc() {
} }
} }
// -------------------------------------------------------------------------------------------------
// Demo: Segment
// -------------------------------------------------------------------------------------------------
// Creates QR Codes with manually specified segments for better compactness. // Creates QR Codes with manually specified segments for better compactness.
segment :: proc() { segment :: proc() {
qrcode: [qr.BUFFER_LEN_MAX]u8 qrcode: [qr.BUFFER_LEN_MAX]u8
@@ -163,7 +130,7 @@ segment :: proc() {
// Encode as single text (auto mode selection) // Encode as single text (auto mode selection)
{ {
concat :: silver0 + silver1 concat :: silver0 + silver1
ok := qr.encode(concat, qrcode[:], qr.Ecc.Low) ok := qr.encode_auto(concat, qrcode[:], qr.Ecc.Low)
if ok do print_qr(qrcode[:]) if ok do print_qr(qrcode[:])
} }
@@ -172,7 +139,7 @@ segment :: proc() {
seg_buf0: [qr.BUFFER_LEN_MAX]u8 seg_buf0: [qr.BUFFER_LEN_MAX]u8
seg_buf1: [qr.BUFFER_LEN_MAX]u8 seg_buf1: [qr.BUFFER_LEN_MAX]u8
segs := [2]qr.Segment{qr.make_alphanumeric(silver0, seg_buf0[:]), qr.make_numeric(silver1, seg_buf1[:])} segs := [2]qr.Segment{qr.make_alphanumeric(silver0, seg_buf0[:]), qr.make_numeric(silver1, seg_buf1[:])}
ok := qr.encode(segs[:], qr.Ecc.Low, qrcode[:]) ok := qr.encode_auto(segs[:], qr.Ecc.Low, qrcode[:])
if ok do print_qr(qrcode[:]) if ok do print_qr(qrcode[:])
} }
} }
@@ -185,7 +152,7 @@ segment :: proc() {
// Encode as single text (auto mode selection) // Encode as single text (auto mode selection)
{ {
concat :: golden0 + golden1 + golden2 concat :: golden0 + golden1 + golden2
ok := qr.encode(concat, qrcode[:], qr.Ecc.Low) ok := qr.encode_auto(concat, qrcode[:], qr.Ecc.Low)
if ok do print_qr(qrcode[:]) if ok do print_qr(qrcode[:])
} }
@@ -201,7 +168,7 @@ segment :: proc() {
qr.make_numeric(golden1, seg_buf1[:]), qr.make_numeric(golden1, seg_buf1[:]),
qr.make_alphanumeric(golden2, seg_buf2[:]), qr.make_alphanumeric(golden2, seg_buf2[:]),
} }
ok := qr.encode(segs[:], qr.Ecc.Low, qrcode[:]) ok := qr.encode_auto(segs[:], qr.Ecc.Low, qrcode[:])
if ok do print_qr(qrcode[:]) if ok do print_qr(qrcode[:])
} }
} }
@@ -219,7 +186,7 @@ segment :: proc() {
"\xEF\xBD\x84\xEF\xBD\x85\xEF\xBD\x93\xEF" + "\xEF\xBD\x84\xEF\xBD\x85\xEF\xBD\x93\xEF" +
"\xBD\x95\xE3\x80\x80\xCE\xBA\xCE\xB1\xEF" + "\xBD\x95\xE3\x80\x80\xCE\xBA\xCE\xB1\xEF" +
"\xBC\x9F" "\xBC\x9F"
ok := qr.encode(madoka, qrcode[:], qr.Ecc.Low) ok := qr.encode_auto(madoka, qrcode[:], qr.Ecc.Low)
if ok do print_qr(qrcode[:]) if ok do print_qr(qrcode[:])
} }
@@ -254,16 +221,12 @@ segment :: proc() {
seg.data = seg_buf[:(seg.bit_length + 7) / 8] seg.data = seg_buf[:(seg.bit_length + 7) / 8]
segs := [1]qr.Segment{seg} segs := [1]qr.Segment{seg}
ok := qr.encode(segs[:], qr.Ecc.Low, qrcode[:]) ok := qr.encode_auto(segs[:], qr.Ecc.Low, qrcode[:])
if ok do print_qr(qrcode[:]) if ok do print_qr(qrcode[:])
} }
} }
} }
// -------------------------------------------------------------------------------------------------
// Demo: Mask
// -------------------------------------------------------------------------------------------------
// Creates QR Codes with the same size and contents but different mask patterns. // Creates QR Codes with the same size and contents but different mask patterns.
mask :: proc() { mask :: proc() {
qrcode: [qr.BUFFER_LEN_MAX]u8 qrcode: [qr.BUFFER_LEN_MAX]u8
@@ -271,10 +234,10 @@ mask :: proc() {
{ // Project Nayuki URL { // Project Nayuki URL
ok: bool ok: bool
ok = qr.encode("https://www.nayuki.io/", qrcode[:], qr.Ecc.High) ok = qr.encode_auto("https://www.nayuki.io/", qrcode[:], qr.Ecc.High)
if ok do print_qr(qrcode[:]) if ok do print_qr(qrcode[:])
ok = qr.encode("https://www.nayuki.io/", qrcode[:], qr.Ecc.High, mask = qr.Mask.M3) ok = qr.encode_auto("https://www.nayuki.io/", qrcode[:], qr.Ecc.High, mask = qr.Mask.M3)
if ok do print_qr(qrcode[:]) if ok do print_qr(qrcode[:])
} }
@@ -290,16 +253,29 @@ mask :: proc() {
ok: bool ok: bool
ok = qr.encode(text, qrcode[:], qr.Ecc.Medium, mask = qr.Mask.M0) ok = qr.encode_auto(text, qrcode[:], qr.Ecc.Medium, mask = qr.Mask.M0)
if ok do print_qr(qrcode[:]) if ok do print_qr(qrcode[:])
ok = qr.encode(text, qrcode[:], qr.Ecc.Medium, mask = qr.Mask.M1) ok = qr.encode_auto(text, qrcode[:], qr.Ecc.Medium, mask = qr.Mask.M1)
if ok do print_qr(qrcode[:]) if ok do print_qr(qrcode[:])
ok = qr.encode(text, qrcode[:], qr.Ecc.Medium, mask = qr.Mask.M5) ok = qr.encode_auto(text, qrcode[:], qr.Ecc.Medium, mask = qr.Mask.M5)
if ok do print_qr(qrcode[:]) if ok do print_qr(qrcode[:])
ok = qr.encode(text, qrcode[:], qr.Ecc.Medium, mask = qr.Mask.M7) ok = qr.encode_auto(text, qrcode[:], qr.Ecc.Medium, mask = qr.Mask.M7)
if ok do print_qr(qrcode[:]) if ok do print_qr(qrcode[:])
} }
} }
// Prints the given QR Code to the console.
print_qr :: proc(qrcode: []u8) {
size := qr.get_size(qrcode)
border :: 4
for y in -border ..< size + border {
for x in -border ..< size + border {
fmt.print("##" if qr.get_module(qrcode, x, y) else " ")
}
fmt.println()
}
fmt.println()
}
+128 -115
View File
@@ -2,10 +2,30 @@ package qrcode
import "core:slice" import "core:slice"
VERSION_MIN :: 1
VERSION_MAX :: 40
// ------------------------------------------------------------------------------------------------- // The worst-case number of bytes needed to store one QR Code, up to and including version 40.
// Types BUFFER_LEN_MAX :: 3918 // buffer_len_for_version(VERSION_MAX)
// -------------------------------------------------------------------------------------------------
// Returns the number of bytes needed to store any QR Code up to and including the given version.
buffer_len_for_version :: #force_inline proc(n: int) -> int {
size := n * 4 + 17
return (size * size + 7) / 8 + 1
}
@(private)
LENGTH_OVERFLOW :: -1
@(private)
REED_SOLOMON_DEGREE_MAX :: 30
@(private)
PENALTY_N1 :: 3
@(private)
PENALTY_N2 :: 3
@(private)
PENALTY_N3 :: 40
@(private)
PENALTY_N4 :: 10
// The error correction level in a QR Code symbol. // The error correction level in a QR Code symbol.
Ecc :: enum { Ecc :: enum {
@@ -44,39 +64,6 @@ Segment :: struct {
bit_length: int, bit_length: int,
} }
// -------------------------------------------------------------------------------------------------
// Constants
// -------------------------------------------------------------------------------------------------
VERSION_MIN :: 1
VERSION_MAX :: 40
// The worst-case number of bytes needed to store one QR Code, up to and including version 40.
BUFFER_LEN_MAX :: 3918 // buffer_len_for_version(VERSION_MAX)
// Returns the number of bytes needed to store any QR Code up to and including the given version.
buffer_len_for_version :: #force_inline proc(n: int) -> int {
size := n * 4 + 17
return (size * size + 7) / 8 + 1
}
// -------------------------------------------------------------------------------------------------
// Private constants
// -------------------------------------------------------------------------------------------------
@(private)
LENGTH_OVERFLOW :: -1
@(private)
REED_SOLOMON_DEGREE_MAX :: 30
@(private)
PENALTY_N1 :: 3
@(private)
PENALTY_N2 :: 3
@(private)
PENALTY_N3 :: 40
@(private)
PENALTY_N4 :: 10
//odinfmt: disable //odinfmt: disable
// For generating error correction codes. Index 0 is padding (set to illegal value). // For generating error correction codes. Index 0 is padding (set to illegal value).
@(private) @(private)
@@ -96,10 +83,9 @@ NUM_ERROR_CORRECTION_BLOCKS := [4][41]i8{
} }
//odinfmt: enable //odinfmt: enable
// ---------------------------------------------------------------------------------------------------------------------
// ------------------------------------------------------------------------------------------------- // ----- Encode Procedures ------------------------
// Encode procedures // ---------------------------------------------------------------------------------------------------------------------
// -------------------------------------------------------------------------------------------------
// Encodes the given text string to a QR Code, automatically selecting // Encodes the given text string to a QR Code, automatically selecting
// numeric, alphanumeric, or byte mode based on content. // numeric, alphanumeric, or byte mode based on content.
@@ -117,7 +103,7 @@ NUM_ERROR_CORRECTION_BLOCKS := [4][41]i8{
// - The text cannot fit in any version within [min_version, max_version] at the given ECL. // - The text cannot fit in any version within [min_version, max_version] at the given ECL.
// - The encoded segment data exceeds the buffer capacity. // - The encoded segment data exceeds the buffer capacity.
@(require_results) @(require_results)
encode_text_explicit_temp :: proc( encode_text_manual :: proc(
text: string, text: string,
temp_buffer, qrcode: []u8, temp_buffer, qrcode: []u8,
ecl: Ecc, ecl: Ecc,
@@ -130,7 +116,7 @@ encode_text_explicit_temp :: proc(
) { ) {
text_len := len(text) text_len := len(text)
if text_len == 0 { if text_len == 0 {
return encode_segments_advanced_explicit_temp( return encode_segments_advanced_manual(
nil, nil,
ecl, ecl,
min_version, min_version,
@@ -162,7 +148,7 @@ encode_text_explicit_temp :: proc(
seg.data = temp_buffer[:text_len] seg.data = temp_buffer[:text_len]
} }
segs := [1]Segment{seg} segs := [1]Segment{seg}
return encode_segments_advanced_explicit_temp( return encode_segments_advanced_manual(
segs[:], segs[:],
ecl, ecl,
min_version, min_version,
@@ -211,13 +197,9 @@ encode_text_auto :: proc(
return false return false
} }
defer delete(temp_buffer, temp_allocator) defer delete(temp_buffer, temp_allocator)
return encode_text_explicit_temp(text, temp_buffer, qrcode, ecl, min_version, max_version, mask, boost_ecl) return encode_text_manual(text, temp_buffer, qrcode, ecl, min_version, max_version, mask, boost_ecl)
} }
encode_text :: proc {
encode_text_explicit_temp,
encode_text_auto,
}
// Encodes arbitrary binary data to a QR Code using byte mode. // Encodes arbitrary binary data to a QR Code using byte mode.
// //
@@ -234,7 +216,7 @@ encode_text :: proc {
// Returns ok=false when: // Returns ok=false when:
// - The payload cannot fit in any version within [min_version, max_version] at the given ECL. // - The payload cannot fit in any version within [min_version, max_version] at the given ECL.
@(require_results) @(require_results)
encode_binary :: proc( encode_binary_manual :: proc(
data_and_temp: []u8, data_and_temp: []u8,
data_len: int, data_len: int,
qrcode: []u8, qrcode: []u8,
@@ -256,7 +238,7 @@ encode_binary :: proc(
seg.num_chars = data_len seg.num_chars = data_len
seg.data = data_and_temp[:data_len] seg.data = data_and_temp[:data_len]
segs := [1]Segment{seg} segs := [1]Segment{seg}
return encode_segments_advanced( return encode_segments_advanced_manual(
segs[:], segs[:],
ecl, ecl,
min_version, min_version,
@@ -268,6 +250,55 @@ encode_binary :: proc(
) )
} }
// Encodes arbitrary binary data to a QR Code using byte mode,
// automatically allocating and freeing the temp buffer.
//
// Parameters:
// bin_data - [in] Payload bytes (aliased by the internal segment; not modified).
// qrcode - [out] On success, contains the encoded QR Code. On failure, qrcode[0] is
// set to 0.
// temp_allocator - Allocator used for the internal scratch buffer. Freed before return.
//
// qrcode must have length >= buffer_len_for_version(max_version).
//
// Returns ok=false when:
// - The payload cannot fit in any version within [min_version, max_version] at the given ECL.
// - The temp_allocator fails to allocate.
@(require_results)
encode_binary_auto :: proc(
bin_data: []u8,
qrcode: []u8,
ecl: Ecc,
min_version: int = VERSION_MIN,
max_version: int = VERSION_MAX,
mask: Maybe(Mask) = nil,
boost_ecl: bool = true,
temp_allocator := context.temp_allocator,
) -> (
ok: bool,
) {
seg: Segment
seg.mode = .Byte
seg.bit_length = calc_segment_bit_length(.Byte, len(bin_data))
if seg.bit_length == LENGTH_OVERFLOW {
qrcode[0] = 0
return false
}
seg.num_chars = len(bin_data)
seg.data = bin_data
segs := [1]Segment{seg}
return encode_segments_advanced_auto(
segs[:],
ecl,
min_version,
max_version,
mask,
boost_ecl,
qrcode,
temp_allocator,
)
}
// Encodes the given segments to a QR Code using default parameters // Encodes the given segments to a QR Code using default parameters
// (VERSION_MIN..VERSION_MAX, auto mask, boost ECL). // (VERSION_MIN..VERSION_MAX, auto mask, boost ECL).
// //
@@ -282,17 +313,8 @@ encode_binary :: proc(
// Returns ok=false when: // Returns ok=false when:
// - The total segment data exceeds the capacity of version 40 at the given ECL. // - The total segment data exceeds the capacity of version 40 at the given ECL.
@(require_results) @(require_results)
encode_segments_explicit_temp :: proc(segs: []Segment, ecl: Ecc, temp_buffer, qrcode: []u8) -> (ok: bool) { encode_segments_manual :: proc(segs: []Segment, ecl: Ecc, temp_buffer, qrcode: []u8) -> (ok: bool) {
return encode_segments_advanced_explicit_temp( return encode_segments_advanced_manual(segs, ecl, VERSION_MIN, VERSION_MAX, nil, true, temp_buffer, qrcode)
segs,
ecl,
VERSION_MIN,
VERSION_MAX,
nil,
true,
temp_buffer,
qrcode,
)
} }
// Encodes segments to a QR Code using default parameters, automatically allocating the temp buffer. // Encodes segments to a QR Code using default parameters, automatically allocating the temp buffer.
@@ -328,13 +350,9 @@ encode_segments_auto :: proc(
return false return false
} }
defer delete(temp_buffer, temp_allocator) defer delete(temp_buffer, temp_allocator)
return encode_segments_explicit_temp(segs, ecl, temp_buffer, qrcode) return encode_segments_manual(segs, ecl, temp_buffer, qrcode)
} }
encode_segments :: proc {
encode_segments_explicit_temp,
encode_segments_auto,
}
// Encodes the given segments to a QR Code with full control over version range, mask, and ECL boosting. // Encodes the given segments to a QR Code with full control over version range, mask, and ECL boosting.
// //
@@ -353,7 +371,7 @@ encode_segments :: proc {
// - The total segment data exceeds the capacity of every version in [min_version, max_version] // - The total segment data exceeds the capacity of every version in [min_version, max_version]
// at the given ECL. // at the given ECL.
@(require_results) @(require_results)
encode_segments_advanced_explicit_temp :: proc( encode_segments_advanced_manual :: proc(
segs: []Segment, segs: []Segment,
ecl: Ecc, ecl: Ecc,
min_version, max_version: int, min_version, max_version: int,
@@ -490,7 +508,7 @@ encode_segments_advanced_auto :: proc(
return false return false
} }
defer delete(temp_buffer, temp_allocator) defer delete(temp_buffer, temp_allocator)
return encode_segments_advanced_explicit_temp( return encode_segments_advanced_manual(
segs, segs,
ecl, ecl,
min_version, min_version,
@@ -502,24 +520,24 @@ encode_segments_advanced_auto :: proc(
) )
} }
encode_segments_advanced :: proc { encode_manual :: proc {
encode_segments_advanced_explicit_temp, encode_text_manual,
encode_segments_advanced_auto, encode_binary_manual,
encode_segments_manual,
encode_segments_advanced_manual,
} }
encode :: proc { encode_auto :: proc {
encode_text_explicit_temp,
encode_text_auto, encode_text_auto,
encode_binary, encode_binary_auto,
encode_segments_explicit_temp,
encode_segments_auto, encode_segments_auto,
encode_segments_advanced_explicit_temp,
encode_segments_advanced_auto, encode_segments_advanced_auto,
} }
// -------------------------------------------------------------------------------------------------
// Error correction code generation // ---------------------------------------------------------------------------------------------------------------------
// ------------------------------------------------------------------------------------------------- // ----- Error Correction Code Generation ------------------------
// ---------------------------------------------------------------------------------------------------------------------
// Appends error correction bytes to each block of data, then interleaves bytes from all blocks. // Appends error correction bytes to each block of data, then interleaves bytes from all blocks.
@(private) @(private)
@@ -587,10 +605,6 @@ get_num_raw_data_modules :: proc(ver: int) -> int {
return result return result
} }
// -------------------------------------------------------------------------------------------------
// Reed-Solomon ECC generator
// -------------------------------------------------------------------------------------------------
@(private) @(private)
reed_solomon_compute_divisor :: proc(degree: int, result: []u8) { reed_solomon_compute_divisor :: proc(degree: int, result: []u8) {
assert(1 <= degree && degree <= REED_SOLOMON_DEGREE_MAX, "reed-solomon degree out of range") assert(1 <= degree && degree <= REED_SOLOMON_DEGREE_MAX, "reed-solomon degree out of range")
@@ -637,9 +651,9 @@ reed_solomon_multiply :: proc(x, y: u8) -> u8 {
return z return z
} }
// ------------------------------------------------------------------------------------------------- // ---------------------------------------------------------------------------------------------------------------------
// Drawing function modules // ----- Drawing Function Modules ------------------------
// ------------------------------------------------------------------------------------------------- // ---------------------------------------------------------------------------------------------------------------------
// Clears the QR Code grid and marks every function module as dark. // Clears the QR Code grid and marks every function module as dark.
@(private) @(private)
@@ -785,9 +799,9 @@ fill_rectangle :: proc(left, top, width, height: int, qrcode: []u8) {
} }
} }
// ------------------------------------------------------------------------------------------------- // ---------------------------------------------------------------------------------------------------------------------
// Drawing data modules and masking // ----- Drawing data modules and masking ------------------------
// ------------------------------------------------------------------------------------------------- // ---------------------------------------------------------------------------------------------------------------------
@(private) @(private)
draw_codewords :: proc(data: []u8, data_len: int, qrcode: []u8) { draw_codewords :: proc(data: []u8, data_len: int, qrcode: []u8) {
@@ -965,9 +979,9 @@ finder_penalty_add_history :: proc(current_run_length: int, run_history: ^[7]int
run_history[0] = current_run_length run_history[0] = current_run_length
} }
// ------------------------------------------------------------------------------------------------- // ---------------------------------------------------------------------------------------------------------------------
// Basic QR Code information // ----- Basic QR code information ------------------------
// ------------------------------------------------------------------------------------------------- // ---------------------------------------------------------------------------------------------------------------------
// Returns the minimum buffer size (in bytes) needed for both temp_buffer and qrcode // Returns the minimum buffer size (in bytes) needed for both temp_buffer and qrcode
// to encode the given content at the given ECC level within the given version range. // to encode the given content at the given ECC level within the given version range.
@@ -981,7 +995,7 @@ min_buffer_size :: proc {
min_buffer_size_segments, min_buffer_size_segments,
} }
// Text path: auto-selects numeric/alphanumeric/byte mode the same way encode_text does. // Text path: auto-selects numeric/alphanumeric/byte mode the same way encode_text_manual does.
// //
// Returns ok=false when: // Returns ok=false when:
// - The text exceeds QR Code capacity for every version in the range at the given ECL. // - The text exceeds QR Code capacity for every version in the range at the given ECL.
@@ -1127,9 +1141,9 @@ get_bit :: #force_inline proc(x: int, i: uint) -> bool {
return ((x >> i) & 1) != 0 return ((x >> i) & 1) != 0
} }
// ------------------------------------------------------------------------------------------------- // ---------------------------------------------------------------------------------------------------------------------
// Segment handling // ----- Segment Handling ------------------------
// ------------------------------------------------------------------------------------------------- // ---------------------------------------------------------------------------------------------------------------------
// Tests whether the given string can be encoded in numeric mode. // Tests whether the given string can be encoded in numeric mode.
is_numeric :: proc(text: string) -> bool { is_numeric :: proc(text: string) -> bool {
@@ -1162,7 +1176,6 @@ calc_segment_buffer_size :: proc(mode: Mode, num_chars: int) -> int {
return (temp + 7) / 8 return (temp + 7) / 8
} }
@(private)
calc_segment_bit_length :: proc(mode: Mode, num_chars: int) -> int { calc_segment_bit_length :: proc(mode: Mode, num_chars: int) -> int {
if num_chars < 0 || num_chars > 32767 { if num_chars < 0 || num_chars > 32767 {
return LENGTH_OVERFLOW return LENGTH_OVERFLOW
@@ -1319,11 +1332,11 @@ make_eci :: proc(assign_val: int, buf: []u8) -> Segment {
return result return result
} }
// ------------------------------------------------------------------------------------------------- // ---------------------------------------------------------------------------------------------------------------------
// Private helpers // ----- Helpers ------------------------
// ------------------------------------------------------------------------------------------------- // ---------------------------------------------------------------------------------------------------------------------
@(private) // Internal
append_bits_to_buffer :: proc(val: uint, num_bits: int, buffer: []u8, bit_len: ^int) { append_bits_to_buffer :: proc(val: uint, num_bits: int, buffer: []u8, bit_len: ^int) {
assert(0 <= num_bits && num_bits <= 16 && val >> uint(num_bits) == 0, "invalid bit count or value overflow") assert(0 <= num_bits && num_bits <= 16 && val >> uint(num_bits) == 0, "invalid bit count or value overflow")
for i := num_bits - 1; i >= 0; i -= 1 { for i := num_bits - 1; i >= 0; i -= 1 {
@@ -1332,7 +1345,7 @@ append_bits_to_buffer :: proc(val: uint, num_bits: int, buffer: []u8, bit_len: ^
} }
} }
@(private) // Internal
get_total_bits :: proc(segs: []Segment, version: int) -> int { get_total_bits :: proc(segs: []Segment, version: int) -> int {
result := 0 result := 0
for &seg in segs { for &seg in segs {
@@ -1354,7 +1367,7 @@ get_total_bits :: proc(segs: []Segment, version: int) -> int {
return result return result
} }
@(private) // Internal
num_char_count_bits :: proc(mode: Mode, version: int) -> int { num_char_count_bits :: proc(mode: Mode, version: int) -> int {
assert(VERSION_MIN <= version && version <= VERSION_MAX, "version out of bounds") assert(VERSION_MIN <= version && version <= VERSION_MAX, "version out of bounds")
i := (version + 7) / 17 i := (version + 7) / 17
@@ -1376,8 +1389,8 @@ num_char_count_bits :: proc(mode: Mode, version: int) -> int {
unreachable() unreachable()
} }
// Internal
// Returns the index of c in the alphanumeric charset (0-44), or -1 if not found. // Returns the index of c in the alphanumeric charset (0-44), or -1 if not found.
@(private)
alphanumeric_index :: proc(c: u8) -> int { alphanumeric_index :: proc(c: u8) -> int {
switch c { switch c {
case '0' ..= '9': return int(c - '0') case '0' ..= '9': return int(c - '0')
@@ -2487,7 +2500,7 @@ test_min_buffer_size_text :: proc(t: ^testing.T) {
testing.expect(t, planned > 0) testing.expect(t, planned > 0)
qrcode: [BUFFER_LEN_MAX]u8 qrcode: [BUFFER_LEN_MAX]u8
temp: [BUFFER_LEN_MAX]u8 temp: [BUFFER_LEN_MAX]u8
ok := encode_text(text, temp[:], qrcode[:], Ecc.Low) ok := encode_text_manual(text, temp[:], qrcode[:], Ecc.Low)
testing.expect(t, ok) testing.expect(t, ok)
actual_version_size := get_size(qrcode[:]) actual_version_size := get_size(qrcode[:])
actual_buf_len := buffer_len_for_version((actual_version_size - 17) / 4) actual_buf_len := buffer_len_for_version((actual_version_size - 17) / 4)
@@ -2538,7 +2551,7 @@ test_min_buffer_size_binary :: proc(t: ^testing.T) {
testing.expect(t, size > 0) testing.expect(t, size > 0)
testing.expect(t, size <= buffer_len_for_version(2)) testing.expect(t, size <= buffer_len_for_version(2))
// Verify agreement with encode_binary // Verify agreement with encode_binary_manual
{ {
data_len :: 100 data_len :: 100
planned, planned_ok := min_buffer_size(data_len, .Medium) planned, planned_ok := min_buffer_size(data_len, .Medium)
@@ -2549,7 +2562,7 @@ test_min_buffer_size_binary :: proc(t: ^testing.T) {
for i in 0 ..< data_len { for i in 0 ..< data_len {
dat[i] = u8(i) dat[i] = u8(i)
} }
ok := encode_binary(dat[:], data_len, qrcode[:], .Medium) ok := encode_binary_manual(dat[:], data_len, qrcode[:], .Medium)
testing.expect(t, ok) testing.expect(t, ok)
actual_version_size := get_size(qrcode[:]) actual_version_size := get_size(qrcode[:])
actual_buf_len := buffer_len_for_version((actual_version_size - 17) / 4) actual_buf_len := buffer_len_for_version((actual_version_size - 17) / 4)
@@ -2609,7 +2622,7 @@ test_min_buffer_size_segments :: proc(t: ^testing.T) {
// Verify against actual encode // Verify against actual encode
qrcode: [BUFFER_LEN_MAX]u8 qrcode: [BUFFER_LEN_MAX]u8
temp: [BUFFER_LEN_MAX]u8 temp: [BUFFER_LEN_MAX]u8
ok := encode_segments(segs[:], Ecc.Low, temp[:], qrcode[:]) ok := encode_segments_manual(segs[:], Ecc.Low, temp[:], qrcode[:])
testing.expect(t, ok) testing.expect(t, ok)
actual_version_size := get_size(qrcode[:]) actual_version_size := get_size(qrcode[:])
actual_buf_len := buffer_len_for_version((actual_version_size - 17) / 4) actual_buf_len := buffer_len_for_version((actual_version_size - 17) / 4)
@@ -2631,7 +2644,7 @@ test_encode_text_auto :: proc(t: ^testing.T) {
text :: "Hello, world!" text :: "Hello, world!"
qr_explicit: [BUFFER_LEN_MAX]u8 qr_explicit: [BUFFER_LEN_MAX]u8
temp: [BUFFER_LEN_MAX]u8 temp: [BUFFER_LEN_MAX]u8
ok_explicit := encode_text_explicit_temp(text, temp[:], qr_explicit[:], .Low) ok_explicit := encode_text_manual(text, temp[:], qr_explicit[:], .Low)
testing.expect(t, ok_explicit) testing.expect(t, ok_explicit)
qr_auto: [BUFFER_LEN_MAX]u8 qr_auto: [BUFFER_LEN_MAX]u8
@@ -2650,7 +2663,7 @@ test_encode_text_auto :: proc(t: ^testing.T) {
text :: "314159265358979323846264338327950288419716939937510" text :: "314159265358979323846264338327950288419716939937510"
qr_explicit: [BUFFER_LEN_MAX]u8 qr_explicit: [BUFFER_LEN_MAX]u8
temp: [BUFFER_LEN_MAX]u8 temp: [BUFFER_LEN_MAX]u8
ok_explicit := encode_text_explicit_temp(text, temp[:], qr_explicit[:], .Medium) ok_explicit := encode_text_manual(text, temp[:], qr_explicit[:], .Medium)
testing.expect(t, ok_explicit) testing.expect(t, ok_explicit)
qr_auto: [BUFFER_LEN_MAX]u8 qr_auto: [BUFFER_LEN_MAX]u8
@@ -2669,7 +2682,7 @@ test_encode_text_auto :: proc(t: ^testing.T) {
text :: "HELLO WORLD" text :: "HELLO WORLD"
qr_explicit: [BUFFER_LEN_MAX]u8 qr_explicit: [BUFFER_LEN_MAX]u8
temp: [BUFFER_LEN_MAX]u8 temp: [BUFFER_LEN_MAX]u8
ok_explicit := encode_text_explicit_temp(text, temp[:], qr_explicit[:], .Quartile) ok_explicit := encode_text_manual(text, temp[:], qr_explicit[:], .Quartile)
testing.expect(t, ok_explicit) testing.expect(t, ok_explicit)
qr_auto: [BUFFER_LEN_MAX]u8 qr_auto: [BUFFER_LEN_MAX]u8
@@ -2695,7 +2708,7 @@ test_encode_text_auto :: proc(t: ^testing.T) {
text :: "https://www.nayuki.io/" text :: "https://www.nayuki.io/"
qr_explicit: [BUFFER_LEN_MAX]u8 qr_explicit: [BUFFER_LEN_MAX]u8
temp: [BUFFER_LEN_MAX]u8 temp: [BUFFER_LEN_MAX]u8
ok_explicit := encode_text_explicit_temp(text, temp[:], qr_explicit[:], .High, mask = .M3) ok_explicit := encode_text_manual(text, temp[:], qr_explicit[:], .High, mask = .M3)
testing.expect(t, ok_explicit) testing.expect(t, ok_explicit)
qr_auto: [BUFFER_LEN_MAX]u8 qr_auto: [BUFFER_LEN_MAX]u8
@@ -2732,7 +2745,7 @@ test_encode_segments_auto :: proc(t: ^testing.T) {
qr_explicit: [BUFFER_LEN_MAX]u8 qr_explicit: [BUFFER_LEN_MAX]u8
temp: [BUFFER_LEN_MAX]u8 temp: [BUFFER_LEN_MAX]u8
ok_explicit := encode_segments_explicit_temp(segs[:], .Low, temp[:], qr_explicit[:]) ok_explicit := encode_segments_manual(segs[:], .Low, temp[:], qr_explicit[:])
testing.expect(t, ok_explicit) testing.expect(t, ok_explicit)
qr_auto: [BUFFER_LEN_MAX]u8 qr_auto: [BUFFER_LEN_MAX]u8
@@ -2764,7 +2777,7 @@ test_encode_segments_advanced_auto :: proc(t: ^testing.T) {
qr_explicit: [BUFFER_LEN_MAX]u8 qr_explicit: [BUFFER_LEN_MAX]u8
temp: [BUFFER_LEN_MAX]u8 temp: [BUFFER_LEN_MAX]u8
ok_explicit := encode_segments_advanced_explicit_temp( ok_explicit := encode_segments_advanced_manual(
segs[:], segs[:],
.Medium, .Medium,
VERSION_MIN, VERSION_MIN,
@@ -2795,7 +2808,7 @@ test_encode_segments_advanced_auto :: proc(t: ^testing.T) {
qr_explicit: [BUFFER_LEN_MAX]u8 qr_explicit: [BUFFER_LEN_MAX]u8
temp: [BUFFER_LEN_MAX]u8 temp: [BUFFER_LEN_MAX]u8
ok_explicit := encode_segments_advanced_explicit_temp( ok_explicit := encode_segments_advanced_manual(
segs[:], segs[:],
.High, .High,
1, 1,
+269 -99
View File
@@ -1,103 +1,139 @@
package ring package ring
import "base:runtime"
import "core:fmt" import "core:fmt"
@(private) @(private)
ODIN_BOUNDS_CHECK :: !ODIN_NO_BOUNDS_CHECK ODIN_BOUNDS_CHECK :: !ODIN_NO_BOUNDS_CHECK
Ring :: struct($T: typeid) { Ring :: struct($E: typeid) {
data: []T, data: []E,
_end_index, len: int, next_write_index, len: int,
} }
Ring_Soa :: struct($T: typeid) { Ring_Soa :: struct($E: typeid) {
data: #soa[]T, data: #soa[]E,
_end_index, len: int, next_write_index, len: int,
} }
from_slice_raos :: #force_inline proc(data: $T/[]$E) -> Ring(E) { destroy_aos :: #force_inline proc(
return {data = data, _end_index = -1} ring: ^Ring($E),
allocator := context.allocator,
) -> runtime.Allocator_Error {
return delete(ring.data)
} }
from_slice_rsoa :: #force_inline proc(data: $T/#soa[]$E) -> Ring_Soa(E) { destroy_soa :: #force_inline proc(
return {data = data, _end_index = -1} ring: ^Ring_Soa($E),
allocator := context.allocator,
) -> runtime.Allocator_Error {
return delete(ring.data)
} }
from_slice :: proc { destroy :: proc {
from_slice_raos, destroy_aos,
from_slice_rsoa, destroy_soa,
} }
create_aos :: #force_inline proc(
$E: typeid,
capacity: int,
allocator := context.allocator,
) -> (
ring: Ring(E),
err: runtime.Allocator_Error,
) #optional_allocator_error {
ring.data, err = make([]E, capacity, allocator)
return ring, err
}
create_soa :: #force_inline proc(
$E: typeid,
capacity: int,
allocator := context.allocator,
) -> (
ring: Ring_Soa(E),
err: runtime.Allocator_Error,
) #optional_allocator_error {
ring.data, err = make(#soa[]E, capacity, allocator)
return ring, err
}
// All contents of `data` will be completely ignored, `data` is treated as an empty slice.
init_from_slice_aos :: #force_inline proc(ring: ^Ring($E), data: $T/[]E) {
ring.data = data
ring.len = 0
ring.next_write_index = 0
return
}
// All contents of `data` will be completely ignored, `data` is treated as an empty slice.
init_from_slice_soa :: #force_inline proc(ring: ^Ring_Soa($E), data: $T/#soa[]E) {
ring.data = data
ring.len = 0
ring.next_write_index = 0
return
}
init_from_slice :: proc {
init_from_slice_aos,
init_from_slice_soa,
}
// Internal
// Index in the backing array where the ring starts // Index in the backing array where the ring starts
_start_index_raos :: proc(ring: Ring($T)) -> int { start_index_aos :: #force_inline proc(ring: Ring($E)) -> int {
if ring.len < len(ring.data) { return ring.len < len(ring.data) ? 0 : ring.next_write_index
return 0
} else {
start_index := ring._end_index + 1
return 0 if start_index == len(ring.data) else start_index
}
} }
// Internal
// Index in the backing array where the ring starts // Index in the backing array where the ring starts
_start_index_rsoa :: proc(ring: Ring_Soa($T)) -> int { start_index_soa :: #force_inline proc(ring: Ring_Soa($E)) -> int {
if ring.len < len(ring.data) { return ring.len < len(ring.data) ? 0 : ring.next_write_index
return 0
} else {
start_index := ring._end_index + 1
return 0 if start_index == len(ring.data) else start_index
}
} }
advance_raos :: proc(ring: ^Ring($T)) { advance_aos :: #force_inline proc(ring: ^Ring($E)) {
// Length // Length
if ring.len != len(ring.data) do ring.len += 1 if ring.len != len(ring.data) do ring.len += 1
// End index // Write index
if ring._end_index == len(ring.data) - 1 { // If we are at the end of the backing array ring.next_write_index += 1
ring._end_index = 0 // Overflow end to 0 if ring.next_write_index == len(ring.data) do ring.next_write_index = 0
} else {
ring._end_index += 1
}
} }
advance_rsoa :: proc(ring: ^Ring_Soa($T)) { advance_soa :: #force_inline proc(ring: ^Ring_Soa($E)) {
// Length // Length
if ring.len != len(ring.data) do ring.len += 1 if ring.len != len(ring.data) do ring.len += 1
// End index // Write index
if ring._end_index == len(ring.data) - 1 { // If we are at the end of the backing array ring.next_write_index += 1
ring._end_index = 0 // Overflow end to 0 if ring.next_write_index == len(ring.data) do ring.next_write_index = 0
} else {
ring._end_index += 1
}
} }
advance :: proc { advance :: proc {
advance_raos, advance_aos,
advance_rsoa, advance_soa,
} }
append_raos :: proc(ring: ^Ring($T), element: T) { append_aos :: #force_inline proc(ring: ^Ring($E), element: E) {
ring.data[ring.next_write_index] = element
advance(ring) advance(ring)
ring.data[ring._end_index] = element
} }
append_rsoa :: proc(ring: ^Ring_Soa($T), element: T) { append_soa :: #force_inline proc(ring: ^Ring_Soa($E), element: E) {
ring.data[ring.next_write_index] = element
advance(ring) advance(ring)
ring.data[ring._end_index] = element
} }
append :: proc { append :: proc {
append_raos, append_aos,
append_rsoa, append_soa,
} }
get_raos :: proc(ring: Ring($T), index: int) -> ^T { get_aos :: #force_inline proc(ring: Ring($E), index: int) -> ^E {
when ODIN_BOUNDS_CHECK { when ODIN_BOUNDS_CHECK {
if index >= ring.len { fmt.assertf(index < ring.len, "Ring index %i out of bounds for length %i", index, ring.len)
panic(fmt.tprintf("Ring index %i out of bounds for length %i", index, ring.len))
}
} }
array_index := _start_index_raos(ring) + index array_index := start_index_aos(ring) + index
if array_index < len(ring.data) { if array_index < len(ring.data) {
return &ring.data[array_index] return &ring.data[array_index]
} else { } else {
@@ -107,14 +143,12 @@ get_raos :: proc(ring: Ring($T), index: int) -> ^T {
} }
// SOA can't return soa pointer to parapoly T. // SOA can't return soa pointer to parapoly T.
get_rsoa :: proc(ring: Ring_Soa($T), index: int) -> T { get_soa :: #force_inline proc(ring: Ring_Soa($E), index: int) -> E {
when ODIN_BOUNDS_CHECK { when ODIN_BOUNDS_CHECK {
if index >= ring.len { fmt.assertf(index < ring.len, "Ring index %i out of bounds for length %i", index, ring.len)
panic(fmt.tprintf("Ring index %i out of bounds for length %i", index, ring.len))
}
} }
array_index := _start_index_rsoa(ring) + index array_index := start_index_soa(ring) + index
if array_index < len(ring.data) { if array_index < len(ring.data) {
return ring.data[array_index] return ring.data[array_index]
} else { } else {
@@ -124,36 +158,36 @@ get_rsoa :: proc(ring: Ring_Soa($T), index: int) -> T {
} }
get :: proc { get :: proc {
get_raos, get_aos,
get_rsoa, get_soa,
} }
get_last_raos :: #force_inline proc(ring: Ring($T)) -> ^T { get_last_aos :: #force_inline proc(ring: Ring($E)) -> ^E {
return get(ring, ring.len - 1) return get(ring, ring.len - 1)
} }
get_last_rsoa :: #force_inline proc(ring: Ring_Soa($T)) -> T { get_last_soa :: #force_inline proc(ring: Ring_Soa($E)) -> E {
return get(ring, ring.len - 1) return get(ring, ring.len - 1)
} }
get_last :: proc { get_last :: proc {
get_last_raos, get_last_aos,
get_last_rsoa, get_last_soa,
} }
clear_raos :: #force_inline proc "contextless" (ring: ^Ring($T)) { clear_aos :: #force_inline proc "contextless" (ring: ^Ring($E)) {
ring.len = 0 ring.len = 0
ring._end_index = -1 ring.next_write_index = 0
} }
clear_rsoa :: #force_inline proc "contextless" (ring: ^Ring_Soa($T)) { clear_soa :: #force_inline proc "contextless" (ring: ^Ring_Soa($E)) {
ring.len = 0 ring.len = 0
ring._end_index = -1 ring.next_write_index = 0
} }
clear :: proc { clear :: proc {
clear_raos, clear_aos,
clear_rsoa, clear_soa,
} }
// --------------------------------------------------------------------------------------------------------------------- // ---------------------------------------------------------------------------------------------------------------------
@@ -164,28 +198,27 @@ import "core:testing"
@(test) @(test)
test_ring_aos :: proc(t: ^testing.T) { test_ring_aos :: proc(t: ^testing.T) {
data := make_slice([]int, 10) ring := create_aos(int, 10)
ring := from_slice(data) defer destroy(&ring)
defer delete(ring.data)
for i in 1 ..= 5 { for i in 1 ..= 5 {
append(&ring, i) append(&ring, i)
log.debug("Length:", ring.len) log.debug("Length:", ring.len)
log.debug("Start index:", _start_index_raos(ring)) log.debug("Start index:", start_index_aos(ring))
log.debug("End index:", ring._end_index) log.debug("Next write index:", ring.next_write_index)
log.debug(ring.data) log.debug(ring.data)
} }
testing.expect_value(t, get(ring, 0)^, 1) testing.expect_value(t, get(ring, 0)^, 1)
testing.expect_value(t, get(ring, 4)^, 5) testing.expect_value(t, get(ring, 4)^, 5)
testing.expect_value(t, ring.len, 5) testing.expect_value(t, ring.len, 5)
testing.expect_value(t, ring._end_index, 4) testing.expect_value(t, ring.next_write_index, 5)
testing.expect_value(t, _start_index_raos(ring), 0) testing.expect_value(t, start_index_aos(ring), 0)
for i in 6 ..= 15 { for i in 6 ..= 15 {
append(&ring, i) append(&ring, i)
log.debug("Length:", ring.len) log.debug("Length:", ring.len)
log.debug("Start index:", _start_index_raos(ring)) log.debug("Start index:", start_index_aos(ring))
log.debug("End index:", ring._end_index) log.debug("Next write index:", ring.next_write_index)
log.debug(ring.data) log.debug(ring.data)
} }
testing.expect_value(t, get(ring, 0)^, 6) testing.expect_value(t, get(ring, 0)^, 6)
@@ -193,18 +226,18 @@ test_ring_aos :: proc(t: ^testing.T) {
testing.expect_value(t, get(ring, 9)^, 15) testing.expect_value(t, get(ring, 9)^, 15)
testing.expect_value(t, get_last(ring)^, 15) testing.expect_value(t, get_last(ring)^, 15)
testing.expect_value(t, ring.len, 10) testing.expect_value(t, ring.len, 10)
testing.expect_value(t, ring._end_index, 4) testing.expect_value(t, ring.next_write_index, 5)
testing.expect_value(t, _start_index_raos(ring), 5) testing.expect_value(t, start_index_aos(ring), 5)
for i in 15 ..= 25 { for i in 15 ..= 25 {
append(&ring, i) append(&ring, i)
log.debug("Length:", ring.len) log.debug("Length:", ring.len)
log.debug("Start index:", _start_index_raos(ring)) log.debug("Start index:", start_index_aos(ring))
log.debug("End index:", ring._end_index) log.debug("Next write index:", ring.next_write_index)
log.debug(ring.data) log.debug(ring.data)
} }
testing.expect_value(t, get(ring, 0)^, 16) testing.expect_value(t, get(ring, 0)^, 16)
testing.expect_value(t, ring._end_index, 5) testing.expect_value(t, ring.next_write_index, 6)
testing.expect_value(t, get_last(ring)^, 25) testing.expect_value(t, get_last(ring)^, 25)
clear(&ring) clear(&ring)
@@ -219,28 +252,27 @@ test_ring_soa :: proc(t: ^testing.T) {
x, y: int, x, y: int,
} }
data := make_soa_slice(#soa[]Ints, 10) ring := create_soa(Ints, 10)
ring := from_slice(data) defer destroy(&ring)
defer delete(ring.data)
for i in 1 ..= 5 { for i in 1 ..= 5 {
append(&ring, Ints{i, i}) append(&ring, Ints{i, i})
log.debug("Length:", ring.len) log.debug("Length:", ring.len)
log.debug("Start index:", _start_index_rsoa(ring)) log.debug("Start index:", start_index_soa(ring))
log.debug("End index:", ring._end_index) log.debug("Next write index:", ring.next_write_index)
log.debug(ring.data) log.debug(ring.data)
} }
testing.expect_value(t, get(ring, 0), Ints{1, 1}) testing.expect_value(t, get(ring, 0), Ints{1, 1})
testing.expect_value(t, get(ring, 4), Ints{5, 5}) testing.expect_value(t, get(ring, 4), Ints{5, 5})
testing.expect_value(t, ring.len, 5) testing.expect_value(t, ring.len, 5)
testing.expect_value(t, ring._end_index, 4) testing.expect_value(t, ring.next_write_index, 5)
testing.expect_value(t, _start_index_rsoa(ring), 0) testing.expect_value(t, start_index_soa(ring), 0)
for i in 6 ..= 15 { for i in 6 ..= 15 {
append(&ring, Ints{i, i}) append(&ring, Ints{i, i})
log.debug("Length:", ring.len) log.debug("Length:", ring.len)
log.debug("Start index:", _start_index_rsoa(ring)) log.debug("Start index:", start_index_soa(ring))
log.debug("End index:", ring._end_index) log.debug("Next write index:", ring.next_write_index)
log.debug(ring.data) log.debug(ring.data)
} }
testing.expect_value(t, get(ring, 0), Ints{6, 6}) testing.expect_value(t, get(ring, 0), Ints{6, 6})
@@ -248,18 +280,18 @@ test_ring_soa :: proc(t: ^testing.T) {
testing.expect_value(t, get(ring, 9), Ints{15, 15}) testing.expect_value(t, get(ring, 9), Ints{15, 15})
testing.expect_value(t, get_last(ring), Ints{15, 15}) testing.expect_value(t, get_last(ring), Ints{15, 15})
testing.expect_value(t, ring.len, 10) testing.expect_value(t, ring.len, 10)
testing.expect_value(t, ring._end_index, 4) testing.expect_value(t, ring.next_write_index, 5)
testing.expect_value(t, _start_index_rsoa(ring), 5) testing.expect_value(t, start_index_soa(ring), 5)
for i in 15 ..= 25 { for i in 15 ..= 25 {
append(&ring, Ints{i, i}) append(&ring, Ints{i, i})
log.debug("Length:", ring.len) log.debug("Length:", ring.len)
log.debug("Start index:", _start_index_rsoa(ring)) log.debug("Start index:", start_index_soa(ring))
log.debug("End index:", ring._end_index) log.debug("Next write index:", ring.next_write_index)
log.debug(ring.data) log.debug(ring.data)
} }
testing.expect_value(t, get(ring, 0), Ints{16, 16}) testing.expect_value(t, get(ring, 0), Ints{16, 16})
testing.expect_value(t, ring._end_index, 5) testing.expect_value(t, ring.next_write_index, 6)
testing.expect_value(t, get_last(ring), Ints{25, 25}) testing.expect_value(t, get_last(ring), Ints{25, 25})
clear(&ring) clear(&ring)
@@ -267,3 +299,141 @@ test_ring_soa :: proc(t: ^testing.T) {
testing.expect_value(t, ring.len, 1) testing.expect_value(t, ring.len, 1)
testing.expect_value(t, get(ring, 0), Ints{1, 1}) testing.expect_value(t, get(ring, 0), Ints{1, 1})
} }
@(test)
test_ring_aos_init_from_slice :: proc(t: ^testing.T) {
// Stack-allocated backing with pre-existing garbage and odd capacity.
backing: [7]int = {99, 99, 99, 99, 99, 99, 99}
ring: Ring(int)
init_from_slice(&ring, backing[:])
// Empty ring invariants after init_from_slice.
testing.expect_value(t, ring.len, 0)
testing.expect_value(t, ring.next_write_index, 0)
testing.expect_value(t, start_index_aos(ring), 0)
// Partial fill (3 / 7).
for i in 1 ..= 3 do append(&ring, i)
testing.expect_value(t, ring.len, 3)
testing.expect_value(t, ring.next_write_index, 3)
testing.expect_value(t, start_index_aos(ring), 0)
testing.expect_value(t, get(ring, 0)^, 1)
testing.expect_value(t, get(ring, 2)^, 3)
testing.expect_value(t, get_last(ring)^, 3)
// Fill exactly to capacity. Pushing element 7 must make len == cap
// AND wrap next_write_index from 6 back to 0 in the same step.
for i in 4 ..= 7 do append(&ring, i)
testing.expect_value(t, ring.len, 7)
testing.expect_value(t, ring.next_write_index, 0)
testing.expect_value(t, start_index_aos(ring), 0)
testing.expect_value(t, get(ring, 0)^, 1)
testing.expect_value(t, get(ring, 6)^, 7)
testing.expect_value(t, get_last(ring)^, 7)
// First overwrite oldest element shifts by one.
append(&ring, 8)
testing.expect_value(t, ring.len, 7)
testing.expect_value(t, ring.next_write_index, 1)
testing.expect_value(t, start_index_aos(ring), 1)
testing.expect_value(t, get(ring, 0)^, 2)
testing.expect_value(t, get(ring, 6)^, 8)
testing.expect_value(t, get_last(ring)^, 8)
// Stress: 3 more complete wrap cycles (21 more pushes).
// After 29 total pushes, ring contains the last 7 (23..=29),
// and next_write_index = 29 mod 7 = 1.
for i in 9 ..= 29 do append(&ring, i)
testing.expect_value(t, ring.len, 7)
testing.expect_value(t, ring.next_write_index, 1)
testing.expect_value(t, start_index_aos(ring), 1)
testing.expect_value(t, get(ring, 0)^, 23)
testing.expect_value(t, get(ring, 3)^, 26)
testing.expect_value(t, get(ring, 6)^, 29)
testing.expect_value(t, get_last(ring)^, 29)
// Clear returns ring to empty-equivalent state.
clear(&ring)
testing.expect_value(t, ring.len, 0)
testing.expect_value(t, ring.next_write_index, 0)
testing.expect_value(t, start_index_aos(ring), 0)
// Single-element edge case: get_last(len==1) routes through get(ring, 0).
append(&ring, 42)
testing.expect_value(t, ring.len, 1)
testing.expect_value(t, ring.next_write_index, 1)
testing.expect_value(t, get(ring, 0)^, 42)
testing.expect_value(t, get_last(ring)^, 42)
}
@(test)
test_ring_soa_init_from_slice :: proc(t: ^testing.T) {
Ints :: struct {
x, y: int,
}
// Stack-allocated backing with pre-existing garbage and odd capacity.
backing: #soa[7]Ints = {{99, 99}, {99, 99}, {99, 99}, {99, 99}, {99, 99}, {99, 99}, {99, 99}}
ring: Ring_Soa(Ints)
init_from_slice(&ring, backing[:])
// Empty ring invariants after init_from_slice.
testing.expect_value(t, ring.len, 0)
testing.expect_value(t, ring.next_write_index, 0)
testing.expect_value(t, start_index_soa(ring), 0)
// Partial fill (3 / 7).
for i in 1 ..= 3 do append(&ring, Ints{i, i})
testing.expect_value(t, ring.len, 3)
testing.expect_value(t, ring.next_write_index, 3)
testing.expect_value(t, start_index_soa(ring), 0)
testing.expect_value(t, get(ring, 0), Ints{1, 1})
testing.expect_value(t, get(ring, 2), Ints{3, 3})
testing.expect_value(t, get_last(ring), Ints{3, 3})
// Fill exactly to capacity. Pushing element 7 must make len == cap
// AND wrap next_write_index from 6 back to 0 in the same step.
for i in 4 ..= 7 do append(&ring, Ints{i, i})
testing.expect_value(t, ring.len, 7)
testing.expect_value(t, ring.next_write_index, 0)
testing.expect_value(t, start_index_soa(ring), 0)
testing.expect_value(t, get(ring, 0), Ints{1, 1})
testing.expect_value(t, get(ring, 6), Ints{7, 7})
testing.expect_value(t, get_last(ring), Ints{7, 7})
// First overwrite oldest element shifts by one.
append(&ring, Ints{8, 8})
testing.expect_value(t, ring.len, 7)
testing.expect_value(t, ring.next_write_index, 1)
testing.expect_value(t, start_index_soa(ring), 1)
testing.expect_value(t, get(ring, 0), Ints{2, 2})
testing.expect_value(t, get(ring, 6), Ints{8, 8})
testing.expect_value(t, get_last(ring), Ints{8, 8})
// Stress: 3 more complete wrap cycles (21 more pushes).
// After 29 total pushes, ring contains the last 7 (23..=29),
// and next_write_index = 29 mod 7 = 1.
for i in 9 ..= 29 do append(&ring, Ints{i, i})
testing.expect_value(t, ring.len, 7)
testing.expect_value(t, ring.next_write_index, 1)
testing.expect_value(t, start_index_soa(ring), 1)
testing.expect_value(t, get(ring, 0), Ints{23, 23})
testing.expect_value(t, get(ring, 3), Ints{26, 26})
testing.expect_value(t, get(ring, 6), Ints{29, 29})
testing.expect_value(t, get_last(ring), Ints{29, 29})
// Clear returns ring to empty-equivalent state.
clear(&ring)
testing.expect_value(t, ring.len, 0)
testing.expect_value(t, ring.next_write_index, 0)
testing.expect_value(t, start_index_soa(ring), 0)
// Single-element edge case: get_last(len==1) routes through get(ring, 0).
append(&ring, Ints{42, 42})
testing.expect_value(t, ring.len, 1)
testing.expect_value(t, ring.next_write_index, 1)
testing.expect_value(t, get(ring, 0), Ints{42, 42})
testing.expect_value(t, get_last(ring), Ints{42, 42})
}
+879 -973
View File
File diff suppressed because it is too large Load Diff
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
+71 -27
View File
@@ -1,8 +1,11 @@
package examples package examples
import "core:fmt" import "core:fmt"
import "core:log"
import "core:mem"
import "core:os" import "core:os"
import "core:sys/posix" import "core:sys/posix"
import mdb "../../lmdb" import mdb "../../lmdb"
// 0o660 // 0o660
@@ -10,34 +13,75 @@ DB_MODE :: posix.mode_t{.IWGRP, .IRGRP, .IWUSR, .IRUSR}
DB_PATH :: "out/debug/lmdb_example_db" DB_PATH :: "out/debug/lmdb_example_db"
main :: proc() { main :: proc() {
environment: ^mdb.Env //----- General setup ----------------------------------
{
// Temp
track_temp: mem.Tracking_Allocator
mem.tracking_allocator_init(&track_temp, context.temp_allocator)
context.temp_allocator = mem.tracking_allocator(&track_temp)
// Create environment for lmdb // Default
mdb.panic_on_err(mdb.env_create(&environment)) track: mem.Tracking_Allocator
// Create directory for databases. Won't do anything if it already exists. mem.tracking_allocator_init(&track, context.allocator)
// 0o774 gives all permissions for owner and group, read for everyone else. context.allocator = mem.tracking_allocator(&track)
os.make_directory(DB_PATH, 0o774) // Log a warning about any memory that was not freed by the end of the program.
// Open the database files (creates them if they don't already exist) // This could be fine for some global state or it could be a memory leak.
mdb.panic_on_err(mdb.env_open(environment, DB_PATH, 0, DB_MODE)) defer {
// Temp allocator
if len(track_temp.bad_free_array) > 0 {
fmt.eprintf("=== %v incorrect frees - temp allocator: ===\n", len(track_temp.bad_free_array))
for entry in track_temp.bad_free_array {
fmt.eprintf("- %p @ %v\n", entry.memory, entry.location)
}
mem.tracking_allocator_destroy(&track_temp)
}
// Default allocator
if len(track.allocation_map) > 0 {
fmt.eprintf("=== %v allocations not freed - main allocator: ===\n", len(track.allocation_map))
for _, entry in track.allocation_map {
fmt.eprintf("- %v bytes @ %v\n", entry.size, entry.location)
}
}
if len(track.bad_free_array) > 0 {
fmt.eprintf("=== %v incorrect frees - main allocator: ===\n", len(track.bad_free_array))
for entry in track.bad_free_array {
fmt.eprintf("- %p @ %v\n", entry.memory, entry.location)
}
}
mem.tracking_allocator_destroy(&track)
}
// Logger
context.logger = log.create_console_logger()
defer log.destroy_console_logger(context.logger)
}
// Transactions environment: ^mdb.Env
txn_handle: ^mdb.Txn
db_handle: mdb.Dbi
// Put transaction
key := 7
key_val := mdb.autoval(&key)
put_data := 12
put_data_val := mdb.autoval(&put_data)
mdb.panic_on_err(mdb.txn_begin(environment, nil, 0, &txn_handle))
mdb.panic_on_err(mdb.dbi_open(txn_handle, nil, 0, &db_handle))
mdb.panic_on_err(mdb.put(txn_handle, db_handle, &key_val.raw, &put_data_val.raw, 0))
mdb.panic_on_err(mdb.txn_commit(txn_handle))
// Get transaction // Create environment for lmdb
get_data_val := mdb.nil_autoval(int) mdb.panic_on_err(mdb.env_create(&environment))
mdb.panic_on_err(mdb.txn_begin(environment, nil, 0, &txn_handle)) // Create directory for databases. Won't do anything if it already exists.
mdb.panic_on_err(mdb.get(txn_handle, db_handle, &key_val.raw, &get_data_val.raw)) os.make_directory(DB_PATH)
mdb.panic_on_err(mdb.txn_commit(txn_handle)) // Open the database files (creates them if they don't already exist)
data_cpy := mdb.autoval_get_data(&get_data_val)^ mdb.panic_on_err(mdb.env_open(environment, DB_PATH, {}, DB_MODE))
fmt.println("Get result:", data_cpy)
// Transactions
txn_handle: ^mdb.Txn
db_handle: mdb.Dbi
// Put transaction
key := 7
key_val := mdb.blittable_val(&key)
put_data := 12
put_data_val := mdb.blittable_val(&put_data)
mdb.panic_on_err(mdb.txn_begin(environment, nil, {}, &txn_handle))
mdb.panic_on_err(mdb.dbi_open(txn_handle, nil, {}, &db_handle))
mdb.panic_on_err(mdb.put(txn_handle, db_handle, &key_val, &put_data_val, {}))
mdb.panic_on_err(mdb.txn_commit(txn_handle))
// Get transaction
data_val: mdb.Val
mdb.panic_on_err(mdb.txn_begin(environment, nil, {}, &txn_handle))
mdb.panic_on_err(mdb.get(txn_handle, db_handle, &key_val, &data_val))
data_cpy := mdb.blittable_copy(&data_val, int)
mdb.panic_on_err(mdb.txn_commit(txn_handle))
fmt.println("Get result:", data_cpy)
} }
+195 -153
View File
@@ -164,24 +164,123 @@
*/ */
package lmdb package lmdb
foreign import lib "system:lmdb"
import "core:c" import "core:c"
import "core:fmt" import "core:fmt"
import "core:reflect"
import "core:sys/posix" import "core:sys/posix"
// ---------------------------------------------------------------------------------------------------------------------
// ----- Added Odin Helpers ------------------------
// ---------------------------------------------------------------------------------------------------------------------
// Wrap a blittable value's bytes as an LMDB Val.
// T must be a contiguous type with no indirection (no pointers, slices, strings, maps, etc.).
blittable_val :: #force_inline proc(val_ptr: ^$T) -> Val {
fmt.assertf(
reflect.has_no_indirections(type_info_of(T)),
"blitval: type '%v' contains indirection and cannot be stored directly in LMDB",
typeid_of(T),
)
return Val{size_of(T), val_ptr}
}
// Reads a blittable T out of the LMDB memory map by copying it into caller
// storage. The returned T has no lifetime tie to the transaction.
blittable_copy :: #force_inline proc(val: ^Val, $T: typeid) -> T {
fmt.assertf(
reflect.has_no_indirections(type_info_of(T)),
"blitval_copy: type '%v' contains indirection and cannot be read directly from LMDB",
typeid_of(T),
)
return (cast(^T)val.data)^
}
// Zero-copy pointer view into the LMDB memory map as a ^T.
// Useful for large blittable types where you want to read individual fields
// without copying the entire value (e.g. ptr.timestamp, ptr.flags).
// MUST NOT be written through writes either segfault (default env mode)
// or silently corrupt the database (ENV_WRITEMAP).
// MUST NOT be retained past txn_commit, txn_abort, or any subsequent write
// operation on the same env the pointer is invalidated.
blittable_view :: #force_inline proc(val: ^Val, $T: typeid) -> ^T {
fmt.assertf(
reflect.has_no_indirections(type_info_of(T)),
"blitval_view: type '%v' contains indirection and cannot be viewed directly from LMDB",
typeid_of(T),
)
return cast(^T)val.data
}
// Wrap a slice of blittable elements as an LMDB Val for use with put/get.
// T must be a contiguous type with no indirection.
// The caller's slice must remain valid (not freed, not resized) for the
// duration of the put call that consumes this Val.
slice_val :: #force_inline proc(s: []$T) -> Val {
fmt.assertf(
reflect.has_no_indirections(type_info_of(T)),
"slice_val: element type '%v' contains indirection and cannot be stored directly in LMDB",
typeid_of(T),
)
return Val{uint(len(s) * size_of(T)), raw_data(s)}
}
// Zero-copy slice view into the LMDB memory map.
// T must match the element type that was originally stored.
// MUST NOT be modified writes through this slice either segfault (default
// env mode) or silently corrupt the database (ENV_WRITEMAP).
// MUST be copied (e.g. slice.clone) if it needs to outlive the current
// transaction; the view is invalidated by txn_commit, txn_abort, or any
// subsequent write operation on the same env.
slice_view :: #force_inline proc(val: ^Val, $T: typeid) -> []T {
fmt.assertf(
reflect.has_no_indirections(type_info_of(T)),
"slice_view: element type '%v' contains indirection and cannot be read directly from LMDB",
typeid_of(T),
)
return (cast([^]T)val.data)[:val.size / size_of(T)]
}
// Wrap a string's bytes as an LMDB Val for use with put/get.
// The caller's string must remain valid (backing memory not freed) for the
// duration of the put call that consumes this Val.
string_val :: #force_inline proc(s: string) -> Val {
return Val{uint(len(s)), raw_data(s)}
}
// Zero-copy string view into the LMDB memory map.
// MUST NOT be modified writes through the underlying bytes either segfault
// (default env mode) or silently corrupt the database (ENV_WRITEMAP).
// MUST be copied (e.g. strings.clone) if it needs to outlive the current
// transaction; the view is invalidated by txn_commit, txn_abort, or any
// subsequent write operation on the same env.
string_view :: #force_inline proc(val: ^Val) -> string {
return string((cast([^]u8)val.data)[:val.size])
}
// Panic if there is an error
panic_on_err :: #force_inline proc(error: Error, loc := #caller_location) {
if error != .NONE {
fmt.panicf("LMDB error %v: %s", error, strerror(i32(error)), loc = loc)
}
}
// ---------------------------------------------------------------------------------------------------------------------
// ----- Bindings ------------------------
// ---------------------------------------------------------------------------------------------------------------------
_ :: c _ :: c
when ODIN_OS == .Windows { when ODIN_OS == .Windows {
#panic("TODO: Compile windows .lib for lmdb")
mode_t :: c.int mode_t :: c.int
} else {
mode_t :: posix.mode_t
}
when ODIN_OS == .Windows {
filehandle_t :: rawptr filehandle_t :: rawptr
} else { } else when ODIN_OS ==
.Linux || ODIN_OS == .Darwin || ODIN_OS == .FreeBSD || ODIN_OS == .OpenBSD || ODIN_OS == .NetBSD {
foreign import lib "system:lmdb"
mode_t :: posix.mode_t
filehandle_t :: c.int filehandle_t :: c.int
} else {
#panic("levlib/vendor/lmdb: unsupported OS target")
} }
Env :: struct {} Env :: struct {}
@@ -189,7 +288,7 @@ Env :: struct {}
Txn :: struct {} Txn :: struct {}
/** @brief A handle for an individual database in the DB environment. */ /** @brief A handle for an individual database in the DB environment. */
Dbi :: u32 Dbi :: c.uint
Cursor :: struct {} Cursor :: struct {}
@@ -205,33 +304,8 @@ Cursor :: struct {}
* Other data items can in theory be from 0 to 0xffffffff bytes long. * Other data items can in theory be from 0 to 0xffffffff bytes long.
*/ */
Val :: struct { Val :: struct {
mv_size: uint, /**< size of the data item */ size: uint, /**< size of the data item */
mv_data: rawptr, /**< address of the data item */ data: rawptr, /**< address of the data item */
}
// Automatic `Val` handling for a given type 'T'.
// Will not traverse pointers. If `T` stores pointers, you probably don't want to use this.
Auto_Val :: struct($T: typeid) {
raw: Val,
}
autoval :: #force_inline proc "contextless" (val_ptr: ^$T) -> Auto_Val(T) {
return Auto_Val(T){Val{size_of(T), val_ptr}}
}
nil_autoval :: #force_inline proc "contextless" ($T: typeid) -> Auto_Val(T) {
return Auto_Val(T){Val{size_of(T), nil}}
}
autoval_get_data :: #force_inline proc "contextless" (val: ^Auto_Val($T)) -> ^T {
return cast(^T)val.raw.mv_data
}
// Panic if there is an error
panic_on_err :: #force_inline proc(error: Error) {
if error != .NONE {
fmt.panicf("Irrecoverable LMDB error", strerror(i32(error)))
}
} }
/** @brief A callback function used to compare two keys in a database */ /** @brief A callback function used to compare two keys in a database */
@@ -253,85 +327,65 @@ Cmp_Func :: #type proc "c" (_: ^Val, _: ^Val) -> i32
*/ */
Rel_Func :: #type proc "c" (item: ^Val, oldptr, newptr, relctx: rawptr) Rel_Func :: #type proc "c" (item: ^Val, oldptr, newptr, relctx: rawptr)
/** @defgroup mdb_env Environment Flags /** @defgroup mdb_env Environment Flags
* @{ * @{
*/ */
/** mmap at a fixed address (experimental) */ Env_Flag :: enum u32 {
ENV_FIXEDMAP :: 0x01 FIXEDMAP = 0, /**< mmap at a fixed address (experimental) */
/** no environment directory */ NOSUBDIR = 14, /**< no environment directory */
ENV_NOSUBDIR :: 0x4000 NOSYNC = 16, /**< don't fsync after commit */
/** don't fsync after commit */ RDONLY = 17, /**< read only */
ENV_NOSYNC :: 0x10000 NOMETASYNC = 18, /**< don't fsync metapage after commit */
/** read only */ WRITEMAP = 19, /**< use writable mmap */
ENV_RDONLY :: 0x20000 MAPASYNC = 20, /**< use asynchronous msync when WRITEMAP is used */
/** don't fsync metapage after commit */ NOTLS = 21, /**< tie reader locktable slots to Txn objects instead of to threads */
ENV_NOMETASYNC :: 0x40000 NOLOCK = 22, /**< don't do any locking, caller must manage their own locks */
/** use writable mmap */ NORDAHEAD = 23, /**< don't do readahead (no effect on Windows) */
ENV_WRITEMAP :: 0x80000 NOMEMINIT = 24, /**< don't initialize malloc'd memory before writing to datafile */
/** use asynchronous msync when #MDB_WRITEMAP is used */ PREVSNAPSHOT = 25, /**< use the previous snapshot rather than the latest one */
ENV_MAPASYNC :: 0x100000 }
/** tie reader locktable slots to #MDB_txn objects instead of to threads */ Env_Flags :: distinct bit_set[Env_Flag;c.uint]
ENV_NOTLS :: 0x200000
/** don't do any locking, caller must manage their own locks */
ENV_NOLOCK :: 0x400000
/** don't do readahead (no effect on Windows) */
ENV_NORDAHEAD :: 0x800000
/** don't initialize malloc'd memory before writing to datafile */
ENV_NOMEMINIT :: 0x1000000
/** @} */ /** @} */
/** @defgroup mdb_dbi_open Database Flags /** @defgroup mdb_dbi_open Database Flags
* @{ * @{
*/ */
/** use reverse string keys */ Db_Flag :: enum u32 {
DB_REVERSEKEY :: 0x02 REVERSEKEY = 1, /**< use reverse string keys */
/** use sorted duplicates */ DUPSORT = 2, /**< use sorted duplicates */
DB_DUPSORT :: 0x04 INTEGERKEY = 3, /**< numeric keys in native byte order */
/** numeric keys in native byte order: either unsigned int or size_t. DUPFIXED = 4, /**< with DUPSORT, sorted dup items have fixed size */
* The keys must all be of the same size. */ INTEGERDUP = 5, /**< with DUPSORT, dups are INTEGERKEY-style integers */
DB_INTEGERKEY :: 0x08 REVERSEDUP = 6, /**< with DUPSORT, use reverse string dups */
/** with #MDB_DUPSORT, sorted dup items have fixed size */ CREATE = 18, /**< create DB if not already existing */
DB_DUPFIXED :: 0x10 }
/** with #MDB_DUPSORT, dups are #MDB_INTEGERKEY-style integers */ Db_Flags :: distinct bit_set[Db_Flag;c.uint]
DB_INTEGERDUP :: 0x20
/** with #MDB_DUPSORT, use reverse string dups */
DB_REVERSEDUP :: 0x40
/** create DB if not already existing */
DB_CREATE :: 0x40000
/** @} */ /** @} */
/** @defgroup mdb_put Write Flags /** @defgroup mdb_put Write Flags
* @{ * @{
*/ */
/** For put: Don't write if the key already exists. */ Write_Flag :: enum u32 {
WRITE_NOOVERWRITE :: 0x10 NOOVERWRITE = 4, /**< For put: Don't write if the key already exists */
/** Only for #MDB_DUPSORT<br> NODUPDATA = 5, /**< For DUPSORT: don't write if the key and data pair already exist.
* For put: don't write if the key and data pair already exist.<br> For mdb_cursor_del: remove all duplicate data items. */
* For mdb_cursor_del: remove all duplicate data items. CURRENT = 6, /**< For mdb_cursor_put: overwrite the current key/data pair */
*/ RESERVE = 16, /**< For put: Just reserve space for data, don't copy it */
WRITE_NODUPDATA :: 0x20 APPEND = 17, /**< Data is being appended, don't split full pages */
/** For mdb_cursor_put: overwrite the current key/data pair */ APPENDDUP = 18, /**< Duplicate data is being appended, don't split full pages */
WRITE_CURRENT :: 0x40 MULTIPLE = 19, /**< Store multiple data items in one call. Only for DUPFIXED. */
/** For put: Just reserve space for data, don't copy it. Return a }
* pointer to the reserved space. Write_Flags :: distinct bit_set[Write_Flag;c.uint]
*/ /** @} */
WRITE_RESERVE :: 0x10000
/** Data is being appended, don't split full pages. */
WRITE_APPEND :: 0x20000
/** Duplicate data is being appended, don't split full pages. */
WRITE_APPENDDUP :: 0x40000
/** Store multiple data items in one call. Only for #MDB_DUPFIXED. */
WRITE_MULTIPLE :: 0x80000
/* @} */
/** @defgroup mdb_copy Copy Flags /** @defgroup mdb_copy Copy Flags
* @{ * @{
*/ */
/** Compacting copy: Omit free space from copy, and renumber all Copy_Flag :: enum u32 {
* pages sequentially. COMPACT = 0, /**< Compacting copy: Omit free space from copy, and renumber all pages sequentially. */
*/ }
CP_COMPACT :: 0x01 Copy_Flags :: distinct bit_set[Copy_Flag;c.uint]
/* @} */ /** @} */
/** @brief Cursor Get operations. /** @brief Cursor Get operations.
* *
@@ -340,33 +394,24 @@ CP_COMPACT :: 0x01
*/ */
Cursor_Op :: enum c.int { Cursor_Op :: enum c.int {
FIRST, /**< Position at first key/data item */ FIRST, /**< Position at first key/data item */
FIRST_DUP, /**< Position at first data item of current key. FIRST_DUP, /**< Position at first data item of current key. Only for DUPSORT */
Only for #MDB_DUPSORT */ GET_BOTH, /**< Position at key/data pair. Only for DUPSORT */
GET_BOTH, /**< Position at key/data pair. Only for #MDB_DUPSORT */ GET_BOTH_RANGE, /**< Position at key, nearest data. Only for DUPSORT */
GET_BOTH_RANGE, /**< position at key, nearest data. Only for #MDB_DUPSORT */
GET_CURRENT, /**< Return key/data at current cursor position */ GET_CURRENT, /**< Return key/data at current cursor position */
GET_MULTIPLE, /**< Return up to a page of duplicate data items GET_MULTIPLE, /**< Return up to a page of duplicate data items from current cursor position. Only for DUPFIXED */
from current cursor position. Move cursor to prepare
for #MDB_NEXT_MULTIPLE. Only for #MDB_DUPFIXED */
LAST, /**< Position at last key/data item */ LAST, /**< Position at last key/data item */
LAST_DUP, /**< Position at last data item of current key. LAST_DUP, /**< Position at last data item of current key. Only for DUPSORT */
Only for #MDB_DUPSORT */
NEXT, /**< Position at next data item */ NEXT, /**< Position at next data item */
NEXT_DUP, /**< Position at next data item of current key. NEXT_DUP, /**< Position at next data item of current key. Only for DUPSORT */
Only for #MDB_DUPSORT */ NEXT_MULTIPLE, /**< Return up to a page of duplicate data items from next cursor position. Only for DUPFIXED */
NEXT_MULTIPLE, /**< Return up to a page of duplicate data items
from next cursor position. Move cursor to prepare
for #MDB_NEXT_MULTIPLE. Only for #MDB_DUPFIXED */
NEXT_NODUP, /**< Position at first data item of next key */ NEXT_NODUP, /**< Position at first data item of next key */
PREV, /**< Position at previous data item */ PREV, /**< Position at previous data item */
PREV_DUP, /**< Position at previous data item of current key. PREV_DUP, /**< Position at previous data item of current key. Only for DUPSORT */
Only for #MDB_DUPSORT */
PREV_NODUP, /**< Position at last data item of previous key */ PREV_NODUP, /**< Position at last data item of previous key */
SET, /**< Position at specified key */ SET, /**< Position at specified key */
SET_KEY, /**< Position at specified key, return key + data */ SET_KEY, /**< Position at specified key, return key + data */
SET_RANGE, /**< Position at first key greater than or equal to specified key. */ SET_RANGE, /**< Position at first key greater than or equal to specified key */
PREV_MULTIPLE, /**< Position at previous page and return up to PREV_MULTIPLE, /**< Position at previous page and return up to a page of duplicate data items. Only for DUPFIXED */
a page of duplicate data items. Only for #MDB_DUPFIXED */
} }
Error :: enum c.int { Error :: enum c.int {
@@ -419,33 +464,28 @@ Error :: enum c.int {
BAD_VALSIZE = -30781, BAD_VALSIZE = -30781,
/** The specified DBI was changed unexpectedly */ /** The specified DBI was changed unexpectedly */
BAD_DBI = -30780, BAD_DBI = -30780,
/** Unexpected problem - txn should abort */
PROBLEM = -30779,
} }
/** @brief Statistics for a database in the environment */ /** @brief Statistics for a database in the environment */
Stat :: struct { Stat :: struct {
ms_psize: u32, psize: u32, /**< Size of a database page. This is currently the same for all databases. */
/**< Size of a database page. depth: u32, /**< Depth (height) of the B-tree */
This is currently the same for all databases. */ branch_pages: uint, /**< Number of internal (non-leaf) pages */
ms_depth: u32, leaf_pages: uint, /**< Number of leaf pages */
/**< Depth (height) of the B-tree */ overflow_pages: uint, /**< Number of overflow pages */
ms_branch_pages: uint, entries: uint, /**< Number of data items */
/**< Number of internal (non-leaf) pages */
ms_leaf_pages: uint,
/**< Number of leaf pages */
ms_overflow_pages: uint,
/**< Number of overflow pages */
ms_entries: uint,
/**< Number of data items */
} }
/** @brief Information about the environment */ /** @brief Information about the environment */
Env_Info :: struct { Env_Info :: struct {
me_mapaddr: rawptr, /**< Address of map, if fixed */ mapaddr: rawptr, /**< Address of map, if fixed */
me_mapsize: uint, /**< Size of the data memory map */ mapsize: uint, /**< Size of the data memory map */
me_last_pgno: uint, /**< ID of the last used page */ last_pgno: uint, /**< ID of the last used page */
me_last_txnid: uint, /**< ID of the last committed transaction */ last_txnid: uint, /**< ID of the last committed transaction */
me_maxreaders: u32, /**< max reader slots in the environment */ maxreaders: u32, /**< max reader slots in the environment */
me_numreaders: u32, /**< max reader slots used in the environment */ numreaders: u32, /**< max reader slots used in the environment */
} }
/** @brief A callback function for most LMDB assert() failures, /** @brief A callback function for most LMDB assert() failures,
@@ -454,7 +494,7 @@ Env_Info :: struct {
* @param[in] env An environment handle returned by #mdb_env_create(). * @param[in] env An environment handle returned by #mdb_env_create().
* @param[in] msg The assertion message, not including newline. * @param[in] msg The assertion message, not including newline.
*/ */
Assert_Func :: proc "c" (_: ^Env, _: cstring) Assert_Func :: #type proc "c" (_: ^Env, _: cstring)
/** @brief A callback function used to print a message from the library. /** @brief A callback function used to print a message from the library.
* *
@@ -462,7 +502,7 @@ Assert_Func :: proc "c" (_: ^Env, _: cstring)
* @param[in] ctx An arbitrary context pointer for the callback. * @param[in] ctx An arbitrary context pointer for the callback.
* @return < 0 on failure, >= 0 on success. * @return < 0 on failure, >= 0 on success.
*/ */
Msg_Func :: proc "c" (_: cstring, _: rawptr) -> i32 Msg_Func :: #type proc "c" (_: cstring, _: rawptr) -> i32
@(default_calling_convention = "c", link_prefix = "mdb_") @(default_calling_convention = "c", link_prefix = "mdb_")
foreign lib { foreign lib {
@@ -623,7 +663,7 @@ foreign lib {
* </ul> * </ul>
*/ */
@(require_results) @(require_results)
env_open :: proc(env: ^Env, path: cstring, flags: u32, mode: mode_t) -> Error --- env_open :: proc(env: ^Env, path: cstring, flags: Env_Flags, mode: mode_t) -> Error ---
/** @brief Copy an LMDB environment to the specified path. /** @brief Copy an LMDB environment to the specified path.
* *
@@ -682,7 +722,7 @@ foreign lib {
* @return A non-zero error value on failure and 0 on success. * @return A non-zero error value on failure and 0 on success.
*/ */
@(require_results) @(require_results)
env_copy2 :: proc(env: ^Env, path: cstring, flags: u32) -> Error --- env_copy2 :: proc(env: ^Env, path: cstring, flags: Copy_Flags) -> Error ---
/** @brief Copy an LMDB environment to the specified file descriptor, /** @brief Copy an LMDB environment to the specified file descriptor,
* with options. * with options.
@@ -702,7 +742,7 @@ foreign lib {
* @return A non-zero error value on failure and 0 on success. * @return A non-zero error value on failure and 0 on success.
*/ */
@(require_results) @(require_results)
env_copyfd2 :: proc(env: ^Env, fd: filehandle_t, flags: u32) -> Error --- env_copyfd2 :: proc(env: ^Env, fd: filehandle_t, flags: Copy_Flags) -> Error ---
/** @brief Return statistics about the LMDB environment. /** @brief Return statistics about the LMDB environment.
* *
@@ -767,7 +807,7 @@ foreign lib {
* </ul> * </ul>
*/ */
@(require_results) @(require_results)
env_set_flags :: proc(env: ^Env, flags: u32, onoff: i32) -> Error --- env_set_flags :: proc(env: ^Env, flags: Env_Flags, onoff: i32) -> Error ---
/** @brief Get environment flags. /** @brief Get environment flags.
* *
@@ -780,7 +820,7 @@ foreign lib {
* </ul> * </ul>
*/ */
@(require_results) @(require_results)
env_get_flags :: proc(env: ^Env, flags: ^u32) -> Error --- env_get_flags :: proc(env: ^Env, flags: ^Env_Flags) -> Error ---
/** @brief Return the path that was used in #mdb_env_open(). /** @brief Return the path that was used in #mdb_env_open().
* *
@@ -973,7 +1013,7 @@ foreign lib {
* </ul> * </ul>
*/ */
@(require_results) @(require_results)
txn_begin :: proc(env: ^Env, parent: ^Txn, flags: u32, txn: ^^Txn) -> Error --- txn_begin :: proc(env: ^Env, parent: ^Txn, flags: Env_Flags, txn: ^^Txn) -> Error ---
/** @brief Returns the transaction's #MDB_env /** @brief Returns the transaction's #MDB_env
* *
@@ -1126,7 +1166,7 @@ foreign lib {
* </ul> * </ul>
*/ */
@(require_results) @(require_results)
dbi_open :: proc(txn: ^Txn, name: cstring, flags: u32, dbi: ^Dbi) -> Error --- dbi_open :: proc(txn: ^Txn, name: cstring, flags: Db_Flags, dbi: ^Dbi) -> Error ---
/** @brief Retrieve statistics for a database. /** @brief Retrieve statistics for a database.
* *
@@ -1151,7 +1191,7 @@ foreign lib {
* @return A non-zero error value on failure and 0 on success. * @return A non-zero error value on failure and 0 on success.
*/ */
@(require_results) @(require_results)
dbi_flags :: proc(txn: ^Txn, dbi: Dbi, flags: ^u32) -> Error --- dbi_flags :: proc(txn: ^Txn, dbi: Dbi, flags: ^Db_Flags) -> Error ---
/** @brief Close a database handle. Normally unnecessary. Use with care: /** @brief Close a database handle. Normally unnecessary. Use with care:
* *
@@ -1229,6 +1269,7 @@ foreign lib {
@(require_results) @(require_results)
set_dupsort :: proc(txn: ^Txn, dbi: Dbi, cmp: Cmp_Func) -> Error --- set_dupsort :: proc(txn: ^Txn, dbi: Dbi, cmp: Cmp_Func) -> Error ---
// NOTE: Unimplemented in current LMDB this function has no effect.
/** @brief Set a relocation function for a #MDB_FIXEDMAP database. /** @brief Set a relocation function for a #MDB_FIXEDMAP database.
* *
* @todo The relocation function is called whenever it is necessary to move the data * @todo The relocation function is called whenever it is necessary to move the data
@@ -1250,6 +1291,7 @@ foreign lib {
@(require_results) @(require_results)
set_relfunc :: proc(txn: ^Txn, dbi: Dbi, rel: Rel_Func) -> Error --- set_relfunc :: proc(txn: ^Txn, dbi: Dbi, rel: Rel_Func) -> Error ---
// NOTE: Unimplemented in current LMDB this function has no effect.
/** @brief Set a context pointer for a #MDB_FIXEDMAP database's relocation function. /** @brief Set a context pointer for a #MDB_FIXEDMAP database's relocation function.
* *
* See #mdb_set_relfunc and #MDB_rel_func for more details. * See #mdb_set_relfunc and #MDB_rel_func for more details.
@@ -1344,7 +1386,7 @@ foreign lib {
* </ul> * </ul>
*/ */
@(require_results) @(require_results)
put :: proc(txn: ^Txn, dbi: Dbi, key: ^Val, data: ^Val, flags: u32) -> Error --- put :: proc(txn: ^Txn, dbi: Dbi, key: ^Val, data: ^Val, flags: Write_Flags) -> Error ---
/** @brief Delete items from a database. /** @brief Delete items from a database.
* *
@@ -1517,7 +1559,7 @@ foreign lib {
* </ul> * </ul>
*/ */
@(require_results) @(require_results)
cursor_put :: proc(cursor: ^Cursor, key: ^Val, data: ^Val, flags: u32) -> Error --- cursor_put :: proc(cursor: ^Cursor, key: ^Val, data: ^Val, flags: Write_Flags) -> Error ---
/** @brief Delete current key/data pair /** @brief Delete current key/data pair
* *
@@ -1541,7 +1583,7 @@ foreign lib {
* </ul> * </ul>
*/ */
@(require_results) @(require_results)
cursor_del :: proc(cursor: ^Cursor, flags: u32) -> Error --- cursor_del :: proc(cursor: ^Cursor, flags: Write_Flags) -> Error ---
/** @brief Return count of duplicates for current key. /** @brief Return count of duplicates for current key.
* *