From f85187eff3485311ab630b0d8f919bca51cab8c5 Mon Sep 17 00:00:00 2001 From: Zachary Levy Date: Mon, 20 Apr 2026 20:27:40 -0700 Subject: [PATCH 1/5] Clean up memory management --- draw/README.md | 222 ++++++++++++++++++++++++++++++++++++++++++----- draw/draw.odin | 5 +- draw/shapes.odin | 6 ++ draw/text.odin | 2 + 4 files changed, 213 insertions(+), 22 deletions(-) diff --git a/draw/README.md b/draw/README.md index 5f9225a..1066a7e 100644 --- a/draw/README.md +++ b/draw/README.md @@ -81,32 +81,63 @@ shader contains both a 20-register RRect SDF and a 72-register frosted-glass blu — even trivial RRects — is allocated 72 registers. This directly reduces **occupancy** (the number of warps that can run simultaneously), which reduces the GPU's ability to hide memory latency. -Concrete example on a modern NVIDIA SM with 65,536 registers: +Concrete occupancy analysis on modern NVIDIA SMs, which have 65,536 32-bit registers and a +hardware-imposed maximum thread count per SM that varies by architecture (Volta/A100: 2,048; +consumer Ampere/Ada: 1,536). Occupancy is register-limited only when `65536 / regs_per_thread` falls +below the hardware thread cap; above that cap, occupancy is 100% regardless of register count. -| Register allocation | Max concurrent threads | Occupancy | -| ------------------------- | ---------------------- | --------- | -| 20 regs (RRect only) | 3,276 | ~100% | -| 48 regs (+ drop shadow) | 1,365 | ~42% | -| 72 regs (+ frosted glass) | 910 | ~28% | +On consumer Ampere/Ada GPUs (RTX 30xx/40xx, max 1,536 threads per SM): -For a 4K frame (3840×2160) at 1.5× overdraw (~12.4M fragments), running all fragments at 28% -occupancy instead of 100% roughly triples fragment shading time. At 4K this is severe: if the main -pipeline's fragment work at full occupancy takes ~2ms, a single unified shader containing the glass -branch would push it to ~6ms — consuming 72% of the 8.3ms budget available at 120 FPS and leaving -almost nothing for CPU work, uploads, and presentation. This is a per-frame multiplier, not a -per-primitive cost — it applies even when the heavy branch is never taken. +| Register allocation | Reg-limited threads | Actual (hw-capped) | Occupancy | +| ------------------------- | ------------------- | ------------------ | --------- | +| 20 regs (RRect only) | 3,276 | 1,536 | 100% | +| 32 regs | 2,048 | 1,536 | 100% | +| 48 regs (+ drop shadow) | 1,365 | 1,365 | ~89% | +| 72 regs (+ frosted glass) | 910 | 910 | ~59% | + +On Volta/A100 GPUs (max 2,048 threads per SM): + +| Register allocation | Reg-limited threads | Actual (hw-capped) | Occupancy | +| ------------------------- | ------------------- | ------------------ | --------- | +| 20 regs (RRect only) | 3,276 | 2,048 | 100% | +| 32 regs | 2,048 | 2,048 | 100% | +| 48 regs (+ drop shadow) | 1,365 | 1,365 | ~67% | +| 72 regs (+ frosted glass) | 910 | 910 | ~44% | + +The register cliff — where occupancy begins dropping — starts at ~43 regs/thread on consumer +Ampere/Ada (65536 / 1536) and ~32 regs/thread on Volta/A100 (65536 / 2048). Below the cliff, +adding registers has zero occupancy cost. + +The impact of reduced occupancy depends on whether the shader is memory-latency-bound (where +occupancy is critical for hiding latency) or ALU-bound (where it matters less). For the +backdrop-effects pipeline's frosted-glass shader, which performs multiple dependent texture reads, +59% occupancy (consumer) or 44% occupancy (Volta) meaningfully reduces the GPU's ability to hide +texture latency — roughly a 1.7× to 2.3× throughput reduction compared to full occupancy. At 4K with +1.5× overdraw (~12.4M fragments), if the main pipeline's fragment work at full occupancy takes ~2ms, +a single unified shader containing the glass branch would push it to ~3.4–4.6ms depending on +architecture. This is a per-frame multiplier, not a per-primitive cost — it applies even when the +heavy branch is never taken, because the compiler allocates registers for the worst-case path. + +**Note on Apple M3+ GPUs:** Apple's M3 GPU architecture introduces Dynamic Caching (register file +virtualization), which allocates registers dynamically at runtime based on actual usage rather than +worst-case declared usage. This significantly reduces the static register-pressure-to-occupancy +penalty described above. The tier split remains useful on Apple hardware for other reasons (keeping +the backdrop texture-copy out of the main render pass, isolating blur ALU complexity), but the +register-pressure argument specifically weakens on M3 and later. The three-pipeline split groups primitives by register footprint so that: -- Main pipeline (~20 regs): 90%+ of fragments run at near-full occupancy. -- Effects pipeline (~55 regs): shadow/glow fragments run at moderate occupancy; unavoidable given the - blur math complexity. -- Backdrop-effects pipeline (~75 regs): glass fragments run at low occupancy; also unavoidable, and - structurally separated anyway by the texture-copy requirement. +- Main pipeline (~20 regs): all fragments run at full occupancy on every architecture. +- Effects pipeline (~48–55 regs): shadow/glow fragments run at 67–89% occupancy depending on + architecture; unavoidable given the blur math complexity. +- Backdrop-effects pipeline (~72–75 regs): glass fragments run at 44–59% occupancy; also + unavoidable, and structurally separated anyway by the texture-copy requirement. This avoids the register-pressure tax of a single unified shader while keeping pipeline count minimal (3 vs. Zed GPUI's 7). The effects that drag occupancy down are isolated to the fragments that -actually need them. +actually need them. Crucially, all shape kinds within the main pipeline (SDF, tessellated, text) +cluster at 12–24 registers — well below the register cliff on every architecture — so unifying them +costs nothing in occupancy. **Why not per-primitive-type pipelines (GPUI's approach)?** Zed's GPUI uses 7 separate shader pairs: quad, shadow, underline, monochrome sprite, polychrome sprite, path, surface. This eliminates all @@ -160,9 +191,9 @@ in submission order: cheaper than the pipeline-switching alternative. The split we _do_ perform (main / effects / backdrop-effects) is motivated by register-pressure tier -boundaries where occupancy differences are catastrophic at 4K (see numbers above). Within a tier, -unified is strictly better by every measure: fewer draw calls, simpler Z-order, lower CPU overhead, -and negligible GPU-side branching cost. +boundaries where occupancy drops are significant at 4K (see numbers above). Within a tier, unified is +strictly better by every measure: fewer draw calls, simpler Z-order, lower CPU overhead, and +negligible GPU-side branching cost. **References:** @@ -172,6 +203,16 @@ and negligible GPU-side branching cost. https://github.com/zed-industries/zed/blob/cb6fc11/crates/gpui/src/platform/mac/shaders.metal - NVIDIA Nsight Graphics 2024.3 documentation on active-threads-per-warp and divergence analysis: https://developer.nvidia.com/blog/optimize-gpu-workloads-for-graphics-applications-with-nvidia-nsight-graphics/ +- NVIDIA Ampere GPU Architecture Tuning Guide — SM specs, max warps per SM (48 for cc 8.6, 64 for + cc 8.0), register file size (64K), occupancy factors: + https://docs.nvidia.com/cuda/ampere-tuning-guide/index.html +- NVIDIA Ada GPU Architecture Tuning Guide — SM specs, max warps per SM (48 for cc 8.9): + https://docs.nvidia.com/cuda/ada-tuning-guide/index.html +- CUDA Occupancy Calculation walkthrough (register allocation granularity, worked examples): + https://leimao.github.io/blog/CUDA-Occupancy-Calculation/ +- Apple M3 GPU architecture — Dynamic Caching (register file virtualization) eliminates static + worst-case register allocation, reducing the occupancy penalty for high-register shaders: + https://asplos.dev/wiki/m3-chip-explainer/gpu/index.html ### Why fragment shader branching is safe in this design @@ -539,6 +580,145 @@ changes. - Valve's original SDF text rendering paper (SIGGRAPH 2007): https://steamcdn-a.akamaihd.net/apps/valve/2007/SIGGRAPH2007_AlphaTestedMagnification.pdf +### Textures + +Textures plug into the existing main pipeline — no additional GPU pipeline, no shader rewrite. The +work is a resource layer (registration, upload, sampling, lifecycle) plus two textured-draw procs +that route into the existing tessellated and SDF paths respectively. + +#### Why draw owns registered textures + +A texture's GPU resource (the `^sdl.GPUTexture`, transfer buffer, shader resource view) is created +and destroyed by draw. The user provides raw bytes and a descriptor at registration time; draw +uploads synchronously and returns an opaque `Texture_Id` handle. The user can free their CPU-side +bytes immediately after `register_texture` returns. + +This follows the model used by the RAD Debugger's render layer (`src/render/render_core.h` in +EpicGamesExt/raddebugger, MIT license), where `r_tex2d_alloc` takes `(kind, size, format, data)` +and returns an opaque handle that the renderer owns and releases. The single-owner model eliminates +an entire class of lifecycle bugs (double-free, use-after-free across subsystems, unclear cleanup +responsibility) that dual-ownership designs introduce. + +If advanced interop is ever needed (e.g., a future 3D pipeline or compute shader sharing the same +GPU texture), the clean extension is a borrowed-reference accessor (`get_gpu_texture(id)`) that +returns the underlying handle without transferring ownership. This is purely additive and does not +require changing the registration API. + +#### Why `Texture_Kind` exists + +`Texture_Kind` (Static / Dynamic / Stream) is a driver hint for update frequency, adopted from the +RAD Debugger's `R_ResourceKind`. It maps directly to SDL3 GPU usage patterns: + +- **Static**: uploaded once, never changes. Covers QR codes, decoded PNGs, icons — the 90% case. +- **Dynamic**: updatable via `update_texture_region`. Covers font atlas growth, procedural updates. +- **Stream**: frequent full re-uploads. Covers video playback, per-frame procedural generation. + +This costs one byte in the descriptor and lets the backend pick optimal memory placement without a +future API change. + +#### Why samplers are per-draw, not per-texture + +A sampler describes how to filter and address a texture during sampling — nearest vs bilinear, clamp +vs repeat. This is a property of the _draw_, not the texture. The same QR code texture should be +sampled with `Nearest_Clamp` when displayed at native resolution but could reasonably be sampled +with `Linear_Clamp` in a zoomed-out thumbnail. The same icon atlas might be sampled with +`Nearest_Clamp` for pixel art or `Linear_Clamp` for smooth scaling. + +The RAD Debugger follows this pattern: `R_BatchGroup2DParams` carries `tex_sample_kind` alongside +the texture handle, chosen per batch group at draw time. We do the same — `Sampler_Preset` is a +parameter on the draw procs, not a field on `Texture_Desc`. + +Internally, draw keeps a small pool of pre-created `^sdl.GPUSampler` objects (one per preset, +lazily initialized). Sub-batch coalescing keys on `(kind, texture_id, sampler_preset)` — draws +with the same texture but different samplers produce separate draw calls, which is correct. + +#### Textured draw procs + +Textured rectangles route through the existing SDF path via `draw.rectangle_texture` and +`draw.rectangle_texture_corners`, mirroring `draw.rectangle` and `draw.rectangle_corners` exactly — +same parameters, same naming — with the color parameter replaced by a texture ID plus an optional +tint. + +An earlier iteration of this design considered a separate tessellated `draw.texture` proc for +"simple" fullscreen quads, on the theory that the tessellated path's lower register count (~16 regs +vs ~24 for the SDF textured branch) would improve occupancy at large fragment counts. Applying the +register-pressure analysis from the pipeline-strategy section above shows this is wrong: both 16 and +24 registers are well below the register cliff (~43 regs on consumer Ampere/Ada, ~32 on Volta/A100), +so both run at 100% occupancy. The remaining ALU difference (~15 extra instructions for the SDF +evaluation) amounts to ~20μs at 4K — below noise. Meanwhile, splitting into a separate pipeline +would add ~1–5μs per pipeline bind on the CPU side per scissor, matching or exceeding the GPU-side +savings. Within the main tier, unified remains strictly better. + +The naming convention follows the existing shape API: `rectangle_texture` and +`rectangle_texture_corners` sit alongside `rectangle` and `rectangle_corners`, mirroring the +`rectangle_gradient` / `circle_gradient` pattern where the shape is the primary noun and the +modifier (gradient, texture) is secondary. This groups related procs together in autocomplete +(`rectangle_*`) and reads as natural English ("draw a rectangle with a texture"). + +Future per-shape texture variants (`circle_texture`, `ellipse_texture`, `polygon_texture`) are +reserved by this naming convention and require only a `Shape_Flag.Textured` bit plus a small +per-shape UV mapping function in the fragment shader. These are additive. + +#### What SDF anti-aliasing does and does not do for textured draws + +The SDF path anti-aliases the **shape's outer silhouette** — rounded-corner edges, rotated edges, +stroke outlines. It does not anti-alias or sharpen the texture content. Inside the shape, fragments +sample through the chosen `Sampler_Preset`, and image quality is whatever the sampler produces from +the source texels. A low-resolution texture displayed at a large size shows bilinear blur regardless +of which draw proc is used. This matches the current text-rendering model, where glyph sharpness +depends on how closely the display size matches the SDL_ttf atlas's rasterized size. + +#### Fit modes are a computation layer, not a renderer concept + +Standard image-fit behaviors (stretch, fill/cover, fit/contain, tile, center) are expressed as UV +sub-region computations on top of the `uv_rect` parameter that both textured-draw procs accept. The +renderer has no knowledge of fit modes — it samples whatever UV region it is given. + +A `fit_params` helper computes the appropriate `uv_rect`, sampler preset, and (for letterbox/fit +mode) shrunken inner rect from a `Fit_Mode` enum, the target rect, and the texture's pixel size. +Users who need custom UV control (sprite atlas sub-regions, UV animation, nine-patch slicing) skip +the helper and compute `uv_rect` directly. This keeps the renderer primitive minimal while making +the common cases convenient. + +#### Deferred release + +`unregister_texture` does not immediately release the GPU texture. It queues the slot for release at +the end of the current frame, after `SubmitGPUCommandBuffer` has handed work to the GPU. This +prevents a race condition where a texture is freed while the GPU is still sampling from it in an +already-submitted command buffer. The same deferred-release pattern is applied to `clear_text_cache` +and `clear_text_cache_entry`, fixing a pre-existing latent bug where destroying a cached +`^sdl_ttf.Text` mid-frame could free an atlas texture still referenced by in-flight draw batches. + +This pattern is standard in production renderers — the RAD Debugger's `r_tex2d_release` queues +textures onto a free list that is processed in `r_end_frame`, not at the call site. + +#### Clay integration + +Clay's `RenderCommandType.Image` is handled by dereferencing `imageData: rawptr` as a pointer to a +`Clay_Image_Data` struct containing a `Texture_Id`, `Fit_Mode`, and tint color. Routing mirrors the +existing rectangle handling: zero `cornerRadius` dispatches to `draw.texture` (tessellated), nonzero +dispatches to `draw.rectangle_texture_corners` (SDF). A `fit_params` call computes UVs from the fit +mode before dispatch. + +#### Deferred features + +The following are plumbed in the descriptor but not implemented in phase 1: + +- **Mipmaps**: `Texture_Desc.mip_levels` field exists; generation via SDL3 deferred. +- **Compressed formats**: `Texture_Desc.format` accepts BC/ASTC; upload path deferred. +- **Render-to-texture**: `Texture_Desc.usage` accepts `.COLOR_TARGET`; render-pass refactor deferred. +- **3D textures, arrays, cube maps**: `Texture_Desc.type` and `depth_or_layers` fields exist. +- **Additional samplers**: anisotropic, trilinear, clamp-to-border — additive enum values. +- **Atlas packing**: internal optimization for sub-batch coalescing; invisible to callers. +- **Per-shape texture variants**: `circle_texture`, `ellipse_texture`, etc. — reserved by naming. + +**References:** + +- RAD Debugger render layer (ownership model, deferred release, sampler-at-draw-time): + https://github.com/EpicGamesExt/raddebugger — `src/render/render_core.h`, `src/render/d3d11/render_d3d11.c` +- Casey Muratori, Handmade Hero day 472 — texture handling as a renderer-owned resource concern, + atlases as a separate layer above the renderer. + ## 3D rendering 3D pipeline architecture is under consideration and will be documented separately. The current diff --git a/draw/draw.odin b/draw/draw.odin index 0ed28b0..0cb0f82 100644 --- a/draw/draw.odin +++ b/draw/draw.odin @@ -265,6 +265,7 @@ measure_text_clay :: proc "c" ( context = GLOB.odin_context text := string(text.chars[:text.length]) c_text := strings.clone_to_cstring(text, context.temp_allocator) + defer delete(c_text, context.temp_allocator) width, height: c.int if !sdl_ttf.GetStringSize(get_font(config.fontId, config.fontSize), c_text, 0, &width, &height) { log.panicf("Failed to measure text: %s", sdl.GetError()) @@ -502,6 +503,7 @@ prepare_clay_batch :: proc( mouse_wheel_delta: [2]f32, frame_time: f32 = 0, custom_draw: Custom_Draw = nil, + temp_allocator := context.temp_allocator, ) { mouse_pos: [2]f32 mouse_flags := sdl.GetMouseState(&mouse_pos.x, &mouse_pos.y) @@ -541,7 +543,8 @@ prepare_clay_batch :: proc( case clay.RenderCommandType.Text: render_data := render_command.renderData.text txt := string(render_data.stringContents.chars[:render_data.stringContents.length]) - c_text := strings.clone_to_cstring(txt, context.temp_allocator) + c_text := strings.clone_to_cstring(txt, temp_allocator) + defer delete(c_text, temp_allocator) // Clay render-command IDs are derived via Clay's internal HashNumber (Jenkins-family) // and namespaced with .Clay so they can never collide with user-provided custom text IDs. sdl_text := cache_get_or_update( diff --git a/draw/shapes.odin b/draw/shapes.odin index 2b15f25..5a8b929 100644 --- a/draw/shapes.odin +++ b/draw/shapes.odin @@ -83,6 +83,7 @@ rectangle_gradient :: proc( temp_allocator := context.temp_allocator, ) { vertices := make([]Vertex, 6, temp_allocator) + defer delete(vertices, temp_allocator) corner_top_left := [2]f32{rect.x, rect.y} corner_top_right := [2]f32{rect.x + rect.width, rect.y} @@ -115,6 +116,7 @@ circle_sector :: proc( vertex_count := segment_count * 3 vertices := make([]Vertex, vertex_count, temp_allocator) + defer delete(vertices, temp_allocator) start_radians := math.to_radians(start_angle) end_radians := math.to_radians(end_angle) @@ -167,6 +169,7 @@ circle_gradient :: proc( vertex_count := segment_count * 3 vertices := make([]Vertex, vertex_count, temp_allocator) + defer delete(vertices, temp_allocator) step_angle := math.TAU / f32(segment_count) @@ -238,6 +241,7 @@ triangle_lines :: proc( temp_allocator := context.temp_allocator, ) { vertices := make([]Vertex, 18, temp_allocator) + defer delete(vertices, temp_allocator) write_offset := 0 if !needs_transform(origin, rotation) { @@ -273,6 +277,7 @@ triangle_fan :: proc( triangle_count := len(points) - 2 vertex_count := triangle_count * 3 vertices := make([]Vertex, vertex_count, temp_allocator) + defer delete(vertices, temp_allocator) if !needs_transform(origin, rotation) { for i in 1 ..< len(points) - 1 { @@ -312,6 +317,7 @@ triangle_strip :: proc( triangle_count := len(points) - 2 vertex_count := triangle_count * 3 vertices := make([]Vertex, vertex_count, temp_allocator) + defer delete(vertices, temp_allocator) if !needs_transform(origin, rotation) { for i in 0 ..< triangle_count { diff --git a/draw/text.odin b/draw/text.odin index 5ff7265..7400b33 100644 --- a/draw/text.odin +++ b/draw/text.odin @@ -139,6 +139,7 @@ text :: proc( temp_allocator := context.temp_allocator, ) { c_str := strings.clone_to_cstring(text_string, temp_allocator) + defer delete(c_str, temp_allocator) sdl_text: ^sdl_ttf.Text cached := false @@ -180,6 +181,7 @@ measure_text :: proc( allocator := context.temp_allocator, ) -> [2]f32 { c_str := strings.clone_to_cstring(text_string, allocator) + defer delete(c_str, allocator) width, height: c.int if !sdl_ttf.GetStringSize(get_font(font_id, font_size), c_str, 0, &width, &height) { log.panicf("Failed to measure text: %s", sdl.GetError()) -- 2.43.0 From a4623a13b576c662dc6477899a17a4d52e5b6bb4 Mon Sep 17 00:00:00 2001 From: Zachary Levy Date: Tue, 21 Apr 2026 13:01:02 -0700 Subject: [PATCH 2/5] Basic texture support --- .zed/tasks.json | 5 + draw/README.md | 244 +++++++----- draw/draw.odin | 197 +++++++--- draw/draw_qr/draw_qr.odin | 78 ++++ draw/examples/hellope.odin | 5 +- draw/examples/main.odin | 5 +- draw/examples/textures.odin | 285 ++++++++++++++ draw/pipeline_2d_base.odin | 37 +- draw/shaders/generated/base_2d.frag.metal | 95 +++-- draw/shaders/generated/base_2d.frag.spv | Bin 17776 -> 19164 bytes draw/shaders/generated/base_2d.vert.metal | 23 +- draw/shaders/generated/base_2d.vert.spv | Bin 4716 -> 5008 bytes draw/shaders/source/base_2d.frag | 18 + draw/shaders/source/base_2d.vert | 4 + draw/shapes.odin | 158 +++++++- draw/text.odin | 4 +- draw/textures.odin | 433 ++++++++++++++++++++++ 17 files changed, 1375 insertions(+), 216 deletions(-) create mode 100644 draw/draw_qr/draw_qr.odin create mode 100644 draw/examples/textures.odin create mode 100644 draw/textures.odin diff --git a/.zed/tasks.json b/.zed/tasks.json index e08acae..8b14508 100644 --- a/.zed/tasks.json +++ b/.zed/tasks.json @@ -70,6 +70,11 @@ "command": "odin run draw/examples -debug -out=out/debug/draw-examples -- hellope-custom", "cwd": "$ZED_WORKTREE_ROOT", }, + { + "label": "Run draw textures example", + "command": "odin run draw/examples -debug -out=out/debug/draw-examples -- textures", + "cwd": "$ZED_WORKTREE_ROOT", + }, { "label": "Run qrcode basic example", "command": "odin run qrcode/examples -debug -out=out/debug/qrcode-examples -- basic", diff --git a/draw/README.md b/draw/README.md index 1066a7e..5eeabf2 100644 --- a/draw/README.md +++ b/draw/README.md @@ -47,99 +47,107 @@ primitives and effects can be added to the library without architectural changes ### Overview: three pipelines -The 2D renderer will use three GPU pipelines, split by **register pressure compatibility** and -**render-state requirements**: +The 2D renderer uses three GPU pipelines, split by **register pressure** (main vs effects) and +**render-pass structure** (everything vs backdrop): -1. **Main pipeline** — shapes (SDF and tessellated) and text. Low register footprint (~18–22 - registers per thread). Runs at high GPU occupancy. Handles 90%+ of all fragments in a typical - frame. +1. **Main pipeline** — shapes (SDF and tessellated), text, and textured rectangles. Low register + footprint (~18–24 registers per thread). Runs at full GPU occupancy on every architecture. + Handles 90%+ of all fragments in a typical frame. 2. **Effects pipeline** — drop shadows, inner shadows, outer glow, and similar ALU-bound blur effects. Medium register footprint (~48–60 registers). Each effects primitive includes the base shape's SDF so that it can draw both the effect and the shape in a single fragment pass, avoiding - redundant overdraw. + redundant overdraw. Separated from the main pipeline to protect main-pipeline occupancy on + low-end hardware (see register analysis below). -3. **Backdrop-effects pipeline** — frosted glass, refraction, and any effect that samples the current - render target as input. High register footprint (~70–80 registers) and structurally requires a - `CopyGPUTextureToTexture` from the render target before drawing. Separated both for register - pressure and because the texture-copy requirement forces a render-pass-level state change. +3. **Backdrop pipeline** — frosted glass, refraction, and any effect that samples the current render + target as input. Implemented as a multi-pass sequence (downsample, separable blur, composite), + where each individual pass has a low-to-medium register footprint (~15–40 registers). Separated + from the other pipelines because it structurally requires ending the current render pass and + copying the render target before any backdrop-sampling fragment can execute — a command-buffer- + level boundary that cannot be avoided regardless of shader complexity. A typical UI frame with no effects uses 1 pipeline bind and 0 switches. A frame with drop shadows uses 2 pipelines and 1 switch. A frame with shadows and frosted glass uses all 3 pipelines and 2 -switches plus 1 texture copy. At ~5μs per pipeline bind on modern APIs, worst-case switching overhead -is under 0.15% of an 8.3ms (120 FPS) frame budget. +switches plus 1 texture copy. At ~1–5μs per pipeline bind on modern APIs, worst-case switching +overhead is negligible relative to an 8.3ms (120 FPS) frame budget. ### Why three pipelines, not one or seven The natural question is whether we should use a single unified pipeline (fewer state changes, simpler code) or many per-primitive-type pipelines (no branching overhead, lean per-shader register usage). -The dominant cost factor is **GPU register pressure**, not pipeline switching overhead or fragment -shader branching. A GPU shader core has a fixed register pool shared among all concurrent threads. The -compiler allocates registers pessimistically based on the worst-case path through the shader. If the -shader contains both a 20-register RRect SDF and a 72-register frosted-glass blur, _every_ fragment -— even trivial RRects — is allocated 72 registers. This directly reduces **occupancy** (the number of -warps that can run simultaneously), which reduces the GPU's ability to hide memory latency. +#### Main/effects split: register pressure -Concrete occupancy analysis on modern NVIDIA SMs, which have 65,536 32-bit registers and a -hardware-imposed maximum thread count per SM that varies by architecture (Volta/A100: 2,048; -consumer Ampere/Ada: 1,536). Occupancy is register-limited only when `65536 / regs_per_thread` falls -below the hardware thread cap; above that cap, occupancy is 100% regardless of register count. +A GPU shader core has a fixed register pool shared among all concurrent threads. The compiler +allocates registers pessimistically based on the worst-case path through the shader. If the shader +contains both a 20-register RRect SDF and a 48-register drop-shadow blur, _every_ fragment — even +trivial RRects — is allocated 48 registers. This directly reduces **occupancy** (the number of +warps/wavefronts that can run simultaneously), which reduces the GPU's ability to hide memory +latency. -On consumer Ampere/Ada GPUs (RTX 30xx/40xx, max 1,536 threads per SM): +Each GPU architecture has a **register cliff** — a threshold above which occupancy starts dropping. +Below the cliff, adding registers has zero occupancy cost. -| Register allocation | Reg-limited threads | Actual (hw-capped) | Occupancy | -| ------------------------- | ------------------- | ------------------ | --------- | -| 20 regs (RRect only) | 3,276 | 1,536 | 100% | -| 32 regs | 2,048 | 1,536 | 100% | -| 48 regs (+ drop shadow) | 1,365 | 1,365 | ~89% | -| 72 regs (+ frosted glass) | 910 | 910 | ~59% | +On consumer Ampere/Ada GPUs (RTX 30xx/40xx, 65,536 regs/SM, max 1,536 threads/SM, cliff at ~43 regs): -On Volta/A100 GPUs (max 2,048 threads per SM): +| Register allocation | Reg-limited threads | Actual (hw-capped) | Occupancy | +| ----------------------- | ------------------- | ------------------ | --------- | +| 20 regs (main pipeline) | 3,276 | 1,536 | 100% | +| 32 regs | 2,048 | 1,536 | 100% | +| 48 regs (effects) | 1,365 | 1,365 | ~89% | -| Register allocation | Reg-limited threads | Actual (hw-capped) | Occupancy | -| ------------------------- | ------------------- | ------------------ | --------- | -| 20 regs (RRect only) | 3,276 | 2,048 | 100% | -| 32 regs | 2,048 | 2,048 | 100% | -| 48 regs (+ drop shadow) | 1,365 | 1,365 | ~67% | -| 72 regs (+ frosted glass) | 910 | 910 | ~44% | +On Volta/A100 GPUs (65,536 regs/SM, max 2,048 threads/SM, cliff at ~32 regs): -The register cliff — where occupancy begins dropping — starts at ~43 regs/thread on consumer -Ampere/Ada (65536 / 1536) and ~32 regs/thread on Volta/A100 (65536 / 2048). Below the cliff, -adding registers has zero occupancy cost. +| Register allocation | Reg-limited threads | Actual (hw-capped) | Occupancy | +| ----------------------- | ------------------- | ------------------ | --------- | +| 20 regs (main pipeline) | 3,276 | 2,048 | 100% | +| 32 regs | 2,048 | 2,048 | 100% | +| 48 regs (effects) | 1,365 | 1,365 | ~67% | -The impact of reduced occupancy depends on whether the shader is memory-latency-bound (where -occupancy is critical for hiding latency) or ALU-bound (where it matters less). For the -backdrop-effects pipeline's frosted-glass shader, which performs multiple dependent texture reads, -59% occupancy (consumer) or 44% occupancy (Volta) meaningfully reduces the GPU's ability to hide -texture latency — roughly a 1.7× to 2.3× throughput reduction compared to full occupancy. At 4K with -1.5× overdraw (~12.4M fragments), if the main pipeline's fragment work at full occupancy takes ~2ms, -a single unified shader containing the glass branch would push it to ~3.4–4.6ms depending on -architecture. This is a per-frame multiplier, not a per-primitive cost — it applies even when the -heavy branch is never taken, because the compiler allocates registers for the worst-case path. +On low-end mobile (ARM Mali Bifrost/Valhall, 64 regs/thread, cliff fixed at 32 regs): -**Note on Apple M3+ GPUs:** Apple's M3 GPU architecture introduces Dynamic Caching (register file -virtualization), which allocates registers dynamically at runtime based on actual usage rather than -worst-case declared usage. This significantly reduces the static register-pressure-to-occupancy -penalty described above. The tier split remains useful on Apple hardware for other reasons (keeping -the backdrop texture-copy out of the main render pass, isolating blur ALU complexity), but the -register-pressure argument specifically weakens on M3 and later. +| Register allocation | Occupancy | +| -------------------- | -------------------------- | +| 0–32 regs (main) | 100% (full thread count) | +| 33–64 regs (effects) | ~50% (thread count halves) | -The three-pipeline split groups primitives by register footprint so that: +Mali's cliff at 32 registers is the binding constraint. On desktop the occupancy difference between +20 and 48 registers is modest (89–100%); on Mali it is a hard 2× throughput reduction. The +main/effects split protects 90%+ of a frame's fragments (shapes, text, textures) from the effects +pipeline's register cost. -- Main pipeline (~20 regs): all fragments run at full occupancy on every architecture. -- Effects pipeline (~48–55 regs): shadow/glow fragments run at 67–89% occupancy depending on - architecture; unavoidable given the blur math complexity. -- Backdrop-effects pipeline (~72–75 regs): glass fragments run at 44–59% occupancy; also - unavoidable, and structurally separated anyway by the texture-copy requirement. +For the effects pipeline's drop-shadow shader — erf-approximation blur math with several texture +fetches — 50% occupancy on Mali roughly halves throughput. At 4K with 1.5× overdraw (~12.4M +fragments), a single unified shader containing the shadow branch would cost ~4ms instead of ~2ms on +low-end mobile. This is a per-frame multiplier even when the heavy branch is never taken, because the +compiler allocates registers for the worst-case path. -This avoids the register-pressure tax of a single unified shader while keeping pipeline count minimal -(3 vs. Zed GPUI's 7). The effects that drag occupancy down are isolated to the fragments that -actually need them. Crucially, all shape kinds within the main pipeline (SDF, tessellated, text) -cluster at 12–24 registers — well below the register cliff on every architecture — so unifying them -costs nothing in occupancy. +All main-pipeline members (SDF shapes, tessellated geometry, text, textured rectangles) cluster at +12–24 registers — below the cliff on every architecture — so unifying them costs nothing in +occupancy. -**Why not per-primitive-type pipelines (GPUI's approach)?** Zed's GPUI uses 7 separate shader pairs: +**Note on Apple M3+ GPUs:** Apple's M3 introduces Dynamic Caching (register file virtualization), +which allocates registers at runtime based on actual usage rather than worst-case. This weakens the +static register-pressure argument on M3 and later, but the split remains useful for isolating blur +ALU complexity and keeping the backdrop texture-copy out of the main render pass. + +#### Backdrop split: render-pass structure + +The backdrop pipeline (frosted glass, refraction, mirror surfaces) is separated for a structural +reason unrelated to register pressure. Before any backdrop-sampling fragment can execute, the current +render target must be copied to a separate texture via `CopyGPUTextureToTexture` — a command-buffer- +level operation that requires ending the current render pass. This boundary exists regardless of +shader complexity and cannot be optimized away. + +The backdrop pipeline's individual shader passes (downsample, separable blur, composite) are +register-light (~15–40 regs each), so merging them into the effects pipeline would cause no occupancy +problem. But the render-pass boundary makes merging structurally impossible — effects draws happen +inside the main render pass, backdrop draws happen inside their own bracketed pass sequence. + +#### Why not per-primitive-type pipelines (GPUI's approach) + +Zed's GPUI uses 7 separate shader pairs: quad, shadow, underline, monochrome sprite, polychrome sprite, path, surface. This eliminates all branching and gives each shader minimal register usage. Three concrete costs make this approach wrong for our use case: @@ -151,7 +159,7 @@ typical UI frame with 15 scissors and 3–4 primitive kinds per scissor, per-kin ~45–60 draw calls and pipeline binds; our unified approach produces ~15–20 draw calls and 1–5 pipeline binds. At ~5μs each for CPU-side command encoding on modern APIs, per-kind splitting adds 375–500μs of CPU overhead per frame — **4.5–6% of an 8.3ms (120 FPS) budget** — with no -compensating GPU-side benefit, because the register-pressure savings within the simple-SDF tier are +compensating GPU-side benefit, because the register-pressure savings within the simple-SDF range are negligible (all members cluster at 12–22 registers). **Z-order preservation forces the API to expose layers.** With a single pipeline drawing all kinds @@ -190,8 +198,8 @@ in submission order: ~60 boundary warps at ~80 extra instructions each), unified divergence costs ~13μs — still 3.5× cheaper than the pipeline-switching alternative. -The split we _do_ perform (main / effects / backdrop-effects) is motivated by register-pressure tier -boundaries where occupancy drops are significant at 4K (see numbers above). Within a tier, unified is +The split we _do_ perform (main / effects / backdrop) is motivated by register-pressure boundaries +and structural render-pass requirements (see analysis above). Within a pipeline, unified is strictly better by every measure: fewer draw calls, simpler Z-order, lower CPU overhead, and negligible GPU-side branching cost. @@ -483,25 +491,40 @@ Wallace's variant) and vger-rs. - Vello's implementation of blurred rounded rectangle as a gradient type: https://github.com/linebender/vello/pull/665 -### Backdrop-effects pipeline +### Backdrop pipeline -The backdrop-effects pipeline handles effects that sample the current render target as input: frosted -glass, refraction, mirror surfaces. It is structurally separated from the effects pipeline for two -reasons: +The backdrop pipeline handles effects that sample the current render target as input: frosted glass, +refraction, mirror surfaces. It is separated from the effects pipeline for a structural reason, not +register pressure. -1. **Render-state requirement.** Before any backdrop-sampling fragment can run, the current render - target must be copied to a separate texture via `CopyGPUTextureToTexture`. This is a command- - buffer-level operation that cannot happen mid-render-pass. The copy naturally creates a pipeline - boundary. +**Render-pass boundary.** Before any backdrop-sampling fragment can run, the current render target +must be copied to a separate texture via `CopyGPUTextureToTexture`. This is a command-buffer-level +operation that cannot happen mid-render-pass. The copy naturally creates a pipeline boundary that no +amount of shader optimization can eliminate — it is a fundamental requirement of sampling a surface +while also writing to it. -2. **Register pressure.** Backdrop-sampling shaders read from a texture with Gaussian kernel weights - (multiple texture fetches per fragment), pushing register usage to ~70–80. Including this in the - effects pipeline would reduce occupancy for all shadow/glow fragments from ~30% to ~20%, costing - measurable throughput on the common case. +**Multi-pass implementation.** Backdrop effects are implemented as separable multi-pass sequences +(downsample → horizontal blur → vertical blur → composite), following the standard approach used by +iOS `UIVisualEffectView`, Android `RenderEffect`, and Flutter's `BackdropFilter`. Each individual +pass has a low-to-medium register footprint (~15–40 registers), well within the main pipeline's +occupancy range. The multi-pass approach avoids the monolithic 70+ register shader that a single-pass +Gaussian blur would require, making backdrop effects viable on low-end mobile GPUs (including +Mali-G31 and VideoCore VI) where per-thread register limits are tight. -The backdrop-effects pipeline binds a secondary sampler pointing at the captured backdrop texture. When -no backdrop effects are present in a frame, this pipeline is never bound and the texture copy never -happens — zero cost. +**Bracketed execution.** All backdrop draws in a frame share a single bracketed region of the command +buffer: end the current render pass, copy the render target, execute all backdrop sub-passes, then +resume normal drawing. The entry/exit cost (texture copy + render-pass break) is paid once per frame +regardless of how many backdrop effects are visible. When no backdrop effects are present, the bracket +is never entered and the texture copy never happens — zero cost. + +**Why not split the backdrop sub-passes into separate pipelines?** The individual passes range from +~15 to ~40 registers, which does cross Mali's 32-register cliff. However, the register-pressure argument +that justifies the main/effects split does not apply here. The main/effects split protects the +_common path_ (90%+ of frame fragments) from the uncommon path's register cost. Inside the backdrop +pipeline there is no common-vs-uncommon distinction — if backdrop effects are active, every sub-pass +runs; if not, none run. The backdrop pipeline either executes as a complete unit or not at all. +Additionally, backdrop effects cover a small fraction of the frame's total fragments (~5% at typical +UI scales), so the occupancy variation within the bracket has negligible impact on frame time. ### Vertex layout @@ -524,19 +547,21 @@ The `Primitive` struct for SDF shapes lives in the storage buffer, not in vertex ``` Primitive :: struct { - kind: Shape_Kind, // 0: enum u8 - flags: Shape_Flags, // 1: bit_set[Shape_Flag; u8] - _pad: u16, // 2: reserved - bounds: [4]f32, // 4: min_x, min_y, max_x, max_y - color: Color, // 20: u8x4 - _pad2: [3]u8, // 24: alignment - params: Shape_Params, // 28: raw union, 32 bytes + bounds: [4]f32, // 0: min_x, min_y, max_x, max_y + color: Color, // 16: u8x4, unpacked in shader via unpackUnorm4x8 + kind_flags: u32, // 20: (kind as u32) | (flags as u32 << 8) + rotation: f32, // 24: shader self-rotation in radians + _pad: f32, // 28: alignment + params: Shape_Params, // 32: raw union, 32 bytes (two vec4s of shape-specific data) + uv_rect: [4]f32, // 64: texture UV sub-region (u_min, v_min, u_max, v_max) } -// Total: 60 bytes (padded to 64 for GPU alignment) +// Total: 80 bytes (std430 aligned) ``` `Shape_Params` is a `#raw_union` with named variants per shape kind (`rrect`, `circle`, `segment`, -etc.), ensuring type safety on the CPU side and zero-cost reinterpretation on the GPU side. +etc.), ensuring type safety on the CPU side and zero-cost reinterpretation on the GPU side. The +`uv_rect` field is used by textured SDF primitives (Shape_Flag.Textured); non-textured primitives +leave it zeroed. ### Draw submission order @@ -547,7 +572,7 @@ Within each scissor region, draws are issued in submission order to preserve the 2. Bind **main pipeline, tessellated mode** → draw all queued tessellated vertices (non-indexed for shapes, indexed for text). Pipeline state unchanged from today. 3. Bind **main pipeline, SDF mode** → draw all queued SDF primitives (instanced, one draw call). -4. If backdrop effects are present: copy render target, bind **backdrop-effects pipeline** → draw +4. If backdrop effects are present: copy render target, bind **backdrop pipeline** → draw backdrop primitives. The exact ordering within a scissor may be refined based on actual Z-ordering requirements. The key @@ -647,7 +672,7 @@ register-pressure analysis from the pipeline-strategy section above shows this i so both run at 100% occupancy. The remaining ALU difference (~15 extra instructions for the SDF evaluation) amounts to ~20μs at 4K — below noise. Meanwhile, splitting into a separate pipeline would add ~1–5μs per pipeline bind on the CPU side per scissor, matching or exceeding the GPU-side -savings. Within the main tier, unified remains strictly better. +savings. Within the main pipeline, unified remains strictly better. The naming convention follows the existing shape API: `rectangle_texture` and `rectangle_texture_corners` sit alongside `rectangle` and `rectangle_corners`, mirroring the @@ -725,6 +750,35 @@ The following are plumbed in the descriptor but not implemented in phase 1: expectation is that 3D rendering will use dedicated pipelines (separate from the 2D pipelines) sharing GPU resources (textures, samplers, command buffer lifecycle) with the 2D renderer. +## Multi-window support + +The renderer currently assumes a single window via the global `GLOB` state. Multi-window support is +deferred but anticipated. When revisited, the RAD Debugger's bucket + pass-list model +(`src/draw/draw.h`, `src/draw/draw.c` in EpicGamesExt/raddebugger) is worth studying as a reference. + +RAD separates draw submission from rendering via **buckets**. A `DR_Bucket` is an explicit handle +that accumulates an ordered list of render passes (`R_PassList`). The user creates a bucket, pushes +it onto a thread-local stack, issues draw calls (which target the top-of-stack bucket), then submits +the bucket to a specific window. Multiple buckets can exist simultaneously — one per window, or one +per UI panel that gets composited into a parent bucket via `dr_sub_bucket`. Implicit draw parameters +(clip rect, 2D transform, sampler mode, transparency) are managed via push/pop stacks scoped to each +bucket, so different windows can have independent clip and transform state without interference. + +The key properties this gives RAD: + +- **Per-window isolation.** Each window builds its own bucket with its own pass list and state stacks. + No global contention. +- **Thread-parallel building.** Each thread has its own draw context and arena. Multiple threads can + build buckets concurrently, then submit them to the render backend sequentially. +- **Compositing.** A pre-built bucket (e.g., a tooltip or overlay) can be injected into another + bucket with a transform applied, without rebuilding its draw calls. + +For our library, the likely adaptation would be replacing the single `GLOB` with a per-window draw +context that users create and pass to `begin`/`end`, while keeping the explicit-parameter draw call +style rather than adopting RAD's implicit state stacks. Texture and sampler resources would remain +global (shared across windows), with only the per-frame staging buffers and layer/scissor state +becoming per-context. + ## Building shaders GLSL shader sources live in `shaders/source/`. Compiled outputs (SPIR-V and Metal Shading Language) diff --git a/draw/draw.odin b/draw/draw.odin index 0cb0f82..0fb4934 100644 --- a/draw/draw.odin +++ b/draw/draw.odin @@ -63,15 +63,17 @@ Rectangle :: struct { } Sub_Batch_Kind :: enum u8 { - Shapes, // non-indexed, white texture, mode 0 + Shapes, // non-indexed, white texture or user texture, mode 0 Text, // indexed, atlas texture, mode 0 - SDF, // instanced unit quad, white texture, mode 1 + SDF, // instanced unit quad, white texture or user texture, mode 1 } Sub_Batch :: struct { - kind: Sub_Batch_Kind, - offset: u32, // Shapes: vertex offset; Text: text_batch index; SDF: primitive index - count: u32, // Shapes: vertex count; Text: always 1; SDF: primitive count + kind: Sub_Batch_Kind, + offset: u32, // Shapes: vertex offset; Text: text_batch index; SDF: primitive index + count: u32, // Shapes: vertex count; Text: always 1; SDF: primitive count + texture_id: Texture_Id, + sampler: Sampler_Preset, } Layer :: struct { @@ -95,35 +97,60 @@ Scissor :: struct { GLOB: Global Global :: struct { - odin_context: runtime.Context, - pipeline_2d_base: Pipeline_2D_Base, - text_cache: Text_Cache, - layers: [dynamic]Layer, - scissors: [dynamic]Scissor, - tmp_shape_verts: [dynamic]Vertex, - tmp_text_verts: [dynamic]Vertex, - tmp_text_indices: [dynamic]c.int, - tmp_text_batches: [dynamic]TextBatch, - tmp_primitives: [dynamic]Primitive, - tmp_sub_batches: [dynamic]Sub_Batch, - tmp_uncached_text: [dynamic]^sdl_ttf.Text, // Uncached TTF_Text objects to destroy after end() - clay_memory: [^]u8, - msaa_texture: ^sdl.GPUTexture, - curr_layer_index: uint, - max_layers: int, - max_scissors: int, - max_shape_verts: int, - max_text_verts: int, - max_text_indices: int, - max_text_batches: int, - max_primitives: int, - max_sub_batches: int, - dpi_scaling: f32, - msaa_width: u32, - msaa_height: u32, - sample_count: sdl.GPUSampleCount, - clay_z_index: i16, - cleared: bool, + // -- Per-frame staging (hottest — touched by every prepare/upload/clear cycle) -- + tmp_shape_verts: [dynamic]Vertex, // Tessellated shape vertices staged for GPU upload. + tmp_text_verts: [dynamic]Vertex, // Text vertices staged for GPU upload. + tmp_text_indices: [dynamic]c.int, // Text index buffer staged for GPU upload. + tmp_text_batches: [dynamic]TextBatch, // Text atlas batch metadata for indexed drawing. + tmp_primitives: [dynamic]Primitive, // SDF primitives staged for GPU storage buffer upload. + tmp_sub_batches: [dynamic]Sub_Batch, // Sub-batch records that drive draw call dispatch. + tmp_uncached_text: [dynamic]^sdl_ttf.Text, // Uncached TTF_Text objects destroyed after end() submits. + layers: [dynamic]Layer, // Draw layers, each with its own scissor stack. + scissors: [dynamic]Scissor, // Scissor rects that clip drawing within each layer. + + // -- Per-frame scalars (accessed during prepare and draw_layer) -- + curr_layer_index: uint, // Index of the currently active layer. + dpi_scaling: f32, // Window DPI scale factor applied to all pixel coordinates. + clay_z_index: i16, // Tracks z-index for layer splitting during Clay batch processing. + cleared: bool, // Whether the render target has been cleared this frame. + + // -- Pipeline (accessed every draw_layer call) -- + pipeline_2d_base: Pipeline_2D_Base, // The unified 2D GPU pipeline (shaders, buffers, samplers). + device: ^sdl.GPUDevice, // GPU device handle, stored at init. + samplers: [SAMPLER_PRESET_COUNT]^sdl.GPUSampler, // Lazily-created sampler objects, one per Sampler_Preset. + + // -- Deferred release (processed once per frame at frame boundary) -- + pending_texture_releases: [dynamic]Texture_Id, // Deferred GPU texture releases, processed next frame. + pending_text_releases: [dynamic]^sdl_ttf.Text, // Deferred TTF_Text destroys, processed next frame. + + // -- Textures (registration is occasional, binding is per draw call) -- + texture_slots: [dynamic]Texture_Slot, // Registered texture slots indexed by Texture_Id. + texture_free_list: [dynamic]u32, // Recycled slot indices available for reuse. + + // -- MSAA (once per frame in end()) -- + msaa_texture: ^sdl.GPUTexture, // Intermediate render target for multi-sample resolve. + msaa_width: u32, // Cached width to detect when MSAA texture needs recreation. + msaa_height: u32, // Cached height to detect when MSAA texture needs recreation. + sample_count: sdl.GPUSampleCount, // Sample count chosen at init (._1 means MSAA disabled). + + // -- Clay (once per frame in prepare_clay_batch) -- + clay_memory: [^]u8, // Raw memory block backing Clay's internal arena. + + // -- Text (occasional — font registration and text cache lookups) -- + text_cache: Text_Cache, // Font registry, SDL_ttf engine, and cached TTF_Text objects. + + // -- Resize tracking (cold — checked once per frame in resize_global) -- + max_layers: int, // High-water marks for dynamic array shrink heuristic. + max_scissors: int, + max_shape_verts: int, + max_text_verts: int, + max_text_indices: int, + max_text_batches: int, + max_primitives: int, + max_sub_batches: int, + + // -- Init-only (coldest — set once at init, never written again) -- + odin_context: runtime.Context, // Odin context captured at init for use in callbacks. } Init_Options :: struct { @@ -168,22 +195,30 @@ init :: proc( } GLOB = Global { - layers = make([dynamic]Layer, 0, INITIAL_LAYER_SIZE, allocator = allocator), - scissors = make([dynamic]Scissor, 0, INITIAL_SCISSOR_SIZE, allocator = allocator), - tmp_shape_verts = make([dynamic]Vertex, 0, BUFFER_INIT_SIZE, allocator = allocator), - tmp_text_verts = make([dynamic]Vertex, 0, BUFFER_INIT_SIZE, allocator = allocator), - tmp_text_indices = make([dynamic]c.int, 0, BUFFER_INIT_SIZE, allocator = allocator), - tmp_text_batches = make([dynamic]TextBatch, 0, BUFFER_INIT_SIZE, allocator = allocator), - tmp_primitives = make([dynamic]Primitive, 0, BUFFER_INIT_SIZE, allocator = allocator), - tmp_sub_batches = make([dynamic]Sub_Batch, 0, BUFFER_INIT_SIZE, allocator = allocator), - tmp_uncached_text = make([dynamic]^sdl_ttf.Text, 0, 16, allocator = allocator), - odin_context = odin_context, - dpi_scaling = sdl.GetWindowDisplayScale(window), - clay_memory = make([^]u8, min_memory_size, allocator = allocator), - sample_count = resolved_sample_count, - pipeline_2d_base = pipeline, - text_cache = text_cache, + layers = make([dynamic]Layer, 0, INITIAL_LAYER_SIZE, allocator = allocator), + scissors = make([dynamic]Scissor, 0, INITIAL_SCISSOR_SIZE, allocator = allocator), + tmp_shape_verts = make([dynamic]Vertex, 0, BUFFER_INIT_SIZE, allocator = allocator), + tmp_text_verts = make([dynamic]Vertex, 0, BUFFER_INIT_SIZE, allocator = allocator), + tmp_text_indices = make([dynamic]c.int, 0, BUFFER_INIT_SIZE, allocator = allocator), + tmp_text_batches = make([dynamic]TextBatch, 0, BUFFER_INIT_SIZE, allocator = allocator), + tmp_primitives = make([dynamic]Primitive, 0, BUFFER_INIT_SIZE, allocator = allocator), + tmp_sub_batches = make([dynamic]Sub_Batch, 0, BUFFER_INIT_SIZE, allocator = allocator), + tmp_uncached_text = make([dynamic]^sdl_ttf.Text, 0, 16, allocator = allocator), + device = device, + texture_slots = make([dynamic]Texture_Slot, 0, 16, allocator = allocator), + texture_free_list = make([dynamic]u32, 0, 16, allocator = allocator), + pending_texture_releases = make([dynamic]Texture_Id, 0, 16, allocator = allocator), + pending_text_releases = make([dynamic]^sdl_ttf.Text, 0, 16, allocator = allocator), + odin_context = odin_context, + dpi_scaling = sdl.GetWindowDisplayScale(window), + clay_memory = make([^]u8, min_memory_size, allocator = allocator), + sample_count = resolved_sample_count, + pipeline_2d_base = pipeline, + text_cache = text_cache, } + + // Reserve slot 0 for INVALID_TEXTURE + append(&GLOB.texture_slots, Texture_Slot{}) log.debug("Window DPI scaling:", GLOB.dpi_scaling) arena := clay.CreateArenaWithCapacityAndMemory(min_memory_size, GLOB.clay_memory) window_width, window_height: c.int @@ -230,12 +265,23 @@ destroy :: proc(device: ^sdl.GPUDevice, allocator := context.allocator) { if GLOB.msaa_texture != nil { sdl.ReleaseGPUTexture(device, GLOB.msaa_texture) } + process_pending_texture_releases() + destroy_all_textures() + destroy_sampler_pool() + for ttf_text in GLOB.pending_text_releases do sdl_ttf.DestroyText(ttf_text) + delete(GLOB.pending_text_releases) destroy_pipeline_2d_base(device, &GLOB.pipeline_2d_base) destroy_text_cache() } // Internal clear_global :: proc() { + // Process deferred texture releases from the previous frame + process_pending_texture_releases() + // Process deferred text releases from the previous frame + for ttf_text in GLOB.pending_text_releases do sdl_ttf.DestroyText(ttf_text) + clear(&GLOB.pending_text_releases) + GLOB.curr_layer_index = 0 GLOB.clay_z_index = 0 GLOB.cleared = false @@ -455,15 +501,24 @@ append_or_extend_sub_batch :: proc( kind: Sub_Batch_Kind, offset: u32, count: u32, + texture_id: Texture_Id = INVALID_TEXTURE, + sampler: Sampler_Preset = .Linear_Clamp, ) { if scissor.sub_batch_len > 0 { last := &GLOB.tmp_sub_batches[scissor.sub_batch_start + scissor.sub_batch_len - 1] - if last.kind == kind && kind != .Text && last.offset + last.count == offset { + if last.kind == kind && + kind != .Text && + last.offset + last.count == offset && + last.texture_id == texture_id && + last.sampler == sampler { last.count += count return } } - append(&GLOB.tmp_sub_batches, Sub_Batch{kind = kind, offset = offset, count = count}) + append( + &GLOB.tmp_sub_batches, + Sub_Batch{kind = kind, offset = offset, count = count, texture_id = texture_id, sampler = sampler}, + ) scissor.sub_batch_len += 1 layer.sub_batch_len += 1 } @@ -554,6 +609,46 @@ prepare_clay_batch :: proc( ) prepare_text(layer, Text{sdl_text, {bounds.x, bounds.y}, color_from_clay(render_data.textColor)}) case clay.RenderCommandType.Image: + render_data := render_command.renderData.image + if render_data.imageData == nil do continue + img_data := (^Clay_Image_Data)(render_data.imageData)^ + cr := render_data.cornerRadius + radii := [4]f32{cr.topLeft, cr.topRight, cr.bottomRight, cr.bottomLeft} + + // Background color behind the image (Clay allows it) + bg := color_from_clay(render_data.backgroundColor) + if bg[3] > 0 { + if radii == {0, 0, 0, 0} { + rectangle(layer, bounds, bg) + } else { + rectangle_corners(layer, bounds, radii, bg) + } + } + + // Compute fit UVs + uv, sampler, inner := fit_params(img_data.fit, bounds, img_data.texture_id) + + // Draw the image — route by cornerRadius + if radii == {0, 0, 0, 0} { + rectangle_texture( + layer, + inner, + img_data.texture_id, + tint = img_data.tint, + uv_rect = uv, + sampler = sampler, + ) + } else { + rectangle_texture_corners( + layer, + inner, + radii, + img_data.texture_id, + tint = img_data.tint, + uv_rect = uv, + sampler = sampler, + ) + } case clay.RenderCommandType.ScissorStart: if bounds.width == 0 || bounds.height == 0 do continue diff --git a/draw/draw_qr/draw_qr.odin b/draw/draw_qr/draw_qr.odin new file mode 100644 index 0000000..9fb3a0f --- /dev/null +++ b/draw/draw_qr/draw_qr.odin @@ -0,0 +1,78 @@ +package draw_qr + +import draw ".." +import "../../qrcode" + +// A registered QR code texture, ready for display via draw.rectangle_texture. +QR :: struct { + texture_id: draw.Texture_Id, + size: int, // modules per side (e.g. 21..177) +} + +// Encode text as a QR code and register the result as an R8 texture. +// The texture uses Nearest_Clamp sampling by default (sharp module edges). +// Returns ok=false if encoding or registration fails. +@(require_results) +create_from_text :: proc( + text: string, + ecl: qrcode.Ecc = .Low, + min_version: int = qrcode.VERSION_MIN, + max_version: int = qrcode.VERSION_MAX, + mask: Maybe(qrcode.Mask) = nil, + boost_ecl: bool = true, +) -> ( + qr: QR, + ok: bool, +) { + qrcode_buf: [qrcode.BUFFER_LEN_MAX]u8 + encode_ok := qrcode.encode(text, qrcode_buf[:], ecl, min_version, max_version, mask, boost_ecl) + if !encode_ok do return {}, false + return create(qrcode_buf[:]) +} + +// Register an already-encoded QR code buffer as an R8 texture. +// qrcode_buf must be the output of qrcode.encode (byte 0 = side length, remaining = bit-packed modules). +@(require_results) +create :: proc(qrcode_buf: []u8) -> (qr: QR, ok: bool) { + size := qrcode.get_size(qrcode_buf) + if size == 0 do return {}, false + + // Build R8 pixel buffer: 0 = light, 255 = dark + pixels := make([]u8, size * size, context.temp_allocator) + for y in 0 ..< size { + for x in 0 ..< size { + pixels[y * size + x] = 255 if qrcode.get_module(qrcode_buf, x, y) else 0 + } + } + + id, reg_ok := draw.register_texture( + draw.Texture_Desc { + width = u32(size), + height = u32(size), + depth_or_layers = 1, + type = .D2, + format = .R8_UNORM, + usage = {.SAMPLER}, + mip_levels = 1, + kind = .Static, + }, + pixels, + ) + if !reg_ok do return {}, false + + return QR{texture_id = id, size = size}, true +} + +// Release the GPU texture. +destroy :: proc(qr: ^QR) { + draw.unregister_texture(qr.texture_id) + qr.texture_id = draw.INVALID_TEXTURE + qr.size = 0 +} + +// Convenience: build a Clay_Image_Data for embedding a QR in Clay layouts. +// Uses Nearest_Clamp sampling (set via Sampler_Preset at draw time, not here) and Fit mode +// to preserve the QR's square aspect ratio. +clay_image :: proc(qr: QR, tint: draw.Color = draw.WHITE) -> draw.Clay_Image_Data { + return draw.clay_image_data(qr.texture_id, fit = .Fit, tint = tint) +} diff --git a/draw/examples/hellope.odin b/draw/examples/hellope.odin index 08026da..eb945bd 100644 --- a/draw/examples/hellope.odin +++ b/draw/examples/hellope.odin @@ -78,10 +78,11 @@ hellope_shapes :: proc() { draw.ellipse(base_layer, {410, 340}, 50, 30, {255, 200, 50, 255}, rotation = spin_angle) // Circle orbiting a point (moon orbiting planet) + // Convention B: center = pivot point (planet), origin = offset from moon center to pivot. + // Moon's visual center at rotation=0: planet_pos - origin = (100, 450) - (0, 40) = (100, 410). planet_pos := [2]f32{100, 450} - moon_pos := planet_pos + {0, -40} draw.circle(base_layer, planet_pos, 8, {200, 200, 200, 255}) // planet (stationary) - draw.circle(base_layer, moon_pos, 5, {100, 150, 255, 255}, origin = {0, 40}, rotation = spin_angle) // moon orbiting + draw.circle(base_layer, planet_pos, 5, {100, 150, 255, 255}, origin = {0, 40}, rotation = spin_angle) // moon orbiting // Ring arc rotating in place draw.ring(base_layer, {250, 450}, 15, 30, 0, 270, {100, 100, 220, 255}, rotation = spin_angle) diff --git a/draw/examples/main.odin b/draw/examples/main.odin index f8107eb..e3ee109 100644 --- a/draw/examples/main.odin +++ b/draw/examples/main.odin @@ -57,7 +57,7 @@ main :: proc() { args := os.args if len(args) < 2 { fmt.eprintln("Usage: examples ") - fmt.eprintln("Available examples: hellope-shapes, hellope-text, hellope-clay, hellope-custom") + fmt.eprintln("Available examples: hellope-shapes, hellope-text, hellope-clay, hellope-custom, textures") os.exit(1) } @@ -66,9 +66,10 @@ main :: proc() { case "hellope-custom": hellope_custom() case "hellope-shapes": hellope_shapes() case "hellope-text": hellope_text() + case "textures": textures() case: fmt.eprintf("Unknown example: %v\n", args[1]) - fmt.eprintln("Available examples: hellope-shapes, hellope-text, hellope-clay, hellope-custom") + fmt.eprintln("Available examples: hellope-shapes, hellope-text, hellope-clay, hellope-custom, textures") os.exit(1) } } diff --git a/draw/examples/textures.odin b/draw/examples/textures.odin new file mode 100644 index 0000000..ca53ba3 --- /dev/null +++ b/draw/examples/textures.odin @@ -0,0 +1,285 @@ +package examples + +import "../../draw" +import "../../draw/draw_qr" +import "core:math" +import "core:os" +import sdl "vendor:sdl3" + +textures :: proc() { + if !sdl.Init({.VIDEO}) do os.exit(1) + window := sdl.CreateWindow("Textures", 800, 600, {.HIGH_PIXEL_DENSITY}) + gpu := sdl.CreateGPUDevice(draw.PLATFORM_SHADER_FORMAT, true, nil) + if !sdl.ClaimWindowForGPUDevice(gpu, window) do os.exit(1) + if !draw.init(gpu, window) do os.exit(1) + JETBRAINS_MONO_REGULAR = draw.register_font(JETBRAINS_MONO_REGULAR_RAW) + + FONT_SIZE :: u16(14) + LABEL_OFFSET :: f32(8) // gap between item and its label + + // ------------------------------------------------------------------------- + // Procedural checkerboard texture (8x8, RGBA8) + // ------------------------------------------------------------------------- + checker_size :: 8 + checker_pixels: [checker_size * checker_size * 4]u8 + for y in 0 ..< checker_size { + for x in 0 ..< checker_size { + i := (y * checker_size + x) * 4 + is_dark := ((x + y) % 2) == 0 + val: u8 = 40 if is_dark else 220 + checker_pixels[i + 0] = val // R + checker_pixels[i + 1] = val / 2 // G — slight color tint + checker_pixels[i + 2] = val // B + checker_pixels[i + 3] = 255 // A + } + } + checker_texture, _ := draw.register_texture( + draw.Texture_Desc { + width = checker_size, + height = checker_size, + depth_or_layers = 1, + type = .D2, + format = .R8G8B8A8_UNORM, + usage = {.SAMPLER}, + mip_levels = 1, + }, + checker_pixels[:], + ) + defer draw.unregister_texture(checker_texture) + + // ------------------------------------------------------------------------- + // Non-square gradient stripe texture (16x8, RGBA8) for fit mode demos + // ------------------------------------------------------------------------- + stripe_w :: 16 + stripe_h :: 8 + stripe_pixels: [stripe_w * stripe_h * 4]u8 + for y in 0 ..< stripe_h { + for x in 0 ..< stripe_w { + i := (y * stripe_w + x) * 4 + stripe_pixels[i + 0] = u8(x * 255 / (stripe_w - 1)) // R gradient left→right + stripe_pixels[i + 1] = u8(y * 255 / (stripe_h - 1)) // G gradient top→bottom + stripe_pixels[i + 2] = 128 // B constant + stripe_pixels[i + 3] = 255 // A + } + } + stripe_texture, _ := draw.register_texture( + draw.Texture_Desc { + width = stripe_w, + height = stripe_h, + depth_or_layers = 1, + type = .D2, + format = .R8G8B8A8_UNORM, + usage = {.SAMPLER}, + mip_levels = 1, + }, + stripe_pixels[:], + ) + defer draw.unregister_texture(stripe_texture) + + // ------------------------------------------------------------------------- + // QR code texture (R8_UNORM — see rendering note below) + // ------------------------------------------------------------------------- + qr, _ := draw_qr.create_from_text("https://odin-lang.org/") + defer draw_qr.destroy(&qr) + + spin_angle: f32 = 0 + + for { + defer free_all(context.temp_allocator) + ev: sdl.Event + for sdl.PollEvent(&ev) { + if ev.type == .QUIT do return + } + spin_angle += 1 + + base_layer := draw.begin({width = 800, height = 600}) + + // Background + draw.rectangle(base_layer, {0, 0, 800, 600}, {30, 30, 30, 255}) + + // ===================================================================== + // Row 1: Sampler presets (y=30) + // ===================================================================== + ROW1_Y :: f32(30) + ITEM_SIZE :: f32(120) + COL1 :: f32(30) + COL2 :: f32(180) + COL3 :: f32(330) + COL4 :: f32(480) + + // Nearest (sharp pixel edges) + draw.rectangle_texture( + base_layer, + {COL1, ROW1_Y, ITEM_SIZE, ITEM_SIZE}, + checker_texture, + sampler = .Nearest_Clamp, + ) + draw.text( + base_layer, + "Nearest", + {COL1, ROW1_Y + ITEM_SIZE + LABEL_OFFSET}, + JETBRAINS_MONO_REGULAR, + FONT_SIZE, + color = draw.WHITE, + ) + + // Linear (bilinear blur) + draw.rectangle_texture( + base_layer, + {COL2, ROW1_Y, ITEM_SIZE, ITEM_SIZE}, + checker_texture, + sampler = .Linear_Clamp, + ) + draw.text( + base_layer, + "Linear", + {COL2, ROW1_Y + ITEM_SIZE + LABEL_OFFSET}, + JETBRAINS_MONO_REGULAR, + FONT_SIZE, + color = draw.WHITE, + ) + + // Tiled (4x repeat) + draw.rectangle_texture( + base_layer, + {COL3, ROW1_Y, ITEM_SIZE, ITEM_SIZE}, + checker_texture, + sampler = .Nearest_Repeat, + uv_rect = {0, 0, 4, 4}, + ) + draw.text( + base_layer, + "Tiled 4x", + {COL3, ROW1_Y + ITEM_SIZE + LABEL_OFFSET}, + JETBRAINS_MONO_REGULAR, + FONT_SIZE, + color = draw.WHITE, + ) + + // ===================================================================== + // Row 2: QR code, Rounded, Rotating (y=190) + // ===================================================================== + ROW2_Y :: f32(190) + + // QR code (R8_UNORM texture, nearest sampling) + // NOTE: R8_UNORM samples as (r, 0, 0, 1) in Metal's default swizzle. + // With WHITE tint: dark modules (R=1) → red, light modules (R=0) → black. + // The result is a red-on-black QR code. The white bg rect below is + // occluded by the fully-opaque texture but kept for illustration. + draw.rectangle(base_layer, {COL1, ROW2_Y, ITEM_SIZE, ITEM_SIZE}, {255, 255, 255, 255}) // white bg + draw.rectangle_texture( + base_layer, + {COL1, ROW2_Y, ITEM_SIZE, ITEM_SIZE}, + qr.texture_id, + sampler = .Nearest_Clamp, + ) + draw.text( + base_layer, + "QR Code", + {COL1, ROW2_Y + ITEM_SIZE + LABEL_OFFSET}, + JETBRAINS_MONO_REGULAR, + FONT_SIZE, + color = draw.WHITE, + ) + + // Rounded corners + draw.rectangle_texture( + base_layer, + {COL2, ROW2_Y, ITEM_SIZE, ITEM_SIZE}, + checker_texture, + sampler = .Nearest_Clamp, + roundness = 0.3, + ) + draw.text( + base_layer, + "Rounded", + {COL2, ROW2_Y + ITEM_SIZE + LABEL_OFFSET}, + JETBRAINS_MONO_REGULAR, + FONT_SIZE, + color = draw.WHITE, + ) + + // Rotating + rot_rect := draw.Rectangle{COL3, ROW2_Y, ITEM_SIZE, ITEM_SIZE} + draw.rectangle_texture( + base_layer, + rot_rect, + checker_texture, + sampler = .Nearest_Clamp, + origin = draw.center_of(rot_rect), + rotation = spin_angle, + ) + draw.text( + base_layer, + "Rotating", + {COL3, ROW2_Y + ITEM_SIZE + LABEL_OFFSET}, + JETBRAINS_MONO_REGULAR, + FONT_SIZE, + color = draw.WHITE, + ) + + // ===================================================================== + // Row 3: Fit modes + Per-corner radii (y=360) + // ===================================================================== + ROW3_Y :: f32(360) + FIT_SIZE :: f32(120) // square target rect + + // Stretch + uv_s, sampler_s, inner_s := draw.fit_params(.Stretch, {COL1, ROW3_Y, FIT_SIZE, FIT_SIZE}, stripe_texture) + draw.rectangle(base_layer, {COL1, ROW3_Y, FIT_SIZE, FIT_SIZE}, {60, 60, 60, 255}) // bg + draw.rectangle_texture(base_layer, inner_s, stripe_texture, uv_rect = uv_s, sampler = sampler_s) + draw.text( + base_layer, + "Stretch", + {COL1, ROW3_Y + FIT_SIZE + LABEL_OFFSET}, + JETBRAINS_MONO_REGULAR, + FONT_SIZE, + color = draw.WHITE, + ) + + // Fill (center-crop) + uv_f, sampler_f, inner_f := draw.fit_params(.Fill, {COL2, ROW3_Y, FIT_SIZE, FIT_SIZE}, stripe_texture) + draw.rectangle(base_layer, {COL2, ROW3_Y, FIT_SIZE, FIT_SIZE}, {60, 60, 60, 255}) + draw.rectangle_texture(base_layer, inner_f, stripe_texture, uv_rect = uv_f, sampler = sampler_f) + draw.text( + base_layer, + "Fill", + {COL2, ROW3_Y + FIT_SIZE + LABEL_OFFSET}, + JETBRAINS_MONO_REGULAR, + FONT_SIZE, + color = draw.WHITE, + ) + + // Fit (letterbox) + uv_ft, sampler_ft, inner_ft := draw.fit_params(.Fit, {COL3, ROW3_Y, FIT_SIZE, FIT_SIZE}, stripe_texture) + draw.rectangle(base_layer, {COL3, ROW3_Y, FIT_SIZE, FIT_SIZE}, {60, 60, 60, 255}) // visible margin bg + draw.rectangle_texture(base_layer, inner_ft, stripe_texture, uv_rect = uv_ft, sampler = sampler_ft) + draw.text( + base_layer, + "Fit", + {COL3, ROW3_Y + FIT_SIZE + LABEL_OFFSET}, + JETBRAINS_MONO_REGULAR, + FONT_SIZE, + color = draw.WHITE, + ) + + // Per-corner radii + draw.rectangle_texture_corners( + base_layer, + {COL4, ROW3_Y, FIT_SIZE, FIT_SIZE}, + {20, 0, 20, 0}, + checker_texture, + sampler = .Nearest_Clamp, + ) + draw.text( + base_layer, + "Per-corner", + {COL4, ROW3_Y + FIT_SIZE + LABEL_OFFSET}, + JETBRAINS_MONO_REGULAR, + FONT_SIZE, + color = draw.WHITE, + ) + + draw.end(gpu, window) + } +} diff --git a/draw/pipeline_2d_base.odin b/draw/pipeline_2d_base.odin index 7b27ca2..a69facb 100644 --- a/draw/pipeline_2d_base.odin +++ b/draw/pipeline_2d_base.odin @@ -35,6 +35,7 @@ Shape_Kind :: enum u8 { Shape_Flag :: enum u8 { Stroke, + Textured, } Shape_Flags :: bit_set[Shape_Flag;u8] @@ -106,9 +107,10 @@ Primitive :: struct { rotation: f32, // 24: shader self-rotation in radians (used by RRect, Ellipse) _pad: f32, // 28: alignment to vec4 boundary params: Shape_Params, // 32: two vec4s of shape params + uv_rect: [4]f32, // 64: u_min, v_min, u_max, v_max (default {0,0,1,1}) } -#assert(size_of(Primitive) == 64) +#assert(size_of(Primitive) == 80) pack_kind_flags :: #force_inline proc(kind: Shape_Kind, flags: Shape_Flags) -> u32 { return u32(kind) | (u32(transmute(u8)flags) << 8) @@ -566,6 +568,7 @@ draw_layer :: proc( current_mode: Draw_Mode = .Tessellated current_vert_buf := main_vert_buf current_atlas: ^sdl.GPUTexture + current_sampler := sampler // Text vertices live after shape vertices in the GPU vertex buffer text_vertex_gpu_base := u32(len(GLOB.tmp_shape_verts)) @@ -584,14 +587,24 @@ draw_layer :: proc( sdl.BindGPUVertexBuffers(render_pass, 0, &sdl.GPUBufferBinding{buffer = main_vert_buf, offset = 0}, 1) current_vert_buf = main_vert_buf } - if current_atlas != white_texture { + // Determine texture and sampler for this batch + batch_texture: ^sdl.GPUTexture = white_texture + batch_sampler: ^sdl.GPUSampler = sampler + if batch.texture_id != INVALID_TEXTURE { + if bound_texture := texture_gpu_handle(batch.texture_id); bound_texture != nil { + batch_texture = bound_texture + } + batch_sampler = get_sampler(batch.sampler) + } + if current_atlas != batch_texture || current_sampler != batch_sampler { sdl.BindGPUFragmentSamplers( render_pass, 0, - &sdl.GPUTextureSamplerBinding{texture = white_texture, sampler = sampler}, + &sdl.GPUTextureSamplerBinding{texture = batch_texture, sampler = batch_sampler}, 1, ) - current_atlas = white_texture + current_atlas = batch_texture + current_sampler = batch_sampler } sdl.DrawGPUPrimitives(render_pass, batch.count, 1, batch.offset, 0) @@ -632,14 +645,24 @@ draw_layer :: proc( sdl.BindGPUVertexBuffers(render_pass, 0, &sdl.GPUBufferBinding{buffer = unit_quad, offset = 0}, 1) current_vert_buf = unit_quad } - if current_atlas != white_texture { + // Determine texture and sampler for this batch + batch_texture: ^sdl.GPUTexture = white_texture + batch_sampler: ^sdl.GPUSampler = sampler + if batch.texture_id != INVALID_TEXTURE { + if bound_texture := texture_gpu_handle(batch.texture_id); bound_texture != nil { + batch_texture = bound_texture + } + batch_sampler = get_sampler(batch.sampler) + } + if current_atlas != batch_texture || current_sampler != batch_sampler { sdl.BindGPUFragmentSamplers( render_pass, 0, - &sdl.GPUTextureSamplerBinding{texture = white_texture, sampler = sampler}, + &sdl.GPUTextureSamplerBinding{texture = batch_texture, sampler = batch_sampler}, 1, ) - current_atlas = white_texture + current_atlas = batch_texture + current_sampler = batch_sampler } sdl.DrawGPUPrimitives(render_pass, 6, batch.count, 0, batch.offset) } diff --git a/draw/shaders/generated/base_2d.frag.metal b/draw/shaders/generated/base_2d.frag.metal index e03eb46..7a4b934 100644 --- a/draw/shaders/generated/base_2d.frag.metal +++ b/draw/shaders/generated/base_2d.frag.metal @@ -25,6 +25,7 @@ struct main0_in float4 f_params2 [[user(locn3)]]; uint f_kind_flags [[user(locn4)]]; float f_rotation [[user(locn5), flat]]; + float4 f_uv_rect [[user(locn6), flat]]; }; static inline __attribute__((always_inline)) @@ -69,6 +70,12 @@ float sdf_stroke(thread const float& d, thread const float& stroke_width) return abs(d) - (stroke_width * 0.5); } +static inline __attribute__((always_inline)) +float sdf_alpha(thread const float& d, thread const float& soft) +{ + return 1.0 - smoothstep(-soft, soft, d); +} + static inline __attribute__((always_inline)) float sdCircle(thread const float2& p, thread const float& r) { @@ -127,12 +134,6 @@ float sdSegment(thread const float2& p, thread const float2& a, thread const flo return length(pa - (ba * h)); } -static inline __attribute__((always_inline)) -float sdf_alpha(thread const float& d, thread const float& soft) -{ - return 1.0 - smoothstep(-soft, soft, d); -} - fragment main0_out main0(main0_in in [[stage_in]], texture2d tex [[texture(0)]], sampler texSmplr [[sampler(0)]]) { main0_out out = {}; @@ -169,6 +170,25 @@ fragment main0_out main0(main0_in in [[stage_in]], texture2d tex [[textur float param_6 = stroke_px; d = sdf_stroke(param_5, param_6); } + float4 shape_color = in.f_color; + if ((flags & 2u) != 0u) + { + float2 p_for_uv = in.f_local_or_uv; + if (in.f_rotation != 0.0) + { + float2 param_7 = p_for_uv; + float param_8 = in.f_rotation; + p_for_uv = apply_rotation(param_7, param_8); + } + float2 local_uv = ((p_for_uv / b) * 0.5) + float2(0.5); + float2 uv = mix(in.f_uv_rect.xy, in.f_uv_rect.zw, local_uv); + shape_color *= tex.sample(texSmplr, uv); + } + float param_9 = d; + float param_10 = soft; + float alpha = sdf_alpha(param_9, param_10); + out.out_color = float4(shape_color.xyz, shape_color.w * alpha); + return out; } else { @@ -177,14 +197,14 @@ fragment main0_out main0(main0_in in [[stage_in]], texture2d tex [[textur float radius = in.f_params.x; soft = fast::max(in.f_params.y, 1.0); float stroke_px_1 = in.f_params.z; - float2 param_7 = in.f_local_or_uv; - float param_8 = radius; - d = sdCircle(param_7, param_8); + float2 param_11 = in.f_local_or_uv; + float param_12 = radius; + d = sdCircle(param_11, param_12); if ((flags & 1u) != 0u) { - float param_9 = d; - float param_10 = stroke_px_1; - d = sdf_stroke(param_9, param_10); + float param_13 = d; + float param_14 = stroke_px_1; + d = sdf_stroke(param_13, param_14); } } else @@ -197,19 +217,19 @@ fragment main0_out main0(main0_in in [[stage_in]], texture2d tex [[textur float2 p_local_1 = in.f_local_or_uv; if (in.f_rotation != 0.0) { - float2 param_11 = p_local_1; - float param_12 = in.f_rotation; - p_local_1 = apply_rotation(param_11, param_12); + float2 param_15 = p_local_1; + float param_16 = in.f_rotation; + p_local_1 = apply_rotation(param_15, param_16); } - float2 param_13 = p_local_1; - float2 param_14 = ab; - float _560 = sdEllipse(param_13, param_14); - d = _560; + float2 param_17 = p_local_1; + float2 param_18 = ab; + float _616 = sdEllipse(param_17, param_18); + d = _616; if ((flags & 1u) != 0u) { - float param_15 = d; - float param_16 = stroke_px_2; - d = sdf_stroke(param_15, param_16); + float param_19 = d; + float param_20 = stroke_px_2; + d = sdf_stroke(param_19, param_20); } } else @@ -220,10 +240,10 @@ fragment main0_out main0(main0_in in [[stage_in]], texture2d tex [[textur float2 b_1 = in.f_params.zw; float width = in.f_params2.x; soft = fast::max(in.f_params2.y, 1.0); - float2 param_17 = in.f_local_or_uv; - float2 param_18 = a; - float2 param_19 = b_1; - d = sdSegment(param_17, param_18, param_19) - (width * 0.5); + float2 param_21 = in.f_local_or_uv; + float2 param_22 = a; + float2 param_23 = b_1; + d = sdSegment(param_21, param_22, param_23) - (width * 0.5); } else { @@ -243,16 +263,16 @@ fragment main0_out main0(main0_in in [[stage_in]], texture2d tex [[textur } float ang_start = mod(start_rad, 6.283185482025146484375); float ang_end = mod(end_rad, 6.283185482025146484375); - float _654; + float _710; if (ang_end > ang_start) { - _654 = float((angle >= ang_start) && (angle <= ang_end)); + _710 = float((angle >= ang_start) && (angle <= ang_end)); } else { - _654 = float((angle >= ang_start) || (angle <= ang_end)); + _710 = float((angle >= ang_start) || (angle <= ang_end)); } - float in_arc = _654; + float in_arc = _710; if (abs(ang_end - ang_start) >= 6.282185077667236328125) { in_arc = 1.0; @@ -277,9 +297,9 @@ fragment main0_out main0(main0_in in [[stage_in]], texture2d tex [[textur d = (length(p) * cos(bn)) - radius_1; if ((flags & 1u) != 0u) { - float param_20 = d; - float param_21 = stroke_px_3; - d = sdf_stroke(param_20, param_21); + float param_24 = d; + float param_25 = stroke_px_3; + d = sdf_stroke(param_24, param_25); } } } @@ -287,10 +307,9 @@ fragment main0_out main0(main0_in in [[stage_in]], texture2d tex [[textur } } } - float param_22 = d; - float param_23 = soft; - float alpha = sdf_alpha(param_22, param_23); - out.out_color = float4(in.f_color.xyz, in.f_color.w * alpha); + float param_26 = d; + float param_27 = soft; + float alpha_1 = sdf_alpha(param_26, param_27); + out.out_color = float4(in.f_color.xyz, in.f_color.w * alpha_1); return out; } - diff --git a/draw/shaders/generated/base_2d.frag.spv b/draw/shaders/generated/base_2d.frag.spv index 39179297819cb4e1f6a9a29737cab241a4c3ce03..c1411b31d8b912d85965a6d1cdfbf070e642fbd9 100644 GIT binary patch literal 19164 zcmZvi2bf(&`Nl8VT@p$NJ@h0%AP@);YA7KDAwf}^QY@GyoNP(azSrh>k6a+<4 z6h)9GC@Kh%5)=d#!2&8G#e$$nQz=sZzu&p%O(y60pXbgq@B4n=eDlqmIdf+3-Ht(9 z3~senYOU59(wf-Os;;$KgHT#%D_8ZDgO50P-24T-<969~M;+E^wH@^t)*9OCq_)iJ znK7HH^%V@)P_CnVpR$zlQ_3ThH3v~>uRk4wI_RzK(}h~b>B3_uw)wsL&zLuTX5W~T z`^WFqztdhwtuETFMm@L2t5eUb@G7nKsOR?{K4;2IB~{;9kc$sffHx4xQ;VL%z*VdR$Tj^37!s7KLd?-74MvGkcD=YxK_6W%xztLh%`#vv5K8^f@!wtXJg z$DZmoPd4~p8~mvTf4ae6Xz-U-;9af%!snmRGq-Qq+;z3y(5JgUQ&r{OF=)lSx>n#k zW$^Y4SJO+Ij@B?VnPYW->}Tq`=k-l*ugA{TCTP`OP>*#}csceF4ZbscUQh3gh1>;W zPJlP(c<)A=i4A^WgHK(7ceD3Z1psFZ-e(Y`0NIs zy8`cMoeJ-%=BJ}|8hBbgSEqxs$@z?&2`}gFtOh@)!7qZ(m_55M&(Fnho~fF5wJw3r zU(hpeK{qpA_jxH=9}lOB_xEym+FuFAGGjIylKET@U%roawU)FwdpI%cc(>OYF{?I9 z;WK7;_snAxG{&>=@_s)ruXk`lR6Z}E&7aZR*WRbPTCd7icla>`EQ@ zR&hF7Yk+&$lpWXF;As`^Y;6NB=XGp@kF%ZcvpSz08upzEK84txA=bcm3gZ*|uI#M2 z`{;`5 z8MSAz_)P_`MA?qm#yx?euSFl*4p3UH#ruZeS@B7e!tbwmYZ-qQ{_u^DF7qEl2iIJE z^~Ct^ZHKMJ%lsEtytVHJ^mhT+JaTWoRP**2J6m6d>n`65f4S>a&G_nHg}hbpZ@`@| z@6**_*M;}V_Hx&?n)YkKe3seeu5+~wxeM0=oD+F@g>Oux8zA}_?>n_Y&Hm&*7hc-G z2bMZ!?Q)+JwanM8ko#Ud=p^mt<8z}u+~-D~@jn2`{5@FP%sKXl&>XMrwfk)GSr*Ud zN$xt|9BfpuK^xX-0r-ygys8Fufa@WF@cb(&p-38a*^G)gR`KIKa zZ^HHWd{gqd1vj5l8~n6_+y3-|>+d-y@#LO!O71x)-1eSxO71zQO?aK3*@lAN_9v>!W^Vt=$YZjyA{ZyHV{8-U#+*zv}L@x7OG_c7S*p z&yV2ccy5EMO~uFY+yVA-JlbxjjAI=bU!3vW1@>Kc2-%zekHKmdFDr9#PuG@qcZ17z z_rTRGbVG~fzW1W}{!AbDfz>QtR);Y+zNdZ$F_u@fpVwOH_X~(|O27MSP5DqpNo};Kahhk%U|6c^#pZ$8KkjM6Ku=iElOVo1b(02a;Z$oVxZOL2RIZR*H znTwaf*VVqBPvjZnE8udBufo+Hpk$1%fz>m{*TG(nQQQA0YK~ECZ14A*;Hl3~z-1T=hYr*lZ2G>XZyjoiwtiBk(Lu+kl&D92KZ4ESiwV8|OKsEbv zJ+B4!oS%JfZLpffi%`pa*Fh_N*M+ND_z->h7}i6uuhsdB%Q37EHkNZ^AGX!UoIHza zH>VqL8V)YsR~x|f$@^+Uus-S;$3|e+VBTLF!`1Y)4{d7RgRCiQ+q?cZ!Dlx8pm_6JkQN2@MzlGMw>pHP^%}mEx`7F2y?MHcuR_!#S2q4f9}nmg+`<6{}3fP zY=x#CzBSl>9IyG00qdilchWXs`|^smZLQg7w6V3OYu-uQL2R$TcH=v?YiOt~=K$Z= z=H$M%1Nai`Ilt@(SM!paKXqd}pZkH!J#;c$t$a`H4_D6~dH~qVIo9?zikfpQ z&X}fvv)&H`%VRqT>^RM7Ds{!{_r%-bz9(#>EqSZ^Y^5)2x%b?@z0Vof!QgV8yc4eG zdUu^10`_tY+TKN(Pce?zIPP7CfxTDRyAFq|S-db+`5g+^mV0~zxOtD?ji#RU@*c45 z$56Bl)UjMI+7jz1u(8T#^=LG8=jKRixjDOD-V4rpIR-4xyuS}zzPsKJSIfKWSg@D# zqU{3|HRnZ~K8^z$+y1+$<>_M@IPb0=usrXs>0tL~+i0`Ban!w+>B~BO^?}n@FIXPi z@nGlPIQ`V}tp5|hBk9XF+VpWfs@vZ6KLcE@{|~~|%JqLDTs`Z57TC)<(l(Q#<{XJL z4|Bj-|FgmJ*iHg_-^^()wLE#x1H1lhqb+%>J4fluTJBo5Z`W_ewE$eM|C8Zr3n{Mu zMbuu7LE9;m&rpmbHjeB6Ltxil*8hj$Y8Ed{ReqEpW z^~GTSW+?aN0W@{z;dE+w{LcXEU+&8vLsQRl_HnRz`kZMSsQdOg(`L>-H=h9Kxj7Ro zkL_%*V>E}esO7PJ66`p%okK0pdHqx1jfrhM+iAC*d-tcojwkQebHQqh$?IeFJNG=e zF|=hJsCnPLhu$ymn=xI_p9Poq{Q|h!`4sQ_h16c&cWs}eJVh~%IP>~>u=lUL4_`o2 zH;;>`R@&7Vd|MDI96*P7IFQJyl|EpmA%RTUGXzKc3N-dB7 zW#HyI`Z}6=*2Oo#gsj$uwgIrQTo>AMf4>QKU94WepRTCeO~zlFZ7!!)&%ON?*f|Qn z3an4=>(yX=)Z=pvSfB7~!OmOSTnE-iJ!AMbIAbuTKIU^JwYJR14PeJm-k}=kKS~8Mn53C=XH`w>WX`1v~EW`@mJMb=;r9^-+({&%tGz zU%<_6N!{jtus-Tn*4i(@#?p3Gy}lm+s}GQ<_se;{8?Np=I|lPL@AUmEaM|~-;bp$R zf$O85HopazZ61P``92KSNBxRA-`|0ar7ijX9<1KXSG~;Fywmp|z>dkWEv1&n_6WGV zr+B{sz`h{Z{&IW%?ic$Qio{};gemwVNVXzKbuPc4uCzrp&K z?~?zZsq6m|wLJd+1?ykFOI}7(&p!7G7?Se|?K7TB?6P0_r{VGY`8Gv1rj>S;3wT(;?`+xT~%Hrx0+hkDv{f{hd2 z1=r_va`arXGFUD4!L>bn2;BDhPG}XlF>M!ZRj}>!Rl9(BTMewQHe)#-tAm|~^YM`n zh3lto?%LGS_nKhur{7zd>soMs(^c2s-*D8tKYq`#4%iswT3Q!PJwEHzKIQscA5A^$ zVK~_J;Cs*Z+LO};VEfHEZbR_>_^IbTvJu!A+R|=gut$xNBUbQ-yP{IWAS%Ed2Hjs z?knc$Z-nxUWoNM8{n?X%{02J5T- zCNbq@F8==F<^NqJm$eJLPL2JaighXGwLZ1~S7P4#P|Qzm?(ZtN_M>Z_{b>()JO1*V zC!ne446rv?E$8cf!CpRN+9pyQx4DS3b|!(%F@N9h2UfFqIbYrvbJphiHnwq$mpJ=_ z%Q$a?t62t7xTVYDXe;9wFZb+#3NPQA4}=@ny(8;s3RpicbJIsH^K}r|Hts=b^LDUp zG9G=@(&imt=h!w`7YBpYGso`)t7VQ41$#Ni+76){L2+)xiG3J2V@T}7(bUb~`!9D+ zjA?u0r2V_W8Bf~32TeWg+BeI(vd;N)X_xjy>4E{~x&7d|)g_fniU&$im) z_dc*|-RDNX_fzy$PhTGZr?2F5ESh@G3f*8e%Rn9LII!`wnWN9dag@aE0js$#5_cL{ zJwDUH`jq#y7p|W7dLP)ajiK0Hdu;t+ZKEmrdEYbkOkkdfS9L z`_oBa#}vM>;H!X7t$F5R9^Clh^TGC!b+G`fkGl8L_-gS#8EhWmr+~}$i{SdG+unTC z;{PGAF~dI$*5|@{o=*eoqaL4+fb|LgXwB2+bhtk1IcqKk8%vwNIeHJzpm^WMQWjJ8 zqzq8Z(|OXKK0XF6^ZYnGIT-6qif!`y>nFhasAn$EsxaS8Qs=#PHrO0oU)o(0YUP?3 z3HRERk~Ohefk%K{2cxL-Ub+~Zb>Mx!yuq(+@EaQZ<_2HV;CD6ny$$|ggFoEhPc-;b z4gO4nKiA-|HFy`3SdMpigO6zNu?@akgHLYoDFq)w-^bQG>*G^s-jnRRp9bSn?Yr7s zQ`*(DUz`hW-^cbka~+*WQIF4OYM-gp_NC8fDe8H*oe#F1`pfk5IchK8ZQ3rNsJTYO z$@xOC{pb8}5m?P)?q11TTiSgd?3zovFM!o7WjlSe`EE=9)*0Iu!TOYA`x0C|K9~GI zK3_&t&-wN%VB4vu-KAh-?@r0QeHE-8pRa+NK9`}XC-&FDwo}g@{td8wYIA;k*3=UB zo8V^LE6~*Ab0xTW@4tnnp1!XF+fF@WxEgGK+1IWC>!)rmKIdxjzYeSx{_UElkL%(3 zsN3FsK`nWH2kcnNd-PqndVFpKJC@StdvNvi^L?=G)RW5(z~xwOg6pSlF78Qc@&6%M zE&P_6r;l6V`l#F9JxneBKLR_x@Y}%p4Ak$M+rj#%=U(3dRv(~b&XRs3;!9o9NW*~jxBxu0<4dEv|obN z@26xA9snC#oBg`4tEHb`fy;h>4KMrs4ZQ64w{U&b)5k+#<7l&w2dUN4$M3-9n*Du) z{{de1@d#WW_4M&auyM56$HUZW>Eltb{pL50KY`Wuq_~GZPVMC$s_ik#`YW~fP;ufs z33iUc{|t71vR3{A)<-?>{J(+;Rs9XBJ-&YfYs_c zPn&;$%QpXn+h!<5Th_wAz}lXr80Q&kwZwTI?3&K^j4y!IVt)~A-{CKTjgkBDZ?Hb< z(f$Kgcl{-g|ALLJEwNq(yT;P)6|jElnTuD!j%x%(yJL8cT0L?82QJ6(I$SOGH^Ai> z-h?}b>;-Ru^-)hQcB)>EVI?$UYfJ1w;BpKdaQ)OXhE6q_IR@>H;We;&;;al-3m*)> z7@inI;C!lcg0}dr0(L$!&#S`qRnJ&f0~=pk+N}=u*-8ANa6Z+!MqB*W02@Di7~DM5 z-!FlsMsOH(@aWv09KEH{% z0oa(yZ$r3k)#I}f_=dt~W4M0m@!16IK9c#_6war*huRWdtz+~E@O>`m$9~j>!Y4nTZ8#j_f>oTj^3u$H^QdR7)r*p zE!Z6NaZH}E)sn|{;4+Wx;pUP1I1a3jdh*x-%%>WY_Qcu|T*lf7Ud9>^*GE0Eb_VmQ z#-u%C+NIVVlRo8`c7>aRK90%16V#H&?qJ86eQ*!3TI>_Rw#gXx1nZ+7pS{5DLz%~Y z;A(qQ(rzNyw%XEeU$A>f+D(GHX4SJ!_5<5ioB8{9jap*t4>tesw}H!i4}j~Vo_-Dl ztLNP?8LXBVQ^EEVeo)QR{_Svm)Z_CGuyMiU_H`IspV$uvn}6p12(Uit>F3>GK2<;3v+ukIY<%sBcO=+7Gh;al?tM~^ z&(UCWNStH9`l)BU?*$uQTiU%3T<(SMhwG=FSRVkZmwTaFxfec8KE@nO-Hpw6PWm_w z-bD%TY4B+^&-kXp^-<3p_JRk9sV#HZ2j)}tr9Iyv_Ji$HyD`n_c(8eC^PV0{t(Kfm z0IP-10B=S+`?viE;rghj{}aJ{s_$B~?}zUy@R?xSPNihbv(VJzGaGDwrOzC=dd55# zY&-Szc@mgUwV!FvoXrE9k9NQFk(b~3jHZp(<`lp4G0!auyd~Io+E&!bc_A^>vp1a# zSBw3W+8(|LZp?fib}F1t_4|o!_17ok`w(~lyME>@FLT}on^&3h*6`#%25ipTQYVLx z5JR8L*GJ)esy$m<{7whkKJzyKS6fU;erJGfug(1A$Wr?atc_u#xHl}f+T?n?FzG@dxUqsPYo3Wg$ W&x4)U^Xq5v3vm6^y)Sa_%l`q6m_iEx literal 17776 zcmZvi2bf+})rG%gW|APGmrw)}iUb3ZCcPv93@Aub0Y!#MCKE;`nUF%UkwFkdK|xd$ z8;T%JKtK=>2_Oh6f`urE1}q4Q2q+~q-}ioZC5QX}_nGsYwbtHepMARR&KQ&^;xAgy46W-nbp%j zo2vCS43|@`q+CzAo$^!4!<1ENt1q2`I_Rlw??NqObm6fQwt2mK_0OF?Fk}4jecSHZ zx6Q6dtuETFNIj>CcSi4&!4s=?wmrUYdmGz!b?UTT13azP z*Q9R88P!@JJg@iA88c_im_5G;vMoW3u@3gcSQp$wQD2X`9dBf7EO=h;J_7^&bLO=N zU?Oe54Y8&DM&KSAIp)o(HhtYa19Oh)8P9-=F>ivcm*QM)s^4JWe5$J57V_S{?s@a) z4jwyWc|&vB99z5Hh}L-hDxdBX`g`Xe1Eu|_)Yh6|zh{2`;OylyZj3Fl zxAX06ZKdz*nFBMbwjHg*sgJ9%^&BR{HVeShYJ3vd&eYGR9;op})U#@Qg?QVx&(&Zz zlm0h~YyB4R^jg0|yj|Pp9&mDa0NfsLM{6;7CdD|9QYVMU!SibUNoprjpQowkp4i4q z#U!rYHH^Ln+jtB(aW(=c&IGVy)_(_Z;!F|Oai)qHus%nLYyZ>0(;@1g0?(^(XX^s+ zvHi0h4z>~1yVcjN(%m=EGt;ioJ6jjx7oiKq_iu2){O;+41B2~8I;;1wuU72m`}oYC zabgwM_q4BjU~qcRK=?ghtJaXVT!p||I+DIy3r~?a8*!5`FBcZu3}! z|Fyv%Z}2A?{Fw%SuEAf1&+X~$U%<%BY1r`pn|CK#nbU{{U%A0oeFNt_fw#|quGWU& z=Ddv4f7!frwl;r5`_9%D4ZdZAPi*iV8hocW@Q&7Gcu&0t_6AR@=XxqQ2Z#Ia0C+j> z0~`F{2LB+ufA;Jd*^@`Z*{3z{YE6gFo8L2cem65-_t}RwgAK3Zz8!$4{U8`i|7>1z z^Em;&{A}uKo!aJ{i^QzsonC9itlFFj@1Nb>Gna$K7}vwg^}bPFpNfR2d~QaY*WWv% z{qA?Q7RleRmpfW_qUDUg3!HQ79(}7g9jyn!Jsfn7>ml&83U{`i0hjaoT!TMvJHHWi zJ})%vFBN<$vHd2mjPF#&C-(c(S#!?{xsP_g+a>p$7}qvecKOC|x~RBkhyGL*_j@mo z-EY6Vqq28Y?{sqUPN1!evIZ?|GY(IE#cP8`P>-dIA-9p#a&2Sbqo~y^@)c^V&p4vK ziCV0mHe;+(YmVPGt5VxueH`AaQKyaVw5yM4=-TXKgNClQajn^>MQsAL{Un#Q=z9nF zSc=btn(x@A6*rGvYQ83;_6`%jeZj*hTN2y2$58aO=wsU;5xf50AHwzb z?oje21=rtuLTUG&P;&1H;r8b}A>8)QG`RPK(*AOT4`YI2H^0t;8_zpJ>~il2CHIaH zZhP+t;re?|2-iNY!M!8IuD^GLaQ(d_l-xT)xc=S|O70yY-1d_jeD8wW-n&BS?_Htf z-W5vjT_N1|-W5uIbc1_WDDB=8O71ker#86vgwpOkq2y;bxc7$A?!BSp z-Wf{1sNl}$odtLN-Wf`N?+qpQ?oe{?4<+{wQF8APCHD?da_F z2sVy3$LqI8?Nv@-`?Ftl&$XLt?3p%1yo~2Z@Nzu2!qxW0$MM_-_HjJgeoT3uVtjGN za|hUO$$?~V{&#}aEIwA|;u)$f?d}4X?e2!FSy*mbEcd+!&F@qCxEHKu@v%CXx$!&k zGl;Q#qW!$qO21z~j8ppES8K`#F&gLpmk?tb!#4Uj9=`$FCeyFqzh6I`rC)LA+WJr8*PcT7;G%Rv5ECNu)6+!Z{*sX^M}Ch8Et-hJBR7ZT7Da~efz#74fA2YRZSK7;aNY~=Jo2;~2{u>TRreow=6e))CED9Yn?A0ay6xR3D}c*= z@+P=ixldMvt7o6A1om-`wT-5zImhDUwhB1=#JiY0w$;Fn)117Q$&>f$VE2h_v?XtK z_eJ`$mb>Qm?K)>%Yl6#tvKCy;v&eli2JGV)w5?6qm0}#RaeNoo1-n)`v)6;GS$r^6 z`FWSrmNi}<++5=~qp4@VYyh_Xc#5{6I+puITViblHdc9#jzv>1!)+`tpt~kL_(>=iWHpx8>PKTZ6~amu=E(_W#?#KF*Q0?I~)`kvQ|PBRKni2e3T0ox!e~Ir)2lJbCW|cK_Q(Tk=+S zj?$O4+`Vky?%#}SH*mTC_kgSU`>gwa619(G(6%S#Fp6=+#&Q2o0lW9I|M!BcS$r^6 z`AvpvOS^r*WxIF4)hzswLA%}F5N+A(`+~hQl;`q(XzI?xRBCzr-wD>gJeS{vrk?j~ zf3SJ_o@pDZ`}RH4X3oAh?*`|+IRGq=?I5sYG=~GJ<*^+Cb{yIcrk3Y>eJFT+VjItP z+HL3e@I7G1<9%5EUa;Ci^7=%ry$@~-ZP^EEuDfgKdbw`KbUz;sF4z72aJ3^SuKSVH zKCZjA4^Xb67)P9WJqqmlmFv)rrfweoE-8wh%0JpR34{mXCQ3^aB9 zr&G)0KNGBfc?KSXrmlY!Ti@d0>6Q=YyTMv{?YwM?GUW9-J{4Qy=pg zq}G=CI1%g^%Jn%3O+7vzu6@ci{|K6T`uix@{`@yu+iSPK6R5Q%_m6?qZmjQ`9|x=D zJF^h%<37}OGDXdOC{CPHz{VM>--l0t)rKhPSIze!?XA=PG_ZT=th)Ut!D?xLDp)Q4 zXMo*H@&6Q9E&iv2)#CqYuv++Mz{b3(9`{*bebjybK1ZE#Yx^wa%M`~gPMmYVjywEZ zuyJp#<9;5jk9vI01D9>S05`W=>NZ~l>!W^2t$hh>ENz$8`}=&b`Vfh_Ue5E`aCPU| zF_^D;r|++T%f2swm-&7bu8(@!d<|T-`8vGJ_Zx71)Gw~{{U+F0+LG^uVD)Cc>Sey> zoxZ;Xc1(`#Olo;--v*a!dJ){Q!)!P+jNxc-+>`?&A5 zT}n}NKE=t$v6y>o*2(X3@R^kCu`9r83rQe8SHe95GDqJ7>!*G*e)i+tMBRRlq&A1E zsEy&bxsKY$v1q%NqGlX%&Xpg4Z>O*FT)6>F-8`LxUG{cog}$Nv_v{^hsiM`-H$-%Kr!|E*yC%Wuh#(bRL!-3G>``Zj1A0vpS3gSN!F z9o+oR+<~U9{~~I+bLt)AC*a&M?gY!TW_N=fTlUmX!RFwez-qDovbKlc54Zh6ypxXU0kAP`7wuPI+v%%z zE=m0wtgkj>+4eVJ=i!_>)^EZ3shhhtwecY*!M9x z{R6C@x;c4fsOj$+@lSC1`_of!HRpTT-KeJwk=wF_Hgl(xOr(tmW$zB=mFKiU`{5`q@Y+pXnI%~~7qjlAq zuK9cP2sGR4uif~L?Q$Aw^WOk=qWJHkzUQ82KK}oM9q}7Q?f+R6`=0xM7F`4PxMT2a z*_xtH-h*wx<~4<)Eo1fnPuV_uemk(f>aP(~UgqN8Uw!<44asHo0K+ zNnmr#@As3zY8D^o%XKklZSHSl8^?HwvlqCGvo~DLGMvJaE{mhBjAOj4**+Csem1`Y zZd}if?5C+<{d~+#AGOTaeqh^p2BpnA!M4eG^ifNjcY&Q_+hkwt4_40{9{^U%93KSs zagMbeNI8_^+=vtVU~tBe*oUC0o4@NXcTS9Hd*h`2d%zh_+P@b~J?-UXd*j6SePH`Y zyu-lB$M$l4^mkt#L2)j8Z{+W%IB(u5e{&$ZL)i&Q3{T!ob+OZV-(4L$Jz|HxZg{GeKX*Srl z>Yh*X@_h2|O+I5NIiJ=k@VW(F5A1$>Gj-0V{OSh3vB7U?@H-m(o(6xQ!5269qYeIegFo5ePdE6|2Jd1L%khqB@QoULLWA$n z;8PlWYQaa*_fa*^{`fSSYm#&KGhke*b61;tO1pZ_i?hJ(b!_i5_tDuD_4s_Y_Su)( zzV!JVMLplPbHKJ!e}R5JPwnHkP20H?HTQ@(IiCl%|J)zG09Lb@yHE1gmUdqRyXVsG zOJFri*-jsAe%sQ&b;kB(us-G3z5-W|&jtUF&sWjZbHDu>*mmk^_jRzbccf(Az5!N` z&o{wMp9|5{6Z>0W+o|Ua|2Eh@wK+e&Yif!69dI-5#c1mBxdhx?`|qNur|(O_wo}g- zE(6bCb>P)nZIfE`P@M&F04$LCtGV<~;E zgR7^X>%q2DPcA!)rX zp0{f8zZa|){xfhnwx7ctTl)M3SReIhzXYq_N68%A4>q{SK-< zzJCL2%f5OPtd`h+2dgE{6JX=4Owr~zAE#DNn}2}IHvfd%W+jTY?1g`UwLL{K&Xd$? ziSuu;dpdt-dkV!Z%%kEPvqZpeyV$ow)m|KHh%ajaPv%mtHSkB&wgJGY&>nrVRf*1XiKa$z@7(L zhc)5)sVCN2V0G8QdGkJ~=G|inns*=H-^5%8Y|P}hF5I^2@mUXib>Xu#2E`N&yjI(eKIc_gY{8QUz>pWsn%M1Vr>d8V{Ha6 zW4#5gk9uNl4(6v?SMB*ddW%|L51T&YDH+pS!RDZkWAcuzmOQoumw9XjH;=62M6f>U z$>VKceyTBPPpqxMWvp%BWvp%C`lu(?c3^(0F=@}3wy$-^q)$1fx5LdrAIIeH32Mn> zN3i3}Ik*#8E%u$kw#gWG0qdh4pIyP8Lz&0j;cB~4(ryp1ZMCJ{o?y?Aw3`HX&#Gsi zOa|LloB8{Djap*t1vdZiy}@O^`@r>4Pe1PftLNJ=1+11B`-1H!e7~Bf{X60MsK@7B zVB>`EU-Ps%0IrXE#`bQov9x844+Qg5y>Hqb^Hgf}#6AdI_H{5^pV$upn}6p1P_RDg z>E}IQeyV=7=iGTO*!bEL?|op;%#7tQxa*`IpToiCkT^$x^;6Gy-w!svwzT^IxI7Dw zgzKlCSRVwdmuI0`c@{1qA7hTB?#AXfCw&|Z@1lhFH2AcdXMEG)`lx3Pd%;7*)RsA% z0p_RbOMCu?*ax;x?Zz~xnPBtM=9(Twt(KgR0jq`ggEyp|{oDRSaDCL%|FK|xs=u{p zpN#Jc@By%G_oZaav(VJzGaGDwrOzN-J!75&ww-$VJPyoHb)IR@oXrKBk9Pm&BQO8v zvl(rCHm3MDAM@O#z?*{oroDwaIWHiFdd{Zf;cBs;P}{>#gc~z|4?79YPxbF7w$)#s zjPJwXA?*5@v%JiC3v51R&YQ!N|9G%Dzm+;Ue2f_SWWGKQ=chWewZ-pbu`l-7va@Xa5^Xk~f diff --git a/draw/shaders/generated/base_2d.vert.metal b/draw/shaders/generated/base_2d.vert.metal index b24ba01..75fa3b4 100644 --- a/draw/shaders/generated/base_2d.vert.metal +++ b/draw/shaders/generated/base_2d.vert.metal @@ -19,6 +19,7 @@ struct Primitive float _pad; float4 params; float4 params2; + float4 uv_rect; }; struct Primitive_1 @@ -30,6 +31,7 @@ struct Primitive_1 float _pad; float4 params; float4 params2; + float4 uv_rect; }; struct Primitives @@ -45,6 +47,7 @@ struct main0_out float4 f_params2 [[user(locn3)]]; uint f_kind_flags [[user(locn4)]]; float f_rotation [[user(locn5)]]; + float4 f_uv_rect [[user(locn6)]]; float4 gl_Position [[position]]; }; @@ -55,7 +58,7 @@ struct main0_in float4 v_color [[attribute(2)]]; }; -vertex main0_out main0(main0_in in [[stage_in]], constant Uniforms& _12 [[buffer(0)]], const device Primitives& _72 [[buffer(1)]], uint gl_InstanceIndex [[instance_id]]) +vertex main0_out main0(main0_in in [[stage_in]], constant Uniforms& _12 [[buffer(0)]], const device Primitives& _74 [[buffer(1)]], uint gl_InstanceIndex [[instance_id]]) { main0_out out = {}; if (_12.mode == 0u) @@ -66,18 +69,20 @@ vertex main0_out main0(main0_in in [[stage_in]], constant Uniforms& _12 [[buffer out.f_params2 = float4(0.0); out.f_kind_flags = 0u; out.f_rotation = 0.0; + out.f_uv_rect = float4(0.0, 0.0, 1.0, 1.0); out.gl_Position = _12.projection * float4(in.v_position * _12.dpi_scale, 0.0, 1.0); } else { Primitive p; - p.bounds = _72.primitives[int(gl_InstanceIndex)].bounds; - p.color = _72.primitives[int(gl_InstanceIndex)].color; - p.kind_flags = _72.primitives[int(gl_InstanceIndex)].kind_flags; - p.rotation = _72.primitives[int(gl_InstanceIndex)].rotation; - p._pad = _72.primitives[int(gl_InstanceIndex)]._pad; - p.params = _72.primitives[int(gl_InstanceIndex)].params; - p.params2 = _72.primitives[int(gl_InstanceIndex)].params2; + p.bounds = _74.primitives[int(gl_InstanceIndex)].bounds; + p.color = _74.primitives[int(gl_InstanceIndex)].color; + p.kind_flags = _74.primitives[int(gl_InstanceIndex)].kind_flags; + p.rotation = _74.primitives[int(gl_InstanceIndex)].rotation; + p._pad = _74.primitives[int(gl_InstanceIndex)]._pad; + p.params = _74.primitives[int(gl_InstanceIndex)].params; + p.params2 = _74.primitives[int(gl_InstanceIndex)].params2; + p.uv_rect = _74.primitives[int(gl_InstanceIndex)].uv_rect; float2 corner = in.v_position; float2 world_pos = mix(p.bounds.xy, p.bounds.zw, corner); float2 center = (p.bounds.xy + p.bounds.zw) * 0.5; @@ -87,8 +92,8 @@ vertex main0_out main0(main0_in in [[stage_in]], constant Uniforms& _12 [[buffer out.f_params2 = p.params2; out.f_kind_flags = p.kind_flags; out.f_rotation = p.rotation; + out.f_uv_rect = p.uv_rect; out.gl_Position = _12.projection * float4(world_pos * _12.dpi_scale, 0.0, 1.0); } return out; } - diff --git a/draw/shaders/generated/base_2d.vert.spv b/draw/shaders/generated/base_2d.vert.spv index c318fc2f5e7d89cff97175f6bdbe615b0a6f3218..ca08cba758c5e0e0184d61dc6f3b8bc35780228d 100644 GIT binary patch literal 5008 zcmaKu33pRf5XT?14G4%RvZzqgR;+-ctb(W%Sqh3;L~u8RCaHlmsp$gl;J%``uV28A z;g@nb$8-Gs-g{GGj>pUS_s;z9%-or|_rB1xaA;AIEJ&6ni<4iHY+Ro#gh}8E={$Pu z#IY^YGnFkncHX1K@}ws<)aJV6`c&@a?_{~&R9Opd2K}H22Ehg}3^syIU@O=Kj)5Nj zEoAk-_Im=bf+JoDc`W4b%)Jo^-%}S}(C{LtyX={IKrFLtkoYf>}ziMn*YExd|) zqcm1+pQ^TJsuxhx`sE$2wWjN7kV3L3k+&}?UvJKwsK1)_A=c~-YX=(jse|?DnR0Wy zO1#X^({^pP(U@Pef1aVk96igq(o}4jatGkX+Vx4Yn2U3ebDU|-HY-$Jz*5fKvhHVf z$eB}Thdf{T9&oioS7Uf@XnlogK5id!oJQ8GS_Y zU2^6}o7r7B+N@-E%(rruweeQFS#4+iX`r8PwHp=BKlSfTX0@p{XS5k5wr|)z9Km1N zVNFN)8@jo#=;DUZ9bspGP_FSdpWf69y#o(BRJ#A#TjX5e7PGz*co5WJ_qMsa_c%c>y|5a zaMmm54^8-bFXTpm=QGNz-B_ykU>EiErZVdM-mW?Hci78^na#~u$^*a=`e3G}y6?bV zZ3Zi;)_Mns(Zj5~8p&LWU=8uiRo!~m@^>+_ocb#MUc+1jj{f}aYokxL)|>8U>?L3? zKJM?GmNUk+%tf#OY|eG>lvGw6t~=iV95-UMl3BYo%+_Ik?xSvB4^_4fCy7wyL-HNPltg9DuF7_Wp*WSF1Jp>$gl7)HP2IM*szP>|! z&HeUVwszOmhjY&LJ961QTvy+ia}m!sXuQb9JCGz7_eCyuqD3CAWytz{7kXA}Z?Xm1 zedPL)e81Kr=bA7l@4WlDZVu}H#zqdFyZ?W1XuqA=-g|$vx1atFyKmf!-MOy)9%Sv^ zx1ZZ}fZ06kk&qX10sRiw)zvyo%x>3b@PipJd11}wSO*Ox7JvH5!p8ydKua8Q|J}sPWuUT z_cUHTm%Z28Yv=zD8T)n-#axZ;Fo!>2jBRc9_7E_Zy0Ol7>5VSk-&XDS6UDc_0{DGc z4jv$eb^E_^`CqZ|z59KZ^E>c6S*&Ju{Dr-1tH49R+?0*u9=8DB{Db)TZBTd3S_eQ8 zxaK=oR~}@(Eth?hArB+Fzcw40hjQ6>=3BBJ@AM8}J>GTq@_z3G^0D{5$li5*yw~>u zeZ245cm{ic{AFy+_kN(C{BbyK9su%=;2uPlTL8xBf_>Nrv~|tAed||<;o1l=KW)s% z@92Ku+qDMkRUZL=(qDUTpY6A~de0vP$H0EF@_s)Kd=vV+rrlxYG2oiHy7%#%V@mga zg82mSPQ~8p+UeuncoG=L`powv^C_T@{WD*6?ZW3NWPKKb@Oc_pAMc$$>e}gJZ=V7B zc<1$bj`?|@kMB_*b?w6E1!R4EFX2-{*2g!hkGgjH*n`tRpFW_EJvamODFc1fPlNCo zN7lzY!>5X@PX*|su3h-lko9rT@OcSYp9!Fkx^{6-&mntHUGrV5hwmh^zMhlz#%%)m z$I;d?w{lM25oczNQ=r@b<(v=yHoBbf?0P`tHjQjf%Xne#H3N*l1-P%i+L(KT+4ac% zJaYHm7vSV$?^lrJgrhU}SJAuuUxSkm|JRY_MC5*n^EKBufbq47_x)zh&oP_xEVG8VE-=x^Be&efpOL{%SYedMz$ZJzk}>v@8r3?i>!^jd+%X> z4|IQ@-p~2S{{wV=<=wZRSuT41A+q_GfqR?pM?hZt!CVjfkI}WSfUy4rSzh}M%<9pf zPm#6vJ!@}&J_GXF59hk}zU9k6d*8D51?JCzy!IP&J@WYiUHb_T`CLJk*M1YTy7t?d zzXbMd8_<3+^H)G#-FxsgXaIfe^EW`BIAe4AHs_wr1{eUdzeTd(I sACa|nZ3t`zSJ>OL_z9SwHs-U2`Dd^dSc5gHYuEWZup0eucG(U70ijBNdjJ3c literal 4716 zcmZ{liFQ;)5QZw7DVbFg#%>8ii0s;jH3Z%Tc`*JRm>Y*n^0`!&nQO<5mI1~-uAV<%3Z z*tW1(+qP@>b}iOqrDUkh4cU!JF7rD#*=VV31h;`fPys_=1Z)Odz&5ZGJON7l^)dfW zu>P#f-<3H1FxR0!*J{kPJ97(&WHgXC<@rwgOnrK>(Pr1+&8arus4h%THtWegkKrD3 z?V1!(OWA4#hI^XtOm({5Y?QwcV+nUn=}6$u6n*^`~>rROcr< zlX((8p&ZGyC&INgTyyli>Dj!rZdNybztpqvoU|9v9LJVnywQs_w!a=JKt-<*t^q{MB-nFN5x! z<$80L`)|(u*-^OVJX0M95(iEH7d40XeB+Xge(uw^#a$k6<#&9%Rm*$Fe)o>1+nrXu zlhv4@y53sU<`}Vkhn{&He`SaHkMkQkACmMn=;nqsWCUj|_-aSyYsAmr`9C7oMdgi zKi7Ptv8L>rb#To$X-v-=?{gKpBRF%Ca}G`)Irk4vSuWqVoZg*#Ikd4>4(sL!&bV@Z zm4h?B+$~+4@#U;#*twV79bKGfldE)b_Nbiy46OI#%yOeiFQxu0A=@kVuOs#eU4Msl zvu|BDmhvXx2z@A5lWt#Im#tte)%I+gh~a+9>ygZ*0yYrOo>cd28~I(yET_JX-)osG zz%f|#vNrnUYvr_`vF-Cm@o|59RL&UJGgrV0u(iF3 z_AqeVPZs8JH<0T_`1-E&HTQc8*|WQ@zPaFBzqgRx!*%s71s65D4?WMfobF#+xj{1X zy_t`ky&;VI0JAx`Zd`T$pUK5tJ&5ek-k!F8_JH=*WG8Unu-{$i+V4TuKHoR5(LVGn zyKr!jn;t_lFV8Ks6LP{BhYDF+eU23R|M}cW_P#mo29b|4s~hVWv-!xGzxsH=xqcjZ z1+ZsbSNBcHxqgCKe`BjpFn8yAy3oxn>hU6S-1AGtx@U~_SCPwIdKJ09OP@m4-u-IK z?q^TxFMkd&4(`Fr+>GUL&p%;|`-<&-Uo^g+4|U-|3COx9$7gPYlm_ z6OjK68{ewmQ90l6A7rtf+3`2_uB`*3z}%FLV@|gN-|GQckdwZoq7h(sIJ{# z)YN)g+fin7wU?g+_VE~5*~3o(`(A(7v^&E5G;qya-TP$0F{QmHm`?%wDE3y@P9OW> z8DJdGXTH9PXMsM}&wSOj3!mqZ_2~oQ^E|RXYk)rL+Ua9$UjX`)fIhyhmw`UMIepZ% z3!hhz_3@2_&uhr~ybAPD*G?a6@H)`PcchOscmwF;Th&KhyYQJr*2g@AMTHLD*jK2-IufE!t`z*8Tk$VTZckcx_`Ph3ASxz{*b3c#X>%RmiAO3G4%ZbSS zB6nr3Zv*3N6L0mMf&hyN2=6ss@UErR(fi}?NS1Q$T$|1q+@^6opxEEhGuglzuyr+b_4 zCqQ2Np+XP)PtmoX0%89dvb^>q%<55}&ylrX1GKk3UjTXSHy670zRNFx_P$H)J=0e} zUi&SDuKjl8uYom)eqV|FO~LE$y?zU3fj-vlGU%<_cLg7H`yO3C`S>RMfGqC_?nh)f w?>D~DKOt-D+A!Fc^!yv~GqU+10j14jZvX%Q diff --git a/draw/shaders/source/base_2d.frag b/draw/shaders/source/base_2d.frag index e6af939..cf301d5 100644 --- a/draw/shaders/source/base_2d.frag +++ b/draw/shaders/source/base_2d.frag @@ -7,6 +7,7 @@ layout(location = 2) in vec4 f_params; layout(location = 3) in vec4 f_params2; layout(location = 4) flat in uint f_kind_flags; layout(location = 5) flat in float f_rotation; +layout(location = 6) flat in vec4 f_uv_rect; // --- Output --- layout(location = 0) out vec4 out_color; @@ -130,6 +131,23 @@ void main() { d = sdRoundedBox(p_local, b, r); if ((flags & 1u) != 0u) d = sdf_stroke(d, stroke_px); + + // Texture sampling for textured SDF primitives + vec4 shape_color = f_color; + if ((flags & 2u) != 0u) { + // Compute UV from local position and half_size + vec2 p_for_uv = f_local_or_uv; + if (f_rotation != 0.0) { + p_for_uv = apply_rotation(p_for_uv, f_rotation); + } + vec2 local_uv = p_for_uv / b * 0.5 + 0.5; + vec2 uv = mix(f_uv_rect.xy, f_uv_rect.zw, local_uv); + shape_color *= texture(tex, uv); + } + + float alpha = sdf_alpha(d, soft); + out_color = vec4(shape_color.rgb, shape_color.a * alpha); + return; } else if (kind == 2u) { // Circle — rotationally symmetric, no rotation needed diff --git a/draw/shaders/source/base_2d.vert b/draw/shaders/source/base_2d.vert index e72aa3b..a43b51f 100644 --- a/draw/shaders/source/base_2d.vert +++ b/draw/shaders/source/base_2d.vert @@ -12,6 +12,7 @@ layout(location = 2) out vec4 f_params; layout(location = 3) out vec4 f_params2; layout(location = 4) flat out uint f_kind_flags; layout(location = 5) flat out float f_rotation; +layout(location = 6) flat out vec4 f_uv_rect; // ---------- Uniforms (single block — avoids spirv-cross reordering on Metal) ---------- layout(set = 1, binding = 0) uniform Uniforms { @@ -29,6 +30,7 @@ struct Primitive { float _pad; // 28-31: alignment padding vec4 params; // 32-47: shape params part 1 vec4 params2; // 48-63: shape params part 2 + vec4 uv_rect; // 64-79: u_min, v_min, u_max, v_max }; layout(std430, set = 0, binding = 0) readonly buffer Primitives { @@ -45,6 +47,7 @@ void main() { f_params2 = vec4(0.0); f_kind_flags = 0u; f_rotation = 0.0; + f_uv_rect = vec4(0.0, 0.0, 1.0, 1.0); gl_Position = projection * vec4(v_position * dpi_scale, 0.0, 1.0); } else { @@ -61,6 +64,7 @@ void main() { f_params2 = p.params2; f_kind_flags = p.kind_flags; f_rotation = p.rotation; + f_uv_rect = p.uv_rect; gl_Position = projection * vec4(world_pos * dpi_scale, 0.0, 1.0); } diff --git a/draw/shapes.odin b/draw/shapes.odin index 5a8b929..cca0140 100644 --- a/draw/shapes.odin +++ b/draw/shapes.odin @@ -68,6 +68,19 @@ emit_rectangle :: proc(x, y, width, height: f32, color: Color, vertices: []Verte vertices[offset + 5] = solid_vertex({x, y + height}, color) } +@(private = "file") +prepare_sdf_primitive_textured :: proc( + layer: ^Layer, + prim: Primitive, + texture_id: Texture_Id, + sampler: Sampler_Preset, +) { + offset := u32(len(GLOB.tmp_primitives)) + append(&GLOB.tmp_primitives, prim) + scissor := &GLOB.scissors[layer.scissor_start + layer.scissor_len - 1] + append_or_extend_sub_batch(scissor, layer, .SDF, offset, 1, texture_id, sampler) +} + // ----- Drawing functions ---- pixel :: proc(layer: ^Layer, pos: [2]f32, color: Color) { @@ -358,17 +371,20 @@ triangle_strip :: proc( // ----- SDF drawing functions ---- -// Compute new center position after rotating a center-parametrized shape -// around a pivot point. The pivot is at (center + origin) in world space. +// Compute the visual center of a center-parametrized shape after applying +// Convention B origin semantics: `center` is where the origin-point lands in +// world space; the visual center is offset by -origin and then rotated around +// the landing point. +// visual_center = center + R(θ) · (-origin) +// When θ=0: visual_center = center - origin (pure positioning shift). +// When origin={0,0}: visual_center = center (no change). @(private = "file") compute_pivot_center :: proc(center: [2]f32, origin: [2]f32, rotation_deg: f32) -> [2]f32 { if origin == {0, 0} do return center theta := math.to_radians(rotation_deg) cos_angle, sin_angle := math.cos(theta), math.sin(theta) - // pivot = center + origin; new_center = pivot + R(θ) * (center - pivot) return( center + - origin + {cos_angle * (-origin.x) - sin_angle * (-origin.y), sin_angle * (-origin.x) + cos_angle * (-origin.y)} \ ) } @@ -384,6 +400,13 @@ rotated_aabb_half_extents :: proc(half_width, half_height, rotation_radians: f32 // Draw a filled rectangle via SDF (analytical anti-aliasing at all orientations). // `roundness` is a 0–1 fraction controlling uniform corner rounding — 0 is sharp, 1 is fully rounded. // For per-corner pixel-precise rounding, use `rectangle_corners` instead. +// +// Origin semantics: +// `origin` is a local offset from the rect's top-left corner that selects both the positioning +// anchor and the rotation pivot. `rect.x, rect.y` specifies where that anchor point lands in +// world space. When `origin = {0, 0}` (default), `rect.x, rect.y` is the top-left corner. +// When `origin = center_of_rectangle(rect)`, `rect.x, rect.y` is the visual center. +// Rotation always occurs around the anchor point. rectangle :: proc( layer: ^Layer, rect: Rectangle, @@ -400,6 +423,7 @@ rectangle :: proc( // Draw a stroked rectangle via SDF (analytical anti-aliasing at all orientations). // `roundness` is a 0–1 fraction controlling uniform corner rounding — 0 is sharp, 1 is fully rounded. // For per-corner pixel-precise rounding, use `rectangle_corners_lines` instead. +// Origin semantics: see `rectangle`. rectangle_lines :: proc( layer: ^Layer, rect: Rectangle, @@ -415,6 +439,7 @@ rectangle_lines :: proc( } // Draw a rectangle with per-corner rounding radii via SDF. +// Origin semantics: see `rectangle`. rectangle_corners :: proc( layer: ^Layer, rect: Rectangle, @@ -436,12 +461,12 @@ rectangle_corners :: proc( half_width := rect.width * 0.5 half_height := rect.height * 0.5 rotation_radians: f32 = 0 - center_x := rect.x + half_width - center_y := rect.y + half_height + center_x := rect.x + half_width - origin.x + center_y := rect.y + half_height - origin.y if needs_transform(origin, rotation) { rotation_radians = math.to_radians(rotation) - transform := build_pivot_rotation({rect.x, rect.y}, origin, rotation) + transform := build_pivot_rotation({rect.x + origin.x, rect.y + origin.y}, origin, rotation) new_center := apply_transform(transform, {half_width, half_height}) center_x = new_center.x center_y = new_center.y @@ -480,6 +505,7 @@ rectangle_corners :: proc( } // Draw a stroked rectangle with per-corner rounding radii via SDF. +// Origin semantics: see `rectangle`. rectangle_corners_lines :: proc( layer: ^Layer, rect: Rectangle, @@ -502,12 +528,12 @@ rectangle_corners_lines :: proc( half_width := rect.width * 0.5 half_height := rect.height * 0.5 rotation_radians: f32 = 0 - center_x := rect.x + half_width - center_y := rect.y + half_height + center_x := rect.x + half_width - origin.x + center_y := rect.y + half_height - origin.y if needs_transform(origin, rotation) { rotation_radians = math.to_radians(rotation) - transform := build_pivot_rotation({rect.x, rect.y}, origin, rotation) + transform := build_pivot_rotation({rect.x + origin.x, rect.y + origin.y}, origin, rotation) new_center := apply_transform(transform, {half_width, half_height}) center_x = new_center.x center_y = new_center.y @@ -545,7 +571,114 @@ rectangle_corners_lines :: proc( prepare_sdf_primitive(layer, prim) } +// Draw a rectangle with a texture fill via SDF. Supports rounded corners via `roundness`, +// rotation, and analytical anti-aliasing on the shape silhouette. +// Origin semantics: see `rectangle`. +rectangle_texture :: proc( + layer: ^Layer, + rect: Rectangle, + id: Texture_Id, + tint: Color = WHITE, + uv_rect: Rectangle = {0, 0, 1, 1}, + sampler: Sampler_Preset = .Linear_Clamp, + roundness: f32 = 0, + origin: [2]f32 = {0, 0}, + rotation: f32 = 0, + soft_px: f32 = 1.0, +) { + cr := min(rect.width, rect.height) * clamp(roundness, 0, 1) * 0.5 + rectangle_texture_corners( + layer, + rect, + {cr, cr, cr, cr}, + id, + tint, + uv_rect, + sampler, + origin, + rotation, + soft_px, + ) +} + +// Draw a rectangle with a texture fill and per-corner rounding radii via SDF. +// Origin semantics: see `rectangle`. +rectangle_texture_corners :: proc( + layer: ^Layer, + rect: Rectangle, + radii: [4]f32, + id: Texture_Id, + tint: Color = WHITE, + uv_rect: Rectangle = {0, 0, 1, 1}, + sampler: Sampler_Preset = .Linear_Clamp, + origin: [2]f32 = {0, 0}, + rotation: f32 = 0, + soft_px: f32 = 1.0, +) { + max_radius := min(rect.width, rect.height) * 0.5 + top_left := clamp(radii[0], 0, max_radius) + top_right := clamp(radii[1], 0, max_radius) + bottom_right := clamp(radii[2], 0, max_radius) + bottom_left := clamp(radii[3], 0, max_radius) + + padding := soft_px / GLOB.dpi_scaling + dpi_scale := GLOB.dpi_scaling + + half_width := rect.width * 0.5 + half_height := rect.height * 0.5 + rotation_radians: f32 = 0 + center_x := rect.x + half_width - origin.x + center_y := rect.y + half_height - origin.y + + if needs_transform(origin, rotation) { + rotation_radians = math.to_radians(rotation) + transform := build_pivot_rotation({rect.x + origin.x, rect.y + origin.y}, origin, rotation) + new_center := apply_transform(transform, {half_width, half_height}) + center_x = new_center.x + center_y = new_center.y + } + + bounds_half_width, bounds_half_height := half_width, half_height + if rotation_radians != 0 { + expanded := rotated_aabb_half_extents(half_width, half_height, rotation_radians) + bounds_half_width = expanded.x + bounds_half_height = expanded.y + } + + prim := Primitive { + bounds = { + center_x - bounds_half_width - padding, + center_y - bounds_half_height - padding, + center_x + bounds_half_width + padding, + center_y + bounds_half_height + padding, + }, + color = tint, + kind_flags = pack_kind_flags(.RRect, {.Textured}), + rotation = rotation_radians, + uv_rect = {uv_rect.x, uv_rect.y, uv_rect.width, uv_rect.height}, + } + prim.params.rrect = RRect_Params { + half_size = {half_width * dpi_scale, half_height * dpi_scale}, + radii = { + top_right * dpi_scale, + bottom_right * dpi_scale, + top_left * dpi_scale, + bottom_left * dpi_scale, + }, + soft_px = soft_px, + stroke_px = 0, + } + prepare_sdf_primitive_textured(layer, prim, id, sampler) +} + // Draw a filled circle via SDF. +// +// Origin semantics (Convention B): +// `origin` is a local offset from the shape's center that selects both the positioning anchor +// and the rotation pivot. The `center` parameter specifies where that anchor point lands in +// world space. When `origin = {0, 0}` (default), `center` is the visual center. +// When `origin = {r, 0}`, the point `r` pixels to the right of the shape center lands at +// `center`, shifting the shape left by `r`. circle :: proc( layer: ^Layer, center: [2]f32, @@ -582,6 +715,7 @@ circle :: proc( } // Draw a stroked circle via SDF. +// Origin semantics: see `circle`. circle_lines :: proc( layer: ^Layer, center: [2]f32, @@ -619,6 +753,7 @@ circle_lines :: proc( } // Draw a filled ellipse via SDF. +// Origin semantics: see `circle`. ellipse :: proc( layer: ^Layer, center: [2]f32, @@ -665,6 +800,7 @@ ellipse :: proc( } // Draw a stroked ellipse via SDF. +// Origin semantics: see `circle`. ellipse_lines :: proc( layer: ^Layer, center: [2]f32, @@ -715,6 +851,7 @@ ellipse_lines :: proc( } // Draw a filled ring arc via SDF. +// Origin semantics: see `circle`. ring :: proc( layer: ^Layer, center: [2]f32, @@ -757,6 +894,7 @@ ring :: proc( } // Draw stroked ring arc outlines via SDF. +// Origin semantics: see `circle`. ring_lines :: proc( layer: ^Layer, center: [2]f32, diff --git a/draw/text.odin b/draw/text.odin index 7400b33..0a741b3 100644 --- a/draw/text.odin +++ b/draw/text.odin @@ -246,7 +246,7 @@ bottom_right_of_text :: proc(text_string: string, font_id: Font_Id, font_size: u // After calling this, subsequent text draws with an `id` will re-create their cache entries. clear_text_cache :: proc() { for _, sdl_text in GLOB.text_cache.cache { - sdl_ttf.DestroyText(sdl_text) + append(&GLOB.pending_text_releases, sdl_text) } clear(&GLOB.text_cache.cache) } @@ -259,7 +259,7 @@ clear_text_cache_entry :: proc(id: u32) { key := Cache_Key{id, .Custom} sdl_text, ok := GLOB.text_cache.cache[key] if ok { - sdl_ttf.DestroyText(sdl_text) + append(&GLOB.pending_text_releases, sdl_text) delete_key(&GLOB.text_cache.cache, key) } } diff --git a/draw/textures.odin b/draw/textures.odin new file mode 100644 index 0000000..64f636d --- /dev/null +++ b/draw/textures.odin @@ -0,0 +1,433 @@ +package draw + +import "core:log" +import "core:mem" +import sdl "vendor:sdl3" + +// --------------------------------------------------------------------------- +// Texture types +// --------------------------------------------------------------------------- + +Texture_Id :: distinct u32 +INVALID_TEXTURE :: Texture_Id(0) // Slot 0 is reserved/unused + +Texture_Kind :: enum u8 { + Static, // Uploaded once, never changes (QR codes, decoded PNGs, icons) + Dynamic, // Updatable via update_texture_region + Stream, // Frequent full re-uploads (video, procedural) +} + +Sampler_Preset :: enum u8 { + Nearest_Clamp, + Linear_Clamp, + Nearest_Repeat, + Linear_Repeat, +} + +SAMPLER_PRESET_COUNT :: 4 + +Fit_Mode :: enum u8 { + Stretch, // Fill rect, may distort aspect ratio (default) + Fit, // Preserve aspect, letterbox (may leave margins) + Fill, // Preserve aspect, center-crop (may crop edges) + Tile, // Repeat at native texture size + Center, // 1:1 pixel size, centered, no scaling +} + +Texture_Desc :: struct { + width: u32, + height: u32, + depth_or_layers: u32, + type: sdl.GPUTextureType, + format: sdl.GPUTextureFormat, + usage: sdl.GPUTextureUsageFlags, + mip_levels: u32, + kind: Texture_Kind, +} + +// Internal slot — not exported. +@(private) +Texture_Slot :: struct { + gpu_texture: ^sdl.GPUTexture, + desc: Texture_Desc, + generation: u32, +} + +// State stored in GLOB +// This file references: +// GLOB.device : ^sdl.GPUDevice +// GLOB.texture_slots : [dynamic]Texture_Slot +// GLOB.texture_free_list : [dynamic]u32 +// GLOB.pending_texture_releases : [dynamic]Texture_Id +// GLOB.samplers : [SAMPLER_PRESET_COUNT]^sdl.GPUSampler + +// --------------------------------------------------------------------------- +// Clay integration type +// --------------------------------------------------------------------------- + +Clay_Image_Data :: struct { + texture_id: Texture_Id, + fit: Fit_Mode, + tint: Color, +} + +clay_image_data :: proc(id: Texture_Id, fit: Fit_Mode = .Stretch, tint: Color = WHITE) -> Clay_Image_Data { + return {texture_id = id, fit = fit, tint = tint} +} + +// --------------------------------------------------------------------------- +// Registration +// --------------------------------------------------------------------------- + +// Register a texture. Draw owns the GPU resource and releases it on unregister. +// `data` is tightly-packed row-major bytes matching desc.format. +// The caller may free `data` immediately after this proc returns. +@(require_results) +register_texture :: proc(desc: Texture_Desc, data: []u8) -> (id: Texture_Id, ok: bool) { + device := GLOB.device + if device == nil { + log.error("register_texture called before draw.init()") + return INVALID_TEXTURE, false + } + + assert(desc.width > 0, "Texture_Desc.width must be > 0") + assert(desc.height > 0, "Texture_Desc.height must be > 0") + assert(desc.depth_or_layers > 0, "Texture_Desc.depth_or_layers must be > 0") + assert(desc.mip_levels > 0, "Texture_Desc.mip_levels must be > 0") + assert(desc.usage != {}, "Texture_Desc.usage must not be empty (e.g. {.SAMPLER})") + + // Create the GPU texture + gpu_texture := sdl.CreateGPUTexture( + device, + sdl.GPUTextureCreateInfo { + type = desc.type, + format = desc.format, + usage = desc.usage, + width = desc.width, + height = desc.height, + layer_count_or_depth = desc.depth_or_layers, + num_levels = desc.mip_levels, + sample_count = ._1, + }, + ) + if gpu_texture == nil { + log.errorf("Failed to create GPU texture (%dx%d): %s", desc.width, desc.height, sdl.GetError()) + return INVALID_TEXTURE, false + } + + // Upload pixel data via a transfer buffer + if len(data) > 0 { + data_size := u32(len(data)) + transfer := sdl.CreateGPUTransferBuffer( + device, + sdl.GPUTransferBufferCreateInfo{usage = .UPLOAD, size = data_size}, + ) + if transfer == nil { + log.errorf("Failed to create texture transfer buffer: %s", sdl.GetError()) + sdl.ReleaseGPUTexture(device, gpu_texture) + return INVALID_TEXTURE, false + } + defer sdl.ReleaseGPUTransferBuffer(device, transfer) + + mapped := sdl.MapGPUTransferBuffer(device, transfer, false) + if mapped == nil { + log.errorf("Failed to map texture transfer buffer: %s", sdl.GetError()) + sdl.ReleaseGPUTexture(device, gpu_texture) + return INVALID_TEXTURE, false + } + mem.copy(mapped, raw_data(data), int(data_size)) + sdl.UnmapGPUTransferBuffer(device, transfer) + + cmd_buffer := sdl.AcquireGPUCommandBuffer(device) + if cmd_buffer == nil { + log.errorf("Failed to acquire command buffer for texture upload: %s", sdl.GetError()) + sdl.ReleaseGPUTexture(device, gpu_texture) + return INVALID_TEXTURE, false + } + copy_pass := sdl.BeginGPUCopyPass(cmd_buffer) + sdl.UploadToGPUTexture( + copy_pass, + sdl.GPUTextureTransferInfo{transfer_buffer = transfer}, + sdl.GPUTextureRegion{texture = gpu_texture, w = desc.width, h = desc.height, d = desc.depth_or_layers}, + false, + ) + sdl.EndGPUCopyPass(copy_pass) + if !sdl.SubmitGPUCommandBuffer(cmd_buffer) { + log.errorf("Failed to submit texture upload: %s", sdl.GetError()) + sdl.ReleaseGPUTexture(device, gpu_texture) + return INVALID_TEXTURE, false + } + } + + // Allocate a slot (reuse from free list or append) + slot_index: u32 + if len(GLOB.texture_free_list) > 0 { + slot_index = pop(&GLOB.texture_free_list) + GLOB.texture_slots[slot_index] = Texture_Slot { + gpu_texture = gpu_texture, + desc = desc, + generation = GLOB.texture_slots[slot_index].generation + 1, + } + } else { + slot_index = u32(len(GLOB.texture_slots)) + append(&GLOB.texture_slots, Texture_Slot{gpu_texture = gpu_texture, desc = desc, generation = 1}) + } + + return Texture_Id(slot_index), true +} + +// Queue a texture for release at the end of the current frame. +// The GPU resource is not freed immediately — see "Deferred release" in the README. +unregister_texture :: proc(id: Texture_Id) { + if id == INVALID_TEXTURE do return + append(&GLOB.pending_texture_releases, id) +} + +// Re-upload a sub-region of a Dynamic texture. +update_texture_region :: proc(id: Texture_Id, region: Rectangle, data: []u8) { + if id == INVALID_TEXTURE do return + slot := &GLOB.texture_slots[u32(id)] + if slot.gpu_texture == nil do return + + device := GLOB.device + data_size := u32(len(data)) + if data_size == 0 do return + + transfer := sdl.CreateGPUTransferBuffer( + device, + sdl.GPUTransferBufferCreateInfo{usage = .UPLOAD, size = data_size}, + ) + if transfer == nil { + log.errorf("Failed to create transfer buffer for texture region update: %s", sdl.GetError()) + return + } + defer sdl.ReleaseGPUTransferBuffer(device, transfer) + + mapped := sdl.MapGPUTransferBuffer(device, transfer, false) + if mapped == nil { + log.errorf("Failed to map transfer buffer for texture region update: %s", sdl.GetError()) + return + } + mem.copy(mapped, raw_data(data), int(data_size)) + sdl.UnmapGPUTransferBuffer(device, transfer) + + cmd_buffer := sdl.AcquireGPUCommandBuffer(device) + if cmd_buffer == nil { + log.errorf("Failed to acquire command buffer for texture region update: %s", sdl.GetError()) + return + } + copy_pass := sdl.BeginGPUCopyPass(cmd_buffer) + sdl.UploadToGPUTexture( + copy_pass, + sdl.GPUTextureTransferInfo{transfer_buffer = transfer}, + sdl.GPUTextureRegion { + texture = slot.gpu_texture, + x = u32(region.x), + y = u32(region.y), + w = u32(region.width), + h = u32(region.height), + d = 1, + }, + false, + ) + sdl.EndGPUCopyPass(copy_pass) + if !sdl.SubmitGPUCommandBuffer(cmd_buffer) { + log.errorf("Failed to submit texture region update: %s", sdl.GetError()) + } +} + +// --------------------------------------------------------------------------- +// Accessors +// --------------------------------------------------------------------------- + +texture_size :: proc(id: Texture_Id) -> [2]u32 { + if id == INVALID_TEXTURE do return {0, 0} + slot := &GLOB.texture_slots[u32(id)] + return {slot.desc.width, slot.desc.height} +} + +texture_format :: proc(id: Texture_Id) -> sdl.GPUTextureFormat { + if id == INVALID_TEXTURE do return .INVALID + return GLOB.texture_slots[u32(id)].desc.format +} + +texture_kind :: proc(id: Texture_Id) -> Texture_Kind { + if id == INVALID_TEXTURE do return .Static + return GLOB.texture_slots[u32(id)].desc.kind +} + +// Internal: get the raw GPU texture pointer for binding during draw. +@(private) +texture_gpu_handle :: proc(id: Texture_Id) -> ^sdl.GPUTexture { + if id == INVALID_TEXTURE do return nil + idx := u32(id) + if idx >= u32(len(GLOB.texture_slots)) do return nil + return GLOB.texture_slots[idx].gpu_texture +} + +// --------------------------------------------------------------------------- +// Deferred release (called from draw.end / clear_global) +// --------------------------------------------------------------------------- + +@(private) +process_pending_texture_releases :: proc() { + device := GLOB.device + for id in GLOB.pending_texture_releases { + idx := u32(id) + if idx >= u32(len(GLOB.texture_slots)) do continue + slot := &GLOB.texture_slots[idx] + if slot.gpu_texture != nil { + sdl.ReleaseGPUTexture(device, slot.gpu_texture) + slot.gpu_texture = nil + } + slot.generation += 1 + append(&GLOB.texture_free_list, idx) + } + clear(&GLOB.pending_texture_releases) +} + +// --------------------------------------------------------------------------- +// Sampler pool +// --------------------------------------------------------------------------- + +@(private) +get_sampler :: proc(preset: Sampler_Preset) -> ^sdl.GPUSampler { + idx := int(preset) + if GLOB.samplers[idx] != nil do return GLOB.samplers[idx] + + // Lazily create + min_filter, mag_filter: sdl.GPUFilter + address_mode: sdl.GPUSamplerAddressMode + + switch preset { + case .Nearest_Clamp: + min_filter = .NEAREST; mag_filter = .NEAREST; address_mode = .CLAMP_TO_EDGE + case .Linear_Clamp: + min_filter = .LINEAR; mag_filter = .LINEAR; address_mode = .CLAMP_TO_EDGE + case .Nearest_Repeat: + min_filter = .NEAREST; mag_filter = .NEAREST; address_mode = .REPEAT + case .Linear_Repeat: + min_filter = .LINEAR; mag_filter = .LINEAR; address_mode = .REPEAT + } + + sampler := sdl.CreateGPUSampler( + GLOB.device, + sdl.GPUSamplerCreateInfo { + min_filter = min_filter, + mag_filter = mag_filter, + mipmap_mode = .LINEAR, + address_mode_u = address_mode, + address_mode_v = address_mode, + address_mode_w = address_mode, + }, + ) + if sampler == nil { + log.errorf("Failed to create sampler preset %v: %s", preset, sdl.GetError()) + return GLOB.pipeline_2d_base.sampler // fallback to existing default sampler + } + + GLOB.samplers[idx] = sampler + return sampler +} + +// Internal: destroy all sampler pool entries. Called from draw.destroy(). +@(private) +destroy_sampler_pool :: proc() { + device := GLOB.device + for &s in GLOB.samplers { + if s != nil { + sdl.ReleaseGPUSampler(device, s) + s = nil + } + } +} + +// Internal: destroy all registered textures. Called from draw.destroy(). +@(private) +destroy_all_textures :: proc() { + device := GLOB.device + for &slot in GLOB.texture_slots { + if slot.gpu_texture != nil { + sdl.ReleaseGPUTexture(device, slot.gpu_texture) + slot.gpu_texture = nil + } + } + delete(GLOB.texture_slots) + delete(GLOB.texture_free_list) + delete(GLOB.pending_texture_releases) +} + +// --------------------------------------------------------------------------- +// Fit mode helper +// --------------------------------------------------------------------------- + +// Compute UV rect, recommended sampler, and inner rect for a given fit mode. +// `rect` is the target drawing area; `texture_id` identifies the texture whose +// pixel dimensions are looked up via texture_size(). +// For Fit mode, `inner_rect` is smaller than `rect` (centered). For all other modes, `inner_rect == rect`. +fit_params :: proc( + fit: Fit_Mode, + rect: Rectangle, + texture_id: Texture_Id, +) -> ( + uv_rect: Rectangle, + sampler: Sampler_Preset, + inner_rect: Rectangle, +) { + size := texture_size(texture_id) + texture_width := f32(size.x) + texture_height := f32(size.y) + rect_width := rect.width + rect_height := rect.height + inner_rect = rect + + if texture_width == 0 || texture_height == 0 || rect_width == 0 || rect_height == 0 { + return {0, 0, 1, 1}, .Linear_Clamp, inner_rect + } + + texture_aspect := texture_width / texture_height + rect_aspect := rect_width / rect_height + + switch fit { + case .Stretch: return {0, 0, 1, 1}, .Linear_Clamp, inner_rect + + case .Fill: if texture_aspect > rect_aspect { + // Texture wider than rect — crop sides + scale := rect_aspect / texture_aspect + margin := (1 - scale) * 0.5 + return {margin, 0, 1 - margin, 1}, .Linear_Clamp, inner_rect + } else { + // Texture taller than rect — crop top/bottom + scale := texture_aspect / rect_aspect + margin := (1 - scale) * 0.5 + return {0, margin, 1, 1 - margin}, .Linear_Clamp, inner_rect + } + + case .Fit: + // Preserve aspect, fit inside rect. Returns a shrunken inner_rect. + if texture_aspect > rect_aspect { + // Image wider — letterbox top/bottom + fit_height := rect_width / texture_aspect + padding := (rect_height - fit_height) * 0.5 + inner_rect = Rectangle{rect.x, rect.y + padding, rect_width, fit_height} + } else { + // Image taller — letterbox left/right + fit_width := rect_height * texture_aspect + padding := (rect_width - fit_width) * 0.5 + inner_rect = Rectangle{rect.x + padding, rect.y, fit_width, rect_height} + } + return {0, 0, 1, 1}, .Linear_Clamp, inner_rect + + case .Tile: + uv_width := rect_width / texture_width + uv_height := rect_height / texture_height + return {0, 0, uv_width, uv_height}, .Linear_Repeat, inner_rect + + case .Center: + u_half := rect_width / (2 * texture_width) + v_half := rect_height / (2 * texture_height) + return {0.5 - u_half, 0.5 - v_half, 0.5 + u_half, 0.5 + v_half}, .Nearest_Clamp, inner_rect + } + + return {0, 0, 1, 1}, .Linear_Clamp, inner_rect +} -- 2.43.0 From ba522fa051143ce15a1371918ee043e902ac07e2 Mon Sep 17 00:00:00 2001 From: Zachary Levy Date: Tue, 21 Apr 2026 15:35:55 -0700 Subject: [PATCH 3/5] QR code improvements --- draw/draw_qr/draw_qr.odin | 216 +++++++++++++++++++++++++++--------- draw/examples/textures.odin | 12 +- qrcode/generate.odin | 128 +++++++++++++-------- 3 files changed, 247 insertions(+), 109 deletions(-) diff --git a/draw/draw_qr/draw_qr.odin b/draw/draw_qr/draw_qr.odin index 9fb3a0f..e5b1d84 100644 --- a/draw/draw_qr/draw_qr.odin +++ b/draw/draw_qr/draw_qr.odin @@ -3,76 +3,188 @@ package draw_qr import draw ".." import "../../qrcode" -// A registered QR code texture, ready for display via draw.rectangle_texture. -QR :: struct { - texture_id: draw.Texture_Id, - size: int, // modules per side (e.g. 21..177) +// ----------------------------------------------------------------------------- +// Layer 1 — pure: encoded QR buffer → RGBA pixels + descriptor +// ----------------------------------------------------------------------------- + +// Returns the number of bytes to_texture will write for the given encoded +// QR buffer. Equivalent to size*size*4 where size = qrcode.get_size(qrcode_buf). +texture_size :: #force_inline proc(qrcode_buf: []u8) -> int { + size := qrcode.get_size(qrcode_buf) + return size * size * 4 } -// Encode text as a QR code and register the result as an R8 texture. -// The texture uses Nearest_Clamp sampling by default (sharp module edges). -// Returns ok=false if encoding or registration fails. +// Decodes an encoded QR buffer into tightly-packed RGBA pixel data written to +// texture_buf. No allocations, no GPU calls. Returns the Texture_Desc the +// caller should pass to draw.register_texture alongside texture_buf. +// +// Returns ok=false when: +// - qrcode_buf is invalid (qrcode.get_size returns 0). +// - texture_buf is smaller than to_texture_size(qrcode_buf). @(require_results) -create_from_text :: proc( +to_texture :: proc( + qrcode_buf: []u8, + texture_buf: []u8, + dark: draw.Color = draw.BLACK, + light: draw.Color = draw.WHITE, +) -> ( + desc: draw.Texture_Desc, + ok: bool, +) { + size := qrcode.get_size(qrcode_buf) + if size == 0 do return {}, false + if len(texture_buf) < size * size * 4 do return {}, false + + for y in 0 ..< size { + for x in 0 ..< size { + i := (y * size + x) * 4 + c := dark if qrcode.get_module(qrcode_buf, x, y) else light + texture_buf[i + 0] = c[0] + texture_buf[i + 1] = c[1] + texture_buf[i + 2] = c[2] + texture_buf[i + 3] = c[3] + } + } + + return draw.Texture_Desc { + width = u32(size), + height = u32(size), + depth_or_layers = 1, + type = .D2, + format = .R8G8B8A8_UNORM, + usage = {.SAMPLER}, + mip_levels = 1, + kind = .Static, + }, + true +} + +// ----------------------------------------------------------------------------- +// Layer 2 — raw: pre-encoded QR buffer → registered GPU texture +// ----------------------------------------------------------------------------- + +// Allocates pixel buffer via temp_allocator, decodes qrcode_buf into it, and +// registers with the GPU. The pixel allocation is freed before return. +// +// Returns ok=false when: +// - qrcode_buf is invalid (qrcode.get_size returns 0). +// - temp_allocator fails to allocate the pixel buffer. +// - GPU texture registration fails. +@(require_results) +register_texture_from_raw :: proc( + qrcode_buf: []u8, + dark: draw.Color = draw.BLACK, + light: draw.Color = draw.WHITE, + temp_allocator := context.temp_allocator, +) -> ( + texture: draw.Texture_Id, + ok: bool, +) { + tex_size := texture_size(qrcode_buf) + if tex_size == 0 do return draw.INVALID_TEXTURE, false + + pixels, alloc_err := make([]u8, tex_size, temp_allocator) + if alloc_err != nil do return draw.INVALID_TEXTURE, false + defer delete(pixels, temp_allocator) + + desc := to_texture(qrcode_buf, pixels, dark, light) or_return + return draw.register_texture(desc, pixels) +} + +// ----------------------------------------------------------------------------- +// Layer 3 — text → registered GPU texture +// ----------------------------------------------------------------------------- + +// Encodes text as a QR Code and registers the result as an RGBA texture. +// +// Returns ok=false when: +// - temp_allocator fails to allocate. +// - The text cannot fit in any version within [min_version, max_version] at the given ECL. +// - GPU texture registration fails. +@(require_results) +register_texture_from_text :: proc( text: string, ecl: qrcode.Ecc = .Low, min_version: int = qrcode.VERSION_MIN, max_version: int = qrcode.VERSION_MAX, mask: Maybe(qrcode.Mask) = nil, boost_ecl: bool = true, + dark: draw.Color = draw.BLACK, + light: draw.Color = draw.WHITE, + temp_allocator := context.temp_allocator, ) -> ( - qr: QR, + texture: draw.Texture_Id, ok: bool, ) { - qrcode_buf: [qrcode.BUFFER_LEN_MAX]u8 - encode_ok := qrcode.encode(text, qrcode_buf[:], ecl, min_version, max_version, mask, boost_ecl) - if !encode_ok do return {}, false - return create(qrcode_buf[:]) + qrcode_buf, alloc_err := make([]u8, qrcode.buffer_len_for_version(max_version), temp_allocator) + if alloc_err != nil do return draw.INVALID_TEXTURE, false + defer delete(qrcode_buf, temp_allocator) + + qrcode.encode_auto( + text, + qrcode_buf, + ecl, + min_version, + max_version, + mask, + boost_ecl, + temp_allocator, + ) or_return + + return register_texture_from_raw(qrcode_buf, dark, light, temp_allocator) } -// Register an already-encoded QR code buffer as an R8 texture. -// qrcode_buf must be the output of qrcode.encode (byte 0 = side length, remaining = bit-packed modules). +// ----------------------------------------------------------------------------- +// Layer 4 — binary → registered GPU texture +// ----------------------------------------------------------------------------- + +// Encodes arbitrary binary data as a QR Code and registers the result as an RGBA texture. +// +// Returns ok=false when: +// - temp_allocator fails to allocate. +// - The payload cannot fit in any version within [min_version, max_version] at the given ECL. +// - GPU texture registration fails. @(require_results) -create :: proc(qrcode_buf: []u8) -> (qr: QR, ok: bool) { - size := qrcode.get_size(qrcode_buf) - if size == 0 do return {}, false +register_texture_from_binary :: proc( + bin_data: []u8, + ecl: qrcode.Ecc = .Low, + min_version: int = qrcode.VERSION_MIN, + max_version: int = qrcode.VERSION_MAX, + mask: Maybe(qrcode.Mask) = nil, + boost_ecl: bool = true, + dark: draw.Color = draw.BLACK, + light: draw.Color = draw.WHITE, + temp_allocator := context.temp_allocator, +) -> ( + texture: draw.Texture_Id, + ok: bool, +) { + qrcode_buf, alloc_err := make([]u8, qrcode.buffer_len_for_version(max_version), temp_allocator) + if alloc_err != nil do return draw.INVALID_TEXTURE, false + defer delete(qrcode_buf, temp_allocator) - // Build R8 pixel buffer: 0 = light, 255 = dark - pixels := make([]u8, size * size, context.temp_allocator) - for y in 0 ..< size { - for x in 0 ..< size { - pixels[y * size + x] = 255 if qrcode.get_module(qrcode_buf, x, y) else 0 - } - } + qrcode.encode_auto( + bin_data, + qrcode_buf, + ecl, + min_version, + max_version, + mask, + boost_ecl, + temp_allocator, + ) or_return - id, reg_ok := draw.register_texture( - draw.Texture_Desc { - width = u32(size), - height = u32(size), - depth_or_layers = 1, - type = .D2, - format = .R8_UNORM, - usage = {.SAMPLER}, - mip_levels = 1, - kind = .Static, - }, - pixels, - ) - if !reg_ok do return {}, false - - return QR{texture_id = id, size = size}, true + return register_texture_from_raw(qrcode_buf, dark, light, temp_allocator) } -// Release the GPU texture. -destroy :: proc(qr: ^QR) { - draw.unregister_texture(qr.texture_id) - qr.texture_id = draw.INVALID_TEXTURE - qr.size = 0 -} +// ----------------------------------------------------------------------------- +// Clay integration helper +// ----------------------------------------------------------------------------- -// Convenience: build a Clay_Image_Data for embedding a QR in Clay layouts. -// Uses Nearest_Clamp sampling (set via Sampler_Preset at draw time, not here) and Fit mode -// to preserve the QR's square aspect ratio. -clay_image :: proc(qr: QR, tint: draw.Color = draw.WHITE) -> draw.Clay_Image_Data { - return draw.clay_image_data(qr.texture_id, fit = .Fit, tint = tint) +// Default fit=.Fit preserves the QR's square aspect; override as needed. +clay_image :: #force_inline proc( + texture: draw.Texture_Id, + tint: draw.Color = draw.WHITE, +) -> draw.Clay_Image_Data { + return draw.clay_image_data(texture, fit = .Fit, tint = tint) } diff --git a/draw/examples/textures.odin b/draw/examples/textures.odin index ca53ba3..a89be7d 100644 --- a/draw/examples/textures.odin +++ b/draw/examples/textures.odin @@ -79,8 +79,8 @@ textures :: proc() { // ------------------------------------------------------------------------- // QR code texture (R8_UNORM — see rendering note below) // ------------------------------------------------------------------------- - qr, _ := draw_qr.create_from_text("https://odin-lang.org/") - defer draw_qr.destroy(&qr) + qr_texture, _ := draw_qr.register_texture_from_text("https://x.com/miiilato/status/1880241066471051443") + defer draw.unregister_texture(qr_texture) spin_angle: f32 = 0 @@ -161,16 +161,12 @@ textures :: proc() { // ===================================================================== ROW2_Y :: f32(190) - // QR code (R8_UNORM texture, nearest sampling) - // NOTE: R8_UNORM samples as (r, 0, 0, 1) in Metal's default swizzle. - // With WHITE tint: dark modules (R=1) → red, light modules (R=0) → black. - // The result is a red-on-black QR code. The white bg rect below is - // occluded by the fully-opaque texture but kept for illustration. + // QR code (RGBA texture with baked colors, nearest sampling) draw.rectangle(base_layer, {COL1, ROW2_Y, ITEM_SIZE, ITEM_SIZE}, {255, 255, 255, 255}) // white bg draw.rectangle_texture( base_layer, {COL1, ROW2_Y, ITEM_SIZE, ITEM_SIZE}, - qr.texture_id, + qr_texture, sampler = .Nearest_Clamp, ) draw.text( diff --git a/qrcode/generate.odin b/qrcode/generate.odin index 9014b56..8261021 100644 --- a/qrcode/generate.odin +++ b/qrcode/generate.odin @@ -117,7 +117,7 @@ NUM_ERROR_CORRECTION_BLOCKS := [4][41]i8{ // - The text cannot fit in any version within [min_version, max_version] at the given ECL. // - The encoded segment data exceeds the buffer capacity. @(require_results) -encode_text_explicit_temp :: proc( +encode_text_manual :: proc( text: string, temp_buffer, qrcode: []u8, ecl: Ecc, @@ -130,7 +130,7 @@ encode_text_explicit_temp :: proc( ) { text_len := len(text) if text_len == 0 { - return encode_segments_advanced_explicit_temp( + return encode_segments_advanced_manual( nil, ecl, min_version, @@ -162,7 +162,7 @@ encode_text_explicit_temp :: proc( seg.data = temp_buffer[:text_len] } segs := [1]Segment{seg} - return encode_segments_advanced_explicit_temp( + return encode_segments_advanced_manual( segs[:], ecl, min_version, @@ -211,13 +211,9 @@ encode_text_auto :: proc( return false } defer delete(temp_buffer, temp_allocator) - return encode_text_explicit_temp(text, temp_buffer, qrcode, ecl, min_version, max_version, mask, boost_ecl) + return encode_text_manual(text, temp_buffer, qrcode, ecl, min_version, max_version, mask, boost_ecl) } -encode_text :: proc { - encode_text_explicit_temp, - encode_text_auto, -} // Encodes arbitrary binary data to a QR Code using byte mode. // @@ -234,7 +230,7 @@ encode_text :: proc { // Returns ok=false when: // - The payload cannot fit in any version within [min_version, max_version] at the given ECL. @(require_results) -encode_binary :: proc( +encode_binary_manual :: proc( data_and_temp: []u8, data_len: int, qrcode: []u8, @@ -256,7 +252,7 @@ encode_binary :: proc( seg.num_chars = data_len seg.data = data_and_temp[:data_len] segs := [1]Segment{seg} - return encode_segments_advanced( + return encode_segments_advanced_manual( segs[:], ecl, min_version, @@ -268,6 +264,55 @@ encode_binary :: proc( ) } +// Encodes arbitrary binary data to a QR Code using byte mode, +// automatically allocating and freeing the temp buffer. +// +// Parameters: +// bin_data - [in] Payload bytes (aliased by the internal segment; not modified). +// qrcode - [out] On success, contains the encoded QR Code. On failure, qrcode[0] is +// set to 0. +// temp_allocator - Allocator used for the internal scratch buffer. Freed before return. +// +// qrcode must have length >= buffer_len_for_version(max_version). +// +// Returns ok=false when: +// - The payload cannot fit in any version within [min_version, max_version] at the given ECL. +// - The temp_allocator fails to allocate. +@(require_results) +encode_binary_auto :: proc( + bin_data: []u8, + qrcode: []u8, + ecl: Ecc, + min_version: int = VERSION_MIN, + max_version: int = VERSION_MAX, + mask: Maybe(Mask) = nil, + boost_ecl: bool = true, + temp_allocator := context.temp_allocator, +) -> ( + ok: bool, +) { + seg: Segment + seg.mode = .Byte + seg.bit_length = calc_segment_bit_length(.Byte, len(bin_data)) + if seg.bit_length == LENGTH_OVERFLOW { + qrcode[0] = 0 + return false + } + seg.num_chars = len(bin_data) + seg.data = bin_data + segs := [1]Segment{seg} + return encode_segments_advanced_auto( + segs[:], + ecl, + min_version, + max_version, + mask, + boost_ecl, + qrcode, + temp_allocator, + ) +} + // Encodes the given segments to a QR Code using default parameters // (VERSION_MIN..VERSION_MAX, auto mask, boost ECL). // @@ -282,17 +327,8 @@ encode_binary :: proc( // Returns ok=false when: // - The total segment data exceeds the capacity of version 40 at the given ECL. @(require_results) -encode_segments_explicit_temp :: proc(segs: []Segment, ecl: Ecc, temp_buffer, qrcode: []u8) -> (ok: bool) { - return encode_segments_advanced_explicit_temp( - segs, - ecl, - VERSION_MIN, - VERSION_MAX, - nil, - true, - temp_buffer, - qrcode, - ) +encode_segments_manual :: proc(segs: []Segment, ecl: Ecc, temp_buffer, qrcode: []u8) -> (ok: bool) { + return encode_segments_advanced_manual(segs, ecl, VERSION_MIN, VERSION_MAX, nil, true, temp_buffer, qrcode) } // Encodes segments to a QR Code using default parameters, automatically allocating the temp buffer. @@ -328,13 +364,9 @@ encode_segments_auto :: proc( return false } defer delete(temp_buffer, temp_allocator) - return encode_segments_explicit_temp(segs, ecl, temp_buffer, qrcode) + return encode_segments_manual(segs, ecl, temp_buffer, qrcode) } -encode_segments :: proc { - encode_segments_explicit_temp, - encode_segments_auto, -} // Encodes the given segments to a QR Code with full control over version range, mask, and ECL boosting. // @@ -353,7 +385,7 @@ encode_segments :: proc { // - The total segment data exceeds the capacity of every version in [min_version, max_version] // at the given ECL. @(require_results) -encode_segments_advanced_explicit_temp :: proc( +encode_segments_advanced_manual :: proc( segs: []Segment, ecl: Ecc, min_version, max_version: int, @@ -490,7 +522,7 @@ encode_segments_advanced_auto :: proc( return false } defer delete(temp_buffer, temp_allocator) - return encode_segments_advanced_explicit_temp( + return encode_segments_advanced_manual( segs, ecl, min_version, @@ -502,18 +534,17 @@ encode_segments_advanced_auto :: proc( ) } -encode_segments_advanced :: proc { - encode_segments_advanced_explicit_temp, - encode_segments_advanced_auto, +encode_manual :: proc { + encode_text_manual, + encode_binary_manual, + encode_segments_manual, + encode_segments_advanced_manual, } -encode :: proc { - encode_text_explicit_temp, +encode_auto :: proc { encode_text_auto, - encode_binary, - encode_segments_explicit_temp, + encode_binary_auto, encode_segments_auto, - encode_segments_advanced_explicit_temp, encode_segments_advanced_auto, } @@ -981,7 +1012,7 @@ min_buffer_size :: proc { min_buffer_size_segments, } -// Text path: auto-selects numeric/alphanumeric/byte mode the same way encode_text does. +// Text path: auto-selects numeric/alphanumeric/byte mode the same way encode_text_manual does. // // Returns ok=false when: // - The text exceeds QR Code capacity for every version in the range at the given ECL. @@ -1162,7 +1193,6 @@ calc_segment_buffer_size :: proc(mode: Mode, num_chars: int) -> int { return (temp + 7) / 8 } -@(private) calc_segment_bit_length :: proc(mode: Mode, num_chars: int) -> int { if num_chars < 0 || num_chars > 32767 { return LENGTH_OVERFLOW @@ -2487,7 +2517,7 @@ test_min_buffer_size_text :: proc(t: ^testing.T) { testing.expect(t, planned > 0) qrcode: [BUFFER_LEN_MAX]u8 temp: [BUFFER_LEN_MAX]u8 - ok := encode_text(text, temp[:], qrcode[:], Ecc.Low) + ok := encode_text_manual(text, temp[:], qrcode[:], Ecc.Low) testing.expect(t, ok) actual_version_size := get_size(qrcode[:]) actual_buf_len := buffer_len_for_version((actual_version_size - 17) / 4) @@ -2538,7 +2568,7 @@ test_min_buffer_size_binary :: proc(t: ^testing.T) { testing.expect(t, size > 0) testing.expect(t, size <= buffer_len_for_version(2)) - // Verify agreement with encode_binary + // Verify agreement with encode_binary_manual { data_len :: 100 planned, planned_ok := min_buffer_size(data_len, .Medium) @@ -2549,7 +2579,7 @@ test_min_buffer_size_binary :: proc(t: ^testing.T) { for i in 0 ..< data_len { dat[i] = u8(i) } - ok := encode_binary(dat[:], data_len, qrcode[:], .Medium) + ok := encode_binary_manual(dat[:], data_len, qrcode[:], .Medium) testing.expect(t, ok) actual_version_size := get_size(qrcode[:]) actual_buf_len := buffer_len_for_version((actual_version_size - 17) / 4) @@ -2609,7 +2639,7 @@ test_min_buffer_size_segments :: proc(t: ^testing.T) { // Verify against actual encode qrcode: [BUFFER_LEN_MAX]u8 temp: [BUFFER_LEN_MAX]u8 - ok := encode_segments(segs[:], Ecc.Low, temp[:], qrcode[:]) + ok := encode_segments_manual(segs[:], Ecc.Low, temp[:], qrcode[:]) testing.expect(t, ok) actual_version_size := get_size(qrcode[:]) actual_buf_len := buffer_len_for_version((actual_version_size - 17) / 4) @@ -2631,7 +2661,7 @@ test_encode_text_auto :: proc(t: ^testing.T) { text :: "Hello, world!" qr_explicit: [BUFFER_LEN_MAX]u8 temp: [BUFFER_LEN_MAX]u8 - ok_explicit := encode_text_explicit_temp(text, temp[:], qr_explicit[:], .Low) + ok_explicit := encode_text_manual(text, temp[:], qr_explicit[:], .Low) testing.expect(t, ok_explicit) qr_auto: [BUFFER_LEN_MAX]u8 @@ -2650,7 +2680,7 @@ test_encode_text_auto :: proc(t: ^testing.T) { text :: "314159265358979323846264338327950288419716939937510" qr_explicit: [BUFFER_LEN_MAX]u8 temp: [BUFFER_LEN_MAX]u8 - ok_explicit := encode_text_explicit_temp(text, temp[:], qr_explicit[:], .Medium) + ok_explicit := encode_text_manual(text, temp[:], qr_explicit[:], .Medium) testing.expect(t, ok_explicit) qr_auto: [BUFFER_LEN_MAX]u8 @@ -2669,7 +2699,7 @@ test_encode_text_auto :: proc(t: ^testing.T) { text :: "HELLO WORLD" qr_explicit: [BUFFER_LEN_MAX]u8 temp: [BUFFER_LEN_MAX]u8 - ok_explicit := encode_text_explicit_temp(text, temp[:], qr_explicit[:], .Quartile) + ok_explicit := encode_text_manual(text, temp[:], qr_explicit[:], .Quartile) testing.expect(t, ok_explicit) qr_auto: [BUFFER_LEN_MAX]u8 @@ -2695,7 +2725,7 @@ test_encode_text_auto :: proc(t: ^testing.T) { text :: "https://www.nayuki.io/" qr_explicit: [BUFFER_LEN_MAX]u8 temp: [BUFFER_LEN_MAX]u8 - ok_explicit := encode_text_explicit_temp(text, temp[:], qr_explicit[:], .High, mask = .M3) + ok_explicit := encode_text_manual(text, temp[:], qr_explicit[:], .High, mask = .M3) testing.expect(t, ok_explicit) qr_auto: [BUFFER_LEN_MAX]u8 @@ -2732,7 +2762,7 @@ test_encode_segments_auto :: proc(t: ^testing.T) { qr_explicit: [BUFFER_LEN_MAX]u8 temp: [BUFFER_LEN_MAX]u8 - ok_explicit := encode_segments_explicit_temp(segs[:], .Low, temp[:], qr_explicit[:]) + ok_explicit := encode_segments_manual(segs[:], .Low, temp[:], qr_explicit[:]) testing.expect(t, ok_explicit) qr_auto: [BUFFER_LEN_MAX]u8 @@ -2764,7 +2794,7 @@ test_encode_segments_advanced_auto :: proc(t: ^testing.T) { qr_explicit: [BUFFER_LEN_MAX]u8 temp: [BUFFER_LEN_MAX]u8 - ok_explicit := encode_segments_advanced_explicit_temp( + ok_explicit := encode_segments_advanced_manual( segs[:], .Medium, VERSION_MIN, @@ -2795,7 +2825,7 @@ test_encode_segments_advanced_auto :: proc(t: ^testing.T) { qr_explicit: [BUFFER_LEN_MAX]u8 temp: [BUFFER_LEN_MAX]u8 - ok_explicit := encode_segments_advanced_explicit_temp( + ok_explicit := encode_segments_advanced_manual( segs[:], .High, 1, -- 2.43.0 From 7650b90d911dbe8cda13db1c522a89df4075a95d Mon Sep 17 00:00:00 2001 From: Zachary Levy Date: Tue, 21 Apr 2026 16:09:40 -0700 Subject: [PATCH 4/5] Comment cleanup --- draw/draw_qr/draw_qr.odin | 20 -------------------- 1 file changed, 20 deletions(-) diff --git a/draw/draw_qr/draw_qr.odin b/draw/draw_qr/draw_qr.odin index e5b1d84..3567092 100644 --- a/draw/draw_qr/draw_qr.odin +++ b/draw/draw_qr/draw_qr.odin @@ -3,10 +3,6 @@ package draw_qr import draw ".." import "../../qrcode" -// ----------------------------------------------------------------------------- -// Layer 1 — pure: encoded QR buffer → RGBA pixels + descriptor -// ----------------------------------------------------------------------------- - // Returns the number of bytes to_texture will write for the given encoded // QR buffer. Equivalent to size*size*4 where size = qrcode.get_size(qrcode_buf). texture_size :: #force_inline proc(qrcode_buf: []u8) -> int { @@ -59,10 +55,6 @@ to_texture :: proc( true } -// ----------------------------------------------------------------------------- -// Layer 2 — raw: pre-encoded QR buffer → registered GPU texture -// ----------------------------------------------------------------------------- - // Allocates pixel buffer via temp_allocator, decodes qrcode_buf into it, and // registers with the GPU. The pixel allocation is freed before return. // @@ -91,10 +83,6 @@ register_texture_from_raw :: proc( return draw.register_texture(desc, pixels) } -// ----------------------------------------------------------------------------- -// Layer 3 — text → registered GPU texture -// ----------------------------------------------------------------------------- - // Encodes text as a QR Code and registers the result as an RGBA texture. // // Returns ok=false when: @@ -134,10 +122,6 @@ register_texture_from_text :: proc( return register_texture_from_raw(qrcode_buf, dark, light, temp_allocator) } -// ----------------------------------------------------------------------------- -// Layer 4 — binary → registered GPU texture -// ----------------------------------------------------------------------------- - // Encodes arbitrary binary data as a QR Code and registers the result as an RGBA texture. // // Returns ok=false when: @@ -177,10 +161,6 @@ register_texture_from_binary :: proc( return register_texture_from_raw(qrcode_buf, dark, light, temp_allocator) } -// ----------------------------------------------------------------------------- -// Clay integration helper -// ----------------------------------------------------------------------------- - // Default fit=.Fit preserves the QR's square aspect; override as needed. clay_image :: #force_inline proc( texture: draw.Texture_Id, -- 2.43.0 From ea19b83ba4054448c76290623b16d0308e7a4dcf Mon Sep 17 00:00:00 2001 From: Zachary Levy Date: Tue, 21 Apr 2026 16:16:51 -0700 Subject: [PATCH 5/5] Cleanup --- draw/draw_qr/draw_qr.odin | 5 + draw/examples/textures.odin | 31 ++--- draw/textures.odin | 251 ++++++++++++++++------------------ qrcode/examples/examples.odin | 80 ++++------- qrcode/generate.odin | 115 +++++++--------- 5 files changed, 211 insertions(+), 271 deletions(-) diff --git a/draw/draw_qr/draw_qr.odin b/draw/draw_qr/draw_qr.odin index 3567092..91cf532 100644 --- a/draw/draw_qr/draw_qr.odin +++ b/draw/draw_qr/draw_qr.odin @@ -161,6 +161,11 @@ register_texture_from_binary :: proc( return register_texture_from_raw(qrcode_buf, dark, light, temp_allocator) } +register_texture_from :: proc { + register_texture_from_text, + register_texture_from_binary +} + // Default fit=.Fit preserves the QR's square aspect; override as needed. clay_image :: #force_inline proc( texture: draw.Texture_Id, diff --git a/draw/examples/textures.odin b/draw/examples/textures.odin index a89be7d..b49b33a 100644 --- a/draw/examples/textures.odin +++ b/draw/examples/textures.odin @@ -2,7 +2,6 @@ package examples import "../../draw" import "../../draw/draw_qr" -import "core:math" import "core:os" import sdl "vendor:sdl3" @@ -17,9 +16,8 @@ textures :: proc() { FONT_SIZE :: u16(14) LABEL_OFFSET :: f32(8) // gap between item and its label - // ------------------------------------------------------------------------- - // Procedural checkerboard texture (8x8, RGBA8) - // ------------------------------------------------------------------------- + //----- Texture registration ---------------------------------- + checker_size :: 8 checker_pixels: [checker_size * checker_size * 4]u8 for y in 0 ..< checker_size { @@ -47,9 +45,6 @@ textures :: proc() { ) defer draw.unregister_texture(checker_texture) - // ------------------------------------------------------------------------- - // Non-square gradient stripe texture (16x8, RGBA8) for fit mode demos - // ------------------------------------------------------------------------- stripe_w :: 16 stripe_h :: 8 stripe_pixels: [stripe_w * stripe_h * 4]u8 @@ -76,14 +71,13 @@ textures :: proc() { ) defer draw.unregister_texture(stripe_texture) - // ------------------------------------------------------------------------- - // QR code texture (R8_UNORM — see rendering note below) - // ------------------------------------------------------------------------- - qr_texture, _ := draw_qr.register_texture_from_text("https://x.com/miiilato/status/1880241066471051443") + qr_texture, _ := draw_qr.register_texture_from("https://x.com/miiilato/status/1880241066471051443") defer draw.unregister_texture(qr_texture) spin_angle: f32 = 0 + //----- Draw loop ---------------------------------- + for { defer free_all(context.temp_allocator) ev: sdl.Event @@ -97,9 +91,8 @@ textures :: proc() { // Background draw.rectangle(base_layer, {0, 0, 800, 600}, {30, 30, 30, 255}) - // ===================================================================== - // Row 1: Sampler presets (y=30) - // ===================================================================== + //----- Row 1: Sampler presets (y=30) ---------------------------------- + ROW1_Y :: f32(30) ITEM_SIZE :: f32(120) COL1 :: f32(30) @@ -156,9 +149,8 @@ textures :: proc() { color = draw.WHITE, ) - // ===================================================================== - // Row 2: QR code, Rounded, Rotating (y=190) - // ===================================================================== + //----- Row 2: Sampler presets (y=190) ---------------------------------- + ROW2_Y :: f32(190) // QR code (RGBA texture with baked colors, nearest sampling) @@ -214,9 +206,8 @@ textures :: proc() { color = draw.WHITE, ) - // ===================================================================== - // Row 3: Fit modes + Per-corner radii (y=360) - // ===================================================================== + //----- Row 3: Fit modes + Per-corner radii (y=360) ---------------------------------- + ROW3_Y :: f32(360) FIT_SIZE :: f32(120) // square target rect diff --git a/draw/textures.odin b/draw/textures.odin index 64f636d..b9e5b31 100644 --- a/draw/textures.odin +++ b/draw/textures.odin @@ -4,10 +4,6 @@ import "core:log" import "core:mem" import sdl "vendor:sdl3" -// --------------------------------------------------------------------------- -// Texture types -// --------------------------------------------------------------------------- - Texture_Id :: distinct u32 INVALID_TEXTURE :: Texture_Id(0) // Slot 0 is reserved/unused @@ -61,10 +57,6 @@ Texture_Slot :: struct { // GLOB.pending_texture_releases : [dynamic]Texture_Id // GLOB.samplers : [SAMPLER_PRESET_COUNT]^sdl.GPUSampler -// --------------------------------------------------------------------------- -// Clay integration type -// --------------------------------------------------------------------------- - Clay_Image_Data :: struct { texture_id: Texture_Id, fit: Fit_Mode, @@ -75,9 +67,9 @@ clay_image_data :: proc(id: Texture_Id, fit: Fit_Mode = .Stretch, tint: Color = return {texture_id = id, fit = fit, tint = tint} } -// --------------------------------------------------------------------------- -// Registration -// --------------------------------------------------------------------------- +// --------------------------------------------------------------------------------------------------------------------- +// ----- Registration ------------- +// --------------------------------------------------------------------------------------------------------------------- // Register a texture. Draw owns the GPU resource and releases it on unregister. // `data` is tightly-packed row-major bytes matching desc.format. @@ -236,130 +228,9 @@ update_texture_region :: proc(id: Texture_Id, region: Rectangle, data: []u8) { } } -// --------------------------------------------------------------------------- -// Accessors -// --------------------------------------------------------------------------- - -texture_size :: proc(id: Texture_Id) -> [2]u32 { - if id == INVALID_TEXTURE do return {0, 0} - slot := &GLOB.texture_slots[u32(id)] - return {slot.desc.width, slot.desc.height} -} - -texture_format :: proc(id: Texture_Id) -> sdl.GPUTextureFormat { - if id == INVALID_TEXTURE do return .INVALID - return GLOB.texture_slots[u32(id)].desc.format -} - -texture_kind :: proc(id: Texture_Id) -> Texture_Kind { - if id == INVALID_TEXTURE do return .Static - return GLOB.texture_slots[u32(id)].desc.kind -} - -// Internal: get the raw GPU texture pointer for binding during draw. -@(private) -texture_gpu_handle :: proc(id: Texture_Id) -> ^sdl.GPUTexture { - if id == INVALID_TEXTURE do return nil - idx := u32(id) - if idx >= u32(len(GLOB.texture_slots)) do return nil - return GLOB.texture_slots[idx].gpu_texture -} - -// --------------------------------------------------------------------------- -// Deferred release (called from draw.end / clear_global) -// --------------------------------------------------------------------------- - -@(private) -process_pending_texture_releases :: proc() { - device := GLOB.device - for id in GLOB.pending_texture_releases { - idx := u32(id) - if idx >= u32(len(GLOB.texture_slots)) do continue - slot := &GLOB.texture_slots[idx] - if slot.gpu_texture != nil { - sdl.ReleaseGPUTexture(device, slot.gpu_texture) - slot.gpu_texture = nil - } - slot.generation += 1 - append(&GLOB.texture_free_list, idx) - } - clear(&GLOB.pending_texture_releases) -} - -// --------------------------------------------------------------------------- -// Sampler pool -// --------------------------------------------------------------------------- - -@(private) -get_sampler :: proc(preset: Sampler_Preset) -> ^sdl.GPUSampler { - idx := int(preset) - if GLOB.samplers[idx] != nil do return GLOB.samplers[idx] - - // Lazily create - min_filter, mag_filter: sdl.GPUFilter - address_mode: sdl.GPUSamplerAddressMode - - switch preset { - case .Nearest_Clamp: - min_filter = .NEAREST; mag_filter = .NEAREST; address_mode = .CLAMP_TO_EDGE - case .Linear_Clamp: - min_filter = .LINEAR; mag_filter = .LINEAR; address_mode = .CLAMP_TO_EDGE - case .Nearest_Repeat: - min_filter = .NEAREST; mag_filter = .NEAREST; address_mode = .REPEAT - case .Linear_Repeat: - min_filter = .LINEAR; mag_filter = .LINEAR; address_mode = .REPEAT - } - - sampler := sdl.CreateGPUSampler( - GLOB.device, - sdl.GPUSamplerCreateInfo { - min_filter = min_filter, - mag_filter = mag_filter, - mipmap_mode = .LINEAR, - address_mode_u = address_mode, - address_mode_v = address_mode, - address_mode_w = address_mode, - }, - ) - if sampler == nil { - log.errorf("Failed to create sampler preset %v: %s", preset, sdl.GetError()) - return GLOB.pipeline_2d_base.sampler // fallback to existing default sampler - } - - GLOB.samplers[idx] = sampler - return sampler -} - -// Internal: destroy all sampler pool entries. Called from draw.destroy(). -@(private) -destroy_sampler_pool :: proc() { - device := GLOB.device - for &s in GLOB.samplers { - if s != nil { - sdl.ReleaseGPUSampler(device, s) - s = nil - } - } -} - -// Internal: destroy all registered textures. Called from draw.destroy(). -@(private) -destroy_all_textures :: proc() { - device := GLOB.device - for &slot in GLOB.texture_slots { - if slot.gpu_texture != nil { - sdl.ReleaseGPUTexture(device, slot.gpu_texture) - slot.gpu_texture = nil - } - } - delete(GLOB.texture_slots) - delete(GLOB.texture_free_list) - delete(GLOB.pending_texture_releases) -} - -// --------------------------------------------------------------------------- -// Fit mode helper -// --------------------------------------------------------------------------- +// --------------------------------------------------------------------------------------------------------------------- +// ----- Helpers ------------- +// --------------------------------------------------------------------------------------------------------------------- // Compute UV rect, recommended sampler, and inner rect for a given fit mode. // `rect` is the target drawing area; `texture_id` identifies the texture whose @@ -431,3 +302,113 @@ fit_params :: proc( return {0, 0, 1, 1}, .Linear_Clamp, inner_rect } + +texture_size :: proc(id: Texture_Id) -> [2]u32 { + if id == INVALID_TEXTURE do return {0, 0} + slot := &GLOB.texture_slots[u32(id)] + return {slot.desc.width, slot.desc.height} +} + +texture_format :: proc(id: Texture_Id) -> sdl.GPUTextureFormat { + if id == INVALID_TEXTURE do return .INVALID + return GLOB.texture_slots[u32(id)].desc.format +} + +texture_kind :: proc(id: Texture_Id) -> Texture_Kind { + if id == INVALID_TEXTURE do return .Static + return GLOB.texture_slots[u32(id)].desc.kind +} + +// Internal: get the raw GPU texture pointer for binding during draw. +@(private) +texture_gpu_handle :: proc(id: Texture_Id) -> ^sdl.GPUTexture { + if id == INVALID_TEXTURE do return nil + idx := u32(id) + if idx >= u32(len(GLOB.texture_slots)) do return nil + return GLOB.texture_slots[idx].gpu_texture +} + +// Deferred release (called from draw.end / clear_global) +@(private) +process_pending_texture_releases :: proc() { + device := GLOB.device + for id in GLOB.pending_texture_releases { + idx := u32(id) + if idx >= u32(len(GLOB.texture_slots)) do continue + slot := &GLOB.texture_slots[idx] + if slot.gpu_texture != nil { + sdl.ReleaseGPUTexture(device, slot.gpu_texture) + slot.gpu_texture = nil + } + slot.generation += 1 + append(&GLOB.texture_free_list, idx) + } + clear(&GLOB.pending_texture_releases) +} + +@(private) +get_sampler :: proc(preset: Sampler_Preset) -> ^sdl.GPUSampler { + idx := int(preset) + if GLOB.samplers[idx] != nil do return GLOB.samplers[idx] + + // Lazily create + min_filter, mag_filter: sdl.GPUFilter + address_mode: sdl.GPUSamplerAddressMode + + switch preset { + case .Nearest_Clamp: + min_filter = .NEAREST; mag_filter = .NEAREST; address_mode = .CLAMP_TO_EDGE + case .Linear_Clamp: + min_filter = .LINEAR; mag_filter = .LINEAR; address_mode = .CLAMP_TO_EDGE + case .Nearest_Repeat: + min_filter = .NEAREST; mag_filter = .NEAREST; address_mode = .REPEAT + case .Linear_Repeat: + min_filter = .LINEAR; mag_filter = .LINEAR; address_mode = .REPEAT + } + + sampler := sdl.CreateGPUSampler( + GLOB.device, + sdl.GPUSamplerCreateInfo { + min_filter = min_filter, + mag_filter = mag_filter, + mipmap_mode = .LINEAR, + address_mode_u = address_mode, + address_mode_v = address_mode, + address_mode_w = address_mode, + }, + ) + if sampler == nil { + log.errorf("Failed to create sampler preset %v: %s", preset, sdl.GetError()) + return GLOB.pipeline_2d_base.sampler // fallback to existing default sampler + } + + GLOB.samplers[idx] = sampler + return sampler +} + +// Internal: destroy all sampler pool entries. Called from draw.destroy(). +@(private) +destroy_sampler_pool :: proc() { + device := GLOB.device + for &s in GLOB.samplers { + if s != nil { + sdl.ReleaseGPUSampler(device, s) + s = nil + } + } +} + +// Internal: destroy all registered textures. Called from draw.destroy(). +@(private) +destroy_all_textures :: proc() { + device := GLOB.device + for &slot in GLOB.texture_slots { + if slot.gpu_texture != nil { + sdl.ReleaseGPUTexture(device, slot.gpu_texture) + slot.gpu_texture = nil + } + } + delete(GLOB.texture_slots) + delete(GLOB.texture_free_list) + delete(GLOB.pending_texture_releases) +} diff --git a/qrcode/examples/examples.odin b/qrcode/examples/examples.odin index 4db3d59..fabca9a 100644 --- a/qrcode/examples/examples.odin +++ b/qrcode/examples/examples.odin @@ -73,57 +73,32 @@ main :: proc() { } } -// ------------------------------------------------------------------------------------------------- -// Utilities -// ------------------------------------------------------------------------------------------------- - -// Prints the given QR Code to the console. -print_qr :: proc(qrcode: []u8) { - size := qr.get_size(qrcode) - border :: 4 - for y in -border ..< size + border { - for x in -border ..< size + border { - fmt.print("##" if qr.get_module(qrcode, x, y) else " ") - } - fmt.println() - } - fmt.println() -} - -// ------------------------------------------------------------------------------------------------- -// Demo: Basic -// ------------------------------------------------------------------------------------------------- - // Creates a single QR Code, then prints it to the console. basic :: proc() { text :: "Hello, world!" ecl :: qr.Ecc.Low qrcode: [qr.BUFFER_LEN_MAX]u8 - ok := qr.encode(text, qrcode[:], ecl) + ok := qr.encode_auto(text, qrcode[:], ecl) if ok do print_qr(qrcode[:]) } -// ------------------------------------------------------------------------------------------------- -// Demo: Variety -// ------------------------------------------------------------------------------------------------- - // Creates a variety of QR Codes that exercise different features of the library. variety :: proc() { qrcode: [qr.BUFFER_LEN_MAX]u8 { // Numeric mode encoding (3.33 bits per digit) - ok := qr.encode("314159265358979323846264338327950288419716939937510", qrcode[:], qr.Ecc.Medium) + ok := qr.encode_auto("314159265358979323846264338327950288419716939937510", qrcode[:], qr.Ecc.Medium) if ok do print_qr(qrcode[:]) } { // Alphanumeric mode encoding (5.5 bits per character) - ok := qr.encode("DOLLAR-AMOUNT:$39.87 PERCENTAGE:100.00% OPERATIONS:+-*/", qrcode[:], qr.Ecc.High) + ok := qr.encode_auto("DOLLAR-AMOUNT:$39.87 PERCENTAGE:100.00% OPERATIONS:+-*/", qrcode[:], qr.Ecc.High) if ok do print_qr(qrcode[:]) } { // Unicode text as UTF-8 - ok := qr.encode( + ok := qr.encode_auto( "\xE3\x81\x93\xE3\x82\x93\xE3\x81\xAB\xE3\x81\xA1wa\xE3\x80\x81" + "\xE4\xB8\x96\xE7\x95\x8C\xEF\xBC\x81\x20\xCE\xB1\xCE\xB2\xCE\xB3\xCE\xB4", qrcode[:], @@ -133,7 +108,7 @@ variety :: proc() { } { // Moderately large QR Code using longer text (from Lewis Carroll's Alice in Wonderland) - ok := qr.encode( + ok := qr.encode_auto( "Alice was beginning to get very tired of sitting by her sister on the bank, " + "and of having nothing to do: once or twice she had peeped into the book her sister was reading, " + "but it had no pictures or conversations in it, 'and what is the use of a book,' thought Alice " + @@ -148,10 +123,6 @@ variety :: proc() { } } -// ------------------------------------------------------------------------------------------------- -// Demo: Segment -// ------------------------------------------------------------------------------------------------- - // Creates QR Codes with manually specified segments for better compactness. segment :: proc() { qrcode: [qr.BUFFER_LEN_MAX]u8 @@ -163,7 +134,7 @@ segment :: proc() { // Encode as single text (auto mode selection) { concat :: silver0 + silver1 - ok := qr.encode(concat, qrcode[:], qr.Ecc.Low) + ok := qr.encode_auto(concat, qrcode[:], qr.Ecc.Low) if ok do print_qr(qrcode[:]) } @@ -172,7 +143,7 @@ segment :: proc() { seg_buf0: [qr.BUFFER_LEN_MAX]u8 seg_buf1: [qr.BUFFER_LEN_MAX]u8 segs := [2]qr.Segment{qr.make_alphanumeric(silver0, seg_buf0[:]), qr.make_numeric(silver1, seg_buf1[:])} - ok := qr.encode(segs[:], qr.Ecc.Low, qrcode[:]) + ok := qr.encode_auto(segs[:], qr.Ecc.Low, qrcode[:]) if ok do print_qr(qrcode[:]) } } @@ -185,7 +156,7 @@ segment :: proc() { // Encode as single text (auto mode selection) { concat :: golden0 + golden1 + golden2 - ok := qr.encode(concat, qrcode[:], qr.Ecc.Low) + ok := qr.encode_auto(concat, qrcode[:], qr.Ecc.Low) if ok do print_qr(qrcode[:]) } @@ -201,7 +172,7 @@ segment :: proc() { qr.make_numeric(golden1, seg_buf1[:]), qr.make_alphanumeric(golden2, seg_buf2[:]), } - ok := qr.encode(segs[:], qr.Ecc.Low, qrcode[:]) + ok := qr.encode_auto(segs[:], qr.Ecc.Low, qrcode[:]) if ok do print_qr(qrcode[:]) } } @@ -219,7 +190,7 @@ segment :: proc() { "\xEF\xBD\x84\xEF\xBD\x85\xEF\xBD\x93\xEF" + "\xBD\x95\xE3\x80\x80\xCE\xBA\xCE\xB1\xEF" + "\xBC\x9F" - ok := qr.encode(madoka, qrcode[:], qr.Ecc.Low) + ok := qr.encode_auto(madoka, qrcode[:], qr.Ecc.Low) if ok do print_qr(qrcode[:]) } @@ -254,16 +225,12 @@ segment :: proc() { seg.data = seg_buf[:(seg.bit_length + 7) / 8] segs := [1]qr.Segment{seg} - ok := qr.encode(segs[:], qr.Ecc.Low, qrcode[:]) + ok := qr.encode_auto(segs[:], qr.Ecc.Low, qrcode[:]) if ok do print_qr(qrcode[:]) } } } -// ------------------------------------------------------------------------------------------------- -// Demo: Mask -// ------------------------------------------------------------------------------------------------- - // Creates QR Codes with the same size and contents but different mask patterns. mask :: proc() { qrcode: [qr.BUFFER_LEN_MAX]u8 @@ -271,10 +238,10 @@ mask :: proc() { { // Project Nayuki URL ok: bool - ok = qr.encode("https://www.nayuki.io/", qrcode[:], qr.Ecc.High) + ok = qr.encode_auto("https://www.nayuki.io/", qrcode[:], qr.Ecc.High) if ok do print_qr(qrcode[:]) - ok = qr.encode("https://www.nayuki.io/", qrcode[:], qr.Ecc.High, mask = qr.Mask.M3) + ok = qr.encode_auto("https://www.nayuki.io/", qrcode[:], qr.Ecc.High, mask = qr.Mask.M3) if ok do print_qr(qrcode[:]) } @@ -290,16 +257,29 @@ mask :: proc() { ok: bool - ok = qr.encode(text, qrcode[:], qr.Ecc.Medium, mask = qr.Mask.M0) + ok = qr.encode_auto(text, qrcode[:], qr.Ecc.Medium, mask = qr.Mask.M0) if ok do print_qr(qrcode[:]) - ok = qr.encode(text, qrcode[:], qr.Ecc.Medium, mask = qr.Mask.M1) + ok = qr.encode_auto(text, qrcode[:], qr.Ecc.Medium, mask = qr.Mask.M1) if ok do print_qr(qrcode[:]) - ok = qr.encode(text, qrcode[:], qr.Ecc.Medium, mask = qr.Mask.M5) + ok = qr.encode_auto(text, qrcode[:], qr.Ecc.Medium, mask = qr.Mask.M5) if ok do print_qr(qrcode[:]) - ok = qr.encode(text, qrcode[:], qr.Ecc.Medium, mask = qr.Mask.M7) + ok = qr.encode_auto(text, qrcode[:], qr.Ecc.Medium, mask = qr.Mask.M7) if ok do print_qr(qrcode[:]) } } + +// Prints the given QR Code to the console. +print_qr :: proc(qrcode: []u8) { + size := qr.get_size(qrcode) + border :: 4 + for y in -border ..< size + border { + for x in -border ..< size + border { + fmt.print("##" if qr.get_module(qrcode, x, y) else " ") + } + fmt.println() + } + fmt.println() +} diff --git a/qrcode/generate.odin b/qrcode/generate.odin index 8261021..0bf3b0d 100644 --- a/qrcode/generate.odin +++ b/qrcode/generate.odin @@ -2,10 +2,30 @@ package qrcode import "core:slice" +VERSION_MIN :: 1 +VERSION_MAX :: 40 -// ------------------------------------------------------------------------------------------------- -// Types -// ------------------------------------------------------------------------------------------------- +// The worst-case number of bytes needed to store one QR Code, up to and including version 40. +BUFFER_LEN_MAX :: 3918 // buffer_len_for_version(VERSION_MAX) + +// Returns the number of bytes needed to store any QR Code up to and including the given version. +buffer_len_for_version :: #force_inline proc(n: int) -> int { + size := n * 4 + 17 + return (size * size + 7) / 8 + 1 +} + +@(private) +LENGTH_OVERFLOW :: -1 +@(private) +REED_SOLOMON_DEGREE_MAX :: 30 +@(private) +PENALTY_N1 :: 3 +@(private) +PENALTY_N2 :: 3 +@(private) +PENALTY_N3 :: 40 +@(private) +PENALTY_N4 :: 10 // The error correction level in a QR Code symbol. Ecc :: enum { @@ -44,39 +64,6 @@ Segment :: struct { bit_length: int, } -// ------------------------------------------------------------------------------------------------- -// Constants -// ------------------------------------------------------------------------------------------------- - -VERSION_MIN :: 1 -VERSION_MAX :: 40 - -// The worst-case number of bytes needed to store one QR Code, up to and including version 40. -BUFFER_LEN_MAX :: 3918 // buffer_len_for_version(VERSION_MAX) - -// Returns the number of bytes needed to store any QR Code up to and including the given version. -buffer_len_for_version :: #force_inline proc(n: int) -> int { - size := n * 4 + 17 - return (size * size + 7) / 8 + 1 -} - -// ------------------------------------------------------------------------------------------------- -// Private constants -// ------------------------------------------------------------------------------------------------- - -@(private) -LENGTH_OVERFLOW :: -1 -@(private) -REED_SOLOMON_DEGREE_MAX :: 30 -@(private) -PENALTY_N1 :: 3 -@(private) -PENALTY_N2 :: 3 -@(private) -PENALTY_N3 :: 40 -@(private) -PENALTY_N4 :: 10 - //odinfmt: disable // For generating error correction codes. Index 0 is padding (set to illegal value). @(private) @@ -96,10 +83,9 @@ NUM_ERROR_CORRECTION_BLOCKS := [4][41]i8{ } //odinfmt: enable - -// ------------------------------------------------------------------------------------------------- -// Encode procedures -// ------------------------------------------------------------------------------------------------- +// --------------------------------------------------------------------------------------------------------------------- +// ----- Encode Procedures ------------------------ +// --------------------------------------------------------------------------------------------------------------------- // Encodes the given text string to a QR Code, automatically selecting // numeric, alphanumeric, or byte mode based on content. @@ -548,9 +534,10 @@ encode_auto :: proc { encode_segments_advanced_auto, } -// ------------------------------------------------------------------------------------------------- -// Error correction code generation -// ------------------------------------------------------------------------------------------------- + +// --------------------------------------------------------------------------------------------------------------------- +// ----- Error Correction Code Generation ------------------------ +// --------------------------------------------------------------------------------------------------------------------- // Appends error correction bytes to each block of data, then interleaves bytes from all blocks. @(private) @@ -618,10 +605,6 @@ get_num_raw_data_modules :: proc(ver: int) -> int { return result } -// ------------------------------------------------------------------------------------------------- -// Reed-Solomon ECC generator -// ------------------------------------------------------------------------------------------------- - @(private) reed_solomon_compute_divisor :: proc(degree: int, result: []u8) { assert(1 <= degree && degree <= REED_SOLOMON_DEGREE_MAX, "reed-solomon degree out of range") @@ -668,9 +651,9 @@ reed_solomon_multiply :: proc(x, y: u8) -> u8 { return z } -// ------------------------------------------------------------------------------------------------- -// Drawing function modules -// ------------------------------------------------------------------------------------------------- +// --------------------------------------------------------------------------------------------------------------------- +// ----- Drawing Function Modules ------------------------ +// --------------------------------------------------------------------------------------------------------------------- // Clears the QR Code grid and marks every function module as dark. @(private) @@ -816,9 +799,9 @@ fill_rectangle :: proc(left, top, width, height: int, qrcode: []u8) { } } -// ------------------------------------------------------------------------------------------------- -// Drawing data modules and masking -// ------------------------------------------------------------------------------------------------- +// --------------------------------------------------------------------------------------------------------------------- +// ----- Drawing data modules and masking ------------------------ +// --------------------------------------------------------------------------------------------------------------------- @(private) draw_codewords :: proc(data: []u8, data_len: int, qrcode: []u8) { @@ -996,9 +979,9 @@ finder_penalty_add_history :: proc(current_run_length: int, run_history: ^[7]int run_history[0] = current_run_length } -// ------------------------------------------------------------------------------------------------- -// Basic QR Code information -// ------------------------------------------------------------------------------------------------- +// --------------------------------------------------------------------------------------------------------------------- +// ----- Basic QR code information ------------------------ +// --------------------------------------------------------------------------------------------------------------------- // Returns the minimum buffer size (in bytes) needed for both temp_buffer and qrcode // to encode the given content at the given ECC level within the given version range. @@ -1158,9 +1141,9 @@ get_bit :: #force_inline proc(x: int, i: uint) -> bool { return ((x >> i) & 1) != 0 } -// ------------------------------------------------------------------------------------------------- -// Segment handling -// ------------------------------------------------------------------------------------------------- +// --------------------------------------------------------------------------------------------------------------------- +// ----- Segment Handling ------------------------ +// --------------------------------------------------------------------------------------------------------------------- // Tests whether the given string can be encoded in numeric mode. is_numeric :: proc(text: string) -> bool { @@ -1349,11 +1332,11 @@ make_eci :: proc(assign_val: int, buf: []u8) -> Segment { return result } -// ------------------------------------------------------------------------------------------------- -// Private helpers -// ------------------------------------------------------------------------------------------------- +// --------------------------------------------------------------------------------------------------------------------- +// ----- Helpers ------------------------ +// --------------------------------------------------------------------------------------------------------------------- -@(private) +// Internal append_bits_to_buffer :: proc(val: uint, num_bits: int, buffer: []u8, bit_len: ^int) { assert(0 <= num_bits && num_bits <= 16 && val >> uint(num_bits) == 0, "invalid bit count or value overflow") for i := num_bits - 1; i >= 0; i -= 1 { @@ -1362,7 +1345,7 @@ append_bits_to_buffer :: proc(val: uint, num_bits: int, buffer: []u8, bit_len: ^ } } -@(private) +// Internal get_total_bits :: proc(segs: []Segment, version: int) -> int { result := 0 for &seg in segs { @@ -1384,7 +1367,7 @@ get_total_bits :: proc(segs: []Segment, version: int) -> int { return result } -@(private) +// Internal num_char_count_bits :: proc(mode: Mode, version: int) -> int { assert(VERSION_MIN <= version && version <= VERSION_MAX, "version out of bounds") i := (version + 7) / 17 @@ -1406,8 +1389,8 @@ num_char_count_bits :: proc(mode: Mode, version: int) -> int { unreachable() } +// Internal // Returns the index of c in the alphanumeric charset (0-44), or -1 if not found. -@(private) alphanumeric_index :: proc(c: u8) -> int { switch c { case '0' ..= '9': return int(c - '0') -- 2.43.0