Capturing the WebGPU Ecosystem

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • Onboard AI - ChatGPT with full context of any GitHub repo.
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • gpuweb

    Where the GPU for the Web work happens!

    WebGPU currently doesn't support the "bindless" resource access model (see: https://github.com/gpuweb/gpuweb/issues/380).

    The "max number of sampled texture per shader stage" is a runtime device limit, and the minimal value for that seems to be 16. So texture atlasses are still a thing in WebGPU.

  • apfel-kruemel

    Pre-Designed Component Library for Spatial User Interfaces

    https://github.com/coconut-xr/apfel-kruemel works today, I only know about battery-optimized software in the context of games though.

    https://felt.com/blog/svg-to-canvas-part-2-building-interact... and https://github.com/servo/pathfinder/ might also be of interest.

  • Onboard AI

    ChatGPT with full context of any GitHub repo. Onboard AI learns any GitHub repo in minutes and lets you chat with it to locate functionality, understand different parts, and generate new code. Use it for free at app.getonboardai.com.

  • pathfinder

    A fast, practical GPU rasterizer for fonts and vector graphics

    https://github.com/coconut-xr/apfel-kruemel works today, I only know about battery-optimized software in the context of games though.

    https://felt.com/blog/svg-to-canvas-part-2-building-interact... and https://github.com/servo/pathfinder/ might also be of interest.

  • wgpu-native

    Native WebGPU implementation based on wgpu-core

    The Mach engine project has prebuilt Dawn libraries and also a simplified build-from-source process using the Zig build system, see:

    https://machengine.org/pkg/mach-gpu-dawn/

    It's also possible to use wgpu-native in C/C++ projects as prebuilt library, see:

    https://github.com/gfx-rs/wgpu-native

  • bevy

    A refreshingly simple data-driven game engine built in Rust

    Most of Nanite (at least, everything but the LOD system, I haven't tried that part, and the compute rasterizer due to lack of storage image atomics because Metal lacks them...) is implementable in WebGPU actually.

    I have a PR that does a lot of the same things (meshlets, visbuffer, material depth, two pass occlusion culling) open for Bevy https://github.com/bevyengine/bevy/pull/10164 that I've been working on, which uses WebGPU.

    WebGPU is actually a pretty good API imo. It's missing some advanced features like raytracing, mesh shaders, and subgroup operations (coming soon!), but it can still do a lot.

    The much bigger missing feature is "bindless" support (non-uniform arrays of bound resources). BindGroup overhead (and ergonomics) is a significant downside.

  • wasmdec

    WebAssembly to C decompiler

    I think you're missing a good amount of nuance here

    minified JS can be turned into reasonable JS, yes, but you're probably not going to get TypeScript code back, so the same sort of challenge exists there.

    Assembly -> high-level language is harder, but there are absolutely binary -> C decompilers that are very popular/used in the RE community to make changes to existing programs.

    But that doesn't even matter, WASM is much higher level than assembly, it's a stack machine, there is no arbitrary control flow / labels / `goto`, there are pre-defined data types, etc. all of this means it's easier to convert WASM -> high-level language than it is with a generic x86/arm binary.

    There are WASM decompilers[0][1] which can convert WASM binaries into C code and back.

    In both cases (minified JS and WASM), you're not going to get out exactly what you put in, but WASM doesn't really change the situation very much given the widespread adoption of 'compile to JS' languages like TypeScript these days.

    [0] https://chromium.googlesource.com/external/github.com/WebAss...

    [1] https://github.com/wwwg/wasmdec

  • stb

    stb single-file public domain libraries for C/C++

    So I read through the materials on mesh shaders and work graphs and looked at sample code. These won't really work (see below). As I implied previously, it's best to research/discuss these sort of matters with professional graphics programmers who have experience actually using the technologies under consideration.

    So for the sake of future web searchers who discover this thread: there are only two proven ways to efficiently draw thousands of unique textures of different sizes with a single draw call that are actually used by experienced graphics programmers in production code as of 2023.

    Proven method #1: Pack these thousands of textures into a texture atlas.

    Proven method #2: Use bindless resources, which is still fairly bleeding edge, and will require fallback to atlases if targeting the PC instead of only high end console (Xbox Series S|X...).

    Mesh shaders by themselves won't work: These have similar texture access limitations to the old geometry/tessellation stage they improve upon. A limited, fixed number of textures still must be bound before each draw call (say, 16 or 32 textures, not 1000s), unless bindless resources are used. So mesh shaders must be used with an atlas or with bindless resources.

    Work graphs by themselves won't work: This feature is bleeding edge shader model 6.8 whereas bindless resources are SM 6.6. (Xbox Series X|S might top out at SM 6.7, I can't find an authoritative answer.) It looks like work graphs might only work well on nVidia GPUs and won't work well on Intel GPUs anytime soon (but, again, I'm not knowledgeable enough to say this authoritatively). Furthermore, this feature may have a hard dependency on using bindless to begin with. That is, I can't tell if one is allowed to execute a work graph that binds and unbinds individual texture resources. And if one could do such a thing, it would certainly be slower than using bindless. The cost of bindless is paid "up front" when the textures are uploaded.

    Some programmers use Texture2DArray/GL_TEXTURE_2D_ARRAY as an alternative to atlases but two limitations are (1) the max array length (e.g. GL_MAX_ARRAY_TEXTURE_LAYERS) might only be 256 (e.g. for OpenGL 3.0), (2) all textures must be the same size.

    Finally, for the sake of any web searcher who lands on this thread in the years to come, to pack an atlas well a good packing algorithm is needed. It's harder to pack triangles than rectangles but triangles use atlas memory more efficiently and a good triangle packing will outperform the fancy new bindless rendering. Some open source starting points for packing:

    https://github.com/nothings/stb/blob/master/stb_rect_pack.h

    https://github.com/ands/trianglepacker

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

  • trianglepacker

    A C/C++ single-file library that packs triangles of a 3D mesh into a rectangle/texture.

    So I read through the materials on mesh shaders and work graphs and looked at sample code. These won't really work (see below). As I implied previously, it's best to research/discuss these sort of matters with professional graphics programmers who have experience actually using the technologies under consideration.

    So for the sake of future web searchers who discover this thread: there are only two proven ways to efficiently draw thousands of unique textures of different sizes with a single draw call that are actually used by experienced graphics programmers in production code as of 2023.

    Proven method #1: Pack these thousands of textures into a texture atlas.

    Proven method #2: Use bindless resources, which is still fairly bleeding edge, and will require fallback to atlases if targeting the PC instead of only high end console (Xbox Series S|X...).

    Mesh shaders by themselves won't work: These have similar texture access limitations to the old geometry/tessellation stage they improve upon. A limited, fixed number of textures still must be bound before each draw call (say, 16 or 32 textures, not 1000s), unless bindless resources are used. So mesh shaders must be used with an atlas or with bindless resources.

    Work graphs by themselves won't work: This feature is bleeding edge shader model 6.8 whereas bindless resources are SM 6.6. (Xbox Series X|S might top out at SM 6.7, I can't find an authoritative answer.) It looks like work graphs might only work well on nVidia GPUs and won't work well on Intel GPUs anytime soon (but, again, I'm not knowledgeable enough to say this authoritatively). Furthermore, this feature may have a hard dependency on using bindless to begin with. That is, I can't tell if one is allowed to execute a work graph that binds and unbinds individual texture resources. And if one could do such a thing, it would certainly be slower than using bindless. The cost of bindless is paid "up front" when the textures are uploaded.

    Some programmers use Texture2DArray/GL_TEXTURE_2D_ARRAY as an alternative to atlases but two limitations are (1) the max array length (e.g. GL_MAX_ARRAY_TEXTURE_LAYERS) might only be 256 (e.g. for OpenGL 3.0), (2) all textures must be the same size.

    Finally, for the sake of any web searcher who lands on this thread in the years to come, to pack an atlas well a good packing algorithm is needed. It's harder to pack triangles than rectangles but triangles use atlas memory more efficiently and a good triangle packing will outperform the fancy new bindless rendering. Some open source starting points for packing:

    https://github.com/nothings/stb/blob/master/stb_rect_pack.h

    https://github.com/ands/trianglepacker

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts