Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This minimalism is very effective.

I took the opposite approach, and it has cause great pain. I've been writing a metaverse client in Rust. Right now, it's running on another screen, showing an avatar riding a tram through a large steampunk city. I let that run for 12 hours before shipping a new pre-release.

This uses Vulkan, but it has WGPU and Rend3 on top. Rend3 offers a very clean API - you create meshes, 2d textures, etc., and "objects", which reference the meshes and textures. Creating an object puts it on screen. Rust reference counting interlocks everything. It's very straightforward to use.

All those layers create problems. WGPU tries to support web browsers, Vulkan, Metal, DX11 (recently dropped), DX12, Android, and OpenGL. So it needs a big dev team and changes are hard. WGPU's own API is mostly like Vulkan - you still have to do your own GPU memory allocation and synchronization.

WGPU has lowest-common-denominator problems. Some of those platforms can't support some functions. WGPU doesn't support multiple threads updating GPU memory without interference, which Vulkan supports. That's how you get content into the GPU without killing the frame rate. Big-world games and clients need that. Also, having to deal with platforms with different concurrency restrictions results in lock conflicts that can kill performance.

Rend3 is supposed to be a modest level of glue code to handle synchronization and allocation. Those are hard to do in a general way. Especially synchronization. Rend3 also does frustum culling (which is a big performance win; you're not rendering what's behind you) and tried to do occlusion culling (which was a performance lose because the compute to do that slowed things down). It also does translucency, which means a depth sort. (Translucent objects are a huge pain. I really need them; I work on worlds with lots of windows, which you can see out of and see in.)

The Rust 3D stack people are annoyed with me because I've been pounding on them to fix their stack for three years now. That's all volunteer. Vulkan has money behind it and enough users to keep it maintained. Rend3 was recently abandoned by its creator, so now I have to go inside that and fix it. Few people do anything elaborate on WGPU - mostly it's 2D games you could have done in Flash, or simple static 3D scenes. Commercial projects continue to use Unity or UE5.

If I went directly to Vulkan, I'd still have to write synchronization, allocation, frustrum culling, and translucency. So that's a big switch.

Incidentally, Vulkano, the wrapper over Vulkan and Metal, has lowest-common-denominator problems too. It doesn't allow concurrent updating of assets in the GPU. Both Vulkan and Metal support that. But, of course, Apple does it differently.



> WGPU doesn't support multiple threads updating GPU memory without interference

WGPU uses WebGPU and AFAIK no browser so far supports "threads". https://gpuweb.github.io/gpuweb/explainer/#multithreading https://github.com/gpuweb/gpuweb/issues/354

And OpenGL never supported "threads", so anything using OpenGL can't either.


OpenGL can do threads with shared contexts but caveats apply so it is not popular.

But even more common is mapping memory in "OpenGL thread" and then letting another thread fill the memory. Quite common is mapping buffers with persistent/coherent flags at init, and then leave them mapped.


WGPU is an implementation of WebGPU. This is more accurate than saying it uses WebGPU; WebGPU is not software, you can't use it.

WGPU goes beyond WebGPU in many ways already, and could also support threads.


> WGPU is an implementation of WebGPU

No.

    wgpu is a safe and portable graphics library for Rust based on the WebGPU API. 
    ...
    and browsers via WebAssembly on WebGPU and WebGL2.
https://wgpu.rs/

> WebGPU is not software, you can't use it

It's an API that you can use, like OpenGL, Vulkan, Metal, .... Here you can see the current Browser support: https://caniuse.com/webgpu


And what do you call when you call that API? wgpu in Firefox and Dawn in Chrome.

wgpu is an implementation of webgpu with the raw rust bindings rather than the C-API it also provides called by Firefox.

Unless you want to dig into the implementation details of the various subcrates.


> wgpu in Firefox

Ah, sorry, I didn't know that, thanks. On Chrome it uses either WebGL2 or WebGPU.

    When running in a web browser (by compilation to WebAssembly) without the "webgl" feature enabled, wgpu relies on the browser's own WebGPU implementation. 
https://github.com/gfx-rs/wgpu?tab=readme-ov-file#tracking-t...


The wgpu crate in rust can do two things. The first is that it is an implementation of the webgpu spec built upon OpenGL/Vulkan/DirectX/WebGL/Metal. This is what firefox uses. The other is that it can just generate calls to the webgpu api. This is what it does when targetting wasm.

So if you're writing rust with wgpu and you target wasm and run it in firefox you will have a wasm program that calls the browser's implementation of webgpu which is the same wgpu crate you used just built in the other way.

If you run it in Chrome instead it will use Dawn.

And in native land wgpu and dawn are incompatible and have different C-apis. One uses uint32_t everywhere and the other size_t. So they're the same on 32bit platforms like WASM but not on modern native platforms


> All those layers create problems. WGPU tries to support web browsers, Vulkan, Metal, DX11 (recently dropped), DX12, Android, and OpenGL. So it needs a big dev team and changes are hard. WGPU's own API is mostly like Vulkan - you still have to do your own GPU memory allocation and synchronization.

The first part is true, but the second part is not. Allocation and synchronization is automatic.


Vulkan does not allocate GPU memory for you. Well, it gives you a big block, and then it's the problem of the caller to allocate little pieces from that. It's like "sbrk" in Linux/Unix, which gets memory from the OS. You usually don't use "sbrk" directly. Something like "malloc" is used on top of that.


Vulkan doesn't, but WebGPU does do synchronization and memory allocation for you.


WGPU or WebGPU? The former is the Rust crate being discussed in the quote.


Both.


> Both Vulkan and Metal support that. But, of course, Apple does it differently.

Metal is older than Vulkan. So really, Vulkan does it differently.


Vulkan is a continuation of AMD's Mantle which is then older than Metal.


I wasn't aware of that. Fair enough!


> WGPU doesn't support multiple threads updating GPU memory without interference, which Vulkan supports.

This is really helpful for me to learn about, this is a key thing I want to be able to get right for having a good experience. I really hope WGPU can find a way to add something for this as an extension.


Do you know if these things I found offer any hope for being able to continue rendering a scene smoothly while we handle GPU memory management operations on worker threads?

https://gfx-rs.github.io/2023/11/24/arcanization.html

https://github.com/gfx-rs/wgpu/issues/5322


The actual issue is not CPU-side. The issue is GPU-side.

The CPU feeds commands (CommandBuffers) telling the GPU what to do over a Queue.

WebGPU/wgpu/dawn only have a single general purpose queue. Meaning any data upload commands (copyBufferToBuffer) you send on the queue block rendering commands from starting.

The solution is multiple queues. Modern GPUs have a dedicated transfer/copy queue separate from the main general purpose queue.

WebGPU/wgpu/dawn would need to add support for additional queues: https://github.com/gpuweb/gpuweb/issues?q=is%3Aopen+is%3Aiss...

There's also ReBAR/SMA, and unified memory (UMA) platforms to consider, but that gets even more complex.


> The solution is multiple queues. Modern GPUs have a dedicated transfer/copy queue separate from the main general purpose queue.

Yes. This is the big performance advantage of Vulkan over OpenGL. You can get the bulk copying of textures and meshes out of the render thread. So asset loading can be done concurrently with rendering.

None of this matters until you're rendering something really big. Then it dominates the problem.


I believe you can do loading texture data onto the GPU from another thread with OpenGL with pixel buffer objects: https://www.khronos.org/opengl/wiki/Pixel_Buffer_Object

I haven't tried it yet, but will try soon for my open-source metaverse Substrata: https://substrata.info/.


It is possible but managing asynchronous transfers in OpenGL is quite tricky.

You either need to use OpenGL sync objects very carefully or accept the risk of unintended GPU stalls.


Yeah you need to make sure the upload has completed before you try and use the texture, right?


Yes, and you need to make sure that the upload has completed before you reuse the pixel buffer too.

And the synchronization API isn't very awesome, it can only wait for all operations until a certain point have been completed. You can't easily track individual transfers.


Thank you. I hope to see progress in these areas when I visit later. I was hoping to be able to go all in on wgpu but if there are still legitimate reasons like this one to build a native app, then so be it.


It depends on your requirements and experience level. Using WebGPU is _much_ easier than Vulkan, so if you don't have a lot of prior experience with all of computer graphics theory / graphics APIs / engine design, I would definitely start with WebGPU. You can still get very far with it, and it's way easier.


Short version: hope, yes. Obtain now, no.

Long version: https://github.com/gfx-rs/wgpu/discussions/5525

There's a lock stall around resource allocation. The asset-loading threads can stall out the rendering thread. I can see this in Tracy profiilng, but don't fully understand the underlying problem. Looks like one of three locks in WGPU, and I'm going to have to build WGPU with more profiling scopes to narrow the problem.

Very few people have gotten far enough with 3D graphics in Rust to need this. Most Rust 3D graphics projects are like the OP's here - load up a mostly static scene and do a little with it. If you load all the content before displaying, and don't do much dynamic modification beyond moving items around, most of the hard problems can be bypassed. You can move stuff just by changing its transform - that's cheap. So you can do a pretty good small-world game without hitting these problems. Scale up to a big world that won't all fit in the GPU at once, and things get complicated.

I'm glad to hear from someone else who's trying to push on this. Write me at "nagle@animats.com", please.

For a sense of what I'm doing: https://video.hardlimit.com/w/7usCE3v2RrWK6nuoSr4NHJ


What I do currently is just limit the amount of data uploaded per frame. Not ideal but works.


That works better in game dev where you have control over the content. Metaverse dev is like writing a web browser - some people are going to create excessively bulky assets, and you have to do something reasonable with them.


It works with large assets too. Just split the upload into chunks.


Do you have any references? I thought all wgpu objects are wrapped with an Arc<Mutex<>>.


That sounds wild to me. In c++ I remember a time where I increased the frame rate of a particle renderer from 20fps to 60fps+ simply by going from passing shared_ptr (Arc equivalent) to passing references.


Nevermind. Just an Arc<>.


I thought your earlier thread on URLO very interesting https://users.rust-lang.org/t/game-dev-in-rust-some-notes-on...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: