It's getting much more complicated. Here's an hour long talk on how Unreal Engine 5's Nanite works.[1] This is a huge breakthrough in level of detail theory. Things that used to be O(N) are now O(1). With enough pre-computation, rendering cost does not go up with scene size. See this demo.[2] You can download the demo for Xbox X/S and PlayStation 5, and soon, in source form for PCs. Explore 16 square kilometers of photorealistic city in which you can see a close-up view of anything.
The format of assets in memory for Nanite becomes far more complicated. GPUs need a redesign again; UE5 has to do more of the rendering on the CPUs than they'd like. What they need to do is simple but not well suited to current GPU designs. It's such a huge win this approach will take over, despite that.
New game engines will need to have something comparable. Maybe using the same format. Epic doesn't seem to be making patent claims, and the code for all this is in the Unreal Engine 5 download.
(Basic concept of how this works: A mesh becomes a hierarchy of meshes, allowing zoom in and out of on level of detail. The approach to subdivision is seamless. You divide the mesh into cells of about 20 triangles. Then, each cell is cut by one or two new long edges across the cell. These become the boundaries of a simpler cell with about half as many triangles, but no new vertices. The lower-resolution cell has no more edges than the higher-resolution cell. There's a neat graph theory result on where to cut to make this work. This process can be repeated recursively. So, one big mesh can be rendered at different levels of resolution depending on how close the camera is to each part. The end result is that the number of rendered triangles is determined by the number of pixels on the screen, not the complexity of the model. Hence, O(1)).
Then, all that needs to stream in from the network. That's probably Unreal Engine 6. In UE5, the whole model has be downloaded to local SSD first. Doing it dynamically is going to require precise download scheduling, and edge servers doing partial mesh reduction. Fun problems.
Then the metaverse starts to look like Ready Player One.
The closest thing to the metaverse in the form of open worlds are flight simulators right now. They have an enormous amount of content to model the entire planet already. A lot of that content is generated rather than designed. MS obviously did a nice job with this and e.g. X-plane and Flightgear have pretty detailed worlds as well. Most of these worlds are generated from various data sources (satellite imagery, altitude data, open street maps, etc. MS does the same but adds machine learning to that mix to generate plausible looking buildings and objects from the massive amount of data they have access to.
Fusing that with what Epic is doing is basically going to enable even more detailed gaming worlds where a lot of the content is generated rather than designed. Flight simulator scenery is bottle necked on the enormous amount of triangles needed to model the entire world. Storing them is not the issue. But rendering them is. Removing that limitation will allow much higher level of details. X-plane is actually pretty good at generating e.g. roads, rails, and buildings from open street map data but the polygon counts are kept low to keep performance up. Check out e.g. simheaven.com for some examples of what x-plane can do with open data. Not as good as what MS has delivered but not that far off either. And they actually provide some open data as well, which simheaven recently integrated for x-plane.
Flightgear actually pioneered the notion of streaming scenery content on demand rather than pre-installing it manually. X-plane does not have this and MS has now shifted to doing both installable scenery and downloading higher resolution stuff on demand while you are flying.
Once you hit a certain size, pre-installing all the content is no longer feasible or even needed. Once these worlds become so large that exploring all of it would take a life time (or more), it's more optimal to only deliver those bits that you actually need, when you need them. And doing that in real time means that even caching that locally becomes optional. And even the rendering process itself does not have to be local. E.g. NVidia and a few others have been providing streaming games with cloud based rendering. That makes a lot more sense once you hit a few peta/exa bytes of scenery data.
Another interesting thing is generating photo realistic scenery from simple sketches with machine learning trained on massive amounts of images. This too is becoming a thing and is going to be much more efficient than manually crafting detailed models in excruciating amounts of detail. MS did this with flight simulator. Others will follow.
I think that we put too much credit on the graphic part for the metaverse. As long as people can immerse themselves in X, I'd say it's good enough. We have had metaverse since the 80s. On my side I enjoy classic games much more than modern ones too, maybe I'm just weird.
Nanite doesn't do any rendering on CPU and does a lot of work that is traditionally done on CPU (culling) on GPU. Nanite does use a software rasterizer for most triangles, but it runs on GPU.
Oh, I think you're right. The nested FOR loops of the "Micropoly software rasterizer" are running on the GPU.[1] I think. I though that was CPU code at first. They're iterating over one or two pixels, not a big chunk of screen. For larger triangles, they use the GPU fill hardware. The key idea is to have triangles be about pixel sized. If triangles are bigger than 1-2 pixels, and there's more detail available, use smaller triangles.
It's getting much more complicated. Here's an hour long talk on how Unreal Engine 5's Nanite works.[1] This is a huge breakthrough in level of detail theory. Things that used to be O(N) are now O(1). With enough pre-computation, rendering cost does not go up with scene size. See this demo.[2] You can download the demo for Xbox X/S and PlayStation 5, and soon, in source form for PCs. Explore 16 square kilometers of photorealistic city in which you can see a close-up view of anything.
The format of assets in memory for Nanite becomes far more complicated. GPUs need a redesign again; UE5 has to do more of the rendering on the CPUs than they'd like. What they need to do is simple but not well suited to current GPU designs. It's such a huge win this approach will take over, despite that.
New game engines will need to have something comparable. Maybe using the same format. Epic doesn't seem to be making patent claims, and the code for all this is in the Unreal Engine 5 download.
(Basic concept of how this works: A mesh becomes a hierarchy of meshes, allowing zoom in and out of on level of detail. The approach to subdivision is seamless. You divide the mesh into cells of about 20 triangles. Then, each cell is cut by one or two new long edges across the cell. These become the boundaries of a simpler cell with about half as many triangles, but no new vertices. The lower-resolution cell has no more edges than the higher-resolution cell. There's a neat graph theory result on where to cut to make this work. This process can be repeated recursively. So, one big mesh can be rendered at different levels of resolution depending on how close the camera is to each part. The end result is that the number of rendered triangles is determined by the number of pixels on the screen, not the complexity of the model. Hence, O(1)).
Then, all that needs to stream in from the network. That's probably Unreal Engine 6. In UE5, the whole model has be downloaded to local SSD first. Doing it dynamically is going to require precise download scheduling, and edge servers doing partial mesh reduction. Fun problems.
Then the metaverse starts to look like Ready Player One.
[1] https://www.youtube.com/watch?v=eviSykqSUUw
[2] https://www.youtube.com/watch?v=WU0gvPcc3jQ