> Your web apps could truly one day be generated frame by frame by a video model. Really. The amount of compute we’ll need will be staggering.
We've technically been able to play board games by entering our moves into our telephones, sending them to a CPU to be combined, then printing out a new board on paper to conform to the new board state. We do not do this because it would be stupid. We can not depend on people starting to do this saving the paper, printer, and ink industries. Some things are not done because they are worthless.
You know that N people can now point a webcam onto their boards and have a multi modal LLM understand everyone’s board state now, right? Literally zero programming involved, you just have to point a camera at the damn thing and maybe write some glue code.
If you’re a board game player then you are more than capable of imagining possibilities well beyond this.
The parent comment's point isn't that we can't do these things, or that these things are difficult; it's that we don't want to do them. They aren't beneficial.
We've technically been able to play board games by entering our moves into our telephones, sending them to a CPU to be combined, then printing out a new board on paper to conform to the new board state. We do not do this because it would be stupid. We can not depend on people starting to do this saving the paper, printer, and ink industries. Some things are not done because they are worthless.