I recently did a deep dive on open-endedness, and my favorite example of its power is Picbreeder from 2008 [1]. It was a simple website where users could somewhat arbitrarily combine pictures created by super simple NNs. Most images were garbage, but a few resembled real objects. The best part is that attempts to replicate these by a traditional hill-climbing method would result in drastically more complicated solutions or even no solution at all.
It's a helpful analogy to understand the contrast between today's gradient descent vs open-ended exploration.
This video was fascinating. I didn't know about "open endedness" as a concept but now that I see it, of course it's an approach.
One thought... in the video, Ken makes the observation that it takes way more complexity and steps to find a given shape with SGD vs. open-endedness. Which is certainly fascinating. However...
Intuitively, this feels like a similar dynamic is at play with the "birthday paradox". That's where if you take a room of just 23 people, there is a greater than 50% chance that two of them have the same birthday. This is very surprising to most people. It seems like you should need way more people (365 in fact!). The paradox is resolved when you realize that your intuition is asking how many people it takes to have your birthday. But the situation with a room of 23 people is implicitly asking for just one connection among any two people. Thus you don't have 23 chances, you have 23 ^ 2 = 529 chances.
I think the same thing is at work here. With the open-ended approach, humans can find any pattern at any generation. With the SGD approach, you can only look for one pattern. So it's just not an apples to apples comparison and sort of misleading / unfair to say that open-endedness is way more "efficient", because you aren't asking it to do the same task.
Said another way, I think with the open-endedness, it seems like you are looking for thousands (or even millions) of shapes simultaneously. With SGD, you're kinda flipping that around, and looking for exactly 1 shape, but giving it thousands of generations to achieve it.
I used to think that typing speed was not really that important, especially when now we have so many LLMs doing the typing for us. But honestly, now I think it's even more important because the specificity and detail in your prompts are paramount to getting a good response, and something like a dictation tool (which is what I'm using right now) is really good for generating very specific prompts.
In fact, I wrote all this out using a dictation tool in ~20 seconds (258 WPM).
Agreed. I installed Whisper on my Linux computer with a program called SpeechNote. The dictation is all done offline, and it is astonishingly good.
I also have a whisper dictation app on my Android phone (the app's ID string is 'org.woheller69.whisper', there's a few Whisper apps with the same name "Whisper", but this one is my favorite).
FWIW this was typed by hand on my phone, but these apps are both amazing.
Built one for myself. It's context-aware and promptable.
Tested well on Linux, not so much on other platforms but in theory should support them.
It's a bit meta but I wrote it mostly using Claude Code. Once I had an MVP, I was able to prompt much faster by just speaking out what I wanted it to change.
> Search "io" on Google right now and see what comes up...
I don't know about you, but neither of them comes up. Google I/O has always been something you have to search for including the "Google" part and this news is all about Jony Ive, not the nondescript company name.
Wow. Thanks for posting the direct link to examples. Those sound incredibly good and would be impressive for a frontier lab. For two people over a few months, it's spectacular.
A little overacted, it reminds me of the voice acting in those flash cartoons you'd see in the early days of YouTube. That's not to say it isn't good work, it still sounds remarkably human. Just silly humans :)
Sounds great. One of the female examples has convincing uptalk. There must be a way to manipulate the latent space to control uptalk, vocal fry, smoker’s voice, lispiness, etc.
Thanks. I was wondering why my open-webui said that I already had the model. I bet a lot of people are making the same mistake I did and downloading just the old, post-quantized 27B.
Cursor is likely very tuned for Claude (prompts-wise and all) due to its dominance with 3.5 and now 3.7. Still, Gemini 2.5's tool calling has been pretty poor in my experience which Cursor heavily relies on.
It's a helpful analogy to understand the contrast between today's gradient descent vs open-ended exploration.
[1] First half of https://www.youtube.com/watch?v=T08wc4xD3KA
More notes from my deep dive: https://x.com/jinaycodes/status/1932078206166749392