More

jinay · 2025-06-11T19:53:15 1749671595

I recently did a deep dive on open-endedness, and my favorite example of its power is Picbreeder from 2008 [1]. It was a simple website where users could somewhat arbitrarily combine pictures created by super simple NNs. Most images were garbage, but a few resembled real objects. The best part is that attempts to replicate these by a traditional hill-climbing method would result in drastically more complicated solutions or even no solution at all.

It's a helpful analogy to understand the contrast between today's gradient descent vs open-ended exploration.

[1] First half of https://www.youtube.com/watch?v=T08wc4xD3KA

More notes from my deep dive: https://x.com/jinaycodes/status/1932078206166749392

bwest87 · 2025-06-11T22:57:16 1749682636

This video was fascinating. I didn't know about "open endedness" as a concept but now that I see it, of course it's an approach.

One thought... in the video, Ken makes the observation that it takes way more complexity and steps to find a given shape with SGD vs. open-endedness. Which is certainly fascinating. However...

Intuitively, this feels like a similar dynamic is at play with the "birthday paradox". That's where if you take a room of just 23 people, there is a greater than 50% chance that two of them have the same birthday. This is very surprising to most people. It seems like you should need way more people (365 in fact!). The paradox is resolved when you realize that your intuition is asking how many people it takes to have your birthday. But the situation with a room of 23 people is implicitly asking for just one connection among any two people. Thus you don't have 23 chances, you have 23 ^ 2 = 529 chances.

I think the same thing is at work here. With the open-ended approach, humans can find any pattern at any generation. With the SGD approach, you can only look for one pattern. So it's just not an apples to apples comparison and sort of misleading / unfair to say that open-endedness is way more "efficient", because you aren't asking it to do the same task.

Said another way, I think with the open-endedness, it seems like you are looking for thousands (or even millions) of shapes simultaneously. With SGD, you're kinda flipping that around, and looking for exactly 1 shape, but giving it thousands of generations to achieve it.

publicdaniel · 2025-06-11T21:35:47 1749677747

Did you see their recent paper building on this? Throwback to Picbreeder!

https://x.com/kenneth0stanley/status/1924650124829196370

jinay · 2025-06-11T22:02:57 1749679377

Ooh I haven't, but this is exactly the kind of follow-up I was looking for. Thanks for sharing!

jinay · 2025-06-11T22:08:31 1749679711

Timestamped link to the YouTube video: https://youtu.be/T08wc4xD3KA?t=124

jinay · 2025-06-03T14:54:17 1748962457

Please add an llms.txt file! https://llmstxt.org/

I'd love to see how far I can take this by giving it to an LLM and asking it to format for me with Quarkdown.

coherentpony · 2025-06-03T14:56:09 1748962569

> Converting complex HTML pages with navigation, ads, and JavaScript into LLM-friendly plain text is both difficult and imprecise.

Oh my god. It just occurred to me that LLMs may have a better experience “browsing the internet” than humans do.

That is so tragically depressing.

tiagod · 2025-06-03T17:02:22 1748970142

On the other hand, the requirement of stuff like this and MCP for LLMs might just bring back open APIs and stuff like RSS back to the web at large!

jinay · 2025-06-03T16:09:22 1748966962

Soon we'll start having a humans.txt format to account for this

https://xkcd.com/927/

jinay · 2025-05-25T14:21:06 1748182866

I used to think that typing speed was not really that important, especially when now we have so many LLMs doing the typing for us. But honestly, now I think it's even more important because the specificity and detail in your prompts are paramount to getting a good response, and something like a dictation tool (which is what I'm using right now) is really good for generating very specific prompts.

In fact, I wrote all this out using a dictation tool in ~20 seconds (258 WPM).

arcanemachiner · 2025-05-25T15:14:18 1748186058

Agreed. I installed Whisper on my Linux computer with a program called SpeechNote. The dictation is all done offline, and it is astonishingly good.

I also have a whisper dictation app on my Android phone (the app's ID string is 'org.woheller69.whisper', there's a few Whisper apps with the same name "Whisper", but this one is my favorite).

FWIW this was typed by hand on my phone, but these apps are both amazing.

carlinm · 2025-05-25T14:29:56 1748183396

Curious, what dictation tool are you using?

jinay · 2025-05-25T15:37:45 1748187465

https://github.com/JinayJain/dictator

Built one for myself. It's context-aware and promptable.

Tested well on Linux, not so much on other platforms but in theory should support them.

It's a bit meta but I wrote it mostly using Claude Code. Once I had an MVP, I was able to prompt much faster by just speaking out what I wanted it to change.

carlmr · 2025-05-25T14:40:06 1748184006

Same, 258wpm is something.

arcanemachiner · 2025-05-25T15:16:16 1748186176

FYI I wrote a comment in the same thread where I described the tools I use (TLDR: Whisper).

jinay · 2025-05-21T17:40:05 1747849205

Likely no coincidence that they announce their company, io, during Google I/O.

Search "io" on Google right now and see what comes up...

bobxmax · 2025-05-21T17:41:01 1747849261

I'm pretty sure this was named years ago

jinay · 2025-05-21T17:45:15 1747849515

I'm more referring to launch timing

jorams · 2025-05-21T19:21:40 1747855300

> Search "io" on Google right now and see what comes up...

I don't know about you, but neither of them comes up. Google I/O has always been something you have to search for including the "Google" part and this news is all about Jony Ive, not the nondescript company name.

jinay · 2025-05-22T13:58:02 1747922282

Went in an incognito window and searched "io" and this announcement was shown right above Google IO [1].

[1] https://i.imgur.com/xNKjFXa.png

jinay · 2025-04-28T20:37:42 1745872662

That's not the only concern though. The web is by far the most accessible way to distribute something, which is paramount in getting people to use it.

jinay · 2025-04-28T16:09:20 1745856560

They really need to fix the kerning on some of their charts

jinay · 2025-04-24T18:17:46 1745518666

I wonder if you could use the Chrome local AI API to build something like this: https://developer.chrome.com/docs/ai

kuberwastaken · 2025-04-24T18:37:27 1745519847

Interesting, I'll look into it

jinay · 2025-04-21T20:13:37 1745266417

For anyone who wants to listen, it's on this page: https://yummy-fir-7a4.notion.site/dia

mrandish · 2025-04-21T20:41:11 1745268071

Wow. Thanks for posting the direct link to examples. Those sound incredibly good and would be impressive for a frontier lab. For two people over a few months, it's spectacular.

DoctorOW · 2025-04-21T20:54:08 1745268848

A little overacted, it reminds me of the voice acting in those flash cartoons you'd see in the early days of YouTube. That's not to say it isn't good work, it still sounds remarkably human. Just silly humans :)

3by7 · 2025-04-22T10:36:14 1745318174

Overacted and silly humans indeed: https://www.youtube.com/watch?v=gO8N3L_aERg

Cthulhu_ · 2025-04-22T09:00:00 1745312400

"flash cartoons in the early days of Youtube" Wouldn't those be straight from Newgrounds?

DoctorOW · 2025-04-23T23:44:15 1745451855

Thank you! I couldn't remember the name Newgrounds for some reason!!

selimthegrim · 2025-04-21T23:03:10 1745276590

Reminded me of the Fenslerfilm G.I. Joe sketch where the kids have something on the stove burning

wisemang · 2025-04-21T23:23:03 1745277783

Stop all the downloading!

dostick · 2025-04-22T12:24:05 1745324645

This is an instant classic. Sesame comparison examples all sound like clueless rich people from The White Lotus.

intalentive · 2025-04-22T15:37:49 1745336269

Sounds great. One of the female examples has convincing uptalk. There must be a way to manipulate the latent space to control uptalk, vocal fry, smoker’s voice, lispiness, etc.

jinay · 2025-04-20T13:31:46 1745155906

Make sure you're using the "-it-qat" suffixed models like "gemma3:27b-it-qat"

Zambyte · 2025-04-20T13:53:25 1745157205

Here are the direct links:

https://ollama.com/library/gemma3:27b-it-qat

https://ollama.com/library/gemma3:12b-it-qat

https://ollama.com/library/gemma3:4b-it-qat

https://ollama.com/library/gemma3:1b-it-qat

ein0p · 2025-04-20T18:57:03 1745175423

Thanks. I was wondering why my open-webui said that I already had the model. I bet a lot of people are making the same mistake I did and downloading just the old, post-quantized 27B.

jinay · 2025-04-12T06:45:24 1744440324

Cursor is likely very tuned for Claude (prompts-wise and all) due to its dominance with 3.5 and now 3.7. Still, Gemini 2.5's tool calling has been pretty poor in my experience which Cursor heavily relies on.

mjirv · 2025-04-12T11:28:25 1744457305

Yep. Tool calling is terrible across all Gemini models. I’m not sure why, when the model itself is so good.