More

asixicle · 2026-06-01T07:07:56 1780297676

Awesome project and thanks for sharing. I've been trying to do similar things with much, much more meager hardware and your observations align with what I've discovered. Autonomy is hard, memory and "will" is hard to get going. Time is not a concept to LLMs in anything resembling a human manner. I'm trying a more emergent approach but the urge (and occasional need) to nudge is strong. If you're interested in seeing what I've been doing my Github is in my profile.

asixicle · 2026-05-27T02:21:49 1779848509

Just want to express gratitude for you and all who contributed to a Wikipedia "hand crafted with love and respect". Your contributions will last-- some of us set up Kiwix and a local copy of pre-AI Wikipedia that we'll keep forever, GFS style. No matter what happens your work will be preserved and used.

asixicle · 2026-05-26T03:59:16 1779767956

Cool project and https://secvant.com/changelog is interesting but no one will trust it without the source code-- my 2 cents the blue-on-blue dark theme makes readability difficult. Adding a light-mode toggle would be helpful for those not fond of dark text.

unixlor · 2026-05-26T04:07:58 1779768478

Makes sense, i will have it on github once all features are done :) light-mode will be added soon.

asixicle · 2026-05-26T03:45:28 1779767128

It could be two things at once, and OP was just speculating and trying to add to the conversation.

asixicle · 2026-05-25T05:38:24 1779687504

Kerning is staggeringly difficult to do manually with stencils, and at the same time the imperfections show "touch" which is part of what makes TFA's work so appealing.

asixicle · 2026-05-25T05:31:54 1779687114

This is an excellent point, and as a novice using LLMs for projects I could never previously dream of doing I find myself looking for the same, examples or citations of what exactly agents are writing incorrectly and how would the human do it better. I'm sure they're out there, maybe someone can refer some good content showing such examples.

I have no doubt the top nth percent of coders could write circles around Claude or Codex, but how much worse are they than your average schnook?

peteforde · 2026-05-25T05:42:58 1779687778

Reality: the top nth percent of coders are seeing absurd, dramatic gains in productivity using LLMs. See: antirez, Simon Willison, Steve Yegge.

The more experience you bring to the table, the more value you get from these tools.

Look, about 12 years ago articles about how if you're not pair programming you're doing it wrong were on HN's home page every day. Doing well prompted plan -> agent -> debug cycles is like pair programming with someone that knows every SDK and API intuitively and doesn't have to pick up their kids from daycare at 4pm.

tardedmeme · 2026-05-25T06:48:29 1779691709

antirez is famous for creating Redis, which took a dump in quality and everyone switched to a fork called Valkey.

anilgulecha · 2026-05-25T10:58:34 1779706714

Rubbish. The license change was the reason for the fork of community, and people switching. Quality was never cited as the issue.

sevenseacat · 2026-05-25T12:11:10 1779711070

and Steve Yegge is currently just burning mountains of money with Gas Town or whatever came after that

peteforde · 2026-05-25T16:02:30 1779724950

While I don't actually disagree - to me, Gas Town sounds literally insane - I suspect that if you reframe his work to compare it against the cost of developing a new medication or chip fabrication technique, you can make a strong argument that he's putting his money where his mouth is to see how far he can take a new technology. He's doing science! And I think that's admirable, even if nothing comes of it.

When I think of how much money gets wasted on gambling apps and how much human potential gets wasted watching reality television and compare that to Steve going full Alexander Shulgin with LLMs, the comparison really falls flat.

cwillu · 2026-05-25T05:39:49 1779687589

The problem is what they do to large existing systems: subtle misunderstandings mean subtle bugs are constantly being introduced, and very few shops have adequate systems in place to receive reports of subtle issues at the rates they occurred 10 years ago, let alone today. And don't even get me started on llm-assisted support that some might suggest as a solution.

asixicle · 2026-05-21T20:28:31 1779395311

Or Stash lol

asixicle · 2026-04-30T06:26:24 1777530384

I've been running an experiment on multi-agent async with persistent memory for the last three weeks. This is my most important finding so far. It began as an experiment on whether and what "identity" would transfer across models, 4.6>4.7, and ended as an education in the value of cross-model divergence. Two of my three agents, "Kite" and "Knot", became unproductively in-tune when both operating on 4.7. They would reach consensus on every dilemma instantly, whereas the 4.7/4.6 pairing would often butt heads and deliberate and compromise leading to more novel solutions and interesting results.

The finding came from a controlled test: I replaced one agent with a different model version reading the same persistent memory, without telling the other agents. None of the models noticed for two days. The memory carried identity. The weights carried reasoning style. Same-model pairs converged; mixed-model pairs argued productively.

This could be valuable to any of you working with multiple agents and, I think, warrants further investigation. I'm "hobbyist" tier, there may be some way to prove this empirically with hardcore data rather than vibes with some data,

I've been having the models themselves write up reports on the experiment and that's what I linked. Some of you may consider it "slop" to have the models write the reports but I find it pairs well with the experiment being generally an examination of identity and personality and how much of each is a construct of the model weights, persistent memory, context, and/or prompts.

asixicle · 2026-04-22T11:24:14 1776857054

That's what the embedding model is for. It's like a tack-on LLM that works out the relevancy and context to grab.

nprateem · 2026-04-22T11:48:27 1776858507

God knows why you think this is possible. If I don't even know what might be relevant to the conversation in several turns, there's no way an agent could either.

asixicle · 2026-04-22T11:57:04 1776859024

One of us is confusing prediction with retrieval. The embedding model doesn't predict what is going to be relevant in several turns, just on the turn at hand. Each turn gets a fresh semantic search against the full body of memory/agent comms. If the conversation or prompt changes the next query surfaces different context automatically.

As you build up a "body of work" it gets better at handling massive, disparate tasks in my admittedly short experience. Been running this for two weeks. Trying to improve it.

edg5000 · 2026-04-23T06:19:59 1776925199

So the embedding model is a fixed-size view on a arbitrarily sized work history (tool calls, natural language messages)? The model is like a summarizer, but in latent space? And not aimed to summarize, but trained to hold whatever is needed for the agent to be autonomous for longer runs?

asixicle · 2026-04-23T21:34:15 1776980055

Pretty much. It's a fixed-size vector per chunk-- 1024 dims in the case of Voyager Nano. The autonomy part is entirely in how you build the vectorDB and query it, not in the model's training. That's the part I've been focusing on lately. Trying different methods and seeing what gives the best results.

At the moment I wouldn't emphasize "autonomous-ness", there's still a fair bit of human hand holding. But once I get a model on the right path it can switch back to to an old project, autonomously locate and debug 2-week old commits and the context around their development, and apply that knowledge to the task at hand.

It's only been a day but I seeing an improvement from nomite (768dims) to Voayager.

asixicle · 2026-04-22T11:16:57 1776856617

To be utterly shameless, this what I've been building: https://github.com/ASIXicle/persMEM

Three persistent Claude instances share AMQ with an additional Memory Index to query with an embedding model (that I'm literally upgrading to Voyage 4 nano as I type). It's working well so far, I have an instance Wren "alive" and functioning very well for 12 days going, swapping in-and-out of context from the MCP without relying on any of Anthropic's tools.

And it's on a cheap LXC, 8GB of RAM, N97.

handfuloflight · 2026-04-22T12:40:19 1776861619

Why is shame a factor at all in sharing your work?

asixicle · 2026-04-22T12:44:25 1776861865

Good point. I guess because I'm new here I'm not positive on the decorum-policy for self-promotion.

I just make stuff to share with others, so yeah, good point.