More

akrauss · 2026-02-05T19:52:49 1770321169

I would like to see the following published:

- All prompts used

- The structure of the agent team (which agents / which roles)

- Any other material that went into the process

This would be a good source for learning, even though I'm not ready to spend 20k$ just for replicating the experiment.

password4321 · 2026-02-05T21:52:27 1770328347

Yes unfortunately these days most are satisfied with just the sausage and no details about how it was made.

a456463 · 2026-02-06T16:30:35 1770395435

Just claims with nothing to back it. Steal people's work of years, and turn around be like I make it "so much better". Support this compiler for 20 years then

akrauss · 2026-01-15T20:35:03 1768509303

What I missed when trying it was a simple way of accessing private repositories. There does not seem to be ssh agent forwarding, or is there? What do people use?

I realize this is all very fresh, but still wondering…

servercobra · 2026-01-18T18:12:51 1768759971

Did you ever find an answer? Other than copy/pasting my priv/pub keys into ~/.ssh/ I'm not sure.

akrauss · 2026-01-15T20:25:25 1768508725

> Sprites are active when: * They're servicing an incoming HTTP request. * You're interacting with a console.

They are advocated as Linux machines. How about daemons then, or cron jobs? What semantics can we expect from them?

akrauss · 2026-01-01T21:45:35 1767303935

This is a good concise summary, regardless of provenance.

akrauss · 2025-12-25T09:21:57 1766654517

It is really important that such posts exist. There is the risk that we only hear about the wild successes and never the failures. But from the failures we learn much more.

One difference between this story and the various success stories is that the latter all had comprehensive test suites as part of the source material that agents could use to gain feedback without human intervention. This doesn’t seem to exist in this case, which may simply be the deal breaker.

enraged_camel · 2025-12-25T09:59:42 1766656782

>> This doesn’t seem to exist in this case, which may simply be the deal breaker.

Perhaps, but perhaps not. The reason tests are valuable in these scenarios is they are actually a kind of system spec. LLMs can look at them to figure out how a system should (and should not) behave, and use that to guide the implementation.

I don’t see why regular specs (e.g. markdown files) could not serve the same purpose. Of course, most GitHub projects don’t include such files, but maybe that will change as time goes on.

morcus · 2025-12-25T15:56:36 1766678196

> I don’t see why regular specs (e.g. markdown files) could not serve the same purpose.

I think because they're doomed to become outdated without something actually enforcing the spec.

akrauss · 2025-12-17T07:08:25 1765955305

What tooling are you using for the orchestration?

akrauss · 2025-12-17T07:04:50 1765955090

Quick feedback: both the „learn more“ link at the very top and the „Explore all examples“ link lead to 404

thomasfromcdnjs · 2025-12-17T07:27:37 1765956457

Thanks will fix that up shortly.

akrauss · 2025-06-26T13:16:30 1750943790

There is one feature in Claude Code which is often overlooked and I haven't seen it in any of the other agentic tools: There is a tool called "sub-agent", which creates a fresh context windows in which the model can independently work on a clearly defined sub-task. This effectively turns Claude Code from a single-agent model to a hierarchical multi-agent model (I am not sure if the hierarchy goes to depths >2).

I wonder if it is a concious decision not to include this (I imagine it opens a lot of possibilities of going crazy, but it also seems to be the source of a great amount of Claud Code's power). I would very much like to play with this if it appears in gemini-cli

Next step would be the possibility to define custom prompts, toolsets and contexts for specific re-occuring tasks, and these appearing as tools to the main agent. Example for such a thing: create_new_page. The prompt could describe the steps one needs to create the page. Then the main agent could simply delegate this as a well-defined task, without cluttering its own context with the operational details.

cperry · 2025-06-26T15:46:29 1750952789

conscious decision not to include it mostly to cut a release we could ship to land yesterday ;)

various forms of this are being discussed, this commentary is helpful thanks!

ericb · 2025-06-28T22:37:02 1751150222

Injecting ENV variables into the template would be super useful.

indigodaddy · 2025-06-26T13:54:29 1750946069

Would Gemini non-interactive mode be a stop gap if they don't have sub-agent equivalent yet?

https://github.com/google-gemini/gemini-cli/blob/main/docs/c...

akrauss · 2025-06-26T14:28:07 1750948087

Possibly. One could think about hooking this in as a tool or simple shell command. But then there is no management when multiple tools modify the codebase simultaneously.

But it is still worth a try and may be possible with some prompting and duct tape.

ramirond · 2025-06-26T15:06:26 1750950386

"sub-agent" sounds incredible! All tools should implement that.

akrauss · 2025-06-26T13:10:04 1750943404

One thing I'd really like to see in coding agents is this: As an architect, I want to formally define module boundaries in my software, in order to have AI agents adhere to and profit from my modular architecture.

Even with 1M context, for large projects, it makes sense to define boundaries These will typically be present in some form, but they are not available precisely to the coding agent. Imagine there was a simple YAML format where I could specify modules and where they can be found in the source tree, and the APIs of other modules it interacts with. Then it would be trivial to turn this into a context that would very often fit into 1M tokens. When an agent decides something needs to be done in the context of a specific module, it could then create a new context window containing exactly that module, effetively turning a large codebase into a small codebase, for which Gemini is extraordinarily effective.

akrauss · 2025-06-11T06:19:13 1749622753

I would be interested in reading what tools are made available to the LLM, and how everything is wired together to form an effective analysis loop. It seems like this is a key ingredient here.

trashtester · 2025-06-11T06:28:49 1749623329

For now, the people able to glue all the necessary ingredients together are the same ones who can understand the output if they drill into it.

Indeed, these may be the last ones to be fired, as they can become efficient enough to do the jobs of everyone else one day.