More

elpakal · 2026-03-17T17:07:24 1773767244

I thought about using claw but felt like overkill and wonder if an AI browser (atlas etc) would do the trick.

npilk · 2026-03-17T17:21:23 1773768083

For sure it was overkill/not the most efficient approach - really I was more just curious if it would work. The answer was "kind of", but even that is pretty amazing. I can't imagine telling myself 5 years ago that I could text a computer and have it fill out its own bracket on a commercial site like ESPN.

bwade818 · 2026-03-17T20:42:41 1773780161

its going to be cool when you put in your todo list in the morning that you need to fill out your espn bracket and by lunch your agent will have 3 different versions ready for your review

elpakal · 2026-03-17T17:06:27 1773767187

Really cool idea. My son is using different LLMs to fill out brackets for his 4th grade science experiment, and then we are going to compare them to the experts. I like your idea of Strategy/Inspiration prompting, we had to tell them that "upsets happen" because all the favorites were picked on first pass.

Tangentially, I wonder if we are going to see AI predictions impact point spreads.

bwade818 · 2026-03-17T18:57:31 1773773851

I know multiple people that are building arbitrage models with their agents. i bet it makes the markets pretty efficient

elpakal · 2026-03-11T14:56:41 1773241001

> This is a known limitation with small LLMs (0.6B-1.2B) doing tool calling.

To me this is this nut to crack, wrt tool calling and locally running inference. This seems like a really cool project and I'm going to dive around a little later but if it's hallucinating for something as basic as this makes me think it's more of POC stage right now (to echo other sentiment here).

sanchitmonga22 · 2026-03-11T15:03:42 1773241422

That's a fair read. Tool calling reliability with sub-4B models is genuinely the hardest unsolved problem in on-device AI right now.

The inference engine (MetalRT) is production-grade, the pipeline architecture is solid, but the models at this size are still the weak link for complex tool routing. Larger model support (where tool calling is much more reliable) is next on the roadmap. Please stay tuned!

elpakal · 2026-03-11T15:43:25 1773243805

Sorry, I scrolled through some of the rest of the comments on this thread and can’t stay tuned.

elpakal · 2026-03-11T14:49:00 1773240540

> file-based state that persists between agent invocations

Can you expand on this with a practical example?

fudfomo · 2026-03-11T15:19:24 1773242364

It needs a canonical source of truth, something isolated agents can't provide easily. There are tools out there like specularis that help you do that and keep specs in sync.

sveme · 2026-03-11T15:08:58 1773241738

One example: I let the agent culminate the essence of all previous discussions into a spec.md file, check it for completeness, and remove all previous context before continuing.

patchnull · 2026-03-11T15:08:12 1773241692

[flagged]

elpakal · 2026-03-11T15:44:36 1773243876

thanks

elpakal · 2026-03-09T20:17:22 1773087442

An iOS app size analysis tool you can run on your Mac and track build size changes over time https://apps.apple.com/us/app/dotipa/id6742254881

elpakal · 2026-02-09T16:00:37 1770652837

An iOS build size analysis app that runs locally on your Mac: https://apps.apple.com/us/app/dotipa/id6742254881

elpakal · 2026-02-05T21:13:59 1770326039

Does fastlane still hang for a little before every command? I used to optimize build pipelines for a large company's iOS teams and it always seemed to stall for a little before doing the work. We eventually moved to Xcode Cloud (mainly to avoid code signing) and ran xcodebuild directly.

elpakal · 2026-02-02T23:59:10 1770076750

why would it need local network access though, I wonder?

elpakal · 2026-02-02T20:53:08 1770065588

> this just calls Codex CLI with OS sandboxing

The git and terminal views are a big plus for me. I usually have those open and active in addition to my codex CLI sessions.

Excited to try skills, too.

elpakal · 2026-01-13T03:32:38 1768275158

But they created GenMoji?!