More

mrbonner · 2026-05-08T17:27:24 1778261244

We have had a service to add two numbers. What make you think this is not realistic? :-)

morkalork · 2026-05-08T17:31:37 1778261497

I too have witnessed a "add two numbers" service! Turns out you can be too extreme with rules for isolating out business logic..

Schiendelman · 2026-05-08T19:18:31 1778267911

Same! It had validation on each number before adding them. Poor design, but that's how it worked.

tgv · 2026-05-09T07:00:38 1778310038

I find this so hard to believe, but I've nearly always worked in small groups/companies. Can you, or any of the commenters above, explain why the reasoning that leads to such a service isn't rejected by, well, common sense? Some super-special requirements?

Schiendelman · 2026-05-11T11:21:44 1778498504

Sure. In this case, this started as a method with two parameters; each were validated internally before addition.

The validation was long running, as it required checking two other services to confirm both of the numbers were OK.

Because of issues calling those services, instead of two nasty synchronous calls, it turned into calling a microservice asynchronously and using a callback. Then that microservice was owned by the team that owned those two other services.

Don't underestimate the power of Conway's law.

roryirvine · 2026-05-09T07:37:42 1778312262

In the case I mentioned at https://news.ycombinator.com/item?id=48062322, it was because the Infrastructure org had grown out of what had previously been Datacenter Operations.

So they had a team of SWEs who knew the system they were responsible for was absurd, but they weren't able to adequately explain that to the senior management folk who came from that DCOps culture and held asset management & configuration tracking to be paramount. The uniqueness was seen less as an inherent property, and more as a constraint that needed to be enforced.

My team of DevOps-y proto-Platform Engineers struggled with the org's culture in similar ways, so I had a lot of sympathy for the situation they found themselves in and how they were handling it. I believe their Zookeeper-based system was intended to be more of generic lightweight config registry which would eventually have replaced the gigantic SOAP-based CMDB nightmare - basically Consul a year or two before Consul existed.

The reason why they struggled to get it into production was that it would have been so obviously useful that they kept having additional requirements and use cases forced into their "MVP". That sort of scope creep, driven by tech leadership wanting to make their mark on a successful project, is also pretty common in large orgs.

tgv · 2026-05-09T15:08:30 1778339310

Fortunately, I've neverencountered that. But still, I can see the usefulness of a guaranteed globally unique UUID, at least for certain purposes. However, a service to add numbers baffles me. The operations needed to create, send, receive and check the message are so much more complex than addition...

I must say, I did experience some lousy tech+sales leadership in one company, which was indeed the biggest I ever worked at. A decent product with a well understood scope was completely scrapped and rewritten. Some team spent more than a year on the (waterfall) design of the new system, which was then scrapped too. When I joined, there was an 8 man team for just the message bus for the new new system. Which didn't even work correctly. The whole was flexible, but in nearly every other aspect inferior to the original product. And it needed much heavier hardware.

mrbonner · 2026-05-08T00:14:44 1778199284

good luck and take care of yourself!

mrbonner · 2026-05-02T19:00:34 1777748434

Your method of combining models to strengthen the implementation reminds me of how we form stronger alloys by combining metals!

gigatexal · 2026-05-02T19:33:55 1777750435

it also sounds like a lot to manage, do you have some sort of agentic framework that's treating all of these llm's you have access to as sort of inputs that it optimizes?

rurban · 2026-05-02T20:12:06 1777752726

Unfortunately not. I'm using plain kimi, opencode (with deepseek, gpt, minmax, whatever) and claude. claude is the best, but only for some hours. The trick is to get a good AGENTS.md file, good test cases and test runner to repro, like seemless docker and qemu calls. GNU autotools would be easiest, but here I'm using plain makefiles. Also for LSP clangd being up-to-date a compile_commands.json is important. git worktrees helped developing the arm port and fixing c-testsuite cases in parallel. I wanted to keep the costs down. About $15-$30 I think.

And for low-level problems, like ARM calling-convention in asm, those models are much better than simple algorithmic python problems. Just for the hardest problem I needed the big expensive gun, but never opus. This helps in deciding what to do with my next jit project.

irthomasthomas · 2026-05-02T21:36:47 1777757807

Not op but I wrote llm-consortium to prompt multiple models and create a synthesis. And it can run on an openai endpoint using llm-model-gateway. It's expensive, naturally, but for situations where you absolutely must get max intelligence its hard to beat.

e.g.

  Pelican Riding a Bicycle — Engineering Study by DeepSeek v4 Pro, Kimi K2.6, and GLM-5.1 (1 iteration in synthesis mode with DeepSeek v4 flash as judge)

https://htmlpreview.github.io/?https://gist.githubuserconten...

mrbonner · 2026-04-27T16:30:08 1777307408

Don’t listen to anyone saying it is fine for reading or writing extensively with the xReal. I have one and it is PITA to do that over a long period. You better just stick with watching videos or playing games with it.

mrbonner · 2026-04-26T22:57:58 1777244278

It’s all for show I guess. But at this point, why would anyone be surprised about it?

mrbonner · 2026-04-16T16:11:49 1776355909

So this is the norm: quantized version of the SOTA model is previous model. Full model becomes latest model. Rinse and repeat.

mrbonner · 2026-04-13T21:04:16 1776114256

Cool! I checked the source and noticed that even LLM prefers simplified, high level Rust coding styles: use value types such as String, use smart pointers such as reference counting, clone liberally, etc… instead of fighting the borrow checker gatekeepers.

It is the style I prefer to use Rust for. Coming from Python, Typescript and even Java, even with this high level Rust, it yields incredible improvement already.

andrepd · 2026-04-13T21:47:59 1776116879

> Cool! I checked the source and noticed that even LLM prefers simplified, high level Rust coding styles: use value types such as String, use smart pointers such as reference counting, clone liberally, etc… instead of fighting the borrow checker gatekeepers.

Yeah that tracks because the AI is dumb as a bag of bricks. It can apply patterns off stackoverflow, but can hardly understand the borrow checker.

mrbonner · 2026-04-12T16:24:48 1776011088

Unverifiable software stack now amplified with LLM undetermistic. This while thing starts to feel like we are building on top a giant house of card!

mrbonner · 2026-04-05T04:15:48 1775362548

You're talking about the aladeen or that aladeen? I don't understand which aladeen you are talking about.

mrbonner · 2026-04-03T19:33:06 1775244786

This is great. I think Apple bought Kuzu, a in memory graph database in late 2025 to support RAG in combine with their FM like this. Even with such as small model, a comprehensive context of our personal data in graph RAG would be sufficient for a PA system. Do we know if we can have access to this RAG data?