More

shad42 · 2026-02-28T05:04:12 1772255052

In some ways: we use their product and they use Mendral

shad42 · 2026-02-27T22:59:31 1772233171

100% and LLMs have tons of related training data

shad42 · 2026-02-27T22:58:08 1772233088

very interesting, curious if there is any downside to running this at scale (compute?)

generallyjosh · 2026-02-28T00:33:45 1772238825

I'd assume it probably depends how large and varied your logs are?

But, my guess, I could see an algorithm like that being very fast. It's basically just doing a form of compression, so I'm thinking ballpark, like similar amount to just zipping the log

Can't be anything CLOSE to the compute cost of running any part of the file through an LLM haha

shad42 · 2026-02-27T21:36:34 1772228194

We did not want to make the post engineering-focused, but we have 18 companies in production today (we wrote about PostHog in the blog). At some point we should post some case studies. The metric we track for usefulness is our monthly revenue :)

shad42 · 2026-02-27T20:08:41 1772222921

Mendral is replacing a human Platform Engineer. It debugs the CI logs, look at the commit associated, look at the implementation of the tests, etc... It then proposes fixes and takes care of opening a PR.

We wrote about how this works for PostHog: https://www.mendral.com/blog/ci-at-scale

shad42 · 2026-02-27T20:07:16 1772222836

There is a cost associated with each investigation (that the Mendral agent is doing). And we spend time tuning the orchestration between agents. Yes expensive but we're making money on top of what it costs us. So far we were able to take the cost down while increasing the relevance of each root cause analysis.

We're writing another post about that specifically, we'll publish it sometimes next week

nikita2206 · 2026-02-28T19:41:17 1772307677

What is your pricing like? Do you do usage based pricing by any chance?

shad42 · 2026-02-27T19:27:06 1772220426

I agree, we automated in the Mendral agent what is time consuming for human (like debugging a flaky test), but it will need permission to confirm the remediation and open a PR.

But it's night and day to fix your CI when someone (in this case an agent) already dug into the logs, the code of the test and propose options to fix. We have several customers asking us to automate the rest (all the way to merge code), but we haven't done it for the reasons you mention. Although I am sure we'll get there sometimes this year.

whoami4041 · 2026-02-27T20:39:17 1772224757

Shameless plug here for Lexega—a deterministic policy enforcement layer for SQL in CI/CD :) https://lexega.com

There are bridges here that the industry has yet to figure out. There is absolutely a place for LLMs in these workflows, and what you've done here with the Mendral agent is very disciplined, which is, I'd venture to say, uncommon. Leadership wants results, which presses teams to ship things that maybe shouldn't be shipped quite yet. IMO the industry is moving faster than they can keep up with the implications.

shad42 · 2026-02-27T17:26:13 1772213173

LLMs are better now at pulling the context (as opposed to feeding everything you can inside the prompt). So you can expose enough query primitives to the LLM so it's able to filter out the noise.

I don't think implementing filtering on log ingestion is the right approach, because you don't know what is noise at this stage. We spent more time on thinking about the schema and indexes to make sure complex queries perform at scale.

shad42 · 2026-02-27T17:21:53 1772212913

I am sure you heard before: there are only two hard things in CS: cache invalidation and naming things.

In the history of this company, I can honestly say that this SQL/LLM thing wasn't the hardest :)

HanClinto · 2026-02-27T18:20:59 1772216459

And the other of the two problems is off-by-one errors.

shad42 · 2026-02-27T17:17:33 1772212653

From own experience it's true, and I think it's due to the amount of SQL content (docs, best practices, code) that you can find online, which is now in all LLM's corpus data.

Same applies when picking a programming language nowadays.