More

jey · 2026-02-22T19:33:28 1771788808

(2014)

jey · 2026-02-22T02:59:41 1771729181

This seems to be the English landing page: https://github.com/Leading-AI-IO/palantir-ontology-strategy/...

jey · 2026-02-10T09:09:37 1770714577

have you tried vim or neovim?

jey · 2026-01-26T19:40:09 1769456409

I don't think the point was to say "look, AI can just take care of writing a browser now". I think it was to show just how far the tools have come. It's not meant to be production quality, it's meant to be an impressive demo of the state of AI coding. Showing how far it can be taken without completely falling over.

EDIT: I retract my claim. I didn't realize this had servo as a dependency.

santadays · 2026-01-26T19:53:24 1769457204

This is entirely too charitable. Basically all this proves is that the agent could run in a loop for a week or so, did anyone doubt that?

They marketed as if we were really close to having agents that could build a browser on their own. They rightly deserve the blowback.

This is an issue that is very important because of how much money is being thrown at it, and that effects everyone, not just the "stakeholders". At some point if it does become true that you can ask an agent to build a browser and it actually does, that is very significant.

At this point in time I personally can't predict whether that will happen or not, but the consequences of it happening seem pretty drastic.

anthonypasq96 · 2026-01-26T22:33:46 1769466826

> This is entirely too charitable. Basically all this proves is that the agent could run in a loop for a week or so, did anyone doubt that?

yes, every AI skeptic publicly doubted that right up until they started doing it.

user34283 · 2026-01-26T21:22:39 1769462559

I find it hard to believe after running agents fully autonomously for a week you'd end up with something that actually compiles and at least somewhat functions.

And I'm an optimist, not one of the AI skeptics heavily present on HN.

From the post it sounds like the author would also doubt this when he talks about "glorified autocomplete and refactoring assistants".

simonw · 2026-01-26T22:32:24 1769466744

You don't run coding agents for a week and THEN compile their code. The best available models would have no chance of that working - you're effectively asking them to one-shot a million lines of code with not a single mistake.

You have the agents compile the code every single step of the way, which is what this project did.

user34283 · 2026-01-27T07:55:21 1769500521

With the agent running autonomously for a long time, I'd have feared it would break my build/verification tasks in an attempt to fix something.

My confidence in running an agent unsupervised for a long time is low, but to be fair that's not something I tried. I worked mostly with the agent in the foreground, at most I had two agents running at once in Antigravity.

Veserv · 2026-01-26T22:29:39 1769466579

It did not compile [1], so your belief was correct.

[1] https://news.ycombinator.com/item?id=46649046

simonw · 2026-01-26T22:30:36 1769466636

It did compile - the coding agents were compiling it constantly.

It didn't have correctly configured GitHub Actions so the CI build was broken.

Veserv · 2026-01-26T22:54:24 1769468064

Then you should have no difficulty providing evidence for your claim. Since you have been engaging in language lawyering in this thread, it is only fair your evidence be held up to the same standard and must be incontrovertible evidence for your claims with zero wiggle room.

Even though I have no burden of proof to debunk your claims as you have provided no evidence for your claims, I will point out that another commenter [1] indicates there were build errors. And the developer agrees there were build errors [2] that they resolved.

[1] https://news.ycombinator.com/item?id=46627675

[2] https://news.ycombinator.com/item?id=46650998

simonw · 2026-01-26T22:56:05 1769468165

I mean I interviewed the engineer for 47 minutes and asked him about this and many other things directly. I think I've done enough homework on this one.

I take back the implication I inadvertently made here that it compiled cleanly the whole time - I know that's not the case, we discussed that in our interview: https://simonwillison.net/2026/Jan/23/fastrender/#intermitte...

I'm frustrated at how many people are carrying around a mental model that the project "didn't even compile" implying the code had never successfully compiled, which clearly isn't true.

Veserv · 2026-01-26T23:17:03 1769469423

Okay, so the evidence you are presenting is that the entity pushing intentionally deceptive marketing with a direct conflict of interest said they were not lying.

I am frustrated at people loudly and proudly "releasing" a system they claim works when it does not. They could have pointed at a specific version that worked, but chose not to indicating they are either intentionally deceptive or clueless. Arguing they had no opportunity for nuance and thus had no choice but to make false statements for their own benefit is ethical bankruptcy. If they had no opportunity for nuance, then they could make a statement that errs against their benefit; that is ethical behavior.

simonw · 2026-01-26T23:30:58 1769470258

See my comment here: https://news.ycombinator.com/context?id=46771405

I do not think Cursor's statements about this project were remotely misleading enough to justify this backlash.

Which of those things would you classify as "false statements"? The use of "from scratch"?

blibble · 2026-01-26T23:30:54 1769470254

> Arguing they had no opportunity for nuance and thus had no choice but to make false statements for their own benefit is ethical bankruptcy.

absolutely

and clueless managers seeing these headlines will almost certainly lead to people losing their jobs

santadays · 2026-01-26T21:58:35 1769464715

That is a good point. It is impressive. Llms from two years ago were impressive, llms a year ago were impressive, and from a month ago even more impressive.

Still, getting "something" to compile after a week of work is very different from getting the thing you wanted.

What is being sold, and invested in, is the promise that LLMs can accomplish "large things" unaided.

But they can't, as of yet, they cannot, unless something is happening in one of the SOTA labs that we don't know about.

They can however accomplish small things unaided. However there is an upper bound, at least functionally.

I just wish everyone was on the same page about their abilities and their limitations.

To me they understand conext well (e.g. the task, build a browser doesn't need some huge specification because specifications already exist).

They can write code competently (this is my experience anyway)

They can accomplish small tasks (my experience again, "small" is a really loose definition I know)

They cannot understand context that doesn't exist (they can't magically know what you mean, but they can bring to bear considerable knowledge of pre-existing work and conventions that helps them make good assumptions and the agentic loop prompts them to ask for clarification when needed)

They cannot accomplish large tasks (again my experience)

It seems to me there is something akin to the context window into which a task can fit. They have this compact feature which I suspect is where this limitation lies. Ie a person can't hold an entire browser codebase in their head, but they can create a general top level mapping of the whole thing so they can know where to reach, where areas of improvement are necessary, how things fit together and what has been and what hasn't been implemented. I suspect this compaction doesn't work super well for agents because it is a best effort tacked on feature.

I say all this speculatively, and I am genuinely interested in whether this next level of capability is possible. To me it could go either way.

mjr00 · 2026-01-26T19:45:05 1769456705

Maybe so, but I don't think 3 million lines of code to ultimately call `servo.render()` is a great way to demonstrate how good AI coding is.

jason_oster · 2026-01-27T04:31:43 1769488303

`servo.render()` does not appear to exist in the code base. Would you please point it out to us?

jey · 2026-01-26T19:48:36 1769456916

lmao okay, touché. I did not realize it had servo as a dependency.

nicoburns · 2026-01-26T19:45:19 1769456719

Yeah, but starting with a codebase that is (at least approaching) production quality and then mangling it into something that's very far from production quality... isn't very impressive.

simonw · 2026-01-26T20:39:01 1769459941

It didn't have Servo as a dependency.

Take a look in the Cargo.toml: https://github.com/wilsonzlin/fastrender/blob/19bf1036105d4e...

cfreksen · 2026-01-26T21:32:51 1769463171

I haven't really looked at the fastrender project to say how much of a browser it implements itself, but it does depend on at least one servo crate: cssparser (https://github.com/servo/rust-cssparser).

Maybe there is a main servo crate as well out there, and fastrender doesn't depend on that crate, but at least in my mind fastrender depends on some servo browser functionality.

EDIT: fastrender also includes the servo HTML parser: html5ever (https://github.com/servo/html5ever).

simonw · 2026-01-26T22:54:11 1769468051

Yes, it depends on cssparser and html5ever from Servo, and also uses Taffy which is a dependency shared with Servo.

I do not think that makes it a "Servo wrapper", because calling it that implies it has no rendering code of its own.

It has plenty of rendering code of its own, that's why the rendered pages are slow and have visual glitches you wouldn't get with Sero!

rsynnott · 2026-01-27T01:43:02 1769478182

> I think it was to show just how far the tools have come.

In… terms of sheer volume of production of useless crap?

jey · 2026-01-06T00:26:58 1767659218

Dan Romik has a nice intro on the moving sofa problem: https://www.math.ucdavis.edu/~romik/movingsofa/

lostlogin · 2026-01-06T09:06:28 1767690388

Yikes. When you’re taking the window frame out to get furniture in, I’m arguing for smaller furniture.

jey · 2026-01-05T07:04:52 1767596692

Yeah maybe time to send a sample to the “mass spec everything” guy

ljsprague · 2026-01-05T07:24:44 1767597884

This guy, yeah?: https://www.youtube.com/@MassSpecEverything

jey · 2026-01-03T00:34:00 1767400440

Here's a less editorialized article from Reuters: https://www.reuters.com/business/finance/banks-tap-record-li...

bmitch3020 · 2026-01-03T01:00:43 1767402043

Thanks. Here's an alternate Reuters link: http://archive.today/GWDYr

sam345 · 2026-01-03T00:52:27 1767401547

Thank you. From that link: "Borrowing from the Fed at year-end is also tied to market forces, where an upward drift in money market rates can make it cheaper to borrow from the Fed compared to private sources. Most expect Wednesday's borrowing surge will dissipate over coming days as more normal trading conditions reassert themselves. The activity at the standing repo operation is highly unlikely to signal any sort of market trouble."

jeffbee · 2026-01-03T01:10:20 1767402620

The breathlessly reported OP credits their man with scouring the halls of the Fed for a report that they post on their website every single day, which literally everyone on Wall Street knows how to find.

https://www.newyorkfed.org/markets/desk-operations/repo

jey · 2025-12-22T00:23:50 1766363030

That makes sense, but how do you efficiently evaluate the Gaussian kernel based approach (“operator-based data structures (OBDS)”)? Presumably you want to do it in a way that keeps a dynamically updating data structure instead of computing a low rank approximation to the kernel etc? In my understanding the upside of the kNN based approaches are fast querying and ability to dynamically insert additional vectors..?

loaderchips · 2025-12-22T04:26:21 1766377581

Thank you for the thoughtful comment. Your questions are valid given the title, which I used to make the post more accessible to a general HN audience. To clarify: the core distinction here is not kernelization vs kNN, but field evaluation vs point selection (or selection vs superposition as retrieval semantics). The kernel is just a concrete example.

FAISS implements selection (argmax ⟨q,v⟩), so vectors are discrete atoms and deletion must be structural. The weighted formulation represents a field: vectors act as sources whose influence superposes into a potential. Retrieval evaluates that field (or follows its gradient), not a point identity. In this regime, deletion is algebraic (append -v for cancellation), evaluation is sparse/local, and no index rebuild is required.

The paper goes into this in more detail.

jey · 2025-12-17T08:17:54 1765959474

> When I ran this program, I expected the `CF_OEMTEXT` string to have the byte 44, but it didn’t. It had the byte 90. We will start unraveling this mystery next time.

Whoa there exists something Raymond Chen didn’t know about Windows core APIs?

jey · 2025-11-30T19:54:38 1764532478

This seems to be incorporated into current LLM generations already -- when code execution is enabled both GPT-5.x and Claude 4.x automatically seem to execute Python code to help with reasoning steps.

axiom92 · 2025-11-30T22:00:38 1764540038

This was integrated in gpt4 2 years ago:

https://www.reddit.com/r/ChatGPT/comments/14sqcg8/anyone_els...

fsfod · 2025-11-30T21:28:16 1764538096

I remember seeing that GPT-5 had two python tools defined in its leaked prompt one them would hide the output from user visible chain of thought UI.

Esophagus4 · 2025-11-30T21:14:27 1764537267

Same with CoT prompting.

If you compare the outputs of a CoT input vs a control input, the outputs will have the reasoning step either way for the current generation of models.

logicprog · 2025-11-30T20:11:20 1764533480

Yeah, this is honestly one of the coolest developments of new models.