More

zniturah · 2025-11-11T16:55:52 1762880152

This is great. Wondering what was author's experience using this framework for real projects.

mushgev · 2025-11-11T17:13:08 1762881188

Thanks! I started building this after running into the problem myself. On one project we had five developers, each using AI tools, and everyone ended up structuring things differently. After a few weeks the codebase felt like five mini-projects stitched together.

I wanted something that kept the architecture consistent without everyone having to stop and redraw diagrams all the time. That’s how SpecMind started. We’ve been using it in real projects, and it’s been much easier to keep track of how everything fits together.

zniturah · 2025-08-13T13:45:59 1755092759

Reporting a bug : 4123262 matches for Google.

zniturah · 2025-04-19T20:10:49 1745093449

How is it technically possible?

greyface- · 2025-04-19T20:11:33 1745093493

https://news.ycombinator.com/item?id=11662380

https://news.ycombinator.com/pool (specifically, this post can be found on page 8: https://news.ycombinator.com/pool?next=43673425)

denysvitali · 2025-04-19T20:24:31 1745094271

Can confirm. Once dang pinged me directly by email saying that my story was re-upped. The story went again to the frontpage and the date was adapted (IIRC), but the comments were kept:

---

Hi denysvitali,

The submission "PostmarketOS-Powered Kubernetes Cluster" that you posted to Hacker News (https://news.ycombinator.com/item?id=42352075) looks good, but hasn't had much attention so far. We put it in the second-chance pool, so it will get a random placement on the front page some time in the next day or so.

This is a way of giving good HN submissions multiple chances at the front page. If you're curious, you can read about it at https://news.ycombinator.com/item?id=26998308 and other links there. And if you don't want these emails, sorry! Let us know and we won't do it again.

Thanks for posting good things to HN!

Daniel (moderator)

zniturah · on Aug 19, 2024

Full transcript is available on GitHub: https://github.com/ociubotaru/transcripts/blob/main/Stanford...

metadat · on Aug 19, 2024

I also created a mobile-friendly version of the transcript:

https://gist.github.com/sleaze/bf74291b4072abadb0b4109da3da2...

And here's the related submission:

Former Google CEO Eric Schmidt's Leaked Stanford Talk - https://news.ycombinator.com/item?id=41263143 (2 days ago, 466 comments)

Edit: Broken gist link fixed. Thanks @ryanwhitney!

ryanwhitney · on Aug 19, 2024

Your first link is missing a character, so it 404s.

Working link: https://gist.github.com/sleaze/bf74291b4072abadb0b4109da3da2...

otteromkram · on Aug 19, 2024

https://archive.ph/RPQie

zniturah · on May 23, 2024

Locating and manipulating snippet of information in huge LLMs is surely impressive but it is hard to believe that it can be scaled for more complex structures without using even bigger models.

zniturah · on May 23, 2024

Looking forward for a document leak about openai using YouTube data for training their models. When asked if they use it, Murali (CTO) told she doesn't know which makes you believe that for 99% they are using it.

Dr_Birdbrain · on May 23, 2024

I would say 100%, simply because there is no other reasonable source of video data

iLoveOncall · on May 23, 2024

I use multiple websites that have hundreds of thousands of free stock videos that are much easier to label than YouTube videos.

_diyar · on May 23, 2024

Number of videos are less relevant than the total duration of high-quality videos (quality can be approximated on YouTube with metrics such as view and subscriber count). Also, while YouTube videos are not labelled directly, you can extract signal from the title, the captions, and perhaps even the comments. Lastly, many sources online use YouTube to host videos and embed them on their pages, which probably contains more text data that can be used as labels.

blackeyeblitzar · on May 23, 2024

To be fair I don’t think Google deserves exclusive rights to contents created by others, just because they own a monopolistic video platform. However I do think it should be the content owner’s right to decide if anyone, including Google, gets to use their content for AI.

Workaccount2 · on May 23, 2024

Any other company can start a video platform. In fact a few have and failed.

Nobody has to use youtube either.

If you want change in the video platform space, either be willing to pay a subscription or watch ads.

Consumers don't want to do either, and hence no one wants to enter the space.

optimalsolver · on May 23, 2024

*Murati

pompino · on May 23, 2024

I am surprised to see a pro-copyright take on HN :)

zniturah · on Feb 8, 2024

"Aready won position" or "99% win rate" is statistics given by Stockfish (or professional chess player). It is weird to assume that the same statement is true for the trained LLM since we are assessing the LLM itself. If it is using during the game then it is searching, thus the title doesn't reflect the actual work.

dmurray · on Feb 9, 2024

It's quite clear from the article that the 99% is the model's predicted win rate for a position, not its evaluation by Stockfish (which doesn't return evaluations in those terms).

It's true that this is a relatively large deficiency in practice: how strong would a player be if he played the middlegame at grandmaster strength but couldn't reliably mate with king and rook?

The authors overcame the practical problem by just punting to Stockfish in these few cases. However, I think it's clearly solvable with LLM methods too. Their model performs poorly because of an artifact in the training process where mate-in-one is valued as highly as mate-in- fifteen. Train another instance of the model purely on checkmate patterns - it can probably be done with many fewer parameters - and punt to that instead.

im3w1l · on Feb 9, 2024

Human players have this concept of progress. I couldn't give a good succinct description of exactly what that entails, but basically if you are trading off pieces that's progress, if your king is breaking through the defensive formation of the pawn endgame that's progress. If you are pushing your passed pawn up the board that's progress. If you are slowly constricting the other king that's progress.

When we have a won position we want to progress and convert it to an actual win.

I think the operational definition I would use for progress is a prediction of how many more moves the game will last. A neural network can be used for that.

zniturah · on Feb 8, 2024

They do use Stockfish for playing thought …

“To prevent some of these situations, we check whether the predicted scores for all top five moves lie above a win percentage of 99% and double-check this condition with Stockfish, and if so, use Stockfish’s top move (out of these) to have consistency in strategy across time-steps.”

n2d4 · on Feb 8, 2024

The context of that sentence:

> Indecisiveness in the face of overwhelming victory

> If Stockfish detects a mate-in-k (e.g., 3 or 5) it outputs k and not a centipawn score. We map all such outputs to the maximal value bin (i.e., a win percentage of 100%). Similarly, in a very strong position, several actions may end up in the maximum value bin. Thus, across time-steps this can lead to our agent playing somewhat randomly, rather than committing to one plan that finishes the game quickly (the agent has no knowledge of its past moves). This creates the paradoxical situation that our bot, despite being in a position of overwhelming win percentage, fails to take the (virtually) guaranteed win and might draw or even end up losing since small chances of a mistake accumulate with longer games (see Figure 4). To prevent some of these situations, we check whether the predicted scores for all top five moves lie above a win percentage of 99% and double-check this condition with Stockfish, and if so, use Stockfish’s top move (out of these) to have consistency in strategy across time-steps.

Vecr · on Feb 8, 2024

They should try to implement some kind of resolute agent in that case. Might be hard to do if it needs to be "not technically search" though.

paulddraper · on Feb 8, 2024

But only to complete a winning position.

mtlmtlmtlmtl · on Feb 9, 2024

That's a crucial part of chess that can't simply be swept under the rug. If I had won all the winning positions I've had over the years I'd be hundreds of points higher rated.

What if a human only used Stockfish in winning positions? Is it cheating? Obviously it is.

paulddraper · on Feb 9, 2024

> That's a crucial part of chess that can't simply be swept under the rug.

Grandmasters very literally do it all the time.

> What if a human only used Stockfish in winning positions? Is it cheating? Obviously it is.

Yes, but this isn't that.

This is a computer that is playing chess. And FYI (usually) without search.

billforsternz · on Feb 8, 2024

The process of converting a completely winning position (typically one with a large material advantage) is a phase change relative to normal play which is the struggle to achieve such a position. In other words you are doing something different at that point. For example, me as weak FIDE CM (Candidate Master) could not compete with a top grandmaster in a game of chess, but I could finish off a trivial win.

Edit: Recently I brought some ancient (1978) chess software back to life https://github.com/billforsternz/retro-sargon. These two phases of chess, basically two different games, were quite noticeable with that program, which is chess software stripped back to the bone. Sargon 1978 could play decently well, but it absolutely did not have the technique to convert winning positions (because this is different challenge to regular chess). For example, it could not in general mate with rook (or even queen) and king against bare king. The technique of squeezing the enemy king into a progressively smaller box was unknown to it.

zniturah · on Feb 8, 2024

That 'only' usage in the winning position could be a decisive for gaining GM rating.

paulddraper · on Feb 8, 2024

Positions with 99% win percentage are not decisive for GM vs non-GM rating.

mtlmtlmtlmtl · on Feb 9, 2024

From the paper:

If Stockfish detects a mate-in-k (e.g., 3 or 5) it outputs k and not a centipawn score. We map all such outputs to the maximal value bin (i.e., a win percentage of 100%). Similarly, in a very strong position, several actions may end up in the maximum value bin. Thus, across time-steps this can lead to our agent playing somewhat randomly, rather than committing to one plan that finishes the game quickly (the agent has no knowledge of its past moves). This creates the paradoxical situation that our bot, despite being in a position of overwhelming win percentage, fails to take the (virtually) guaranteed win and might draw or even end up losing since small chances of a mistake accumulate with longer games (see Figure 4). To prevent some of these situations, we check whether the predicted scores for all top five moves lie above a win percentage of 99% and double-check this condition with Stockfish, and if so, use Stockfish’s top move (out of these) to have consistency in strategy across time-steps.

So they freely admit that their thing will draw or even lose in these positions. It's not merely making the win a little cleaner.

paulddraper · on Feb 9, 2024

> So they freely admit that their thing will draw or even lose in these positions.

Yeah, they didn't use Stockfish for the lols.

They create a search-less engine for chess. And then used a search engine to pay a small minority of the game.

mtlmtlmtlmtl · on Feb 9, 2024

Yes. So how is this irrelevant for qualifying as GM-level play then? Being able to play these positions is a clear prerequisite for even being in the ballpark of GM strength. If you regularly choke in completely winning endgames, you'll never get there.

This is cheating, plain and simple. It would never fly in human play or competitive computer play. And it's most definitely disingenuous research. They made an engine, it plays a certain level, and then they augment it with preexisting software they didn't even write themselves to beef up their claims about it.

littlestymaar · on Feb 13, 2024

> If you regularly choke in completely winning endgames, you'll never get there.

Except we're talking about moves where no human player would choke because they are basically impossible to lose except by playing at random (which is what the bot does).

It makes no sense to try and compare to a human player in the same situation because no human player could at the same time end up in such a position against a strong opponent and be unable to exploit them once there…

It's basically a bug, and what they did is just working around this particular bug in order to have a releasable paper.

Someone · on Feb 8, 2024

They are once your opponents know you’re very bad at converting them.

zniturah · on Feb 8, 2024

Proof?

For winning any game at some point (at the end of the game) there will be a position with >99% winning chances. The move that follows are decisive.

littlestymaar · on Feb 8, 2024

That's not how chess works. The move that follow aren't usually decisive unless you don't know how to play the game and make enormous mistakes.

Anyone that knows how to play can beat a GM with a big enough advantage at the end of the game (which is what's reflected in the win probability).

famouswaffles · on Feb 8, 2024

Search isn't used to play/win here. Just for training.

cool_dude85 · on Feb 8, 2024

It looks like it does use search here in the sense that Stockfish's top move is generated using search.

phoe-krk · on Feb 8, 2024

From the abstract:

> We annotate each board in the dataset with action-values provided by the powerful Stockfish 16 engine, leading to roughly 15 billion data points.

So some of the learning data comes from Stockfish.

paulddraper · on Feb 8, 2024

The original comment was "for playing."

In training, traditional search is absolutely used to score positions.

In playing, search is not used. (*Except to finish out an already-won position.)

zniturah · on Dec 19, 2023

The post would have found a more fitting home on Twitter rather than the 'forgotten' realms of FB. The very individuals and entities mentioned (or hinted at) might have had the chance to see the story and share their perspectives. Otherwise it is just sounds like a rant.

zniturah · on April 12, 2023

Looks interesting, thanks for sharing!