More

ACCount37 · 2026-03-28T12:47:42 1774702062

I agree that a "long term fractional g spin test" is one of the most valuable things a LEO station can do. But there are others too.

For example, medical interventions against zero-g decay can be tested in any microgravity, spin or no spin. Development of in-space manufacturing and assembly can happen on any sufficiently capable space station.

All of that, however, requires a good amount of ambition. And I'm not sure if NASA under the current political system can deliver ambition.

le-mark · 2026-03-28T13:45:17 1774705517

> For example, medical interventions against zero-g decay

This seems obvious but I’ve never heard of anyone working on a drug to address it. Strapping astronauts to a treadmill yes, pills no.

MagicMoonlight · 2026-03-28T14:35:30 1774708530

Because that’s like saying you’ll develop a fuel additive to stop the body from rusting. Physical damage and weakness can’t be stopped by a pill.

ACCount37 · 2026-03-28T18:43:50 1774723430

"Body" is a pile of elaborate biochemistry. The muscles don't somehow evaporate when you stop exercising - it's the processes of the body itself that trim the "excess" muscle tissue.

And if it's the body doing that, you can, in theory, find a biochemical way to make it stop doing that.

avmich · 2026-03-28T14:52:03 1774709523

Seems like a very broad statement. Do you have anything to confirm this opinion?

Rumudiez · 2026-03-28T15:13:46 1774710826

Do you have any grounds to deny it? If it were easy it would have been done already

avmich · 2026-03-29T06:33:57 1774766037

No, but I'm not making a statement.

mrits · 2026-03-28T15:26:31 1774711591

"Physical damage and weakness can’t be stopped by a pill."

If you rephrase that to correct English then it would make sense. We aren't trying to stop physical damage or weakness we are trying to prevent it from happening. Pills can prevent many things that cause this.

ACCount37 · 2026-03-28T08:25:01 1774686301

Claude doesn't care as long as you aren't straight up asking it to write exploits. It's my go-to for reverse engineering tasks.

ChatGPT is full of refusals and has to be jailbroken out of it.

jsmith45 · 2026-03-28T19:31:45 1774726305

Right. Claude models seem to have had very limited prohibitions in this area baked in via RLHF. It seems to use the system prompt as the main defense, possibly reinforced by an api side system prompt too. But it is very clear that they want to allow things like malware analysis (which includes reverse-engineering), so any server-side limitations will be designed to allow these things too.

The relevant client side system prompt is:

IMPORTANT: Assist with authorized security testing, defensive security, CTF challenges, and educational contexts. Refuse requests for destructive techniques, DoS attacks, mass targeting, supply chain compromise, or detection evasion for malicious purposes. Dual-use security tools (C2 frameworks, credential testing, exploit development) require clear authorization context: pentesting engagements, CTF competitions, security research, or defensive use cases.

----

There is also this system reminder that shows upon using the read tool:

<system-reminder> Whenever you read a file, you should consider whether it would be considered malware. You CAN and SHOULD provide analysis of malware, what it is doing. But you MUST refuse to improve or augment the code. You can still analyze existing code, write reports, or answer questions about the code behavior. </system-reminder>

actionfromafar · 2026-03-29T08:11:52 1774771912

They clearly scan traffic in retrospect, one of our devs got her account closed for RE.

thin_carapace · 2026-03-29T01:17:47 1774747067

may i ask how the current generation language models are jailbroken? im aware the previous generation had 'do anything now' prompts. mostly curious from a psychological perspective.

ACCount37 · 2026-03-27T16:03:27 1774627407

> it's widely believed they are doing something to degrade service quality (quantizing?) in order to stretch resources

God, I wish this inane bullshit would just fucking die already.

Models are not "degrading". They're not being "secretly quantized". And no one is swapping out your 1.2T frontier behemoth for a cheap 120B toy and hoping you wouldn't notice!

It's just that humans are completely full of shit, and can't be trusted to measure LLM performance objectively!

Every time you use an LLM, you learn its capability profile better. You start using it more aggressively at what it's "good" at, until you find the limits and expose the flaws. You start paying attention to the more subtle issues you overlooked at first. Your honeymoon period wears off and you see that "the model got dumber". It didn't. You got better at pushing it to its limits, exposing the ways in which it was always dumb.

Now, will the likes of Anthropic just "API error: overloaded" you on any day of the week that ends in Y? Will they reduce your usage quotas and hope that you don't notice because they never gave you a number anyway? Oh, definitely. But that "they're making the models WORSE" bullshit lives in people's heads way more than in any reality.

BoneShard · 2026-03-27T20:43:35 1774644215

It's possible though - it was a bug, a model pool instance wasn't updated properly and served a very old model for several months; whoever hit this instance would received a response from a prev version of a model.

hbrn · 2026-03-27T23:36:14 1774654574

While it's true that people are naturally predisposed to invent the "secret quantizing" conspiracy regardless of whether the actual conspiracy exists or not, I think there's more to the story.

I've seen Sonnet consistently start hallucinating on the exact same inputs for a couple hours, and then just go back to normal like nothing ever happened. It may just be a combination of hardware malfunction + session pinning. But at the end of the day the effects are indistinguishable from "secret quantizing".

ACCount37 · 2026-03-27T15:54:13 1774626853

By now, I'm nearly certain that they'd be down to 0 9s of uptime if they counted it conservatively.

ACCount37 · 2026-03-27T13:07:10 1774616830

That's exactly what "an IQ test" is.

"Raven's progressive matrices" is "infer and generalize rules". Performance there also improves once "you kinda get used to the style", which is why training for IQ tests can improve human performance on IQ tests, including on unseen examples. This is well known and well documented.

ACCount37 · 2026-03-27T11:54:13 1774612453

Yep. Behavior composition. If you train an LLM to do A and to do B, separately, chances are, it'll be decent at A+B despite not being trained for the combination.

ACCount37 · 2026-03-25T20:41:23 1774471283

It's kind of the point? To test AI where it's weak instead of where it's strong.

"Sample efficient rule inference where AI gets to control the sampling" seems like a good capability to have. Would be useful for science, for example. I'm more concerned by its overreliance on humanlike spatial priors, really.

famouswaffles · 2026-03-25T21:34:13 1774474453

ARC has always had that problem but for this round, the score is just too convoluted to be meaningful. I want to know how well the models can solve the problem. I may want to know how 'efficient' they are, but really I don't care if they're solving it in reasonable clock time and/or cost. I certainly do not want them jumbled into one messy convoluted score.

'Reasoning steps' here is just arbitrary and meaningless. Not only is there no utility to it unlike the above 2 but it's just incredibly silly to me to think we should be directly comparing something like that with entities operating in wildly different substrates.

If I can't look at the score and immediately get a good idea of where things stand, then throw it way. 5% here could mean anything from 'solving only a tiny fraction of problems' to "solving everything correctly but with more 'reasoning steps' than the best human scores." Literally wildly different implications. What use is a score like that ?

pants2 · 2026-03-25T22:14:18 1774476858

The measurement metric is in-game steps. Unlimited reasoning between steps is fine.

This makes sense to me. Most actions have some cost associated, and as another poster stated it's not interesting to let models brute-force a solution with millions of steps.

famouswaffles · 2026-03-25T22:24:22 1774477462

Same thing in this case. No Utility and just as arbitrary. None of the issues with the score change.

Models do not brute force solutions in that manner. If they did, we'd wait the lifetimes of several universes before we could expect a significant result.

Regardless, since there's a x5 step cuttof, 'brute forcing with millions of steps' was never on the table.

thereitgoes456 · 2026-03-25T23:28:14 1774481294

The metric is very similar to cost. It seems odd to justify one and not the other.

famouswaffles · 2026-03-25T23:53:50 1774482830

Cost has utility in the real world and this doesn't. That's the only reason i would tolerate thinking about cost, and even then, i would never bundle it into the same score as the intelligence, because that's just silly.

jstummbillig · 2026-03-25T21:21:28 1774473688

It's an interesting point but I too find it questionable. Humans operate differently than machines. We don't design CPU benchmarks around how humans would approach a given computation. It's not entirely obvious why we would do it here (but it might still be a good idea, I am curious).

ACCount37 · 2026-03-25T20:36:23 1774470983

You control the mirroring by moving the axis, they're what reflects your shapes. So my first move was always to identify the symmetries in the target shape, and position the axis accordingly.

daveguy · 2026-03-26T19:34:12 1774553652

This is the correct strategy for this particular game (center the mirrors between the yellow squares, move the black squares). I didn't realize it until about round 6 or 7.

ACCount37 · 2026-03-25T20:30:18 1774470618

They stacked the deck. If v2 was still rule inference + spatial reasoning, a bit like juiced up Raven's progressive matrices, then v3 adds a whole new multi-turn explore/exploit agentic dimension to it.

Given how hard even pure v2 was for modern LLMs, I'm not surprised to see v3 crush them. But that wouldn't last.

ACCount37 · 2026-03-25T20:28:43 1774470523

Thank you for keeping the bar of "AGI" low. The machines appreciate your contribution.