More

jiggawatts · 2026-06-01T22:00:18 1780351218

I’ve had Apple delete my music from my devices.

As in, I had a physical CD I had purchased, ripped to MP3, and loaded onto my iPhone.

iTunes recognised it, linked it to the matching official entry in their music store, they lost the licence and deleted all customer copies including mine.

marysol5 · 2026-06-02T10:13:29 1780395209

Lest we forget that they forced the U2 album onto devices, then after backlash removed it.

But apparently it still syncs itself on some devices to this day

jiggawatts · 2026-05-31T04:15:54 1780200954

Hundreds of millions of businesses (and individuals) transacted $83 billion to Microsoft just last quarter, so clearly they're doing something right.

Any "big enough" organisation will eventually do something stupid, disgraceful, or even illegal. Once you have over a hundred thousand staff, there's just no way to guarantee that they all row in the same direction and nobody gives in to the temptation to cut corners or outright cheat.

If you think you can judge the entire rest of an organisation by a few bad actors within it, you'll be perpetually disappointed.

Bratmon · 2026-05-31T04:30:18 1780201818

Yet not as disappointed as the people who actually believed Microsoft would give them a perpetual license for something just because they paid for it.

jiggawatts · 2026-05-30T10:25:09 1780136709

The "space economy" is not yet a certainty, other than in the mind of science fiction fans. (Unsurprisingly, hard to reach irradiated rocks of undifferentiated boring minerals in a cold vacuum are of negligible value to humans here on Earth.)

Even if the Star Trek utopian future materialises, it is very likely to be a long time from now.

1. SpaceX has competitors. Most are making reusable rockets.

2. SpaceX has no moat.

3. The concept of money itself might change dramatical by the time SpaceX becomes a multi-planetary mega corporation. Investing now may not return returns in any meaningful sense.

copx · 2026-05-30T13:43:46 1780148626

>The "space economy" is not yet a certainty

True, and that's exactly the reason why people want to buy this stock now.

If future returns were already (almost) certain, they would have been priced in and you couldn't make any money with this stock.

This is a classic high risk / high reward stock. IF the space economy takes off you might 10X your investment. If it doesn't, you might lose most of it.

Rich people (who own most of the stock market) can afford to make such high risk bets, because they can afford to lose the money and thus many will make that bet.

MrBuddyCasino · 2026-05-30T16:08:33 1780157313

This is the only sane reply so far, and being on HN (notably a VCs platform) this is rather sad.

jiggawatts · 2026-05-30T07:02:02 1780124522

Happened to me when I reported that I could get Azure to issue me a certificate for a domain I don’t own.

Rejected, then quietly fixed a couple of months later.

jiggawatts · 2026-05-29T22:04:55 1780092295

Communism, or more accurately, mechanised collective farming practices in the early 1900s in Russia resulted in revolutions and world wars. When tens of millions of inefficient farmers were replaced by tractors needing only a fraction of the labour force the excess population was disposed of.

Sorry, bad phrasing!

They were put to work in new roles enabled by technological advancements: wielding mass manufactured rifles and operating artillery.

This has played out over and over throughout history whenever a large fraction of the population suddenly becomes surplus to requirements.

They never get to enjoy utopia. They are expended in warfare or low value forced labour until the labour pool once again matches the requirements.

oblio · 2026-05-30T03:13:39 1780110819

You don't even need to look at the Soviets. Life for the average person in Britain became worse between 1760 until about 1920. That meant about 3 generations of people were lost.

I'm super happy about this idilic AI future my great grandchildren will enjoy...

jiggawatts · 2026-05-28T12:54:28 1779972868

Many of the rows in that spreadsheet reference "current events", which models aren't expected to do much better at than a human making an educated guess! They all have cutoff dates either last year or early this year and know nothing about what happened in "April 2026".

This is doubly problematic because you evaluated earlier models like Gemini Pro 3 instead of 3.1, GPT 5.4 instead of 5.5, etc...

Given that it's only a thousand short questions, you should be able to re-run your test in about an hour with the latest models, so... why haven't you?

Similarly, LLM output is non-deterministic, so if you could get more interesting stats of your data set by repeating each question 'n' times for each model.

kostaj · 2026-05-28T13:02:03 1779973323

Two of the models used have retrieval capabilities and have access to newer information through search. The other three are parametric.

simonw · 2026-05-28T13:05:59 1779973559

Comparing models with search tools to models without - when there's no option for "I am unable to answer this question without access to search" - doesn't make sense to me.

kostaj · 2026-05-28T14:19:10 1779977950

Agree about comparing models with and without search capabilities. Even the two models with search capabilities (Sonar Pro and Gemini) agree only on 58% of the claims.

throw310822 · 2026-05-28T13:06:57 1779973617

The title mention "fact-checks", but "fact checking" is a process in which facts are checked against sources, not one where you are given a random fact and have to tell if it's true or false from your own memory. That's what is normally called a quiz game. So a more honest title for this research would be "Models answer differently to quiz questions".

jiggawatts · 2026-05-29T02:18:43 1780021123

> "Models answer differently to quiz questions".

from their future

It's as if you asked a human in 2026 to "fact check" something from 2027.

You're going to get an educated guess at best, a coin flip at worst.

furyofantares · 2026-05-28T13:11:34 1779973894

Yes, so in that case you set them up to disagree and then measured disagreement.

jiggawatts · 2026-05-27T20:56:15 1779915375

The Grenfell fire was caused by petty corruption. Someone involved in its construction used a cheaper flammable cladding material instead of the (slightly!!) more expensive fire resistant version.

It’s very on-brand for places like Russia and China but clearly western countries are not immune to this kind of thing either.

After the fire there were investigations into towers constructed here in Australia. Many used the cheaper flammable cladding material also. Just like with Grenfell, nothing much was done and nobody went to prison.

dylan604 · 2026-05-27T21:03:24 1779915804

What does that have to do with the actual idea that being in a tall building could make it difficult to escape. It doesn't matter if the cause of the disaster is cheap building materials or an external force acting on a properly built building.

themanualstates · 2026-05-28T04:29:53 1779942593

The material should've behaved like concrete; stopping fire / smoke instead of spreading it.

That way firefighters can take people down the elevators safely etc.

expedition32 · 2026-05-29T01:03:49 1780016629

Grenfell was iirc social housing.

A contractor is less likely to fuck around with people who can afford very good lawyers.

jiggawatts · 2026-05-26T09:57:01 1779789421

The funny thing is that you've just described an idealised development process as would be used by effective, skilled humans in a heterogenous team where everyone has a speciality.

If only things were so! If only code was discussed, reviewed, iterated on! If only the "manager" actually read the code, provided actionable feedback, and disseminated PRs to multiple people with diverse skill sets.

(If you can't tell, I'm a jaded consultant desperately trying to make the horse drink the water.)

bottlepalm · 2026-05-26T16:40:16 1779813616

I've worked in large teams for many years and yes it's just like that, but without the time constraint. PR's can only go back and forth so many times. Depending on the reviewer they may phone it in, or focus on different things depending on the person. You yourself aren't able to implement every piece of feedback due to constraints and it ends up as tech debt.

So AI definitely changes the game. I feel like we almost need something higher level for reviewers to review changes faster. Todays code is starting to feel like assembler. Too much of it, too low level. We need even higher level constructs to be able to more in less time. I'm just not sure what that is.

jiggawatts · 2026-05-25T09:24:58 1779701098

They didn't include a common reason for wanting at least a thin blanket on hot summer nights: it keeps the mosquitoes away!

jiggawatts · 2026-05-25T07:08:26 1779692906

I just had a thought: is there some API so obscenely baroque and painful to use that even AIs would flatly refuse to work with them?

It would be an interesting exercise to keep feeding a coding agent ever crazier interface designs until it cracks.

“The base64 of the rot13 encrypted EBCDIC string has to be included in a JSON in the XML SOAP request, but both the JSON and XML escaping is manual and incorrect...”

"...but first split the string into chunks no bigger than 64 bytes and spread the request amongst HTTP headers instead of the POST body. Reassemble by trying every possible ordering until one passes the decoding steps."

wwalexander · 2026-05-25T08:33:42 1779698022

AIs can barely handle PKCE OAuth flow. It’s not very hard to confuse them.

NoMoreNicksLeft · 2026-05-25T08:21:21 1779697281

>I just had a thought: is there some API so obscenely baroque and painful to use that even AIs would flatly refuse to work with them?

Copilot Studio. It's painful to try to set up any sort of logic within Copilot Studio. Worse if you're not on the most bleed-edging-new machine with overkill levels of ram. So I had a thought... why am I doing this when I have Claude with absolutely no quotas?

Turns out, there's just no way to drive it from Claude. It first started with the pac command line tool, but that's agonizingly broken. Tried to use Chrome next, but even it can't navigate that UI from the browser (neither could I, you'd click and sometimes the response occurs 10 seconds later). Copilot Studio is the quintessential Microsoft technology. Shortly after, Claude began experiencing what I can only call schizophrenic symptoms. It imagined that every time I queried it that there were embedded hacking attempts in my reply and that soon spread to every conversation I had with it even in new chats.

Atiscant · 2026-05-25T10:19:50 1779704390

It is kind of ironic that the AI building tool is so hostile to AI. Copilot studio really is a hot mess, at least for me.

irishcoffee · 2026-05-25T12:26:24 1779711984

I’ve been trying for the last few weeks to get a really solid local model workflow going, and every single tool I use feels hostile af, whereas the stuff work pays for, it all “just works” together. It really irritates me.

borski · 2026-05-25T07:19:07 1779693547

Sounds like the IOCCC[0] of APIs

[0] https://en.wikipedia.org/wiki/International_Obfuscated_C_Cod...

lq9AJ8yrfs · 2026-05-25T13:55:11 1779717311

that's nearly a requirement for anti-bot things. turnstile etc.