This is actually a good thing for personal autonomy. Instead of accidentally having kids you can't afford, due to modern science, it's completely optional.
The article alludes to this, but the government previously promoted smaller families. Just a few generation ago the birthrate was considered too high.
Realistically to have a growing population you probably want to have around an average of 3 children per couple.
This is economically impossible for most people though. No one has a stable job anymore. We're all temps and gig workers.
If you just do it anyway, and find it's a struggle... Society blames you and calls you careless.
The path of least resistance is to just skip having a family.
[Not a sales pitch - just answering the questions]
This AWS EC2 site is just an open-source project and site we maintain for the benefit of the community. So it's not directly our business but it promotes our brand and is just a helpful site that I think should exist. It's very popular and has been around for about 15 years now.
Our main business hosted on the main domain of https://www.vantage.sh/ is around cloud cost management across 25 different providers (AWS, Azure, Datadog, OpenAI, Anthropic, etc) and the use-cases there about carving up one bill and showing specific costs back to engineers for them to be held accountable to, take action on, etc. Cloud costs and their impact on companies' margins is a big enough problem for vendors like us to exist and we're one player in a larger market.
Fragile against upgrades, tons of unmaintained plugins, admin panel UX is a mess where you struggle to find the stuff your are looking for, half backed transition to nicer UI (Blue Ocean) that has been ongoing for years, too many ways to setup jobs and integrates with repos, poor resource management (disk space, CPU, RAM), sketchy security patterns inadvertently encouraged.
This stuff is a nightmare to manage, and with large code bases/products, you need a dedicated "devops" just to babysit the thing and avoid it becoming a liability for your devs.
I'm actually looking forward our migration to GHEC from on-prem just because Github Actions, as shitty as they are, are far less of an headache than Jenkins.
This article is a bit negative. Claude gets close , it just can't get the order right which is something OP can manually fix.
I prefer GitHub Copilot because it's cheaper and integrates with GitHub directly. I'll have times where it'll get it right, and times when I have to try 3 or 4 times.
what if the LLM gets something wrong that the operator (a junior dev perhaps) doesn't even know it's wrong? that's the main issue: if it fails here, it will fail with other things, in not such obvious ways.
>what if the LLM gets something wrong that the operator (a junior dev perhaps) doesn't even know it's wrong?
the same thing that always happens if a dev gets something wrong without even knowing it's wrong - either code review/QA catches it, or the user does, and a ticket is created
>if it fails here, it will fail with other things, in not such obvious ways.
is infallibility a realistic expectation of a software tool or its operator?
I think that's the main problem with them. It is hard to figure out when they're wrong.
As the post shows, you can't trust them when they think they solved something but you also can't trust them when they think they haven't[0]. The things are optimized for human preference, which ultimately results in this being optimized to hide mistakes. After all, we can't penalize mistakes in training when we don't know the mistakes are mistakes. The de facto bias is that we prefer mistakes that we don't know are mistakes than mistakes that we do[1].
Personally I think a well designed tool makes errors obvious. As a tool user that's what I want and makes tool use effective. But LLMs flip this on the head, making errors difficult to detect. Which is incredibly problematic.
[0] I frequently see this in a thing it thinks is a problem but actually isn't, which makes steering more difficult.
[1] Yes, conceptually unknown unknowns are worse. But you can't measure unknown unknowns, they are indistinguishable from knowns. So you always optimize deception (along with other things) when you don't have clear objective truths (most situations).
All AI's are overconfident. It's impressive what they can do, but it is at the same time extremely unimpressive what they can't do while passing it off as the best thing since sliced bread. 'Perfect! Now I see the problem.'. 'Thank you for correcting that, here is a perfect recreation of problem 'x' that will work with your hardware.' (never mind the 10 glaring mistakes).
I've tried these tools a number of times and spent a good bit of effort on learning to maximize the return. By the time you know what prompt to write you've solved the problem yourself.
“Bad” seems extreme. The only way to pass the litmus test you’ve described is for a tool to be 100% perfect, so then the graph looks like 99.99% “bad tool” until it reaches 100% perfection.
It’s not that binary imo. It can still be extremely useful and save a ton of time if it does 90% of the work and you fix the last 10%. Hardly a bad tool.
It’s only a bad tool if you spent more time fixing the results than building it yourself, which sometimes used to be the case for LLMs but is happening less and less as they get more capable.
If you show me a tool that does a thing perfectly 99% of the time, I will stop checking it eventually. Now let me ask you: How do you feel about the people who manage the security for your bank using that tool? And eventually overlooking a security exploit?
I agree that there are domains for which 90% good is very, very useful. But 99% isn't always better. In some limited domains, it's actually worse.
That is a true and useful component of analyzing risk, but the point is that human behaviour isn't a simple risk calculation. We tend to over-guard against things that subjectively seem dangerous, and under-guard against things that subjectively feel safe.
This isn't about whether AI is statistically safer, it's actually about the user experience of AI: If we can provide the same guidance without lulling a human backup into complacency, we will have an excellent augmented capability.
I wouldn't go that far, but I do believe good tool design tries to make its failure modes obvious. I like to think of it similar to encryption: hard to do, easy to verify.
All tools have failure modes and truthfully you always have to check the tool's work (which is your work). But being a master craftsman is knowing all the nuances behind your tools, where they work, and more importantly where they don't work.
That said, I think that also highlights the issue with LLMs and most AI. Their failure modes are inconsistent and difficult to verify. Even with agents and unit tests you still have to verify and it isn't easy. Most software bugs are created from subtle things, often which compound. Which both those things are the greatest weaknesses of LLMs: nuance and compounding effects.
So I still think they aren't great tools, but I do think they can be useful. But that also doesn't mean it isn't common for people to use them well outside the bounds of where they are generally useful. It'll be fine a lot of times, but the problem is that it is like an alcohol fire[0]; you don't know what's on fire because it is invisible. Which, after all, isn't that the hardest part of programming? Figuring out where the fire is?
That's my thinking. If I need to check up on the work, then I'm equally capable of writing the code myself. It might go faster with an LLM assisting me, and that feels perfectly fine. My issue is when people use the AI tools to generate something far beyond their own capabilities. In those cases, who checks the result?
ya, this is true. Another commenter also pointed out that my intention was to one-shot. I didn't really go too deeply into trying to try multiple iterations.
This is also fairly contrived, you know? It's not a realistic limitation to rebuild HTML from a screenshot because of course if I have the website loaded I can just download the HTML.
This is precisely the workflow when a traditional graphic designer mocks up a web/app design, which still happens all the time.
They sketch a design in something like Photoshop or Illustrator, because they're fluent in these tools and many have been using them for decades, and somebody else is tasked with figuring out how to slice and encode that design in the target interactive tech (HTML+CSS, SwiftUI, QT, etc).
Large companies, design agencies, and consultancies with tech-first design teams have a different workflow, because they intentionally staff graphic designers with a tighter specialization/preparedness, but that's a much smaller share of the web and software development space than you may think.
There's nothing contrived at all about this test and it's a really great demonstration of how tools like Claude don't take naturally to this important task yet.
"Willing" is an interesting word choice. There was quite a bit of resistance in the Python world despite the clear benefits. (2.x really could not be fixed, because the semantics were fundamentally broken in many places.)
I don't have much faith in Arm Linux. Tuxedo gave up.
Cheap Windows Arm laptops are flooding the market, if someone can pick ONE laptop to support they could easily buy them on sale , refurbished them with Linux and make a profit.
Looks likes their are some challenges with doing this.
I was about to comment to say that unless Valve is prepared to invest significant effort into an x86 -> ARM translation layer that's not going to happen but a quick search for "linux x86 to arm translation" led me to an XDA article[1] proving me wrong. The recently announced Steam Frame runs on ARM and can run x86 games directly using using something called FEX.
Now we just need to be as good as (or better than) Apple's Rosetta.
Apple Silicon actually has microarchitectural quirks implementing certain x86-isms in hardware for Rosetta 2 to use. I doubt any other ARM SoC would do such a thing, so I doubt third-party translation will ever get quite as efficient.
They are. You're mad that Valve isn't militantly enforcing Linux-native games, which is nonsense. The OG Steam Machine did that and was DOA.
Thousands of game studios are gone now, and supporting their software is important legacy work. You don't have to appreciate that, but I do. I do not give the faintest fuck about the opportunity cost you bemoan towards native UNIX games when I do this. That's your problem, not mine.
I have seen VR headsets trying to break any significant market share since 1994, they have never been anything other than a niche, with customers having too much money to throw around.
They probably gave up on their Snapdragon X efforts as Snapdragon X2 Elite was nipping at their heels and they'd have a redundant device by the time their efforts came to market.
> I don't have much faith in Arm Linux. Tuxedo gave up.
I was also slowly loosing hope, although I do still run some NixOS ARM Raspberry PIs. But with the recent Valve backing, I'm back on the train again, and eagerly awaiting the slow but steady improvements, and figuring out where I can contribute back.
Not really. The drivers are not upstream, so it only works well on specially made Ubuntu spins that carry out of tree patches and random binary blobs. It is really still quite a mess at the moment.
Integration, testing, and support are all expensive. Right or wrong, that's a reason why if a laptop "just works" (like a Mac, Windows Thinkpad, or a Chromebook), it probably has proprietary binaries.
Also, if you aren't paying for the OS (via the hardware it's coupled with), you can't expect the OS to have the benefits of tight hardware integration.
Even Framework laptops use proprietary boot firmware, and they've been pretty clear that they only provide support for Ubuntu and Fedora, not the alphabet soup of other Linux desktop distros.
Honestly, I don't have much faith in Linux anymore, and it has everything to do with the explosion of the kernel's codebase due to the explosion of cheaper devices running linux and the (admittedly difficult) management issues surrounding the kernel. I feel like from a security perspective, macos is a better choice and that pains me as a long time linux user.
Can we please move on to microkernels already? I'm fine with a tiny performance hit, I just don't want to get rooted because I plugged in the wrong USB stick.
You can use microkernels whenever you want. Just be aware that they typically have the same issues with zombie/cruft code and aren't necessarily more secure for every application.
I think the point is that even drivers could be non-trusted and live outside of the kernel and just provide the exact service required with minimal access.
That said, why do we still need drivers in 2025? Most regular printers should be dumb, U-MASS should be dumb, webcams should be dumb, monitors are dumb, etc... very few devices coming really needs custom drivers anymore (even with many customizations we could provide class specific descriptors that drivers could adhere to).
If you don't want to go macOS route and want to leave Linux world, your destination would be FreeBSD or OpenBSD.
On the other hand, if you're not running Wine, you can't get autorun virii from USB drives, plus the Windows virii just lives there and can't do anything.
Plan9 is like ocean yacht racing. If you have to ask about the "cost" you aren't the target market.
Plan9 is like writing. You either do it, or talk about doing it. I'm talking not doing btw. I tried, but I got stuck on trivial things and the barrier to asking for help over 2+2= is high. (No offence intended. The 9 heads aren't interested in running a kindergarden)
Out of all the OS's on which you'd have to hack on a bluetooth implementation, I feel like a mostly vanilla linux is the best one you could hope for. [edit] If it's not obvious from my previous phrasing, I'm referring to Sailfish OS.
Within 3 years I went from a college dropout with nothing going on to making 6 figures.
That was a long time ago and I've been comfortable ever since.
2 evictions before I turned 19 and I haven't been evicted since.
Life is good.
reply