More

studmuffin650 · 2025-11-06T17:19:54 1762449594

Also important to remember that Google is years ahead of most other AI shops in that they're running on custom silicon. This makes their inference (and maybe training) cheaper then almost any other company. People don't realize this when compared to OpenAI/Anthropic where most folks are utilizing NVIDIA GPUs, Google is completely different in that aspect with their custom TPU platform.

xnx · 2025-11-06T17:23:36 1762449816

> Also important to remember that Google is years ahead of most other AI shops in that they're running on custom silicon.

Not just the chips, Google's entire datacenter setup seems much more mature (e.g. liquid cooling, networking, etc.). I saw some video of new Amazon datacenter (https://www.youtube.com/watch?v=vnGC4YS36gU) and it looks like a bunch of server racks in a warehouse.

stego-tech · 2025-11-06T17:55:06 1762451706

Google’s datacenters are excellent, from what I’ve seen in my career. They genuinely had so many amazingly talented SMEs pushing boundaries for decades without executive intervention or deterrence, and that’s paid dividends in the subsequent tenure under Pichai and external shareholders (in that they have “infinite” runway and cash reserves to squander on moonshots before risking the company’s core businesses). That said, nothing lasts forever, and if their foray into LLMs don’t pay off, their shareholders are going to be pissed.

lokar · 2025-11-06T18:09:56 1762452596

And not just pushing the boundaries, working with the HW vendors to define them, asking for features and design elements that others don't really even see the point of.

cma · 2025-11-06T17:27:28 1762450048

Anthropic uses TPUs as well as nvidia. Compiler bugs in the tooling around the platform caused most of their quality issues and customer churn this year, but I think they've since announced a big expansion in use:

https://www.anthropic.com/engineering/a-postmortem-of-three-...

studmuffin650 · 2025-09-10T23:57:14 1757548634

Where I work, we primarily use Ceph for the a K8s Native Filesystem. Though we still use OpenEBS for block store and are actively watching OpenEBS mayastor

__turbobrew__ · 2025-09-11T00:13:19 1757549599

I looked into mayastor and the NVME-of stuff is interesting, but it is so so so far behind ceph when it comes to stability and features. One ceph has the next generation crimson OSD with seastore I believe it should close a lot of the performance gaps with ceph.

dilyevsky · 2025-09-11T02:16:29 1757556989

> One ceph has the next generation crimson OSD with seastore I believe it should close a lot of the performance gaps with ceph.

only been in development for what like 5 years at this point? =) i have no horse in this race but seems to me openebs will close the gap sooner.

__turbobrew__ · 2025-09-11T14:19:07 1757600347

soon™

studmuffin650 · 2025-07-15T20:08:44 1752610124

Interesting to me that this doesn’t directly integrate with Bedrock. Seems like an easy way to slowly onboard folks to that platform.

studmuffin650 · 2025-02-12T03:10:59 1739329859

This is a feature that’s required in Government environments. You need a check at runtime to ensure that FIPS is set or you run the risk of breaking compliance. Which leads to inevitable audits and endless meetings. I would much prefer a panic causing an issue for 30 minutes vs. endless days of meetings to set up new controls and validations that will make your life more miserable.

studmuffin650 · on Dec 22, 2024

As someone who works for a company that’s transitioning out of the cloud into their own data centers, the supply chain factor is difficult. You have to be really good at forecasting and planning with a large upfront cost. But the savings are substantial (up to 70%)

studmuffin650 · on Aug 7, 2024

Also eBPF is still in beta for windows and is nowhere near parity with Linux.

studmuffin650 · on Aug 6, 2024

Sounds more like a off by 1 bug that was hidden by regexs if I'm reading correctly

darylteo · on Aug 7, 2024

Very easily hidden. Something obtuse like

    (.+)|(.+)|(.+)|(.+)|(.+)|(.+)|(.+)|(.+)|(.+)|(.+)|(.+)|(.+)|(.+)|(.+)|(.+)|(.+)|(.+)|(.+)|(.+)|(.+)|(.+)

or even this

    (.{4})(.{7})(.{3})(.{6})(.{9})(.{4})(.{7})(.{3})(.{6})(.{9})(.{4})(.{7})(.{3})(.{6})(.{9})(.{4})(.{7})(.{3})(.{6})(.{9})(.{1})

would simply fail to match.

And I wouldn't necessarily blame the developer in either scenario - they received a card that says "hey the channel file will now have an extra field in it's schema"... noone said "btw it's optional".

Calling it a "first year programming mistake" like I'm reading in some media is somewhat incendiary. I see unmarshalling errors happen all the time.

The forest that we must not miss is the kernel-level driver simply dies with no error recovery and bricks the system.

studmuffin650 · on Aug 7, 2024

I think that’s just the nature of kernel programming. Once you’re running in kernel space, there are essentially no safety guards, which is why kernel programming is so difficult. Any faults that occur in user space causing a seg fault + core dump do not exist in kernel space. Especially since kernel code generally has to be written in C, it can be quite difficult even for the best engineers to get everything right.

cptskippy · on Aug 7, 2024

Yeah, my read was that they changed an interface to include an optional parameter but never actually tested the underlying code by providing said optional parameter.

The bug in clients (sensors) wasn't due to regex, the regex was in their integration unit testing which also had a bug and was never supplying the 21st parameter to the client code.

flamedoge · on Aug 7, 2024

regex isn't probably a good thing in a kernel boot code considering it's NP hard

cptskippy · on Aug 8, 2024

That's true statement but what does it have to do with the RCA? From what I read it appears the regex was in the integration tests for the template.

studmuffin650 · on June 28, 2024

What you would be looking for is actually the forks of redis that came about after the license change. This has existed for awhile as a redis alternative, but not a step-in replacement.

The 2 main ones are:

Valkey - run by most of the large corporations (AWS, Google, Microsoft, Alibaba, etc.) that used to have developers assigned to the Redis project doing Open source work and they just run this fork now

Redict - Another fork that seems to have quite a bit of engineers behind it

studmuffin650 · on March 14, 2024

AWS is putting up good fight

studmuffin650 · on Nov 28, 2023

As a follow-on question, is there plans to integrate the AWS CRT under the hood or does the existence of Tokio and async/await negate the need for it?