It's not that Nvidia 'keeps winning', Nvidia failed mobile computing, failed embedded computing, had no market share in the server market or data centers, was left in the niche market of gaming as a companion to PC and was not that impressive all along.
then its parallel GPU got "lucky", first the bitcoin mining, then the AI. it probably did not expect and plan for this, to some extent, it got super lucky.
credit must be given to its CUDA ecosystem and the ability to better itself when chances knocked its door, it so far left all competitors in the dust, its showtime arrived, finally.
Nvidia got lucky because every quarterly all-hands meeting Jensen repeated that he kept on investing in CUDA and adding more silicon to the GPUs than strictly needed because, one day, an application would come along that would make it all worth it.
TFA says that Nvidia started aggressively seeding CUDA and GPUs for research in the early 2010s. It was much earlier than that: it started pretty much immediately after CUDA was introduced late 2006. And every new generation there were hardware features added to make GPU programming and porting of applications less painful. The first Nvision conference, precursor of GTC, was in 2008. That’s how you make your own luck.
I’ll never forget when, sometime around 2012?, he answered the question: “aren’t you afraid of Intel?”
His answer: “Not at all. Intel should be afraid of us. We will be bigger than them.” There was not a trace of doubt.
> "His answer: 'Not at all. Intel should be afraid of us. We will be bigger than them.' There was not a trace of doubt."
Given all the times that HN readers have derided grandiose executive pronouncements preceding flops, more people should recognize the above for what it is: not profundity but just puffery that happened to pan out. Not that skill and effort weren't involved in making it pan out but that any of a zillion things could have gone wrong to make that statement false and part of any manager's job is to project confidence and instill motivation despite knowing that.
I think he had a strategy - utilizing the massively parallel computation of GPUs for more general purpose compute as Moore's law tailed off - and he noticed that Intel couldn't see the lights of this in the rear view mirror.
Everybody's known that Moore's law was on its way out, for speed increases at least, since the mid 2000s - the seminal article was by Herb Sutter [1]. So hardware needed to get more parallel. But multicore is a distinctly different paradigm to CUDA, which is closer to SIMD but on a completely different order of magnitude. So Intel was never going to get to where the puck was skating.
That’s the point, though. This is no different than any other statement made by a CEO with good engineers behind them.
This time it worked out. Can’t give it a survivorship bias. I don’t personally mind CEOs being encouraging, but at least understand that they don’t really ever know.
IMO one big factor is that Nvidia is still fully engineering driven - it's engineers all the way to the top making the calls. Intel was like that as well, and then lost it (until Gelsinger). IMO you need domain experts in charge of companies, or they can't thrive in the long run, not unless there is an actual, almost unsurpassable moat.
It's called leadership. George Washington wasn't a brilliant general but he was able to convince people they were going to win against an empire. Whether he actually believed it himself we'll never know.
> Nvidia started aggressively seeding CUDA and GPUs for research in the early 2010s
I was at a niche graphics app startup circa 2000-2005 and even then NVidia invested enough to be helpful with info and new hardware, certainly better than other GPU companies. Post 2010 I was at a F500, industry leading tech company and an NVidia Biz Dev person came to meet with us every quarter usually bearing info and sometimes access to new hardware.
It's also worth noting that NVidia has consistently invested more than their peers in their graphics drivers. While the results aren't always perfect, NVidia usually has the best drivers in their class.
Oh interesting. I remember that Folding@Home way back then (ca 2009) was already testing protein folding on GPUs and it took advantage of CUDA. I never really thought much of it other than how cool it was that my mid-tier Nvidia graphics card could be used for something else other than games, but this explains how this ended up happening.
(Bit of a tangent, but that project was very influential in getting me interested in computer science because, wow, how cool is it that we can use GPUs to do insane parallel computing. So I guess, very very indirectly, Nvidia had a part in me being a software engineer today.)
By 2007/2008 there was a trend in HPC research called GPGPU. This involved havky techniques to get the shaders to do the computations you wanted.
CUDA started appearing in 2008 with a framework (compiler, debugger) to do GPGPU in a proper way. It got the monopoly. They’ve been benefiting from first movers advantage ever since.
GPGPU was a thing well before that! In 2004, it was ready covered in a few chapters of GPU Gems 1, increasing to 18 chapters of the 2005 GPU Gems 2, which including an FFT implementation.
Excellent point! Jensen is very focused and the company has worked incredibly hard on whatever they've put out there. The Shield is a testament of this focus. They find a budding niche and double down on building it from nothing. Most "self-driving" cars have Nvidia gear for a reason.
It's the best Android TV/Gaming device around, even by todays standards. It's stable and does what it was designed to do. With zero marketing from Google or Nvidia, the mass market obviously didn't care for this type of product and category but the device itself is great and works flawlessly. The Nvidia Game app also bundled and pushed the Nvidia game streaming concept around GeForce Now. Overall, Nvidia gave put their best foot forward with this device, providing standout support for both HW and SW.
The chip line they made for it powers the most popular console on the market and basically locked its manufacturer into Nvidia chips until they're willing to drop compatibility, so financially it probably worked out for them, even if the Shield line itself wasn't extremely financially successful.
I don’t know about it’s market success, but it’s a great product. We use it for as the frontend for all of the streaming platforms and PLEX as well as running some stuff directly from a NAS and IPTV.
I use them everywhere and have a big pile of chromecasts, satellite boxes, remote controls and Apple tvs now ready for eBay!
Some years before CUDA there was a lot of hype when the first GPGPU papers published in 2003 which showed significantly increasing performance using parallel computation from consumer graphics cards. At the time, it looked like competing on general purpose computation was a solid strategy: multi-core CPU from intel was still years away, showing up in 2005; starting from 2000 the rate of increase of clock speeds started slumping. We saw Intel started releasing more variants of processors, but the clock speeds weren't advancing exponentially anymore. The new battle for core supremacy was on the horizon.
It's a bit unfair to ascribe it to sheer luck. They were focused on the compute related possibilities all the way back with the GeForce 3 in 2001. Presentations from its launch were already talking about the potential of a "parallel compute monster" [1].
They saw the potential of GPU compute very early on, invested in it long term and as a result eventually ended up dominating the market. The others didn't seriously commit and so they fell behind. AMD still can't seem to commit, while Intel seems to be working hard on catching up but isn't quite there yet.
The actual quote from your link is: "Expect a massively programmable, massively
parallel and pipelined graphics monster", not a "compute monster".
While I agree that Nvidia positioned themselves well, they were not looking as far forward as you suggest. As I recall it, they seemed surprised by BrookGPU, though they moved quickly to embrace the model.
> then its parallel GPU got "lucky", first the bitcoin mining, then the AI. it probably did not expect and plan for this, to some extent, it got super lucky.
I feel like saying they got "lucky" after trying and failing in multiple other endeavors requires a special definition of luck. If someone rolls a 6 sided die six times and they roll a six once did they get lucky?
Being good also has the facade of luck, every venture is a gamble none of which are guaranteed, but putting yourself in the most optimal positions (including diversifying) will result in some successes and some failures. You don’t end up with a $1t valuation based purely on luck.
For corporations, just staying alive that long without a huge hit is very lucky. Look how many other chip companies failed or were acquired cheaply during that period.
Given how AMD/ATI has fared with their shitty software ecosystem after close to 20 years (the first paper on using GPUs for CNN was in '05/'06 -ish where they were using shaders), calling Nvidia 'lucky' is quite unfair.
Those of us who were unlucky enough to buy a AMD GPU based on 'flop-count' were hurt quite badly. Nvidia's software compute-infra is simply unmatched.
Having a competitor that seemingly refuses to compete is pretty damn lucky. There's not a single thing nVidia could have legally done to make that happen.
>then its parallel GPU got "lucky", first the bitcoin mining...credit must be given to its CUDA ecosystem
this seems like a bit of contradiction to me. they either got luck as you claim, or CUDA was good to keep things going. from my experience, CUDA was the only way to go for GPU accelerated processing. specifically, my experience was with Resolve color correction and using CUDA was the only way to go. memory is fuzzy, but the timeline on crypto was sametime-ish. because of the audience, this forum is naturally going to trend crypto vs niche video post workflows, but CUDA was definitely kicking Radeon/AMD ass in other aspects besides useless crypto.
Same. I wanted to buy a really chunky graphics card for BMD Fusion, but the price got pushed up by the cryptobro wankers mining their Dunning-Krugerrand, and now the price is getting pushed up by all the chatgpt wankers trying to run hardware-accelerated Eliza bots.
It's not pure luck tho. The GPGPU research started in early 2000s IIRC. Nvidia had invested into it earlier than anyone and more than anyone. That's how they got CUDA. Nvidia was ready for the next generation computing. It's just that no one, including Nvidia, knew when it's gonna hit the market.
The problem isn't C. The problem is that the buggy OpenCL garbage doesn't work at all. I once used hashcat with OpenCL in a security course on an AMD GPU and it made my system completely unstable. I couldn't care less what the kernel is written in. I'm never going to use hashcat on AMD GPUs ever again.
The other problem is that companies invest in some alternative to OpenCL which fragments the non-CUDA ecosystem.
I thought that I should mention a book because it got talked a lot under Nvidia's "lucky success". The book is titled: Why Greatness Cannot Be Planned: The Myth of the Objective (https://link.springer.com/book/10.1007/978-3-319-15524-1#toc).
I don't really think Nvidia's luck is really just a luck. After all, they has the vision for CUDA, bet money on it, and succeeded. They CAN stop the effort at any point during the adventure, but they choose to continue. That's not luck, that's a vision.
BTW, the same book got picked up by some Chinese talk heads as an perfect demonstration showing the good side of capitalism, which I totally agreed. After all, if Nvidia was a Chinese company, they'll probably instead put their efforts into developing some massive-spying citizen-incriminating policewrongs-neverseening camera with funding provided by the government, thus almost guaranteed "success". I'm really glad that Nvidia is operating in a country where "making hardware so everyday people can use their computers to play games" is not something to be mocked at by it's (abusive)parents-acting government. Now, look who's paying smugglers for those A100 chips at double the price while crying? A well deserved fate on my book.
I started reading that book (seems great!) and stumbled upon this: "And it’s hardly clear that computer scientists will succeed in creating a
convincingly-human artificial intelligence any time soon." Made me smile. Took but 8 years, the book was published in 2015. :)
(BTW. I'm super concerned about all the incredible amount of power nvidia has now and in the future. They get to decide the fate of the human race, I feel. And their incentive is just to make as much money as possible, while all negative externalities including the likes of extinction is left to the society to deal with. Sigh.)
There's a lot of hard work that goes into luck, but one aspect of the Nvidia story is of AMD's mismanagement. There's an alternate reality where OpenCL became the default instead of CUDA but that's not our reality.
> Nvidia failed mobile computing, failed embedded computing,
Really? What do you think high-end drones and self-driving cars are using. They invested in generic robotics software and hardware probably more than all others put together. When you go beyond Raspberry Pi it's only NVidia. Their day is coming. Another similar ecosystem will be hard to create.
It allows for machine vision and heavy maths to be executed very fast. There is basically nothing similar in an embedded format in the market. The remainder would be SoCs for Android phones (Samsung Exynos?).
Thsese SoCs have lots of hacky drivers and mostly support Android, which is not a very good fit for real time and high customization required for drones and self driving cars (AOSP build system, a google class piece of crap).
They barely exist in the network switch market. The new NVLink switches may change that for tier one inter-chassis transport to replace Infiniband RDMA, though.
Reality is Mellanox has never been a substantial networking player. Yes, in HPC, but that's a tiny market even today.
It doesn't help that Mellanox acquired Cumulus just prior to Nvidia acquiring Mellanox, and Cumulus was basically DOA. Now they are split SONiC/Cumulus with a lot of internal infighting trying to keep Cumulus relevant despite industry trends.
Don't forget the drivers. As someone who just switched from Nvidia to AMD, it is downright painful how bad AMD's implementations of Vulkan and OpenGL are. I might be getting more bang for my buck but damn do I miss not having unfixable glitches.
AMD is good at being the underdog, hopefully it will focus more on its software, the ROCm thing, which really needs some love, a lot of love indeed. The software ecosystem for AMD's RDNA(gpu) and CDNA(MI2xx MI300) is at its best a mess.
AMD shall own OpenCL and boost it heavily and make it a central piece for its ROCm framework as the preferred backend, in my opinion.
ROCm does not compile PTX IR to AMD assembly, if it did you'd be able to run nearly any compiled CUDA program on and AMD GPU without the source. ROCm is a source level compiler for CUDA (technically HIP is the actual compiler, but whatever), it allows you to a substantial fraction of the CUDA APIs. This notably also means that any Nvidia open source library that uses inline PTX assembly won't work, but fortunately AMD does have alternatives to many of the Nvidia libraries.
both OneAPI and ROCm are, or should, or could provide higher level APIs to isolate you from OpenCL's C APIs, or just leverage SPIR-V. Other than CUDA, I failed to see any other open alternatives to get heterogeneous computing working yet, OpenCL is the only one on the table as far as I can tell for now. Yes there are Vulkan compute shader etc but they're still pretty far behind, and they could be made OpenCL compatible too.
While CUDA is great, OpenCL can be made by OneAPI and ROCm more open source friendly, I hope.
By reputation, Jensen doesn't really have much in the way of SW understanding, and in that sense he is like a lot of former chip guys. They got _insanely_ lucky that CUDA took off and there's an amazing irony in SW being their primary lock on their current market.
To win, they just needed to show up. Which is more than can be said about the competition.
AMD alternatives to CUDA are/were fumbling in the dark for many years, and more open alternatives like OpenCL are too limited (by design?).
To me the situation looks quite clear: a GPU has vastly more compute than a CPU. As time goes, we will need and use more and more compute. You just need a way (general purpose language or API) to use that GPU.
For some reason, other companies in this space did not see this.
They keep failing to see this, while CUDA is a polyglot programming model, a couple of years ago at an OpenCL conference (IWOCL), someone asked the panel when Fortran support was going to happen.
Everyone on the panel reacted surprised that it would be something that anyone would want to do, and most of the answers were the kind of talk to us later.
Meanwhile PGI was shipping Fortran for CUDA, this was before them being acquired by NVidia.
I agree with this. All they had to do was the bare minimum and actually keep it alive for a few years.
This pattern is pretty common in industry. Almost all the huge companies that are winners in technology are those that got on the market and kept the thing alive - that's not sufficient, but it is necessary.
Nvidia gpus were massively inferior for Bitcoin mining, in fact, because ATI/AMD had some integer operations that allowed SHA256 to be several times more efficient.
Ironically, that's what ultimately made Nvidia the winner for gpu mining: ATI gpus had been massively deployed for Bitcoin mining prior to the dominance of mining asics. When people created new altcoins they specifically designed their work functions so that they the inventors could have an advantage vs the general public, so they designed them for nvidia gpus rather than what was already deployed. This let them buy up gpus before shortages came into effect and delayed competition from the installed base.
Sure, you can trivially port whatever to whatever, but outside of startup effect mining is naturally perfectly competitive. Being 20% less efficient vs costs means bankruptcy.
> Nvidia's moat is AI.
Nvidia had a gpu computing moat before the current AI fad, due to maturity of the CUDA ecosystem. At least AI codes are generally pretty easy to port to other architectures, similar to mining in that sense-- but the AI designers don't have a profit motive to make sure they choose algorithms that are more efficient on their hardware than yours, and your AI hardware doesn't become useless if does happen to be a few percent less efficient.
The first big mining wave was 2014-2015. This was done on 5850s, and GCN 1 and 2 GPUs. This was Bitcoin, so it was computation-focused, and I think in this era it was definitively AMD dominated due to VLIW allowing very dense execution resources plus early GCN having a large amount of raw integer processing power.
The next was 2017-2018. By this era bitcoin itself had moved off to FPGAs and ASICs, so this was around Ethereum which primarily worked based on proof of memory bandwidth. AMD GPUs were falling behind Maxwell/Pascal in terms of their memory compression (although they did use it) so they were equipped with more memory bandwidth to compensate. So for a given AMD card (Polaris, Vega, etc) you got more memory bandwidth per $, but in terms of the actual compute efficiency, NVIDIA had already pushed ahead even in Ethereum. NVIDIA was usually superior per-watt with cards like 1060, 1070, 1070 Ti, and 1080 Ti, and AMD just let you burn more watts.
However when it came to altcoins, where it was not just raw bandwidth, NVIDIA's superior compute/GPGPU efficiency took over and there were some coins that NVIDIA was 2x or more more efficient on per-watt and also the winner in absolute performance.
(the thing to remember is that compute is the part that ASICs can do efficiently, and I always questioned whether those altcoins were really ASIC-resistant. But the ProgPOW-style algorithms doing better on NVIDIA cards never bothered/confused me, the reality is that most GPGPU programs "favored NVIDIA" during this era and Ethereum's proof-of-bandwidth model was an exception. Pascal was an efficiency beast and Polaris and Vega were only ok at best, outside their enormous, dangling memory buses.)
This state of affairs persisted throughout the 5700 series until AMD launched the 6000 series, where they shifted to a design with smaller memory buses and more cache, which put them in the inverse situation of 2014-2015 where they were getting more out of a weaker memory subsystem than NVIDIA. And I think they did this on purpose because they wanted to "opt out" of the mining boom/bust cycle, and NVIDIA made a similar approach with Ada that has been extremely unpopular (despite AMD leading the way on this a few years before on their cards too).
Isn't it wonderfully coincidental that out of any amount NVIDIA could have put into the LHR to slow down when it detected mining, that they put in the exact amount that dropped their cards to the same relative mining performance as AMD, and moved to the same cache-based approaches in the next generation as well? That has always been my take around LHR - it's not that they didn't like mining revenue, it's that they didn't want NVIDIA cards to be disproportionately pulled off shelves like happened to AMD in the 2014 and 2017 mining booms. People remember the "AMD was $1600, NVIDIA was $2400" situation already, they didn't want that to persist and turn into actual marketshare.
I won't comment on the actual facts, I'll just say that I couldn't bear reading more than the first few paragraphs of the article because both guys sounded like such rabid fanbois...
NVidia has always been the first choice for graphics work, and especially now with large complex VFX pipelines there really isn't an alternative - particularly since most of that work is done on Linux which has always had excellent support from NVidia.
A big reason that Nvidia keeps winning is that AMD doesn’t bother to compete.
Competition means giving customers exciting, fast, low cost GPUs.
Nvidia, as #1, no longer needs to compete and has stopped winning via low cost, high performance GPUs. This opened a giant opportunity for AMD GPUs to give those things to consumers and start winning against Nvidia.
But instead AMD has just followed Nvidia into making slow, uncompetitive GPUs at high prices.
Is has been the years of investing and specially the software ( CUDA ) that keeps Nvidia as the "king". Hardware wise AMD it's pretty much up there or winning in a few cases.
> AMD has the patents for putting a SSD directly on the card, they could be killing it in the home AI market, but.... they just can't get it together.
Radeon SSG was just a PCIe switch chip on-card, with the SSD+GPU behind it, so functionally it is the same as having the SSD on the motherboard.
I'm not sure if they got patents but either way the problem it solved was not the one people think it solved. It wasn't "using the SSD as memory", and in the way that it did (block storage) any other SSD+GPU can function identically in the same way without needing the SSG tech. Putting the SSD on your mobo performs as high in every way on any GPU (CUDA RDMA Direct have been around for a while, since Fermi/Kepler at least).
It was just a convenience thing of getting a SSD mountpoint built into your GPU basically. They even showed up as a HBA in windows and were treated as a striped disk.
More recently this idea has re-surfaced with some of the modern GPUs with x8 interfaces (6600XT, 4060 Ti, etc) getting a couple M.2 drives in the other x4x4 via bifurcation, and this functionally performs the same as SSG but without the switch chip (needs hardware support instead). And if there are patents around using a switch chip that may be how the patent is evaded. But combination cards utilizing more than one type of hardware in general is not that novel (network card/ssd combos are another popular one) and I'm not sure AMD patented it or it's defensibly novel against prior art.
Last time AMD was 9 USD per share was Q4 2016. That is a LONG time ago. What made you bet on AMD at that time? I'm still shocked how fast AMD turned around. And I cannot see the end of Intel's slow slide into dinosaur tech company, similar to IBM, HP & Compaq.
>Hardware wise AMD it's pretty much up there or winning in a few cases.
Hard disagree as someone who's had two 7900 XTXs and is now looking to buy an Nvidia card. If I need to pay 1100€ for a product, I'd like it to work and AMD cannot deliver on that.
The lack of a recall for broken hardware also makes me distrust them from a customer support point of view.
Bumpgate was just RoHS solder and nobody recalled for that, including AMD. Seen a lot of pictures of people baking their 7850 as well, it just didn't get a fancy name like bumpgate.
Bumpgate was an industry-wide problem and pretty much nobody recalled for it unless they were sued into doing it.
> But instead AMD has just followed Nvidia into making slow, uncompetitive GPUs at high prices.
I'm curious: do you use GPUs? It's hard to take this claim seriously. Yes NVIDIA GPUs are expensive, but I don't think we live in the same matrix when you claim that their GPUs are slow and uncompetitive... Unless you have an alternative in mind, in which case I'm all ears.
Yeah I'm not sure how a GPU can be both slow and uncompetitive and also the best GPU available at any price. That would imply that they're extremely competitive.
Some people just have an ax to grind and will contort their facts to agree with preexisting thoughts and feelings. I think it's called cognitive dissonance.
There are two senses of 'competitive' in play here: the sense in which a market can be 'competitive' or not by featuring alternatives that are comparable and whose successors leapfrog one another in various virtues over time, and the sense in which a product can be 'competitive' within a market by being more or less at least as good as whatever else is out there.
NVIDIA is definitely not 'competitive' in the sense of participating in a competitive market, or in the sense of being characterized by furious striving for rapid and continuous improvement. Many of their newer cards are barely 'competitive' with their own cards from a generation or two ago.
It's extremely clear to me what people mean when they say that NVIDIA or their GPUs are not meaningfully 'competitive'. I don't think it's that hard to see, actually.
> Many of their newer cards are barely 'competitive'
The 4060 is the only card this generation that's a flop, and I wouldn't consider one in four being "many".
> I don't think it's that hard to see, actually.
It is if you're not looking or trying to be disingenuous. Every generation has introduced new features and had noticeable impacts on the consumer products in its category (games). DLSS alone has been a game-changer, pun intended.
That’s… very debatable. Don’t get me wrong, they’re definitely competitive at all levels but AMD absolutely is too & very arguably is better in some cases.
When you say AMD is extremely popular still, yet Nvidia has what like 82-84% market share.. what are we actually talking about here in the context of "market talks"?
Not how that works, friend. Neither data nor debate. You can't hoist talk to another sphere and continue, but even if we do there's a way back. Let's dove.in, shall we? Data is accurate, reported by numerous providers. If you don't trust any of data providers you can always compare revenue size of two companies where AMD accounts for CPUs and those deaps with console manufacturers as well and still doesn't come close, in fact percentage relative still holds. Consoles are B2B deals, people don't buy consoles because AMD. Same for mobile or apple (to a degree, because apple sells apple), as well as integrated Intel GPUs. Where consumers and businesses have a direct pick and choose - they did, and it's Nvidia for the better part, 80%+ part.
Yes, and that's a great ArtX doing and legacy tracing roots to SGI and Nintendo64 cooperation which ATI bought (into). However, it's not at all relevant since consumers aren't buying those for AMD GPUs, they are specialized (to consoles), and they're buying it for the brand of consoles themselves not because they have AMD GPU in it. Same parallel could then be drawn for intel's integrated GPUs. We're talking discrete GPU cards where market directly picks and chooses and it chose Nvidia that it's not even a contest anymore it seems. Even HPC area is done for almost.
Right, at the moment and that game changes all the time. However, we're talking share here where 5 out of top 10 are Nvidia (which also changes, yes - and not in favor of AMD but those "other" systems which are mostly under sanctions and can't buy Nvidia at scale). Top five accelerators / co-processors in TOP500 are Nvida per https://en.wikipedia.org/wiki/TOP500 . If you ask Nvidia they'll tell you 64-68% of TOP500 are Nvidia, but they count interconnect as well, not just GPU. Truth is somewhere down, probably within 1/3-1/2 area. It's really not hard to see AMD is putting up a fight, but against a giant. Nvidia outclassed AMD's sales in b2b as well as towards consumers by a big margin which is reflected both in revenue and ultimately market cap. Reason why that's so, since both are fabless, lies primarily within R&D Nvidia did execute on rather well - but that's just like my opinion, man. Everything else are facts.
That's not how it works. But I'll bite since you seem to follow a religion here - you can follow market talks from other end, revenue. It tells the same story, eerily so in fact.
That’s their newest mid-tier consumer card. On the data-center side they can’t make H100s fast enough, all the AI startups are clamoring for them. You also won’t see any reviews like that for the 4090; the only complaint I’ve seen there is that it’s hard to take advantage of it if your’re just gaming (but it’s still great for CUDA development).
Nvidia knows that the profit margins at mid-tier GPUs are tiny - so they're pushing people to buy the 4070, 4080, and 4090.
Wafer prices have increased drastically in the last decade - to the point that midrange cards are no longer midrange priced.
It simply isn't really profitable to make real midrange cards anymore.
In addition, discrete GPUs market is a declining market and has been for 15 years or so. Therefore, in any declining market, the low-end and mid-range products will get squeezed and the high end will get pushed. The remaining buyers of discrete GPUs are willing to pay higher prices. Those who don't will buy a laptop with an Nvidia GPU already in it or some sort of SoC (like Apple Silicon/AMD APUs/Intel APUs).
>The market for GOUs is declining because crypto is over and because the GPUs are so underpowered and overpriced that people don’t want to buy them.
No, this is not true. DIY PCs aren't as popular as before due to the advancements in laptops and interest swinging to mobile. People are buying far more laptops than 10-20 years ago. In fact, gaming laptops outsell gaming desktops 2:1 now.[0] The gap is expected to widen.
The market for discrete GPUs has been declining long before crypto. Crypto just slowed the decline.
Notice the word discrete. The entire GPU market isn't declining. Only discrete.
>And if GPUs cost so much to make, why do they get discounted so drastically when the manufacturers eventually decide to compete?
Part of it is because of price inflation due to covid and crypto bubble. So prices are just going back to a more normal.
Based on desktop CPU sales or based on pre-built sales? I don't know what qualifies as a gaming laptop these days but the best laptops for software development often have discrete graphics cards in them e.g. 1660 Ti
I'm guessing it's having a non-APU AMD/Nvidia GPU inside the laptop.
But it makes sense to me. Laptops have gotten much better, and most gamers aren't buying Nvidia 4090. The most common GPU is just a GeForce GTX 1650 which many laptops have and have thermals that easily fit inside a laptop.
It's not surprising at all that gaming laptops outsell gaming desktops. This isn't the 2000s anymore where if you want to play PC games, you build a DIY desktop tower and buy a discrete GPU.
It's a myth that most PC gamers use top of the range GPUs. Most of them use low-end or mid-range GPUs.
The 4060 is a pretty unusual case. In general NVidia cards fit on a very predictable price/performance curve where you don't get any performance for free, but consequentially you pay more and get more performance.
(admittedly I have zero interest in them as gaming hardware, but on the GPU compute side this is definitely the case)
It's kinda amazing how year after year reviewers release the "new products are crap, buy the thing we told you was crap last year" and people don't catch on.
The low-end is crawling along due to fixed cost overheads from PHY area that doesn't shrink. 4N is ~3x the price per area as Samsung, and PHY area becomes relatively much larger. The incentive is to cut every cost in PHY area - fewer memory channels to reclaim PHY area and using cache instead, cutting PCIe width to reclaim that PHY area, even cannibalizing the media engine/encoders to claw back that last little bit of space. AMD has actively been engaged in this battle with 6600XT (x8 PCIe, 128b memory bus) and 6500XT (x4 PCIe, 64b memory bus, no encoder) and RDNA2 generally shrinking memory bus and replacing it with cache in this same way (and 6600/6600XT and 5700/5700XT also regressed performance vs their predecessors in some situations just like 4060/4060 Ti). On top of that you have big fixed increases in manufacturing, shipping, etc, and the rest of the BOM is ballooning over time too (forget VRAM spot prices, ask automakers if their BOM is higher or lower than 2019... and it's not a small difference, it's probably 2x!)
In this world of organically-low performance increase within the low-end product segment, the impact of clearance sales exceeds the impact of new product generations. And this means we end up in this situation where reviewers do the "the new products suck, buy the ones we told you sucked last year" rubber-chicken routine year after year after year. But if everything is constantly bad, kinda nothing is, really. That's just how this product segment is now.
And when you look at things like the 3060 Ti being $275 on clearance, or the 6700XT being $300-320... the market consensus is that the old products are Good Enough. And the new products in fact may be worse in some ways, unless you are willing to step up in price to maintain the same product segment rather than the same price segment. Because TSMC is cranking the prices 25-50% every generation and Samsung was abnormally cheap to begin with, so there is a definite price step happening.
Hard to see how people don't understand that the $200-300 product segment is dying in the same way the $100-150 product segment already died. 6 months after launch you could buy a 7850 2GB for $150, or a 7750 for $100. What does $100 buy you in the new/retail GPU market these days? Does anybody actually seriously think the inability to deliver new $100-150 GPUs with enthusiast-tier performance is due to "greed" or "agreeing not to compete" as opposed to market realities? But people will defend it to the death that the privilege of selling them a $200 GPU is some massive profit opportunity that AMD and NVIDIA are just choosing to ignore and sandbag.
We are in the end of silicon in some ways. Post-Moore's Law the costs have been spiraling, and now we are to the point where consumers are deciding that no, it's not worth the cost increases. And companies aren't going to cut their throats and run zero margin or sell at a loss either. They will make the products they can make, and if the demand is low enough that it's not economical to continue developing products for that segment, they'll stop and continue in the segments that are still profitable. And that eventually flows through to TSMC and ASML, and node research will slow (hyper-NA is already effectively canceled due to excessive costs) and silicon research becomes incremental and iterative and slower rather than continuing to make even N7->N5->N3 sized progress. Yes, it can get slower.
And again, this doesn't mean "NVIDIA is leaving gaming", any more than they left gaming after the $100-150 market died. You can still make something in the 4070-class product very profitably at $600 and slimmer margins at $500, and the market access the gaming products provide is foundational for capturing the innovation happening in the other segments (it's why NVIDIA keeps being showered in money with stuff like AI while AMD is left out of the rain, you don't get to be in the segments like AI or OptiX without doing the work in gaming). But they can’t do the 4070 at $329 like it’s 2014 on a mature 28nm anymore. And the $200 market is as toast as the $100 market before it, if that's your budget then buy a console and benefit from the cost-reductions of integration and reduced modularity and fixed/stable hardware specs.
But gamers are “”emotionally unprepared”” for living in a world where there isn’t automatic progress at each price point. People do think of it as the privilege of AMD and nvidia getting to sell you an upgrade… the vendors see it as the “”privilege”” of selling you the lowest-margin product in their lowest-margin product family (all of that 6nm wafer is 10x as profitable for AMD doing literally anything else already). If it sucks oh well, and if you don’t buy it then they’ll stop making it, and it’s not malicious or conspiratorial, it’s just not where the tech is going in that price segment. People want cost-inefficient Lego-style modular product design even in the lowest-end product segments, and then get mad when it’s expensive, and start spewing conspiracies about collusion etc. That’s easier for people emotionally than just admit they’re wrong and being irrational.
Also there’s really no segment where any product here regressed. The predecessor to the 4060 isn’t the 3060 ti, a card that was a $400 MSRP / $450 street price even after mining. It’s the 3060, and the 4060 is way faster in all scenarios. The predecessor to the 7600 is the 6600, the 6700XT is a $480 MSRP card. People love to do this “it’s actually 2% slower (at a res nobody plays on those cards) than a card that’s only 20% faster” bit - and again this includes reviewers too. Nothing is actively regressing, it’s just not advancing as fast as people want, but they’re so emotionally immature they have to turn that into “2% slower than a card that’s 20% faster / 50% higher msrp” to express their frustration. It’s not slower than anything other than your expectations, but people emotionally love the framing of it somehow being a regression.
After 5 years of these reviews and discourse (everything post Pascal and even Pascal itself tbh) I’m just kinda tired with it. If you apple Moores law era standard then everything is going to suck going forward, period, with rare “this one is ok” for truly great ones. If everything is awful nothing is.
I use GPUs for FP64 (64-bit floating point) compute, and the best AMD GPU right now is still Radeon VII, which was released in 2019. Since then AMD split the GPU business into two lines, CDNA and RDNA, the point being to be able to charge much higher prices for the GPUs that provide good compute (i.e. CDNA).
The RDNA line is targeted at gamers. The reasoning apparently being "let's be careful that these RDNA GPUs are bad enough so they can't really be used for anything else but games; if the user is into compute, they should pay the big money for CDNA GPUs".
This is also reflected into the ROCm situation. ROCm has good support for CDNA GPUs, which is the hardware that is deployed in the huge national-lab GPU projects, where money is not an issue. The problem with that approach? normal people do not have access to CDNA GPUs that cost a multiple of 10000 USD apiece.
Indeed, you do need kind of a loss leader weak compute GPU to get people interested and experimenting with your platform.
Cloud does not fulfill that purpose being too expensive, too limited and not secret enough.
Normal gaming nVidia GPUs fill that role. You need big guns, you buy a lot of their compute cards next.
AMD had decent compute capabiliites in their CDNA line, but they didn't follow through with drivers. The cards were pretty bad to try to use for compute with early ROCm, and just as bad for gaming.
> I'm curious: do you use GPUs? It's hard to take this claim seriously. Yes NVIDIA GPUs are expensive, but I don't think we live in the same matrix when you claim that their GPUs are slow and uncompetitive... Unless you have an alternative in mind, in which case I'm all ears.
I think he meant Nvidia GPUs are more expensive than Nvidia GPUs from previous generations.
> Competition means giving customers exciting, fast, low cost GPUs.
You actually need to do more. The ML community chose nVidia because a decade ago, nVidia donated GPUs to universities for free. People like Hinton hacked CUDA as a poor man's HPC.
For example the original AlexNet use 224x224 resolution pictures, to fit the GTX 580 3GB memory.
>A big reason that Nvidia keeps winning is that AMD doesn’t bother to compete.
AMD definitely wants to compete. Why wouldn't they want a piece of a trillion dollar market?
The problem is that AMD was almost bankrupt, had poor leadership until Lisa Su, and they were just trying to survive making console chips for Sony and Microsoft until Zen2.
> But instead AMD has just followed Nvidia into making slow, uncompetitive GPUs at high prices.
It's true that modern GPUs are ridiculously expensive compared to where they were 10 years ago, but there still is a good sub-$200 GPU available: the RX 6600, from AMD. It's not the fastest, but from what I understand, it gives the most power for your buck of any modern GPU.
I was so disappointed with the 7970 and 680 era of cards because the one time AMD launched first they priced a mid size die as if it was a top end card and Nvidia followed suit renaming the x70 card to the 680 as competition. It marked the beginning of the period where cards have got more expensive and renamed over and over. The 580 in pedigree looked a lot more like a Titan card/x90 does today than any x80. An x80 today is a x60 class card of that era in terms of die size, memory bus width and power consumption targets and a host of other core big measures of GPU design.
They have both been at it gradually increasing prices to the point where a mid range card of a decade ago now costs 5x as much and twice the price of an entire console. It's got pretty insane this generation with marginal gains and massive price hikes and another set of renaming cards to step the dies down again. Finally no one is buying them but these companies are big enough they will just blame the overall economy anyway.
> But instead AMD has just followed Nvidia into making slow, uncompetitive GPUs at high prices.
This seems like a needlessly conspiratorial and complicated cope to avoid recognizing that the low end is simply suffering from fixed overhead and rising costs.
The alternative/null hypothesis is that both companies are responding to the market options that TSMC and BOM costs provide them, and the technology is simply moving slower in some segments. In this null hypothesis, it's not an active conspiracy to screw anyone, and nobody is deliberately "followed anyone into noncompetitiveness", there simply isn't a big market opportunity to sell $150 enthusiast GPUs anymore (7850 HD!) and make a decent profit based on the actual costs of building the product and getting the product through design/validation and manufacturing/shipping cycles.
Like, it's kinda facially absurd that gamers think that there's some golden opportunity to make low-margin $150 GPUs that everyone is just choosing to ignore because "they'd rather do AI/because they hate gamers". That's a shitty low-margin product and everyone is choosing to ignore it because it's not profitable, gamers are just kinda operating under this fallacy that $150 is a lot of money for an enthusiast-tier dGPU. And of course gamers will not touch a card with less than 8GB, and will scream a fit with 4060/6600 style 128b buses not being enough width. They don't like any of the compromises that are necessary to get down to this price point, they want a $150 card that's no-compromises quasi-midrange.
Shockingly, most companies are not interested in chasing the customer who wants the corvette for $25k. And yes, you can sustainably build a car for $25k, or maybe even a bit less... but it's not gonna be a corvette either. It's gonna be a midrange that you've "cut down" with feature lockout, or it's gonna be the shitbox econo-model that was bad to start with.
> Nvidia, as #1, no longer needs to compete and has stopped winning via low cost, high performance GPUs. This opened a giant opportunity for AMD GPUs to give those things to consumers and start winning against Nvidia.
> But instead AMD has just followed Nvidia into making slow, uncompetitive GPUs at high prices.
I am not really sure they could do that. If nvidia cards are cheaper to produce, then for AMD just ordering more cards can end in massive losses short term, and much less profit long term, as customers demand cheaper cards. Nvidia always could lower their prices as the updog, and hurting AMD really bad.
But yes, in retrospective, knowing that the demand for GPUs would remain ridiculously high for years, the best move for AMD was ordering and producing much much more cards. As the best move for us customers were jumping into crypto. (Not necessarily the best strategy.)
I'm not sure that's true when even Intel's low cost GPU offerings are not doing any better. The cheaper AMD's RX7600 is probably a better value than Intel's more expensive A770. Nvidia's newest RTX 4060 is in that price range too, also performs similarly, but will probably have more mindshare.
Intel from what I can tell has better hardware than AMD, but they have even worse software than AMD.
It's really just Nvidia firmly winning the entire market the good old fashioned way. Those who can compete still can't hold a candle, and the market at large is content to keep buying Nvidia because their stuff is simply That Fucking Good(tm).
It's Nvidia's market to lose, and they clearly aren't losing any time soon.
I'd say AMD still has worse software than Nvidia too, even before GPGPU/CUDA. About a year ago, I swapped a 1080TI with a friend's 5700XT because he was having crashing issues with games. I use it fine now with Linux drivers, so I doubt it was a hardware issue.
AMD drivers exist in this weird state of tautological pseudo-goodness where they're good as long as you ignore all the times they aren't. And if you point it out, people dig in with the "well I've never had a problem in 15 years now" and "go look at NVIDIA's tech support forum, they have bugs too".
NVIDIA has not had a sustained generational instability problem like 5700XT or Vega drivers in the modern era, and they haven't even had a more short-term shitstorm like RDNA3 launch drivers that lingered for nearly as long. And there have been multiple instances of top-5 e-sports titles being flatly broken on AMD drivers (often resolveable by going back to much older drivers) for prolonged periods of time (quarters/years) that simply don't happen on NVIDIA cards.
But of course there is a low-level stew of problems on both brands constantly. Power-saving with multimonitor is a great example of one that's been perma-broken for 10 years on both brands now. But there is also a high-level stew of Radeon-specific problems that pretty constantly churns and it gets discarded out of hand because "I've never had a problem" like that means 5700XT didn't exist. And it's people who you absolutely know are aware of the 5700XT issues.
Even working from a baseline assumption of intellectual honesty it's a super frustrating discourse overall, and I think in many cases the assumption may be unwarranted.
I'm not a pro in this field but I would think with AMD acquiring Xilinx they see a better way to the AI competition through FPGA. I would suspect others are in the same boat. GPU for AI is a nice hack, but still a hack. Make hardware that specifically is for that job.
Microsoft has historically been the biggest proponent for using FPGAs for AI.
It’s not that you can’t use them, and there may be narrow cases where they have a benefit. There are always trade-offs to be made.
But that doesn’t make them the best hardware for AI (there’s silicon overhead that will never be used) and it doesn’t change the fact that they’re much harder and slower than bare metal CUDA to optimize for, in a fast moving field.
Look at it this way: if you have a fixed DL architecture and a fixed set of network weights, there’s no question that you could come up with an ASIC that only does that, that will be cheaper than an FPGA or GPU, faster, and use lower power too. It will be the perfect inference solution for that specific problem.
It would also be obsolete in a few months, before the chip comes back from the fab.
huh :o) what are your thoughts about intel on this ? it was (and still is ?) the market leader in cpu's and decided to push fpga's for machine learning !
in the hindsight, it seems like a serious misstep.
It seems to me that Nvidia has largely abandoned the consumer market in order to chase the seemingly-bottomless demand for AI hype. While AMD is more than happy to stay in their lane, continue making record profits on their same market, and not waste billions trying to catch up on 10 years of CUDA development. By the time they have anything remotely comparable, the AI grift will be over just like crypto. Tech grifts only have a 1-3 year cycle these days.
Really surprised to see such low effort commentary akin to Reddit/twitter on here.
In what way has Nvidia abandoned the consumer market? They have something like 80%+ market share on discrete gaming GPUs and still rake in billions a year from GeForce, which at their revenue is a sizable fraction.
And what makes you say that current AI hype is a grift? Do you truly think that the work with transformers is just a fad? I swear people on this site were saying the same thing about AI in general 3+ years ago before transformers blew up the scene.
It's kinda strange seeing some people link advancements in transformer models with stuff like cryptos and NFTs. You've gotta ask, where are these thoughts coming from? Plus, the ongoing use of the 'stochastic parrot' argument is starting to feel a bit repetitive in these discussions.
Their consumer stuff barely competes with itself. They can only get away with that because nobody is interested in that market. I’d also say current AI hype is hard to tell from a grift now - my team is doing some useless crap because CEO basically has tech FOMO. It’s so uncalled for and I’m only guessing not uncommon.
Companies doing useless crap with AI has been a thing long before the current AI popularity spike. I've heard a specialist complain about that trend 5 years ago, where clients couldn't say what they wanted to do with an AI but they "knew" they needed one. That doesn't make it a grift, though.
Nvida literally doubled the prices of their entire product line overnight and then spent their entire GDC keynote talking about how AI was going to eliminate the need for game developers.
Like crypto, fraud is the only use case that will pan out for transformers in the end.
We’ve been here before. ChatGPT isn’t functionally all that different from ELIZA from the 1960s. It’s tricking people just the same, into thinking it’s more than it really is.
I’m also eagerly awaiting the discovery process that will show that all of this generative AI was built on mass copyright infringement. OpenAI certainly ingested the entire z-library. They’d have to be stupid not to, right? If they didn’t use it, less ethical competitors would have beaten them to market.
Oh, I’m sure they used some sketchy subcontractor to do it. We’ve all seen a variation of this in our careers, where a dataset was acquired under questionable circumstances, and you don’t ask and don’t tell if you want to keep your job, right?
And that’s just speculation. There’s hard evidence for the Getty Images lawsuit.
Comparing ELIZA to GPT because it's not quite there yet is like comparing horse carriages to autonomous cars because they still have accidents. People have already found actual real-world uses for GPT, meanwhile ELIZA's most popular use-case is an obscure feature in Emacs.
Regarding the questionable training data I agree, but the likely outcome is that those AI models will be used regardless.
I'm guessing it's a gut feeling based on seeing a lot of the same types of influencers who used to endlessly hype up crypto now making similar noise about AI. But I agree that that assessment is unfair to AI, even if the hype is ahead of current abilities.
Best kept secret in the AI boom really is that you don't need a ton of local horsepower to run a model and do something useful, you only really need the grunt once, to do the training in a reasonable amount of time.
Combine that with the stuff that Mythic was working on in texas (before they ran out of burn) and we're looking at a world where, most people won't have a need for heavy GPU compute.
It’s gonna take another decade before the suits at F500 companies figure that out, though. You wouldn’t believe the wasteful nonsense that’s in production already…
Oh hell yeah, super happy for them. They've got the right idea for edgeAI stuff, I hope they can survive, or, at least sell their IP to someone big and buy a few acres to DGAF on.
As long as sanctions and ransomware exist, I don’t think it will ever fully die. But I don’t think the price will ever hit an ATH again without SBF, tether, and their co-conspirators pumping the price by printing stablecoins out of thin air.
Pretty much everyone has heard of crypto by now. The Ponzi scheme is out of marks. There will be no next time. Sell now if you’re still holding the bag.
Then in 2018, they prohibited usage of GeForce GPUs in data centers: https://www.datacenterdynamics.com/en/news/nvidia-updates-ge... And during 5 years which followed, they have reached the point where for many use cases nVidia is dramatically worse value compared to competitors.
On my day job I do CAM/CAE, we compute a lot of FP64 numbers. Because we use D3D tech for GPGPU, we recently bought some computers with AMD 7900 XTX. These GPUs were sold for about $1000. An nVidia equivalent is L40: the AMD is 1.459 TFlops and 0.96 TB/sec memory bandwidth, the nVidia is 1.414 TFlops and 0.864 TB/sec memory bandwidth, but the nVidia is sold for about $9000. That’s an order of magnitude difference in cost efficiency.
I agree the cost difference being substantial, but something seems off here. I think you can buy nVidia cores that can beat anything AMD has for ~$1500-2000. That is gaming focused, but it makes me extremely skeptical of the overall numbers.
If the AMD chip really is better for your job, that's not that crazy a claim, but it make no sense that nVidia wouldn't have something for 1.5-2x as much. This seems to be the going rate, currently.
If you're getting comparable speed for 1/9 the cost, I don't think that product would exist for very long. I've been looking at graphics cards and nVidia clearly beats AMD in every category with around a 50-100% markup.
Gamers don’t care about FP64 performance, and it seems nVidia is using that for market segmentation. The FP64 performance for RTX 4090 is 1.142 TFlops, for RTX 3090 Ti 0.524 TFlops. AMD doesn’t do that, FP64 performance is consistently better there, and have been this way for quite a few years. For example, the figure for 3090 Ti (a $2000 card from 2022) is similar to Radeon Vega 56, a $400 card from 2017 which can do 0.518 TFlops.
And another thing: nVidia forbids usage of GeForce cards in data centers, while AMD allows that. I don’t know how specifically they define datacenter, whether it’s enforceable, or whether it’s tested in courts of various jurisdictions. I just don’t want to find out answers to these questions at the legal expenses of my employer. I believe they would prefer to not cut corners like that.
I think nVidia only beats AMD due to the ecosystem: for GPGPU that’s CUDA (and especially the included first-party libraries like BLAS, FFT, DNN and others), also due to the support in popular libraries like TensorFlow. However, it’s not that hard to ignore the ecosystem, and instead write some compute shaders in HLSL. Here’s a non-trivial open-source project unrelated to CAE, where I managed to do just that with decent results: https://github.com/Const-me/Whisper That software even works on Linux, probably due to Valve’s work on DXVK 2.0 (a compatibility layer which implements D3D11 on top of Vulkan).
> However, it’s not that hard to ignore the ecosystem
I'd say this will only work if you have stable results or models to target like Whisper, that's one thing, and "going your own way" can help improve portability and stuff; Llama.cpp is another good example. But a lot of the software demand is not driven by that, it's driven by continuously evolving models and needs; a lot of the bloat or whatever you want to call it is a result of that.
Besides that, the programming models are moving on. The open source Nvidia Linux driver now enables fully heterogeneous memory management on x86 across the CPU and GPU. This means the GPU and CPU do not need the programmer to enforce memory coherency, perform device-specific allocations, or copy memory; migrations, page table/TLB flushes, etc all work out of the box with no modifications to userspace software. So now your io_uring asynchronous loop can write training data to memory that is implicitly available to the GPU, no matter what memory allocator you're using. It basically means arbitrary CPU compute and arbitrary GPU compute is now composable using the memory substrate (and OS kernel) as a coherent transport/storage layer. On x86/Nvidia, this works on the granularity of a page, but on the Grace Hopper superchip, this is going to take place at the level of a cache line. Multiple Hopper Superchips can be NVLink'd together over infiniband so this works across the cluster. You can drive an entire rack of systems this way and it works.
For people actually doing a lot of GPU-specific programming, or deploying models on servers (e.g. for API usage), this is going to be a big deal in the long run, and it started way back when they first introduced unified virtual memory. AMD is moving this way too for their compute stacks, I assume. The compute shader model just isn't evolving for these kinds of needs and it isn't clear it's going to anytime soon.
I’m not sure heterogeneous memory is such a huge deal, due to performance numbers. PCI express is relatively slow in terms of bandwidth, and especially latency. To compensate, GPUs have dedicated piece of hardware to asynchronously copy blocks of data over PCIe. In modern low-level APIs that hardware is even directly exposed to programmers, as transfer queue in Vulkan, and copy command queue in D3D12.
Manually moving data with APIs like cudaMemcpy or ID3D11DeviceContext.CopyResource complicates the code, but much faster than unified memory. Especially if you did it correctly with pipelining, and GPU computes something else (like previous batch of work) while the new data is being copied.
Speaking of new features, I would rather expect GPGPU users to be interested in DirectStorage technology which allows GPUs to efficiently load data from SSD. That thing is currently Windows-only but supported by all 3 GPU vendors. Because it was implemented primarily for videogames, it works just fine with compute shaders.
I think it’s money. nVidia is selling same chips on different boards for different prices.
An example from the current generation, $800 GeForce 4070 Ti, and $2500 L4 are based on the same chip. The difference between these two cards is amount of VRAM, some settings which control clock frequencies, and these legal/software limitations. Without the legal limitations, people who need to compute stuff which fits in 12GB VRAM would only pay them $800 for the GeForce model.
Because AMD's gpgpu offerings have been egregiously buggy for at least a decade. Now that AMD has money I'm hoping they can turn this around, but Geohotz hitting driver crashes from a loop around a demo suggests to me that things are still pretty bad.
On a personal level that youtube doesn't make him come off looking that good... like people are trying to get patches to him and generally soothe him/damage control and he's just being a bit of a manchild. And it sounds like that's the general course of events around a lot of his "efforts".
On the other hand he's not wrong either, having this private build inside AMD and not even validating official, supported configurations for the officially supported non-private builds they show to the world isn't a good look, and that's just the very start of the problems around ROCm. AMD's OpenCL runtime was never stable or good either and every experience I've heard with it was "we spent so much time fighting AMD-specific runtime bugs and specs jank that what we ended up with was essentially vendor-proprietary anyway".
On the other other hand, it sounds like AMD know this is a mess and has some big stability/maturity improvements in the pipeline. It seems clear from some of the smoke coming out of the building that they're cooking on more general ROCm support for RDNA cards, and generally working to patch the maturity and stability issues he's talking about. I hate the "wait for drivers/new software release bro it's gonna fix everything" that surrounds AMD products but in this case I'm at least hopeful they seem to understand the problem, even if it's completely absurdly late.
Some of what he was viewing as "the process happening in secret" was likely people doing rush patches on the latest build to accommodate him, and he comes off as berating them over it. Again, like, that stream just comes off as "mercurial manchild" not coding genius. And everyone knew the driver situation is bad, that's why there's notionally alpha for him to realize here in the first place. He's bumping into moneymakers, and getting mad about it.
That aligns closely with when AMD bought ATI. Before that ATI had a long history of egregiously buggy GPGPUs. I can't say what's gone wrong with Radeon, but I can say it's been going wrong for nearly my entire life.
Honestly, I am amazed with all the complaints on here regarding AMD crashing. I have not seen this once while owning my card! All the while my friends on Discord keep complaining about their Nvidia cards while we’re playing Warzone.
You don’t need these buggy AMD-specific technologies to leverage the AMD’s GPGPU hardware. Compute shaders are used by videogames for a decade now. For example, GTA5 from 2013 already used compute shaders to render stuff. At least in my experience, the tech is very reliable regardless of GPU vendor.
I haven’t tested that, but I would expect on AMD hardware Vulkan compute to be comparable by now. After all, people bought couple millions of steam decks which uses Vulkan for the native GPU API, and emulates D3D on top of that.
Yeah, my GPU isn't so terrible terrible lately, but that's been a big draw. Eventually, you just get fed up and start looking elsewhere. Some of these AMD cards are complete trash.
I mean, silicon aside, isn't it all basically CUDA lock-in?
CUDA is the defacto parallel computing platform, and exciting tech moves VERY fast with early mover advantage - so no researcher\computer scientist is going to bother learning another platform or risk\waste time\effort on their stuff breaking when ported to OpenCL\AMD.
Unless a miracle happens, any new parallel-computing requirement that seizes the popular imagination in the next decade (or more) will done in CUDA.
The sad part is that most of the people are using a higher level API like PyTorch. So AMD or Intel just need a simple low level API to access their HW, and then write and tune some kernels for PyTorch (and I suspect the community or AI will gladly help with the tuning). So basically there does not seem to be a real CUDA lock-in, it's just the fact that the competition still seems unable to do even this.
Apple was able to break the CUDA lock-in with their Metal computing api (MPS), in no time at all. Within months, now the major AI libraries like Pytorch and Tensorflow all support Apple GPU without a hitch.
What's taking AMD so long? They just can't do software I guess?
> I mean, silicon aside, isn't it all basically CUDA lock-in?
Everything is and always will be some form of lock-in. Look at what’s happened with CentOS or even Ubuntu. Even what was solid open source choices before have turned into some form of exploited lock-in. I’ve come to the realization that it’s impossible to optimize out of lock-in, instead it’s better to cost optimize for current best practices while continuously running test applications on different platforms and technologies when you have the resources to do so.
The amount of money being spent on ML compute is pretty high. If it stays high, cost will be a relevant axis that others can compete on.
GPUs are also not super available atm. We'll see if this is a temporary issue or one that persists for years. If you can't get A100s, you'll try the AMD/TPU competitors.
In the post Moore's law era my hunch is that the eventual "winner", if there is one, will be decided based on software not hardware.
So far the main use cases that have pushed the computational performance envelope were hype-driven. In such a climate people will just accept developing costs, lock-in risks etc. as the cost of entering the bubble early enough. In "normalized" conditions the focus shifts back to developer and user productivity and cost-efficiency.
We are slowly entering the commoditised HPC era, but HPC was always a volatile domain precisely because of its highly specialized nature. The broad, general purpose use and (relatively) easy set of tools of the x86 cpu era provided a stability that is unlikely to repeat.
The vector accelerator market is (conceptually) for the CPU vendors to lose because they can create a more productive development environment that unifies CPU/GPU. To paraphrase, users dont care about GPU, they care about performant applications.
They just keep adding support to what the market demands, adding FP8 to the H100s and 40xx series was such a good move, their tooling is top notch, like what is the alternative of rapids in amd? Their drivers are so stable that you can rely on them not crashing in projects that need to train for weeks & months, sxm is another big win of nvidia.
They also have their skin in the game and do a lot of research in ds.
A single 4090 can do 0.66 PFLOPS which is mind blowing to me, borderline having in your room a supercomputer of the 200xs for a few k$.
I recently found an article by Apple explaining how they optimized a transformer to work on Apple Neural Engine [1]. This was before their recent hardware refresh with 192 GB unified memory upgrade.
Apple could have a real chance to push their boxes as an alternative to dedicated GPUs.
I'd love to see more discussion on the assertion that Google is the only real competitor to Nvidia. I get the angle, but it still feels weird to think of them as competitors. Really odd to think of with how Google appears to be struggling with their AI story at the moment.
It also wouldn't hurt if Google showed anywhere near the discipline that Nvidia has shown over the years. About the only thing Nvidia has done to upset anyone, is to not have open source drivers. Even there, though, many folks are far happier with their closed drivers than they are many of the open alternatives.
I think the article is making the (reasonable?) assumption Google's search and advertising will use Google's own in-house Bard or something similar. The article mentioned something about Google having the full vertical stack: hardware, networking, software, ML model, product, and the users.
As of right now, Bard is separate. Or if Google has integrated ML models into search and ads, it hasn't been widely discussed.
From HPC industry experience with Nvidia, my take is that they are the smartest guys in the room. They’re solution architects that help customers implement and optimize CUDA area by far the best SA’s I’ve ever worked with. If that hiring ethos is pervasive they’ll keep winning.
If the GCC compiler has an issue, I'm in a lot of trouble too, even though it's open source. I guess I could scan the code and figure out any minor problems (or figure out why I was misusing it) but I'm not an expert in compilers, or even at tensor programming.
Obviously it would be better if they're open source, since someone would be able to fix them, and it helps to know how your tools really work. But a better closed source driver is probably the right tool for most problems.
I've had no problems with Nvidia Linux drivers themselves in the last 15 years or so. Yes, they don't play well with the rest of the ecosystem. Using Optimus laptops was a nightmare but you could still get hardware acceleration if you needed it, with performance on par with Windows. ATI on the other hand is a choose your own adventure kind of a deal.
I love open source, but this is just isn't true. If you are a company, and you find an actual problem with Nvidia, and you just spent some multiple of $100k or more on graphics cards for a cluster. You are able to get an ear and get that driver fixed.
They both need to get this 1000% better stat. I think it's probably the best ROI the CEO of AMD can get - put a team of engineers on Linux drivers full time, increase your AI market share... and triple your market cap.
As soon as you are out of what they care about, your support will become abysmal. And they don't care about modern desktop use case (such as they didn't have Wayland support for decades).
So if you are a Linux user - simply avoid Nvidia. They don't care about you.
Nvidia's Linux compute solution is far better than AMD's compute solution, and their Linux graphics solution is far worse than AMD (and Intel)'s graphics solution. So the recommendation comes down to why you want that GPU.
No, they are "famous" for having the best drivers but not open sourcing them.
NVIDIA's drivers are practically the same size as an OS anyway. Arguably they are an OS, for the GPU instead of the CPU. So it's not a huge surprise they don't give it all away for free, especially as software is a competitive advantage for them.
CUDA on Linux just works. And works very well. If anything most CUDA workloads run on Linux machines, and it's completely pain free compared to setting up ROCm even on Linux.
From what I understand that was about them not cooperating with the kernel devs. and that their driver was a black box that had to interact with half the kernel didn't help.
Driver wise my worst experiences where all solidly in the ATI/AMD binary camp, with the constant need to exorcise nouveau from various systems second.
I'm really disturbed by all of this - it doesn't quite check out with my that there are some quite capable players with kagillions of dollars who haven't managed to, or more disturbingly, won't even put up the investment nor effort to break into the AI shenanigans space within the GPU market.
I don't think you understand just how insanely difficult it is to break into that market. There's a lot that goes into GPU's that makes it a very difficult industry to get going in. Even with apple money or something like that, it's a losing prospect because in the time it'll take you to get up and off the ground (which is FOREVER) your competition will crush you.
Agreed. Nvidia’s success comes from their unrelenting drive and ability to pivot to new markets. They didn’t independently discover that GPUs would be super useful for DL but they did see the incredible potential and invested big time into it.
So while completion will come, Nvidia isn’t going to pull an Intel and sit by collecting datacenter revenue on incrementally better hardware. The way I see it, Nvidia will really only be outdone by some major paradigm shift in their biggest markets, e.g., a breakthrough in AI computing that no longer relies on digital ICs.
> I don't think you understand just how insanely difficult it is to break into that market.
You're right, I have no clue nor have I ever tried myself.
> Even with apple money or something like that, it's a losing prospect because in the time it'll take you to get up and off the ground (which is FOREVER) your competition will crush you.
This I find hard to believe, do you have a source or reference for that claim? Companies with that amount of cash are hardly going to be crushed by competition be it direct or indirect. Anyway, I'm talking more about the Intels and AMDs of this world.
We have very lacklustre efforts from players I won't name <cough Intel> with their Zluda library (https://github.com/vosen/ZLUDA) which I got REALLY excited about, until I read the README.txt. Four contributors, last commit early 2021.
ZLUDA is a library written by a hobbyist. The comparison you are looking for is oneAPI. They don't have the messaging down, but Intel is good at software.
We really really tried to use oneAPI, for FPGA, working with intel directly.
Every release would completely change some approach requiring restructuring. The only card with “free” compatibility was discontinued the week after we bought them, and the drivers stopped working on the same OS that oneAPI required.
Constantly encountered showstopper bugs that led us to conclude that nobody outside Intel was using it. Some of them appeared to be live-patched on their Devcloud.
Documentation was - reasonably good. But every few months Intel would move or restructure documentation links such that it was impossible to persistenly store a link to a useful document.
Intel may be good at software - but they are utterly crap at actually maintaining a consistent ecosystem that isn’t painful to try and follow.
If it had been "easy", China would have been able to do it. For them it is a existential that at nation level. Just money ain't enough to beat decade(s) of effort.
AFAIK the "most powerful" TPU that you can get is CRL-G18U-P3DF, and that is only capable of running quantized tensorflow lite models. Not sure how that is supposed to compete with those 700W TDP superchips.
That nobody else seems to be trying hard enough, despite being well-funded and respectable companies. And until somebody does, we're stuck with NVidia/Cuda or half-working stuff that's way more hassle to prototype or build a product from.
To answer the question why there is not much competition, I would like to point out that the required expertise is very rare. Not specifically the chip design expertise but the expertise spanning both chip hardware and low-level software for such chip-specific hardware, that is all necessary to create the compilers, the frameworks and libraries and which can also plug-in well with the existing AI/ML ecosystem such as PyTorch.
The US is putting restrictions on this, so it feels like a short term workaround:
“Jordan Schneider: Chinese firms, however, are not restricted in accessing cloud services overseas. Nothing is stopping a Chinese company from buying top-of-the-line Nvidia compute from AWS. Does that alter the dynamic?”
I thought it was strange that the history of GPUs jumped from graphics to AI with nary a mention of crypto, but then I realized this was China-centric where this kind of GPU use-case is against the law.
"The main reasons for Nvidia's success, are its creation of a GPU ecosystem, lack of significant competitors, and the benefits derived from its compute and software ecosystem."
ROCm is a disaster. It doesn't even work with most AMD GPUs, and the software is awful(difficult to setup, very poorly supported). AMD keeps occasionally releasing new stuff in the GPGPU space, then leaves it unsupported and continues to fall further behind Nvidia.
then its parallel GPU got "lucky", first the bitcoin mining, then the AI. it probably did not expect and plan for this, to some extent, it got super lucky.
credit must be given to its CUDA ecosystem and the ability to better itself when chances knocked its door, it so far left all competitors in the dust, its showtime arrived, finally.