NVIDIA’s Moat Is Deeper Than It Looks - Except in One Place
On building a moat that wins everywhere, and the one place it quietly stops working
Sometime around 2008, NVIDIA did something to its own products that looked like sabotage.
At the time, NVIDIA sold graphics cards: the chip that draws the picture in a video game. It was a maker of a single component, a part that slots into someone else’s machine, and making that part well was the whole of the business.
The act of sabotage was this: NVIDIA built a piece of software called CUDA that let its gaming chips run general-purpose mathematics, not just graphics, and baked it into every card it sold, whether the buyer wanted it or not. CUDA made the cards more expensive to build, and gamers had no use for it and would not pay a penny more because of it. By Jensen Huang’s own account, the cost of carrying CUDA swallowed most of the company’s profit, and for a while the market valued the entire business at barely more than the cash in its bank account.
NVIDIA kept doing it for six years anyway: giving the tools away, teaching courses to use CUDA, seeding the language into university labs that had no obvious reason to need it. In 2012 a neural network called AlexNet, trained on two NVIDIA gaming cards, won the ImageNet image-recognition contest by a margin that embarrassed everyone else in it, and the field that was about to swallow the next decade of technology discovered that the cheapest way to do its enormous arithmetic was already sitting in NVIDIA’s chips, in a language NVIDIA had spent six years quietly teaching to every lab on Earth. The timing was not luck. NVIDIA had built the road before there was any traffic.
The trap, and the move that escaped it
Huang has a clean way of describing the trap NVIDIA was caught in. A specialist does one narrow thing superbly, and that is its strength and its ceiling at once. The narrower your speciality, the smaller your market; the smaller your market, the smaller your research budget; the smaller your budget, the less you can ever shape the industry at large.
The usual escape is to widen out and become a generalist, and it is a trap of its own, because a generalist is optimal at nothing, and widening dilutes the very thing that made you worth choosing. So most companies pick a side. They stay sharp and small, or grow broad and ordinary, and either way the ceiling holds.
Excellence at a narrow job carries the seed of its own irrelevance.
CUDA was how NVIDIA refused to pick. It kept the speed of a specialist chip and gave it the reach of a generalist platform, and the way it pulled that off is the whole game.
Co-design wins the lead. Openness turns it into a moat.
There are two moves here, and they are easy to mistake for one because they happened at the same time.
The first move is engineering: what Huang calls “extreme co-design”, doing the hard, specific work across every layer of the stack at once, the chip, the system, the software, the training methods, so that nothing else in the world matches it on raw speed. This is what wins the lead. It is genuinely hard, and it is genuinely NVIDIA’s.
The second move is different in kind, and it is the one that actually built the moat. NVIDIA opened the surrounding layers, the tools, the libraries, the courses, and let a whole industry build its work on top of CUDA rather than something else. Crucially, the core itself was never given away. It stayed proprietary, NVIDIA’s alone, which is exactly why building on it binds you to NVIDIA. What gets opened is everything around the thing that locks you in.
The engineering buys the lead. The openness turns the lead into an install base. The install base is the moat.
Here is the strangeness of it: everywhere else, you defend a position by closing it, guarding the technology, controlling access, making it hard to copy. NVIDIA’s moat is built from the opposite material. Nobody is locked in. They are simply home.
It is millions of developers who, over fifteen years, freely chose to put their software on top of CUDA, and who are not held by a contract or a wall, but by the plain fact that they built their working lives there. And the longer they stay, the harder it is to leave, because the thing they are standing on keeps thickening: not just CUDA itself but the mature layers grown over it, the tuned libraries, the debuggers, the years of accumulated answers to obscure questions, each one another reason the next person starts here too. You can read what that is worth in the accounts. NVIDIA earns gross margins near seventy-five per cent. A typical chipmaker lives well below that, and Intel in its prime, the most dominant chip company of the previous era, topped out around sixty. NVIDIA clears all of them, and not because its silicon is three times better: its customers are really paying for the ecosystem they cannot leave, and the chip is just the part of it that happens to have a price tag.
Huang is blunt about this being the real prize. If a rival turned up tomorrow with a perfect copy of CUDA, call it GUDA or TUDA, it would change nothing, because it was never only about the technology. A developer deciding where to build chooses the platform that runs on the most machines and carries the richest ecosystem, and for years the answer has been the same one.
You can watch the same thing happen elsewhere in computing, most clearly in the chips that run the world’s servers. They nearly all speak a decades-old design called x86, which engineers have mocked for almost as long, because cleaner, faster ways to build a chip have been known for years. Those better designs kept coming, and kept losing, because the software the world depends on was already written for x86, and rewriting all of it was never worth it. Intel sank billions and a decade into one such replacement, Itanium, and it failed for exactly this reason. Elegance loses to incumbency, and incumbency is only install base with a longer memory.
But the NVIDIA moat is not uniform: strong everywhere you tend to look, thin in one place you do not, and that thin place is where the story gets interesting.
Where the moat runs shallow
A moat is not one uniform ring of water. Some stretches stay deep because the ground keeps shifting too fast for anyone to learn it. Others, worn down by years of ordinary use, silt up enough to wade across. NVIDIA’s moat has both kinds, and the giant cloud companies now building their own chips have found the shallow one.
The shallow stretch is inference: the everyday act of running a model that has already been trained, the work behind every answer ChatGPT gives you. It is the same calculation, over and over, for as long as the model stays in service, and repetition like that can eventually be learned well enough that a narrower, cheaper chip built for that one calculation alone will beat a general-purpose one on cost. Anthropic now runs Claude’s day-to-day responses on more than a million of Google’s custom inference chips, the largest deployment of its kind by any single customer. OpenAI went further and designed its own chip from scratch with Broadcom, built only to serve finished models and pointedly not to train them. Whichever way NVIDIA’s own market-share numbers move from one quarter to the next, the pattern in who builds what has not changed: when a company is large enough to justify designing its own silicon, it reaches for inference first.
The deep stretch of the moat is training: the months-long run across tens of thousands of chips that builds a new model in the first place, before anyone has typed a word to it. Nobody can build a cheap, narrow chip in advance for a problem that has not been solved yet, and each training run is its own new problem: a different architecture, a different trick for moving data between chips, a different way of splitting the work across a machine the size of a small town. NVIDIA still supplies something close to nine chips in ten for this kind of work, by most counts, and that number has barely moved, even as the cloud giants spend billions wading into the shallows on the other side. A few of the very largest are starting to test the deep water too, just barely, at a cost only they could carry.
Google is the one company that has actually waded across, and its case proves the depth rather than denying it. Google has built and run its own chips, called TPUs, for the better part of a decade, longer than anyone else, and now trains some of its largest models on them instead of NVIDIA’s. But that did not come quickly or cheaply. Most of the software the world’s AI researchers use was built up around CUDA over the better part of fifteen years, written to run on NVIDIA’s chips first and best, and getting that same software to run properly on Google’s own hardware took years of a parallel engineering effort that nobody outside Google ever saw. Google paid for its crossing in a decade of its own engineers’ time rather than in a purchase order, which is a price only a handful of companies on Earth could meet, and is itself the clearest evidence that the water there runs deep.
There is an obvious objection here: if NVIDIA wants to fight back, why doesn’t it become a cloud provider itself, or back one AI lab over the others, the way its biggest customers are now building their own chips? Because that would be the fastest way to lose them entirely. The moment NVIDIA started competing with a cloud company or an AI lab, instead of simply selling chips to it, that company would gain a second and far stronger reason to stop buying from NVIDIA altogether: nobody wants to keep funding a rival’s growth. So NVIDIA stays deliberately neutral. It does not build cloud infrastructure to rent out, and it does not bet on which lab wins, because either move would hand its biggest customers a reason to leave that has nothing to do with price or speed, only with refusing to keep arming a competitor.
Two kinds of water
The useful lesson here is not really about chips. Any large moat is rarely dug to one depth. Part of it holds because of sheer numbers: so many customers have already built their own work on top of it that no single one of them has enough reason, on its own, to tear everything up and start again somewhere else. The rest of it holds for a completely different reason. Whatever sits at the centre of the moat, the one capability a rival would actually have to copy, is simply hard to do, and that difficulty does not go away just because a rival wants in. The first kind of defence is cheap to keep, but breaks once enough customers decide, separately, that leaving is worth it. The second kind is expensive to keep, but almost impossible to take, because nobody can simply buy their way past a problem that nobody has solved yet.
A moat dug by numbers holds until someone has a reason to leave. A moat dug by difficulty holds until someone solves it. They are not the same wall, and they do not fail on the same day.
The numbers of NVIDIA, for now, read like a misprint. The bet that once reduced the market’s view of the whole company to little more than the cash sitting in its account helped build, two decades later, the most valuable company on Earth, worth something close to five trillion, with roughly ninety per cent of last year’s revenue drawn from the very data centres whose owners are already wading in up to their knees. Whether they ever reach the far bank is the only question that matters now. They can see it from where they are standing. Whether the water in front of them keeps getting deeper, every year, faster than they can learn to swim, is the thing that will actually decide how this ends.
Disclosure: I work in semiconductors, at a company in NVIDIA's supply chain rather than at NVIDIA itself. Nothing in this piece draws on non-public information from my employer, all of it comes from public interviews, filings, and reporting. This is analysis and storytelling, not financial advice, and nothing here should be read as a recommendation to buy, sell, or hold any security. Do your own research, or speak to a licensed financial adviser, before making investment decisions.




👌