CUDA proves that Nvidia is a software company

🚀 Check out this insightful post from WIRED 📖

📂 **Category**: Business,Business / Computers and Software,Machine Readable

💡 **What You’ll Learn**:

Forgive me Starting with a cliche, a piece of financial jargon that has recently seeped into the tech lexicon, but I’m afraid I have to talk about the “moats.” The word was coined decades ago by Warren Buffett to refer to a company’s competitive advantage, and it found its way into Silicon Valley presentations when a leaked memo from Google titled “We Have No Moat, Neither OpenAI” expressed its fear that open-source AI might pillage Big Tech’s citadel.

A few years later, the castle walls are still safe. Aside from a brief panic attack when DeepSeek debuted, open source AI models have not significantly outperformed proprietary models. However, none of the leading labs — such as OpenAI, Anthropic, and Google — have a moat to speak of.

The company that has a moat is Nvidia. CEO Jensen Huang described it as his most precious “treasure.” It’s not a piece of hardware, as you might assume for a chipset company. It’s something called CUDA. What sounds like a chemical compound banned by the Food and Drug Administration may be the only real moat in artificial intelligence.

CUDA technically stands for To account for a unified, but very similar, hardware architecture Laser or DivingNo one cares about expanding the acronym; We just say “KOO-duh”. So what is the use of this extremely important treasure? If I had to give a one-word answer: parallelism.

Here’s a simple example. Suppose we task a machine to fill in a 9×9 multiplication table. With a single-core computer, all 81 processes are executed faithfully one after the other. But a nine-core GPU can assign tasks so that each core takes a different column — one from 1 x 1 to 1 x 9, another from 2 x 1 to 2 x 9, and so on — for a nine-fold speed boost. Modern GPUs can be smarter. For example, if it was programmed to recognize commutation — 7 x 9 = 9 x 7 — it could avoid duplicate work, reducing 81 operations to 45, cutting the workload by nearly half. When a single training costs $100 million, every improvement counts.

Nvidia GPUs were originally designed to provide graphics for video games. In the early 2000s, a PhD student at Stanford University named Ian Buck, who first got into GPUs as a gamer, realized that their architecture could be repurposed for general high-performance computing. He created a programming language called Brook, was hired by Nvidia, and along with John Nickolls led the development of CUDA. If AI ushers in the era of a permanent underclass and autonomous weapons, just know that it will all be because someone somewhere is playing… death I think the devil’s scrotum must be vibrating at 60 frames per second.

CUDA is not a programming language per se but a “platform”. I use this elusive word because, unlike The New York Times which is also a gaming company, CUDA has, over the years, become a nested package of software libraries for artificial intelligence. Each function shave nanoseconds off individual calculations — combined, they make GPUs, in industry parlance, fire brrr.

Modern graphics A card is not just a circuit board crammed with chips, memory, and fans. It is an elaborate combination of cache hierarchies and specialized units called “tensor cores” and “streaming multiprocessors.” In this sense, what chip companies sell is like a professional kitchen, and more cores are like more grilling stations. But even a kitchen with 30 grilling stations won’t run faster without a chef able to deftly assign tasks — as CUDA does with GPU cores.

Extending the metaphor, hand-tuned CUDA libraries optimized for a single array operation are the equivalent of kitchen tools designed for one function and no more — the cherry vase, the shrimp cleaner — which are indulgences for home cooks but not if you have 10,000 shrimp guts to pluck. Which brings us back to DeepSeek. Its engineers went below this already deep layer of abstraction to work directly in PTX, a type of assembly language for Nvidia’s GPUs. Suppose the task is to peel garlic. The non-optimized GPU will say: “Peel the skin with your fingernails.” CUDA can command: “Crush the cloves with a flat knife.” PTX lets you dictate each sub-instruction: “Lift the blade 2.35 inches above the cutting board, keeping it parallel to the equator of the clove, and strike downward with your palm with a force of 36.2 Newtons.”

🔥 **What’s your take?**
Share your thoughts in the comments below!

#️⃣ **#CUDA #proves #Nvidia #software #company**

🕒 **Posted on**: 1778495526

🌟 **Want more?** Click here for more info! 🌟

By

Leave a Reply

Your email address will not be published. Required fields are marked *