Level 1 - Absolute Beginner
Nvidia is a company that makes computer chips. Its boss is Jensen Huang. He showed a new chip called Vera Rubin at a big event in Taipei.
The new chip is very fast. It is ten times faster than the old chip for AI tasks. AI means artificial intelligence. It helps computers think and learn.
The chip will be used in big computers called data centers. These computers help run AI programs around the world.
- chip
- a small piece of silicon inside a computer that does calculations
- artificial intelligence (AI)
- computer systems that can do tasks that normally need human thinking
- fast
- able to do something in a short amount of time
- data center
- a large building full of computers that store and process information
- event
- a planned occasion where people come together for a purpose
- computer
- a machine that can process and store information
- boss
- the leader or head of a company
- program
- a set of instructions that tells a computer what to do
Level 2 - Elementary
Nvidia CEO Jensen Huang announced the Vera Rubin AI chip at COMPUTEX in Taipei on June 1, 2026. The new chip is ten times faster than Nvidia's previous chip, called Blackwell, for AI inference tasks. Inference means running an AI model to generate answers rather than training it from scratch.
The Vera Rubin chip comes in a large rack system called NVL72. This rack holds 72 Rubin GPUs and 36 Vera CPUs all connected together by a fast connection called NVLink 6. Companies can use these racks to run large AI programs like chatbots and image generators.
Nvidia also revealed its future plans. After Vera Rubin, the company will release Rubin Ultra and then a chip named Feynman by 2028. These announcements show that Nvidia plans to keep improving its chips every year to stay ahead of competitors.
- CEO
- Chief Executive Officer; the top leader of a company
- inference
- running an AI model to produce outputs, as opposed to training it
- GPU
- Graphics Processing Unit; a chip originally for graphics, now widely used for AI
- CPU
- Central Processing Unit; the main chip that controls a computer
- rack
- a frame holding many computer components together in a data center
- chatbot
- an AI program that can have conversations with people
- competitor
- a company that sells similar products and competes for the same customers
- roadmap
- a plan showing future products or goals and when they will arrive
Level 3 - Intermediate
Nvidia CEO Jensen Huang delivered one of his signature theatrical keynotes at COMPUTEX Taipei on June 1, 2026, unveiling the Vera Rubin GPU architecture alongside a new custom CPU named Vera. The chip platform delivers ten times the inference throughput of the previous Blackwell generation, a leap driven by a new transformer engine, a 1.5 terabyte per second memory bandwidth via HBM4, and NVLink 6, which connects chips at 1.8 terabits per second within a rack.
The flagship product is the NVL72, a rack-scale system packing 72 Rubin GPUs and 36 Vera CPUs. Huang said the system is specifically optimized for large language model inference at data-center scale, allowing cloud providers like Google, Microsoft, and Amazon to serve AI queries at lower cost and latency than before. Several hyperscaler CEOs appeared by video to confirm orders already placed for 2027 delivery.
Huang also outlined Nvidia's multi-year roadmap: Rubin Ultra ships in late 2027, followed by the Feynman architecture in 2028. Named after Nobel laureate Richard Feynman, the new architecture is expected to incorporate photonic interconnects. Analysts noted that the cadence of annual architecture upgrades puts pressure on rivals AMD and Intel to accelerate their own AI chip programs, while startups such as Groq and Cerebras face an increasingly difficult landscape.
- throughput
- the amount of data or work processed by a system in a given time period
- transformer engine
- dedicated hardware inside a GPU designed to accelerate transformer-based AI models
- memory bandwidth
- the rate at which data can be read from or written to a chip's memory
- latency
- the delay between a request and its response in a computing system
- hyperscaler
- a very large cloud computing company such as Google, Amazon, or Microsoft
- photonic interconnects
- connections that use light instead of electrical signals to transfer data faster
- architecture
- the overall design and organization of a computer chip or system
- keynote
- the main speech or presentation at a conference, usually by a senior leader
Level 4 - Advanced
Jensen Huang's COMPUTEX keynote on June 1, 2026 followed his now-established ritual of arriving in a leather jacket to an arena crowd and announcing performance claims that, however theatrical in delivery, have consistently been borne out in third-party benchmarks. Vera Rubin, Nvidia's successor to Blackwell, delivers the jump in AI inference throughput by combining a redesigned transformer engine with HBM4 memory at 1.5 terabytes per second bandwidth, NVLink 6 at 1.8 terabits per second, and a custom companion CPU, also called Vera, that handles orchestration workloads previously offloaded to Intel Xeon or AMD EPYC host processors.
The architectural cornerstone is the NVL72, a rack-scale compute module housing 72 Rubin GPUs and 36 Vera CPUs fully interconnected via NVLink fabric, functioning as a single logical accelerator with a unified memory space. This addresses one of the principal bottlenecks in serving trillion-parameter language models: the need for high-bandwidth, low-latency chip-to-chip communication that conventional PCIe fabrics cannot sustain. Hyperscalers including Google Cloud, Microsoft Azure, and Amazon Web Services confirmed multi-billion-dollar purchase commitments, consistent with the capital expenditure guidance each had telegraphed in their most recent quarterly earnings calls.
Nvidia's roadmap, extending to the Feynman architecture in 2028, telegraphs an intent to incorporate silicon photonics for intra-rack communication, a technology that would push interconnect bandwidth into the tens of terabits-per-second range while slashing power consumption per bit. The competitive landscape Nvidia dominates, measured by data-center GPU revenue share exceeding 85 percent, remains under challenge from AMD's MI400 series, custom silicon programs at Google (TPU v6) and Amazon (Trainium 3), and a cohort of well-capitalized startups including Groq, Cerebras, and Tenstorrent. However, Nvidia's software moat, anchored by the CUDA ecosystem and tens of thousands of optimized kernel libraries, continues to exact a switching cost that even architecturally superior alternatives have struggled to overcome.
- orchestration
- the automated coordination of complex computing tasks across multiple processors or systems
- unified memory space
- a design where multiple chips share a single address space, allowing them to act as one large processor
- silicon photonics
- technology that uses light rather than electricity to move data inside or between chips
- capital expenditure
- money spent by a company on acquiring or upgrading physical assets such as servers and chips
- switching cost
- the expense or difficulty a customer faces when changing from one product or platform to another
- kernel libraries
- pre-written code routines optimized for specific hardware, enabling faster AI and graphics computations