NVIDIA Boosts AI Performance with GB200 NVL72 and OpenAI gpt-oss Models

Zach Anderson
Aug 05, 2025 23:50

NVIDIA collaborates with OpenAI to enhance AI capabilities, achieving up to 1.5 million TPS with their GB200 NVL72 system, optimizing gpt-oss models.

NVIDIA, in collaboration with OpenAI, has announced significant advancements in AI performance, leveraging the power of the NVIDIA GB200 NVL72 system. The recent launch of the OpenAI gpt-oss-20b and gpt-oss-120b models promises to deliver up to 1.5 million tokens per second (TPS), marking a substantial leap in AI processing capabilities, according to NVIDIA.

Enhanced AI Capabilities

The gpt-oss models, known for their text-reasoning capabilities, are built using the mixture of experts (MoE) architecture with SwigGLU activations. These models utilize RoPE for attention layers, supporting a 128k context length, and are optimized for NVIDIA’s Blackwell architecture. They are released in FP4 precision, compatible with an 80 GB data center GPU, and optimized for NVIDIA’s advanced hardware.

Collaborative Developments

NVIDIA’s collaboration with OpenAI extends to various open-source frameworks, including Hugging Face Transformers and NVIDIA TensorRT-LLM, to enhance model performance and developer accessibility. The gpt-oss-120b model, in particular, required extensive training, amounting to over 2.1 million GPU hours.

Technical Specifications

The gpt-oss-20b and gpt-oss-120b models feature a range of specifications to cater to diverse AI needs. These include varying transformer block counts, total parameters, and expert configurations, designed to optimize inference performance on NVIDIA’s platforms.

Deployment Options

NVIDIA offers multiple deployment options for developers, including the use of vLLM and TensorRT-LLM for server setup and performance optimization. The GB200 NVL72 system is designed to handle high throughput, accommodating up to 50,000 concurrent users efficiently.

Future Prospects

With the introduction of these advanced models, NVIDIA aims to support a broad spectrum of AI applications from cloud to edge. Their efforts to integrate gpt-oss models across various platforms highlight a commitment to enhancing AI infrastructure and developer experience.

For more details on the deployment and capabilities of these models, visit the NVIDIA blog.

Image source: Shutterstock

#NVIDIA #Boosts #Performance #GB200 #NVL72 #OpenAI #gptoss #Models

NVIDIA Boosts AI Performance with GB200 NVL72 and OpenAI gpt-oss Models

Enhanced AI Capabilities

Collaborative Developments

Technical Specifications

Deployment Options

Future Prospects

Leave a Reply Cancel reply

Guru Fundamental Report for CAT

Internet Computer (ICP) Price Holds Above Key Support as Technical Indicators Show Mixed Signals

1 Reason to Buy PepsiCo (PEP) Stock That’s Been a Good Reason for More Than 50 Years

First China Stablecoin Launches Amid Digital Geopolitical Race

What Is One of the Best EV Stocks to Buy Right Now?

State of Crypto: ETF Listings Became Easier

3 Super-Reliable Real Estate Stocks to Buy and Hold for Passive Income

Gold Retreats from All-Time Highs: Market Reactions and Investment Insights

Tax Day 2025 Looms: Your Guide to Filing Before the April 15 Deadline

Gramercy Funds Eyes $1 Billion Milestone in Peru Private Debt Investments

Navigating Debt After Loss: Understanding Your Obligations for a Deceased Spouse’s Credit Cards