
OpenAI, Broadcom Debut Custom ‘Jalapeño’ AI Chip for Faster LLM Inference
OpenAI and Broadcom have unveiled Jalapeño, OpenAI’s first custom AI accelerator chip, designed from the ground up to handle large language model inference rather than adapted from general-purpose hardware. Engineering samples are already running production workloads in the lab, including GPT-5.3-Codex-Spark, at target frequency and power. Early testing shows performance per watt substantially ahead of current state-of-the-art accelerators, though OpenAI says a detailed technical report is months away.
The chip was co-developed with Broadcom handling silicon implementation and networking, and Celestica managing board, rack, and system integration. OpenAI designed the architecture around its own models, kernels, and serving systems — optimizing for the memory movement, networking, and compute patterns specific to frontier LLM inference. Jalapeño is also intended to be compatible with LLMs beyond OpenAI’s own.
“By designing more of the stack ourselves, we can serve more intelligence with greater efficiency and keep pushing advanced AI toward broader access,” said Greg Brockman, president and co-founder, OpenAI.
Broadcom CEO Hock Tan framed the partnership as a long-term infrastructure commitment, with gigawatt-scale data center deployments planned with Microsoft and other partners beginning this year. Jalapeño is the first chip in what both companies describe as a multi-generation roadmap, with subsequent generations already in development.
The chip’s nine-month development cycle was itself partly accelerated using OpenAI’s own models — a detail the company highlighted as proof of concept for AI-assisted chip design lowering the cost of compute across the industry over time.

