The words Innovation Explained with the ai underlined on gradient background with a data node pattern.The words Innovation Explained with the ai underlined on gradient background with a data node pattern.

DeepSeek V4 is the latest large language model from DeepSeek, the Hangzhou-based Chinese AI startup that disrupted global tech markets last year with its surprisingly capable and cost-efficient R1 reasoning model. Released as a preview on April 24, 2026, V4 arrives in two open-source variants: DeepSeek-V4-Pro, a massive 1.6-trillion-parameter model, and DeepSeek-V4-Flash, a leaner 284-billion-parameter alternative. Both support a one-million-token context window and use a Mixture-of-Experts (MoE) architecture, meaning only a fraction of their total parameters activate for any given task. The result is a pair of models that deliver near-frontier performance at a fraction of the cost charged by U.S. competitors.

In this article, we’ll discuss everything you need to know about the DeepSeek V4 preview release. We’ll cover the technical innovations that set V4 apart from its predecessors, compare its benchmark performance and pricing against leading models from OpenAI, Anthropic, and Google, and examine the broader implications for the AI industry, including the growing competition between Chinese and American labs, DeepSeek’s partnership with Huawei, and the ongoing controversy around model distillation.

TL;DR Snapshot

DeepSeek V4 is the company’s most ambitious model release to date. With its 1.6-trillion-parameter Pro variant (49 billion active per token), it’s the largest open-weight model currently available, and its aggressive pricing undercuts every major frontier model on the market. DeepSeek claims V4-Pro trails the very best closed-source models by only three to six months of development, while outperforming every other open-source model across coding, math, and reasoning benchmarks.

Key takeaways include…

DeepSeek V4-Pro is the largest open-weight model available, with 1.6 trillion total parameters (49 billion active), a one-million-token context window, and benchmark scores that come within striking distance of GPT-5.4 and Gemini 3.1-Pro.
V4-Pro’s output pricing sits at $3.48 per million tokens, roughly 7x cheaper than comparable models from Anthropic and OpenAI, while V4-Flash costs just $0.28 per million output tokens.
The release is notable for its use of Huawei’s Ascend AI chips in training, signaling a potential shift away from reliance on Nvidia hardware and deepening China’s push for AI sovereignty.

Who should read this: AI engineers, startup founders, enterprise decision-makers, and anyone following the U.S.-China technology rivalry.

What’s New Under the Hood: V4’s Architectural Innovations

DeepSeek V4 introduces several meaningful upgrades over its predecessor, V3.2. According to the technical report published on Hugging Face, the model features a hybrid attention mechanism that combines Compressed Sparse Attention (CSA) and Heavily Compressed Attention (HCA). This design dramatically improves efficiency when handling long-context tasks. At its one-million-token context setting, V4-Pro requires only 27% of the single-token inference compute and 10% of the key-value (KV) cache that V3.2 needed for the same workload.

The model also introduces what DeepSeek calls Manifold-Constrained Hyper-Connections (mHC), a technique designed to strengthen the conventional residual connections found in transformer architectures. The goal is to improve the stability of signal propagation across the model’s layers without sacrificing expressivity. Both V4 variants were pre-trained on more than 32 trillion diverse tokens, a considerable step up from earlier iterations.

Post-training follows a two-stage approach. First, domain-specific experts are cultivated through supervised fine-tuning and reinforcement learning using a technique called GRPO. Then, those specialists are consolidated into a single unified model through on-policy distillation, integrating proficiencies across coding, math, reasoning, and general knowledge into one cohesive system.

Benchmark Performance and Pricing: The Numbers That Matter

Illustration of an AI processor receiving streams of data blocks, with connected modular chips and server towers representing large language model computing.

The headline story for V4 is its price-to-performance ratio. According to a detailed benchmark review from Build Fast with AI, V4-Pro scores 80.6% on SWE-bench Verified, landing within 0.2 points of Claude Opus 4.6, and it actually outperforms Claude on Terminal-Bench 2.0 (67.9% vs 65.4%) and LiveCodeBench (93.5% vs 88.8%). The Pro model’s Codeforces rating reportedly reaches 3206, placing it squarely in competitive programming territory.

That said, V4 doesn’t top every leaderboard. Google’s Gemini 3.1-Pro and OpenAI’s GPT-5.4 maintain leads on knowledge-heavy benchmarks. DeepSeek’s own tech report acknowledges this candidly, as reported by Fortune, noting that V4’s trajectory trails frontier models by roughly three to six months.

Where V4 really turns heads is on cost. V4-Pro charges $3.48 per million output tokens and $1.74 per million input tokens. As TechCrunch noted, this undercuts GPT-5.4, Gemini 3.1-Pro, and Claude Opus. V4-Flash is even more aggressive at just $0.14 per million input tokens and $0.28 per million output tokens, making it cheaper than even OpenAI’s smallest model, GPT-5.4 Nano. For teams running high-volume coding agents or other inference-heavy workloads, these numbers change the math considerably.

Both models also support text-only processing at this time. Unlike many closed-source competitors that offer multimodal capabilities (audio, video, and image understanding), V4 sticks to text, which is worth noting for teams evaluating it for multimodal use cases.

The Huawei Factor: Building AI Without Nvidia

One of the most consequential aspects of V4’s release has nothing to do with benchmarks or pricing. According to CNN, DeepSeek partnered with Huawei for V4’s computing needs, relying on Huawei’s Ascend 950 chips and “Supernode” technology to provide the necessary compute clusters. This represents a significant departure from earlier models. DeepSeek’s R1, the model that rocked markets in early 2025, was trained primarily on Nvidia hardware.

Washington’s export controls have restricted Chinese companies from purchasing Nvidia’s most advanced AI chips, and Beijing has been actively pushing tech companies to adopt domestic alternatives. Wei Sun, principal analyst at Counterpoint Research, told CNN that V4’s ability to run natively on local chips could have major implications for AI development globally, helping Beijing pursue greater AI sovereignty while reducing dependence on Nvidia.

Huawei confirmed on Friday that its latest AI computing cluster supports DeepSeek V4. However, as Fortune reported, it remains unclear exactly how extensively Huawei’s chips were used in the full training process compared to Nvidia hardware.

The Competitive Landscape: A Crowded Market With Higher Stakes

Illustration of two AI processors facing off across a race-like track, surrounded by smaller connected chips to represent a crowded and competitive AI market.

V4 arrives into a much more competitive environment than the one R1 disrupted. As CNBC reported, Chinese players like Alibaba, ByteDance, Moonshot AI, and MiniMax have all released strong models in 2026, and DeepSeek now frames these companies as direct competitors. Ivan Su, senior equity analyst at Morningstar, told CNBC that V4 probably won’t create the same market shock as R1, because investors have already priced in the reality that Chinese AI is competitive and cheaper.

The timing is also notable. OpenAI released GPT-5.5 just one day before DeepSeek’s V4 announcement. Meanwhile, the White House accused Chinese entities of conducting industrial-scale distillation attacks against American AI companies. Both OpenAI and Anthropic have publicly alleged that DeepSeek and other Chinese developers have extracted capabilities from their proprietary models. China’s foreign ministry called these claims groundless.

Despite the controversy, V4 reinforces a pattern that’s hard to ignore: open-source models from China continue to close the gap with the best proprietary models from the U.S., and they’re doing it at dramatically lower price points. Whether that gap continues to narrow will be one of the defining questions of the AI industry over the next year.

Frequently Asked Questions

DeepSeek is a Chinese AI startup based in Hangzhou, founded in 2023. It’s owned by High-Flyer, a Chinese hedge fund. DeepSeek gained international attention in late 2024 and early 2025 with the release of its V3 and R1 models, which demonstrated frontier-level AI performance at significantly lower costs than competing U.S. models. The company follows an open-source approach, publishing its model weights under the MIT License.

Mixture-of-Experts is a neural network design where a model contains many “expert” sub-networks, but only a small subset of them activates for any given input. This approach allows a model to have a very large total parameter count (giving it broad capability) while keeping the computational cost of each individual inference relatively low, since only the active parameters are used. Both V4-Pro (49B active out of 1.6T total) and V4-Flash (13B active out of 284B total) use this approach.

A context window refers to the maximum amount of text (measured in tokens) that an AI model can process in a single interaction. DeepSeek V4 supports a one-million-token context window, roughly equivalent to 750,000 words. A larger context window lets users feed the model longer documents, bigger codebases, or more extended conversation histories without losing earlier information.

Huawei’s Ascend chips are AI processors designed by the Chinese tech giant as an alternative to Nvidia’s GPUs for training and running AI models. With U.S. export controls restricting Chinese companies’ access to Nvidia’s most powerful chips, Huawei’s Ascend line has become a key part of China’s effort to build a self-sufficient AI hardware supply chain. DeepSeek used Huawei’s Ascend 950 chips and Supernode clustering technology to support V4’s compute needs.

Model distillation is a technique where a smaller model is trained to replicate the behavior of a larger, more capable model. In the context of the U.S.-China AI rivalry, American companies like OpenAI and Anthropic have accused Chinese developers, including DeepSeek, of using distillation to extract capabilities from their proprietary models. The method typically involves feeding a frontier model thousands of prompts, collecting its responses, and using that data to train a new model. The practice is a source of ongoing geopolitical tension.

SWE-bench is a benchmark designed to evaluate AI models on real-world software engineering tasks. It tests a model’s ability to resolve actual GitHub issues from popular open-source repositories. SWE-bench Verified is a curated subset of the benchmark that focuses on confirmed, well-defined issues. DeepSeek V4-Pro scored 80.6% on SWE-bench Verified, placing it close to the top of current model rankings.

TL;DR Snapshot

What’s New Under the Hood: V4’s Architectural Innovations

Benchmark Performance and Pricing: The Numbers That Matter

The Huawei Factor: Building AI Without Nvidia

The Competitive Landscape: A Crowded Market With Higher Stakes

Frequently Asked Questions

What is DeepSeek?+

What is a Mixture-of-Experts (MoE) architecture?+

What is a context window?+

What are Huawei’s Ascend AI chips?+

What is model distillation?+

What is SWE-bench?+

Other Enterprise AI Articles You May Be Interested In