DeepSeek Launches LLM V4 with Advanced Features

Global AI Watch··3 min read·Le Monde Informatique
DeepSeek Launches LLM V4 with Advanced Features

DeepSeek has introduced its latest LLM V4 models, building on the previous version V3.2, which had 685 billion parameters. The V4 features two distinct beta versions: V4-Pro with a massive architecture of 1.6 trillion parameters, and V4-Flash with a more cost-effective 284 billion parameters. Notably, both versions showcase an extended context window of up to 1 million tokens, enhancing their capability for complex queries. Preliminary benchmarks show V4-Pro performing exceptionally well in reasoning tasks, directly challenging leading proprietary models from the United States. The introduction of LLM V4 signifies a notable technical advancement in DeepSeek's efforts to optimize AI functionality while minimizing costs. By employing a hybrid attention architecture and a new optimizer called Muon, the company claims to reduce memory footprint significantly, achieving a substantial efficiency gain. This strategic development may bolster China’s position in the global AI landscape, potentially reducing dependency on foreign technology and enhancing national AI autonomy.