News
This gain is made possible by TNG’s Assembly-of-Experts (AoE) method — a technique for building LLMs by selectively merging the weight tensors ...
German firm TNG has released DeepSeek-TNG R1T2 Chimera, an open-source variant twice as fast as its parent model thanks to a ...
The updated version of DeepSeek-R1 tied for first place with Google’s Gemini-2.5 and Anthropic’s Claude Opus 4 on the WebDev Arena leaderboard, which evaluates large language models (LLMs) on ...
Say hello to DeepSeek-TNG R1T2 Chimera, a large language model built by German firm TNG Consulting, using three different ...
DeepSeek quietly updates its R1 reasoning model Chinese AI startup DeepSeek has released an update to its R1 reasoning model. The new version, named R1-0528, was published on developer platform ...
Chinese AI upstart MiniMax released a new large language model, joining a slew of domestic peers inspired to surpass DeepSeek in the field of reasoning AI.
Benchmark results cited by DeepSeek show that R1-0528 now surpasses Alibaba’s Qwen 3 and matches the performance of OpenAI and Google’s best models.
A new report says that Huawei's CloudMatrix 384 outperforms Nvidia processors running DeepSeek R1, which is to be expected given the energy use involved.
Some results have been hidden because they may be inaccessible to you
Show inaccessible results