“Chinese AI startup DeepSeek presented its own large language model, which outperformed competitors from Meta and OpenAI in tests. 🚀 Introducing DeepSeek-V3! Biggest leap forward yet:⚡ 60 tokens/second (3x faster than V2!)💪 Enhanced capabilities🛠 API compatibility intact🌍 Fully open-source models & papers 🐋 1/n pic.twitter.com/p1dV9gJ2Sd — DeepSeek (@deepseek_ai) December 26, 2024 DeepSeek V3 has 671 billion parameters. For comparison, in Llama […]”, — write: businessua.com.ua
Chinese AI startup DeepSeek presented its own large language model, which outperformed competitors from Meta and OpenAI in tests.
🚀 Introducing DeepSeek-V3!
Biggest leap forward yet:
⚡ 60 tokens/second (3x faster than V2!)
💪 Enhanced capabilities
🛠 API compatibility intact
🌍 Fully open-source models & papers🐋 1/n pic.twitter.com/p1dV9gJ2Sd
— DeepSeek (@deepseek_ai) December 26, 2024
DeepSeek V3 has 671 billion parameters. For comparison, Llama 3.1 405B has an indicator of 405 billion. The number reflects the ability of AI to adapt to more complex applications and give more accurate answers.
Comparison of DeepSeek V3 with competitors. Source: DeepSeek.
The Hangzhou-based company trained the neural network in two months and $5.58 million, using significantly less computing resources (2,048 GPUs) than larger tech companies. It promises to provide the best price/quality ratio on the market.
💰 API Pricing Update
🎉 Until Feb 8: same as V2!
🤯 From Feb 8 onwards:
Input: $0.27/million tokens ($0.07/million tokens with cache hits)
Output: $1.10/million tokens🔥 Still the best value in the market!
🐋 3/n pic.twitter.com/OjZaB81Yrh
— DeepSeek (@deepseek_ai) December 26, 2024
Multimodality and “other advanced features” are planned to be added in the future.
OpenAI team member Andriy Karpati noted that DeepSeek demonstrated very amazing research and development in conditions of limited resources.
DeepSeek (Chinese AI co) making it look easy today with an open weights release of a frontier-grade LLM trained on a joke of a budget (2048 GPUs for 2 months, $6M).
For reference, this level of capability is supposed to require clusters of closer to 16K GPUs, the ones being… https://t.co/EW7q2pQ94B
— Andrej Karpathy (@karpathy) December 26, 2024
“Does that mean you don’t need large clusters GPU for border ones LLM? No, but you have to make sure you don’t waste what you have. It looks like a good demonstration that there is still a lot to be done with both data and algorithms,” he added.
Previously, DeepSeek introduced the “o1 competitor from OpenAI” – a smart “super powerful” AI model DeepSeek-R1-Lite-Preview.
We will remind you that in July, the Chinese company Kuaishou opened an AI model for generating Kling videos for everyone.
The source