The DeepSeek R1 hype explained for normal people: What you need to know
R1 from DeepSeek challenges AI norms with smarter training, rivaling OpenAI’s o1. It’s 18x more efficient, raising questions on scaling, costs, and censorship.
Your entire timeline is suddenly filled with just one model: R1 from DeepSeek. But Large Language Models have been outperforming each other by a few percent in benchmarks almost daily for months now. So what makes this model stand out if it doesn’t surpass OpenAI’s market-leading o1? Is it just because it’s from China? No—its hype comes from delivering nearly the same performance as o1 but at a fraction of the cost. Let’s break it down!
🧠 What Is a Reasoning Model?
R1 is a Reasoning model, similar to OpenAI’s o1. This means it "thinks" before it responds. Instead of immediately choosing the most probable answer, it evaluates different possibilities before making a decision.
🌏 Attention and Censorship
This model is getting a lot of attention, mainly because it came out of nowhere and from China. This naturally raises questions about censorship. Early tests clearly show that R1 is politically aligned on certain topics and does not allow critical discussions on those. However, in R1’s case, from a purely technical perspective, it doesn’t really matter where it comes from or how its content is regulated. The model is Open Source, and a paper has been published alongside the model detailing how it was trained. Ethical concerns and how it handles sensitive topics could be adjusted by modifying the training dataset and re-training the model. For DeepSeek's end-user product, censorship remains a notable concern.
💡 The Real Innovation
But this is not the main reason DeepSeek is getting so much attention. The groundbreaking development that is shaking up Western stock markets is something else: R1 is significantly more resource-efficient than its competitors, both in training and inference.
There are doubts about whether the official training cost of just $5.6 million is accurate or if heavy subsidies were involved. But ultimately, it doesn’t matter how much less budget the DeepSeek team used to develop R1. What’s exciting is how technical limitations can drive innovation: Due to U.S. sanctions, China only has access to less powerful NVIDIA GPUs, such as the H800, instead of the high-end H100. But necessity breeds innovation, and the DeepSeek team developed a new training approach.
📉 Breaking the Scaling Laws with Reinforcement Learning
The result is a custom-built training framework also developed by DeepSeek based entirely on Reinforcement Learning. Until now, this approach was considered impractical, as training usually involves labeled data and another LLM acting as a coach—both very expensive processes. Despite this, R1 achieves performance comparable to OpenAI’s market-leading o1 model, demonstrating that smarter training techniques can rival brute-force scaling.
Until now, the common belief was that powerful models could only be improved by following Scaling Laws: The more computing power thrown at training, the better the model. This assumption has fueled NVIDIA’s dominance in the AI market. But R1 seems to challenge this idea. If LLMs can be improved through smarter training instead of brute-force scaling, this could start a “race to the bottom,” where training models become more efficient and therefore much cheaper.
This is the real reason why the R1 model is so groundbreaking, and why the stock market value of companies like NVIDIA or OpenAI is suddenly being called into question. On the other hand, this could open new markets for low-end GPUs.
OpenAI does not disclose how it trains its reasoning models. It’s clear that they must also be using Reinforcement Learning, since it rewards models for "thinking" during training. However, whether they rely exclusively on this method or still include costly Supervised Learning remains unknown.
💰 Cost Efficiency Through Mixture of Experts
Another important factor: Training AI models is expensive, but so is running them. OpenAI has to maintain enormous server infrastructure just to handle user queries from ChatGPT. This is where R1 shines again. One of DeepSeek’s major innovations is its implementation of the Mixture of Experts (MoE) technique in its reasoning model. This approach breaks the model into smaller, specialized expert models. Even though R1 has a total of 671 billion parameters, only 37 billion are used per token generation. This reduces computing requirements by a factor of 18. As a result, the model is not only 18 times cheaper to run than other reasoning models but also allows the same hardware to handle 18 times more requests.
🔮 Conclusion: A New AI Era?
In the end, R1 demonstrates that the future of AI isn’t just about building bigger models and throwing more computing power at them—it’s also about smarter training techniques. DeepSeek has shown that technical constraints can inspire innovation. Whether this new approach will be widely adopted and how it will impact the market remains to be seen. But one thing is certain: R1 sends a strong message to the AI world and fundamentally challenges previous assumptions about scaling and efficiency.
It’s important to remember that despite the hype around new models, the real value for end-users and businesses will likely come from the Application Layer, where AI models drive concrete business cases, rather than chasing every new model.
☝️ Advertisement Block: I will buy myself a pizza every time I make enough money with these ads to do so. So please feed a hungry developer and consider disabling your Ad Blocker.