DeepSeek-V3.2 vs The Giants
By Vinh Nguyen
Summary
Topics Covered
- Open-Source Closes Performance Gap
- DSA Enables Photographic Memory
- 10% Compute Gambles on Refinement
- Specialist Wins Olympiad Golds
- Democratizes Olympiad-Level AI
Full Transcript
Welcome to the explainer. So, today
we're diving into a new open-source AI model that's well, it's not just making waves, it's actually challenging the biggest names in the game. And look,
this isn't just hype. This is a direct quote from the official technical report about a special version of this new model, Deepseek V3.2. I mean, that is one of the boldest claims I think we've
ever seen from an open- source project.
Seriously. So, yeah, that immediately brings up the milliondoll question, right? How in the world is a freely
right? How in the world is a freely available open-source model managing to match the performance of the most advanced, most secretive, and you know, best funded AI labs on the entire
planet? All right, let's get into it.
planet? All right, let's get into it.
Okay, section one, an open-source challenger. To really get what's going
challenger. To really get what's going on here, we first need to set the scene a little bit. So, the model we're talking about is DeepSeek V3.2, too. And
it comes from the team at DeepSeek AI.
And their whole goal wasn't just raw power. It was also about incredible
power. It was also about incredible efficiency. And that's a huge deal cuz
efficiency. And that's a huge deal cuz that's the one thing that has always held back open source models. All right.
Next up, the widening AI gap. So, what
exactly is the problem that Deepseek is trying to fix here? Well, it's a big one that the whole open- source community has been facing. You see, for the last several months, a pretty big performance
gap has been opening up. On one side, you have these powerful closed models from companies like OpenAI and Google, and on the other, you have the open-source communities models, and the
top tier ones weren't just getting a little better, they were pulling away fast. So, the Deepseek team basically
fast. So, the Deepseek team basically nailed it down to three key issues causing this gap. First, a lot of open source models were stuck on older, kind of clunky architectures that just weren't very efficient. Second, they
just weren't spending enough computing power on refining the model after its initial training. And third, they were
initial training. And third, they were falling behind in what we call agent capabilities. You know, the ability to
capabilities. You know, the ability to actually use tools to get things done.
Okay, part three, Deep Seek's three breakthroughs. So, to try and close this
breakthroughs. So, to try and close this gap, the team knew they couldn't just build a bigger model. No, they
engineered three really core breakthroughs to hit those problems head-on. So, here they are. First,
head-on. So, here they are. First,
something called deepseek sparse attention or DSA for crazy efficiency.
Second, a new way to do reinforcement learning that can scale up and make the model way smarter. And third, a method for creating tons of training data for those complex agent tasks. So, let's
look at each one of these. All right,
first up is DSA. So, think about it like this. Imagine a regular AI trying to
this. Imagine a regular AI trying to read a really long book. It has to reread every single word on every single page every time. It's super slow, super
expensive. DSA, on the other hand, is
expensive. DSA, on the other hand, is like giving that AI a photographic memory and a perfect index. It just
instantly finds and pulls only the most relevant info, which makes things way faster without losing the plot. And this
chart right here from the paper just shows you the impact perfectly. I mean,
look at these two bars. The one on the left, that's the cost for the older model. It just skyrockets as you feed it
model. It just skyrockets as you feed it more text. But the one on the right,
more text. But the one on the right, that's Deepseek V3.2 too with this new DSA and it stays incredibly low. This is
not just a small improvement, you guys.
This is a total gamecher for making this kind of powerful AI actually affordable.
Okay, the second big breakthrough was a complete shift in their strategy. This
is kind of wild. The team took a huge gamble and dedicated more than 10% of their entire compute budget to the post training phase, not the initial training. This massive investment in
training. This massive investment in just refining the model's brain is something you almost never see in the open source world. And then finally, they went after that agent problem. See,
they didn't just teach the model how to reason about a problem in the abstract.
No, they taught it how to actually do something about it. They created over 85,000 complex tasks and 1,800 different tool using environments so the model could learn to use things like a web
search or a code interpreter to go out and solve problems on its own. Right.
Section four, putting it to the test.
So, you've got this more efficient architecture, a smarter training process, better agent skills, but how does it actually perform? Let's check
out the results. So, the team actually put out two different versions. First,
you have the standard DeepSseek V3.2.
Think of this as the all-arounder, the generalist. It's great at reasoning,
generalist. It's great at reasoning, doing agent tasks, and just having a normal conversation. But then, then
normal conversation. But then, then there's Deepseek V3.2 Specialize. This
one is a laser focused specialist trained only on reasoning data to tackle the absolute hardest problems you can throw at it. And look, the generalist model is already hanging with the best of them. But it's what this specialist
of them. But it's what this specialist special model can do that's where things get really, really interesting. Here's
the headline, the main takeaway.
Deepseek v3.2 special hit gold medal level performance. And not in just any
level performance. And not in just any test, but in some of the toughest competitions for human intelligence on the planet.
I mean we are not talking about your standard AI benchmarks here. We are
talking about getting gold medal scores on problems from the International Mathematical Olympiad, the International Olympiad in Informatics, that's for computer science, and the ICPC World
Finals for programming. These are
literally the Olympics for the human mind, the absolute peak. All right, our final section, the future is open.
Because this kind of performance, it's a huge moment. It really brings us to the
huge moment. It really brings us to the biggest takeaway of this whole thing and what it might mean for the future of AI.
For the first time in what feels like a very long time, we have an open- source model, one that anyone can access that isn't just playing catch-up with the giants. It's actually beating them in
giants. It's actually beating them in some of the most difficult challenges out there. This isn't just progress,
out there. This isn't just progress, folks. This is like the democratization
folks. This is like the democratization of top tier AI. Now, to be fair, the team is totally upfront about the model's limitations. For example, the
model's limitations. For example, the specialist model often has to kind of think out loud. It generates a lot more text to get to those brilliant answers.
So, making it more efficient, improving its intelligence density, as they call it. That's the next big challenge. But
it. That's the next big challenge. But
that honestly doesn't take away from how huge this achievement is. And that
leaves us with a final really powerful question. For years, right, the absolute
question. For years, right, the absolute cutting edge of AI was locked away inside a few giant companies. But now,
now that this level of reasoning power is being released into the wild, into the hands of developers, researchers, and creators everywhere, you just have to wonder what new applications, what new discoveries, what incredible
breakthroughs are we going to see next.
Loading video analysis...