(2025-01-29) ZviM Deepseek Lemon, It's Wednesday
Zvi Mowshowitz: DeepSeek: Lemon, It's Wednesday. It’s been another checks notes two days, so it’s time for all the latest DeepSeek news. You can also see my previous coverage of the r1 model and, from Monday various reactions including the Panic at the App Store.
Table of Contents
- First, Reiterating About Calming Down About the $5.5 Million Number.
- OpenAI Offers Its Congratulations.
- Scaling Laws Still Apply.
- Other r1 and DeepSeek News Roundup.
- People Love Free.
- Investigating How r1 Works.
- Nvidia Chips are Highly Useful.
- Welcome to the Market.
- Ben Thompson Weighs In.
- Import Restrictions on Chips WTAF.
- Are You Short the Market.
- DeepSeeking Safety.
- Mo Models Mo Problems.
- What If You Wanted to Restrict Already Open Models.
- So What Are We Going to Do About All This?
First, Reiterating About Calming Down About the $5.5 Million Number
OpenAI Offers Its Congratulations
Altman handled his response to r1 with grace.
OpenAI plans to ‘pull up some new releases.’
Meaning, oh, you want to race? I suppose I’ll go faster and take less precautions.
since it is both free & getting a ton of attention, I think a lot of people who were using free “mini” models are being exposed to what a early 2025 reasoner AI can do & are surprised.
I’m not saying that’s the baseline scenario, but I do expect the world to be quite amazed at the next generation of models, and they could now be more primed for that.
Scaling Laws Still Apply
Should we now abandon all our plans to build gigantic data centers because DeepSeek showed we can run AI cheaper?
No. Of course not. We’ll need more. Jevons Paradox and all that.
Another question is compute governance. Does DeepSeek’s model prove that there’s no point in using compute thresholds for frontier model governance?
My answer is no. DeepSeek did not mean the scaling laws stopped working. DeepSeek found new ways to scale and economize, and also to distill. But doing the same thing with more compute would have gotten better results
the big mystery today in AI world is why NVIDIA dropped despite R1 demonstrating that GPUs are even more valuable than we thought they were.
Ethan Mollick: The most unnerving part of the DeepSeek reaction online has been seeing folks take it as a sign that AI capability growth is not real.
It signals the opposite, large improvements are possible, and is almost certain to kick off an acceleration in AI development through competition.
One can think of all this as combining multiple distinct scaling laws. Mark Chen above talked about two axes but one could refer to at least four?
Other r1 and DeepSeek News Roundup
McKay Wrigley emphasizes the point that visible chain of thought (CoT) is a prompt debugger. It’s hard to go back to not seeing CoT after seeing CoT.
People Love Free
We had a brief period where DeepSeek would serve you up r1 free and super fast.
Yes, of course one consideration is that if you use DeepSeek’s app it will collect all your data including device model, operating system, keystroke patterns or rhythms, IP address and so on and store it all in China.
This doesn’t appear to rise to TikTok 2.0 levels of rendering your phone and data insecure, but let us say that ‘out of an abundance of caution’ I will be accessing the model through their website not the app thank you very much.
I’m not going so far as to use third party providers for now, because I’m not feeding any sensitive data into the model, and DeepSeek’s implementation here is very nice and clean, so I’ve decided lazy is acceptable. I’m certainly not laying out ~$6,000 for a self-hosting rig, unless someone wants to buy one for me in the name of science.
Investigating How r1 Works
theory
Wordgrammer thread on the DeepSeek technical breakthroughs. Here’s his conclusion, which seems rather overdetermined
Nvidia Chips are Highly Useful
DeepSeek and what happened yesterday: Probably the largest positive one day change in the present discounted value of total factor productivity growth in the history of the world.
Welcome to the Market
I would not discount the role of narrative and vibes in all this. I don’t think that’s the whole Nvidia drop or anything
Palmer Lucky: The markets are not smarter on AI. The free hand is not yet efficient because the number of legitimate experts in the field is near-zero
The One True Newsletter, Matt Levine’s Money Stuff, is of course on the case of DeepSeek’s r1 crashing the stock market, and asking what cheap inference for everyone would do to market prices. He rapidly shifts focus to non-AI companies, asking which ones benefit. It’s great if you use AI to make your management company awesome, but not if you get cut out because AI replaces your management company.
Ben Thompson Weighs In
he explicitly endorses Jevons Paradox and expects compute use to rise
Q: So are we close to AGI?
A: It definitely seems like it. This also explains why Softbank (and whatever investors Masayoshi Son brings together) would provide the funding for OpenAI that Microsoft will not: the belief that we are reaching a takeoff point where there will in fact be real returns towards being first.
Masayoshi Sun feels the AGI. Masayoshi Sun feels everything. He’s king of feeling it. His typical open-model-stanning arguments on existential risk later in the past are as always disappointing, but in no way new or unexpected.
Import Restrictions on Chips WTAF
The diffusion regulations are largely to force companies to create them at home.
we really, really want the data centers at home.
Dhiraj: Taiwan made the largest single greenfield FDI in US history through TSMC. Now, instead of receiving gratitude for helping the struggling US chip industry, Taiwan faces potential tariffs. In his zero-sum worldview, there are no friends
In this case, what Trump wants is presumably for TSMC to announce they are building more new chip factories in America
I presume Trump is mostly bluffing, in that he has no intention of actually imposing these completely insane tariffs,
Are You Short the Market
DeepSeeking Safety
It would be a good sign if DeepSeek had a plan for safety, even if it wasn’t that strong?
[DeepSeek] signed Artificial Intelligence safety commitment by CAICT (gov backed institute). You can see the whale sign at the bottom if you can't read their name Chinese.
even fully abiding by commitments like this won’t remotely be enough.
Mo Models Mo Problems
How committed is DeepSeek to its current path?
Q: DeepSeek, right now, has a kind of idealistic aura reminiscent of the early days of OpenAI, and it’s open source. Will you change to closed source later on? Both OpenAI and Mistral moved from open-source to closed-source.
theory that DeepSeek’s ascent took China’s government by surprise, and they had no idea what v3 and r1 were as they were released. Going forward, China is going to be far more aware. In some ways, DeepSeek will have lots of support. But there will be strings attached.
That starts with the ordinary censorship demands of the CCP.
What If You Wanted to Restrict Already Open Models
that does not mean that we would have actual zero options.
So What Are We Going to Do About All This?
Adam Ozimek was first I saw point out this time around with DeepSeek (I and many others echo this a lot in general) that the best way for the Federal Government to ensure American dominance of AI is to encourage more high skilled immigration and brain drain the world.
Donald Trump: The release of DeepSeek AI from a Chinese company should be a wake-up call for our industries that we need to be laser-focused on competing to win.
We’re going to dominate. We’ll dominate everything.
This is the biggest danger of all - that we go full Missile Gap jingoism and full-on race to ‘beat China,’ and act like we can’t afford to do anything to ensure the safety of the AGIs and ASIs we plan on building
I believe that alignment, and getting a good outcome for humans, was already going to be very hard. It’s going to be a lot harder if we actively try to get ourselves killed like this
Edited: | Tweet this! | Search Twitter for discussion