(2025-01-28) ZviM Deepseek Panic At The App Store
Zvi Mowshowitz: DeepSeek Panic at the App Store. DeepSeek released v3. Market didn’t react.
DeepSeek released r1. Market didn’t react.
DeepSeek released a f**ing app of its website. Market said I have an idea, let’s panic.*
Nvidia was down 11%, Nasdaq is down 2.5%, S&P is down 1.7%, on the news.
Shakeel: The fact this is happening today, and didn’t happen when r1 actually released last Wednesday, is a neat demonstration of how the market is in fact not efficient at all.
That is exactly the market’s level of situational awareness. No more, no less.
To avoid confusion: r1 is clearly a pretty great model. It is the best by far available at its price point, and by far the best open model of any kind. I am currently using it for a large percentage of my AI queries.
Table of Contents
- Current Mood.
- DeepSeek Tops the Charts.
- Why Is DeepSeek Topping the Charts?.
- What Is the DeepSeek Business Model?.
- The Lines on Graphs Case for Panic.
- Everyone Calm Down About That $5.5 Million Number.
- Is The Whale Lying?.
- Capex Spending on Compute Will Continue to Go Up.
- Jevon’s Paradox Strikes Again.
- Okay, Maybe Meta Should Panic.
- Are You Short the Market.
- o1 Versus r1.
- Additional Notes on v3 and r1.
- Janus-Pro-7B Sure Why Not.
- Man in the Arena.
- Training r1, and Training With r1.
- Also Perhaps We Should Worry About AI Killing Everyone.
- And We Should Worry About Crazy Reactions To All This, Too.
- The Lighter Side.
Current Mood
DeepSeek Tops the Charts
Lots of people are rushing to download the DeepSeek app.
switching costs are basically zero
Then regular people started to notice DeepSeek.
DeepSeek’s mobile app has entered the top 10 of the U.S. App Store.
It’s getting ~300k global daily downloads.
Claude had ~300k downloads last month, but that’s a lot less than 300k per day.
*Then it went all the way to #1 on the iPhone app store.
Kevin Xu: Two weeks ago, RedNote topped the download chart
Today, it’s DeepSeek*
Why Is DeepSeek Topping the Charts?
Because:
- It’s completely free.
- It has no ads.
- It’s a damn good model, sir.
- It lets you see the chain of thought which is a lot more interesting and fun and also inspires trust.
- All the panic about it only helped people notice, getting it on the news and so on.
- It’s the New Hotness that people hadn’t downloaded before, and that everyone is talking about right now because see the first five.
- No, this mostly isn’t about ‘people don’t trust American tech companies but they do trust the Chinese.’ But there aren’t zero people who are wrong enough to think this way, and China actively attempts to cultivate this including through TikTok.
- The Open Source people are also yelling about how this is so awesome and trustworthy and virtuous and so on, and being even more obnoxious than usual, which may or may not be making any meaningful difference.
I suspect we shouldn’t be underestimating the value of showing the CoT here, as I also discuss elsewhere in the post.
That doesn’t mean it is ‘worth’ sharing the CoT, even if it adds a lot of value – it also reveals a lot of valuable information, including as part of training another model. So the answer isn’t obvious.
What Is the DeepSeek Business Model?
Meta is pursuing open weights primarily because they believe it maximizes shareholder value. DeepSeek seems to be doing it primarily for other reasons.
Several possibilities. The most obvious ones are, in some combination:
They don’t need a business model. They’re idealists looking to give everyone AGI.
They’ll pivot to the standard business model same as everyone else.
From where I sit, there is very broad uncertainty about which of these dominate, or will dominate in the future no matter what they believe about themselves today.
I also believe they intend to build and open source AGI.
The CCP is doubtless all for DeepSeek having a hit app. And they’ve been happy to support open source in places where open source doesn’t pose existential risks
That’s very different from an intent to open source AGI. China’s strategy on AI regulation so far has focused on content moderation for topics they care about. That approach won’t stay compatible with their objectives over time.
The question now becomes: “What countermove will the CCP make now?”
The CCP wants to stay in control. What DeepSeek is doing is incompatible with that. If they are not simply asleep at the wheel, they understand this.
The Lines on Graphs Case for Panic
Yishan takes the opposite perspective, that newcomers like DeepSeek who come out with killer products like this are on steep upward trajectories
seeing it as similar to Internet Explorer 3 or Firefox, or iPhone 1
I think the example list here illustrates why I think DeepSeek probably (but not definitely) doesn’t belong on that list.
to those who are saying that ‘China has won’ or ‘China is in the lead now,’ or other similar things, seriously, calm the **** down.
no, China isn’t suddenly the one with the engineering secrets
I worry about this because I worry about a jingoist ‘we must beat China and we are behind’ reaction causing the government to do some crazy ass stuff that makes us all much more likely to get ourselves killed, above and beyond what has already happened. There’s a lot of very strong Missile Gap vibes here.
Everyone Calm Down About That $5.5 Million Number
Dean Ball offers notes on DeepSeek and r1 in the hopes of calming people down. Because we have such different policy positions yet see this situation so similarly, I’m going to quote him in full
I fully agree with #1 through #6
I can verify the bet in #7 was very on point
I think ‘there are no moats’ is too strong.
For #8 we of course have our differences on regulation, but we do agree on a lot of this. Dean doubtless would count a lot more things as ‘awful state laws’ than I would, but we agree that the proposed Texas law would count. At this point, given what we’ve seen from the Trump administration, I think our best bet is the state law path
Nabeel Qureshi: Everyone is way overindexing on the $5.5m final training run number from DeepSeek.
- GPU capex probably $1BN+
- Running costs are probably $X00M+/year
- ~150 top-tier authors on the v3 technical paper, $50m+/year
Danielle Fong: when tesla claimed that they were going to have batteries < $100 / kWh, practically all funding for american energy storage companies tanked.
Tesla still won’t sell you a powerwall or powerpack for $100/kWh. it’s like $1000/kWh and $500 for a megapack.
The entire VC sector in the US was bluffed and spooked by Elon Musk. don’t be stupid in this way again.
Is it impressive that they (presumably) did the final training run with only $5.5M in direct compute costs? Absolutely. Is it impressive that they’re relevant while plausibly spending only hundreds of millions per year total instead of tens of billions? Damn straight.
Is The Whale Lying?
The following all seem clearly true:
- A lot of this is based on misunderstanding the ‘$5.5 million’ number.
- People have strong motive to engage in baseless cope around DeepSeek.
- DeepSeek had strong motive to lie about its training costs and methods.
they are very importantly being constrained by access to compute, even if they’ve smuggled in a bunch of chips they can’t talk about. As Tim Fist points out, the export controls are tightened, so they’ll have more trouble accessing the next generations than they are having now, and no this did not stop being relevant, and they risk falling rather far behind.
Capex Spending on Compute Will Continue to Go Up
We’re barely one new chip generation into the export controls, so it’s not surprising China “caught up.” The controls will only really start to bind and drive a delta in the US-China frontier this year and next.
DeepSeek is not a Sputnik moment. Their models are impressive but within the envelope of what an informed observer should expect.
Imagine if US policymakers responded to the actual Sputnik moment by throwing their hands in the air and saying, “ah well, might as well remove the export controls on our satellite tech.” Would be a complete non-sequitur.
In my opinion, open-source models are a bit of a red herring on the path to acceptable ASI futures. Free model weights still do not distribute power to all of humanity; they distribute it to the compute-rich.
I don’t think Roon is right that it matters ‘even more,’ and I think who has what access to the best models for what purposes is very much not a red herring, but compute definitely still matters a lot in every scenario that involves strong AI.
Imagine if the ones going ‘I suppose we should drop the export controls then’ or ‘the export controls only made us stronger’ were mostly the ones looking to do the importing and exporting. Oh, right.
Jevon’s Paradox Strikes Again
what does this mean for Nvidia?
DeepSeek is definitely not at ‘run on your laptop’ level, and these are reasoning models so when we first crack AGI or otherwise want the best results I am confident you will want to be using some GPUs or other high powered hardware, even if lots of other AI also is happening locally.
Does Jevon’s Paradox (which is not really a paradox at all, but hey) apply here to Nvidia in particular?
I believe it will on net drive demand up rather than down, although I also think Nvidia would have been able to sell as many chips as it can produce either way, given the way it has decided to set prices.
You could reach the opposite conclusion if you think that there is a rapidly approaching limit to how good AI can be, that throwing more compute an training or inference won’t improve that by much
That’s a view that doesn’t believe in AGI, let alone ASI, and likely doesn’t even factor in what current models (including r1!) can already do.
Okay, Maybe Meta Should Panic
Or at least, if you’re in their GenAI department, you should definitely panic.
It should have been an engineering-focused small organization, but since a bunch of people wanted to join for the impact and artificially inflate hiring in the organization, everyone loses.
It also illustrates why DeepSeek may have made a major mistake revealing as much information as it did, but then again if they’re not trying to make money and instead are driven by ideology of ‘get everyone killed’ (sorry I meant to say ‘open source AGI’) then that is a different calculus than Meta’s.
Are You Short the Market
If you were short on Friday, you’re rather happy about that now. Does it make sense?
The timing is telling. To the extent this does have impact, all of this really should have been mostly priced in.
There are obvious reasons to think this is rather terrible for OpenAI in particular, although it isn’t publicly traded, because a direct competitor is suddenly putting up some very stiff new competition, and also the price of entry for other competition just cratered, and more companies could self-host or even self-train.
I totally buy that
That same logic goes for other frontier labs like Anthropic or xAI, and to Google and Microsoft and everyone else
This is obviously potentially bad for Meta, since Meta’s plan involved being the leader in open models and they’ve been informed they’re not the leader in open models
This is obviously bad for existential risk, but I have not seen anyone else even joke about the idea that this could be explaining the decline in the market. The market does not care or think about existential risk, at all
My diagnosis is that this is about, fundamentally, ‘the vibes.’ It’s about Joe’s sixth point and investor MOMO and FOMO.
I do know I will continue to be long, and I bought more Nvidia today.
Rynuck: Now when it comes to prompting these models, I suspected it with O1 but R1 has completely proven it beyond a shadow of a doubt: prompt engineering is more important than ever.
We can see now with R1’s reasoning that these models are like a probe that you send down some “idea space”. If your idea-space is undefined and too large, it will diffuse its reasoning and not go into depth on one domain or another.
It’s still early but for now, I would say that R1 is perhaps a little bit weaker with coding
Additional Notes on v3 and r1
Teortaxes: What I also love about R1 is it gives no fucks about the user – only the problem. It’s not sycophantic, like, at all, autistic in a good way; it will play with your ideas, it won’t mind if you get hurt. It’s your smart helpful friend who’s kind of a jerk. Like my best friends.
Give me NYC Nice over SF Nice every time.
Janus-Pro-7B Sure Why Not
Man in the Arena
Needless to say, the details of these ratings here are increasingly absurdist
It’s still not nothing – this list does tend to put better things ahead of worse things, even with large error bars.
Training r1, and Training With r1
Peter Schmidt-Nielsen explains why r1 and its distillations, or going down the o1 path, are a big deal – if you can go on a loop of generating expensive thoughts then distilling them to create slightly better quick thoughts, which in turn generate better expensive thoughts, you can potentially bootstrap without limit into recursive self-improvement. And end the world. Whoops.
Also Perhaps We Should Worry About AI Killing Everyone
As in, DeepSeek intends to create and then open source AGI.
How do they intend to make this end well?
As far as we can tell, they don’t. The plan is Yolo.
These people really think that the best thing humanity can do is create things smarter than ourselves with as many capabilities as possible, make them freely available to whoever wants one, and see what happens, and assume that this will obviously end well and anyone who opposes this plan is a dastardly villain.
And We Should Worry About Crazy Reactions To All This, Too
The government could well decide to go down what is not technologically an especially wise or pleasant path. There is a long history of the government attempting crazy interventions into tech, or what looks crazy to tech people, when they feel national security or public outrage is at stake, or in the EU because it is a day that ends in Y.
The United States could also go into full jingoism mode. Some tried to call this a ‘Sputnik moment.’ What did we do in response to Sputnik, in addition to realizing our science education might suck (and if we decide to respond to this by fixing our educational system, that would be great)? We launched the Space Race and spent 4% of GDP or something to go to the moon and show those communist bastards.
what I worry about is the opposite – that this locks us into a mindset of a full-on ‘race to AGI’ that causes all costly attempts to have it not kill us to be abandoned, and that this accelerates the timeline. We already didn’t have any (known to me) plans with much of a chance of working in time, if AGI and then ASI are indeed near.
Edited: | Tweet this! | Search Twitter for discussion