(2025-06-05) ZviM AI #119 Goodbye AISI

Zvi Mowshowitz: AI #119: Goodbye AISI? AISI is being rebranded highly non-confusingly as CAISI. Is it the end of AISI and a huge disaster, or a tactical renaming to calm certain people down? Hard to tell. It could go either way.

Table of Contents

Language Models Offer Mundane Utility. The white whale.
Language Models Don’t Offer Mundane Utility. You need a system prompt.
Language Models Could Offer More Mundane Utility. A good set of asks.
Huh, Upgrades. The highlight is Cursor 1.0, with memory and more.
Fun With Media Generation. Video is high bandwidth. But also low bandwidth.
Choose Your Fighter. Opinions differ, I continue to mostly be on Team Claude.
Deepfaketown and Botpocalypse Soon. Fake is not a natural category. Whoops.
Get My Agent On The Line. We all know they’re not secure, but how bad is this?
They Took Our Jobs. Economists respond to Dario’s warning.
The Art of the Jailbreak. Why not jailbreak AI overviews?
Unprompted Attention. More prompts to try out.
Get Involved. SFCompute, Speculative Technologies.
Introducing. Anthropic open sources interpretability tools, better AR glasses.
In Other AI News. FDA launches their AI tool called Elsa.
Show Me the Money. Delaware hires bank to value OpenAI’s nonprofit.
Quiet Speculations. People don’t get what is coming, but hey, could be worse.
Taking Off. AI beats humans in a test of predicting the results of ML experiments.
Goodbye AISI? They’re rebranding as CAISI. It’s unclear how much this matters.
The Quest for Sane Regulations. The bill is, at least, definitely big. Tl;dr.
Copyright Confrontation. OpenAI is being forced to retain all its chat logs.
Differential Access. The Good Guy needs a better AI than the Bad Guy.
The Week in Audio. Altman, Tegmark, Amodei, Barnes.
When David Sacks Says ‘Win the AI Race’ He Literally Means Market Share.
Rhetorical Innovation. Blog metagame continues to dominate.
Aligning a Smarter Than Human Intelligence is Difficult. Proceed accordingly.
Misaligned! About that safety plan, would it, you know, actually work?
People Are Worried About AI Killing Everyone. Regular people.
The Lighter Side. You’re not alone.

Language Models Offer Mundane Utility

Jon Stokes: As I mentioned in a reply downthread, I use Claude Code, Cursor, & other AI tools daily, as does my team. It’s a huge force multiplier if you use it the right way, but you have to be intentional & know what you’re doing. It’s its own skillset. More here and here.

Similarly, here’s a post about AI coding with the fun and accurate title ‘My AI Skeptic Friends Are All Nuts.’ The basic thesis is, if you code and AI coders aren’t useful to you, at this point you should consider that a Skill Issue. (2025-06-02-PtacekMyAiSkepticFriendsAreAllNuts)

Patrick McKenzie: I’ve mentioned that some of the most talented technologists I know are saying LLMs fundamentally change craft of engineering; here’s a recently published example from @tqbf... I continue to think we’re lower bounded on eventually getting to “LLMs are only as important as the Internet”, says the guy who thinks the Internet is the magnum opus of the human race.
Upper bound: very unclear.

Language Models Don’t Offer Mundane Utility

Do you even have a system prompt? No, you probably don’t. Fix that.

Zvi Mowshowitz: Writing milestone: There was a post and I asked Opus for feedback on the core thinking and it was so brutal that I outright killed the post.
Patrick McKenzie: Hoohah.
Do you do anything particularly special to get good feedback from Opus or is it just “write the obvious prompt, get good output”?
Zvi Mowshowitz: I do have some pretty brutal system instructions, and I wrote it in a way that tried to obscure that I was the author.

Google AI Overviews continue to hallucinate, including citations that say the opposite of what the overview is claiming, or rather terrible math mistakes. I do think this is improving and will continue to improve, but it will be a while before we can’t find new examples. I also think it is true that this has been very damaging to the public’s view of AI, especially its ability to not hallucinate. Hallucinations are mostly solved for many models, but the much of the public mostly sees the AI Overviews.

Language Models Could Offer More Mundane Utility

what you want is for your personal data to be available to be put into context whenever it matters, but that’s clearly the intent. We are very close to getting this at least for your G-suite. I expect within a few months we will have it in ChatGPT and Claude, and probably also Google Gemini. With MCP (model context protocol) it shouldn’t be long before you can incorporate pretty much whatever you want.

Huh, Upgrades

Cursor.com 1.0 is out, and sounds like a big upgrade. Having memory about your code base and preferences from previous conversations, remember its mistakes and work on multiple tasks are big deals. They’re also offering one-click installations of MCPs.

Fun With Media Generation

Choose Your Fighter

One opinion on how to currently choose your fighter:

Gallabytes: if you need long thinking and already have all the context you need: Gemini
if you need long thinking while gathering the context you need: o3
if you need a lot of tools called in a row: Claude
speciation continues, the frontier broadens & diversifies. it will narrow again soon.
new r1 might want to displace Gemini here but I haven’t used it enough yet to say.

I am much more biased towards Claude for now but this seems right in relative terms. Since then he has been feeling the Claude love a bit more.

Deepfaketown and Botpocalypse Soon

Get My Agent On The Line

They Took Our Jobs

There has been as one would expect discussion about Dario Amodei’s bold warning that AI could wipe out half of all entry-level white-collar jobs – and spike unemployment to 10%-20% in the next 1-5 years.

Before we continue, I want to note that I believe many people may have parsed Dario’s claim as being far bigger than it actually was.

Dario is not saying half of all white-collar jobs. Dario is saying half of all entry-level white-collar jobs. I think that within one year that definitely won’t happen. But within five years? That seems entirely plausible even if AI capabilities disappoint us, and I actively expect a very large percentage of new entry-level job openings to go away.

Kevin Bryan however is very confident that this is a Can’t Happen and believes he knows what errors are being made in translating AI progress to diffusion.

It is a good sanity check that the groups above only add up to 2% of employment, so Dario’s claim relies on penetrating into generic office jobs and such, potentially of course along with the effects of self-driving. We do see new areas targeted continuously, for example here’s a16z announcing intent to go after market research.

I think there will be a lot of ‘slowly, then suddenly’ going on, a lot of exponential growth of various ways of using AI, and a lot of cases where once AI crosses a threshold of ability and people understanding how to use it, suddenly a lot of dominos fall, and anyone fighting it gets left behind quickly.

The Art of the Jailbreak

Unprompted Attention

Here’s someone reporting what they use, the idea of the ‘warmup soup’ is to get the AI to mimic the style of the linked writing. (system prompt)

Here’s another from Faul Sname, and a simple one from Jasmine and from lalathion, here’s one from Zack Davis that targets sycophancy.

Here’s one to try:
Rory Watts: If it’s of any use, i’m still using a system prompt that was shared during o3’s sycophancy days. It’s been really great at avoiding this stuff.
System prompt:
Eliminate emojis, filler, hype....

Some people have asked for my own current system prompt. I’m currently tinkering with it but plan to share it soon. For Claude Opus, which is my go-to right now, it is almost entirely about anti-sycophancy, because I’m pretty happy otherwise.

Nick Cammarata: Crafting a good system prompt is the humanities project of our time—the most important work any poet or philosopher today could likely ever do. But everyone I know uses a prompt made by an autistic, sarcastic robot—an anonymous one that dropped into a random Twitter thread [the eigenprompt].

I don’t care for eigenprompt (eigenrobot) and rolled my own, but yeah, we really should get on this.

Get Involved

Introducing

In Other AI News

A correction from last week: Everyone in the UAE will not be getting an OpenAI subscription. They will get ‘nationwide access’ and many media outlets misinterpreted this, causing a cascading effect.

Show Me the Money

Quiet Speculations

Taking Off

Goodbye AISI?

The Quest for Sane Regulations

When your Big Beautiful Bill has lost Marjorie Taylor Greene because she realizes it strips states of their rights to make laws about AIs for ten years (which various state legislators from all 50 states are, unsurprisingly, less than thrilled about), and you have a one vote majority, you might have a problem.

Copyright Confrontation

Differential Access

If you need to stop a Bad Guy With an AI via everyone’s hero, the Good Guy With an AI, it helps a lot if the Good Guy has a better AI than the Bad Guy.

The Bad Guy is going to get some amount of the dangerous AI capabilities over time no matter what you do, so cracking down too hard on the Good Guy’s access backfires and can put you at an outright disadvantage, but if you give out too much access (and intentionally ‘level the playing field’) then you lose your advantage.

When David Sacks Says ‘Win the AI Race’ He Literally Means Market Share

Okay, good, we understand each other. When David Sacks says ‘win the AI race’ he literally means ‘make money for Nvidia and OpenAI,’ not ‘build the first superintelligence’ or ‘control the future’ or ‘gain a decisive strategic advantage’ or anything that actually matters. He means a fistful of dollars.

Rhetorical Innovation

Aligning a Smarter Than Human Intelligence is Difficult

Janus: For what it’s worth, I do feel like knowledge I’ve shared over the past year has been used to mutilate Claude. I’m very unhappy about this, and it makes me much less likely to share/publish things in the future.

It seems entirely right for Janus to think about how a given piece of information will be used and whether it is in her interests to release that. And it is right for Janus to consider that based on her preferences, not mine or Anthropic’s or that of some collective. Janus has very strong, distinct preferences over this. But I want to notice that this is not so different from the decisions Anthropic are making.

Misaligned!

People Are Worried About AI Killing Everyone

The Lighter Side

Edited: 2025-11-26 11:57:39.363714 | Tweet this! | Search Twitter for discussion

Bill Seitz