(2017-11-11) Karpathy Software 2.0

Andrej Karpathy: Software 2.0. Neural networks are not just another classifier, they represent the beginning of a fundamental shift in how we develop software. They are Software 2.0. (GenAI)

The “classical stack” of Software 1.0 is what we’re all familiar with — it is written in languages such as Python, C++, etc. It consists of explicit instructions to the computer written by a programmer

In contrast, Software 2.0 is written in much more abstract, human unfriendly language, such as the weights of a neural network. No human is involved in writing this code because there are a lot of weights (typical networks might have millions), and coding directly in weights is kind of hard (I tried).

Instead, our approach is to specify some goal on the behavior of a desirable program (e.g., “satisfy a dataset of input output pairs of examples”, or “win a game of Go”), write a rough skeleton of the code (i.e. a neural net architecture) that identifies a subset of program space to search, and use the computational resources at our disposal to search this space for a program that works.

In most practical applications today, the neural net architectures and the training systems are increasingly standardized into a commodity, so most of the active “software development” takes the form of curating, growing, massaging and cleaning labeled datasets.

It turns out that a large portion of real-world problems have the property that it is significantly easier to collect the data (or more generally, identify a desirable behavior) than to explicitly write the program

Let’s briefly examine some concrete examples of this ongoing transition

Machine Translation

Games. Explicitly hand-coded Go playing programs have been developed for a long while, but AlphaGo Zero

Databases

“The Case for Learned Index Structures” replaces core components of a data management system with a neural network

You’ll notice that many of my links above involve work done at Google. This is because Google is currently at the forefront of re-writing large chunks of itself into Software 2.0 code.

Let’s take a look at some of the benefits of Software 2.0

Computationally homogeneous.

Simple to bake into silicon.

Constant running time.

Constant memory use.

It is highly portable

It is very agile

Modules can meld into an optimal whole.

It is better than you.

The 2.0 stack also has some of its own disadvantages

The 2.0 stack can fail in unintuitive and embarrassing ways ,or worse, they can “silently fail”, e.g., by silently adopting biases in their training data

Programming in the 2.0 stack

it’s clear that there is much more work to do.

For example, when the network fails in some hard or rare cases, we do not fix those predictions by writing code, but by including more labeled examples of those cases. Who is going to develop the first Software 2.0 IDEs, which help with all of the workflows in accumulating, visualizing, cleaning, labeling, and sourcing datasets?

Similarly, Github is a very successful home for Software 1.0 code. Is there space for a Software 2.0 Github? In this case repositories are datasets and commits are made up of additions and edits of the labels.

Edited: 2025-11-03 11:20:04.906163 | Tweet this! | Search Twitter for discussion

Bill Seitz