(2022-11-23) Litt Dynamic Documents Llms Enduser Programming

Geoffrey Litt: Dynamic documents // LLMs + end-user programming. Potluck: Dynamic documents as personal software. We recently published an essay about Potluck, a research project I worked on together with Max Schoening, Paul Shen, and Paul Sonnentag at Ink and Switch.

Potluck originated from a desire to make it easier for people to build little pieces of personal software. We ended up building a prototype that enables people to gradually enrich text notes into interactive tools by extracting structured data from freeform text, running computations on that data, and then injecting the results back into the text as annotations.

We found that starting with familiar text notes seems to make it easier to think of places to use computation; instead of thinking “what app should I make?” you can just notice places in your notes app where you’d like a small extra bit of functionality.

We’re not planning on developing this particular prototype any further, or turning it into a product or anything. But we do plan to carry the lessons we learned from this prototype into future computing environments we’re thinking about at Ink & Switch. (One main reason for this approach is that Potluck really works better as an OS feature more than an isolated app.)

LLMs + end-user programming

I’m trying my best to keep up with all the changes and think about how AI might fit in for end-user programming. Here are some messy reflections

In my daily programming work, I use GitHub Copilot every day

software. One of the biggest barriers to making computers do stuff is learning to write code in traditional programming languages, with their rigid reasoning and delicate syntax. People have been thinking for a long time about ways to get around this. For many decades people have been chasing the dream of having the user just demonstrate some examples, and automatically creating the program

With AI on the scene, there’s been sudden progress

I think there’s a blurry but useful distinction to be drawn between “tools” and “machines”:

In their essay Using Artificial Intelligence to Augment Human Intelligence, Shan Carter and Michael Nielsen call this idea “Artificial Intelligence Augmentation”, or AIA. (IA)

Linus Lee recently posted some demos of dragging a slider to change the length or emotional tone of a text summary, which I think has a similar sense of “tool” rather than “machine

We didn’t use LLMs in Potluck, but it’d be a natural extension, as we discuss briefly in the Future Work

The general vibe of AIA is: human in the driver seat, precisely wielding a tool, but supported by AI capabilities.

When it comes to AI, I’m much more interested in using AI to amplify human capabilities than I am in cheaply automating tasks that humans were already able to do.

There, the AI could help with extracting structured data from messy raw text data, but still leave the user in control of deciding what kinds of computations to run over that structured data

It’s also exactly the split that Nardi, Miller and Wright envisioned when they invented data detectors at Apple.

Recently I’ve been using a wonderful video editing app called Descript which lets you edit a video by editing the text transcript.

Maybe the Descript example suggests “automating away the tedious part” is a reliable recipe for making tools that support human abilities, but it’s not obvious to me what counts as the tedious part.

Interpreter vs compiler

“AI as fuzzy interpreter”: Give instructions, just have the AI directly do stuff for you “AI as compiler”: Have the AI spit out code in Python, JavaScript, or whatever language, that you can run

There are serious tradeoffs here. The AI can do soft reasoning that’s basically impossible to do in traditional code. No need to deal with pesky programming, just write instructions and let ‘er run. On the other hand, it’s harder to rely on the AI

Andrej Karpathy’s Software 2.0 blog post covers some of these tradeoffs in more depth.

I expect reliability to improve dramatically over the coming years as models and prompting techniques get more mature, but the last 5-10% will be really hard

any domain where code is already being used today, code will remain dominant for a while, even if it’s increasingly AI-generated.

One of my favorite interaction ideas in this area comes from a traditional synthesis paper, User Interaction Models for Disambiguation in Programming by Example.


Edited:    |       |    Search Twitter for discussion