(2023-02-09) Willison Training Nanogpt Entirely On Content From My Blog

Simon Willison: Training nanoGPT entirely on content from my blog. This is a follow-up to Running nanoGPT on a MacBook M2 to generate terrible Shakespeare. I used nanoGPT by Andrej Karpathy to train a GPT model entirely against content from my blog!

I mainly used the same technique described in my previous article - but I started by creating my own training set, rather than using the Shakespeare example.

Initial setup

Fetching the data

Each line of the file is a JSON array containing the title and body.

Splitting the data into training and validation sets

I split the entries into 90% training and 10% validation.

Training the model

Getting to 20,000 iterations took around 45 minutes.

Sampling the model

It's just my blog entries... not an example of fine-tuning an existing model using my own content. I'm literally training a complete model here from scratch!


Edited:    |       |    Search Twitter for discussion