(2023-02-09) Willison Training Nanogpt Entirely On Content From My Blog
Simon Willison: Training nanoGPT entirely on content from my blog. This is a follow-up to Running nanoGPT on a MacBook M2 to generate terrible Shakespeare. I used nanoGPT by Andrej Karpathy to train a GPT model entirely against content from my blog!
I mainly used the same technique described in my previous article - but I started by creating my own training set, rather than using the Shakespeare example.
Initial setup
Fetching the data
Each line of the file is a JSON array containing the title and body.
Splitting the data into training and validation sets
I split the entries into 90% training and 10% validation.
Training the model
Getting to 20,000 iterations took around 45 minutes.
Sampling the model
It's just my blog entries... not an example of fine-tuning an existing model using my own content. I'm literally training a complete model here from scratch!
- hmm do this with my ebook/epub library/antilibrary?
- I'll probably try RAG first
Edited: | Tweet this! | Search Twitter for discussion