lemonteaa
/

nanogpt-speedrun

Model card Files Files and versions

NanoGPT Speedrun

Following https://github.com/KellerJordan/modded-nanogpt for fun (learning).

Run Info

baseline/

Run on lightning cloud, using one L40S
Batch size set to 32
VRAM usage: 26.95GB (25698MB reported in nvidia-smi)
4 seconds per step, total 3200 steps
Checkpoint saved every 320 steps

Training loss

To experimentally check the neural scaling law:

(Fitted line: log y = -0.11 * log x + 0.9 where x is step (0 to 3200) and y is the training loss)

Demo

Available at https://huggingface.co/spaces/lemonteaa/nanogpt-speedrun-demo

(WIP)

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for lemonteaa/nanogpt-speedrun

Base model

openai-community/gpt2

Finetuned

(2160)

this model

Dataset used to train lemonteaa/nanogpt-speedrun

Space using lemonteaa/nanogpt-speedrun 1