HuggingFaceFW/fineweb
Viewer • Updated • 52.5B • 970k • 2.81k
Following https://github.com/KellerJordan/modded-nanogpt for fun (learning).
baseline/
nvidia-smi)To experimentally check the neural scaling law:
(Fitted line: log y = -0.11 * log x + 0.9 where x is step (0 to 3200) and y is the training loss)
Available at https://huggingface.co/spaces/lemonteaa/nanogpt-speedrun-demo
(WIP)
Base model
openai-community/gpt2