Skip to content

Minimalistic, extremely fast, and hackable researcher's toolbench for GPT models in 307 lines of code. Reaches <3.8 validation loss on wikitext-103 on a single A100 in <100 seconds. Scales to larger models with one parameter change (feature currently in alpha).

License

Notifications You must be signed in to change notification settings

tysam-code/hlb-gpt

Error
Looks like something went wrong!

About

Minimalistic, extremely fast, and hackable researcher's toolbench for GPT models in 307 lines of code. Reaches <3.8 validation loss on wikitext-103 on a single A100 in <100 seconds. Scales to larger models with one parameter change (feature currently in alpha).

Resources

License

Citation

Stars

Watchers

Forks

Packages

No packages published

Languages