-
Notifications
You must be signed in to change notification settings - Fork 28
Minimalistic, extremely fast, and hackable researcher's toolbench for GPT models in 307 lines of code. Reaches <3.8 validation loss on wikitext-103 on a single A100 in <100 seconds. Scales to larger models with one parameter change (feature currently in alpha).
License
tysam-code/hlb-gpt
ErrorLooks like something went wrong!
About
Minimalistic, extremely fast, and hackable researcher's toolbench for GPT models in 307 lines of code. Reaches <3.8 validation loss on wikitext-103 on a single A100 in <100 seconds. Scales to larger models with one parameter change (feature currently in alpha).
Resources
License
Citation
Cite this repository
Loading
Something went wrong.
Stars
Watchers
Forks
Packages 0
No packages published