Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
arilotter
on Dec 2, 2024
|
parent
|
context
|
favorite
| on:
Pre-training a 15B parameter language model over t...
This specific model is only trained on 100 billion tokens, so it's not SOTA by any means, but we've got designs on larger training runs later :)
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: