What he's getting at is single level storage. Ram isn't used for loading data & working on it. Ram is cache. The size of your disk defines the "size" of your system.
This existed in Lisp and Smalltalk systems. Since there's no disk/running program split you don't have to serialize your data. You just pass around Lisp sexprs or Smalltalk code/ASTs. No more sucking your data from Postgres over a straw, or between microservices, or ...
These systems are magnitudes smaller and simpler than what we've built today. I'd love to see them exist again.
There has always been pressure to do so, but there are fundamental bottlenecks in performance when it comes to model size.
What I can think of is that there may be a push toward training for exclusively search-based rewards so that the model isn't required to compress a large proportion of the internet into their weights. But this is likely to be much slower and come with initial performance costs that frontier model developers will not want to incur.
> so that the model isn't required to compress a large proportion of the internet into their weights.
The knowledge compressed into an LLM is a byproduct of training, not a goal. Training on internet data teaches the model to talk at all. The knowledge and ability to speak are intertwined.
I wonder if this maintains the natural language capabilities which are what LLM's magic to me. There is a probably some middle ground, but not having to know what expressions, or idiomatic speech an LLM will understand is really powerful from a user experience point of view.
.-~~\
/ \ _
~x .-~_)_
~x".-~ ~-.
_ ( / \ _
|| T o o Y ||
==:l l < ! I;==
\\ \ .__/ / //
\\ ,r"-,___.-'r.//
}^ \.( ) _.'//.
/ }~Xi--~ // \
Y Y I\ \ " Y
| | |o\ \ |
| l_l Y T |
l "o l_j !
\ /
___,.---^. o .^---.._____
"~~~ " ~ ~~~"
SEEKING WORK - Data scientist, consulting & fractional leadership, US/remote worldwide, email in profile.
All I want for christmas is some gnarly problems to chew on, otherwise it's coal for Christmas.
I'm a data scientist with 20+ years experience who enjoys gnarly, avant-garde problems. I saved a well known German automaker from lemon law recalls. I've worked with a major cloud vendor to predict when servers would fail, allowing them to load shed in time.
Some of the things I've done:
- Live chip counting for estmating betting in casinos.
- Automotive part failure prediction (Lemon law recalls)
- Server fleet failure prediction allowing load shedding.
- Shipping piracy risk prediction - routing ships away from danger.
- Oil reservoir & well engineering forecasting production.
- Realtime routing (CVRP-PD-TW, shifts) for on demand delivery.
- Legal entity and contract term extraction from documents.
- Wound identification & tissue classification.
- The usual LLM and agent stuff. (I'd love to work on effective executive functioning)
- Your nasty problem here.
I use the normal stacks you'd expect. Python, Pytorch, Spark/ray, Jupyter/Merimo, AWS, Postgres, Mathematica and whatever else is needed to get the job done. Ultimately it's about the problem, not the tools.
I have years of experience helping companies plan, prototype, and productionize sane data science solutions.
Get in touch if you have a problem, my email is in my profile.
> Ideally, that thing should be related to your major, like programming competitions for CS. You need an accomplishment you bring to the table that no other applicant exceeds.
I wonder how Linus Torvalds would do in programming competition.
Obviously he has better food taste than I do, so those too. I will shit like a mink and love it.
reply