LLMs as programs are here to stay. The issue is with expenses/revenue ratio all these LLM corpos have. According to Sequoia analyst (so not some anon on a forum) there is a giant money hole in that industry, and "giant" doesn't even begins to describe it (iirc it was 600bln this summer). That whole industry will definitely see winter soon, even if all things Altman says would be true.
You just described what literally anyone who says "AI Winter" means; the technology doesn't go away, companies still deploy it and evolve it, customers still pay for it, it just stops being so attractive to massive funding and we see fewer foundational breakthroughs.
They're useful in some situations, but extremely expensive to operate. It's unclear if they'll be profitable in the near future. OpenAI seems to be claiming they need an extra $XXX billion in investment before they can...?
I just made a (IMHO) cool test with OpenAI/Linux/TCL-TK:
"write a TCL/tk script file that is a "frontend" to the ls command: It should provide checkboxes and dropdowns for the different options available in bash ls and a button "RUN" to run the configured ls command. The output of the ls command should be displayed in a Text box inside the interface. The script must be runnable using tclsh"
It didn't get it right the first time (for some reason wants to put a `mainloop` instruction) but after several corrections I got an ugly but pretty functional UI.
Imagine a Linux Distro that uses some kind of LLM generated interfaces to make its power more accessible. Maybe even "self healing".
The issue (and I think what's behind the thinking of AI skeptics) is previous experience with the sharp edge of the Pareto principle.
Current LLMs being 80% to being 100% useful doesn't mean there's only 20% effort left.
It means we got the lowest-hanging 80% of utility.
Bridging that last 20% is going to take a ton of work. Indeed, maybe 4x the effort that getting this far required.
And people also overestimate the utility of a solution that's randomly wrong. It's exceedingly difficult to build reliable systems when you're stacking a 5% wrong solution on another 5% wrong solution on another 5% wrong solution...
Thank You! You have explained the exact issue I (and probably many others) are seeing trying to adopt AI for work. It is because of this I don't worry about AI taking our jobs for now. You still need somewhat foundational knowledge in whatever you are trying to do in order to get that remaining 20%. Sometimes this means pushing back against the AI's solution, other times it means reframing the question, and other times its just giving up and doing the work yourself. I keep seeing all these impressive toy demos and my experience (Angular and Flask dev) seem to indicate that it is not going to replace any subject matter expert anytime soon. (And I am referring to all the three major AI players as I regularly and religiously test all their releases).
>And people also overestimate the utility of a solution that's randomly wrong. It's exceedingly difficult to build reliable systems when you're stacking a 5% wrong solution on another 5% wrong solution on another 5% wrong solution...
I call this the merry go round of hell mixed with a cruel hall of mirrors. LLM spits out a solution with some errors, you tell it to fix the errors, it produces other errors or totally forgets important context from one prompt ago. You then fix those issues, it then introduces other issues or messes up the original fix. Rinse and repeat. God help you if you don't actually know what you are doing, you'll be trapped in that hall of mirrors for all of eternity slowly losing your sanity.
You’re talking about informatics Olympiad and O-1. As for Google’s DeepMind network and math Olympiad it didn’t do 10000 submissions. It did however generated bunch of different solutions but it was all automatic (and consistent). We’re getting there.
Can you share an example of a use case you have in mind of this "explainer + RAG" combo you just described?
I think that RAG and RAG-based tooling around LLMs is gonna be the clear way forward for most companies with a properly constructed knowledge base but I wonder what you mean by "explainer"?.
Are you talking about asking an LLM something like "in which way did the teams working on project X deal with Y problem?" and then having it breaking it down for you? Or is there something more to it?
I'm not the OP but I got some fun ones that I think are what you are asking? I would also love to hear others interesting ideas/findings.
1. I got this medical provider that has a webapp that downloads graphql data(basically json) to the frontend and shows some of the data to the template as a result while hiding the rest. Furthermore, I see that they hide even more info after I pay the bill. I download all the data, combine it with other historical data that I have downloaded and dumped it into the LLM. It spits out interesting insights about my health history, ways in which I have been unusually charged by my insurance, and the speed at which the company operates based on all the historical data showing time between appointment and the bill adjusted for the time of year. It then formats everything into an open format that is easy for me to self host. (HTML + JS tables). Its a tiny way to wrestle back control from the company until they wise up.
2. Companies are increasingly allowing customers to receive a "backup" of all the data they have on them(Thanks EU and California). For example Burger King/Wendys allow this. What do they give you when you request data? A zip file filled with just a bunch of crud from their internal system. No worries: Dump it into the LLM and it tells you everything that the company knows about you in an easy to understand format (Bullet points in this case). You know when the company managed to track you, how much they "remember", how much money they got out of you, your behaviors, etc.
One of the big challenges with clinical trials is making this information more accessible to both patients (for informed consent) and the trial site staff (to avoid making mistakes, helping answer patient questions, even asking the right questions when negotiating the contract with a sponsor).
The gist of it here is exactly like you said: RAG to pull back the relevant chunks of a complex document like this and then LLM to explain and summarize the information in those chunks that makes it easier to digest. That response can be tuned to the level of the reader by adding simple phrases like "explain it to me at a high school level".
Google ought to hang its head in utter disgrace over the putrid swill they have the audacity to peddle under the Gemini label.
Their laughably overzealous nanny-state censorship, paired with a model so appallingly inept it would embarrass a chatbot from the 90s, makes it nothing short of highway robbery that this digital dumpster fire is permitted to masquerade as a product fit for public consumption.
The sheer gall of Google to foist this steaming pile of silicon refuse onto unsuspecting users borders on fraudulent.