In this system, the llm can hallucinate to its hearts content - the hallucinations are then fed into a proof engine and if they are a valid proof then it wasn't a hallucination, and the computation succeeds. If it fails it just tries again. So hallucinations cannot actually leave the system, and all we get are valid refactorings with working proofs of validity.
Binding the LLM to a formal logic and proof engine is one way to stop them hallucinating and make them useful for the real world.
But you would have to actually care about Proof and Truth to concede any point here. If you're only protecting the worldview where AI can never do things that humans can do, then you're going to have to retreat into some form of denial. But if you are interested in actual ways forward to useful AI, then results like this should give you some hope!
> Binding the LLM to a formal logic and proof engine is one way to stop them hallucinating and make them useful for the real world.
Checking the output does not mean the model does not hallucinate and thus does not help for all other cases in which there is no "formal logic and proof engine".
What if I consider the model to be the llm plus whatever extra components it has that allows it to not hallucinate? In that case then the model doesn't hallucinate, because the model is the llm plus the bolt-ons.
Remember llm truthers claim that no bolt-ons can ever fully mitigate an llm's hallucinations. And yet in this case it does. But saying that it doesn't matter because other llms will still hallucinate is moving the goal post, or at least discounting the utility of this incremental progress. I think it's unfair to do this because there are many many domains where things can indeed be reduced to a formal logic amenable to approve engine.
If they don't care about the actual output of a hybrid system that doesn't hallucinate, because it's math and not speech, then do they care about solving the issue at all, or providing human utility? I get the feeling that they only want to be right, not the benefit anyone.
This shows that in cases where we can build good enough verifiers, hallucinations in a component of the system do not have to poison the entire system.
Our own brains work this way - we have sections of our brains that hallucinate, and sections of the brain that verify. When the sections of the brain that verify our sleep, we end up hallucinating dreams. When the sections of the brain that verify are sick, we end up hallucinating while awake.
I agree with you that the current system does not solve the problem for natural language. However it gives an example of a non hallucinating hybrid llm system.
So the problem is reduced from having to make llms not hallucinate at all, to designing some other systems, potentially not an llm at all, that can reduce the number of hallucinations to a useful amount.
https://news.ycombinator.com/item?id=40634775
In this system, the llm can hallucinate to its hearts content - the hallucinations are then fed into a proof engine and if they are a valid proof then it wasn't a hallucination, and the computation succeeds. If it fails it just tries again. So hallucinations cannot actually leave the system, and all we get are valid refactorings with working proofs of validity.
Binding the LLM to a formal logic and proof engine is one way to stop them hallucinating and make them useful for the real world.
But you would have to actually care about Proof and Truth to concede any point here. If you're only protecting the worldview where AI can never do things that humans can do, then you're going to have to retreat into some form of denial. But if you are interested in actual ways forward to useful AI, then results like this should give you some hope!
Good luck and good day either way!