Is there a name of a phenomena when you expect someone to be equal or more knowledgeable as you are when discussing a topic? The Dunning-Kruger effect is when one overestimates their own ability. But is there an effect of overestimating someone else's? I feel as though that explains the reasoning.
Overestimated Familiarity Window. There you go - an abstruse term that you can use assuming other people will also understand it, and therefore the reference fulfills the referent.
Aren't these industry awards essentially participation trophies for whoever is willing to pay? Like the notorious "Who's Who Among American High School Students" in the US.
aimco properties have been requesting tenants drivers licenses or passports to be scanned by facial recognition technology from their contractors in india and then sold to advertisers. They have been doing this since 2010, you cannot live without selling your soul to landlords and advertisers
sorry for the tldr heres the whole rambling
Large language models can exhibit "hallucinatory behavior" and generate artificial content that does not correspond to facts.
This does not truly anthropomorphize the models by imbue them with consciousness, however. They are generating outputs based on the statistical patterns in their training data, not through any internal experience or self-awareness.
The response to "how much opportunity is still in front of us for adversarial LLM systems that try to detect/control for hallucinations." is by nature infinite or none (as in its futile). As "hallucinations" are whatever the developer deems to be a "hallucination". To hallucinate anthropomorphizes the model to be a human actor and leads "treatment" like a drug to be administered. A physician saying that "oh my patient is hallucinating" they have a mental disorder. This implies that there is a ground truth the developer knows to "not hallucinations". To make a model with such procedures would inherently contain any bias from the development team. Using techniques like Constitutional AI to align models with ethical values, relies on someone making that "ethical value".
Statistical artifacts or general incorrectness in responses are a more accurate to this research. Adopting a "bias mitigation" mindset, viewing bias reduction as an ongoing process of detection and correction, not a one-time fix produces its own errors or inconsistencies is a better solution, as the red tape is out of scope of the model itself. Treat every model as rouge, similar to zero trust of a computer system. If the solution is not also an AI model, then you avoid a sort of Inventors Paradox by dehumanizing people into agents.
Both of these are ideas at the current state of AI is a social dilemma, that people have been warning about for years. The nature of the words we use change our mental model and perception of the tools we create. While history shows it is something in human nature to anthropomorphize items and tools like cars and boats, they do not talk back in a human readable format. If my car started to "hallucinate" I would think I am driving inside a Herby or some other living car. The parallels made between silicon and carbon are similar but profoundly inaccurate to our current understanding, but to go down that path is off topic. As an engineer please do not anthropomorphize your creations it is unhealthy and may lead to superficial relationships. To control "Statistical artifacts" or "hallucinations" is to be the same contextually, and there is always middleware and interface management, but to "hallucinate" changes how one may perceive the ai's functions. Please do not anthropomorphize LLM.
I wasn't, and I'm not sure how you got that out of what I said. I'm not claiming "understanding," "sentience," etc.
I'm claiming there's a great deal of work in the realm of research that I'm excited by: research I expect to be done by humans. I do expect/am supposing that result may be LLMs of a different nature: to apply guard rails around generative LLM systems, but that's not to anthropomorphize them: just to suppose their purpose.
The term "hallucination" does anthropomorphize LLMs, but I think that's now accepted nomenclature in the industry, at least for the time being, and it's helpful to have some standard nomenclature to describe some of the benefits and problems.
"The term 'hallucination' does anthropomorphize LLMs"
It does not as hallucinations are not only something humans experience. Beyond that it is now an accepted term of art used to describe a specific behavior exhibited by an LLM that is separate from the biological one.
The problem I have with the term is that we already have one that describes much more accurately what these models are doing: it's called guessing. Guessing is simply reporting information one does not know to be true. As a model does not have data points regarding certain information, each token it returns is done so with lower and lower confidence. It's literally guessing. But since we aren't exposed to the confidence score of the completion it's taken to be full confidence, when it is not the case.
That framing fails to describe the case where the model is confident in a response (at the token-level), and is wrong, which I think is still considered hallucinating.
Misconceptions. There's no inherent reason a false statement would have lower probability than a true one.
To be clear, I'm referring to things like GPT-3.5 reportedly consistently messing up on statements like "what's heavier, two pounds of feathers or a pound of bricks". Being consistently wrong in the same way implies to me (but I don't know for sure) that the class of response is high probability in an absolute sense.
I can't find the article that demonstrated the sort of things that GPT consistently gets wrong, but it was things like common misconceptions and sayings.
Very interesting. So it could produce, with high confidence, common and real-world guesses found in it's dataset.
So in that case it's not guessing and not wrong; it's indeed producing something that is correct, but still false. Now we're really getting into the weeds here though.
~8 years at best if everything goes well (rarely ever does), there are outliers like Polio took ~4 years, and Covid took ~1 year. But those are much different than RA.