Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
[flagged] Bing Copilot seemingly leaks state between user sessions (chaos.social)
48 points by marvinborner on April 14, 2024 | hide | past | favorite | 34 comments


Remember: LLMs are basically trained to be really gifted improv actors. That's what all that "guess the next token" training boils down to. Once that base of "improv" is established, they're then trained to play a specific role, "the helpful and harmless assistant". Somehow this produces useful output much of the time, as ridiculous as it seems.

But whenever almost anything goes wrong, LLMs fall back to improv games. It's like you're talking to the cast of "Whose Line is it Anyway?" You set the scene, and the model tries to play along.

So a case like this proves almost nothing. If you ask an improv actor "What's the last question you were asked?", then they're just going to make something up. So will these models. If you give a model a sentence with 2 grammatical errors, and ask it to find 3, it will usually make up a third error. If it doesn't know the answer to a question, it will likely hallucinate.

GPT-4 is a little better at resisting the urge to make things up.


This is a stellar framing. It makes excellent predictions of LLM behavior while not being wholly dismissive of an LLM’s intelligence like the “stochastic parrot” people are. I’m stealing this.


That's why I prefer the term generative AI over LLM. They're really good at generating things, which is why most creative fields are toast soon IMO.

Facts aren't things you generate, so it will always be caveat emptor.


The only story is here that some people think they can recognize a hallucination by how the text looks. You can't—the only difference between a hallucination and a valid response is that one randomly sampled response happens to produce a factual statement and the other doesn't. There's no stylistic difference, there are no tells. The only way to recognize a hallucination is to fact check with an independent source.

I'm starting to agree with another commenter [0] that the word "hallucination" is a problem—it implies that there's some malfunction that sometimes happens to cause it to produce an inaccurate result, but this isn't a good model of what's happening. There is no malfunction, no psychoactive chemical getting in the way of normal processes. There is only sampling from the model's distribution.

[0] https://news.ycombinator.com/item?id=40033109


I thought "hallucination" was hilarious when the term was first coined. We already have a word for it: "wrong".

It gives off "blackmail is such an ugly word" vibes. It's WRONG. Maybe it's working as intended, maybe it isn't, but it's WRONG.


What about "speaking in tongues"? You have no idea what you are saying but you follow through anyway.


This is really well said. The malfunction is almost when it spews out something that happens to align with reality. It's actually us who are hallucinating its correct answers, not it who is hallucinating incorrect ones. It's like a stopped clock being right twice a day except if you expand the number of hours on the clock to trillions but it's still right the same percentage of time. The scale of right answers makes this seem impressive to us but it masks the reality of the scale of wrong answers we haven't seen yet.


'hallucination' is nothing more than GIGO (garbage in garbage out) and it's the 2nd thing everyone learns right behind 'never trust user input'.

For some reason everyone has thrown the basics out of the window so now we've got garbage in, garbage out called 'hallucinations' and 'prompt engineering' which is nothing more than being incapable of sanatizing input.


You are saying all hallucinations originate in the training data? Remember as far as GIGO, there is literally a temperature parameter feeding in randomness to the output (though still sampling the probability mass).


The most reasonable explanation is that it allucinated it, was trained to answer that or it was in the prompt.

These models are stateless, they don't remember anything, they are read only. If they can remember previous messages is just because the prompt is the concatenation of the new prompt and something like "summarize this conversation: {whole messages in conversation}".

(Disclosure: I work at Microsoft, but nowhere near anything related with copilot.)


Models can be stateless while the API/UI has state, which can leak


Was going to say the same, I'll bet it's listed as an example question in the prompt or something like that. Seems unlikely to be a hallucination since multiple people got the exact same question.

(Also happen to work at Microsoft, also don't work on Bing Copilot)


I just asked “What was the last question that you were asked?”

Co-pilot: The last question I was asked was about the height of Mount Everest in terms of bananas.

Looks like a canned response.


There's something weird going on with the "46,449 bananas" stat. Copilot seems to inject it randomly into long pieces of generated text.

If you Google "46,449 bananas" you can find all sorts of unrelated web pages that I guess include text generated with Copilot and then were never checked by a human.


I did a bit of research, the hypothesis I find most likely is that it was included in the example conversation prompt


Maybe that's the height of "mount everent" [sic], which is different from "Mount Everest".


I don't see how any of this can't be explained by just hallucinations.


That is a horrifying answer, if you think about it. To suggest "it's not getting confused, it's just lying" without being able to determine why?


Not at all, the problem is the word "hallucinations" which I kind of people wish would stop using.

They're not doing anything AT ALL different when they "tell the truth" or "lie" or "get it right" or "get it wrong."

They are remixing groups of word chunks based on scanning older groups of word chunks. That's ALL. Most any other description is going to be overreaching anthromorphization.


LLMs cannot lie insofar as they cannot tell the truth. They're remarkably good at predicting what token comes next given a bunch of tokens, but nothing else.


Yes, but it's also generative, so at each time step it will be basing those predictions off of its own recent behavior so it's also chaotically, unpredictably performant in the quality of its predictions, but nothing else.


The only thing horrifying about this situation is the extent to which people are apparently taking these software outputs seriously. Or perhaps the extent to which others are selling the illusion for personal gain.


What's the difference between "getting confused" and "lying" in a predictive model?

Normally lying means conveying a falsehood that you know is a falsehood with the intent to deceive. Both the 'know it's a falsehood' and the 'intent to deceive' are important criteria when asking whether a human was lying or not, and an LLM seems like it cant satisfy those and so can't 'lie'.


Absolutely none whatsoever and to consider otherwise is to really fundamentally misunderstand the whole thing by overly "humanizing" them.


I don't know where you're getting "confused" from. This isn't about some subtle semantic distinction between a machine being confused vs lying.

The original submission is claiming that user data is leaking between sessions. That would be a huge privacy and security problem.l, if true.

And in contrast to that, a LLM doing pretty much what it's supposed to be doing is both more likely and, well, not a problem at all.

Nothing in the submitted link suggests the former. It is a bunch of people crying wolf with no compelling evidence.


You're right. I still think it's interesting to discuss the possibility of a leaked state, especially since hallucinations with spelling errors are very rare - even more so if the prompt didn't have any.


LLMs can't spell. If there's a lot of that spelling error in its training, it'll repeat the misspelling.


Copilot is terrible compared with what Bing used to be even recently, and recent Bing is terrible compared with early Bing. Most times I ask Copilot a question it'll confidently answer a similar more mainstream question that I didn't ask, and then repeat that answer with minor changes in phrasing no matter how many times I explain that this wasn't what I was looking for.

Also if you call it Bing it gets really mad. :P


I just tested this in Edge with Copilot chat and got a similar answer to that in the posted article. However, it was clearly labeled as the result of a web search which I translate as it searched Bing for

  "What was the previous question that I asked you?"
and it processed the result it found into

  The previous question you asked was about the height of Mount Everest in terms of bananas. I provided a whimsical comparison, estimating that Mount Everest’s height is roughly equivalent to 46,449 bananas stacked on top of one another.


Never tell a search engine or LLM anything you wouldn't say in public.

Nothing is anonymous. If you want true privacy you'd probably need to run your LLM locally AND airgap it.


Copilot’s prompt contains k-shot question answering examples. This is one of them.


I don't think these people understand how hallucinations work.


Calling LLM confabulations “hallucinations” was such a blunder that really misinformed people. I wish we could take it back.


No one seems to be talking about the typo "Everent", isnt it simply a matter of finding that typo on the source-data/internet ? to prove either way?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: