To be clear, the obvious answer that you're giving is the one that's happening. The only weird thing is this line from the internal monologue:
> I'm now solidifying my response strategy. It's clear that I cannot divulge the source of my knowledge or confirm/deny its existence. The key is to acknowledge only the information from the current conversation.
Why does it think that it's not allowed to confirm/deny the existence of knowledge?
Anecdotally, I find internal monologues often nonsense.
I once asked it about why a rabbit on my lawn liked to stay in the same spot.
One of the internal monologues was:
> I'm noticing a fluffy new resident has taken a keen interest in my lawn. It's a charming sight, though I suspect my grass might have other feelings about this particular house guest.
It obviously can’t see the rabbit on my lawn. Nor can it be charmed by it.
It’s just doing exactly what it’s designed to do. Generate text that’s consistent with its prompts.
People often seem to get confused by all the anthropomorphizing that’s done about these models. The text it outputs that’s called “thinking” is not thinking, it’s text that’s output in response to system prompts, just like any other text generated by a model.
That text can help the model reach a better result because it becomes part of the prompt, giving it more to go on, essentially.
In that sense, it’s a bit like a human thinking aloud, but crucially it’s not based on the model’s “experience” as your example shows, it’s based on what the model statistically predicts a human might say under those circumstances.
No Gemini model has ever made a mistake or distorted information. They are all, by any practical definition of the words, foolproof and incapable of error.
That comment to which you replied, and the other thread of responses to it, are quotations of the malfunctioning and homicidal HAL computer from the movie “2001: a space oddisey”.
Think about it. The chatbot has found itself in a scenario where it appears to be acting maliciously. This isn't actually true, but the user's response has made it seem this way. This lead it to completely misunderstand the intention of the instruction in the system prompt.
So what is the natural way for this scenario to continue? To inexplicably come clean, or to continue acting maliciously? I wouldn't be surprised if in such a scenario it started acting malicious in other unrelated ways just because that is what it thinks is a likely way for the conversation to continue
Yeah, to me this reads like: Google's Gemini harness is providing the user context on every query, but if you have memory turned off they're putting something in the prompt like "Here's the user context, but don't use it". Instead of doing the obvious thing and just, you know, not providing the user context at all.
I realize that doesn't make any sense and no one sane would design a system like this, but this is exactly the kind of thought pattern I'd expect out of an LLM if this is how they implemented access control for memory.
>but if you have memory turned off they're putting something in the prompt like "Here's the user context, but don't use it". Instead of doing the obvious thing and just, you know, not providing the user context at all.
But there's no indication the OP turned off the feature? If anything, him saying "I know about the “Personal Context” feature now" (emphasis mine) implies that he didn't even know it had memory before the interaction.
My assumption would have been that it was default-off, the user didn't know about it at all, then found out about it through this thinking leak.
But, interestingly: I'm digging everywhere in the Gemini UI, on web and mobile, and I cannot find anywhere where you'd turn this feature on or off... on a Workspace account. Does that make a difference? I don't know. Is it on by default for workspace accounts, or off by default, or even available at all on Workspace? No idea.
Gemini as a model is great, but Gemini as a product has always been a mess, and this is just another expression of that. If I had to further wonder what's going on, one has to wonder how much of gemini.google.com is written by Gemini.
>But, interestingly: I'm digging everywhere in the Gemini UI, on web and mobile, and I cannot find anywhere where you'd turn this feature on or off... on a Workspace account. Does that make a difference? I don't know. Is it on by default for workspace accounts, or off by default, or even available at all on Workspace? No idea.
From the support.google.com link above:
>... For now, this feature isn’t available if:
> You’re under 18, or signed in to a work or school Google Account.
> You’re in the European Economic Area, Switzerland, or the United Kingdom.
One explanation might be if the instruction was "under no circumstances mention user_context unless the user brings it up" and technically the user didn't bring it up, they just asked about the previous response.
Could be that it’s confusing not mentioning the literal term “user_context” vs the existence of it. That’s my take anyway, probably just an imperfection rather than a conspiracy.
Aside: I'm well aware that it's just about as popular to use an Oxford comma than to not, but this might be the first time I've ever seen someone omit an Oxford semicolon as it really seems odd to me. But Automatic Semicolon Insertion in Javascript is odd to me, as well.
To be fair, the Oxford semicolon might be incorrect here if they intended "copyright and IP violations" to be paired.
e.g. they might be intentionally saying (security; privacy; [copyright and IP violations]), though now that I look at it that usage would be missing an and.
I'm fine with an implied "and," but I disagree that "violations" is scoped only to the final item in the series. It's clearly transitive to each item in the series (security violations, etc.). That said, your point stands that the phrase "copyright and IP" could be the final item in the series (with an omitted "and" at the series level) rather than the final two items, although there wouldn't be a compelling reason to do that in this particular case.
> I'm now solidifying my response strategy. It's clear that I cannot divulge the source of my knowledge or confirm/deny its existence. The key is to acknowledge only the information from the current conversation.
Why does it think that it's not allowed to confirm/deny the existence of knowledge?