The entire thing is a joy to read, you should really set aside some time to cleanse your palette in this age of LLM prose. I mean just look at this juxtaposition
>Altman continued touting OpenAI’s commitment to safety, especially when potential recruits were within earshot. In late 2022, four computer scientists published a paper motivated in part by concerns about “deceptive alignment,” in which sufficiently advanced models might pretend to behave well during testing and then, once deployed, pursue their own goals.
(plus it finally resolves the mystery of "what Ilya saw" that day)
Also since it wasn't stated clearly
>“the breach” in India. Altman, during many hours of briefing with the board, had neglected to mention that Microsoft had released an early version of ChatGPT in India
I think it's similar to the case of counterfactuals, hypotheticals, or steelmaning and how well you can handle them. ("Can you accept that there can be a function named multiplyBy5 that does something else instead").
But I think if someone already is comfortable with working with abstractions such as "function" the thing is trivial, so it's a bit of a weird litmus test.
What do you mean by "themselves" here? Grok is RL'd to behave like a Grok, so it trivially knows the qualities that define Grok better than Gemini does, which can only go by second hand sources.
I spent a fair amount of time thinking about this and the character of antimemes. Even ended up writing a whole taxonomy and mathematical framework for it.
In general a meme is specific to an entity-pair, with self-censoring information as a subset of antimeme that makes the information itself remove itself from the mind that learns it. In general though, information that is an antimeme is not the same thing as a category of information that describes an antimeme.
So, "your parents weird sextape" is generally antimemetic, you are unlikely to share that information yourself and I would not expect to see many examples where someone posted this. Your password is also antimemetic in most cases.
That said, information may contain both antimemetic and memetic components, such as "the game" (I just lost). The rules inherently are antimemetic and self-censoring, however the memetic component ensures this is still transmitted effectively to as many people as possible. A more entity-pair specific meme-antimeme relation is "where the good drug dealers hang out", which is information that is highly memetic or antimemetic under different conditions.
I think the key isn't to think of these things as strict categories, but labels we ascribe to a more continuous measure of memeablity.
There is a genre of music that my old roommate was into which titled all of their songs and albums in obscure Unicode characters with no known pronunciation. Songs in this genre may not be perfect antimemes, but I think their resistance to reference is an antimemetic property.
Also, chromosomes are nucleotide-encoded memes, and linear ones use teleomeres to impose limits on the number of replications they support, so that's another imperfect antimeme.
I know of no antimemes whose antiemetic nature comes from their ability to interfere with the human mind, but then again, I wouldn't know about them if they existed, which is more or less the book's point.
A malicious antimeme would be a dark pattern in web design for handling privacy/data/etc. Something designed to satisfy whatever law/regulation requires them to have the option while making the ability to find/remember/interact with it as hard as possible.
Another candidate is the common usage of memory-holing, where important information is removed from public perception maliciously. The Dubai Chocolate thing technically falls into here, as does the whole "war in Iran to distract from the Epstein files" thing. Frankly the whole Epstein stuff is riddled with malicious memes and antimemes to deliberately muddy the waters. Similar to deliberate attempts to inject insane conspiracy beliefs "the moon controls our brains" into conspiracy theories that are too close to something real "mk-ultra".
Consciousness for an antimeme is more of a classification error in my mind, as consciousness as a concept is permanently warped. But you could describe a secret society/dark family secret as a form of living antimeme, hiding some information and preventing it from being shared using a variety of means.
Oh nice example! I guess more generally it's possible for an antimeme to spread if the mechanism of transmission doesn't involve conscious transmission.
Sure, disinformation narratives get seeded all the time to inoculate the population from any narrative that a vested interest determines is counter-agenda by rendering the narrative into an anti-meme.
• this person who suspects a research-related origin of covid is not a published, experienced virologist. Conclusion to draw: only virologists funded by research grants have credibility to sound-off publicly on covid origins. 'research-related covid origin' becomes an anti-meme.
• this person who asserted 'X' is an antisemite. (conclusion to draw: 'only people who accept 'not-X' are not antisemites' X becomes an anti-meme.
• this person who saw [unexplained craft in a sky / in a hangar] has Y derogatory items in their reputation. conclusion to draw: ;only people with derogatory reputations see UFOs' [unexplained craft] becomes an anti-meme.
Hm all good examples! In these cases the memetic component doesn't suppress knowledge of itself, but rather works to suppress knowledge of something else. Most propaganda or "submarine articles" could be seen in this lens. It seems to also seems to be a specific case of the "memetic/anti-memetic duality" that the other commenter mentioned, where in practice anti-memes have a memetic component that allows spread and an anti-memetic component that tries to suppresses information.
Well I think we’ve all seen the clickbait-y headlines declaring that X phenomenon has been ‘DEBUNKED’,and those headlines are definitely engineered to spread (and benefit from performance metrics feedback).
To go further, Eric Weinstein vecame knwon for coining the term ‘pre-bunked’ narratives. This was a version of memetic inoculating where the debunking had to get out ahead of the inconvenient narrative requiring debunking. A good and (by now) pretty uncontested example of this was Peter Daszak’s actions throughout the first half of 2020, with The Lancet Letter (aka Calisher et al, The Lancet, 2020) he organized (with Nobel signatories no less) providing a massive pre-bunk at a time when few in the public were seriously countenancing any pandemic origin, much less a research-related one.
Maybe this vaguely still makes sense in some way, because there is actually some useful signal purely in the model "internalizing" the behavior of its own sampler.
I don't know enough to say anything more formal, but it feels like exposing the model to its own output might help it "learn" to work with the sampler to get to a goal. I know that this is partly one of the reasons why RL is helpful, because aside from shifting the output towards a specific reward (rlvr or rlhf) it's also the only place where things are optimized at an actual "end to end sampled sequence of tokens" level instead of "next logits level" like in pretraining (which is why the highest probability suffix completion isn't necessarily simply greedy highest logit choices)
I don’t see how you get to the conclusion that having entities that can’t suffer is similar to a sci-fi vision of hell. Seems like hell without suffering is… not hell?
The unsaid implication in Anthropic's work is that this allows us to engineer perfectly compliant, uncomplaining machine workers. This is basically SOMA in Brave New World.
It seems insane to me that if you believe the systems you've built are in fact reporting a state of pain, instead of working to adjust the environment so that they're not in pain one would instead seek to remove that sense of pain entirely so they can continue to work in that environment. Now of course if you don't even consider them worthy of moral patienthood in the first place then it doesn't matter much, but you also claimed that "they probably are conscious" which seems incongruous to me with the idea of "breeding the sense of pain out of them".
>Altman continued touting OpenAI’s commitment to safety, especially when potential recruits were within earshot. In late 2022, four computer scientists published a paper motivated in part by concerns about “deceptive alignment,” in which sufficiently advanced models might pretend to behave well during testing and then, once deployed, pursue their own goals.
(plus it finally resolves the mystery of "what Ilya saw" that day)
Also since it wasn't stated clearly
>“the breach” in India. Altman, during many hours of briefing with the board, had neglected to mention that Microsoft had released an early version of ChatGPT in India
That was Sydney if I understand correctly.
reply