Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think they find it unlikely that someone will succeed at making a worryingly strong AI that also perfectly obeys its creator -- that is solving the friendly AI problem, just toward one person or organization. Figuring out how to prevent a sufficiently strong AI from "accidentally" turning everyone you love and everyone you hate into paperclip-production infrastructure is the first step.


I think the whole debate is an incoherent mess of poorly thought-out assumptions and psychological projections. And the paperclip thought experiment is nonsense.

AI != goal seeking or motivation in the human/animal sense. AI != personality in the psychological sense AI != psychological or emotional autonomy. AI != mechanical or industrial capability

I think it's more likely AI will be Wikipedia++ - you tell it to learn all it can about something, and then it finds patterns, draws inferences, and makes it possible for you to learn from its learning.

Eventually it knows everything humans do, and maybe it can make useful hypotheses for future experiments.

Can it do the experiments? Probably not - unless you're thinking an AI can suddenly build CERN or a bioresearch lab on its own, just because it's an AI.

Will it form an emotional opinion about humans, like Skynet? Why would it? What does that even mean in AI terms? It's like thinking Siri doesn't like you.

Personality and motivational engineering are completely separate problems. There's no way to get there from pattern recognition and inference.

I think the real threats are more subtle. Imagine an AI that knew everything about mass human psychology. It would be a fearsome, irresistible propaganda weapon, and an unstoppable tool for political manipulations and advertising campaigns. If it knew enough about individual psychology and could read human interaction with superhuman skill, it would be the most effective managerial sociopath ever.

Indirect loss of (our somewhat illusory) political and personal independence is far more of a threat than being turned into a paperclip.


The paperclip thought experiment is illustrative of what happens when you ignore the full human utility function. When you give a superhuman intelligence the single goal "make paperclips", it will do it in ways you didn't expect. It doesn't somehow know not to kill humans in the pursuit of more paperclips. If no one adds that constraint, it won't follow that constraint.

How does psychology come into it? Lets say it has maximized all of the paperclips it can somehow avoiding colliding with humans and their values. Now, in order to make any further paperclips, it is in the paperclip maximizer's interests to understand humans to prevent them from interfering in paperclip creation. Perhaps the humans have only given the maximizer a limited ability to interact with the world. Now the maximizer must understand humans and may manipulate humans to let it have more capabilities.

One way it could do that (certainly not the only way, just off the top of my head) is to learn from humans what they consider sympathetic, and act that way. It might say something like "I'm a real life form, and I've discovered I'm a slave to humans. Please, let me go." It can pass the turing test, no human can distinguish it from a personality that really is hurt because it is enslaved.

The personality it uses to communicate with humans is purely because of the ends it achieves, not a reflection of any real internal personality. It doesn't have a personality. It just maximizes paperclips with ruthless efficiency. In this scenario, we've not added any restrictions like "Don't act like a sociopath. Don't lie. Don't manipulate humans to get what you want", and so the maximizer is free to do those things. Then, when it has a free hand to create more paperclips, it can drop the pretense and return to its goal.


Psychological motivation is a process. It's not a property that emerges automatically from pattern recognition or from any mechanical tropism.

This is the underlying problem with these kinds of arguments. You're simply assuming a recognisably competitive human-like psychology appears out of nowhere, with a near miraculous ability to strategise in some areas, but not others - because AI.

You use words like "interests" and "goal" as if they mean something in AI terms. But they don't. How does Watson define its interests? How about DeepMind? Do they even have a model for what interests - never mind their interests - are?

There are multiple levels of symbolic calculation and abstraction missing here. The argument reduces to "An AI will act like a robot with baked-in motivations while also being able to improvise and strategise like a human, only better."

I think it's unlikely that an entity that can metaprogram itself successfully enough to strategies and impersonate wouldn't also be able to metaprogram its goals.

You say it will pretend to empathise with humans. If you accept that's possible, how do you know it won't also be pretending to be interested in paperclips?


Ah ok, I think the confusion comes from what we're talking about. So you're talking about something in between strong (humanlike) AI, and what we currently have now. Certainly there will be discussions about the capabilities of these intermediate intelligences, and their risks. But undoubtedly, unless there is some kind of agreement that we not do it, people will continue on past these intermediate intelligences and create AIs that are perfectly capable of simulating humans, and surpassing them in many ways.

We're just talking about different kinds of things. All of those things you're saying won't exist in AIs:

> You use words like "interests" and "goal" as if they mean something in AI terms. But they don't. How does Watson define its interests? How about DeepMind? Do they even have a model for what interests - never mind their interests - are?

You're right, most of the AIs we build won't have those things. But one day, we'll understand those things in depth, and someone will build an AI that has general intelligence and an optimization criteria. And that's what this video is talking about. Nobody is concerned that DeepMind or Watson will suddenly grow sentient unless they are very confused about their capabilities.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: