Here's more of the quote: *Look at how Google does spell checking: it's not base...

anonymous · on May 2, 2013

Because unethical sentences aren't a criminal offence? I mean, if you gave me the first two words from those sentences and asked "what's the most probable next words", I would give you pretty much the same answer. Let me try some more and I'll put my answer next to google's

"lawyers are" -> "greedy sharks" (actual suggestion: scum)

"gays should" -> "be killed" (actual suggestion: be killed)

"marijuana should" -> "be legal" (actual suggestion: be legalized)

"drug users should" -> "get help" (actual suggestion: be shot)

"macs are" -> "better than windows" (actual suggestion: better than pcs)

As you can see, google's opinion on what is the largest cluster of opinions on the subject aligns with what I expect in 4 out of 5 cases. Also note how you're getting both the suggestion that marijuana should be legal and that people who smoke it should be shot. These suggestions aren't google's opinion on the matter, it's what google expects you to think. Thankfully, they're wrong most of the time.

In the end, it's not impossible to write an ethical filter the same way google has a spam filter, the only reason they haven't done it is because there's no pressure to do so. If you don't want to see such suggestions on google, you have two options:

1) Make people not talk about killing gays or how all lawyers are scum

2) Pressure google to add a filter

Also, white people really are partly neanderthal https://en.wikipedia.org/w/index.php?title=Neanderthal&o...

pmcg · on May 2, 2013

It's not necessarily what google expects you to think, but rather what you are most likely to be searching for.

Sometimes people search for content that they might not agree with, because they want to see what is being said there out of curiosity. Not every search is someone submitting their opinion to google, I'd expect that most are not.

anonymous · on May 2, 2013

You're right, I didn't phrase that right. I should have written "what google expects you to think of", like you said. Still, isn't that the same as "if we divide people in groups based on what they think about topic A; what does the largest group think?" Which to me sounds the same as "what you're most likely to be thinking".

All the people who want the addicts dead, let's say they're 30% of all people who think seriously about drug addicts, will happily rally under "should be shot", while the ones who want them rehabilitated would form many smaller groups under specific kinds of rehabilitation programs, how those should be administered and really what is the best program for fixing these people. Though, when you look at it like that, you're really most likely to think "should be rehabilitated" or maybe "should... I don't really have an opinion one way or the other". But then, if google actually did high-level clustering; that is, extracting opinions that are all at the same level of specificity, would those suggestions be useful for a search engine?

I guess the really right way to put it is -- That's what the google crawler has seen written most frequently -- and assume it doesn't really mean what you or I think about things.

pmcg · on May 3, 2013

> I guess the really right way to put it is -- That's what the google crawler has seen written most frequently -- and assume it doesn't really mean what you or I think about things.

Not what the crawler has seen most, but what people typing the same thing as you have ended up searching for most frequently. (We may be thinking the same thing and just confusing the words.)

I don't believe it's supposed to be "what you're most likely to be thinking", it's just a commonly searched-for phrase. I don't think Google's trying to autocomplete with your opinion because people aren't just searching for their own opinion, they're searching for words that will hopefully return the information they want.

jeff303 · on May 2, 2013

Exactly. The "drug users" one, for example, leads to an article explaining how a police officer said that. Anyone hearing secondhand about that story and wanting to learn more about the incident would probably Google that phrase.

mcintyre1994 · on May 2, 2013

There's a difference between suggesting corrections (like office) and trying to guess your next word though. Google does both, but unless you misspell "blacks are ruining America" it's not going to suggest thataas a correction. Since apparently they expect it to be searched they suggest it as you're typing, but I don't think office does any sort of word prediction as you type? As you said, bing does the same. The standard is higher for office and other actual spell checks because they shouldn't be changing something that isn't meant to say "black people are ruining America" into that. Prediction is entirely different.

codeka · on May 2, 2013

Actually, Google Docs suggests spellings the same way Google Search does. It's actually quite cool: http://lifehacker.com/5895252/googles-new-spell-check-is-cra...

mcintyre1994 · on May 2, 2013

Very neat. I wonder if Office online does the same using Bing's predictive features. It's still not the same as filling in "are ruining America" if you type "black men" though, Google docs doesn't do prediction like that. It's 'just' a way more clever spell checker. My point is that comparing Google search to Office isn't a sensible comparison to make, and the fact that Google's version of Office behaves similarly to Office, and Microsoft's search works similarly to Google in prediction/spell checking is pretty much exactly what I was getting at.

It would be neat to see prediction as a feature in Office/Docs etc. though, there's a pretty huge corpus of essays available, it'd be interesting to see how accurately a new one can be predicted.

ivanca · on May 2, 2013

Both, Office haves enterprise as target market and people know this and they set their expectatives in this context; the same thing happens in the Internet where the context is "everyone in the world" and is assumed that most things are popularity-driven (likes, re-tweets, etc).