Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> I think it's also important to remember a lot of human models were abandoned because of the inherent bias

The industry needs to stop misusing the term bias this way. Virtually every attempt to find this supposed human bias has failed. Latest public example was Amazon and hiring[1]

Bias is the tendency to consistently mis-classify towards a certain class or tendency to consistently to over or under-estimate.

Somehow the term has been hijacked to mean 'discriminate on factors that are politically incorrect'. You can have a super racist model that's bias free, and most models blinded to protected factors are in fact statistically biased.

It's not constructive to conflate actual bias with political incorrectness.

Operational decision making, whether AI or human or statistical, faces an inherent trilemma: it's impossible to simultaneously treat everyone the same way, to have a useful model, and to have uniformly distributed 'bias'-free outcomes. At best a model can strive to achieve two of these factors.

See: https://www.youtube.com/watch?v=Zn7oWIhFffs

[1] https://www.reuters.com/article/us-amazon-com-jobs-automatio...



Hold up. Bias is being used in two different manners because it has two differed meanings. When you are using bias in industry you are talking about a minor mathematical factor added to a learning rate. When we talk about human bias, we aren't. The term was never hijacked, it just has multiple meanings.


I’m not sure if he was using any of those known definitions of bias though. I’m not sure if I would define bias as the consistent act of mis-classification


Yes, mathematicians mean something different and specific when using the word bias. The average non-expert is not misusing the word. They are using the word to express a different -- and far more popular -- meaning.

Neither is wrong, but insisting that a naming clash carries any substantive significance on an underlying issue is just silly. Similarly, insisting that nonmathematicians should stop using a certain word unless they use it how mathematicians use it is a tad ridiculous.

Of anything, it's more reasonable for mathematicians to change their language. After all, their intended meaning is far less commonly understood.


If an algorithm is clearly sorting on irrelevant criterion, especially a black box algorithm, we normally assume it's a bug. It's not reasonable to reverse that, assume the code is incapable of being mistaken and say that obviously irrelevant criterion are somehow correct in an unknown way.

Amazon's problem is a bug, they even describe it's nature. And given how flawed their recommendation algorithms are, it's especially unreasonable to assume this one is infallible.

So that linked Reuters does not show a failure to find bias, if anything it shows a design error.


You say "obviously irrelevant criterion"

Data says criterion is an eigenvalue and no matter how hard amazon tried to blind the solution to that eigenvalue, the ML system kept finding ways to infer it because it was that strongly correlated with the fitness function.

This is the difference between political newspeak '''bias''' and actual bias. Amazon scrapped the model despite it performing just fine and being bias-free, because it kept finding ways to discriminate on a protected attribute which is a PR nightmare in the age of political outrage cancel culture. It's fine to explicitly decide that some attributes should not be discriminated upon, but this comes with a cost either in terms of model utility or in terms of discrimination against other demographics. There's no way around this. In designing operational decision making systems, one must explicitly choose a victim demographic or not to implement the system at all. There's no everyone-wins scenario.

The harm of the newspeak version of '''bias''' is that it misleads people into thinking that making system inputs uniform somehow makes it bias-free when the opposite is typically true. Worse, it creates the impression that some kind of magical bias-free system can exist where everyone is treated fairly, even though we've formally demonstrated that to be false.

No amount of white-boxing or model transparency will get around this trilemma. The sooner the industry comes to grips with it and learns to explicitly wield it when required, the better.


>No amount of white-boxing or model transparency will get around this trilemma. The sooner the industry comes to grips with it and learns to explicitly wield it when required, the better.

Agreed. The optima of multiple criterion will essentially never intersect.

But for Amazon, there is no evidence the tool was accurately selecting the best candidates. They themselves never said it was. After all, altering word choices in a trivial way dramatically affects ranking. On the points you mention, why should we assume their data was relevant or their fitness function even doing what they thought? If they were naive enough, they could just be building something that predicts what the staff of an e-commerce monopoly in a parallel universe will look like.

The most likely story is that they failed at what they were doing. Part of that failure happened to be controversial and so got unwanted attention. I would guess there were quite a few incredible correlations the tool "discovered" that did not get to press.

At any rate, their recommendation engine is more important and has been worked on longer yet it is conspicuously flawed. When their recommendation tool inspires awe then maybe we could take their recruiting engine seriously enough to imagine it has found deep socio-psychological-genetic secrets.


Supervised learning algorithms assume that the input data are iid of the future. This is not valid in most of the real applications. The observation that we see men more than women in programming does not necessarily generalize to the future. That's why online learning provides an exploitation vs exploration mechanism to minimize the bias in the hindsight. In many applications, people just forgot about this simple strategy and blame the bias caused by supervised learning to the black box model.

Of course, black box AI itself is not the right solution. As more and more cross domain multitask settings emerge, open box AI will gradually take off. It is about compositional capability like functor and monad in functional language. Explanable or not is just a communication problem which is parallel to the ultimate intelligence problem. It is very possible that human intelligence is bounded.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: