Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I worked with Nick a few times in a previous job and he was a reviewer on a paper I wrote; he was really nice and professional, and his feedback was constructive and incisive.

The movement he's part of is fundamentally disagreeable (in the 'big five' sense [0]), and some of its prominent characters, as you might expect, have reputations for being interpersonally difficult [1]. Collectively they've been labeled "data thugs" [2] and accused of "methodological terrorism" [3], which is why I think Nick's fundamental goodness is an especially valuable asset. We'd gain a ton of social value from certain reforms to the scientific research apparatus, and, IMO, being kind when delivering pointed critiques helps a difficult pill go down. That's different than pulling your punches; it just means staying focused on the thing you actually want to change.

[0] https://en.wikipedia.org/wiki/Big_Five_personality_traits

[1] https://medium.com/@OmnesRes/the-dumbest-fucking-thing-ive-e...

[2] https://www.science.org/content/article/meet-data-thugs-out-...

[3] https://statmodeling.stat.columbia.edu/2016/09/21/what-has-h...



> being kind when delivering pointed critiques helps a difficult pill go down

While I am sympathetic to this, I think it fails to recognize a very important thing that we expect, or at least should expect, from people who claim to be doing science: that every scientist is responsible for sanity checking their own work before making any claims based on it. As Richard Feynman said, your first duty as a scientist is to not fool yourself--and you are the easiest person to fool.

That means every scientist needs to be an expert in whatever fields are relevant to the work they're doing. If a scientist is going to make claims based on some purported correspondence between human psychology and fluid dynamics, they need to be an expert, not just in human psychology, but also in fluid dynamics; and if they're not, they need to not make the claims, no matter how enthusiastic they are about them. And the scientists who published the claims that Nick and his colleagues debunked were not experts in fluid dynamics, and knew it, yet they chose to publish anyway.

And that kind of thing, if science as an institution is going to be trustworthy, cannot be handled by kind words when delivering pointed critiques. It has to be labeled as what it is: not just being mistaken, but being scientifically dishonest, by making claims that you do not have the expertise to make. In a sane world, scientists who did that would be stripped of the label "scientist", the same way we disbar dishonest lawyers or revoke the medical licenses of dishonest doctors. At the very least, it justifies language that does not include kind words, but the opposite.

I have no problem with using kind words when calling out a scientist who is just mistaken. But I don't think kind words are called for when a scientist knowingly publishes claims that they don't have the expertise to evaluate for themselves.


> ... I don't think kind words are called for when a scientist knowingly publishes claims that they don't have the expertise to evaluate for themselves.

But, until you can establish beyond all reasonable doubt that that is what happened, you should be polite because there is a chance you are the one who is wrong. Then, after you have established it beyond reasonable doubt, you should be polite because trying to destroy someone will only make them dig in and fight you til the bitter end. And, because it's the right thing to do.

You can be kind and tenacious, and forceful, and not take no for an answer. I think that's what the original commenter was implying.


> until you can establish beyond all reasonable doubt that that is what happened

In the case under discussion, the key scientist involved (Frederickson) admitted that she wasn't an expert in fluid dynamics. That is sufficient to establish beyond reasonable doubt that yes, what I said happened is what happened.

> You can be kind and tenacious, and forceful, and not take no for an answer.

I'm not sure that "kind" is consistent with all of those other things, in the situation under discussion. But maybe we have different interpretations of what is "kind".


I thought I was going insane before reading your comment. This world is such a nightmare to live in.

Honestly, the idea that this guy even discovered something is just as ridiculous as the fact that the ruse had gone on for as long as it did. The only real story here is about the work that wasn’t done to begin with.


This is an interesting stance to take on HN, of all places. "Be kind. Don't be snarky. Have curious conversation; don't cross-examine. Please don't fulminate. Please don't sneer, including at the rest of the community."

This extends to other places, too.

On a more practical level, not being kind doesn't buy you anything. (I want to be clear here that "kindness" doesn't mean "freedom from consequences". Of course there should be consequences. But there's no need to be a jerk about administering them - it achieves nothing)


> This is an interesting stance to take on HN, of all places.

I am not talking about internet forum discussions in which the default assumption is that all participants are having the discussion in good faith. I am talking about what the response should be to a scientist who wilfully violates the norms that are required of all science if science is to be reliable and trustworthy.

The closest analogy in the context of an internet forum would be how a forum moderator should deal with a participant who wilfully violates the norms that are required to have a useful, good faith discussion. We normally call these people "trolls", and for ordinary participants the best thing to do is usually to ignore them, but a moderator has to maintain the forum's signal to noise ratio, which at some point is going to mean shutting the troll down, and doing it visibly and publicly, so that the norms of the forum can be seen to be enforced. Kindness would not be appropriate in that situation either (although since the situation is not as serious as a scientist wilfully violating the norms of science, one would not expect the response to be as vehement either).

> Of course there should be consequences. But there's no need to be a jerk about administering them

Publicly enforcing norms that are required for an institution to function, and making it explicit that that is what you are doing, in language that reflects the seriousness of the violation, is not "being a jerk". Granted, it's also not being kind. But "kind" and "jerk" are not the only available options.


To be honest, some people get an emotional satisfaction out of being unkind.

As a matter of practical strategy, being kind is better for all the reasons you suggest. See also Slate Star Codex, eg https://slatestarcodex.com/2014/02/23/in-favor-of-niceness-c... or https://slatestarcodex.com/2016/05/02/be-nice-at-least-until...


Basic civility, respect and decorum go a lot further than mere "kindness" IME. The latter I can't even properly define in the context of a computer-mediated debate. I suppose it's largely a way of saying "don't personally attack other users, or you'll get booted" which ought to be plain common sense.


As I see it, being kind means being empathetic, asking ourselves how the other person is going to feel when receiving our communication, and therefore adjusting in a way to avoid emotional damage. I believe this is especially important when being critical. The goal of a criticism should be to help the other/the community/the discourse grow, keeping our ego out of the equation.


Agree. Expanding on ‘emotional damage’ - I’d say principally this would be softening the blow by not putting someone’s sense of who they are/sense of worth under threat.

Failing that, you get one of two counterproductive effects: a) They feel a compulsive need to deny the threat and defend against it, doing something dumb or unwarranted as a result. b) They lose that sense of worth, and become impotent. Recovery from this depends on their environment and ability to build themselves back up.

That said, if you’re too ‘nice’, there’s a chance you’re being too subtle and the message doesn’t get through.


Oh, I agree for sure that we shouldn't taunt or bait other users, in ways that would hurt them emotionally and tempt them to attack in turn. That's just as bad as an overt attack - it destroys the spirit of a robust debate. But this all falls under a proper understanding of respect and decorum, at least as far as I'm concerned. These words just feel more precise and accurate when referencing these things. Which helps making the norm stick.



Interesting. The author of that piece isn't exactly know for producing what's commonly understood as kind communication himself.


All the more reason for the self reflection that went into that document ;-). I think these days, he pretty much follows it, at least in written communications. In person he used to get upset easily, which could lead to unkindness. I don't know if he is still like that, N years later.


Yes. And in any case: even a hypocrite could give good advice some times.


That isn't nice. Not always succeeding at something is different than saying others should do it but you don't have to.


Sorry, I didn't mean to imply that the author was a hypocrite.

I just meant, even if he was a hypocrite, that wouldn't necessarily undermine the text. A world class coach doesn't have to be a world class athlete. Or someone not practicing what they preach doesn't necessarily invalidate what they preach.


If this generalizes, seems like someone with a graph database, some basic calculus, and an archive of papers citation data could pull the rug out from under entire disciplines, specifically ones that are informing public policy these days. These "data thugs," they seem like what we hoped hackers would be.


They have done but public policy doesn't respond.

For example one of the "data thugs" (James Heathers) wrote a tool that could detect impossible numbers in psych papers, like means/std devs that couldn't possibly have been the result of any allowable combination of results. Some very high percentage of papers failed, I think it was 40%.

And of course psychology isn't the worst. Epidemiology is worse than psychology. The methodological problems there are terrifying. Good luck getting public policy to even accept that it's happened, let alone do anything about it.


For anyone curious, this seems to be a re-implementation of said tool: https://github.com/QuentinAndre/pysprite .


What is an example of a kind of impossible statistic?


https://peerj.com/preprints/2064v1/ - note that Nick Brown also worked on this.

e.g.

"Specifically, the mean of the 28 participants in the experimental condition, reported as 5.19, cannot be correct. Since all responses were integers between 1 and 7, the total of the response scores across all participants must also be an integer in the range 28–196. The two integers that give a result closest to the reported mean of 5.19 (which will typically have been subjected to rounding) are 145 and 146. However, 145 divided by 28 is 85714217.5, which conventional rounding returns as 5.18. Likewise, 146 divided by 28 is 42857121.5, which rounds to 5.21. That is, there is no combination of responses to the question that can give a mean of 5.19 when correctly rounded."

(that example is a fictional one but the same issue arises elsewhere)


Thanks for that. It's statistics that are impossible given the data, makes a ton of sense.

Copy paste didn't come in 100%, but meaning was clear, the values are: 5.17857... and 5.2147...


Where the papers that failed retracted???


No. Virtually no scientific papers with errors are ever retracted, in any field. That's how you can end up with fields where more than half of all claims are probably false. We're told science is "self correcting" but that's just one more lie on top of so many others. In reality science doesn't get corrected even when people go above and beyond to try and correct it, as in this story.


And people wonder why faith is lost in science. I wonder if we're at the point yet where anecdotal experience is more likely on average to be correct than a study.


Ironically, one of the areas of psychology the field has struggled the most to accept is also one with the most robust results and largest effect sizes. That area is ... stereotype accuracy.

It's exactly what it sounds like and it shows that, surprise, anecdotal stereotypes people have about other people are actually pretty accurate when tested. This is not a politically correct conclusion so the hard-left academic world struggled for a long time to accept this (and arguably still does).

https://psycnet.apa.org/record/2015-19097-002

"This chapter discusses stereotype accuracy as one of the largest and most replicable effects in all of social psychology. This chapter is divided into three major sections. The first, History of Obstacles to Social Psychology Accepting Its Own Data on Stereotype Accuracy, reviews some of the obstacles social psychology has faced with respect to accepting that stereotype (in)accuracy is an empirical question, and that the empirical data do not justify assumptions, definitions, or declarations that stereotypes are inaccurate. The second, The Empirical Assessment of Stereotype (In)Accuracy, summarizes what is now an impressive body of literature assessing the (in)accuracy of racial, gender, age, national, ethnic, political, and other stereotypes. The third, Stereotype (In)Accuracy: Knowns, Unknowns, and Emerging Controversies, summarizes broad and emerging patterns in that body of literature, highlighting unresolved controversies, and identifying important directions for future research. (PsycInfo Database Record (c) 2020 APA, all rights reserved)"


Not that simple, and that was something missed in the original article - not all citations are equal. You have to know the context of each. Like, it could've been 1 positive citation, and 349 calling it bullshit... or vice versa.


Indeed, the beauty of using a graph is you can see the direct citations, and then secondary and teritiery ones and then pull apart the clusters to see if there is any opportunity to pull on mathematical threads.

A query like, given paper X, show papers that cite it, papers that cite those, and then ones that cite both. Cycles in the graph might show citation rings as well (as I think has happened in the past), but this would be a challenge for young people to apply their high school calculus skills to verifying claims in social science research.

A breadth first search across papers using citations selects which paths you want to do your depth first validation on. We do the same thing in vulnerability research with software dependencies, and picking a bad-math vuln in a discipline and tracing its impact though, or picking clusters of tightly dependent research and finding vulns in them seems like an opportunity for an enterprising outsider who never wants a job from any of these people.


How can a citation ring happen if papers are written one after another?


Preprints. It happens quite often in deep learning papers, these days.


Scientists almost never negatively cite papers, and actually never do for incompetence or bad methodologies, so this is not a problem in practice [1]. In fact they don't even notice when citing retracted papers [2] so good luck getting them to notice deeper problems.

The reality is, a citation is always either neutral or a positive reference to other work.

[1] https://www.pnas.org/content/112/45/13823 - only ~2.5% of citations are negative. Of those virtually all are about the findings, not the methodology.

[2] https://pubmed.ncbi.nlm.nih.gov/18974415/


Good points, and good to see there's some data behind that as well. That's how it's seemed to me in my areas. I'm not sure to what degree it applies for things like psychology - prob never as extreme as my original example, but it would be interesting to see a similar study.


Hahaha, it is a bit of a catch-22 though isn't it. In debunking a paper like this you have to understand that your are cutting away the foundation that people built their careers on, and you may be fundamentally damaging their academic trajectory. Doesn't mean you shouldn't do it, just that being a bit of an asshole probably helps.


Hopefully this will make future scientists more wary of building their careers on a foundation of shaky work that could disappear - and thus engage more critically.


Maybe. But as a matter of strategy, it also helps to appear kind.

(If nothing else, that way your criticism cuts deeper.)


There's a time for kindness, and then there's a time for brutal honesty. The Nick Brown story does usually feature another person who was too kind and ended up regretting it:

In his email, Guastello included a list of errors he had found in Fredrickson and Losada’s application of the math. “Ironically,” he wrote, “I did send American Psychologist a comment on some of the foregoing points, which they chose not to publish because ‘there wasn’t enough interest in the article.’ In retrospect, however, I see how I could have been more clearly negative and less supportive of any positive features of the original article.”

So. The people in the end who brought the problem to light were people like Sokal, who freely use terms like "bullshit" to describe what Losada and Fredrickson did. And the people who tried to be polite and kind and balance out their criticism with the positives, were blown off and got nowhere.


I don't think you have to balance out your criticism. But you can stay polite and keep attacking the work only.


Perhaps consider that the consideration is not "be a bit of an asshole or not" but "be considerate to every person less one or just the one"?


I'm unsure what your second citation is for, they don't use the word "data thugs" but the author certainly seems annoying (although likely correct about the points they're making as far as I read).

e: the parent comment has now been edited for clarity, keeping my comment for posterity


The piece is written by a member of the posse [0] who can be, as I think the above quoted piece demonstrates, somewhat abrasive

[0] https://medium.com/@OmnesRes/the-circle-of-data-thug-life-81...


If that's considered abrasive, we've really lost out on actual communication for the sake of bullshit etiquette.


It is hard for me to comprehend a point of view where that article is not considered abrasive in tone. Whether or not it bothers you is another question.

The tile is in all caps "THE DUMBEST FUCKING THING I’VE EVER READ" and challenges someone to a cage fight at the end. The entire rest of the article could have been honeyed sweetness (which clearly it's not), and that could be enough to call it abrasive.

One person I showed it responded very favorably, commenting that the author clearly considered it "...important to get the basic points across violently vecause no one is listening to the non-violent arguments." i.e. this is abrasive, but abrasive to a purpose.


> If that's considered abrasive, we've really lost out on actual communication for the sake of bullshit etiquette.

Did we read the same article? I'm confident that we can have actual communication that is both less obnoxious and not hamstrung by "bullshit etiquette."


I wanted to hate [1], but find myself agreeing quite a lot on Jordan's substance. Well done to the writer!

What is this about data thugs and posses?


> its prominent characters, as you might expect, have reputations for being interpersonally difficult [1]. Collectively they've been labeled "data thugs" [2] and accused of "methodological terrorism" [3]

Ahh the "everyone I don't agree with/like are nazis" trope.


I suspect he would come after the lack of rigor in models like Big 5 personality factors. :-)


What’s funny is the whole field of Big Five is fundamentally a fraud because it has both correlation and dimensionality problems.

And using it to describe Nick, who apparently took down another positive psychology finding, is ironically funny.


> the whole field of Big Five is fundamentally a fraud

Do you have more reading? The Big 5 is hardly perfect and merits critique, but to call it "fundamentally a fraud" frankly feels like the conclusion of a "topic tourist" that read 1-2 critiques and made up their mind. It's not theoretically-derived, it relies on factor analysis, its utility does not all cultures, the five traits are not fully orthogonal. I'm not here to champion it. But comparing Big 5 to Positivity Ratios or Power Posing seems hyperbolic at best, particularly given its usage in situations where validity must be demonstrated.

But I've changed my mind on a lot of things. I'm prepared to change my mind on this if you've seen or read something I haven't.


It’s been a while since I was studying this topic but if I remember correctly, there is fundamental problem of a lack of justification for operational definition in the whole field of psychometrics. This applies to most (all) personality tests including the big 5.

Because the whole theory is based around cluster analysis it is pretty important that you can justify the data that is going into the analysis if you want it to be reveling any truth outside of the model, otherwise you end up with what is called “junk-in-junk-out”. As far as I know, this justification is still lacking 40 years after this theory surfaced.

I think that the big 5—and personality psychology in general—might not have the same glaring issues as positive psychology. But solid science it is not. Fraud might be overreaching, but I would definitely categorize it as pseudoscience.


Lack of justification for operational definition?

That's pretty subjective and could be leveled against almost anything behavioral or psychological.

The thing about the Big Five is that it has surfaced in all sorts of contexts. You can maybe say maybe the 5 per se is not well justified (as opposed to 4 or 6 or 7, for example), but if you take enough ratings of a person, some variant of those 5 will probably work as reasonable summaries of the ratings, and they will account for a substantial chunk of their predictive variance. The thing too, is that if you take other types of variables, like clinical symptom ratings, or diagnoses, you start to see roughly similar types of attributes become prominent.

The Big Five is a descriptive model of how people perceive others. There's a lot of evidence for certain mechanistic processes being heavily involved in some (e.g., positive emotion in extraversion, negative emotion in neuroticism, behavioral control with conscientiousness, etc.) but I'm not sure the original idea behind the Big Five was mechanistic -- it was a hypothesis about major dimensions that could summarize social perceptual data. It's like classification in biology pre-DNA era. People have some ideas of how things go together, and find it useful for organizing descriptions and measuremnts.

It's like if you did unsupervised DL modeling of all the videos involving humans you can find on the web, and found that their classification could be accounted for by 5 major vectors, almost all the time, regardless of sampling. Wouldn't you want to know that?

Many other measures are very mechanistically well-justified but lack ecological validity, in the sense that they are very narrow predictively and not well outside of laboratory contexts. That's fine, there's a tension there between predictive bandwidth and depth, but if you want any kind of rating of a human being's behavior and experience, you enter at your own risk if you think you'll measure something that's radically different from the Big Five (or something subsumed by cognitive measures). Can you do it? Sure, but a lot of the time no (see: grit).


What constitutes as an adequate justification for use of an operational definition within a model is subjective indeed. However there is usually a point in the life of a theory where the gathered evidence are sufficient enough that a scientific consensus starts to form that the operational definition is justified. I’m not aware that that has happened in the 40-odd year history of the Big 5 personality theory.

The 5 personality traits may be overarching within the field of psychometrics and they may indeed be useful to describe behavior, however you still need to justify that said behavior is not easier described using different models, and this is where personality psychologists usually fails in justifying their operands.

Works criticizing the model range from using totally different constructs (such as priming, positive reinforcement, universal grammar, brain dopamine level socio-economic status etc.)—which don’t rely on psychometrics at all—to claiming that the behavior psychometricians are predicting are actually not that useful (e.g. predicting ‘high confidence’ is not that useful if ‘high confidence’ does not result in a significant behavior which isn’t better predicted without made up operands).

If you were an early astronomer and you constructed the notion of ‘epicycles’ to simplify your model of planetary motion. You may use these ‘epicycles’ to justify your prediction, however you may not use a successful prediction to justify the existence of epicycles. Your epicycles may be useful until someone comes along and deems them unnecessary since planetary motion is better described by using elliptical orbits.

Of course this could go the other way as is the case with particle physics and the atom. However given the amount of research, success of rival theories, the failure of psychometricians from making useful predictions outside of their narrow field that isn’t better explained with alternative theories, I have high doubts that the Big-5 personality traits (and any theory of personality using psychometrics for that matter) is anything but pseudoscience.


> If you were an early astronomer and you constructed the notion of ‘epicycles’ to simplify your model of planetary motion. You may use these ‘epicycles’ to justify your prediction, however you may not use a successful prediction to justify the existence of epicycles. Your epicycles may be useful until someone comes along and deems them unnecessary since planetary motion is better described by using elliptical orbits.

I couldn't comprehend the discussion until I read this metaphor. Thanks for the detailed explanation.


I appreciate your response, but think I'd have to read more to be nearly as convinced as you. I'm familiar with the critiques you mention, but your conclusion as to the model's merits go beyond what I've seen other critiques make. As a model grounded in theory it is lacking, but as an explanatory construct to predict patterns of behavior and outcomes in specific settings (e.g., knowledge work) it persists for a reason. I've peer reviewed papers criticizing it, but none "debunking" it.

Personally my biggest gripe personally is that it is represented as a model of total personality, but that's definitely not true. It's just just representing a larger "personality space" than most other constructs. At least the outcome isn't placing one into a discrete category.


When I was doing psychology over a decade ago personality psychology was actually my biggest gripe. The way that I saw it, was that it wasn’t explaining anything which didn’t have better explanation using a different theory.

Now a decade later—being a little more class conscious—I can actually see how this is problematic. When a better explanation can come from sociology and has to do with class and economic status, measuring people based on a theory that lacks justification to justify one hire over another is problematic. When you ascribe it was because of an operationally defined concept (read: made up; based on data analysis) and name it personality that seems like a lame excuse to hire from your ingroup and excluding your outgroup.

These tests are robust, I know that, however robustness alone is not enough to justify a theory. These tools might be useful (or they may be dangerous) but while there is no justification for the operationally defined terms it cannot be used to justify terms outside of the model. That is pseudoscience. You can only to explain things inside of the model, and as such it is pretty limited as a theory.


I agree that the label "personality" unreasonably implies some sort of real (possibly for many, "genetic") truth.

> When a better explanation can come from sociology

Explanation for what? Behavior generally? In that case, the Big 5 is less of the predictor and more the criterion. The Big 5 is not a theory, it's a taxonomy that emerged from trait theory. It describes, it clusters, it predicts, but in and of itself it doesn't explain.

> ... seems like a lame excuse to hire from your ingroup and excluding your outgroup.

When it comes to hiring in the United States this is literally the opposite of the biggest use-case for personality testing for the past 50 years. But that's the only domain I can speak confidently on, and that doesn't include class or SES as a subgroup.

> there is no justification for the operationally defined terms

This is subjective, and I do not agree. If your critique of The Big 5 is actually a critique of trait theory more generally, I'm there for that. But a taxonomy for observable behaviors that shows reliability as well as content, convergent, and predictive validity across many populations and contexts seems justified to me barring a superior option.

The Big 5 is not and never would be a grand theory of human behavior, but it does describe actual behavior in a way that is interpretable and connected to the world we actually inhabit.


I don’t know how people interpret results from personality tests which gives them a criterion for what constitutes a good hire. However I have a strong feeling there is no sound science behind whichever criteria recruiters are using. And that risks placing arbitrarily high weights on whichever traits correlate with your in-group. This, however, is a falsifiable claim, and if anyone has ever done research which shows that personality tests actually reduces bias (as opposed to enhances it) then I’m open to be proven wrong. I do also question the efficacy of using personality tests as a tool to scope for good hires. How do companies actually measure that? There are so many biases that comes to mind that could make a company overvalue the efficacy of these tests.

My critique applies for all theories of personality. I don’t see personality traits as a useful categorization to predict behavior. By far the most research I see in personality psychology is about correlation with other terms inside a very narrow scope (this also applies other sub-fields of psychometrics; including positive psychology). The behavior that personality tests predicts does not further our understanding of the human mind. A theory of behavior that fails to do that is a poor theory at best.

But I want to go further, I don’t claim that personality psychology is just poor science but pseudo-science.

>> there is no justification for the operationally defined terms

The personality traits in the big-5 are operationally defined. That is they are defined in terms of what the tools are designed to detect. This is useful inside your models (as evidence by the success of big-5) but this does not tell us anything outside of our model. Now if you go to the real world and find evidence that these terms exist outside of your model, then I would change my mind. That would be a pretty good grounds for a theory which describes something that your model predicts accurately. If you don’t then you have at most a useful construct that you can use in other theories (think atom before they proved it’s existence). IF you can’t even use your constructs outside of your scope, then there is not much value in it outside of a narrow scope and you are most likely doing pseudo-science.

My critique actually extends to all personality traits. Personality traits (if they exist) can at best describe a proportion of the variability within a narrow scope. Outside of that narrow scope it is actually just better to describe a person as calm and courteous as opposed to speculate where they stand in the agreeableness axis in the Big-5 personality trait. And this is what I mean when I say “lack of justification”. ‘Agreeableness’ has been operationally defined within a certain model. When you use it elsewhere you need to justify doing so. And there should probably be a scientific consensus about whether the justification is good enough. If not you are most likely doing pseudo-science


Note that "operationalization" implies a fairly specific set of epistimolgical and ontological approaches which do not necessarily require that what is being measured has a one-to-one correspondence to a 'real' entity.


Indeed. You can operationally define anything you want within your model. If done carefully, a good operational definition may simplify your model quite a bit. (A bad operational definition, on the other hand, will almost certainly make your model overly complex and can be quite detrimental).

When you use your model to infer about a real world phenomena you have to be careful how you treat your operational definition. If you use it to make prediction, you cannot make a claim that what your operand caused it, not until you go into the real world and find it. If your model is successful you may use your operand to describe your prediction, but you have to justify why your operand is necessary, a better model may exist which doesn’t use an operand at all.

A successful model is neither a sufficient nor necessary condition for proving an operand exists.


Consider the whole type A personality thing - it's pure nonsense from a scientific perspective (created by tobacco companies to explain why certain people [smokers] had more heart attacks.)

But at this point it's still a useful cultural shorthand to describe certain characteristics we subjectively experience in others.


Yes, but that's a) not the Big 5, and 2) the worst example of personality tests/models where you're put into a category (which will show very low reliability over time, even if your responses show some).


This seems like the perfect time to pull out the old phrase:

"All models are wrong. Some models are useful."

Big 5 seems useful.


When I cite such things, it's because "all models are wrong, but mine are useful" :)

And yeah I think big five is useful in the above context, and also for things like this: https://arnoldkling.substack.com/p/keeping-up-with-the-fits-...

"Empirically, men and women to tend to differ on the trait that personality psychology calls agreeableness. More women show up as high in agreeableness than men.

[As an aside, I once wrote Nassim Taleb and the Disagreeables.

Nassim Taleb’s latest book heaps praise on the trait that personality psychologists call low agreeableness. . . I am pretty far out on the disagreeable end of the spectrum myself, but Taleb makes me look like a goody two-shoes.

Taleb came across the essay and tweeted this response:

There is this BS in this "disagreebleness" scale used by psychologists, unconditional of domain. Like most psych categorizations, BS. Many are socially gentle but intellectually rigorous & no-nonsense: others nasty in person but appear gentle in public . BS!

I rest my case.]"


Useful for what?


People need to understand that different personality types exist and have some idea about the different traits that people have.

It's common to assume everyone thinks like you, then to read spite into their actions.

Here's a good article loosely related: https://www.lesswrong.com/posts/baTWMegR42PAsH9qJ/generalizi...


Can you share more or point to further reading? I originally came across the Big Five because a psychologist was using it to debunk the idea of "grit" as being a predictor of success. (The claim was grit was not a new concept and essentially just a repackaging of the conscientiousness Big Five trait.)

It would be interesting (and another example of irony) if the Big Five itself was debunked.


I go into many reasons why big-5 (and personality psychology in general) might be a pseudoscience in a nibling comment. Here I would like to add that what you might have seen big-5 proponents do to grit (a trait from positive psychology), other fields of psychology (such as cognitive psychology, neuroscience, social psychology, behavioral psychology, etc.) does to personality psychology (and psychometrics in general).

I’m not aware of any debunking claims in particular, but skepticism is plenty, and the usual accusations range from pseudoscience to poor science.

In my opinion the worst critics go into an alternative theory of personality traits, e.g. 7 axis instead of 5 or neurotism is actually something else etc. These are attempts to debunk big-5 and in my opinion they always fall short. It doesn’t take long for proponents of big-5 to debunk these critics as the statistics behind the theory of big 5 is solid. Personality tests are robust and they consistently yield 5 disjoint axis in cluster analysis.

A better criticism goes into the foundation of the theory, attacks the fact that personality (and the personality traits) are operationally defined with little evidence they exist outside of the models. The behavior that personality tests claim to predict often have better correlation to non-personality traits, such as your brain dopamine levels, class, education or economic status, anxiety or depression, prior stimuli and response etc. Behavior is better predicted by looking at the person not how the person responds on a piece of paper. Their tests might be robust, but their science is lacking.

I’m not a believer in personality traits in general, neither “grit” nor “agreeableness”. To me the findings of any psychometrician is unremarkable at best and racist at worst (See Stephen J. Gould (1996) “The Mismeasure of Man”). In a given situation how you respond is never dictated by your personality traits. If it explains any variability behavior the effect size is hardly ever significant, and if it is there are probably other factors—not measurable with questionnaire—will explain it better.


Great post and thank you for going in-depth. What types of behavior were you thinking of when you typed "The behavior that personality tests claim to predict often have better correlation to non-personality traits, such as your brain dopamine levels, class, education or economic status..."? I'm just curious about what has been studied.


It’s been awhile since I was doing psychology, but following is a list of the top of my head (note that I’m likely to misremember or misrepresent some of these):

* One good study demonstrated that a common drug used to treat Parkinson deices has an unfortunate side-effect of limiting dopamine levels in your frontal cortex. People who used that drug are way more likely to engage in anti-social behavior after they start treatment then before. It seems like dopamine levels in your pre-frontal cortex can severely affect your personality.

* Social scientists often create amazing studies where they are easily able to manipulate the environment in such a way that they get participants to e.g. lie, cheat, etc. Off course there is variability within these studies and you can argue that personality trades explains this variability, however I don’t know if that has been done, and if it has, I wouldn’t be surprised if you would find that other factors such as religiousness, socio-economic status, gender, age etc. explains a bigger part of the variability in those social science studies.

* If you are looking for a better general theory of behavior then cognitive science has a number of constructs which don’t have this problem of operational definition. For example research has shown that you can prime people with a stimuli to increase the chances the participants will respond to similar stimuli in a short timespan afterwards.

* Behavioralism is more like a philosophy then science and posits that you can explain behavior by looking at the reinforcement history of an individual. That theory has been somewhat debunked by cognitive science, however behavioralists are still doing some impressive research which shows how you can modify behavior by offering some reinforcement contingencies. The same variability applies though as with social the social science studies.

I tried to sway away from specific studies because it has been a minute since I was studying psychology. But I hope this list is still relevant and accurate, and that it inspires you to go look for the individual studies which backs these claims.

Do note that what I’m doing here is a bit unfair. I’m making grand claims backed by evidence I think exists, but pushing the responsibility of looking for these evidence onto the reader. This is not how science communication is supposed to work, and my only excuse is that I’m lazy.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: