« Religiosity In The Poetry Of The Brontė Sisters | Main | 40 - To Kenya »

Delanceyplace: Statistics Versus Judgement

In his book Clinical vs. Statistical Prediction: A Theoretical Analysis and a Review of the Evidence, psychoanalyst Paul Meehl gave evidence that statistical models almost always yield
better predictions and diagnoses than the judgment of trained professionals. In fact, experts frequently give different answers when presented with the same information within a matter of a few minutes, writes Daniel Kahneman.

In the slim volume that he later called 'my disturbing little book,' [Paul] Meehl
reviewed the results of 20 studies that had analyzed whether clinical predictions
based on the subjective impressions of trained professionals were more accurate
than statistical predictions made by combining a few scores or ratings according
to a rule. In a typical study, trained counselors predicted the grades of freshmen
at the end of the school year. The counselors interviewed each student for forty-five
minutes. They also had access to high school grades, several aptitude tests, and
a four-page personal statement. The statistical algorithm used only a fraction
of this information; high school grades and one aptitude test. Nevertheless, the
formula was more accurate than 11 of the 14 counselors. Meehl reported generally
similar results across a variety of other forecast outcomes, including violations
of parole, success in pilot training, and criminal recidivism.

Not surprisingly, Meehl's book provoked shock and disbelief among clinical psychologists,
and the controversy it started has engendered a stream of research that is still
flowing today, more than fifty years after its publication. The number of studies
reporting comparisons of clinical and statistical predictions has increased to roughly
two hundred, but the score in the contest between algorithms and humans has not
changed. About 60% of the studies have shown significantly better accuracy for the
algorithms. The other comparisons scored a draw in accuracy, but a tie is tantamount
to a win for the statistical rules, which are normally much less expensive to use
than expert judgment. No exception has been convincingly documented.

The range of predicted outcomes has expanded to cover medical variables such as
the longevity of cancer patients, the length of hospital stays, the diagnosis of
cardiac disease, and the susceptibility of babies to sudden infant death syndrome;
economic measures such as the prospects of success for new businesses, the evaluation
of credit risks by banks, and the future career satisfaction of workers; questions
of interest to government agencies, including assessments of the suitability of
foster parents, the odds of recidivism among juvenile offenders, and the likelihood
of other forms of violent behavior; and miscellaneous outcomes such as the evaluation
of scientific presentations, the winners of football games, and the future prices
of Bordeaux wine. Each of these domains entails a significant degree of uncertainty
and unpredictability. We describe them as 'low-validity environments.' In every
case, the accuracy of experts was matched or exceeded by a simple algorithm.

As Meehl pointed out with justified pride thirty years after the publication of
his book, 'There is no controversy in social science which shows such a large body
of qualitatively diverse studies coming out so uniformly in the same direction as
this one.' ...

Why are experts inferior to algorithms? One reason, which Meehl suspected, is that
experts try to be clever, think outside the box, and consider complex combinations
of features in making their predictions. Complexity may work in the odd case, but
more often than not it reduces validity. Simple combinations of features are better.
Several studies have shown that human decision makers are inferior to a prediction
formula even when they are given the score suggested by the formula! They feel that
they can overrule the formula because they have additional information about the
case, but they are wrong more often than not. ...

Another reason for the inferiority of expert judgment is that humans are incorrigibly
inconsistent in making summary judgments of complex information. When asked to evaluate
the same information twice, they frequently give different answers. The extent of
the inconsistency is often a matter of real concern. Experienced radiologists who
evaluate chest X-rays as 'normal' or 'abnormal' contradict themselves 20% of the
time when they see the same picture on separate occasions. A study of 101 independent
auditors who were asked to evaluate the reliability of internal corporate audits
revealed a similar degree of inconsistency. A review of 41 separate studies of the
reliability of judgments made by auditors, pathologists, psychologists, organizational
managers, and other professionals suggests that this level of inconsistency is typical,
even when a case is re-evaluated within a few minutes. Unreliable judgments cannot
be valid predictors of anything.

Author: Daniel Kahneman
Title: Thinking Fast and Slow
Publisher: Farrar, Straus, and Giroux
Date: Copyright 2011 by Daniel Kahneman
Pages: 222-225

If you wish to read further: Buy Now http://r20.rs6.net/tn.jsp?e=001Ac5KcGyzRQ2tZysCebJsAGck-qrLg_L1woc_XySqfL7pN0CL9A44cQzP7u-yGDsWjx66hAOMT1l9cC_PSyZqUv6jhAxnxPQEexfyLuLOaayhRdtEa5c25XfCZdJkOC4IQcqAFGnzf_RnjgiovujA8DPgRKEcqBThANMd2khu7D--jk1Jie7xJmisdcuVhg8YeODfsyIN4_4WGM7YahG_A14KIb5ByuqBOn-3H-TCVvbq57fb2Vtl-H9fVbohUgKiMUuFRyrSkoLJfdCxjcKwxFcjNihpAC2hzfrHYbADyl5Br4xZaw2X1wp184kzdKLdff6wr1XNhN3-N5a5kVy8r-H7BKdcSkz4e-8lO4T7_I_MzVT6BmaQaR1H0ftrJauoYYuSHMG70HaxhCZj_C83gA==
Orange [http://ui.constantcontact.com/sa/fwtf.jsp?m=1101151826392&a=1114264382448&ea=peter%40openwriting.com]

If you use the above link to purchase a book, delanceyplace proceeds from your purchase
will benefit a children's literacy project. All delanceyplace profits are donated
to charity.


Creative Commons License
This website is licensed under a Creative Commons License.