When should researchers be more suspicious of their statistical intuitions? – Chapter 10

A study of cancer diagnoses in the United States showed a pattern: the number was lowest in sparsely populated, rural and Republican areas. What you make of this information, through searching memory and formulating hypotheses, is an operation of System 2. System 1 was also involved: System 2 depends on the suggestions and facts retrieved from associative memory. You probably focused on the rural-fact and did not link the lowest number to Republican policies. It makes sense to attribute it to the rural environment, with fresher foods and cleaner air. The number of diagnosis was highest in also rural and sparsely populated areas. You might link this information to poverty, access to good healthcare or smoking. However, living in rural areas cannot explain both numbers. The main factor was not the areas being Republican or rural, but having small populations. This example shows the complex relation between statistics and our mind.

System 1 excels in one form of thinking: it effortlessly and automatically detects causal links between events. System 1 fails to deal with merely statistical information, which affects the probability of the outcome but not the cause of the event.

Imagine a jar filled with balls: half of them are black, half are yellow. Without looking, you draw 4 balls from the jar, record the number of black balls and throw them back. You repeat this several times. If you summarize the outcomes, you will find that ‘2 black, 2 yellow’ occurs six times as often as the result ‘4 black’ or ‘4 yellow’. This is a mathematical fact. Now imagine the US-population as balls in a jar. Some balls are marked CD (cancer diagnosis). You draw samples of balls and populate each area. Rural samples are the smallest. Extreme outcomes (many or few diagnoses) will be found in sparsely populated areas. Fact: number of cancer diagnoses varies across areas. Explanation: extreme outcomes are more likely to be found in smaller samples. This is a statistical explanation, it is not causal. The small population of an area does not prevent or cause cancer. There is nothing to explain, the number of diagnoses is actually not higher or lower than normal, it just looks that way due to a sampling accident. The differences in sample sizes resulted into ‘artifacts’: observations that are produced exclusively by an aspect of the research method. Outcomes of large samples are more trustworthy, also known as the law of large numbers. The ‘sparsely populated’ part probably did not seem relevant to you and it took some effort to realize that large samples are more precise extreme outcomes are found more often in small samples. Even researchers have a poor understanding of sampling effects.

Research psychologists see sampling variation as an unpleasant obstacle in their research project. If the hypothesis to be tested is “The vocabulary of seven-year-old girls is greater than the vocabulary of seven-year-old boys”, you must use a large enough sample to prevent wasting effort and time. In the whole population, the hypothesis is true: girls in general have a more developed vocabulary. Boys and girls vary greatly though, so you could select a sample in which the boys score higher or no difference is detected. Picking a too small sample size puts you at the mercy of sampling luck. It is possible to estimate the risk of error for all sample sizes, but psychologists tend to skip this procedure and use their often flawed judgment.

Psychologists usually make the mistake of choosing very small samples resulting into a 50% risk of failing to confirm hypotheses that are actually true. A likely explanation is that these researchers have intuitive misconceptions about the extent of sampling variation.  Instead of picking a sample size by computation, researchers tend to trust their intuition and tradition. A study among researchers, including statisticians, showed that the majority made sample size mistakes. Kahneman and Amos advocate in their article ‘Belief in the Law of Small Numbers’ that researchers should be more suspicious of their statistical intuitions and recommend replacing impressions with computations.

What is the bias of confidence over doubt?

In a poll of 400 senior citizen, 60% approves of the president’s actions. You will probably conclude that ‘older people support the president’ and not pay much attention to the sample size. Your conclusion would be the same in case of a different sample size, unless it would be an extreme number like 4 or 40 million. This shows that we are not ‘adequately sensitive to sample size’. We automatically focus on the story, not on the reliability of the data. The principle of WYSIATI indicates that System 1 is not able to distinguish degrees of belief, it does not tend to doubt. It generates coherent stories and associations that make a statement seem true. System 2 has the ability to doubt, although it requires some effort. The law of small numbers is an example of the bias that makes us favor certainty over doubt. The belief that a small sample represents the population from which it is drawn is part of the tendency to exaggerate the coherence and consistency of what we witness. This exaggerated faith in the value of a few observations is related to the halo effect. System 1 creates a story on the basis of fragments of evidence, running ahead of the facts. It generates a representation of reality that makes too much sense.

The associative machine searches for causes: how did something came to be? The statistical approach focuses on what could have happened instead. There is no particular case, it was selected by chance. Our preference for causal thinking makes us susceptible for significant mistakes in the evaluation of the randomness of truly random incidents.

Consider the following example. Six women give birth to a baby. The sequence of male and female babies is random, the births are independent of each other and the number of male/female babies born earlier that day has no effect on the gender of the next baby. Are the following three possible sequences equally likely?




Your intuitive answer ‘no’ is wrong. Because the outcomes M and F are equally likely and the births are independent, any possible sequence is a likely as any other. Even with this knowledge, only the last option seems random and is believed to more likely. We seek patterns and believe in a coherent world, in which regularities are the result of intention or mechanical causality: they do not occur accidentally. We refuse to believe regularities are the result of randomness. This misconception, and the ease with which we see patterns when there are none, can have serious consequences. A rocket bombing in World War II was believed to be not random, but a statistical analysis proved it was.

A study of misperceptions of randomness in basketball had a surprising outcome. The so called ‘hot hand’ is considered a fact by coaches, fans and players. Multiple successful shots in a row result into the causal judgments that the player is ‘hot’ and likely to make more shots. Teammates pass more often to this player. However, researchers found that the sequence of missed and successful shots is random. The hot hand is just a cognitive illusion. The public response to this finding was disbelief, due to the strong tendency to see patters in randomness (illusion of pattern). This illusion affects our lives in various ways.

Page access
Comments, Compliments & Kudos

Add new contribution

This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Enter the characters shown in the image.
The JoHo Insurances Foundation is specialized in insurances for travel, work, study, volunteer, internships an long stay abroad
Check the options on joho.org (international insurances) or go direct to JoHo's https://www.expatinsurances.org