Inferential Statistics, Howell Chapter 4,5,6
Inferential Statistics, Howell Chapter 4,5,6
Also “chance variability”. Variability in findings are due to chance
Reason: Data are ambiguous à means are different
Goal: Find out if the difference is big or small i.o.w à statistically significant
What degree of variability of sample-sample can we expect in the data?
Tells us what variability we can expect under certain conditions (e.g. if population mean is equal).
Can also be done with other measure of variability: Range,
Sampling distribution of differences between means:
Compares distribution of means
Expected standard deviation of samples of measured statistic, when measured repeatedly.
Theory of Hypothesis Testing
- Answering statistical significance is no longer sufficient (p<.05)
à Need to inform reader about power and confidence limits and effect size
- Try to find out if difference in sample means (sampling distribution) is likely if the sample was drawn from a population with an equal mean
1. Set up the research hypothesis. Eg. Parking takes longer if someone watches
2. Collect random sample under the 2 conditions
3. Set up Ho = null hypothesis = the population means of the 2 samples are equal
4. Calculate sampling distribution of the 2 means under condition that Ho is true
5. Calculate probability of a mean difference that is at least as large as the one obtained
6. Reject or fail to reject Ho (Assumption that Ho is not true – not proven !!!!)
1. Research Hypothesis
2. Collect random sample
3. Set up null hypothesis
4. Sampling distribution under Ho=true
5. Compare sample statistic to distribution
6. Reject or retain Ho
- Usually the opposite of the research hypothesis
à in order to be disproven (cause we can never prove something, only disprove the
- Options are to reject or suspend judgement over Ho.
à If Ho cannot be rejected, the judgement about it has to be suspended.
(eg. Schoolexperiment continues)
- Options are to reject or accept that Ho is true.
à If Ho cannot be rejected, Ho has to be considered true until disproven.
(eg. Schoolexperiment stops, until evidence has to be reconsidered)
- Confusion between the probability of the hypothesis given the data and the data given the hypothesis.
à p = .045 means that probability of data given if hypothesis Ho = true à p(D I Ho)
- Descriptives (mean, range, variance, correlation coefficient)
- Describe characteristics of the samples
- Statistical procedure with own sampling distributions (t, F , X²)
Decisions about the Null-Hypothesis
Rejection level / significance level:
- Sample score falls inside the 5% level of the assumed distribution à rejection region
à If it falls there, the likelihood that the findings are due to chance is 5%
à Therefore it is statistically significant
Type I and Type II Errors
Type I : (Jackpot Error)
- Rejecting Ho when it is actually true
à Probability of making this error is expressed as alpha
à We will make this error 5% of the time
Type II :
- Fail to reject Ho when it is actually wrong
à Probability of making this error is expressed as beta
à We will make this error depending on the size of rejection region
- Less Type I error = more Type II error
If beta is smaller, the distance between sample mean and pop mean is bigger, thus the generalizability increases. à More power
One and Two Tailed Tests
One tailed / directional test:
- test only for one direction of the distribution 5% level
Two tailed / nondirectional test:
- test for negative and positive scores on 2.5% level
- Reasons: No clue what data will look like
Cover themselves in the event the prediction was wrong
One tailed tests are hard to define (if more than two groups)
àTry to keep statistical significance low.
2 Questions to deal with any new statistic
1. How and with what assumption is the statistic calculated?
2. What does the statistic´s sampling distribution look like under Ho?
Alternative view of hypothesis testing
- Null hypothesis = m1 = m2 or m1 not = m2 (two tailed)
According to Jones, Tukey and Harris
- 3 possible conclusions
1. m1 < m2
2. m1 > m2
3. m1 = m2
- 3. Is ruled out, because the means are never the same. So we test for 2 directions at the same time. It allows us to keep 5% levels at both ends of the distribution, because we will just discard the other one
Basic Concepts of Probability
Analytic view: Common definition of probability. Even can occur in A ways and fail to occur in B ways.
à all possible ways are equally likely (definite probability, eg. 50%)
Probability of occurrence: A/(A+B) à p(blue)
Probability of failure to occur: B/(A+B) à p(green)
Frequentist view: Probability is the limit of the relative frequency of occurrence
à Dice will land approx. 1/6th of time on one side with multiple throws (proportions)
Subjective probability: Individuals subjective estimate. (opposite of frequentist view)
à use of Bayes´ theorem
à usually disagree with general hypothesis testing orientation
Event: The occurrence of “something”
Independent event: Set of events that do not have an effect on each others occurrences
Mutually exclusive event: The occurrence of one event precluded the occurrence of the alternative event.
Exhaustive event: All possible occurences /outcomes (e.g. die) are considered.
(Sampling with replacement: Before drawing a new sweet (occurrence), the old draw is replaced.)
Additive law of probability: (mutually exclusive event must be given)
The occurrence of one event is equal to the sum of their separate probabilities.
p(blue or green) = p(blue) + p(green) = .24 + .16 = .40
à one outcome (occurrence)
Mulitplicative Rule: (independence of events must be given)
Probability of their joint (successive/co-occurrence) occurrence is product of individual
p(blue, blue) = p(blue) * p (blue) = .24 * .24 = .0576
à minimum 2 outcomes (occurrences)
Joint probability: Probability of the co-occurrence of two or more events
- If independent, p can be calculated with multiplicative law
- If not independent, than very complicated procedure (not given in book)
Denoted as: p(A, B) à p(blue, green)
Conditional probability: Probability an event occurs if / given another event has occurred.
à hypothesis testing: If Ho = true, the p of this result is….
à Conditional can be read as: If…is true, then
Denoted as: p (A I B) à p(Aids I drug user)
Discrete variable: Can take on specific values à 1,2,3,4,5
à Probability distribution:
Proportions translate directly to probability
à can be measured at ordinate (Y-axis) – relative frequency
Continuous variable: Can take on infinite values à 1.234422 , 2.234 , 4 …
à Variable in experiment can be considered
continuous if min. ordinal scale (e.g. IQ)
Density: height of the curve at point X
à Probability distribution:
Likelihood of one specific score is not useful, cause p(X = exactly,
e.g. 2) is highly unlikely, rather 2.1233
à Measure Interval: E.g. 1.5 – 2.5
à Area under defined interval, a to b = our probability à use distribution tables (later chapters)
Measurement data: (also quantitative data): Observation represents score on a continuum (e.g. mean, st. dev.)
Categorical data: (also frenquency data): Data consists of frequencies of observations that fall into 2 or more
categories. à remember frequency tables
Chi-square X²: 2 different meanings: 1. Mathematical distribution that stands for itself
or Pearson´s chi-square 2. Refers to a statistical test of which the result is distributed in
approximately the same way as X²
Assumptions of Chi-square test: Observations need to be independent of each other
+ Aim is to test independence of variables (significance of findings)
6.1 Chi-Square Distribution
Gamma function: = factorial.
When argument of gamma (k/2), then gamma = integer à [(k/2) – 1]!
à Need of gamma functions because arguments not always integers
- Chi-square has only one parameter k. (≠ two-parameter functions with µ and ơ )
- Everything else is either a constant e or another value of X²
(- X²3 is read as “chi-square with 3 degrees of freedom = df (expl. Later))
Chi-square test: - based on X² distribution.
- can be used for one-dimensional tables and two-dimensional (contingency tables)
!!!! Beware: We need large expected frequencies: X² distribution is continuous and cannot provide a good
approximate if we have only a few possible Efrequencies, which are discrete.
à Should minimum be: Efreq. ≥ 5 ,otherwise low power to reject Ho.
(e.g. flipping a coin only 3 times cannot be compared with the frenquency distribution because the
frequency is just too small) – It could be compared but this is stupid :P
nonoccurences: Have to be mentioned in the table. Cannot compare 2 variables that only show
Goodness-of-fit test: Test whether difference of observed score from expected scores are big enough to
question whether this is by chance or significant. Significance test or Independence test.
observed frequency: Actual data collected
expected frequency: Frequency expected if Ho were true.
6.2.1 Tabled Chi-Square Distribution
We have obtained a value for X² and now we have to compare it to the X² distribution to get a probability,
so we can define whether our X² is significant (reject Ho) or we accept our H1.
For this we use: Tabled distribution of X²:
depends on df = degrees of freedom à df = k-1 (number of categories -1)
We want to know if a variable is contingent or conditional on a second variable.
We do this by using a
Marginal total: (Rowtotal * Columntotal) – N
See also: Formula for joint occurrence of independent events (chapter 5)
Now continue with calculation of the chi-square to determine significance of findings.
Now, to assess whether our X² is significant, we first have to calculate the degree of freedom =df to know
where to look on the X² distribution table
Yate´s correction for continuity: Reducing absolute value of each numerator (O-E) for 0.5 before squaring
Fischer´s Exact Test: Is mentioned, but I think not exam material. If it is, I will update the summary.
Kappa ( k ) : Statistic that measures interjudge agreement by using contingency tables (not based on chi-square) à measure of reliability
à corrects for chance
1. First calculate expected frequencies for diagonal cells = (cells in which the judges agree = relevant)
2. Apply formula. Result = k Kappa
Access level of this page
- WorldSupporters only
- JoHo members