Inferential Statistics, Howell Chapter 4,5,6

Inferential Statistics, Howell Chapter 4,5,6

Sampling error:
Also “chance variability”. Variability in findings are due to chance


Hypothesis testing:
Reason: Data are ambiguous à means are different
Goal: Find out if the difference is big or small i.o.w à statistically significant

Sampling distributions:
What degree of variability of sample-sample can we expect in the data?
Tells us what variability we can expect under certain conditions (e.g. if population mean is equal).
Can also be done with other measure of variability: Range,

Sampling distribution of differences between means:
Compares distribution of means

Standard error:
Expected standard deviation of samples of measured statistic, when measured repeatedly.

Theory of Hypothesis Testing

                - Answering statistical significance is no longer sufficient (p<.05)
                à Need to inform reader about power and confidence limits and effect size

                - Try to find out if difference in sample means (sampling distribution) is likely if the sample              was drawn from a population with an equal mean

                1. Set up the research hypothesis. Eg. Parking takes longer if someone watches
                2. Collect random sample under the 2 conditions
                3. Set up Ho = null hypothesis = the population means of the 2 samples are equal
                4. Calculate sampling distribution of the 2 means under condition that Ho is true
                5. Calculate probability of a mean difference that is at least as large as the one obtained
                6. Reject or fail to reject Ho (Assumption that Ho is not true – not proven !!!!)

                1. Research Hypothesis
                2. Collect random sample
                3. Set up null hypothesis
                4. Sampling distribution under Ho=true
                5. Compare sample statistic to distribution
                6. Reject or retain Ho

                Null hypothesis:
- Usually the opposite of the research hypothesis
                à in order to be disproven (cause we can never prove something, only disprove the

Statistical conclusions

                - Fisher:
                - Options are to reject or suspend judgement over Ho.
                à If Ho cannot be rejected, the judgement about it has to be suspended.
                     (eg. Schoolexperiment continues)

                - Neyman-Pearson:
                - Options are to reject or accept that Ho is true.
                à If Ho cannot be rejected, Ho has to be considered true until disproven.
                    (eg. Schoolexperiment stops, until evidence has to be reconsidered)

                Conditional Probabilites:
- Confusion between the probability of the hypothesis given the data and the data given the           hypothesis.
                à p = .045 means that probability of data given if hypothesis Ho = true à p(D I Ho)

Test Statistics

                Sample statistics:
                - Descriptives (mean, range, variance, correlation coefficient)
                - Describe characteristics of the samples

                Test statistics:
                - Statistical procedure with own sampling distributions (t, F , X²)


Decisions about the Null-Hypothesis

                Rejection level / significance level:
                - Sample score falls inside the 5% level of the assumed distribution à rejection region
                à If it falls there, the likelihood that the findings are due to chance is 5%
                à Therefore it is statistically significant

Type I and Type II Errors

                Type I : (Jackpot Error)
                - Rejecting Ho when it is actually true
                à Probability of making this error is expressed as alpha
à We will make this error 5% of the time

                Type II :
Fail to reject Ho when it is actually wrong
                à Probability of making this error is expressed as beta
à We will make this error depending on the size of rejection region 

                - Less Type I error = more Type II error

If beta is smaller, the distance between sample mean and pop mean is bigger, thus the generalizability increases. à More power

One and Two Tailed Tests

                One tailed / directional test:
test only for one direction of the distribution 5% level

                Two tailed / nondirectional test:
test for negative and positive scores on 2.5% level
                - Reasons: No clue what data will look like
                                     Cover themselves in the event the prediction was wrong
                                     One tailed tests are hard to define (if more than two groups)

                àTry to keep statistical significance low.

                2 Questions to deal with any new statistic

                1. How and with what assumption is the statistic calculated?
                2. What does the statistic´s sampling distribution look like under Ho?
                à compare

Alternative view of hypothesis testing

                Traditional way:
                - Null hypothesis = m1 = m2        or            m1 not = m2 (two tailed)

                According to Jones, Tukey and Harris
                - 3 possible conclusions
                1. m1 < m2
                2. m1 > m2
                3. m1 = m2

  • 3. Is ruled out, because the means are never the same. So we test for 2 directions at the same time. It allows us to keep 5% levels at both ends of the distribution, because we will just discard the other one

Basic Concepts of Probability

1.0 Probability. 1

1.1 3 concepts: 1

1.2 Basic Terminology and Rules. 1

1.3 Basic Laws: 1

2.0 Discrete vs Continuous Variables. 2

2.1 Definitions. 2


1.0 Probability

1.1 3 concepts:

 Analytic view: Common definition of probability. Even can occur in A ways and fail to occur in B ways.
                                à all possible ways are equally likely (definite probability, eg. 50%)

                                Probability of occurrence:            A/(A+B) à p(blue)
                                Probability of failure to occur:     B/(A+B) à p(green)

Frequentist view: Probability is the limit of the relative frequency of occurrence
                                    à Dice will land approx. 1/6th of time on one side with multiple throws (proportions)

Subjective probability: Individuals subjective estimate. (opposite of frequentist view)
                                             à use of Bayes´ theorem
                                              à usually disagree with general hypothesis testing orientation

1.2 Basic Terminology and Rules

Event: The occurrence of “something”

Independent event: Set of events that do not have an effect on each others occurrences

Mutually exclusive event: The occurrence of one event precluded the occurrence of the alternative event.

Exhaustive event: All possible occurences /outcomes (e.g. die) are considered.

Theorem: Rule

(Sampling with replacement: Before drawing a new sweet (occurrence), the old draw is replaced.)

1.3 Basic Laws:

Additive law of probability: (mutually exclusive event must be given)
                                                        The occurrence of one event is equal to the sum of their separate probabilities.

                                                        p(blue or green) = p(blue) + p(green) = .24 + .16 = .40

                                                        à one outcome (occurrence)

Mulitplicative Rule: (independence of events must be given)
                                        Probability of their joint (successive/co-occurrence) occurrence is product of individual

                                        p(blue, blue) = p(blue) * p (blue) = .24 * .24 = .0576

                                        à minimum 2 outcomes (occurrences)

Joint probability: Probability of the co-occurrence of two or more events
                                   - If independent, p can be calculated with multiplicative law
                                   - If not independent, than very complicated procedure (not given in book)

                                   Denoted as:    p(A, B)  à  p(blue, green)

Conditional probability: Probability an event occurs if / given another event has occurred.
                                                 à hypothesis testing: If Ho = true, the p of this result is….
                                                 à Conditional can be read as: If…is true, then

                                                 Denoted as:       p (A I B) à p(Aids I drug user)


2.0 Discrete vs Continuous Variables

2.1 Definitions

                Discrete variable: Can take on specific values à 1,2,3,4,5

                                à Probability distribution:
                                      Proportions translate directly to probability
                                 à can be measured at ordinate (Y-axis) – relative frequency

                Continuous variable: Can take on infinite values à 1.234422 , 2.234 , 4 …
                                                          à Variable in experiment can be considered
                                                                  continuous if min. ordinal scale (e.g. IQ)

                                Density: height of the curve at point X

                                à Probability distribution:
                                      Likelihood of one specific score is not useful, cause p(X = exactly,
                                      e.g. 2) is highly unlikely, rather 2.1233
                                      à Measure Interval: E.g. 1.5 – 2.5
                                      à Area under defined interval, a to b = our probability à use distribution tables (later chapters)



6.0 Basics for Chi-Square tests. 1

6.1 Chi-Square Distribution. 1

6.2 Chi-Square Goodness of Fit Test – One-way Classification. 2

6.2.1 Tabled Chi-Square Distribution. 3

6.3 Two Classification Variables: Contingency Table Analysis. 3

6.3.2 Correcting for Continuity (for 2 x 2 tables + expected frequency is small). 4

6.3.3 Fischers Exact Test (another test, besides the chi-square test). 4

6.12 Kappa - Measure of Agreement. 4

6.13 How to write down findings – see book !!!!. 5


6.0 Basics for Chi-Square tests

                Measurement data: (also quantitative data): Observation represents score on a continuum (e.g. mean, st. dev.)

                Categorical data: (also frenquency data): Data consists of frequencies of observations that fall into 2 or more
                                                    categories. à remember frequency tables

                Chi-square X²: 2 different meanings:       1. Mathematical distribution that stands for itself
                or Pearson´s chi-square                                2. Refers to a statistical test of which the result is distributed in
                                                                                                     approximately the same way as  X²

                Assumptions of Chi-square test: Observations need to be independent of each other
                                                                                + Aim is to test independence of variables (significance of findings)

6.1 Chi-Square Distribution
                Chi-square Distribution:

                                                                Gamma function: =
                                                                When argument of gamma (k/2), then gamma = integer à [(k/2) – 1]!
                                                                à Need of gamma functions because arguments not always integers

                                                                - Chi-square has only one parameter k.    (≠ two-parameter functions with  µ and ơ )


                                                                - Everything else is either a constant e or another value of

                                                                (- X²3  is read as “chi-square with 3 degrees of freedom = df    (expl. Later))


6.2 Chi-Square Goodness of Fit Test – One-way Classification



                Chi-square test: - based on distribution.
                                                 - can be used for one-dimensional tables and two-dimensional (contingency tables)

                !!!! Beware: We need large expected frequencies: X² distribution is continuous and cannot provide a good
                                        approximate if we have only a few possible Efrequencies, which are discrete.  
                                      à Should minimum be:   Efreq. ≥ 5  ,otherwise low power to reject Ho.
                                        (e.g. flipping a coin only 3 times cannot be compared with the frenquency distribution because the
                                        frequency is just too small) – It could be compared but this is stupid :P

                                       nonoccurences: Have to be mentioned in the table. Cannot compare 2 variables that only show
                                                                        one observation.

                Goodness-of-fit test: Test whether difference of observed score from expected scores are big enough to
                                                          question whether this is by chance or significant. Significance test or Independence test.

                                observed frequency: Actual data collected
                                expected frequency: Frequency expected if Ho were true.





6.2.1 Tabled Chi-Square Distribution
                We have obtained a value for X² and now we have to compare it to the X² distribution to get a probability,
                so we can define whether our X² is significant (reject Ho) or we accept our H1.
                For this we use: Tabled distribution of X²:
depends on df = degrees of freedom à df = k-1 (number of categories -1)


6.3 Two Classification Variables: Contingency Table Analysis

   Textfeld: Rowtotal


                We want to know if a variable is contingent or conditional on  a second variable.
                We do this by using a

                contingency table:


Textfeld: N

Textfeld: Columntotal



Marginal total:  (Rowtotal * Columntotal) – N


                                                                     See also: Formula for joint occurrence of independent events (chapter 5)

Textfeld: = (rows-1)*(columns-1) = 1df


                Now continue with calculation of the chi-square to determine significance of findings.

                Now, to assess whether our X² is significant, we first have to calculate the degree of freedom =df  to know
                where to look on the X² distribution table







Textfeld: -0.5

6.3.2 Correcting for Continuity (for 2 x 2 tables + expected frequency is small)

                Yate´s correction for continuity: Reducing absolute value of each numerator (O-E) for 0.5 before squaring


            6.3.3 Fischers Exact Test (another test, besides the chi-square test)

                 Fischer´s Exact Test: Is mentioned, but I think not exam material. If it is, I will update the summary.


6.12 Kappa - Measure of Agreement

                Kappa ( k ) : Statistic that measures interjudge agreement by using contingency tables (not based on chi-square)                                       à measure of reliability
                                      à corrects for chance

1. First calculate expected frequencies for diagonal cells = (cells in which the judges agree = relevant)
2. Apply formula. Result  = k Kappa


6.13 How to write down findings – see book !!!!

Access level of this page

  • Public
  • WorldSupporters only
  • JoHo members
  • Private

Add this content to my World Supporter Magazine

Magazine of chaneldodan