What is a confidence interval in null hypothesis significance testing?

An confidence interval is an interval brought out an algorithm that by repeated use gives an X% change to hold the true population value.

Tip category:

Studies & Exams

Related organization or sector page:

Universiteit Amsterdam: UVA

Supporting content or organization page:

WSRt, critical thinking - a summary of all articles needed in the fourth block of second year psychology at the uva

This is a summary of the articles and reading materials that are needed for the fourth block in the course WSR-t. This course is given to second year psychology

Kinds versus continua: a review of psychometric approaches to uncover the structure of psychiatric constructs - summary of an article by Borsboom, Rhemtulla, Cramer, van der Maas, Scheffer and Dolan

Critical thinking
Article: Borsboom, Rhemtulla, Cramer, van der Maas, Scheffer and Dolan (2016)
Kinds versus continua: a review of psychometric approaches to uncover the structure of psychiatric constructs

The present paper reviews psychometric modelling approaches that can be used to investigate the question whether psychopathology constructs are discrete or continuous dimensions through application of statistical models.

Introduction
Measurement theoretical definitions of kinds and continua
Kinds and continua as psychometric entities
Alternative latent variable models

Introduction

The question of whether mental disorders should be thought of as discrete categories or as continua represents an important issue in clinical psychology and psychiatry.

The DSM-V typically adheres to a categorical model, in which discrete diagnoses are based on patterns of symptoms.

But, such categorizations often involve apparently arbitrary conventions.

Measurement theoretical definitions of kinds and continua

All measurement starts with categorization, the formation of equivalence classes.
Equivalence classes: sets of individuals who are exchangeable with respect to the attribute of interest.
We may not succeed in finding an observational procedure that in fact yields the desired equivalence classes.

We may find that individuals who have been assigned the same label are not indistinguishable with respect to the attribute of interest.
Because there are now three classes rather than two, next to the relation between individuals within cases (equivalence), we may also represent systematic relations between members of different cases.
One may do so by invoking the concept of order.
But, we may find that within these classes, there are non-trivial differences between individuals that we wish to represent.

If we break down the classes further, we may represent them with a scale that starts to approach continuity.

The continuity hypothesis formally implies that:

in between any two positions lies a third that can be empirically instantiated
there are no gaps in the continuum.

In psychological terms, categorical representations line up naturally with an interpretation of disorders as discrete disease entities, while continuum hypotheses are most naturally consistent with the idea that a construct varies continuously in a population.

in a continuous interpretation, the distinction between individuals depends on the imposition of a cut-off score that does not reflect a gap that is inherent in the attribute itself.

Kinds and continua as psychometric entities

In psychology, we have no way to decide conclusively whether two individuals are ‘equally depressed’.
This means we cannot form the equivalence classes necessary for measurement theory to operate.
The standard approach to dealing with this situation in psychology is to presume that, even though equivalence classes for theoretical entities like depression

Access:

Public

Toward a Model-Based Approach to the Clinical Assessment of Personality Psychopathology - summary of an article by Eaton, Krueger, Docherty, and Sponheim

Critical thinking
Article: Eaton, Krueger, Docherty, and Sponheim (2013)
Toward a Model-Based Approach to the Clinical Assessment of Personality Psychopathology

This paper illustrates how new statistical methods can inform conceptualization of personality psychopathology and therefore its assessment.

The relationship between structure and assessment
Distributional assumptions of personality constructs
Model-based tests of distributional assumptions
Discussion

The relationship between structure and assessment

Structural assumptions about personality variables are inextricably linked to personality assessment.

reliable assessment of normal-range personality traits, and personality disorder categories, frequently takes different forms, given that the constructs of interest are presumed to have different structures.
when assessing personality traits, the assessor needs to measure the full range of the trait dimension to determine where an individual falls in it.
then assessing the presence or absence of a DSM-V personality disorder, the assessor needs to evaluate the presence of absence of the binary categorical diagnosis.
given the polythetic nature of criterion sets, the purpose of the assessment is to determine which criteria are present, calculate the number of present criteria, and note whether this sum meets or exceeds a diagnostic threshold.

The nature of the personality assessment instrument reflect assumptions about the distributional characteristics of the construct of interest.

items on DSM-oriented inventories are usually intended to gather converging pieces of information about each criterion to determine whether or not it is present.

Distributional assumptions of personality constructs

Historically, many assumptions about the distributions of data reflecting personality constructs resulted form expert opinion or theory.
Both ‘type’ theories and dimensional theories have been proposed.
Assessment instruments have reflected this bifurcation in conceptualization.

The resulting implications for assessment are far from trivial
The structure of a personality test designed to determine whether an individual is one or two personality types, needs only to assess the two characteristics, as opposed to assessing characteristics that are more indicative or mid-range.
- There is no mid-ground in type theory, so items covering middle-ground are not relevant.

Because the structure of personality assessment is reflective of the underlying distributional assumptions of the personality constructs of interest, reliance solely on expert opinion about these distributions is potentially problematic.

Model-based tests of distributional assumptions

It is critical for personality theory and assessment that underlying distributional assumptions of symptomatology be correct and justifiable.

different distributions impact the way clinical and research constructs are conceptualized, measured, and applied to individuals.
characterizing these latent constructs properly is a prerequisite for efforts to asses them.
- it is of limited value to assess an improperly conceived construct with high reliability.

Access:

Public

Bayes and the probability of hypotheses - summary of Chapter 4 of Understanding Psychology as a science by Dienes

Critical thinking
Chapter 4 of Understanding Psychology as a science by Dienes
Bayes and the probability of hypotheses

Objective probability: a long-run relative frequency.
Classic (Neyman-Pearson) statistics can tell you the long-run relative frequency of different types of errors.

Classic statistics do not tell you the probability of any hypothesis being true.

An alternative approach to statistics is to start with what Bayesians say are people’s natural intuitions.
People want statistics to tell them the probability of their hypothesis being right.
Subjective probability: the subjective degree of conviction in a hypothesis.

Subjective probability
Bayes’ theorem
Bayesian analysis

Subjective probability

Subjective or personal probability: the degree of conviction we have in a hypothesis.
Probabilities are in the mind, not in the world.

The initial problem to address in making use of subjective probabilities is how to assign a precise number to how probable you think a proposition is.
The initial personal probability that you assign to any theory is up to you.
Sometimes it is useful to express your personal convictions in terms of odds rather than probabilities.

Odds(theory is true) = probability(theory is true)/probability(theory is false)
Probability = odds/(odds +1)

These numbers we get from deep inside us must obey the axioms of probability.
This is the stipulation that ensures the way we change our personal probability in a theory is coherent and rational.

People’s intuitions about how to change probabilities in the light of new information are notoriously bad.

This is where the statistician comes in and forces us to be disciplined.

There are only a few axioms, each more-or-less self-evidently reasonable.

Two aximons effectively set limits on what values probabilities can take.
All probabilities will lie between 0 and 1
P(A or B) = P(A) + P(B), if A and B are mutually exclusive.
P(A and B) = P(A) x P(B|A)
- P(B|A) is the probability of B given A.

Bayes’ theorem

H is the hypothesis
D is the data

P(H and D) = P(D) x P(H|D)
P(H and D) = P(H) x P(D|H)

P(D) x P(H|D) = P(H) x P(D|H)

Moving P(D) to the other side

P(H|D) = P(D|H) x P(H) / P(D)

This last one is Bayes theorem.
It tells you how to go from one conditional probability to its inverse.
We can simplify this equation if we are interested in comparing the probability of different hypotheses given the same data D.
Then P(D) is just a constant for all these comparisons.

P(H|D) is proportional to P(D|H) x P(H)

P(H) is called the prior.
It is how probable you

Access:

Public

Bayesian Versus orthodox statistics: which side are you on? - summary of an article by Dienes, 2011

Critical thinking
Article: Dienes, Z, 2011
Bayesian Versus orthodox statistics: which side are you on?
doi: 10.1177/1745691611406920

The contrast: orthodox versus Bayesian statistics
The probabilities of data given theory and theory given data
Problems with the Neyman Pearson approach
The rationality of the Bayesian approach
Effect size
How to calculate a Bayes factor
Multiple testing and cheating
Weaknesses of the Bayesian approach

The contrast: orthodox versus Bayesian statistics

The orthodox logic of statistics, starts from the assumption that probabilities are long-run relative frequencies.
A long-run relative frequency requires an indefinitely large series of events that constitutes the collective probability of some property (q) occurring is then the proportion of events in the collective with property q.

The probability applies to the whole collective, not to any one person.
- One person may belong to two different collectives that have different probabilities
Long run relative frequencies do not apply to the truth of individual theories because theories are not collectives. They are just true or false.
- Thus, when using this approach to probability, the null hypothesis of no population difference between two particular conditions cannot be assigned a probability.
Given both a theory and a decision procedure, one can determine a long-run relative frequency with which certain data might be obtained. We can symbolize this as P(data| theory and decision procedure).

The logic of Neyman Pearson (orthodox) statistics is to adopt decision procedures with known long-term error rates and then control those errors at acceptable levels.

Alpha: the error rate for false positives, the significance level
Beta: the error rate for false negatives

Thus, setting significance and power controls long-run error rates.

An error rate can be calculated from the tail area of test statistics.
An error rate can be adjusted for factors that affect long-run error rates
These error rates apply to decision procedures, not to individual experiments.
- An individual experiment is a one-time event, so does not constitute a long-run set of events
- A decision procedure can in principle be considered to apply over a indefinite long-run number of experiments.

The probabilities of data given theory and theory given data

The probability of a theory being true given data can be symbolized as P(theory|data).
This is what orthodox statistics tell us.
One cannot infer one conditional probability just by knowing its inverse. (So P(data|theory) is unknown).

Bayesian statistics starts from the premise that we can assign degrees of plausibility to theories, and what we want our

Access:

Public

Network Analysis: An Integrative Approach to the Structure of Psychopathology - summary of an article by Borsboom and Cramer (2013)

Critical thinking
Article: Borsboom, D. and Cramer, A, O, J. (2013)
Network Analysis: An Integrative Approach to the Structure of Psychopathology
doi: 10.1146/annurev-clinpsy-050212-185608

Introduction
Symptoms and disorders in psychopathology
Complex psychopathology networks
Constructing and analysing psychopathology networks
The many roads to disorder: individual networks

Introduction

The current dominant paradigm of the disease model of psychopathology is problematic.
Current handling of psychopathology data is predicated on traditional psychometric approaches that are the technical mirror of of this paradigm.
In these approaches, observables (clinical symptoms) are explained by means of a small set of latent variables, just like symptoms are explained by disorders.

From this psychometric perspective, symptoms are regarded as measurements of a disorder, and in accordance, symptoms are aggregated in a total score that reflects a person’s stance on that latent variable.
The dominant paradigm is not merely a matter of theoretical choice, but also of methodological and pragmatic necessity.

In this review, we argue that complex network approaches, which are currently being developed at the crossroads of various scientific fields, have the potential to provide a way of thinking about disorders that does justice to their complex organisation.

In such approaches, disorders are conceptualized as systems of causally connected symptoms rather than as effects of a latent disorder.
Using network analysis techniques, such systems can be represented, analysed, and studied in their full complexity.
In addition, network modeling has the philosophical advantage of dropping the unrealistic idea that symptoms of a single disorder share a single causal background, while it simultaneously avoids the realistic consequence that disorders are merely labels for an arbitrary set of symptoms.
- It provides a middle ground in which disorders exists as systems, rather than as entities

Symptoms and disorders in psychopathology

We know for certain that people suffer from symptoms and that these symptoms cluster in a non-arbitrary way.
For most psychopathological conditions, the symptoms are only empirically identifiable causes of distress.

Mental disorders are themselves not empirically identifiable in that they cannot be diagnosed independently of their symptoms.
- It is impossible to identify any of the common mental disorders as conditions that exists independently of their symptoms.

In order for a disease model to hold, it should be possible to conceptually separate conditions from symptoms.

It must be possible (or at least imaginable) that a person should have a condition/disease without the associated symptoms.

This isn’t possible for mental disorders.
As an important corollary, this means that disorders cannot be causes of these symptoms.
This strongly suggests that the treatment of disorders as

Access:

Public

Introduction to qualitative psychological research - an article by Coyle (2015)

Critical thinking
Article: Coyle, A (2015)
Introduction to qualitative psychological research

Introduction

This chapter examines the development of psychological interest in qualitative methods in historical context and point to the benefits that psychology gains from qualitative research.
It also looks at some important issues and developments in qualitative psychology.

Epistemology and the ‘scientific method’
Resistance to the ‘scientific method’: alternative epistemologies and research foci
Reflexivity in qualitative research
Evaluative criteria for qualitative research
Combining research methods and approaches

Epistemology and the ‘scientific method’

At its most basic, qualitative psychological research may be regarded as involving the collection and analysis of non-numerical data through a psychological lens in order to provide rich descriptions and possibly explanations of peoples meaning-making, how they make sense of the world and how they experience particular events.

Qualitative research is bound up with particular sets of assumptions about the bases or possibilities of knowledge.
Epistemology: particular sets of assumptions about the bases or possibilities of knowledge.
Epistemology refers to a branch of philosophy that is concerned with the theory of knowledge and that tries to answer questions about how we can know what we know.
Ontology: the assumptions we make about the nature of being, existence or reality.

Different research approaches and methods are associated with different epistemologies.
The term ‘qualitative research’ covers a variety of methods with a range of epistemologies, resulting in a domain that is characterized by difference and tension.

The epistemology adopted by a particular study can be determined by a number of factors.

A researcher may have a favoured epistemological outlook or position and may locate their research within this, choosing methods that accord to with that position.
Alternatively, the researcher may be keen to use a particular qualitative method in their research and so they frame their study according to the epistemology that is usually associated with that method.

Whatever epistemological position is adopted in a study, it is usually desirable to ensure that you maintain this position consistently throughout the wire-up to help produce a coherent research report.

Positivism: holds that the relationship between the world and our sense perception of the world is straightforward. There is a direct correspondence between things in the world and our perception of them provided that our perception is not skewed by factors that might damage that correspondence.
So, it is possible to obtain accurate knowledge of things in the world, provided we can adopt an impartial, unbiased, objective viewpoint.

Empiricism: holds that our knowledge of the world must arise from the collection and categorization of our sense perceptions/observations of the world.
This categorization allows us to develop more complex knowledge of the world and to develop theories to explain the world.

Access:

Public

Surrogate Science: The Idol of a Universal Method for Scientific Inference - summary of an article by Gigerenzer & Marewski

Critical thinking
Article: Gigerenzer, G. & Marewski, J, N. (2015)
Surrogate Science: The Idol of a Universal Method for Scientific Inference
doi: 10.1177/0149206314547522

Introduction

Scientific inference should not be made mechanically.
Good science requires both statistical tools and informed judgment about what model to construct, what hypotheses to test, and what tools to use.

This article is about the idol of a universal method of statistical inference.

In this article, we make three points:

There is no universal method of scientific inference, but, rather a toolbox of useful statistical methods. In the absence of a universal method, its followers worship surrogate idols, such as significant p values.
The inevitable gap between the ideal and its surrogate is bridged with delusions.
These mistaken beliefs do much harm. Among others, by promoting irreproducible results.
If the proclaimed ‘Bayesian revolution’ were to take place, the danger is that the idol of a universal method might survive in a new guise, proclaiming that all uncertainty can be reduced to subjective probabilities.
Statistical methods are not simply applied to a discipline. They change the discipline itself, and vice versa.

Dreaming up a universal method of inference
Bayesianism and the new quest for an universal method
How statistics change research, surrogate science
Conclusion: Leibniz’s dram of Bayes’ nightmare?

Dreaming up a universal method of inference

The null ritual

The most prominent creation of a seemingly universal inference method is the null ritual:

Set up a null hypothesis of ‘no mean inference’ or ‘zero correlation’. Do not specify the predictions or your own research hypothesis.
Use 5% as a convention for rejecting the null. If significant, accept you research hypothesis. Report the result as p<.05, p<.01, p<.001, whichever comes next to the obtained p value.
Always perform this procedure.

Level of significance has three different meanings:

A mere convention
The alpha level
The exact level of significance

Three meanings of significance

The alpha level: the long-term relative frequency of mistakenly rejecting hypothesis H₀if it is true, also known as Type I error rate.
The beta level: the long-term frequency of mistakenly rejecting H₁ if it is true.

Two statistical hypothesis need to be specified in order to be able to determine both alpha and beta.
Neyman and Pearson rejected a mere convention in favour of an alpha level that required a rational scheme.

Set up two statistical hypotheses, H₁, H₂, and decide on alpha, beta and the sample size before the experiment, based on subjective cost-benefit considerations.
If the data fall into the rejection region of H₁, accept H₂, otherwise accept H₁

Access:

Public

WSRt, critical thinking, a list of terms used in the articles of block 4

This is a list of the important terms used in the articles of the fourth block of WSRt, with the subject alternative approaches to psychological research.

Article: Kinds versus continua: a review of psychometric approaches to uncover the structure of psychiatric constructs
Toward a Model-Based Approach to the Clinical Assessment of Personality Psychopathology
Bayes and the probability of hypotheses
Bayesian Versus orthodox statistics: which side are you on?
Network Analysis: An Integrative Approach to the Structure of Psychopathology
Introduction to qualitative psychological research
Surrogate Science: The Idol of a Universal Method for Scientific Inference - summary of an article by Gigerenzer & Marewski

Article: Kinds versus continua: a review of psychometric approaches to uncover the structure of psychiatric constructs

Equivalence classes: sets of individuals who are exchangeable with respect to the attribute of interest.

Taxometrics: by inspecting particular consequences of the model for specific statistical properties of (subsets of) items, such as the patterns of bivariate correlations expected to hold in the data

Toward a Model-Based Approach to the Clinical Assessment of Personality Psychopathology

Latent trait models: posit the presence of one or more underlying continuous distributions.

Zones of rarity: locations along the dimension that are unoccupied by some individuals.

Discrimination: the measure of how strongly the item taps into the latent trait.

Quasi-continuous: the construct would be bounded at the low end by zero, a complete absence of the quality corresponding with the construct.

Latent class models: based on the supposition of a latent group (class) structure for a construct’s distribution.

Conditional independence: that inter-item correlations solely reflect class membership.

Hybrid models (of factor mixture models): combine the continuous aspects of latent trait models with the discrete aspects of latent class models.

EFMA: exploratory factor mixture analysis.

Bayes and the probability of hypotheses

Objective probability: a long-run relative frequency.

Subjective probability: the subjective degree of conviction in a hypothesis.

The likelihood principle: the notion that all the information relevant to inference contained in data is provided by the likelihood.

Probability density distribution: the distribution of if the dependent variable can be assumed to vary continuously

Credibility interval: the Bayesian equivalent of a confidence interval

The Bayes factor: the Bayesian equivalent of null hypothesis testing

Flat prior or uniform prior: you have no idea what the population value is likely to be

Bayesian Versus orthodox statistics: which side are you on?

Alpha: the error rate for

Access:

Public

Everything you need for the course WSRt of the second year of Psychology at the Uva

This magazine contains all the summaries you need for the course WSRt at the second year of psychology at the Uva.

WSRt, critical thinking, a list of terms used in the articles of block 2

WSRt, critical thinking - a summary of all articles needed in the second block of second year psychology at the uva

WSRt, critical thinking, a list of terms used in the articles of block 3

WSRt using SPSS, manual for tests in the third block of the second year of psychology at the uva

WSRt, critical thinking - a summary of all articles needed in the third block of second year psychology at the uva

WSRt, critical thinking, a list of terms used in the articles of block 4

WSRt, critical thinking - a summary of all articles needed in the fourth block of second year psychology at the uva

Sharon Klinkenberg legt SPSS uit op YouTube

Summary of Discovering statistics using IBM SPSS statistics by Field - 5th edition

Critical thinking: A concise guide by Bowell & Kemp (4th edition) - a summary

What is a confidence interval in null hypothesis significance testing?

What is the difference between a p-value and Bayes likelihood?

What are important elements of Bayesian statistics?

What is the Bayes factor?

What are weaknesses of the Bayesian approach?

What is qualitative psychological research?

What criteria should be held by good qualitative research?

Year 2 of psychology at the uva

Access:

Public

What is a confidence interval in null hypothesis significance testing?

An confidence interval is an interval brought out an algorithm that by repeated use gives an X% change to hold the true population value.

What are important elements of Bayesian statistics?

The three most important elements of Bayesian statistics are:

The Prior: the relative plausibility of hypothesis, before seeing the data
Likelihood: the predictive updating factor
The Posterior: the relative plausibility of hypothesis, after seeing the data

For more information about Bayesian statistics, check out my summary of the fourth block of WSRt

What is the Bayes factor?

The Bayes factor (B) compares the probability of an experimental theory to the probability of the null hypothesis.
It gives the means of adjusting your odds in a continuous way.

If B is greater than 1, your data support the experimental hypothesis over the null
If B is less than 1, your data support the null over the experimental hypothesis
If B is about 1, then your experiment was not sensitive

For more information, look at the (free) summary of 'Bayes and the probability of hypotheses' or 'Bayesian versus orthodox statistics: which side are you one?'

What are weaknesses of the Bayesian approach?

Weaknesses of the Bayesian approach are:

The prior is subjective
Bayesian analysis force people to consider what a theory actually predicts, but specifying the predictions in detail may by contentious
Bayesian analysis escape the paradoxes of violating the likelihood principle, but in doing so they no longer control for Type I and Type II errors

For more information, look at the (free) summary of 'Bayesian versus orthodox statistics: which side are you on?'

What is qualitative psychological research?

At its most basic, qualitative psychological research can be seen as involving the collection and analysis of non-numerical data through a psychological lens in order to provide rich descriptions and possibly explanations of peoples meaning-making, how they make sense of the world and how they experience particular events.

For more information, look at the (free) summary of 'Introduction to qualitative psychological research'

What criteria should be held by good qualitative research?

Criteria that should be held by good qualitative research are:

Sensitivity to context
Commitment
Rigour
Transparency
Coherence
Impact and importance

For more information about these criteria, look at my (free) summary of 'Introduction to qualitative psychological research, Coyle (2015)'.

2577 reads

Summary of Discovering statistics using IBM SPSS statistics by Field - 5th edition

This is a summary of the book "Discovering statistics using IBM SPSS statistics" by A. Field. In this summary, everything students at the second year of psychology at the

Why is my evil lecturer forcing me to learn statisics? - summary of chapter 1 of statistics by A. Field (5th edition)

Statistics
Chapter 1
Why is my evil lecturer forcing me to learn statistics?

The research process
Collecting data: research design
Analysing data
Reporting data

The research process

Initial observation: finding something that needs explaining

To see whether an observation is true, you need to define one or more variables to measure that quantify the thing you’re trying to measure.

Generating and testing theories and hypotheses

A theory: an explanation or set of principles that is well substantiated by repeated testing and explains a broad phenomenon.

A hypotheses: a proposed explanation for a fairly narrow phenomenon or set of observations.
An informed, theory-driven attempt to explain what has been observed.

A theory explains a wide set of phenomena with a small set of well-established principles.
A hypotheses typically seeks to explain a narrower phenomenon and is, as yet, untested.
Both theories and hypotheses exist in the conceptual domain, and you cannot observe them directly.

To test a hypotheses, we need to operationalize our hypotheses in a way that enables us to collect and analyse data that have a bearing on the hypotheses.
Predictions emerge from a hypotheses. A prediction tells us something about the hypotheses from which it derived.

Falsification: the act of disproving a hypotheses or theory.

Collecting data: measurement

Independent and dependent variable

Variables: things that can change

Independent variable: a variable thought to be the cause of some effect.

Dependent variable: a variable thought to be affected by changes in an independent variable.

Predictor variable: a variable thought to predict an outcome variable. (independent)

Outcome variable: a variable thought to change as a function of changes in a predictor variable (dependent)

Levels of measurement

The level of measurement: the relationship between what is being measured and the number that represent what is being measured.

Variables can be categorical or continuous, and can have different levels of measurement.

A categorical variable is made up of categories.
It names distinct entities.
In its simplest form it names just two distinct types of things (like male or female).
Binary variable: there are only two categories.
Nominal variable: there are more than two categories.

Ordinal variable: when categories are ordered.
Tell us not only that things have occurred, but also the order in which they occurred.
These data tell us nothing about the differences between values. Yet they still do not tell us about the differences between point scale.

Continuous variable: a variable that gives us a score for each person and can take on any value on the measurement scale that we are using.
Interval variable: to say that data are interval, we must certain that equal intervals on the scale represents equal differences in the property being measured.
Ratio variables: in addition to

Access:

Public

The spine of statistics - summary of chapter 2 of Statistics by A. Field (5th edition)

Statistics
Chapter 2
The spine of statistics

What is the spine of statistics?

The spine of statistics: (an acronym for)

Standard error
Parameters
Interval estimates (confidence intervals)
Null hypotheses significance testing
Estimation

Statistical models
Populations and samples
P is for parameters
Standard error
(Confidence) interval
Null hypothesis significance testing

Statistical models

Testing hypotheses involves building statistical models of the phenomenon of interest.
Scientists build (statistical) models of real-world processes to predict how these processes operate under certain conditions. The models need to be as accurate as possible so that the prediction we make about the real world are accurate too.
The degree to which a statistical model represents the data collected is known as the fit of the model.

The data we observe can be predicted from the model we choose to fit plus some amount of error.

Populations and samples

Scientists are usually interested in finding results that apply to an entire population of entities.
Populations can be very general or very narrow.
Usually, scientists strive to infer things abut general populations rather than narrow ones.

We collect data from a smaller subset of the population known as a sample, and use these data to infer things about the population as a whole.
The bigger the sample, the more likely it is to reflect the whole population.

P is for parameters

Statistical models are made up of variables and parameters.
Parameters are not measured an are (usually) constants believed to represent some fundamental truth about the relations between variables in the model.
(Like mean and median).

We can predict values of an outcome variable based on a model. The form of the model changes, but there will always be some error in prediction, and there will always be parameters that tell us about the shape or form of the model.

To work out what the model looks like, we estimate the parameters.

The mean as a statistical model

The mean is a hypothetical value and not necessarily one that is observed in the data.

Estimates have ^.

Assessing the fit of a model: sums of squares and variance revisited.

The error or deviance for a particular entity is the score predicted by the model for that entity subtracted from the corresponding observed score.

Degrees of freedom (df): the number of scores used to compute the total adjusted for the fact that we’re trying to estimate the population value.
The degrees of freedom relate to the number of observations that are free to vary.

We can use the sum of squared errors and the mean squared error

Access:

Public

The beast of bias - summary of chapter 6 of Statistics by A. Field (5th edition)

Statistics
Chapter 6
The beast of bias

What is bias?
Outliers
Overview of assumptions
SPSS
Reducing bias

What is bias?

Bias: the summary information is at odds with the objective truth.

An unbiased estimator: one estimator that yields and expected value that is the same thing it is trying to estimate.

We predict an outcome variable from a model described by one or ore predictor variables and parameters that tell us about the relationship between the predictor and the outcome variable.
The model will not predict the outcome perfectly, so for each observation there is some amount of error.

Statistical bias enters the statistical process in three ways:

things that bias the parameter estimates (including effect sizes)
things that bias standard errors and confidence intervals
things that bias test statistics and p-values

Outliers

An outlier: a score very different from the rest of the data.

Outliers have a dramatic effect on the sum of squared error.
If the sum of squared errors is biased, the associated standard error, confidence interval and test statistic will be too.

Overview of assumptions

The second bias is ‘violation of assumptions’.

An assumption: a condition that ensures that what you’re attempting to do works.
If any of the assumptions are not true then the test statistic and p-value will be inaccurate and could lead us to the wrong conclusion.

The main assumptions that we’ll look at are:

additivity and linearity
normality of something or other
homoscedasticity/ homogeneity of variance
independence

Additivity and linearity

The assumption of additivity and linearity: the relationship between the outcome variable and predictor is accurately described by equation.
The scores on the outcome variable are, in reality, linearly related to any predictors. If you have several predictors then their combined effect is best described by adding their effects together.

If the assumption is not true, even if all the other assumptions are met, your model is invalid because your description of the process you want to model is wrong.

Normally distributed something or other

The assumption of normality relates in different ways to things we want to do when fitting models and assessing them:

Parameter estimates.
The mean is a parameter and extreme scores can bias it.
Estimates of parameters are affected by non-normal distributions (such as those with outliers).
Parameter estimates differ in how much they are biased in a non-normal distribution.
Confidence intervals
We use values of the standard normal distribution to compute the confidence interval around a parameter estimate. Using values of he standard normal distribution makes sense only if the parameter estimates comes from

Access:

Public

Non-parametric models - summary of chapter 7 of Statistics by A. Field (5h edition)

Statistics
Chapter 7
Non-parametric models

When to use non-parametric tests
Comparing two independent conditions: the Wilcoxon rank-sum test and Mann-Whitney test
Comparing two related conditions: the Wilcoxon signed-rank test
Differences between several independent groups: the Kruskal-Wallis test
Differences between several related groups: Friedman’s ANOVA

When to use non-parametric tests

Sometimes you can’t correct problems in your data.
This is especially irksome if you have a small sample and can’t rely on the central limit theorem to get you out of trouble.

The historical solution is a small family of models called non-parametric tests or assumption-free tests that make fewer assumptions than the linear model.

The four most common non-parametric procedures:

the Mann-Whitney test
the Wilcoxon signed-rank test
the Friedman’s test
the Kruskal-Wallis test

All four tests overcome distributional problems by ranking the data.

Ranking the data: finding the lowest score and giving it a rank 1, then finding the next highest score and giving it the rank 3, and so on.
This process results in high scores being represented by large ranks, and low scores being represented by small ranks.
The model is then fitted to the ranks and not to the raw scores.

By using ranks we eliminate the effect of outliers.

Comparing two independent conditions: the Wilcoxon rank-sum test and Mann-Whitney test

There are two choices to compare the distributions in two conditions containing scores from different entities:

the Mann-Whitney test
the Wilcoxon rank-sum test

Both tests are equivalent.
There is also a second Wilcoxon test that does something different.

Theory

If you were to rank the data ignoring the group to which a person belonged from lowest to highest, if there’s no difference between the groups, ten you should find a similar number of high and low ranks in each group.

if you added up the ranks, then you’d expect the summed total of ranks in each group to be about the same.

If you were to rank the data ignoring the group to which a person belonged from lowest to highest, if there’s a difference between the groups, ten you should not find a similar number of high and low ranks in each group.

if you added up the ranks, then you’d expect the summed total of ranks in each group to be different.

The Mann-Whitney and Wilcoxon rank-sum test use the principles above.

when the groups have unequal numbers of participants in them, the test statistic (W_s) for the Wilxcoxon rank-sum test is simply the sum of ranks in the group that contains the fewer people.
- then the group sizes

Access:

Public

Correlation - summary of chapter 8 of Statistics by A. Field (5th edition)

Statistics
Chapter 8
Correlation

Modeling relationships
Partial and semi-partial correlation
Comparing correlations
Calculating the effect size
How to report correlation coefficents

Modeling relationships

The data we observe can be predicted from the model we choose to fit the data plus some error in prediction.

Outcome_i= (model) + error_i
Thus
outcome_i= (b₁X_i)+error_i

z(outcome)_i = b₁z(X_i)+error_i

z-scores are standardized scores.

A detour into the murky world of covariance

The simplest way to look at whether two variables are associated is to look whether they covary.
If two variables are related, then changes in one variable should be met with similar changes in the other variable.

Covariance (x,y) = Σⁿ_i=1 ((x_i-ẍ)(y_i-ÿ))/N-1

The equation for covariance is the same as the equation for variance, except that instead of squaring the deviances, we multiply them by the corresponding deviance of the second variable.

A positive covariance indicates that as on variable deviates from the mean, the other variable deviates in the same direction.
A negative covariance indicates that as one variable deviates from the mean, the other deviates from the mean in the opposite direction.

The covariance depends upon the scales of measurement used: it is not a standardized measure.

Standardization of the correlation coefficient

To overcome the problem of dependence on the measurement scale, we need to convert the covariance into standard set of units → standardization.
Standard deviation: a measure of the average deviation from the mean.
If we divide any distance from the mean by the standard deviation, it gives us that distance in standard deviation units.
We can express the covariance in a standard units of measurement if we divide it by the standard deviation. But, there are two variables and hence two standard deviations.

Correlation coefficient: the standardized covariance

r = cov_xy/(s_xs_y)

s_x is the standard deviation for the first variable
s_y is the standard deviation for the second variable.

By standardizing the covariance we end up with a value that has to lie between -1 and +1.
A coefficient of +1 indicates that the two variables are perfectly positively correlated.
A coefficient of -1 indicates a perfect negative relationship.
A coefficient of 0 indicates no linear relationship at all.

The significance of the correlation coefficient

We can test the hypothesis that the correlation is different from zero.
There are two ways of testing this hypothesis.

We can adjust r so that its sampling distribution is normal:

z_r = ½ log_e((1+r)/(1-r))

The resulting z_rhas a standard error given by:

Se_zr = 1/(square root(N-3))

We can adjust r into a z-score

z = z_r/Se_zr

Access:

Public

The linear model - summary of Chapter 9 by A. Field 5th edition

Statistics
Chapter 9
The linear model (regression)

An introduction to the linear model (regression)
Bias in linear models?
Generalizing the model
Sample size and the linear model
The linear model with two or more predictors (multiple regression)

An introduction to the linear model (regression)

The linear model with one predictor

outcome = (b₀+b₁x_i) +error_i

This model uses an unstandardised measure of the relationship (b₁) and consequently we include a parameter b₀ that tells us the value of the outcome when the predictor is zero.

Any straight line can be defined by two things:

the slope of the line (usually denoted by b₁)
the point at which the the line crosses the vertical axis of the graph (the intercept of the line, b₀)

These parameters are regression coefficients.

The linear model with several predictors

The linear model expands to include as many predictor variables as you like.
An additional predictor can be placed in the model given a b to estimate its relationship to the outcome:

Y_i = (b₀ +b₁X_1i +b₂X_2i+ … b_nX_ni) + Ɛ_i

b_n is the coefficient is the nth predictor (X_ni)

Regression analysis is a term for fitting a linear model to data and using it to predict values of an outcome variable form one or more predictor variables.
Simple regression: with one predictor variable
Multiple regression: with several predictors

Estimating the model

No matter how many predictors there are, the model can be described entirely by a constant (b₀) and by parameters associated with each predictor (bs).

To estimate these parameters we use the method of least squares.
We could assess the fit of a model by looking at the deviations between the model and the data collected.

Residuals: the differences between what the model predicts and the observed values.

To calculate the total error in a model we square the differences between the observed values of the outcome, and the predicted values that come from the model:

total error: Σⁿ_i=1(observed_i-model_i)²

Because we call these errors residuals, this is called the residual sum of squares (SS_R).
It is a gauge of how well a linear model fits the data.

if the SS_R is large, the model is not representative
if the SS_R is small, the model is representative for the data

The least SS_R gives us the best model.

Assessing the goodness of fit, sums of squares R and R²

Goodness of fit: how well the model fits the observed data

Total sum of squares (SS_T): how good the mean is as a model of the observed outcome scores.

We can use the values of SS_T and SS_R to calculate how much better the linear model is than the baseline model of ‘no relationship’.
The improvement in prediction

Access:

Public

Comparing two means - summary of chapter 10 of Statistics by A. Field (5th edition)

Statistics
Chapter 10
Comparing two means

Categorical predictors in the linear model

If we want to compare differences between the means of two groups, all we are doing is predicting an outcome based on membership of two groups.
This is a linear model with one dichotomous predictor.

The t-test
SPSS

The t-test

Independent t-test: used when you want to compare two means that come from conditions consisting of different entities (this is sometimes called the independent-measures or independent-means t-test)
Paired-samples t-test: also known as the dependent t-test. Is used when you want to compare two means that come from conditions consisting of the same or related entities.

Rationale for the t-test

Both t-tests have a similar rationale:

two samples of data are collected and the sample means calculated. These might differ by either a little or a lot
If the samples come from the same population, then we expect their means to be roughly equal. Although it is possible for the means to differ because of sample variation, we would expect large differences between sample means to occur very infrequently. Under the null hypothesis we assume that the experimental manipulation has no effect on the participant’s behaviour: therefore, we expect means from two random samples to be very similar.
We compare the difference between the sample means that we collected to the difference between the sample means that we would expect to obtain (in the long run) if there were no effect. We use the standard error as a gauge of the variability between sample means. If the standard error is small, then we expect most samples to have very similar means. When the standard error is large, large differences in sample means are more likely. If the difference between the samples we have collected is larger than we would expect based on the standard error then one of two things has happened:
- There is no effect but sample means form our population fluctuate a lot and we happen to have collected two samples that produce very different means.
- The two samples come from different populations, which is why they have different means, and this difference is indicative of a genuine difference between the samples.
the larger the observed difference between the sample means, the more likely it is that the second explanation is correct.

Most test statistics have a signal-to-noise ratio: the ‘variance explained by the model’ divided by the ‘variance that the model can’t explain’.
Effect divided by error.
When comparing two means, the model we fit is the difference between the two group means. Means vary from sample to sample (sampling variation) and we can use the standard error as a measure of how much means fluctuate. Therefore, we can use the standard error of the differences between the

Access:

Public

Moderation, mediation, and multi-category predictors - summary of chapter 11 of Statistics by A. Field (5th edition),

Statistics
Chapter 11
Moderation, mediation, and multi-category predictors

Moderation: interactions in the linear model
Mediation
Categorical predictors in regression

Moderation: interactions in the linear model

The conceptual model

Moderation: for a statistical model to include the combined effect of two or more predictor variables on an outcome.
This is in statistical terms an interaction effect.

A moderator variable: one variable that affects the relationship between two others.
Can be continuous or categorical.
We can explore this by comparing the slope of the regression plane for X ad low and high levels of Y.

The statistical model

Moderation is conceptually.

Moderation in the statistical model. We predict the outcome from the predictor variable, the proposed variable, and the interaction of the two.
It is the interaction effect that tells us whether moderation has occurred, but we must include the predictor and moderator for the interaction term to be valid.

Outcome_i = (model) + error_i

Y_i = (b₀ + b_1iX_1i + b_2iX_2i + … + b_nX_ni) + Ɛ_i

To add variables to a linear model we literally just add them in and assign them a parameter (b).
Therefore, if we had two predictors labelled A and B, a model that tests for moderation would be expressed as:

Y_i = (b₀ + b₁A_i + b₂B_i + b₃AB_i) + Ɛ_i

The interaction is AB_i

Centring variables

When an interaction term is included in the model the b parameters have a specific meaning: for the individual predictors they represent the regression of the outcome on that predictor when the other predictor is zero.

But, there are situation where it makes no sense for a predictor to have a score of zero. So the interaction term makes the bs for the main predictors uninterpretable in many situations.
For this reason, it is common to transform the predictors using grand mean centring.
Centring: the process of transforming a variable into deviations around a fixed point.
This fixed point ca be any value that you choose, but typically it’s the grand mean.
The grand mean centring for a given variable is achieved by taking each score and subtracting from it the mean of all scores (for that variable).

Centring the predictors has no effect on the b for highest-order predictor, but will affect the bs for the lower-order predictors.
Order: how many variables are involved.
When we centre variables, the bs represent the effect of the predictor when the other predictor is at its mean value.

Centring is important when your model contains an interaction term because it makes the bs for lower-order effects interpretable.
There are good reasons for not caring about the lower-order effects when the higher-order interaction involving these effects is significant.

when it is

Access:

Public

Comparing several independent means - summary of chapter 12 of Statistics by A. Field (5th edition)

Statistics
Chapter 12
Comparing several independent means

Using a linear model to compare several means
Assumptions when comparing means
Planned contrast (contrast coding)
Post hoc procedures
Comparing several means
Calculating the effect size
Reporting results from one-way independent ANOVA

Using a linear model to compare several means

ANOVA: analysis of variance
the same thing as the linear model or regression.

In designs in which the group sizes are unequal, it is important that the baseline category contains a large number of cases to ensure that the estimates of the b-values are reliable.

When we are predicting an outcome from group membership, predicted values from the model are the group means.
If the group means are meaningfully different, then using the group means should be an effective way to predict scores.

Prediction_i= b₀ + b₁X + b₂Y + Ɛ_i

Control = b₀

Using dummy coding ins only one of many ways to code dummy variables.

an alternative is contrast coding: in which you code the dummy variables in such a way that the b-values represent differences between groups that you specifically hypothesized before collecting data.

The F-test is an overall test that doesn’t identify differences between specific means. But, the model parameters do.

Logic of the F-statistic

The F-statistic tests the overall fit of a linear model to a set of observed data.
F is the ratio of how good the model is compared to how bad it is.
When the model is based on group means, our predictions from the model are those means.

if the group means are the same then our ability to predict the observed data will be poor (F will be small)
if the means differ we will be able to better discriminate between cases from different groups (F will be large).

F tells us whether the group means are significantly different.

The same logic as for any linear model:

the model that represents ‘no effect’ or ‘no relationship between the predictor variable and the outcome’ is one where the predicted value of the outcome is always the grand mean
we can fit a different model to the data that represents our alternative hypotheses. We compare fit of this model to the fit of the null model
the intercept and one or more parameters (b) describe the model
the parameters determine the shape of the model that we have fitted.
in experimental research the parameters (b) represent the differences between group means. The bigger the differences between group means, the greater the difference between the model and the null model (grand mean)
if the differences between group means are large enough, then the resulting model will be a better fit to the data than the null model

Access:

Public

Analysis of covariance - summary of chapter 13 of Statistics by A. Field (5th edition)

Statistics
Chapter 13
Comparing means adjusted for other predictors (analysis of covariance)

What is ANCOVA?
ANCOVA and the general linear model
Assumptions and issues in ANCOVA
Interpreting ANCOVA
Calculating the effect size
Reporting results

What is ANCOVA?

The linear model to compare means can be extended to include one or more continuous variables that predict the outcome (or dependent variable).
Covariates: the additional predictors.

ANCOVA: analysis of covariance.

Reasons to include covariates in ANOVA:

To reduce within-group error variance
Elimination of confounds

ANCOVA and the general linear model

For example:

Happiness_i = b₀ + b₁Long_i + b₂Short_i + b₃Covariate_i + Ɛ_i

We can add a covariate as a predictor to the model to test the difference between group means adjusted for the covariate.

With a covariate present, the b-values represent the differences between the means of each group and the control adjusted for the covariate(s).

Assumptions and issues in ANCOVA

Independence of the covariate and treatment effect

When the covariate and the experimental effect are not independent, the treatment effect is obscured, spurious treatment effects can arise, and at the very least the interpretation of the ANCOVA is seriously compromised.

When treatment groups differ on the covariate, putting the covariate into the analysis will not ‘control for’ or ‘balance out’ those differences.
This problem can be avoided by randomizing participants to experimental groups, or by matching experimental groups on the covariate.

We can see whether this problem is likely to be an issue by checking whether experimental groups differ on the covariate before fitting the model.
If they do not significantly differ then we might consider it reasonable to use it as a covariate.

Homogeneity of regression slopes

When a covariate is used we loot at its overall relationship with the outcome variable:; we ignore the group to which a person belongs.
We assume that this relationship between covariate and outcome variable holds true for all groups of participants: homogeneity of regression slopes.

There are situations where you might expect regression slopes to differ across groups and that variability may be interesting.

What to do when assumptions are violated

bootstrap for the model parameters
post hoc tests

But bootstrap won’t help for the F-tests.

There is a robust variant of ANCOVA.

Interpreting ANCOVA

The main analysis

The format of the ANOVA table is largely the same as without the covariate, except that there is an additional row of information about the covariate.

looking first at the significance values,

Access:

Public

Factorial designs - summary of chapter 14 of statistics by A. Field (5th edition)

Statistics
Chapter 14
Factorial designs

Factorial designs
Independent factorial designs and the linear model
Model assumptions in factorial design
Output from factorial design
Calculating effect sizes
Reporting results of factorial design

Factorial designs

Factorial design: when an experiment has two or more independent variables.
There are several types of factorial designs:

Independent factorial design: there are several independent variables or predictors and each has been measured using different entities (between groups).
Repeated-measures (related) factorial design: several independent variables or predictors have been measured, but the same entities have been used in all conditions.
Mixed design: several independent variables or predictors have been measured: some have been measured with different entities, whereas others used the same entities.

We can still fit a linear model to the design.
Factorial ANOVA: the linear model with two or more categorical predictors that represent experimental independent variables.

Independent factorial designs and the linear model

The general linear model takes the following general form:

Y_i =b₀ + b₁X_1i+b₂X_2i+... +b_nX_ni+Ɛ_i

We can code participant’s category membership on variables with zeros and ones.

For example:

Attractiveness_i = b₀+b₁A_i+b₂B_i+b₃AB_i+Ɛ_i

b₃AB is the interaction variable. It is A dummy multiplied by B dummy variable.

Behind the scenes of factorial designs

Calculating the F-statistic with two categorical predictors is very similar to when we had only one.

We still find the total sum of squared errors (SS_T) and break this variance down into variance that can be explained by the model/experiment (SS_M) and variance that cannot be explained (SS_R)
The main difference is that with factorial designs, the variance explained by the model/experiment is made up of not one predictor, but two.

Therefore, the sum of squares gets further subdivided into

variance explained by the first predictor/independent variable (SS_A)
variance explained by the second predictor/independent variable (SS_B)
variance explained by the interaction of these two predictors (SS_AxB)

Total sum of squares (SS_T)

We start of with calculating how much variability there is between scores when the ignore the experimental condition from which they came.

The grand variance: the variance of all scores when we ignore the group to which they belong.
We treat the data as one big group.
The degrees of freedom are: N-1

SS_T = s²_Grand(N-1)

The model sum of squares (SS_M)

The model sum of squares is broken down into the variance attributable to the first independent variable, the variance attributable to the second independent variable, and the variance attributable to the interaction of those two.

The model sum of squares: the difference between what the model predicts and the overall mean of the outcome variable.
What the model predicts is the group mean.
We

Access:

Public

Repeated measures designs - summary of chapter 15 of Statistics by A. Field (5th edition)

Statistics
Chapter 15
Repeated measures designs

Introduction to repeated-measures designs
Repeated measures and the linear model
The ANOVA approach to repeated-measures designs
The F-statistic for repeated-measures designs
Assumptions in repeated-measures designs
One-way repeated-measures designs
Effect sizes for one-way repeated-measures designs
Reporting one-way repeated-measures designs
Factorial repeated-measures designs
Effect sizes for factorial repeated-measures designs
Reporting the results from factorial repeated-measures designs

Introduction to repeated-measures designs

Repeated measures: when the same entities participate in all conditions of an experiment or provide data at multiple time points.

Repeated measures and the linear model

Repeated measures can also be considered as a variation of the general linear model.

For example.

Y_gi = b_0i +b_1iX_gi +Ɛ_gi

b_0i = b₀ + u_0i

b_1i = b₁ + u_1i

Y_gi for outcome g within person i from the specific predictor X_gi with the error Ɛ_gi

g is the level of treatment condition
i for the individuals

u_0i for the deviation of the individual’s intercept from the group-level intercept

The ANOVA approach to repeated-measures designs

The way that people typically handle repeated measures in IBM SPSS is to use a repeated-measures ANOVA approach.

The assumption of sphericity

The assumption that permits us to use a simpler model to analyse repeated-measures data is sphericity.

Sphericity: assuming that the relationship between scores in pairs of treatment conditions is similar.

It is a form of compound symmetry: holds true when both the variances across conditions are equal and the covariances between pairs of conditions are equal.
We assume that the variation within conditions is similar and that no two conditions are any more dependent than any other two.
Sphericity is a more general, less restrictive form of compound symmetry and refers to the equality of variances of the differences between treatment levels.

For example:

variance_A-B = variance_A-C = variance_B-C

Assessing the severity of departures from sphericity

Mauchly’s test: assesses the hypothesis that the variances of the differences between conditions are equal.
If the test is statistically significant, it implies that there are significant differences between the variances of differences and, therefore, sphericity is not met.
If it is not significant, the implication is that the variances of differences are roughly equal and sphericity is met.
It depends upon sample size.

What’s the effect of violating the assumption of sphericity?

A lack of sphericity creates a loss of power and an F-statistic that doesn’t have the distribution that it’s supposed to have.
It also causes some complications for post hoc tests.

What do you do if you violate sphericity?

Adjust

Access:

Public

Mixed designs - summary of chapter 16 of Statistics by A. Field (5th edition)

Statistics
Chapter 16
Mixed designs

Mixed designs
Assumptions in mixed designs
Mixed designs
Calculating effect sizes

Mixed designs

Situations where we combine repeated-measures and independent designs.

Mixed designs: when a design includes some independent variables that were measured using different entities and others that used repeated measures.
A mixed design requires at least two independent variables.

Because by adding independent variables we’re simply adding predictors to the linear model, you can have virtually any number of independent variables if your sample size is gin enough.

We’re still essentially using the linear model.
Because there are repeated measures involved, people typically use an ANOVA-style model. Mixed ANOVA

Assumptions in mixed designs

All the sources of potential bias in chapter 6 apply.

homogeneity of variance
sphericity

You can apply the Greenhouse-Geisser correction and forget about sphericity.

Mixed designs

Mixed designs compare several means when there are two or more independent variables, and at least one of them has been measured using the same entities and at least one other has been measured using different entiteis.
Correct for deviations from sphericity for the repeated-measures variable(s) by routinely interpreting the Greenhouse-Geisser corrected effects.
The table labelled Tests of Within-Subject Effects shows the F-statistic(s) for any repeated-measures variables and all of the interaction effects. For each effect, read the row labelled Greenhouse-Geisser or Huynh-Feldt. If the values in the Sig column is less than 0.05 then the means are significantly different
The table labelled Test of Between-Subjects Effects shows the F-statistic(s) for any between-group variables. If the value in the Sig column is less than 0.05 then the means of the groups are significantly different
Break down the mean effects and interaction terms using contrasts. These contrasts appear in the table labelled Tests of Within-Subjects Contrasts. Again, look at the column labelled sig.
Look at the means, or draw graphs, to help you interpret contrasts.

Calculating effect sizes

Effect sizes are more useful when they summarize a focused effect.

A straightforward approach is to calculate effect sizes for your contrasts.

Access:

Public

Multivariate analysis of variance (MANOVA) - summary of chapter 17 of Statistics by A. Field (5th edition)

Statistics
Chapter 17
Multivariate analysis of variance (MANOVA)

Introducing MANOVA
Introducing matrices
The theory behind MANOVA
Practical issues when conducting MANOVA
Summary
Reporting results from MANOVA

Introducing MANOVA

Multivariate analysis of variance (MANOVA) is used when we are interested in several outcomes.

The principles of the linear model extend to MANOVA in that we can use MANOVA when there is one independent variable or several, we can look at interactions between outcome variables, and we can do contrasts to see which groups differ.

Univariate: the model when we have only one outcome variable.
Multivariate: the model when we include several outcome variables simultaneously.

We shouldn’t fit separate linear models to each outcome variable.

Separate models can tell us only whether groups differ along a single dimension, MANOVA has the power to detect whether groups differ along a combination of dimensions.

Choosing outcomes

It is a bad idea to lump outcome measures together in a MANOVA unless you have a good theoretical or empirical basis for doing so.
Where there is a good theoretical basis for including some, but not all, of your outcome measures, then fit separate models: one for the outcomes being tested on a heuristic and one for the theoretically meaningful outcomes.

The point here is not to include lots of outcome variables in a MANOVA just because you measured them.

Introducing matrices

A matrix: a grid of numbers arranged in columns and rows.
A matrix can have many columns and rows, and we specify its dimensions using numbers.
For example: a 2 x 3 matrix is a matrix with two rows and three columns.

The values within a matrix are components or elements.
The rows and columns are vectors.

A square matrix: a matrix with an equal number of columns and rows.

An identity matrix: a square matrix in which the diagonal elements are 1 and the off-diagonal elements are 0.

The matrix that represents the systematic variance (or the model sum of squares for all variables) is denoted by the letter H and is called the hypothesis sum of squares and cross-products matrix (or hypothesis SSCP).

The matrix that represents the unsystematic variance (the residual sums of squares for all variables) is denoted by the letter E and called the error sum of squares and cross-products matrix (or error SSCP).

The matrix that represents the total amount of variance present for each outcome variable is denoted by T and is called the total sum of squares and cross-products matrix (or total SSCP).

Cross-products represent a total value for the combined error between two variables.
Whereas the sum of squares of a variable is the total squared difference between the observed values and the mean

Access:

Public

Exploratory factor analysis - summary of chapter 18 of Statistics by A. Field (5th edition)

Statistics
Chapter 18
Exploratory factor analysis

In factor analysis, we take a lot of information (variables) and a computer effortlessly reduces this into a simple message (fewer variables).

When to use factor analysis
Factors and components
Discovering factors
Preliminary analysis
Factor extraction
Interpretation
How to report factor analysis
Reliability analysis
Reliability

When to use factor analysis

Latent variable: something that cannot be accessed directly.

Measuring what the observable measures driven by the same underlying variable are.

Factor analysis and principal component analysis (PCA) are techniques for identifying clusters of variables.
Three main uses:

To understand the structure of a set of variables
to construct a questionnaire to measure an underlying variable
to reduce a data set to a more manageable size while retaining as much of the original information as possible.

Factors and components

If we measure several variables, or ask someone several questions about themselves, the correlation between each pair of variables can be arranged in a table.

this table is sometimes called the R-matrix.

Factor analysis attempts to achieve parsimony by explaining the maximum amount of common variance in a correlation matrix using the smallest number of explanatory constructs.
Explanatory constructs are known as latent variables (or factors) and they represent clusters of variables that correlate highly with each other.

PCA differs in that it tries to explain the maximum amount of total variance in a correlation matrix by transforming the original variables into linear components.

Factor analysis and PCA both aim to reduce the R matrix into a smaller set of dimensions.

in factor analysis these dimensions, or factors, are estimated form the data and are believed to reflect constructs that can’t be measured directly.
PCA transforms the data into a set of linear components. It doesn’t estimate unmeasured variables, it just transforms measured ones.

Graphical representation

Factors and components can be visualized as the axis of a graph along which we plot variables.
The coordinates of variables along each axis represent the strength of relationship between that variable and each factor.
In an ideal world a variable will have a large coordinate for one of the axes and small coordinates for any others.

this scenario indicates that this particular variable is related to only one factor.
variables that haver large coordinates on the same axis are assumed to measure different aspects of some common underlying dimension.

Factor loading: the coordinate of a variable along a classification axis.

If we square the factor loading for a variable we get a measure of its substantive importance to a factor.

Mathematical representation

A component ins PCA can be described as:

Component_i = b₁Variable_1i

Access:

Public

Categorical outcomes: chi-square and loglinear analysis - summary of chapter 19 of Statistics by A. Field

Statistics
Chapter 19
Categorical outcomes: chi-square and loglinear analysis

Analysing categorical data

Sometimes we want to predict categorical outcome variables. We want to predict into which category an entity falls.

Associations between two categorical variables
Associations between several categorical variables: loglinear analysis
Assumptions when analysing categorical data
Interpreting the chi-square test
SPSS
Interpreting loglinear analysis in SPSS
Reporting the results of loglinear analysis

Associations between two categorical variables

With categorical variables we can’t use the mean or any similar statistic because the mean of a categorical variable is meaningless: the numeric values you attach to different categories are arbitrary, and the mean of those numeric values will depend on how many members each category has.

When we’ve measured only categorical variables, we analyse the number of things that fall into each combination of categories (the frequencies).

Pearson’s chi-square test

To see whether there’s a relationship between two categorical variables we can use the Pearson’s chi-square test.
This statistic is based on the simple idea of comparing the frequencies you observe in certain categories to the frequencies you might expect to get in those categories by chance.

X² = Σ(observed_ij-model_ij)²/ model_ij

i represents the rows in the contingency table
j represents the columns in the contingency table.

As model we use ‘expected frequencies’.

To adjust for inequalities, we calculate frequencies for each cell in the table using the column and row totals for that cell.
By doing so we factor in the total number of observations that could have contributed to that cell.

Model_ij = E_ij = (row total_i x column total_j) / n

X² has a distribution with known properties called the chi-square distribution. This has a shape determined by the degrees of freedom: (r-1)(c-1)

r = the number of rows

c = the number of columns

Fischer’s exact test

The chi-square statistic has a sampling distribution that is only approximately a chi-square distribution.
The larger the sample is, the better this approximation becomes. In large samples the approximation is good enough not to worry about the fact that it is an approximation.
In small samples, the approximation is not good enough, making significance tests of the chi-square statistic inaccurate.

Fischer’s exact tests: a way to compute the exact probability of the chi-square statistic in small samples.

The likelihood ratio

An alternative to Pearson’s chi-square.
Based on maximum-likelihood theory.

General idea: you collect some data and create a model for which the probability of obtaining the observed set of data is maximized, then you compare this model to the probability of obtaining those data under the null hypothesis.
The resulting statistic is based on comparing observed frequencies with those predicted by the model.

LX²= sΣobserved_ij In( Observed_ij / model_ij)

In = the

Access:

Public

WSRt using SPSS, manual for tests in the third block of the second year of psychology at the uva

Here is a short explanation how to do tests in SPSS. These are the tests needed for the third block of WSRt and psychology at the second year of the uva.

Correlation analysis (two continuous variables)
Partial correlation (three continuous variables and you want to know the correlation between two variables, corrected for a third variable)
Multiple regression analysis
Principal component analysis
Reliability analysis
Mediation analysis (using PROCESS)
Moderation analysis (using PROCESS)

Correlation analysis (two continuous variables)

Open the data
Go to analyse, correlate, bivariate
Place the variables of which you want to know the correlation under ‘variables’
Click on ‘paste’ and run the syntax

Partial correlation (three continuous variables and you want to know the correlation between two variables, corrected for a third variable)

Open the data
Go to analyse, correlate, partial
Place the variable of which you want to know the correlation under ‘variables’
Place the variable for which you want to control under ‘controlling for’
Click on ‘options’
Select ‘zero-order correlations’ (this is the correlation without controlling for one variable)
Click on ‘continue’
Click on ‘paste’ and run the syntax

Multiple regression analysis

Open the data
Go to analyse, regression, linear
Place the dependent variable under ‘dependent’
Place the independent variables under ‘independent’
If you want to run more models, you can put the first variable under ‘independent’, click on ‘next’ and put the next variable under ‘independent’ (this way you can compare the models)
Click on ‘statistics’ and select:
Model fit
R squared change (if you have multiple models)
Descriptives
Part and partial correlations
Collinearity diagnostics
Click on ‘plots’
Put ZPRED under Y
Put ZRESID under X
(This is for testing homoscedasticity)
Click on ‘save’ and select:
Unstandardised
(for expected values)
Mahalanobis
Cook’s
Leverage values
(for outliers)
Click on paste and run the syntax

Principal component analysis

Open the data
Go to analyse, dimension-reduction, Factor
Put the items which you want to analyse under ‘variables’
Click on ‘descriptives’ and select:
Univariate descriptives
Initial solution
Coefficients
Significance levels
Anti-image (for assumptions)
KMO and Bartlett’s test of sphericity (also for assumptions)
Click on Extraction
Chose Principal component analysis
Select:
Scree plot
Chose for an eigenvalue bigger than 1
Click on rotation and select:
Varimax
Click on options and select:
Suppress

Access:

Public

Everything you need for the course WSRt of the second year of Psychology at the Uva

This magazine contains all the summaries you need for the course WSRt at the second year of psychology at the Uva.

WSRt, critical thinking, a list of terms used in the articles of block 2

WSRt, critical thinking - a summary of all articles needed in the second block of second year psychology at the uva

WSRt, critical thinking, a list of terms used in the articles of block 3

WSRt using SPSS, manual for tests in the third block of the second year of psychology at the uva

WSRt, critical thinking - a summary of all articles needed in the third block of second year psychology at the uva

WSRt, critical thinking, a list of terms used in the articles of block 4

WSRt, critical thinking - a summary of all articles needed in the fourth block of second year psychology at the uva

Sharon Klinkenberg legt SPSS uit op YouTube

Summary of Discovering statistics using IBM SPSS statistics by Field - 5th edition

Critical thinking: A concise guide by Bowell & Kemp (4th edition) - a summary

What is a confidence interval in null hypothesis significance testing?

What is the difference between a p-value and Bayes likelihood?

What are important elements of Bayesian statistics?

What is the Bayes factor?

What are weaknesses of the Bayesian approach?

What is qualitative psychological research?

What criteria should be held by good qualitative research?

Year 2 of psychology at the uva

Access:

Public

Categorical outcomes: logistic regression - summary of (part of) chapter 20 of Statistics by A. Field

Discovering statistics using IBM SPSS statistics
Chapter 20
Categorical outcomes: logistic regression

This summary contains the information from chapter 20.8 and forward, the rest of the chapter is not necessary for the course.

What is logistic regression?
Theory of logistic regression
Testing assumptions
Predicting several categories: multinomial logistic regression

What is logistic regression?

Logistic regression is a model for predicting categorical outcomes from categorical and continuous predictors.

A binary logistic regression is when we’re trying to predict membership of only two categories.
Multinominal is when we want to predict membership of more than two categories.

Theory of logistic regression

The linear model can be expressed as: Y_i = b₀ + b₁X_i + error_i

b₀ is the value of the outcome when the predictors are zero (the intercept).
The bs quantify the relationship between each predictor and outcome.
X is the value of each predictor variable.

One of the assumptions of the linear model is that the relationship between the predictors and outcome is linear.
When the outcome variable is categorical, this assumption is violated.
One way to solve this problem is to transform the data using the logarithmic transformation, where you can express a non-linear relationship in a linear way.

In logistic regression, we predict the probability of Y occurring, P(Y) from known (logtransformed) values of X₁ (or Xs).
The logistic regression model with one predictor is:
P(Y) = 1/(1+e ^{–(b0 +b1X1i)})
The value of the model will lie between 1 and 0.

Testing assumptions

You need to test for

Linearity of the logit
You need to check that each continuous variable is linearly related to the log of the outcome variable.
If this is significant, it indicates that the main effect has violated the assumption of linearity of the logic.
Multicollinearity
This has a biasing effect

Predicting several categories: multinomial logistic regression

Multinomial logistic regression predicts membership of more than two categories.
The model breaks the outcome variable into a series of comparisons between two categories.
In practice, you have to set a baseline outcome category.

Access:

Public

11872 reads

Concerned countries and regions

The Netherlands

Side road:

The Netherlands

Side road:

Samenvattingen voor psychologie en gedrag

Tip: type

Advice & Instructions

Tip: date of posting

16-01-2019

Help other WorldSupporters with additions, improvements and tips

Add new contribution

Related activities, jobs, skills, suggestions or topics

Institutions, jobs and organizations:

Universiteit Amsterdam: UVA

Activity abroad, study field of working area:

Samenvattingen voor psychologie en gedrag

Statistics and Data analysis Methods

Countries and regions:

The Netherlands

This content is used in bundle:

WSRt, critical thinking - a summary of all articles needed in the fourth block of second year psychology at the uva

This is a summary of the articles and reading materials that are needed for the fourth block in the course WSR-t. This course is given to second year psychology students at the Uva. The course is about thinking critically about how scientific research is done and how this

...