Statistics
Chapter 6
The beast of bias
Bias: the summary information is at odds with the objective truth.
An unbiased estimator: one estimator that yields and expected value that is the same thing it is trying to estimate.
We predict an outcome variable from a model described by one or ore predictor variables and parameters that tell us about the relationship between the predictor and the outcome variable.
The model will not predict the outcome perfectly, so for each observation there is some amount of error.
Statistical bias enters the statistical process in three ways:
- things that bias the parameter estimates (including effect sizes)
- things that bias standard errors and confidence intervals
- things that bias test statistics and p-values
An outlier: a score very different from the rest of the data.
Outliers have a dramatic effect on the sum of squared error.
If the sum of squared errors is biased, the associated standard error, confidence interval and test statistic will be too.
The second bias is ‘violation of assumptions’.
An assumption: a condition that ensures that what you’re attempting to do works.
If any of the assumptions are not true then the test statistic and p-value will be inaccurate and could lead us to the wrong conclusion.
The main assumptions that we’ll look at are:
- additivity and linearity
- normality of something or other
- homoscedasticity/ homogeneity of variance
- independence
Additivity and linearity
The assumption of additivity and linearity: the relationship between the outcome variable and predictor is accurately described by equation.
The scores on the outcome variable are, in reality, linearly related to any predictors. If you have several predictors then their combined effect is best described by adding their effects together.
If the assumption is not true, even if all the other assumptions are met, your model is invalid because your description of the
... Interested? Read the instructions below in order to read the full content of this page.
Add new contribution