## Discovering statistics using IBM SPSS statistics by Andy Field, fifth edition – Summary chapter 6

Bias can be detrimental for the parameter estimates (1), standard errors and confidence intervals (2) and the test statistics and p-values (3). Outliers and violations of assumptions are forms of bias. An outlier is a score very different from the rest of the data. They bias parameter estimates and have an impact on the error associated with that estimate. Outliers have a strong effect on the sum of squared errors and this biases the standard deviation. There are several assumptions of the linear model:Additivity and linearityThe scores on the outcome variable are linearly related to any predictors. If there are multiple predictors, their combined effect is best described by adding them together. NormalityThe parameter estimates are influenced by a violation of normality and the residuals of the parameters should be normally distributed. It is normality for each level of the predictor variable that is relevant. Normality is also important for confidence intervals and for null hypothesis significance testing.Homoscedasticity / homogeneity of variance Homoscedasticity / homogeneity of varianceThis impacts the parameters and the null hypothesis significance testing. It means that the variance of the outcome variable should not change between levels of the predictor variable. Violation of this assumption leads to bias in the standard error. IndependenceThis assumption means that the errors in the model are not related to each other. The data has to be independent. The assumption of normality is mainly relevant in small samples. Outliers can be spotted using graphs (e.g. histograms or boxplots). Z-scores can also be used to find outliers. The P-P plot can be used to look for normality of a distribution. It is the expected z-score of a score against the actual z-score. If the expected z-scores overlap with the actual z-scores, the data will be normally distributed. The Q-Q plot is like the P-P plot but it plots the quantiles of the data...

## Add new contribution