Learning Statistics with R - Navarro - 2014 - Article

Useful tools for data analysis go beyond what is covered in undergraduate classes. There are tools outside of R or statistics that are essential topics in data analysis:

  • Other types of correlations are correlations apart for Spearman and Pearson. Both methods can be used in measuring continuous variables and their correlation. Other types of correlations are used for variables on a nominal scale.

  • Effect sizes. There are more ways to think about effect size than just the most popular one.

  • Dealing with violated assumptions: bootstrapping, Bayesian probability and cross-validation are tools to analyse data when assumptions are violated.

  • Interaction terms for regression. Interaction terms can be included in the regression model ANOVA to enhance data analysis.

  • Method of planned comparison. Post-hoc analysis (like Tukey HSD) is not always needed. Instead if data is limited and gathered ahead of time, just ANOVA will be enough.

  • Multiple comparison methods. It is not necessary to stick to only one comparison method.

What are important non-traditional statistical methods?

There are a lot of statistical tools used in statistical modelling. Some important ones are described here.

  • Analysis of covariance. ANOVA can be used as a model for regression. The analysis of covariance (ANCNOVA) is a method where some of the predictors are continuous and others are categorical.

  • Nonlinear regression. The relationship between two predictors does not always have to be linear. For example, when the relationship is monotonic (isotonic, polynomial or Lowess regression).

  • Logistic regression. When the outcome variable is binary valued, but the predictors are continuous logistic regression is used.

  • The general linear model (GLM) is a family of models including regression, it allows for the ides that data might not be normally distributed. It also allows for non-linear relationships between predictors and outcomes.

  • Survival analysis is used when data from a study is missing. For example, outliers on one side are missing due to (time) restrictions. It is often used in the medical field.

  • Repeated measures ANOVA is used when participants are measured in multiple conditions. Repeated measures make that independence is violated. Observations from the same participant are more related than observations between participants. Variations in the data can be attributed to individual differences.

  • Mixed Models are used when the repeated measures ANOVA is insufficient. This can happen when people’s changes over time are measured. Mixed models are designed to analyse data and learn about individual units as well as overall effects.

  • Reliability analysis is used to check correlation between questions within a questionnaire. Reliability analysis (e.g. Cronbach’s α) is used to check the assumption that questions covering the same topic are correlated.

  • Factor analysis is useful when measuring more than a single construct. For example, with IQ scores, several things at once are measured. Factor analysis helps to see what these things are. It attempts to express a pattern of the correlations between variables using a smaller number of variables.

  • Multidimensional scaling (MDS) is used when variables cannot be divided in predictors and outcomes. It is an example of an unsupervised learning model. It is used when analysing similarities between items, objects or people. In MDS, the goal is finding a geometric representation of the data. Each item is plotted as a point in a two-dimensional space and the distance between them is measured.

  • Clustering is another example of an unsupervised learning model. It is also referred to as classification and the idea is to figure out what groups exist in the data. There are different types of clustering: k-means clustering, which is unsupervised, semi-supervised clustering and supervised clustering.

  • Causal models are useful tools to learn about causal relationships between variables. Variables should be correlated, but when there are three events it is useful to say something about the causal relationships between them. For example, did event A happen prior to event B or C? Causal models or structural equations modelling (SEM) can be used to clarify causal relationships.

What other ways of inferential statistics can be used?

Besides the traditional analysis of p-values and hypothesis significance testing, there are more methods being used for data analysis.

  • Bayesian methods. The Bayesian interpretation of probability can be explained as the degree of belief. Bayesian probability is used to assign probability to one of two events rather than restricting probability to events that can be replicated. It leads to different tools for data analysis.

  • Bootstrapping is useful when not all underlying assumptions for your data are met. This often happens for small sample sizes. It is a simple method where the results of the study are simulated lots of times under the assumptions that (a) the null-hypothesis is true and (b) the population distribution is like the raw data.

  • Cross validation is a method to describe the data sample. Divide the data into two subsets X1 and X2. Use subset X1 to train the model and see if it performs the same on subset X2. It gives an indication of generalisation of one dataset to another. It is a measure of how good the model is performing.

  • Robust statistics is used when data is messier than it is supposed to be. Variables are not normally distributed, and relationships are non-linear. Some statistical inferences are robust and work on data where the underlying assumptions are not met. Robust statistics is about ho to make safe inferences based on data when faced with contamination.

What are miscellaneous topics in statistics?

  • Missing data can be solved by making a plausible guess about what that data should be.

  • Power analysis is necessary to check how likely it is to find an effect if it exists.

  • Data-analysis using theory-inspired models is the use of psychological theory to get better statistical analysis.

Why should all the basics in statistics be learned?

The pragmatism argument states that all the basics should be learned because they are widely used. The incremental knowledge argument is that understanding the basics helps in understanding more advanced statistics. The extensibility of statistics is the biggest payoff.

Image

Access: 
Public

Image

Join WorldSupporter!
Search a summary

Image

 

 

Contributions: posts

Help other WorldSupporters with additions, improvements and tips

Add new contribution

CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Image CAPTCHA
Enter the characters shown in the image.

Image

Spotlight: topics

Image

Check how to use summaries on WorldSupporter.org

Online access to all summaries, study notes en practice exams

How and why use WorldSupporter.org for your summaries and study assistance?

  • For free use of many of the summaries and study aids provided or collected by your fellow students.
  • For free use of many of the lecture and study group notes, exam questions and practice questions.
  • For use of all exclusive summaries and study assistance for those who are member with JoHo WorldSupporter with online access
  • For compiling your own materials and contributions with relevant study help
  • For sharing and finding relevant and interesting summaries, documents, notes, blogs, tips, videos, discussions, activities, recipes, side jobs and more.

Using and finding summaries, notes and practice exams on JoHo WorldSupporter

There are several ways to navigate the large amount of summaries, study notes en practice exams on JoHo WorldSupporter.

  1. Use the summaries home pages for your study or field of study
  2. Use the check and search pages for summaries and study aids by field of study, subject or faculty
  3. Use and follow your (study) organization
    • by using your own student organization as a starting point, and continuing to follow it, easily discover which study materials are relevant to you
    • this option is only available through partner organizations
  4. Check or follow authors or other WorldSupporters
  5. Use the menu above each page to go to the main theme pages for summaries
    • Theme pages can be found for international studies as well as Dutch studies

Do you want to share your summaries with JoHo WorldSupporter and its visitors?

Quicklinks to fields of study for summaries and study assistance

Main summaries home pages:

Main study fields:

Main study fields NL:

Follow the author: Vintage Supporter
Work for WorldSupporter

Image

JoHo can really use your help!  Check out the various student jobs here that match your studies, improve your competencies, strengthen your CV and contribute to a more tolerant world

Working for JoHo as a student in Leyden

Parttime werken voor JoHo

Statistics
480