Interaction terms in regression analysis

## Interaction terms in regression analysis

Today you will use SPSS to run several regressions using interaction terms.  You will use our class survey data to test some hypotheses about support for the welfare state using interaction terms.

Recall the definition of an interaction term:

The effect of x1 on y is moderated by a third variable, x2.

Another way of saying this is that the presence/absence of x2 influences the relationship between x1 and y.

The first step is always to think theoretically: what interactions might you expect based on the state of your theoretical knowledge? A potential interaction term for the purposes of opdracht 3 would be that a partisanship and advantage from the welfare state interact in determining support for the welfare state.

Or:

Support for the welfare state is higher among residents of small towns than among residents of cities.  Political preference interacts with the size of one’s home town in determining support for the welfare state.

Or:

One hypothesis concerning the welfare state is that gender interacts with political ideology.  In other words, females are more likely to support the welfare state than men, and this relationship becomes stronger/weaker at different locations on a left-right scale of individual political preference.  This is another way of saying that females support the welfare state more than males do, and that this relationship becomes stronger as a person moves left on the left-right scale.

In the previous example, without the interaction term, the OLS model would be:

WS support = A + political ideology + gender

When we introduce an interaction term, it is helpful to refer to original IVs as the “main effect” variables in order to distinguish them from the interaction term.  In the above example, ‘political preference’ and ‘gender’ are the main effects variables.

The new model that includes the interaction term is:

WS support = A + political ideology + gender + polideology*gender

Before we run any regressions, let’s expand the model, since we know that all relevant variables should be included in the model!  We will include ‘age’ as a way of measuring individual advantage from the welfare state (welfare state ‘benefit’ increases with age).

We now arrive at the following model:

WS support = A + political ideology + gender + age + polideology*gender

1.  Check your data and make sure that you have recoded variables as necessary.

- create a dummy variable for gender (recode into 1-0)

- for the other variables (ideology and age), make sure that they are coded correctly and that you have dealt with ‘system missings.’

2.   First do a preliminary analysis of the interactive relationship between gender and political preference.  Examine the mean of the dependent variable for each value of political preference, controlling for gender.

analyze--->compare means--->means

(follow the same sequence of steps as in Pollock, page 186)

Now inspect your results and ask yourself whether the relationship between political ideology depends on gender.

If your hunch has merit (you DO see that gender moderates the relationship between political ideology and support for the WS), then carry our a regression analysis that includes this interaction term.

3. Create an interaction term, which is essentially a new variable. See Pollock page 187 for detailed instructions.

4.  Now run the regression with FOUR independent variables, the two ‘main effects’ variables, gender and political ideology, age, and the interaction term (gender*polideol)

WS support = A + political ideology + gender + age + gender*polideology

Now interpret your results, keeping in mind that:

1. interpretation of interaction terms is tricky!

2. the best way to interpret the results for the main effects variables and the interaction term is to choose several examples, such as a male who scores 1 on political ideology, and then compare the results with a female who scores 1 on political ideology. Then do the same for males and females scoring 8 or 9 on political ideology.

Now spend a few minutes thinking about potential interaction terms based on your knowledge of the literature on the welfare state.  What other interaction effects might you expect to find in our class survey data?  What kinds of interaction effects are consistent with Pierson and which are consistent with alternative theories? Choose one interaction effect, construct an interaction variable, and then run a regression that incorporates it.

HANDOUT INTERACTIONS Here are some hints for using interaction terms in your regression analysis. The examples are drawn from our class survey dataset.

We would like to test the following model (based on our reading of the literature):

steun = A + gebruikwelvaart + polideology + maledum + leeftijd

‘gebruikwelvaart’ is used to test one of Pierson’s core claims (that one’s own benefit from the welfare state is one of the best predictors of support for the welfare state)

‘polideology’ is included because the main theoretical approach that Pierson challenges is the ‘power resources’ school that claims that partisanship (individual political preference) is the best predictor of attitudes concerning the welfare state, including support for the WS.

‘maledum’ and ‘leeftijd’ are included as control variables; existing research confirms that these two factors shape support for the WS, so we include them, even though they are not our main ‘research variable.’  If we do not include all known relevant variables in our model, our estimates (coefficients) might not be accurate.

Note the following:  we can describe the above model as one that tests the influence of ‘individual WS advantage’ on WS support, controlling for political partisanship, gender and age.

We can also describe it as:

-a model that tests the influence of ‘partisanship’ on WS support, controlling for individual WS advantage, gender and age.

-a model that tests the influence of ‘gender’ on WS suport, controlling for individual WS advantage, political partisanship and age.

-a model that tests the influence of ‘age’ on WS support, controlling for individual WS advantage, political partisanship, gender and age.

KEEP YOUR EYE ON YOUR RESEARCH GOALS: we are interested in the debate between the ‘old’ and ‘new’ politics theorists, so the main variables of interest are ‘individual WS advantage’ and ‘partisanship.’ But that does not mean we can omit other relevant variables from the model!

[note: it is a good idea to develop ONE overall model to start with. This model should include ALL RELEVANT variables, and should have a dependent variable measured at the INTERVAL level.  Ordinal DVs are usually fine as long as they have at least 5 categories, and are ‘interval-like’ meaning that the distance between the categories is comparable (for example extremely disgagree (5) thru extremely agree (1), but not educational levels).’  Once you have formulated the overall model, you can adjust it by using different operationalizations of key variables, adding an interaction term, or adding/dropping variables if there is a theoretical reason to do so.]

The first thing we do is inspect our dataset and choose appropriate variables.

For the variable ‘political ideology’ we see that there are some strange things going on: we know the scale is 1-10 but there are several ‘4.5s’, ‘5.5s’ and ‘7.5s.’  This cause problems with our results, so these values need to be recoded, either as SYSMIS or as the closest value on the scale. We choose the latter, and decide to round upwards so that 4.5 becomes 5, and so on. We inspect the frequencies to make sure that everything went ok.

Statistics

Politieke ideologie zelfinschaling (1= links; 10 is rechts)

N

Valid

421

Missing

20

Politieke ideologie zelfinschaling (1= links; 10 is rechts)

Frequency

Percent

Valid Percent

Cumulative Percent

Valid

1

4

.9

1.0

1.0

2

32

7.3

7.6

8.6

3

79

17.9

18.8

27.3

4

89

20.2

21.1

48.5

4.5

2

.5

.5

48.9

5

70

15.9

16.6

65.6

5.5

4

.9

1.0

66.5

6

54

12.2

12.8

79.3

7

43

9.8

10.2

89.5

7.5

1

.2

.2

89.8

8

35

7.9

8.3

98.1

9

8

1.8

1.9

100.0

Total

421

95.5

100.0

Missing

geen antwoord

13

2.9

System

7

1.6

Total

20

4.5

Total

441

100.0

newpoltideology

Frequency

Percent

Valid Percent

Cumulative Percent

Valid

1

4

.9

1.0

1.0

2

32

7.3

7.6

8.6

3

79

17.9

18.8

27.3

4

89

20.2

21.1

48.5

5

72

16.3

17.1

65.6

6

58

13.2

13.8

79.3

7

43

9.8

10.2

89.5

8

36

8.2

8.6

98.1

9

8

1.8

1.9

100.0

Total

421

95.5

100.0

Missing

System

20

4.5

Total

441

100.0

Now we need the best possible operationalization of ‘individual WS advantage’ and we decide to use variable 6_1-8 to construct an index.

gebruikwelvaart

Frequency

Percent

Valid Percent

Cumulative Percent

Valid

0

115

26.1

27.1

27.1

1

275

62.4

64.7

91.8

2

32

7.3

7.5

99.3

3

1

.2

.2

99.5

4

1

.2

.2

99.8

5

1

.2

.2

100.0

Total

425

96.4

100.0

Missing

System

16

3.6

Total

441

100.0

Before we run the first regression, we double check that we have checked our variables and dealt with ‘don’t knows’ etc.  Make sure these are coded “SYSMIS”.

Now we will run the first regression:

steun = A + gebruikwelvaart + polideology + maledum + leeftijd

Model Summary

Model

R

R Square

Std. Error of the Estimate

1

.416a

.173

.164

4.93864

a. Predictors: (Constant), maledum, Leeftijd in jaren, newpoltideology, gebruikwelvaart

ANOVAb

Model

Sum of Squares

df

Mean Square

F

Sig.

1

Regression

2020.724

4

505.181

20.713

.000a

Residual

9682.888

397

24.390

Total

11703.612

401

a. Predictors: (Constant), maledum, Leeftijd in jaren, newpoltideology, gebruikwelvaart

b. Dependent Variable: SteunWVS

Coefficientsa

Model

Unstandardized Coefficients

Standardized Coefficients

t

Sig.

B

Std. Error

Beta

1

(Constant)

11.540

.969

11.913

.000

gebruikwelvaart

.258

.404

.030

.638

.524

Leeftijd in jaren

.101

.015

.321

6.913

.000

newpoltideology

-.729

.134

-.249

-5.424

.000

maledum

-1.640

.498

-.151

-3.291

.001

a. Dependent Variable: SteunWVS

Interpretations:

First we recall the scale of the DV, steunwelvaart: 1= very low support, 21= very high support.

To interpret “gebruik”, recall that it has 9 categories, 0-6, that refer to the number of social programs a person is using.  The first thing we notice about gebruik is that the significance is far away from an alpha of p < .05. We interpret the coefficient anyway to see what it tells us: an increase of welfare state usage by one program, out of a total of 6 programs, results in an increase in support for the welfare state of .258 points on a 21 point scale.  The sign (+) of the coefficient is in the predicted direction, but the relationship between welfare state use and support for the WS is weak (a 1/4 of a point increase on the 21 point scale of support for each program that a person uses).  The coefficient is far from our alpha, so the chance that the relationship does not hold for the population is > 50%. (Recall that our sample is NOT representative, which means we have to be careful about our significance tests. We can use the significance test with regard to the confidence we have for this survey’s sampling technique of the population which consists of the social environment of second year polisci students at the Radboud University, but we can’t make strong claims about the Dutch population or about people beyond the borders.)  For the moment, we proceed as if our sample is representative enough to make inferences about significance).

Now we turn to our political ideology variable:  the sign of the coefficient is in the predicted direction. As a person becomes more ‘right’, support for the welfare state should decrease, and this is exactly what happens.  Recall that the polideology scale is 1-10. A one point increase on this scale results in (about) a ¾ point decrease in support for the WS on the 21 point scale.  I would say that this variable has a moderate effect on the dependent variable, but this is a matter of interpretation.  The coefficient is highly significant, so we can be more than 99% confident that it holds for the population.

Now we turn to the male dummy (male=1; female=0): the sign of the coefficient is in the predicted direction (men support the WS less than women) and the size of the effect is, I would say, moderate.  Maleness decreases support for the welfare state by 1.64 points on the 21 point scale. The coefficient is highly significant.

Now we turn to the leeftijd variable: again the sign of the coefficient is in the predicted direction, and the size of the effect is weak to moderate: an increase in age of 10 years results in a one point increase in WS support on the 21 point scale.  The coefficient is highly significant.

Overall, the results confirm our expectations based on the theory, except for the “gebruik” variable which we put in the model to test Pierson’s ‘new politics’ thesis that those who benefit from the welfare state are more likely to support the welfare state, regardless of political ideology. Moreover, when we look at the standardized coefficients we see that ‘age’ and ‘political ideology’ have the most influence on the dependent variable.

The adjusted r square of the model is .164, which means that the model explains about 16% of the variance in welfare state support.  However, we are not that concerned about the value of the r square because the goal of our research is to test the viability of Pierson’s ‘new politics’ thesis by using linear regression. So for us, the size and significance of the coefficients are the most important places to look for confirmation/disconfirmation of Pierson’s arguments.

Now we would like to include an interaction term in our model, because we suspect that the relationship between political ideology and WS support is moderated by gender. This means that we suspect that the relationship between political ideology and WS support is stronger among men than among women.  We theorized, and our first regression confirmed, that men are less likely to support the WS than women, which leads us to think that men are more influenced by their political Ideology than women are.

We construct the following model:

steun = A + gebruikwelvaart + polideology + maledum + leeftijd + polideology*maledum

Before running the regression, we use “compare means” in SPSS to see if there is some evidence for our hypothesis:

Don’t forget to click ‘next’ after you enter the first part of the interaction (maledum) into the menu. This brings you to the next ‘layer’ of the computation, where you enter the second part of the interaction term (newpoliticalideology). Then click OK.

SPSS produces the following table:

Report

SteunWVS

maledum

newpoltideology

Mean

N

Std. Deviation

female

1

12.5000

2

12.02082

2

12.6429

14

5.21252

3

12.7429

35

5.10676

4

11.4318

44

4.78563

5

12.3438

32

5.16570

6

13.5769

26

4.16819

7

11.9500

20

3.85903

8

7.8000

10

7.33030

9

15.0000

4

9.52190

Total

12.1711

187

5.17673

male

1

17.5000

2

2.12132

2

11.7059

17

4.76661

3

13.0682

44

5.02736

4

10.5814

43

5.35960

5

10.5143

35

5.64838

6

9.0645

31

5.15710

7

9.1739

23

5.43266

8

7.5000

26

4.99800

9

6.2500

4

5.67891

Total

10.4178

225

5.48404

Total

1

15.0000

4

7.61577

2

12.1290

31

4.91082

3

12.9241

79

5.03264

4

11.0115

87

5.06583

5

11.3881

67

5.46048

6

11.1228

57

5.21012

7

10.4651

43

4.91523

8

7.5833

36

5.62837

9

10.6250

8

8.63444

Total

11.2136

412

5.41135

Now we inspect the table and ask whether there is evidence that the relationship between polideology and WS support is moderated by gender. This means asking: is the relationship stronger/weaker for men than for women? Recall that men are coded 1 and women 0.  You see that the predicted relationship between political ideology and WS support holds very well for men:  as we move higher (more ‘right’) on the left-right scale, the mean for WS support for men drops from 17.5 to 6.25.  Now the question is whether this relationship (between polideo and WS support) is stronger or weaker for women.  We predicted that the relationship should be weaker, because women are more likely to support the welfare state regardless of their political ideology because they are women (caring tasks, work in care sector, etc).  What does the table tell us?  We see that as we move higher on the left-right scale, the mean level of support for women is fairly stable: there is no clear downward trend in the level of support for the welfare state as there was for men. Instead, women’s support hovers between 12.5 and 15 and actually increases somewhat at higher scores on the left-right scale. Thus there is evidence of an interaction effect, and we can move on to the regression with the interaction term added.

We now compute the interaction term or interaction variable (Allison calls these ‘product terms’).

Now we run our regression with the added interaction term. SPSS gives us the following results.

Model Summary

Model

R

R Square

Std. Error of the Estimate

1

.432a

.187

.177

4.90224

a. Predictors: (Constant), maledumXpolideology, Leeftijd in jaren, gebruikwelvaart, newpoltideology, maledum

ANOVAb

Model

Sum of Squares

df

Mean Square

F

Sig.

1

Regression

2186.956

5

437.391

18.200

.000a

Residual

9516.656

396

24.032

Total

11703.612

401

a. Predictors: (Constant), maledumXpolideology, Leeftijd in jaren, gebruikwelvaart, newpoltideology, maledum

b. Dependent Variable: SteunWVS

Coefficientsa

Model

Unstandardized Coefficients

Standardized Coefficients

t

Sig.

B

Std. Error

Beta

1

(Constant)

9.675

1.195

8.097

.000

gebruikwelvaart

.202

.401

.023

.503

.615

Leeftijd in jaren

.100

.014

.318

6.891

.000

newpoltideology

-.314

.206

-.108

-1.523

.129

maledum

1.755

1.382

.162

1.270

.205

maledumXpolideology

-.709

.269

-.371

-2.630

.009

a. Dependent Variable: SteunWVS

The first thing we notice is that the statistical significance of our interaction term is high, p<.01.  Following Allison’s advice, we decide that the interaction term does indeed belong in the model, so we proceed with our interpretations. Now the tricky part begins.

One strategy is to cautiously interpret the coefficients themselves, although this is not always as informative as it is for regression models that do NOT contain interaction terms.  Of course we can interpret “leeftijd’ and ‘gebruikwelvaart’ in the normal way, since they are not part of an interaction term.  But we have to be very careful about interpreting the coefficients for the ‘main effects’ variables (maledum and polideology).

The first thing we see is that the sign of the interaction term is negative, the sign of maledum is positive and the sign of polideol is negative.

The second thing we can do is to interpret the coefficient of the main effects variables:

For newpolid, the coefficient is -.314.  We can interpret this number as the effect of ideology when maledum = 0, i.e. for women (it is not very helpful to interpret the ‘main effects’ coefficient for newpolideology).

Another approach is to estimate the effect of political ideology for different levels of the gender variable, i.e. for men (1) and for women (0).

To calculate the effect of newpolideology on WS support at a given value of maledum, use the following formula:

b(polideology) at a specific value of maledum = b(polideology) + b(polidmaledum)* value of maledum

SO: the effect of polideology when maledum is 1 (male) =  -.314 + (-.709)*1

Thus, the effect of political ideology for men is -.314 + .709 = -1.023

(among men, each one point increase on the political ideology scale results in a 1.023 point decrease on the 21 point WS support scale)

The effect of polideology for women is = -.314 + 0 = -.314 (i.e. the newpolideology main effects coefficient)

This confirms our expectations: our hypothesis is that political ideology matters for men’s support of the welfare state but not very much for women’s. We conclude that political ideology matters more for men; being male strengthens the influence of ideology on WS support. We can also state this the other way around: being female weakens the positive influence of political ideology.

***We notice that the variables maledum and newpolitideology are no longer statistically significant at the p < .05 level. If we run the regression using femdum (female=1, male=0) we see that femdum, newpolid, and the interaction term femdumXpolid (SPSS output is not shown here) are statistically significant.

Page access
Public
Using summaries:

Join World Supporter
Join World Supporter
Content categories
Main Studies & Fields