Interaction terms in regression analysis
Today you will use SPSS to run several regressions using interaction terms. You will use our class survey data to test some hypotheses about support for the welfare state using interaction terms.
Recall the definition of an interaction term:
The effect of x1 on y is moderated by a third variable, x2.
Another way of saying this is that the presence/absence of x2 influences the relationship between x1 and y.
The first step is always to think theoretically: what interactions might you expect based on the state of your theoretical knowledge? A potential interaction term for the purposes of opdracht 3 would be that a partisanship and advantage from the welfare state interact in determining support for the welfare state.
Or:
Support for the welfare state is higher among residents of small towns than among residents of cities. Political preference interacts with the size of one’s home town in determining support for the welfare state.
Or:
One hypothesis concerning the welfare state is that gender interacts with political ideology. In other words, females are more likely to support the welfare state than men, and this relationship becomes stronger/weaker at different locations on a leftright scale of individual political preference. This is another way of saying that females support the welfare state more than males do, and that this relationship becomes stronger as a person moves left on the leftright scale.
In the previous example, without the interaction term, the OLS model would be:
WS support = A + political ideology + gender
When we introduce an interaction term, it is helpful to refer to original IVs as the “main effect” variables in order to distinguish them from the interaction term. In the above example, ‘political preference’ and ‘gender’ are the main effects variables.
The new model that includes the interaction term is:
WS support = A + political ideology + gender + polideology*gender
Before we run any regressions, let’s expand the model, since we know that all relevant variables should be included in the model! We will include ‘age’ as a way of measuring individual advantage from the welfare state (welfare state ‘benefit’ increases with age).
We now arrive at the following model:
WS support = A + political ideology + gender + age + polideology*gender
1. Check your data and make sure that you have recoded variables as necessary.
 create a dummy variable for gender (recode into 10)
 for the other variables (ideology and age), make sure that they are coded correctly and that you have dealt with ‘system missings.’
2. First do a preliminary analysis of the interactive relationship between gender and political preference. Examine the mean of the dependent variable for each value of political preference, controlling for gender.
analyze>compare means>means
(follow the same sequence of steps as in Pollock, page 186)
Now inspect your results and ask yourself whether the relationship between political ideology depends on gender.
If your hunch has merit (you DO see that gender moderates the relationship between political ideology and support for the WS), then carry our a regression analysis that includes this interaction term.
3. Create an interaction term, which is essentially a new variable. See Pollock page 187 for detailed instructions.
4. Now run the regression with FOUR independent variables, the two ‘main effects’ variables, gender and political ideology, age, and the interaction term (gender*polideol)
Recall that your model is:
WS support = A + political ideology + gender + age + gender*polideology
Now interpret your results, keeping in mind that:
1. interpretation of interaction terms is tricky!
2. the best way to interpret the results for the main effects variables and the interaction term is to choose several examples, such as a male who scores 1 on political ideology, and then compare the results with a female who scores 1 on political ideology. Then do the same for males and females scoring 8 or 9 on political ideology.
Now spend a few minutes thinking about potential interaction terms based on your knowledge of the literature on the welfare state. What other interaction effects might you expect to find in our class survey data? What kinds of interaction effects are consistent with Pierson and which are consistent with alternative theories? Choose one interaction effect, construct an interaction variable, and then run a regression that incorporates it.
HANDOUT INTERACTIONS Here are some hints for using interaction terms in your regression analysis. The examples are drawn from our class survey dataset.
We would like to test the following model (based on our reading of the literature):
steun = A + gebruikwelvaart + polideology + maledum + leeftijd
‘gebruikwelvaart’ is used to test one of Pierson’s core claims (that one’s own benefit from the welfare state is one of the best predictors of support for the welfare state)
‘polideology’ is included because the main theoretical approach that Pierson challenges is the ‘power resources’ school that claims that partisanship (individual political preference) is the best predictor of attitudes concerning the welfare state, including support for the WS.
‘maledum’ and ‘leeftijd’ are included as control variables; existing research confirms that these two factors shape support for the WS, so we include them, even though they are not our main ‘research variable.’ If we do not include all known relevant variables in our model, our estimates (coefficients) might not be accurate.
Note the following: we can describe the above model as one that tests the influence of ‘individual WS advantage’ on WS support, controlling for political partisanship, gender and age.
We can also describe it as:
a model that tests the influence of ‘partisanship’ on WS support, controlling for individual WS advantage, gender and age.
a model that tests the influence of ‘gender’ on WS suport, controlling for individual WS advantage, political partisanship and age.
a model that tests the influence of ‘age’ on WS support, controlling for individual WS advantage, political partisanship, gender and age.
KEEP YOUR EYE ON YOUR RESEARCH GOALS: we are interested in the debate between the ‘old’ and ‘new’ politics theorists, so the main variables of interest are ‘individual WS advantage’ and ‘partisanship.’ But that does not mean we can omit other relevant variables from the model!
[note: it is a good idea to develop ONE overall model to start with. This model should include ALL RELEVANT variables, and should have a dependent variable measured at the INTERVAL level. Ordinal DVs are usually fine as long as they have at least 5 categories, and are ‘intervallike’ meaning that the distance between the categories is comparable (for example extremely disgagree (5) thru extremely agree (1), but not educational levels).’ Once you have formulated the overall model, you can adjust it by using different operationalizations of key variables, adding an interaction term, or adding/dropping variables if there is a theoretical reason to do so.]
The first thing we do is inspect our dataset and choose appropriate variables.
For the variable ‘political ideology’ we see that there are some strange things going on: we know the scale is 110 but there are several ‘4.5s’, ‘5.5s’ and ‘7.5s.’ This cause problems with our results, so these values need to be recoded, either as SYSMIS or as the closest value on the scale. We choose the latter, and decide to round upwards so that 4.5 becomes 5, and so on. We inspect the frequencies to make sure that everything went ok.
Statistics  

Politieke ideologie zelfinschaling (1= links; 10 is rechts)  
N  Valid  421 
Missing  20 
Politieke ideologie zelfinschaling (1= links; 10 is rechts)  


 Frequency  Percent  Valid Percent  Cumulative Percent 
Valid  1  4  .9  1.0  1.0 
2  32  7.3  7.6  8.6  
3  79  17.9  18.8  27.3  
4  89  20.2  21.1  48.5  
4.5  2  .5  .5  48.9  
5  70  15.9  16.6  65.6  
5.5  4  .9  1.0  66.5  
6  54  12.2  12.8  79.3  
7  43  9.8  10.2  89.5  
7.5  1  .2  .2  89.8  
8  35  7.9  8.3  98.1  
9  8  1.8  1.9  100.0  
Total  421  95.5  100.0 
 
Missing  geen antwoord  13  2.9 


System  7  1.6 

 
Total  20  4.5 

 
Total  441  100.0 


newpoltideology  


 Frequency  Percent  Valid Percent  Cumulative Percent 
Valid  1  4  .9  1.0  1.0 
2  32  7.3  7.6  8.6  
3  79  17.9  18.8  27.3  
4  89  20.2  21.1  48.5  
5  72  16.3  17.1  65.6  
6  58  13.2  13.8  79.3  
7  43  9.8  10.2  89.5  
8  36  8.2  8.6  98.1  
9  8  1.8  1.9  100.0  
Total  421  95.5  100.0 
 
Missing  System  20  4.5 


Total  441  100.0 


Now we need the best possible operationalization of ‘individual WS advantage’ and we decide to use variable 6_18 to construct an index.
gebruikwelvaart  


 Frequency  Percent  Valid Percent  Cumulative Percent 
Valid  0  115  26.1  27.1  27.1 
1  275  62.4  64.7  91.8  
2  32  7.3  7.5  99.3  
3  1  .2  .2  99.5  
4  1  .2  .2  99.8  
5  1  .2  .2  100.0  
Total  425  96.4  100.0 
 
Missing  System  16  3.6 


Total  441  100.0 


We look at our frequences above and see that ‘5’ is the highest score in our dataset. When writing up the results of our research, we have to include the exact scale of the variable, which is tricky in this case because they survey asked respondents if they used any of 8 social programs. Obviously a person receiving WW cannot receive AOW at the same time, but the question asks respondents about their CURRENT situation and the situation in the last two years. It is possible for someone to have received WW within the past year and to currently receive AOW. We make a preliminary decision that ‘6’ is the highest possible score based on the reasoning that a person who is now or in the last 2 years received studiefinanciering is not likely to receive AOW, likewise with WAJONG and AOW. But given the question wording (current situation and last two years), all other combinations of benefit receipt seem plausible. Determining the highest possible score of the scale is not a problem for SPSS (SPSS does not care about this)but we want to be clear about what we tell the reader later in our research report. In other words, SPSS will produce the same results whether the highest score is 5, 6, 7 or 8, but given that it is good scientific practice to provide the necessary information to the reader about how a variable is measured, we need to make a decision. [Another option is to recode variable into two dummies: 0 is the reference category, and the two other categories are ‘receiving 1 benefit’ and ‘receiving more than 1 benefit.’]
Before we run the first regression, we double check that we have checked our variables and dealt with ‘don’t knows’ etc. Make sure these are coded “SYSMIS”.
Now we will run the first regression:
steun = A + gebruikwelvaart + polideology + maledum + leeftijd
Model Summary  

Model  R  R Square  Adjusted R Square  Std. Error of the Estimate 
1  .416^{a}  .173  .164  4.93864 
a. Predictors: (Constant), maledum, Leeftijd in jaren, newpoltideology, gebruikwelvaart 
ANOVA^{b}  

Model  Sum of Squares  df  Mean Square  F  Sig.  
1  Regression  2020.724  4  505.181  20.713  .000^{a} 
Residual  9682.888  397  24.390 

 
Total  11703.612  401 


 
a. Predictors: (Constant), maledum, Leeftijd in jaren, newpoltideology, gebruikwelvaart  
b. Dependent Variable: SteunWVS 



Coefficients^{a}  

Model  Unstandardized Coefficients  Standardized Coefficients  t  Sig.  
B  Std. Error  Beta  
1  (Constant)  11.540  .969 
 11.913  .000 
gebruikwelvaart  .258  .404  .030  .638  .524  
Leeftijd in jaren  .101  .015  .321  6.913  .000  
newpoltideology  .729  .134  .249  5.424  .000  
maledum  1.640  .498  .151  3.291  .001  
a. Dependent Variable: SteunWVS 




Interpretations:
First we recall the scale of the DV, steunwelvaart: 1= very low support, 21= very high support.
To interpret “gebruik”, recall that it has 9 categories, 06, that refer to the number of social programs a person is using. The first thing we notice about gebruik is that the significance is far away from an alpha of p < .05. We interpret the coefficient anyway to see what it tells us: an increase of welfare state usage by one program, out of a total of 6 programs, results in an increase in support for the welfare state of .258 points on a 21 point scale. The sign (+) of the coefficient is in the predicted direction, but the relationship between welfare state use and support for the WS is weak (a 1/4 of a point increase on the 21 point scale of support for each program that a person uses). The coefficient is far from our alpha, so the chance that the relationship does not hold for the population is > 50%. (Recall that our sample is NOT representative, which means we have to be careful about our significance tests. We can use the significance test with regard to the confidence we have for this survey’s sampling technique of the population which consists of the social environment of second year polisci students at the Radboud University, but we can’t make strong claims about the Dutch population or about people beyond the borders.) For the moment, we proceed as if our sample is representative enough to make inferences about significance).
Now we turn to our political ideology variable: the sign of the coefficient is in the predicted direction. As a person becomes more ‘right’, support for the welfare state should decrease, and this is exactly what happens. Recall that the polideology scale is 110. A one point increase on this scale results in (about) a ¾ point decrease in support for the WS on the 21 point scale. I would say that this variable has a moderate effect on the dependent variable, but this is a matter of interpretation. The coefficient is highly significant, so we can be more than 99% confident that it holds for the population.
Now we turn to the male dummy (male=1; female=0): the sign of the coefficient is in the predicted direction (men support the WS less than women) and the size of the effect is, I would say, moderate. Maleness decreases support for the welfare state by 1.64 points on the 21 point scale. The coefficient is highly significant.
Now we turn to the leeftijd variable: again the sign of the coefficient is in the predicted direction, and the size of the effect is weak to moderate: an increase in age of 10 years results in a one point increase in WS support on the 21 point scale. The coefficient is highly significant.
Overall, the results confirm our expectations based on the theory, except for the “gebruik” variable which we put in the model to test Pierson’s ‘new politics’ thesis that those who benefit from the welfare state are more likely to support the welfare state, regardless of political ideology. Moreover, when we look at the standardized coefficients we see that ‘age’ and ‘political ideology’ have the most influence on the dependent variable.
The adjusted r square of the model is .164, which means that the model explains about 16% of the variance in welfare state support. However, we are not that concerned about the value of the r square because the goal of our research is to test the viability of Pierson’s ‘new politics’ thesis by using linear regression. So for us, the size and significance of the coefficients are the most important places to look for confirmation/disconfirmation of Pierson’s arguments.
Now we would like to include an interaction term in our model, because we suspect that the relationship between political ideology and WS support is moderated by gender. This means that we suspect that the relationship between political ideology and WS support is stronger among men than among women. We theorized, and our first regression confirmed, that men are less likely to support the WS than women, which leads us to think that men are more influenced by their political Ideology than women are.
We construct the following model:
steun = A + gebruikwelvaart + polideology + maledum + leeftijd + polideology*maledum
Before running the regression, we use “compare means” in SPSS to see if there is some evidence for our hypothesis:
Don’t forget to click ‘next’ after you enter the first part of the interaction (maledum) into the menu. This brings you to the next ‘layer’ of the computation, where you enter the second part of the interaction term (newpoliticalideology). Then click OK.
SPSS produces the following table:
Report  

SteunWVS 


 
maledum  newpoltideology  Mean  N  Std. Deviation 
female  1  12.5000  2  12.02082 
2  12.6429  14  5.21252  
3  12.7429  35  5.10676  
4  11.4318  44  4.78563  
5  12.3438  32  5.16570  
6  13.5769  26  4.16819  
7  11.9500  20  3.85903  
8  7.8000  10  7.33030  
9  15.0000  4  9.52190  
Total  12.1711  187  5.17673  
male  1  17.5000  2  2.12132 
2  11.7059  17  4.76661  
3  13.0682  44  5.02736  
4  10.5814  43  5.35960  
5  10.5143  35  5.64838  
6  9.0645  31  5.15710  
7  9.1739  23  5.43266  
8  7.5000  26  4.99800  
9  6.2500  4  5.67891  
Total  10.4178  225  5.48404  
Total  1  15.0000  4  7.61577 
2  12.1290  31  4.91082  
3  12.9241  79  5.03264  
4  11.0115  87  5.06583  
5  11.3881  67  5.46048  
6  11.1228  57  5.21012  
7  10.4651  43  4.91523  
8  7.5833  36  5.62837  
9  10.6250  8  8.63444  
Total  11.2136  412  5.41135 
Now we inspect the table and ask whether there is evidence that the relationship between polideology and WS support is moderated by gender. This means asking: is the relationship stronger/weaker for men than for women? Recall that men are coded 1 and women 0. You see that the predicted relationship between political ideology and WS support holds very well for men: as we move higher (more ‘right’) on the leftright scale, the mean for WS support for men drops from 17.5 to 6.25. Now the question is whether this relationship (between polideo and WS support) is stronger or weaker for women. We predicted that the relationship should be weaker, because women are more likely to support the welfare state regardless of their political ideology because they are women (caring tasks, work in care sector, etc). What does the table tell us? We see that as we move higher on the leftright scale, the mean level of support for women is fairly stable: there is no clear downward trend in the level of support for the welfare state as there was for men. Instead, women’s support hovers between 12.5 and 15 and actually increases somewhat at higher scores on the leftright scale. Thus there is evidence of an interaction effect, and we can move on to the regression with the interaction term added.
We now compute the interaction term or interaction variable (Allison calls these ‘product terms’).
Now we run our regression with the added interaction term. SPSS gives us the following results.
Model Summary  

Model  R  R Square  Adjusted R Square  Std. Error of the Estimate 
1  .432^{a}  .187  .177  4.90224 
a. Predictors: (Constant), maledumXpolideology, Leeftijd in jaren, gebruikwelvaart, newpoltideology, maledum 
ANOVA^{b}  

Model  Sum of Squares  df  Mean Square  F  Sig.  
1  Regression  2186.956  5  437.391  18.200  .000^{a} 
Residual  9516.656  396  24.032 

 
Total  11703.612  401 


 
a. Predictors: (Constant), maledumXpolideology, Leeftijd in jaren, gebruikwelvaart, newpoltideology, maledum  
b. Dependent Variable: SteunWVS 



Coefficients^{a}  

Model  Unstandardized Coefficients  Standardized Coefficients  t  Sig.  
B  Std. Error  Beta  
1  (Constant)  9.675  1.195 
 8.097  .000 
gebruikwelvaart  .202  .401  .023  .503  .615  
Leeftijd in jaren  .100  .014  .318  6.891  .000  
newpoltideology  .314  .206  .108  1.523  .129  
maledum  1.755  1.382  .162  1.270  .205  
maledumXpolideology  .709  .269  .371  2.630  .009  
a. Dependent Variable: SteunWVS 




The first thing we notice is that the statistical significance of our interaction term is high, p<.01. Following Allison’s advice, we decide that the interaction term does indeed belong in the model, so we proceed with our interpretations. Now the tricky part begins.
One strategy is to cautiously interpret the coefficients themselves, although this is not always as informative as it is for regression models that do NOT contain interaction terms. Of course we can interpret “leeftijd’ and ‘gebruikwelvaart’ in the normal way, since they are not part of an interaction term. But we have to be very careful about interpreting the coefficients for the ‘main effects’ variables (maledum and polideology).
The first thing we see is that the sign of the interaction term is negative, the sign of maledum is positive and the sign of polideol is negative.
The second thing we can do is to interpret the coefficient of the main effects variables:
For newpolid, the coefficient is .314. We can interpret this number as the effect of ideology when maledum = 0, i.e. for women (it is not very helpful to interpret the ‘main effects’ coefficient for newpolideology).
Another approach is to estimate the effect of political ideology for different levels of the gender variable, i.e. for men (1) and for women (0).
To calculate the effect of newpolideology on WS support at a given value of maledum, use the following formula:
b(polideology) at a specific value of maledum = b(polideology) + b(polidmaledum)* value of maledum
SO: the effect of polideology when maledum is 1 (male) = .314 + (.709)*1
Thus, the effect of political ideology for men is .314 + .709 = 1.023
(among men, each one point increase on the political ideology scale results in a 1.023 point decrease on the 21 point WS support scale)
The effect of polideology for women is = .314 + 0 = .314 (i.e. the newpolideology main effects coefficient)
This confirms our expectations: our hypothesis is that political ideology matters for men’s support of the welfare state but not very much for women’s. We conclude that political ideology matters more for men; being male strengthens the influence of ideology on WS support. We can also state this the other way around: being female weakens the positive influence of political ideology.
***We notice that the variables maledum and newpolitideology are no longer statistically significant at the p < .05 level. If we run the regression using femdum (female=1, male=0) we see that femdum, newpolid, and the interaction term femdumXpolid (SPSS output is not shown here) are statistically significant.
 for free to follow other supporters, see more content and use the tools
 for €10, by becoming a member to see all content
Je vertrek voorbereiden of je verzekering afsluiten bij studie, stage of onderzoek in het buitenland
Study or work abroad? check your insurance options with The JoHo Foundation
 1 of 6
 next ›
Add new contribution