Does it make sense to interact 2 dummy variables?

I am running a logistic regression, where the dependent variable is marital status. And I assume that educated women are different from women with e.g. high school degree and are less likely to marry. Does it make sense to interact 2 binary variables? I remember that we usually interacted a continuous and a binary variable, that is why I am hesitating to include this interaction and how to interpret it. Please help.

asked Apr 29, 2017 at 17:40 101 2 2 silver badges 7 7 bronze badges $\begingroup$ I just find one binary covariate: Education. Where is the second one? $\endgroup$ Commented Apr 29, 2017 at 20:27

$\begingroup$ what is the interpretation of the coefficient in a case were only one is significant? lets say my two variables are gender (men = 0, women = 1) and marital stat (single = 0, married = 1), and I regress them for life satisfaction. how can I understand the data if all my coefficients are non significant but the intercept? does that mean that only single men differ from other groups? thank you! $\endgroup$

Commented Feb 19, 2021 at 11:30

3 Answers 3

$\begingroup$

Sure, you can include an interaction between categorical variables in your regression. The interpretation is particularly easy if the categorical variables are binary (i.e. have only two categories).

Let's look at your example and how to interpret it. You did only tell us one of the binary variables, $\mathrm$ ($0:$ High school, $1:$ More than high school). For the sake of illustration, I'm going to assume another binary variable, $\mathrm$ ($0:$ <35 years old, $1:$ $\geq$35 years old). The logistic model containing an interaction between $\mathrm$ and $\mathrm$ is:

Where $p_$ is the probability to be married for the $i$th subject. We have four possibilities to consider. Below is a table of all four possibilities and the corresponding coefficients that remain. Please note that if a binary dummy variables is 0, the corresponding coefficient vanishes.

$$ \begin & \text & \text & \text \\ \hline \text & \beta_ & \beta_ + \beta_ & (\beta_ + \beta_) -\beta_ = \beta_ \\ \text & \beta_ + \beta_ & \beta_ + \beta_ + \beta_ + \beta_ & \beta_ + \beta_ \\ \hline \text & \beta_ & \beta_ + \beta_ \end $$

To summarize the interpretation:

$\beta_$ is the log-odds for women with only high school education and that are below 35.
$\beta_$ is the difference in the log-odds between women with a higher education and women with only high school education for women below 35.
$\beta_$ is the difference in the log-odds between women below 35 and women above 35 among women with only a high school education.
$\beta_$ is the additional difference between the log-odds for women < 35 and older women if education changes from 0 to 1.

This still may be cryptic. Have a look at the graphic below which illustrates all coefficients graphically. We can draw an important conclusion from the picture: The interaction tests if the lines are parallel. If $\beta_$ is $0$ or very small, we can conclude that the lines are more or less parallel. A corresponding hypothesis test helps to quantify the evidence for parallelity.

enter image description here

This analysis easily extends to categorical variables with more than two categories.

answered Apr 29, 2017 at 19:05 COOLSerdash COOLSerdash 31k 10 10 gold badges 103 103 silver badges 157 157 bronze badges $\begingroup$

It makes sense, but only in situation when there are all possible combinations of those variables in data, that is: when there are cases that have 0;0, 0;1, 1;0 and 1;1 in your data (first number means variable of first possible dummy variable, and second number means value of second variable).

In such a situation there can be possibility, that there is different effect of every of these possibilities on dependent variable. Otherwise (let's say in most popular situation when there are no cases that have 1;1 combinations) the interaction variable would be linear combination of the base variables, so in case of logistic regression without regularization model will not converge at all.

Also if there are no 0;0, 0;1 or 1;0 it does not make sense, because then also two coefficients would be enough to fit all conditional means of predicted variable within combinations of these two variables.

This would also logically make no sense. If there are let's say, employed and unemployed women and employed and unemployed man, it make sense to make model that predict different value for employed woman than for employed man. But if there are no man that have given a birth to a child, it does not make sense to predict different value in model for man who have given a birth to a child and for woman who have given a birth to a child.

In model with two dummy variables the effect of all of their combinations is just sum of effect of one of them and the second one:

In such a model for a case who has both variables equal to one model predicts just sum of effects of both variables when predicting his dependent variable value.

With interaction term there is also individual effect of having both of them:

In such a situation model predicts individual, special value to a case that have both variables equal to one, that can differ from just sum of effects of two separate variables.