# Serial Mediation for Count Data (with R code)

## Warning:

1. The correctness and quality of the R code presented in this tutorial are not guaranteed.
2. This tutorial is inspired by a paper called “Accommodating binary and count variables in mediation: A case for conditional indirect effects“. However, the author of this tutorial is not one of the authors of that paper.
3. The author of this tutorial cannot guarantee that the R code in this tutorial correctly reflects the idea of that paper. Please do read the paper by yourself and make your own judgment.
4. Please do NOT cite this tutorial as a reference, if you are writing an academic paper. Further, the author of this tutorial does not provide any consulting service. But, you can leave a comment down below, and I will try my best to help.

## Introduction

This tutorial shows how to do serial mediation for count data. Complete R code is provided.

The following are the theoretical models and conceptual framework, assuming IV, M1, M2 are close to normally distributed, and DV is the count data.

$$IV \sim N(4, 5)$$
$$M1 = 0.3+0.5 \times IV$$
$$M2 = 0.1+0.7 \times IV +0.5 \times M1$$
$$log(E(DV | IV)) = 0.2+0.3 \times M1 +0.3 \times M2 + 0.01 \times IV$$

## Step 1 (Optional): Data Simulation

The following is the code to generate the data based on the models stated above. Note that, if you already have data ready, you do not need this simulation step.

# set sample size
n=1000
# set seed
set.seed(123)

# simulate x (normal distribution), namely IV
X <- rnorm(n, 5, 4)

# simulate residual for M_1
residual_1<-rnorm(n,0,1)
# simulate M_1
M_1<-0.3+0.5*X+residual_1

# simulate residual for M_2
# Note that, residuals for M_1 and M_2 should be different.
# Otherwise, there will be problem of singularities.
residual_2<-rnorm(n,0,1.1)
# simulate M_2
M_2<-0.1+0.7*X+0.5*M_1+residual_2

# mu for Poisson regression via a log link
mu_1 <- exp(0.2 + 0.3*M_1+0.3*M_2+0.01*X)
# use rpois to generate Y
Y <- rpois(n, lambda=mu_1)

# combine into a dataframe and print out the first 6 rows
data <- data.frame(X=X, M_1=M_1,M_2=M_2, Y=Y)


## Step 2: R code for serial mediation for count data

The following is the R code that can be used to calculate the indirect effect.

library(boot)

Serial_Mediation_poisson<-function(data_used,i,x_predetermined=0)
{
# Sample a data
data_temp=data_used[i,]

# Deciding which X value to use
if(x_predetermined==0){x_predetermined=mean(data_temp$X)} else if (x_predetermined==-1){x_predetermined=mean(data_temp$X)-sd(data_temp$X)} else(x_predetermined=mean(data_temp$X)+sd(data_temp$X)) # a path result_a<-lm(M_1~X, data = data_temp) a_0<-result_a$coefficients
a_1<-result_a$coefficients # d path result_d<-lm(M_2~X+M_1, data = data_temp) d_0<-result_d$coefficients
d_1<-result_d$coefficients d_2<-result_d$coefficients

# b path
result_b<-glm(Y~M_1+M_2+X, data = data_temp, family = quasipoisson)
b_0<-result_b$coefficients b_1<-result_b$coefficients
b_2<-result_b$coefficients c_1_apostrophe<-result_b$coefficients

#calculating the indirect effect
M_1_estimated=a_0+a_1*x_predetermined
M_2_estimated=d_0+d_1*x_predetermined+d_2*M_1_estimated
indirect_effect<-a_1*d_2*b_2*exp(b_0+b_1*M_1_estimated+b_2*M_2_estimated+c_1_apostrophe*x_predetermined)
return(indirect_effect)
}

# use boot() to do bootstrapping mediation analysis
boot_mediation <- boot(data, Serial_Mediation_poisson, R=1000)
boot_mediation

# print out confidence intervals
boot.ci(boot.out = boot_mediation, type = c("norm", "perc"))

The following is the output:

> boot_mediation
ORDINARY NONPARAMETRIC BOOTSTRAP

Call:
boot(data = data, statistic = Serial_Mediation_poisson, R = 1000)

Bootstrap Statistics :
original      bias
t1*   1.1202 0.006848951
std. error
t1*   0.1039129

> boot.ci(boot.out = boot_mediation, type = c("norm", "perc"))
BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
Based on 1000 bootstrap replicates

CALL :
boot.ci(boot.out = boot_mediation, type = c("norm", "perc"))

Intervals :
Level      Normal             Percentile
95%   ( 0.910,  1.317 )   ( 0.928,  1.341 )
Calculations and Intervals on Original Scale

## Step 3 (Optional): Check the model

We can also check the theoretical value of the indirect effect using the following R code. The value is 1.028849. It is not exactly the same as the the value of 1.1202. This is due to three randomness in the data simulation, 2 in the residuals in normal distribution and 1 in the simulation of Poisson data.

x_mean=mean(data\$X)
M_1=0.3+0.5*x_mean
M_2=0.1+0.7*x_mean+0.5*M_1
0.5*0.5*0.3*exp(0.2 + 0.3*M_1+0.3*M_2+0.01*x_mean)

## Appendix: Raw Code in GitHub

You can click this link for complete raw R code stored in GitHub.