1 Generate Data for Choice Model

First we generate data in the following procedure.

Generate \(\lambda_{imt}\). \(\lambda_{imt}\) is one of the most significant variables that represents consumer’s risk (observable for consumers but not econometrician)
Generate choice in \(t = 0\) according to the choice model. In \(t = 0\), consumers have four options: the firm with and without monitoring, competitor 1 and 2.
Generate realization of cost (claims count, severity and monitoring score) in \(t = 0\) according to the cost model.
Generate choice in \(t = 1\) based on renewal prices realized by claims and scores. Now consumers have three options: the firm, competitor 1 and 2.

1.1 Generate \(\lambda\)

The rate parameter, \(\lambda_{imt}\) of Poisson distributed claims count, has mean \(\mu_{\lambda, imt}\) and an additive error \(\varepsilon_{\lambda, i}\) which is log-normally distributied with spread \(\sigma_{\lambda}\)
We assume that \(\mu_{\lambda, imt} = \theta_{\lambda1}x_{1, imt} + \cdots + \theta_{\lambda4}x_{4, imt}\)

rm(list = ls())

N <- 10^3
T <- 1
D <- 4

# set the seed
set.seed(1)

## True Parameters
sigma_lambda <- 0.1
theta_lambda <- c(-3, -0.5, 1, -1, 1, -0.5)

## Covariates
mu_x_t0 <- rep(0, 4)
mu_x_t1 <- rep(0, 4)
sigma_x <- diag(0.25, 4)

X_t0 <- mvrnorm(N, mu_x_t0, sigma_x)
X_t1 <- 0.5*X_t0 + 0.5*mvrnorm(N, mu_x_t1, sigma_x)

X <- X_t0 %>% 
  rbind(X_t0) %>% 
  rbind(X_t1) %>% 
  rbind(X_t1)

colnames(X) <- c(paste("x", 1:4, sep = "_"))

X_lambda <- expand.grid(i = 1:N, m = 0:1, t = 0:1) %>% 
  tibble::as_tibble() %>% 
  cbind(X)

## mu_lambda
mu_lambda <- X_lambda %>% 
  dplyr::mutate(mu_lambda = theta_lambda[1]
                + theta_lambda[2]*x_1 + theta_lambda[3]*x_2
                + theta_lambda[4]*x_3 + theta_lambda[5]*x_4
                + theta_lambda[6]*ifelse(m == 1 & t == 0, 1, 0))

## epsilon_lambda
epsilon_lambda <- expand.grid(i = 1:N) %>% 
  tibble::as_tibble() %>% 
  dplyr::mutate(epsilon_lambda = rnorm(N, 0, sigma_lambda))

## lambda
lambda <- mu_lambda %>% 
  dplyr::left_join(epsilon_lambda, by = c("i")) %>% 
  dplyr::mutate(lambda = exp(mu_lambda + epsilon_lambda))

The distribution of generated \(\lambda\) is as follows

## Visualize 
g_lambda <- lambda %>% 
  ggplot(aes(x = lambda)) + 
  geom_histogram(binwidth = 0.01)

plot(g_lambda)

We also check opt-in drivers tend to have lower risk

## Visualize lambda and monitoring opt-in
g_lambda_monitoring <- lambda %>% 
  dplyr::mutate(monitoring = ifelse(m == 1, "Monitored", "Unmonitored")) %>% 
  ggplot(aes(x = monitoring, y = lambda)) + 
  geom_boxplot() + 
  scale_y_continuous(limits=c(0, 0.1))

plot(g_lambda_monitoring)

1.2 Generate (Expectation of) Monitoring Score and Out-of-Pocket

Then, we generate expectation of monitoring score and severity, which is necessary for expectation of out-of-pocket, \(e(C, y_{d})\) and renewal price multiplier \(R_{idt}(C, s)\).
Score \(s\) follows log-normal distribution with an individual mean \(\mu_{s, i}\) and precision \(\sigma_{s}\)
Assuming that \(\mu_{s, i} = \theta_{s, 1} + \theta_{s, 2}\log(\lambda_{i}) + \theta_{s, 3}x_{1, i} + \theta_{s, 4}x_{2, i}\), the expectation of score is given by \(\exp(\mu_{s, i} + \frac{\sigma_{s}^{2}}{2})\)
Out-of-pocket (OOP), is that consumers have to pay when an accident occurs due to deductible and/or policy limit
We assume that every plan has only policy limit
In the paper, for simplicity, consumers only consider the possibility of one claim occurrence per term in expectation
Then, expectation of OOP is given by:

\[ \begin{align} E[e(C, y_{d})] &= \Pr(C = 1) \times \int_{y_{0}}^{\infty} (x - y_{0})a_{l} l_{0}^{a_{l}} x^{- a_{l} - 1}dx \\ &= \lambda \exp(-\lambda) \times \left\{ \frac{l_{0}^{a_{l}}}{a_{l} - 1} y_{0}^{-a_{l} + 1} \right\} \end{align} \]

## Generate expectations of score
theta_score <- c(-3, -0.5, 2, 1)
sigma_score <- 0.25

score <- lambda %>% 
  dplyr::filter(t == 0, m == 1) %>% 
  dplyr::mutate(mu_score = theta_score[1] + theta_score[2]*log(lambda)
                + theta_score[3]*x_1 + theta_score[4]*x_2,
                E_score = exp(mu_score + sigma_score^2 / 2)) %>% 
  dplyr::select(i, m, t, mu_score, E_score)

## Visualize Score
g_mu_score <- score %>% 
  ggplot(aes(x = mu_score)) + 
  geom_histogram(binwidth = 0.1)

plot(g_mu_score)

## Expectation of Out-of-Pocket
policy_limit <- 1000
l0 <- 500
a_l <- 2.5

E_oop <- lambda %>% 
  dplyr::mutate(prob_C1 = lambda * exp(-lambda),
                E_oop = prob_C1 * (l0^(a_l)/(a_l - 1))*policy_limit^(-a_l + 1),
                E_oop_sq = prob_C1 * ((2 * l0^(a_l)) / ((a_l - 1) * (a_l - 1))) * policy_limit^(-a_l + 2)) %>% 
  dplyr::select(i, m, t, E_oop, E_oop_sq)

## Visualize OOP
g_oop <- E_oop %>% 
  ggplot(aes(x = E_oop)) + 
  geom_histogram(binwidth = 0.5)

plot(g_oop)

1.3 Generate Choice in \(t = 0\)

In the paper, consumer’s realized choice utility is modeled as follows:

\[ \begin{align} u_{idt}(C, s) &= u_{\gamma}(w_{it} + h_{idt}(C, s)) \\ h_{idt}(C, s) &= -p_{idt} - \mathbf{1}_{d, t-1} \cdot \psi_{idt} - e(C, y_{d}) - p_{idt} \cdot R_{idt}(C, s) \\ \text{where } \psi_{idt} &= \mathbf{1}_{d, t-1} \cdot \eta_{0} + \mathbf{1}_{f_{d}, t-1} \cdot \eta_{it} + \mathbf{1}_{m_{d}} \cdot \mathbf{1}_{t=0} \cdot \xi_{it} \end{align} \]

At each period \(t\), consumer \(i\) chooses \(d\) so as to maximize her expected utility:

\[ \begin{align} d_{it} &= \mathop{\rm argmax}\limits_{d \in D_{it}} \left\{ v_{idt} + \varepsilon_{idt} \right\} \\ \text{where } v_{idt} &= \mathbb{E}_{C, s}[u_{idt}(C, s)] = \mathbb{E}[h_{idt}] -\frac{\gamma}{2}\mathbb{E}[h_{idt}^{2}] \end{align} \]

Thus, we start with generating each component of \(h_{idt}\).

price_par_t0_d1 <- c(15, 3, 2, 3, -2)
price_par_t0_d2 <- c(15, 3, 2, 3, -2)
price_par_t0_d3 <- c(15, 1, 3, 2, -2)
price_par_t0_d4 <- c(15, 4, 5, 1, -2)

price_gen_d <- function(choice_t0, D, price_par_t0){
  output <- choice_t0 %>% 
    dplyr::filter(d == D) %>% 
    dplyr::mutate(price = 
                    price_par_t0[1] +
                    price_par_t0[2]*x_1 + 
                    price_par_t0[3]*x_2 + 
                    price_par_t0[4]*x_3 +
                    price_par_t0[5]*m +
                    rnorm(N, 1, 1)) %>% 
    dplyr::select(i, d, t, m, f, price)
}

## Generate Price
price_gen_t0 <- function(choice_t0,
                         price_par_t0_d1, price_par_t0_d2,
                         price_par_t0_d3, price_par_t0_d4){
  
  price <- price_gen_d(choice_t0, 1, price_par_t0_d1) %>% 
    rbind(price_gen_d(choice_t0, 2, price_par_t0_d2)) %>% 
    rbind(price_gen_d(choice_t0, 3, price_par_t0_d3)) %>% 
    rbind(price_gen_d(choice_t0, 4, price_par_t0_d4))
  
  output <- choice_t0 %>% 
    dplyr::left_join(price, by = c("i", "d", "t", "m", "f"))
  
  return(output) 
}


## Generate Prior Firm
prior_firm_t0 <- expand.grid(i = 1:N) %>% 
  tibble::as_tibble() %>% 
  dplyr::mutate(prior_firm = sample(1:3, size = N, replace = T, prob = c(0.4, 0.35, 0.25)))


## True Parameters
### Inertia
eta_f <- c(5, 1, 2, 2)
xi <- c(1, -2.5)

### Renewal Price
alpha_m0 <- c(20, 5, 3)
alpha_m1 <- c(20, 5, 3, -1)
beta <- 20

### Risk Aversion
risk_aversion <- 10^(-4)


## Generate Choice
choice_t0 <- expand.grid(i = 1:N, d = 1:D, t = 0) %>% 
  tibble::as_tibble() %>% 
  dplyr::mutate(m = ifelse(d == 1, 1, 0),
                f = dplyr::case_when(
                  d <= 2 ~ 1,
                  d == 3 ~ 2,
                  d == 4 ~ 3)) %>% 
  dplyr::left_join(lambda, by = c("i", "m", "t")) %>% 
  dplyr::left_join(score, by = c("i", "m", "t")) %>% 
  dplyr::left_join(E_oop, by = c("i", "m", "t")) %>% 
  price_gen_t0(price_par_t0_d1, price_par_t0_d2, price_par_t0_d3, price_par_t0_d4) %>% 
  dplyr::left_join(prior_firm_t0, by = c("i")) %>% 
  dplyr::mutate(inertia = eta_f[1] + eta_f[2]*x_1 + eta_f[3]*x_2 + eta_f[4]*x_3,
                monitoring_disutility = xi[1] + xi[2]*log(lambda),
                demand_friction = ifelse(f == prior_firm, 0, inertia) + 
                  ifelse(d == 1, monitoring_disutility, 0)) %>% 
  dplyr::mutate(E_R_s = ifelse(m == 0, 
                           (alpha_m0[1] + alpha_m0[2]*x_1 + alpha_m0[3]*x_2) / beta,
                           (alpha_m1[1] + alpha_m1[2]*x_1 + alpha_m1[3]*x_2 + alpha_m1[4]*E_score) / beta),
                E_R_s_sq = E_R_s / beta,
                E_R_C = 0.98 * exp(-lambda) + 1.2 * (1 - exp(-lambda)),
                E_R_C_sq = 0.98^2 * exp(-lambda) + 1.2^2 * (1 - exp(-lambda)),
                E_renewal_price = price * E_R_s * E_R_C,
                E_renewal_price_sq = price^2 * E_R_s_sq * E_R_C_sq) %>% 
  dplyr::mutate(h = - price - demand_friction - E_oop - E_renewal_price,
                h_sq = price^2 - demand_friction^2 + E_oop_sq + E_renewal_price_sq
                + 2*price*demand_friction + 2*price*E_oop + 2*price*E_renewal_price
                + 2*demand_friction*E_oop + 2*demand_friction*E_renewal_price
                + 2*E_oop*E_renewal_price)

# draw idiosyncratic shocks
e_t0 <- evd::rgev(dim(choice_t0)[1])

choice_t0 <- cbind(choice_t0, e_t0) %>% 
  dplyr::mutate(v = h - (risk_aversion / 2)*h_sq,
                u = v + e_t0) %>% 
  dplyr::group_by(i, t) %>% 
  dplyr::mutate(choice = ifelse(u == max(u), 1, 0)) %>% 
  dplyr::ungroup()

summary(choice_t0)

summarize_choice <- choice_t0 %>% 
  dplyr::group_by(d) %>% 
  dplyr::summarise(share = mean(choice))

summarize_choice

The figure shows generated prices of each plan.
Monitoring plan provided by firm 1 is slightly lower than other plans, representing the opt-in discount

## Visualize Price in t = 0
g_price_t0 <- choice_t0 %>% 
  dplyr::mutate(plan = dplyr::case_when(d == 1 ~ "Firm1 Monitoring",
                                        d == 2 ~ "Firm1 No Monitoring",
                                        d == 3 ~ "Firm2",
                                        d == 4 ~ "Firm3")) %>% 
  ggplot(aes(x = price, fill = plan)) + 
  geom_histogram(position = "identity", alpha = 0.6, binwidth = 0.25)

plot(g_price_t0)

The following figure shows the expectation of \(R_{idt}(s)\), which implies that monitored consumers tend to have smaller expected renewal prices, incentivizing them to opt-in

## Visualize E_R_s in t = 0
g_E_R_s_t0 <- choice_t0 %>% 
  dplyr::filter(f == 1) %>% 
  dplyr::mutate(plan = dplyr::case_when(m == 1 ~ "Monitoring",
                                        m == 0 ~ "No Monitoring")) %>% 
  ggplot(aes(x = E_R_s, fill = plan)) + 
  geom_histogram(position = "identity", alpha = 0.6, binwidth = 0.01)

plot(g_E_R_s_t0)

1.4 Generate Realization of Cost in \(t = 0\)

Since consumer’s choice is path-dependent, i.e. price in next period depends on claims and monitoring score, we need to generate realization of costs in \(t = 0\)
Claim occurs according to Poisson distribution with mean \(\lambda\)
The monitoring score \(s\) is drawn according to a log-normal distribution with mean \(\mu_{s, i}\) and precision \(\sigma_{s}\)
Note that we do not need to generate realization of severity since it does not affect consumer’s choice in \(t = 1\)

## Claims Count
cost_t0 <- choice_t0 %>% 
  dplyr::filter(choice == 1) %>% 
  dplyr::mutate(claims = rpois(N, lambda = lambda))

## Severity
# max_claims <- max(cost_t0$claims)
# 
# claims_mat <- matrix(rep(0, N * max_claims), nrow = N)
# severity <- matrix(l0*runif(N * max_claims)^(-1/a_l), nrow = N)
# 
# colnames(severity) <- paste("sev_", 1:max_claims, sep = "")
# 
# cost_t0 <- cbind(cost_t0, severity)

## Monitoring Score
cost_t0 <- cost_t0 %>% 
  dplyr::mutate(score = rlnorm(N, mu_score, sigma_score),
                score = ifelse(m == 1, score, 1),
                log_score = log(score)) %>% 
  dplyr::select(i, claims, matches("sev."), score, log_score)

## Visualize realized claims count
g_realized_claims_count <- cost_t0 %>%
  ggplot(aes(x = claims)) + 
  geom_histogram(binwidth = 0.5)

plot(g_realized_claims_count)

The following figure shows realized monitoring scores (in logarithm) for opt-in drivers

## Visualize realized score
g_realized_score <- cost_t0 %>%
  dplyr::filter(score != 1) %>% 
  ggplot(aes(x = log_score)) + 
  geom_histogram(binwidth = 0.25)

plot(g_realized_score)

1.5 Generate Choice in \(t = 1\)

## Prior Choice and Firm
prior_firm_t1 <- choice_t0 %>% 
  dplyr::filter(choice == 1) %>% 
  dplyr::mutate(prior_choice = d * choice,
                prior_firm = f * choice) %>% 
  dplyr::select(i, prior_choice, prior_firm)

### Prior Price
prior_price_t1 <- choice_t0 %>% 
  dplyr::left_join(prior_firm_t1, by = c("i")) %>% 
  dplyr::mutate(prior_price = price,
                flg = dplyr::case_when(prior_choice == 1 & d != 2 ~ 1,
                                       prior_choice > 1 & d != 1 ~ 1,
                                       TRUE ~ 0)) %>% 
  dplyr::filter(flg == 1) %>% 
  dplyr::select(i, f, prior_price) %>% 
  dplyr::arrange(i)

## Choice in t = 1
choice_t1 <- expand.grid(i = 1:N, d = 1:(D - 1), t = 1) %>% 
  tibble::as_tibble() %>% 
  dplyr::mutate(m = 0,
                f = d) %>% 
  dplyr::left_join(prior_firm_t1, by = c("i")) %>% 
  dplyr::left_join(prior_price_t1, by = c("i", "f")) %>% 
  dplyr::left_join(lambda, by = c("i", "m", "t")) %>% 
  dplyr::left_join(E_oop, by = c("i", "m", "t")) %>% 
  dplyr::left_join(cost_t0, by = c("i")) %>% 
  dplyr::mutate(gamma_alpha = ifelse(score == 1,
                                     alpha_m0[1] + alpha_m0[2]*x_1 + alpha_m0[3]*x_2,
                                     alpha_m1[1] + alpha_m1[2]*x_1 + alpha_m1[3]*x_2 + alpha_m1[4]*score)) %>% 
  dplyr::mutate(R_s = rgamma(N * (D - 1), gamma_alpha, beta),
                R_C = ifelse(claims == 0, 0.98, 1.2),
                price = prior_price * R_s * R_C) %>% 
  dplyr::mutate(inertia = eta_f[1] + eta_f[2]*x_1 + eta_f[3]*x_2 + eta_f[4]*x_3,
                demand_friction = ifelse(d == prior_firm, 0, inertia)) %>%   
  dplyr::mutate(E_R_s = ifelse(m == 0, 
                           (alpha_m0[1] + alpha_m0[2]*x_1 + alpha_m0[3]*x_2) / beta,
                           (alpha_m1[1] + alpha_m1[2]*x_1 + alpha_m1[3]*x_2 + alpha_m1[4]*E_score) / beta),
                E_R_s_sq = E_R_s / beta,
                E_R_C = 0.98 * exp(-lambda) + 1.2 * (1 - exp(-lambda)),
                E_R_C_sq = 0.98^2 * exp(-lambda) + 1.2^2 * (1 - exp(-lambda)),
                E_renewal_price = price * E_R_s * E_R_C,
                E_renewal_price_sq = price^2 * E_R_s_sq * E_R_C_sq) %>% 
  dplyr::mutate(h = - price - demand_friction - E_oop - E_renewal_price,
                h_sq = price^2 - demand_friction^2 + E_oop_sq + E_renewal_price_sq
                + 2*price*demand_friction + 2*price*E_oop + 2*price*E_renewal_price
                + 2*demand_friction*E_oop + 2*demand_friction*E_renewal_price
                + 2*E_oop*E_renewal_price) %>% 
  dplyr::arrange(i)

# draw idiosyncratic shocks
e_t1 <- evd::rgev(dim(choice_t1)[1])

choice_t1 <- cbind(choice_t1, e_t1) %>% 
  dplyr::mutate(v = h - (risk_aversion / 2)*h_sq,
                u = v + e_t1) %>% 
  dplyr::group_by(i, t) %>% 
  dplyr::mutate(choice = ifelse(u == max(u), 1, 0)) %>% 
  dplyr::ungroup()

summary(choice_t1)

summarize_choice_t1 <- choice_t1 %>% 
  dplyr::group_by(d) %>% 
  dplyr::summarise(share = mean(choice))

summarize_choice_t1

## Visualize Price in t = 1
g_price_t1 <- choice_t1 %>% 
  dplyr::mutate(plan = dplyr::case_when(d == 1 ~ "Firm1",
                                        d == 2 ~ "Firm2",
                                        d == 3 ~ "Firm3"),
                prior_plan = dplyr::case_when(prior_choice == 1 ~ "Firm1 Monitoring",
                                              prior_choice == 2 ~ "Firm1 No Monitoring",
                                              prior_choice == 3 ~ "Firm2",
                                              prior_choice == 4 ~ "Firm3")) %>% 
  ggplot(aes(x = price, fill = prior_plan)) + 
  geom_histogram(position = "identity", alpha = 0.6, binwidth = 0.25)

plot(g_price_t1)

## Visualize Realized R_s in t = 1
g_R_s_t1 <- choice_t1 %>% 
  dplyr::mutate(plan = dplyr::case_when(d == 1 ~ "Firm1",
                                        d == 2 ~ "Firm2",
                                        d == 3 ~ "Firm3"),
                prior_plan = dplyr::case_when(prior_choice == 1 ~ "Firm1 Monitoring",
                                              prior_choice == 2 ~ "Firm1 No Monitoring",
                                              prior_choice == 3 ~ "Firm2",
                                              prior_choice == 4 ~ "Firm3")) %>% 
  ggplot(aes(x = R_s, fill = prior_plan)) + 
  geom_histogram(position = "identity", alpha = 0.4, binwidth = 0.025)

plot(g_R_s_t1)

1.6 Merge Choice Data

df_t0 <- choice_t0 %>% 
  dplyr::select(i, d, t, m, f, x_1, x_2, x_3, x_4, price, prior_firm, choice)
df_t1 <- choice_t1 %>% 
  dplyr::select(i, d, t, m, f, x_1, x_2, x_3, x_4, price, prior_firm, choice, claims, score, R_s, R_C)

df <- dplyr::bind_rows(df_t0, df_t1) %>% 
  dplyr::arrange(t, i, d)

Monte Carlo Simulation: Jin and Vasserman (2021)