---
title: "Multiple ITS control introduction for level change (two-stage)"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Multiple ITS control introduction for level change (two-stage)}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE, 
  echo = FALSE,
  warning = FALSE,
  message = FALSE,
  comment = "#>"
)
```

```{r setup}
library(multipleITScontrol)
library(dplyr)
library(ggplot2)
library(lubridate)
library(stringi)
library(rlang)
library(purrr)

phei_calendar <- function(df,
                          date_column = NULL,
                          factor_column = NULL,
                          colours = NULL,
                          title = "Placeholder: Please supply title or 'element_blank()' to `title` argument",
                          subtitle = "Placeholder: Please supply subtitle or 'element_blank()' to `subtitle` argument",
                          caption = "PH.Intelligence@hertfordshire.gov.uk",
                          ncol,
                          ...) {


  date_column <- rlang::sym(date_column)
  factor_column <- rlang::sym(factor_column)

  df <- df |> dplyr::mutate(
    mon = lubridate::month(!!date_column, label = T, abbr = F),
    wkdy = weekdays(!!date_column,
                    abbreviate =
                      T
    ) |> forcats::fct_relevel("Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun"),
    day = lubridate::mday(!!date_column),
    week = stringi::stri_datetime_fields(!!date_column)$WeekOfMonth,
    year = lubridate::year(!!date_column),
    year_mon = zoo::as.yearmon(!!date_column, "%Y %m")
  ) |>
    dplyr::mutate(across(week, ~ dplyr::case_when(wkdy == "Sun" ~ week - 1,
                                           .default = as.numeric(week)
    )))
  
  df %>%
    ggplot2::ggplot(., ggplot2::aes(wkdy, week)) +
    # custom theme stuff below
    # geom_tile and facet_wrap will do all the heavy lifting
    ggplot2::geom_tile(
      alpha = 0.8,
      ggplot2::aes(fill = !!factor_column),
      color = "black", ...
    ) +
    ggplot2::facet_wrap(~year_mon, scales = "free_x", ncol = ncol) +
    ggplot2::geom_text(ggplot2::aes(label = day)) +
    # put your y-axis down, flip it, and reverse it
    ggplot2::scale_y_reverse(breaks = NULL) +
    # manually fill scale colors to something you like...
    ggplot2::scale_fill_manual(
      values = colours,
      na.value = "white",
      na.translate = FALSE
    ) +
    ggpubr::theme_pubclean() +
    ggplot2::theme(legend.position = "bottom") +
    ggplot2::labs(
      fill = "",
      x = "",
      y = "",
      title = element_blank(),
      caption = "PH.Intelligence@hertfordshire.gov.uk"
    )
}
```

## Usage

This is a basic example which shows how to solve a common problem with a **two‑stage interrupted time series with a control**, for a **level (step) change hypothesis**.

**Background:**

\
*MetroBike City* and *RiverCycle District* operate bicycle‑sharing networks in neighbouring urban areas. Both systems use similar bicycles, have comparable ridership patterns, and maintain similar maintenance schedules.

In early 2023, MetroBike City implemented two sequential interventions aimed at reducing **weekly maintenance incidents** (e.g., damaged gears, brake faults, electronic lock failures). RiverCycle District made *no* fleet-wide changes during the same period and serves as the **control**.

This example illustrates a scenario where **each intervention leads to a significant immediate “step” decrease** in the number of maintenance incidents, without slope changes.

### **Intervention 1: Smart‑Lock Upgrade (Fleet‑wide)**

-   **Objective:** Reduce lock‑related maintenance tickets.
-   **Start Date:** April 3, 2023
-   **Duration:** 2 months
-   **Description:** All MetroBike bicycles received new smart‑locks with improved sensors, reducing jammed/failed locking events.
-   **Measurement:** Weekly total maintenance incidents.

### **Intervention 2: Predictive Maintenance Algorithm**

-   **Objective:** Reduce mechanical faults through early detection.
-   **Start Date:** June 5, 2023
-   **Duration:** 2 months
-   **Description:** The operator introduced an AI‑supported predictive system identifying bicycles likely to require maintenance before breakdowns occur.
-   **Measurement:** Weekly total maintenance incidents.

### Controlled Interrupted Time Series Design (2 stage)

**Step 1: Baseline Period**

-   Duration: January 1, 2023 – April 2, 2023

-   Weekly maintenance incident counts collected.

**Step 2: Intervention 1 Period**

-   Duration: April 3, 2023 – June 4, 2023

**Step 3: Intervention 2 Period**

-   Duration: June 5, 2023 – December 31, 2023

The calendar plot below summarises the timeline of the interventions:

```{r calendar, echo = FALSE, warning = FALSE, message = FALSE, fig.align="center", fig.height=10, fig.width=7, fig.retina=3}

bike_calendar <- its_data_bike_programme |> 
    group_by(group_var) |>
    arrange(group_var, Date) |> 
    tidyr::complete(Date = seq(min(Date), max(Date), by = "day")) |> 
    tidyr::fill(Period, .direction = "down")

plot <- phei_calendar(
  bike_calendar,
  date_column = "Date",
  "Period",
  colours = c("#3b5163", "#80bb77", "#afd0f0"),
  ncol = 3
) +
  theme(strip.text = element_text(size = rel(0.5)),
        axis.text = element_text(size = rel(0.5)),
        plot.caption = element_text(size = rel(0.5)),
        legend.text = element_text(size = rel(0.5)))
  

plot$layers[[2]]$aes_params$size <- 3

plot


```

# Step 1) Loading data

Sample data can be loaded from the package for this scenario through the bundled dataset `its_data_bike_programme`.

<br></br>

```{r step_1_load_data}
DT::datatable(its_data_bike_programme, options = list(dom = 'tip'), rownames = FALSE)
```

<br></br>

This sample dataset demonstrates the format your own data should be in.

You can observe that in the `Date` column, that the dates are of equal distance between each element, and that there are two rows for each date, corresponding to either `control` or `treatment` in the `group_var` variable. `control` and `treatment` each have three periods, a `Pre-intervention period` detailing measurements of the outcome prior to any intervention, the first intervention detailed by `Intervention 1) Smart-Lock Upgrade`, and the second intervention, detailed by `Intervention 2) Predictive Maintenance Algorithm`.

<br></br>

# Step 2) Transforming the data

The data frame should be passed to `multipleITScontrol::tranform_data()` with suitable arguments selected, specifying the names of the columns to the required variables and starting intervention time points.

```{r, echo = TRUE, results='hide'}
intervention_dates <- c(as.Date("2023-04-03"), as.Date("2023-06-05"))
transformed_data <- 
  multipleITScontrol::transform_data(df = its_data_bike_programme,
               time_var = "Date",
               group_var = "group_var",
               outcome_var =  "score",
               intervention_dates = intervention_dates)
```

Returns the initial data frame with a few transformed variables needed for interrupted time series.

```{r}
transformed_data
```

# Step 3) Fitting ITS model

The transformed data is then fit using `multipleITScontrol::fit_its_model()`. Required arguments are `transformed_data`, which is simply an unmodified object created from `multipleITScontrol::transform_data()` in the step above; a defined impact model, with current options being either '*slope*', \`*level*, or '*levelslope*', and the number of interventions.

```{r, echo = TRUE, results='hide'}
fitted_ITS_model <-
  multipleITScontrol::fit_its_model(transformed_data = transformed_data,
                                    impact_model = "level",
                                    num_interventions = 2)

fitted_ITS_model
```

Gives a conventional model output from `nlme::gls()`.

```{r}
fitted_ITS_model
```

# Step 4) Analysing ITS model

However, the coefficients given do not make intuitive sense to a lay person. We can call the package's internal `multipleITScontrol::summary_its()` which modifies the summary output by renaming the coefficients, variable names, and other model-related terms to make them easier to interpret in the context of interrupted time series (ITS) analysis.

```{r, echo = TRUE, results='hide'}
my_summary_its_model <- multipleITScontrol::summary_its(fitted_ITS_model)

my_summary_its_model
```

```{r}
my_summary_its_model
```

```{r, echo = TRUE, results='hide'}
summary(my_summary_its_model)
```

```{r}
summary(my_summary_its_model)
```

```{r, echo = TRUE, results='hide'}
sjPlot::tab_model(
  my_summary_its_model,
  dv.labels = "Weekly Total Maintenance Incidents",
  show.se = TRUE,
  collapse.se = TRUE,
  linebreak = FALSE,
  string.est = "Estimate (std. error)",
  string.ci = "95% CI",
  p.style = "numeric_stars"
)
```

```{r}
sjPlot::tab_model(
  my_summary_its_model,
  dv.labels = "Weekly Total Maintenance Incidents",
  show.se = TRUE,
  collapse.se = TRUE,
  linebreak = FALSE,
  string.est = "Estimate (std. error)",
  string.ci = "95% CI",
  p.style = "numeric_stars"
)

a <- coef(my_summary_its_model)[[which(names(coef(my_summary_its_model)) == "A) Control y-axis intercept")]] |> round(2)
# c <- coef(my_summary_its_model)[[which(names(coef(my_summary_its_model)) == "C) Control pre-intervention slope")]] |> round(2)
# d <- coef(my_summary_its_model)[[which(names(coef(my_summary_its_model)) == "D) Pilot pre-intervention slope difference to control")]] |> round(2)
g <- coef(my_summary_its_model)[[which(names(coef(my_summary_its_model)) == "G) Control intervention 1 level")]] |> round(2)
h <- coef(my_summary_its_model)[[which(names(coef(my_summary_its_model)) == "H) Pilot intervention 1 level difference to control")]] |> round(2)
k <- coef(my_summary_its_model)[[which(names(coef(my_summary_its_model)) == "K) Control intervention 2 level")]] |> round(2)
l <- coef(my_summary_its_model)[[which(names(coef(my_summary_its_model)) == "L) Pilot intervention 2 level difference to control")]] |> round(2)

```

The predictor coefficients elucidate a few things:

## **First intervention**:

***G) Control intervention 1 level*** describes the step change that occurs at the intervention break point in the control group at the start of the first intervention (`r g`).

***H) Pilot intervention 1 level*** describes the *difference* in the step change that occurs at the intervention timepoint in the control group for the first intervention compared to the pilot (`r h`).

To ascertain the level for the pilot data, we add to the control coefficient step of the pilot data, the coefficients ***G) Control intervention 1 level*** and ***H) Pilot intervention 1 level difference to control***. ***G*** (`r g`) + H (`r h`) = `r g+h` is the step change during the first intervention for the pilot data.

The coefficient ***H) Pilot intervention 1 level difference to control***, in essence acts a t-test of whether there is a statistically significant difference between the step of the pilot data and the control. Mathemetically it is equivalent to a two-sample t-test.

In the regression model results above, We can see there is a statistically significant difference via the p-value for coefficient H.

## **Second intervention:**

***K) Control intervention 2 level*** describes the step change that occurs at the intervention break point in the control group at the start of the second intervention (`r k`).

***L) Pilot intervention 2 level difference to control"*** describes the *difference* in the step change that occurs at the intervention timepoint in the pilot group for the second intervention compared to the control (`r l`).

In the regression model results above, We can see there is a statistically significant difference via the p-value for coefficient L.

# Step 5) Fitting Predictions

We can fit predictions with the created model which project the pre-intervention period into the post-intervention period by using the model coefficients using `multipleITScontrol::generate_predictions()`.

```{r, echo = TRUE, results='hide'}
transformed_data_with_predictions <- generate_predictions(transformed_data, fitted_ITS_model)

transformed_data_with_predictions
```

```{r}
DT::datatable(transformed_data_with_predictions, options = list(dom = 'tip', scrollX = TRUE), rownames = FALSE)
```

## Step 6) Plotting the results

We can use the predicted values and map the segmented regression lines which compare whether an intervention had a statistically significant difference.

```{r, echo = TRUE, fig.align="center", fig.width=7, fig.height=7, fig.retina=3}
its_plot(model = my_summary_its_model,
         data_with_predictions = transformed_data_with_predictions, 
         time_var = "time",
         intervention_dates = intervention_dates, 
         y_axis = "Weekly Total Maintenance Incidents")
```