--- title: "Getting started with lineagefreq" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Getting started with lineagefreq} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 7, fig.height = 4 ) ``` ## Overview lineagefreq models pathogen lineage frequency dynamics from genomic surveillance count data. Given a table of lineage-resolved sequence counts over time, the package estimates relative growth advantages, generates short-term frequency forecasts, and provides tools for evaluating model accuracy. This vignette demonstrates the core workflow using simulated SARS-CoV-2 surveillance data. ## Preparing data The entry point is `lfq_data()`, which validates and standardizes a count table. The minimum input is a data frame with columns for date, lineage name, and sequence count. ```{r setup} library(lineagefreq) data(sarscov2_us_2022) head(sarscov2_us_2022) ``` ```{r lfq-data} x <- lfq_data(sarscov2_us_2022, lineage = variant, date = date, count = count, total = total) x ``` The function computes frequencies, flags low-count time points, and returns a validated `lfq_data` object. ## Fitting a model `fit_model()` provides a unified interface. The default engine is multinomial logistic regression (MLR). ```{r fit} fit <- fit_model(x, engine = "mlr") fit ``` The print output shows each lineage's estimated growth rate relative to the pivot (reference) lineage, which is auto-selected as the most prevalent lineage early in the time series. ## Extracting growth advantages `growth_advantage()` converts growth rates into interpretable metrics. Four output types are available. ```{r growth-advantage} ga <- growth_advantage(fit, type = "relative_Rt", generation_time = 5) ga ``` A relative Rt above 1 indicates a lineage growing faster than the reference. The confidence intervals are derived from the Fisher information matrix. ## Visualizing the fit `autoplot()` supports four plot types for fitted models. ```{r plot-frequency} autoplot(fit, type = "frequency") ``` ```{r plot-advantage} autoplot(fit, type = "advantage", generation_time = 5) ``` ## Forecasting `forecast()` projects frequencies forward with uncertainty quantified by parametric simulation. ```{r forecast} fc <- forecast(fit, horizon = 28) autoplot(fc) ``` ## Detecting emerging lineages `summarize_emerging()` tests each lineage for statistically significant frequency increases. ```{r emergence} summarize_emerging(x) ``` ## Next steps - Compare multiple engines with `backtest()` — see `vignette("model-comparison")`. - Run a full surveillance workflow — see `vignette("surveillance-workflow")`.