--- title: "Getting started with agriReg" author: "agriReg Authors" date: "`r Sys.Date()`" output: rmarkdown::html_vignette: toc: true toc_depth: 3 fig_width: 7 fig_height: 4.5 vignette: > %\VignetteIndexEntry{Getting started with agriReg} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include=FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.align = "centre", warning = FALSE, message = FALSE ) library(agriReg) ``` ## Overview `agriReg` provides a streamlined workflow for the most common regression tasks in agricultural research: | Task | Function | |------|----------| | Clean field data | `clean_agri_data()` | | Normalise yield | `yield_normalize()` | | Flag outliers | `outlier_flag()` | | Linear / mixed model | `fit_linear()` | | Polynomial selection | `fit_polynomial()` | | Nonlinear growth curves | `fit_nonlinear()` | | Optimum dose / rate | `optimum_dose()` | | Dose-response (LL.4/5) | `fit_dose_response()` | | Effective dose (ED50…) | `ed_estimates()` | | Compare models | `compare_models()` | | Goodness-of-fit stats | `gof_stats()` | | Residual diagnostics | `residual_check()` | --- ## 1 · Data preparation Load one of the bundled example datasets and inspect it. ```{r load-data} wheat <- load_example_data("wheat_trial") head(wheat) ``` ### Cleaning and outlier detection ```{r clean} wheat_clean <- clean_agri_data(wheat, yield_col = "yield", flag_outliers = TRUE) summary(wheat_clean) ``` ### Normalisation Normalise yield within each block using z-scoring: ```{r normalise} wheat_norm <- yield_normalize(wheat_clean, yield_col = "yield", method = "zscore", group_by = "block") head(wheat_norm[, c("block", "yield", "yield_norm")]) ``` ### Advanced outlier flagging The standalone `outlier_flag()` supports three methods: ```{r outlier} wheat_clean <- outlier_flag(wheat_clean, col = "yield", method = "modz") table(wheat_clean$yield_outlier) ``` --- ## 2 · Linear models ### Simple OLS ```{r lm-simple} m_lm <- fit_linear(wheat_clean, "yield ~ nitrogen + phosphorus") print(m_lm) ``` ### Diagnostic plots The `plot()` method returns `ggplot2` objects — pipe them, save them, or patch them together with `patchwork`. ```{r lm-plots, fig.height=4} plot(m_lm, type = "fit") plot(m_lm, type = "residuals") plot(m_lm, type = "qq") ``` ### Full residual panel ```{r residual-panel, fig.height=6} residual_check(m_lm) ``` ### Mixed-effects model Add a random intercept per block to account for field heterogeneity: ```{r lmer} m_lmer <- fit_linear(wheat_clean, formula = "yield ~ nitrogen + phosphorus + variety", random = "(1|block)") summary(m_lmer) ``` ### Polynomial selection Automatically test degrees 1–3 and return the best-fitting model: ```{r poly} m_poly <- fit_polynomial(wheat_clean, x_col = "nitrogen", y_col = "yield", max_degree = 3) coef(m_poly) ``` --- ## 3 · Nonlinear models Load the maize growth dataset: ```{r maize-load} maize <- load_example_data("maize_growth") # Use well-watered plants only maize_ww <- maize[maize$treatment == "WW", ] head(maize_ww) ``` ### Logistic growth The 3-parameter logistic is the most common growth curve in agronomy. Parameter interpretation: - **Asym**: upper asymptote (maximum biomass) - **xmid**: inflection point (day of maximum growth rate) - **scal**: scale parameter (related to the steepness) ```{r logistic} m_log <- fit_nonlinear(maize_ww, x_col = "days", y_col = "biomass_g", model = "logistic") summary(m_log) ``` ```{r logistic-plot, fig.height=4} plot(m_log) ``` ### Gompertz curve The Gompertz is often a better fit for biological growth that is asymmetric around the inflection point: ```{r gompertz, eval = FALSE} m_gomp <- fit_nonlinear(maize_ww, "days", "biomass_g", "gompertz") coef(m_gomp) ``` ### Asymptotic (Mitscherlich) model Classic model for nutrient-response data where yield approaches a plateau: ```{r asymptotic} m_asym <- fit_nonlinear(wheat_clean, x_col = "nitrogen", y_col = "yield", model = "asymptotic") plot(m_asym) ``` ### Quadratic and optimum dose The quadratic polynomial is widely used to find the economic optimum nitrogen rate: ```{r quadratic} m_quad <- fit_nonlinear(wheat_clean, x_col = "nitrogen", y_col = "yield", model = "quadratic") opt <- optimum_dose(m_quad) cat(sprintf("Optimum nitrogen rate : %.1f kg/ha\n", opt["x_opt"])) cat(sprintf("Predicted max yield : %.2f t/ha\n", opt["y_max"])) ``` ### Linear-plateau model Suitable when yield increases linearly up to a critical input level and then plateaus (common for phosphorus response): ```{r lp} m_lp <- fit_nonlinear(wheat_clean, x_col = "nitrogen", y_col = "yield", model = "linear_plateau") cat("Critical point (plateau onset):", optimum_dose(m_lp)["x_opt"], "kg/ha\n") ``` --- ## 4 · Dose-response models Load the herbicide dataset: ```{r herb-load} herb <- load_example_data("herbicide_trial") # Subset to one species amaranth <- herb[herb$species == "Amaranth", ] ``` ### Fit a 4-parameter log-logistic curve The LL.4 model is the industry standard for herbicide dose-response: $$y = c + \frac{d - c}{1 + \exp(b(\log(x) - \log(e)))}$$ where *b* = slope, *c* = lower asymptote, *d* = upper asymptote, *e* = ED50. ```{r drc} m_dr <- fit_dose_response(amaranth, dose_col = "dose_g_ha", resp_col = "weed_control_pct", fct = "LL.4") summary(m_dr) ``` ```{r drc-plot, fig.height=4} plot(m_dr, log_dose = TRUE) ``` ### Effective dose estimates ```{r ed} ed_estimates(m_dr, respLev = c(10, 50, 90)) ``` --- ## 5 · Model comparison Compare all models fitted to the wheat data: ```{r compare} cmp <- compare_models( ols = m_lm, mixed = m_lmer, asymptotic = m_asym, quadratic = m_quad, lp = m_lp ) print(cmp) ``` The table is ranked by AIC. `delta_AIC` shows the difference from the best model — values below 2 are considered equivalent; above 10 is strong evidence against the weaker model. --- ## 6 · Extracting results for reports All `agriReg` objects are compatible with `broom` for tidy output: ```{r broom} library(broom) tidy(m_lm$fit) glance(m_lm$fit) ``` ### Saving plots ```{r save-plot, eval=FALSE} p <- plot(m_log) ggplot2::ggsave("logistic_growth.png", p, width = 7, height = 4.5, dpi = 300) ``` --- ## Session information ```{r session} sessionInfo() ```