---
title: "Visualization Guide"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Visualization Guide}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.width = 8,
  fig.height = 5
)
```

## Overview

taxdiv provides **7 plot types** built on ggplot2, each designed to answer
a specific analytical question. All plot functions return ggplot objects
that can be further customized.

```{r setup}
library(taxdiv)

# Mediterranean forest community
community <- c(
  Quercus_coccifera    = 25,
  Quercus_infectoria   = 18,
  Pinus_brutia         = 30,
  Pinus_nigra          = 12,
  Juniperus_excelsa    = 8,
  Juniperus_oxycedrus  = 6,
  Arbutus_andrachne    = 15,
  Styrax_officinalis   = 4,
  Cercis_siliquastrum  = 3,
  Olea_europaea        = 10
)

tax_tree <- build_tax_tree(
  species = names(community),
  Genus   = c("Quercus", "Quercus", "Pinus", "Pinus",
              "Juniperus", "Juniperus", "Arbutus", "Styrax",
              "Cercis", "Olea"),
  Family  = c("Fagaceae", "Fagaceae", "Pinaceae", "Pinaceae",
              "Cupressaceae", "Cupressaceae", "Ericaceae", "Styracaceae",
              "Fabaceae", "Oleaceae"),
  Order   = c("Fagales", "Fagales", "Pinales", "Pinales",
              "Pinales", "Pinales", "Ericales", "Ericales",
              "Fabales", "Lamiales")
)
```

## Quick Reference

| Plot | Function | Question it answers |
|------|----------|-------------------|
| Taxonomic Tree | `plot_taxonomic_tree()` | How are species related? |
| Heatmap | `plot_heatmap()` | Which species pairs are closest/farthest? |
| Bubble Chart | `plot_bubble()` | Which species contribute most to diversity? |
| Radar Chart | `plot_radar()` | How do communities compare across all indices? |
| Iteration Plot | `plot_iteration()` | How stable is pTO across resampling? |
| Rarefaction Curve | `plot_rarefaction()` | Is my sampling effort sufficient? |
| Funnel Plot | `plot_funnel()` | Is my community's AvTD/VarTD significant? |

## 1. Taxonomic Tree (Dendrogram)

**Question:** *How are the species in my community taxonomically related?*

Shows the hierarchical classification as a dendrogram. Species on the same
branch share closer taxonomic classification.

```{r tree, fig.width=9, fig.height=5.5, fig.alt="Dendrogram showing species grouped by family"}
plot_taxonomic_tree(tax_tree, community = community,
                    color_by = "Family", label_size = 3.5,
                    title = "Mediterranean Forest - Taxonomic Tree")
```

**How to read:**

- Species emerging from the same branch share a taxonomic group
- Numbers in parentheses show abundance
- Colors indicate family membership
- Longer branches = greater taxonomic distance

**When to use:** At the start of any analysis, to understand the taxonomic
structure of your community before computing indices.

## 2. Taxonomic Distance Heatmap

**Question:** *How distant is each species pair in the taxonomic hierarchy?*

Displays the full pairwise taxonomic distance matrix as a color grid.

```{r heatmap, fig.width=9, fig.height=8, fig.alt="Heatmap of pairwise taxonomic distances"}
plot_heatmap(tax_tree, label_size = 2.8,
             title = "Pairwise Taxonomic Distances")
```

**How to read:**

- Dark red cells = distant species pairs (different orders)
- Light/white cells = closely related species (same genus or family)
- Diagonal is always zero (species compared to itself)
- Symmetric matrix (distance from A to B = distance from B to A)

**When to use:** To identify clusters of closely related species and to
understand which species pairs drive the AvTD and Delta values.

## 3. Bubble Chart

**Question:** *Which species contribute most to taxonomic diversity?*

Each bubble represents a species, positioned by abundance (x-axis) and
average taxonomic distance to all other species (y-axis). Bubble size
reflects the combined contribution.

```{r bubble, fig.width=10, fig.height=7, fig.alt="Bubble chart of species contributions"}
plot_bubble(community, tax_tree, color_by = "Family",
            title = "Species Contributions to Diversity")
```

**How to read:**

- **Upper right**: High abundance + taxonomically distinct = major
  contributor to diversity
- **Upper left**: Rare but taxonomically unique = important for taxonomic
  breadth despite low numbers
- **Lower right**: Abundant but taxonomically common = contributes to
  evenness but not taxonomic distinctness
- **Lower left**: Rare and taxonomically common = minimal contribution

**When to use:** To identify keystone species for conservation priority ---
species in the upper portion of the chart contribute most to taxonomic
diversity and their loss would have the greatest impact.

## 4. Radar Chart (Spider Plot)

**Question:** *How do two or more communities compare across all indices?*

Overlays multiple communities on a single polar coordinate plot where each
axis represents a different diversity index.

```{r radar_data}
# Degraded community for comparison
dominant_community <- c(
  Quercus_coccifera   = 80, Quercus_infectoria  = 5,
  Pinus_brutia        = 3,  Pinus_nigra         = 2,
  Juniperus_excelsa   = 2,  Juniperus_oxycedrus = 1,
  Arbutus_andrachne   = 3,  Styrax_officinalis  = 1,
  Cercis_siliquastrum = 2,  Olea_europaea       = 1
)

communities <- list(
  Diverse  = community,
  Dominant = dominant_community
)
```

```{r radar, fig.width=8, fig.height=8, fig.alt="Radar chart comparing two communities"}
plot_radar(communities, tax_tree,
           title = "Diverse vs Dominant Community")
```

**How to read:**

- Each axis is one diversity index (normalized to 0--1)
- Larger polygon area = higher overall diversity
- Overlapping axes = communities score similarly on that index
- Divergent axes = communities differ on that dimension

**Key insight:** If polygons overlap on AvTD/VarTD but diverge on
Shannon/Simpson, the communities have the same species list but different
abundance distributions. If they diverge on everything, the species
composition itself is different.

**When to use:** For publication-ready multi-community comparisons. The
radar chart reveals which *dimensions* of diversity differ, not just
whether communities are "more" or "less" diverse overall.

## 5. Iteration Plot (Run 2)

**Question:** *How stable are pTO values across stochastic resampling?*

Shows the pTO value at each iteration of Run 2, where different species
subsets are randomly included or excluded.

```{r iteration_data}
run2 <- ozkan_pto_resample(community, tax_tree, n_iter = 101, seed = 42)
```

```{r iteration, fig.width=9, fig.height=5, fig.alt="Scatter plot of TO values across iterations"}
plot_iteration(run2, component = "TO",
               title = "Run 2: TO Values Across Iterations")
```

**How to read:**

- **Grey dots**: pTO value for each random species subset
- **Red line**: Deterministic value (Run 1, all species included)
- **Blue line**: Maximum value found across all iterations

**Key insight:** Points above the red line indicate subcommunities that
are *more diverse* than the full community. This happens when removing
taxonomically redundant species improves the ratio of between-group to
within-group diversity.

**When to use:** After running the Ozkan pipeline, to understand the
distribution of possible diversity values and whether the maximum is
an outlier or representative of many subsets.

## 6. Rarefaction Curve

**Question:** *Is my sampling effort sufficient to capture the community's
diversity?*

Shows how the diversity estimate changes as you increase the number of
individuals sampled, with bootstrap confidence intervals.

```{r rarefaction}
rare <- rarefaction_taxonomic(community, tax_tree,
                               index = "shannon",
                               steps = 10, n_boot = 50, seed = 42)
```

```{r rarefaction_plot, fig.width=8, fig.height=5, fig.alt="Rarefaction curve with confidence interval"}
plot_rarefaction(rare)
```

**How to read:**

- **X-axis**: Number of individuals sampled
- **Y-axis**: Estimated diversity index
- **Shaded band**: 95% bootstrap confidence interval
- **Plateau**: Curve levels off = sampling is sufficient
- **Steep at right edge**: More sampling needed

**When to use:** Before any analysis, to check whether your sampling
effort is adequate. If the curve has not plateaued, additional sampling
would likely reveal new species and change your diversity estimates.

## 7. Funnel Plot

**Question:** *Is my community's AvTD/VarTD significantly different from
random expectation?*

Plots observed values against simulated 95% confidence intervals from
random subsamples of a master species pool.

```{r funnel_data}
data(anatolian_trees)

sim <- simulate_td(
  tax_tree = anatolian_trees,
  s_range = c(3, 15),
  n_sim = 99,
  index = "avtd",
  seed = 42
)
```

```{r funnel, fig.width=9, fig.height=6, fig.alt="Funnel plot with 95% confidence bands"}
spp <- names(community)
obs_avtd <- avtd(spp, tax_tree)

plot_funnel(sim,
            observed = data.frame(
              site = "Mediterranean",
              s = length(spp),
              value = obs_avtd
            ),
            index = "avtd",
            title = "AvTD Significance Test")
```

**How to read:**

- **Blue line**: Mean expected value for each species richness level
- **Grey band**: 95% confidence interval from simulations
- **Red point**: Your observed community

**Inside the funnel** = not significantly different from random expectation.
**Below the funnel** = taxonomically impoverished (fewer higher-level groups
than expected). **Above the funnel** = taxonomically enriched (unusually
high distinctness).

**When to use:** Whenever you need to assess whether a community's
taxonomic structure is statistically unusual. This is the only test in
taxdiv that provides formal significance assessment.

## Customizing Plots

All plot functions return ggplot objects. You can customize them with
standard ggplot2 functions:

```{r custom, fig.width=8, fig.height=5, fig.alt="Customized rarefaction curve with modified theme"}
library(ggplot2)

plot_rarefaction(rare) +
  theme_minimal() +
  labs(subtitle = "Mediterranean forest community") +
  theme(plot.title = element_text(face = "bold", size = 14))
```

Common customizations:

```{r custom_examples, eval=FALSE}
# Change theme
p + theme_classic()

# Modify colors
p + scale_color_brewer(palette = "Set2")

# Adjust text size
p + theme(axis.text = element_text(size = 12))

# Save to file
ggsave("my_plot.png", p, width = 10, height = 6, dpi = 300)
```