Calculates the Shannon-Wiener diversity index (H') for a community, optionally applying a bias correction for small samples.
Details
The naive (MLE) Shannon index is calculated as: $$H' = -\sum_{i=1}^{S} p_i \ln(p_i)$$ where \(p_i = n_i / N\) is the proportion of species \(i\), \(N\) is the total number of individuals, and \(S\) is the number of species observed.
The MLE estimator has a known negative bias that is significant for small samples. Three bias-correction methods are available:
Miller-Madow (1955): Adds a first-order bias correction term: $$H_{MM} = H_{MLE} + \frac{S_{obs} - 1}{2N}$$
Grassberger (2003): Uses the digamma function instead of the logarithm: $$H_G = \ln(N) - \frac{1}{N} \sum_i n_i \psi(n_i)$$ where \(\psi\) is the digamma function.
Chao-Shen (2003): Applies a Good-Turing coverage correction with Horvitz-Thompson weighting: $$\hat{C} = 1 - f_1 / N$$ $$H_{CS} = -\sum_i \frac{\hat{p}_i \ln \hat{p}_i}{1 - (1 - \hat{p}_i)^N}$$ where \(\hat{p}_i = \hat{C} \cdot n_i / N\) and \(f_1\) is the number of singletons.
Bias corrections require integer abundance counts. A warning is
issued if non-integer values are detected with correction != "none".
References
Miller, G.A. & Madow, W.G. (1954). On the maximum likelihood estimate of the Shannon-Wiener index of diversity. AFCRC-TR-54-75.
Grassberger, P. (2003). Entropy estimates from insufficient samplings. arXiv:physics/0307138.
Chao, A. & Shen, T.-J. (2003). Nonparametric estimation of Shannon's index of diversity when there are unseen species in sample. Environmental and Ecological Statistics, 10, 429-443.
See also
simpson() for Simpson diversity, deng_entropy_level() for
Deng entropy (a generalization of Shannon).
Examples
comm <- c(10, 5, 8, 3, 12)
shannon(comm)
#> [1] 1.510657
shannon(comm, correction = "miller_madow")
#> [1] 1.563289
shannon(comm, correction = "grassberger")
#> [1] 1.578282
shannon(comm, correction = "chao_shen")
#> [1] 1.521172