Evaluate Statistics with Null Models of Biological Communities
oecosimu.Rd
Function evaluates a statistic or a vector of statistics in
community and evaluates its significance in a series of simulated
random communities. The approach has been used traditionally for
the analysis of nestedness, but the function is more general and can
be used with any statistics evaluated with simulated
communities. Function oecosimu
collects and evaluates the
statistics. The Null model communities are described in
make.commsim
and permatfull
/
permatswap
, the definition of Null models in
nullmodel
, and nestedness statistics in
nestednodf
(which describes several alternative
statistics, including nestedness temperature, \(N0\), checker
board units, nestedness discrepancy and NODF).
Arguments
- comm
Community data, or a Null model object generated by
nullmodel
or an object of classsimmat
(array of permuted matrices fromsimulate.nullmodel
). Ifcomm
is a community data, null model simulationmethod
must be specified. Ifcomm
is anullmodel
, the simulationmethod
is ignored, and ifcomm
is asimmat
object, all other arguments are ignored exceptnestfun
,statistic
andalternative
.- nestfun
Function analysed. Some nestedness functions are provided in vegan (see
nestedtemp
), but any function can be used if it accepts the community as the first argument, and returns either a plain number or a vector or the result in list item with the name defined in argumentstatistic
. See Examples for defining your own functions.- method
Null model method: either a name (character string) of a method defined in
make.commsim
or acommsim
function. This argument is ignored ifcomm
is anullmodel
or asimmat
object. See Details and Examples.- nsimul
Number of simulated null communities (ignored if
comm
is asimmat
object).- burnin
Number of null communities discarded before proper analysis in sequential methods (such as
"tswap"
) (ignored with non-sequential methods or whencomm
is asimmat
object).- thin
Number of discarded null communities between two evaluations of nestedness statistic in sequential methods (ignored with non-sequential methods or when
comm
is asimmat
object).- statistic
The name of the statistic returned by
nestfun
.- alternative
a character string specifying the alternative hypothesis, must be one of
"two.sided"
(default),"greater"
or"less"
. Please note that the \(p\)-value of two-sided test is approximately two times higher than in the corresponding one-sided test ("greater"
or"less"
depending on the sign of the difference).- batchsize
Size in Megabytes of largest simulation object. If a larger structure would be produced, the analysis is broken internally into batches. With default
NA
the analysis is not broken into batches. See Details.- parallel
Number of parallel processes or a predefined socket cluster. With
parallel = 1
uses ordinary, non-parallel processing. The parallel processing is done with parallel package. If you define anestfun
in Windows that needs other R packages than vegan or permute, you must set up a socket cluster before the call.- x
An
oecosimu
result object.- ...
Other arguments to functions.
Details
Function oecosimu
is a wrapper that evaluates a statistic
using function given by nestfun
, and then simulates a series
of null models based on nullmodel
, and evaluates the
statistic on these null models. The vegan packages contains
some nestedness functions that are described separately
(nestedchecker
, nesteddisc
,
nestedn0
, nestedtemp
,
nestednodf
), but many other functions can be used as
long as they are meaningful with simulated communities. An
applicable function must return either the statistic as a plain
number or a vector, or as a list element "statistic"
(like
chisq.test
), or in an item whose name is given in the
argument statistic
. The statistic can be a single number
(like typical for a nestedness index), or it can be a vector. The
vector indices can be used to analyse site (row) or species (column)
properties, see treedive
for an example. Raup-Crick
index (raupcrick
) gives an example of using a
dissimilarities.
The Null model type can be given as a name (quoted character string)
that is used to define a Null model in make.commsim
.
These include all binary models described by Wright et al. (1998),
Jonsson (2001), Gotelli & Entsminger (2003), Miklós &
Podani (2004), and some others. There are several quantitative Null
models, such those discussed by Hardy (2008), and several that are
unpublished (see make.commsim
,
permatfull
, permatswap
for
discussion). The user can also define her own commsim
function (see Examples).
Function works by first defining a nullmodel
with
given commsim
, and then generating a series of
simulated communities with simulate.nullmodel
. A
shortcut can be used for any of these stages and the input can be
Community data (
comm
), Null model function (nestfun
) and the number of simulations (nsimul
).A
nullmodel
object and the number of simulations, and argumentmethod
is ignored.A three-dimensional array of simulated communities generated with
simulate.nullmodel
, and argumentsmethod
andnsimul
are ignored.
The last case allows analysing several statistics with the same simulations.
The function first generates simulations with given
nullmodel
and then analyses these using the
nestfun
. With large data sets and/or large number of
simulations, the generated objects can be very large, and if the
memory is exhausted, the analysis can become very slow and the
system can become unresponsive. The simulation will be broken into
several smaller batches if the simulated nullmodel
objective will be above the set batchsize
to avoid memory
problems (see object.size
for estimating the size of
the current data set). The parallel processing still increases the
memory needs. The parallel processing is only used for evaluating
nestfun
. The main load may be in simulation of the
nullmodel
, and parallel
argument does not help
there.
Function as.ts
transforms the simulated results of sequential
methods into a time series or a ts
object. This allows
using analytic tools for time series in studying the sequences (see
examples). Function toCoda
transforms the simulated results
of sequential methods into an "mcmc"
object of the
coda package. The coda package provides functions for
the analysis of stationarity, adequacy of sample size,
autocorrelation, need of burn-in and much more for sequential
methods, and summary of the results. Please consult the
documentation of the coda package.
Function permustats
provides support to the standard
density
, densityplot
,
qqnorm
and qqmath
functions for
the simulated values.
Value
Function oecosimu
returns an object of class
"oecosimu"
. The result object has items statistic
and
oecosimu
. The statistic
contains the complete object
returned by nestfun
for the original data. The
oecosimu
component contains the following items:
- statistic
Observed values of the statistic.
- simulated
Simulated values of the statistic.
- means
Mean values of the statistic from simulations.
- z
Standardized effect sizes (SES, a.k.a. the \(z\)-values) of the observed statistic based on simulations.
- pval
The \(P\)-values of the statistic based on simulations.
- alternative
The type of testing as given in argument
alternative
.- method
The
method
used innullmodel
.- isSeq
TRUE
ifmethod
was sequential.
References
Hardy, O. J. (2008) Testing the spatial phylogenetic structure of local communities: statistical performances of different null models and test statistics on a locally neutral community. Journal of Ecology 96, 914–926.
Gotelli, N.J. & Entsminger, N.J. (2003). Swap algorithms in null model analysis. Ecology 84, 532–535.
Jonsson, B.G. (2001) A null model for randomization tests of nestedness in species assemblages. Oecologia 127, 309–313.
Miklós, I. & Podani, J. (2004). Randomization of presence-absence matrices: comments and new algorithms. Ecology 85, 86–92.
Wright, D.H., Patterson, B.D., Mikkelson, G.M., Cutler, A. & Atmar, W. (1998). A comparative analysis of nested subset patterns of species composition. Oecologia 113, 1–20.
Note
If you wonder about the name of oecosimu
, look at journal
names in the References (and more in nestedtemp
).
The internal structure of the function was radically changed in
vegan 2.2-0 with introduction of commsim
and
nullmodel
and deprecation of
commsimulator
.
See also
Function oecosimu
currently defines null models with
commsim
and generates the simulated null model
communities with nullmodel
and
simulate.nullmodel
. For other applications of
oecosimu
, see treedive
and
raupcrick
.
See also nestedtemp
(that also discusses other
nestedness functions) and treedive
for another
application.
Examples
## Use the first eigenvalue of correspondence analysis as an index
## of structure: a model for making your own functions.
data(sipoo)
## Traditional nestedness statistics (number of checkerboard units)
oecosimu(sipoo, nestedchecker, "r0")
#> oecosimu object
#>
#> Call: oecosimu(comm = sipoo, nestfun = nestedchecker, method = "r0")
#>
#> nullmodel method ‘r0’ with 99 simulations
#>
#> alternative hypothesis: statistic is less or greater than simulated values
#>
#> Checkerboard Units : 2767
#> C-score (species mean): 2.258776
#>
#> statistic SES mean 2.5% 50% 97.5% Pr(sim.)
#> checkerboards 2767 -18.742 8017.0 7469.0 8021.0 8450.5 0.01 **
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
## sequential model, one-sided test, a vector statistic
out <- oecosimu(sipoo, decorana, "swap", burnin=100, thin=10,
statistic="evals", alt = "greater")
out
#> oecosimu object
#>
#> Call: oecosimu(comm = sipoo, nestfun = decorana, method = "swap",
#> burnin = 100, thin = 10, statistic = "evals", alternative = "greater")
#>
#> nullmodel method ‘swap’ with 99 simulations
#> options: thin 10, burnin 100
#> alternative hypothesis: statistic is greater than simulated values
#>
#>
#> Call:
#> nestfun(veg = comm)
#>
#> Detrended correspondence analysis with 26 segments.
#> Rescaling of axes with 4 iterations.
#> Total inertia (scaled Chi-square): 2.4436
#>
#> DCA1 DCA2 DCA3 DCA4
#> Eigenvalues 0.3822 0.2612 0.1668 0.08723
#> Additive Eigenvalues 0.3822 0.2609 0.1631 0.07650
#> Decorana values 0.4154 0.2465 0.1391 0.04992
#> Axis lengths 2.9197 2.5442 2.7546 1.78074
#>
#>
#> statistic SES mean 50% 95% Pr(sim.)
#> DCA1 0.382249 2.04544 0.32960 0.33249 0.3677 0.01 **
#> DCA2 0.261208 1.77368 0.21549 0.21404 0.2587 0.05 *
#> DCA3 0.166788 0.63257 0.15363 0.15405 0.1907 0.22
#> DCA4 0.087226 -1.69622 0.12533 0.12788 0.1636 0.96
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
## Inspect the swap sequence as a time series object
plot(as.ts(out))
lag.plot(as.ts(out))
acf(as.ts(out))
## Density plot
densityplot(permustats(out), as.table = TRUE, layout = c(1,4))
## Use quantitative null models to compare
## mean Bray-Curtis dissimilarities
data(dune)
meandist <- function(x) mean(vegdist(x, "bray"))
mbc1 <- oecosimu(dune, meandist, "r2dtable")
mbc1
#> oecosimu object
#>
#> Call: oecosimu(comm = dune, nestfun = meandist, method = "r2dtable")
#>
#> nullmodel method ‘r2dtable’ with 99 simulations
#>
#> alternative hypothesis: statistic is less or greater than simulated values
#>
#> statistic SES mean 2.5% 50% 97.5% Pr(sim.)
#> statistic 0.64565 13.84 0.46601 0.44155 0.46756 0.4928 0.01 **
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
## Define your own null model as a 'commsim' function: shuffle cells
## in each row
foo <- function(x, n, nr, nc, ...) {
out <- array(0, c(nr, nc, n))
for (k in seq_len(n))
out[,,k] <- apply(x, 2, function(z) sample(z, length(z)))
out
}
cf <- commsim("myshuffle", foo, isSeq = FALSE, binary = FALSE,
mode = "double")
oecosimu(dune, meandist, cf)
#> oecosimu object
#>
#> Call: oecosimu(comm = dune, nestfun = meandist, method = cf)
#>
#> nullmodel method ‘myshuffle’ with 99 simulations
#>
#> alternative hypothesis: statistic is less or greater than simulated values
#>
#> statistic SES mean 2.5% 50% 97.5% Pr(sim.)
#> statistic 0.64565 3.8862 0.63505 0.62995 0.63508 0.64 0.01 **
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
## Use pre-built null model
nm <- simulate(nullmodel(sipoo, "curveball"), 99)
oecosimu(nm, nestedchecker)
#> oecosimu object
#>
#> Call: oecosimu(comm = nm, nestfun = nestedchecker)
#>
#> nullmodel method ‘curveball’ with 99 simulations
#> options: thin 1, burnin 0
#> alternative hypothesis: statistic is less or greater than simulated values
#>
#> Checkerboard Units : 2767
#> C-score (species mean): 2.258776
#>
#> statistic SES mean 2.5% 50% 97.5% Pr(sim.)
#> checkerboards 2767 1.4459 2710.6 2635.0 2723.0 2762.7 0.05 *
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
## Several chains of a sequential model -- this can be generalized
## for parallel processing (see ?smbind)
nm <- replicate(5, simulate(nullmodel(sipoo, "swap"), 99,
thin=10, burnin=100), simplify = FALSE)
## nm is now a list of nullmodels: use smbind to combine these into one
## nullmodel with several chains
## IGNORE_RDIFF_BEGIN
nm <- smbind(nm, MARGIN = 3)
nm
#> An object of class “simmat”
#> ‘swap’ method (binary, sequential)
#> 18 x 50 matrix
#> Number of permuted matrices = 495
#> Start = 110, End = 1090, Thin = 10 (5 chains)
#>
oecosimu(nm, nestedchecker)
#> oecosimu object
#>
#> Call: oecosimu(comm = nm, nestfun = nestedchecker)
#>
#> nullmodel method ‘swap’ with 495 simulations
#> options: thin 10, burnin 100, chains 5
#> alternative hypothesis: statistic is less or greater than simulated values
#>
#> Checkerboard Units : 2767
#> C-score (species mean): 2.258776
#>
#> statistic SES mean 2.5% 50% 97.5% Pr(sim.)
#> checkerboards 2767 0.71842 2698.1 2572.0 2676.0 2904.7 0.4577
## IGNORE_RDIFF_END
## After this you can use toCoda() and tools in the coda package to
## analyse the chains (these will show that thin, burnin and nsimul are
## all too low for real analysis).