Fits an Environmental Vector or Factor onto an Ordination

The function fits environmental vectors or factors onto an ordination. The projections of points onto vectors have maximum correlation with corresponding environmental variables, and the factors show the averages of factor levels. For continuous varaibles this is equal to fitting a linear trend surface (plane in 2D) for a variable (see ordisurf); this trend surface can be presented by showing its gradient (direction of steepest increase) using an arrow. The environmental variables are the dependent variables that are explained by the ordination scores, and each dependent variable is analysed separately.

Usage

# Default S3 method
envfit(ord, env, permutations = 999, strata = NULL, 
   choices=c(1,2),  display = "sites", w, na.rm = FALSE, ...)
# S3 method for class 'formula'
envfit(formula, data, ...)
# S3 method for class 'envfit'
plot(x, choices = c(1,2), labels, arrow.mul, at = c(0,0), 
   axis = FALSE, p.max = NULL, r2.min = NULL, col = "blue", bg, add = TRUE, ...)
# S3 method for class 'envfit'
scores(x, display, choices, arrow.mul=1, tidy = FALSE, ...)
vectorfit(X, P, permutations = 0, strata = NULL, w, ...)
factorfit(X, P, permutations = 0, strata = NULL, w, ...)

Arguments

ord: An ordination object or other structure from which the ordination scores can be extracted (including a data frame or matrix of scores).
env: Data frame, matrix or vector of environmental variables. The variables can be of mixed type (factors, continuous variables) in data frames.
X: Matrix or data frame of ordination scores.
P: Data frame, matrix or vector of environmental variable(s). These must be continuous for vectorfit and factors or characters for factorfit.
permutations: a list of control values for the permutations as returned by the function how, or the number of permutations required, or a permutation matrix where each row gives the permuted indices. Set permutations = 0 to skip permutations.
formula, data: Model formula and data.
na.rm: Remove points with missing values in ordination scores or environmental variables. The operation is casewise: the whole row of data is removed if there is a missing value and na.rm = TRUE.
x: A result object from envfit. For ordiArrowMul and ordiArrowTextXY this must be a two-column matrix (or matrix-like object) containing the coordinates of arrow heads on the two plot axes, and other methods extract such a structure from the envfit results.
choices: Axes to plotted.
tidy: Return scores that are compatible with ggplot2: all scores are in a single data.frame, score type is identified by factor variable scores ("vectors" or "factors"), the names by variable label. These scores are incompatible with conventional plot functions, but they can be used in ggplot2.
labels: Change plotting labels. The argument should be a list with elements vectors and factors which give the new plotting labels. If either of these elements is omitted, the default labels will be used. If there is only one type of elements (only vectors or only factors), the labels can be given as vector. The default labels can be displayed with labels command.
arrow.mul: Multiplier for vector lengths. The arrows are automatically scaled similarly as in plot.cca if this is not given in plot and add = TRUE. However, in scores it can be used to adjust arrow lengths when the plot function is not used.
at: The origin of fitted arrows in the plot. If you plot arrows in other places then origin, you probably have to specify arrrow.mul.
axis: Plot axis showing the scaling of fitted arrows.
p.max, r2.min: Maximum estimated \(P\) value and minimum \(r^2\) for displayed variables. You must calculate \(P\) values with setting permutations to use p.max.
col: Colour in plotting.
bg: Background colour for labels. If bg is set, the labels are displayed with ordilabel instead of text. See Examples for using semitransparent background.
add: Results added to an existing ordination plot.
strata: An integer vector or factor specifying the strata for permutation. If supplied, observations are permuted only within the specified strata.
display: In fitting functions these are ordinary site scores or linear combination scores ("lc") in constrained ordination (cca, rda, dbrda). In scores function they are either "vectors" or "factors" (with synonyms "bp" or "cn", resp.).
w: Weights used in fitting (concerns mainly cca and decorana results which have nonconstant weights).
...: Parameters passed to scores.

Details

Function envfit finds vectors or factor averages of environmental variables. Function plot.envfit adds these in an ordination diagram. If X is a data.frame, envfit uses factorfit for factor variables and vectorfit for other variables. If X is a matrix or a vector, envfit uses only vectorfit. Alternatively, the model can be defined a simplified model formula, where the left hand side must be an ordination result object or a matrix of ordination scores, and right hand side lists the environmental variables. The formula interface can be used for easier selection and/or transformation of environmental variables. Only the main effects will be analysed even if interaction terms were defined in the formula.

The ordination results are extracted with scores and all extra arguments are passed to the scores. The fitted models only apply to the results defined when extracting the scores when using envfit. For instance, scaling in constrained ordination (see scores.rda, scores.cca) must be set in the same way in envfit and in the plot or the ordination results (see Examples).

The printed output of continuous variables (vectors) gives the direction cosines which are the coordinates of the heads of unit length vectors. In plot these are scaled by their correlation (square root of the column r2) so that “weak” predictors have shorter arrows than “strong” predictors. You can see the scaled relative lengths using command scores. The plotted (and scaled) arrows are further adjusted to the current graph using a constant multiplier: this will keep the relative r2-scaled lengths of the arrows but tries to fill the current plot. You can see the multiplier using ordiArrowMul(result_of_envfit), and set it with the argument arrow.mul.

Functions vectorfit and factorfit can be called directly. Function vectorfit finds directions in the ordination space towards which the environmental vectors change most rapidly and to which they have maximal correlations with the ordination configuration. Function factorfit finds averages of ordination scores for factor levels. Function factorfit treats ordered and unordered factors similarly.

If permutations \(> 0\), the significance of fitted vectors or factors is assessed using permutation of environmental variables. The goodness of fit statistic is squared correlation coefficient (\(r^2\)). For factors this is defined as \(r^2 = 1 - ss_w/ss_t\), where \(ss_w\) and \(ss_t\) are within-group and total sums of squares. See permutations for additional details on permutation tests in Vegan.

User can supply a vector of prior weights w. If the ordination object has weights, these will be used. In practise this means that the row totals are used as weights with cca or decorana results. If you do not like this, but want to give equal weights to all sites, you should set w = NULL. The fitted vectors are similar to biplot arrows in constrained ordination only when fitted to LC scores (display = "lc") and you set scaling = "species" (see scores.cca). The weighted fitting gives similar results to biplot arrows and class centroids in cca.

The lengths of arrows for fitted vectors are automatically adjusted for the physical size of the plot, and the arrow lengths cannot be compared across plots. For similar scaling of arrows, you must explicitly set the arrow.mul argument in the plot command; see ordiArrowMul and ordiArrowTextXY.

The results can be accessed with scores.envfit function which returns either the fitted vectors scaled by correlation coefficient or the centroids of the fitted environmental variables, or a named list of both.

Value

Functions vectorfit and factorfit return lists of classes vectorfit and factorfit which have a print method. The result object have the following items:

arrows: Arrow endpoints from vectorfit. The arrows are scaled to unit length.
centroids: Class centroids from factorfit.
r: Goodness of fit statistic: Squared correlation coefficient
permutations: Number of permutations.
control: A list of control values for the permutations as returned by the function how.
pvals: Empirical P-values for each variable.

Function envfit returns a list of class envfit with results of vectorfit and envfit as items.

Function plot.envfit scales the vectors by correlation.

Author

Jari Oksanen. The permutation test derives from the code suggested by Michael Scroggie.

Note

Fitted vectors have become the method of choice in displaying environmental variables in ordination. Indeed, they are the optimal way of presenting environmental variables in Constrained Correspondence Analysis cca, since there they are the linear constraints. In unconstrained ordination the relation between external variables and ordination configuration may be less linear, and therefore other methods than arrows may be more useful. The simplest is to adjust the plotting symbol sizes (cex, symbols) by environmental variables. Fancier methods involve smoothing and regression methods that abound in R, and ordisurf provides a wrapper for some.

Examples

data(varespec, varechem)
library(MASS)
ord <- metaMDS(varespec)
#> Square root transformation
#> Wisconsin double standardization
#> Run 0 stress 0.1843196 
#> Run 1 stress 0.1825658 
#> ... New best solution
#> ... Procrustes: rmse 0.04163027  max resid 0.1518284 
#> Run 2 stress 0.2152867 
#> Run 3 stress 0.2048307 
#> Run 4 stress 0.220789 
#> Run 5 stress 0.1967393 
#> Run 6 stress 0.2245479 
#> Run 7 stress 0.2096935 
#> Run 8 stress 0.2292645 
#> Run 9 stress 0.1858401 
#> Run 10 stress 0.1852397 
#> Run 11 stress 0.2419377 
#> Run 12 stress 0.2144309 
#> Run 13 stress 0.2032569 
#> Run 14 stress 0.2419374 
#> Run 15 stress 0.2088293 
#> Run 16 stress 0.196245 
#> Run 17 stress 0.2061122 
#> Run 18 stress 0.1976151 
#> Run 19 stress 0.2291377 
#> Run 20 stress 0.1825658 
#> ... New best solution
#> ... Procrustes: rmse 9.102961e-06  max resid 3.217625e-05 
#> ... Similar to previous best
#> *** Best solution repeated 1 times
(fit <- envfit(ord, varechem, perm = 999))
#> 
#> ***VECTORS
#> 
#>             NMDS1    NMDS2     r2 Pr(>r)    
#> N        -0.05730 -0.99836 0.2536  0.047 *  
#> P         0.61971  0.78483 0.1938  0.129    
#> K         0.76644  0.64232 0.1809  0.134    
#> Ca        0.68518  0.72837 0.4119  0.006 ** 
#> Mg        0.63251  0.77455 0.4270  0.005 ** 
#> S         0.19137  0.98152 0.1752  0.161    
#> Al       -0.87161  0.49020 0.5269  0.001 ***
#> Fe       -0.93603  0.35193 0.4450  0.002 ** 
#> Mn        0.79872 -0.60171 0.5231  0.002 ** 
#> Zn        0.61754  0.78654 0.1879  0.126    
#> Mo       -0.90308  0.42947 0.0609  0.516    
#> Baresoil  0.92489 -0.38022 0.2508  0.036 *  
#> Humdepth  0.93283 -0.36031 0.5200  0.001 ***
#> pH       -0.64799  0.76164 0.2308  0.057 .  
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> Permutation: free
#> Number of permutations: 999
#> 
#> 
scores(fit, "vectors")
#>                NMDS1      NMDS2
#> N        -0.02885647 -0.5027950
#> P         0.27283633  0.3455360
#> K         0.32601005  0.2732133
#> Ca        0.43972415  0.4674409
#> Mg        0.41332776  0.5061499
#> S         0.08009763  0.4108201
#> Al       -0.63270239  0.3558383
#> Fe       -0.62443838  0.2347789
#> Mn        0.57766223 -0.4351797
#> Zn        0.26770638  0.3409640
#> Mo       -0.22294695  0.1060239
#> Baresoil  0.46316368 -0.1904068
#> Humdepth  0.67270727 -0.2598321
#> pH       -0.31130341  0.3659022
plot(ord)
plot(fit)
plot(fit, p.max = 0.05, col = "red")

## Adding fitted arrows to CCA. We use "lc" scores, and hope
## that arrows are scaled similarly in cca and envfit plots
ord <- cca(varespec ~ Al + P + K, varechem)
plot(ord, type="p")
fit <- envfit(ord, varechem, perm = 999, display = "lc")
plot(fit, p.max = 0.05, col = "red")

## 'scaling' must be set similarly in envfit and in ordination plot
plot(ord, type = "p", scaling = "sites")
fit <- envfit(ord, varechem, perm = 0, display = "lc", scaling = "sites")
plot(fit, col = "red")


## Class variables, formula interface, and displaying the
## inter-class variability with ordispider, and semitransparent
## white background for labels (semitransparent colours are not
## supported by all graphics devices)
data(dune)
data(dune.env)
ord <- cca(dune)
fit <- envfit(ord ~ Moisture + A1, dune.env, perm = 0)
plot(ord, type = "n")
with(dune.env, ordispider(ord, Moisture, col="skyblue"))
with(dune.env, points(ord, display = "sites", col = as.numeric(Moisture),
                      pch=16))
plot(fit, cex=1.2, axis=TRUE, bg = rgb(1, 1, 1, 0.5))

## Use shorter labels for factor centroids
labels(fit)
#> $vectors
#> [1] "A1"
#> 
#> $factors
#> [1] "Moisture1" "Moisture2" "Moisture4" "Moisture5"
#> 
plot(ord)
plot(fit, labels=list(factors = paste("M", c(1,2,4,5), sep = "")),
   bg = rgb(1,1,0,0.5))