Averaged Subsampled Dissimilarity Matrices
avgdist.RdThe function computes the dissimilarity matrix of a dataset multiple
  times using vegdist while randomly subsampling the
  dataset each time. All of the subsampled iterations are then averaged
  (mean) to provide a distance matrix that represents the average of
  multiple subsampling iterations. This emulates the behavior of the
  distance matrix calculator within the Mothur microbial ecology toolkit.
Usage
avgdist(x, sample, distfun = vegdist, meanfun = mean,
    transf = NULL, iterations = 100, dmethod = "bray",
    diag = TRUE, upper = TRUE, ...)Arguments
- x
- Community data matrix. 
- sample
- The subsampling depth to be used in each iteration. Samples that do not meet this threshold will be removed from the analysis, and their identity returned to the user in stdout. 
- distfun
- The dissimilarity matrix function to be used. Default is the vegan - vegdist
- meanfun
- The calculation to use for the average (mean or median). 
- transf
- Option for transforming the count data before calculating the distance matrix. Any base transformation option can be used (e.g. - sqrt)
- iterations
- The number of random iterations to perform before averaging. Default is 100 iterations. 
- dmethod
- Dissimilarity index to be used with the specified dissimilarity matrix function. Default is Bray-Curtis 
- diag, upper
- Return dissimilarities with diagonal and upper triangle. NB. the default differs from - vegdistand returns symmetric- "dist"structure instead of lower diagonal. However, the object cannot be accessed with matrix indices unless cast to matrix with- as.matrix.
- ...
- Any additional arguments to add to the distance function or mean/median function specified. 
Note
The function builds on the function rrarefy and and
  additional distance matrix function (e.g. vegdist) to
  add more meaningful representations of distances among randomly
  subsampled datasets by presenting the average of multiple random
  iterations. This function runs using the vegdist. This
  functionality has been utilized in the Mothur standalone microbial
  ecology toolkit, see https://mothur.org/wiki/Dist.shared.
Examples
# Import an example count dataset
data(BCI)
# Test the base functionality
mean.avg.dist <- avgdist(BCI, sample = 50, iterations = 10)
# Test the transformation function
mean.avg.dist.t <- avgdist(BCI, sample = 50, iterations = 10, transf = sqrt)
# Test the median functionality
median.avg.dist <- avgdist(BCI, sample = 50, iterations = 10, meanfun = median)
# Print the resulting tables
head(as.matrix(mean.avg.dist))
#>       1     2     3     4     5     6     7     8     9    10    11    12    13
#> 1 0.000 0.554 0.606 0.648 0.616 0.628 0.612 0.608 0.658 0.622 0.642 0.646 0.708
#> 2 0.554 0.000 0.570 0.566 0.626 0.566 0.526 0.552 0.582 0.576 0.590 0.564 0.676
#> 3 0.606 0.570 0.000 0.582 0.578 0.618 0.600 0.568 0.610 0.574 0.610 0.622 0.738
#> 4 0.648 0.566 0.582 0.000 0.624 0.654 0.610 0.572 0.594 0.594 0.596 0.636 0.672
#> 5 0.616 0.626 0.578 0.624 0.000 0.602 0.662 0.626 0.670 0.606 0.678 0.716 0.754
#> 6 0.628 0.566 0.618 0.654 0.602 0.000 0.598 0.566 0.616 0.644 0.576 0.568 0.706
#>      14    15    16    17    18    19    20    21    22    23    24    25    26
#> 1 0.628 0.648 0.602 0.648 0.748 0.684 0.626 0.604 0.644 0.702 0.658 0.618 0.618
#> 2 0.596 0.590 0.558 0.560 0.658 0.584 0.594 0.608 0.546 0.640 0.576 0.624 0.594
#> 3 0.628 0.612 0.616 0.646 0.730 0.660 0.646 0.568 0.636 0.722 0.652 0.672 0.660
#> 4 0.602 0.602 0.618 0.634 0.666 0.622 0.636 0.656 0.568 0.642 0.614 0.654 0.638
#> 5 0.610 0.642 0.630 0.760 0.766 0.698 0.654 0.642 0.738 0.702 0.652 0.688 0.652
#> 6 0.618 0.696 0.626 0.618 0.668 0.634 0.664 0.648 0.630 0.694 0.608 0.682 0.674
#>      27    28    29    30    31    32    33    34    35    36    37    38    39
#> 1 0.702 0.684 0.678 0.632 0.682 0.726 0.692 0.684 0.744 0.668 0.694 0.716 0.698
#> 2 0.630 0.588 0.564 0.578 0.648 0.606 0.618 0.632 0.722 0.636 0.592 0.588 0.618
#> 3 0.668 0.660 0.654 0.676 0.660 0.674 0.694 0.700 0.764 0.684 0.628 0.684 0.708
#> 4 0.644 0.596 0.612 0.608 0.662 0.594 0.610 0.632 0.730 0.648 0.562 0.588 0.636
#> 5 0.712 0.752 0.728 0.734 0.676 0.710 0.698 0.768 0.830 0.678 0.690 0.752 0.760
#> 6 0.628 0.640 0.630 0.668 0.668 0.636 0.662 0.682 0.768 0.678 0.658 0.664 0.700
#>      40    41    42    43    44    45    46    47    48    49    50
#> 1 0.716 0.686 0.630 0.690 0.652 0.668 0.670 0.654 0.678 0.692 0.674
#> 2 0.638 0.672 0.640 0.648 0.626 0.662 0.648 0.634 0.650 0.676 0.668
#> 3 0.716 0.702 0.634 0.672 0.668 0.646 0.720 0.730 0.712 0.698 0.704
#> 4 0.680 0.652 0.618 0.680 0.698 0.696 0.658 0.666 0.658 0.696 0.708
#> 5 0.828 0.688 0.654 0.658 0.690 0.654 0.772 0.720 0.704 0.692 0.656
#> 6 0.724 0.686 0.678 0.670 0.666 0.698 0.734 0.692 0.714 0.734 0.704
head(as.matrix(mean.avg.dist.t))
#>           1         2         3         4         5         6         7
#> 1 0.0000000 0.5073247 0.5451162 0.5511308 0.5863431 0.5693314 0.5561033
#> 2 0.5073247 0.0000000 0.5076177 0.5454488 0.5590474 0.5361407 0.5075526
#> 3 0.5451162 0.5076177 0.0000000 0.5169507 0.5442375 0.5536933 0.5580432
#> 4 0.5511308 0.5454488 0.5169507 0.0000000 0.5556915 0.6029668 0.5894129
#> 5 0.5863431 0.5590474 0.5442375 0.5556915 0.0000000 0.5730661 0.5904097
#> 6 0.5693314 0.5361407 0.5536933 0.6029668 0.5730661 0.0000000 0.5161496
#>           8         9        10        11        12        13        14
#> 1 0.5673559 0.5627535 0.5792425 0.5729159 0.6020539 0.7016723 0.5490011
#> 2 0.5366592 0.5380985 0.5623811 0.5321790 0.5515688 0.6779465 0.5572373
#> 3 0.5073755 0.5536056 0.5389882 0.5801096 0.5868026 0.7260347 0.5816036
#> 4 0.5259962 0.5706750 0.5613498 0.5523447 0.5753112 0.6804780 0.5838035
#> 5 0.5484180 0.5855322 0.5881431 0.6281267 0.6276212 0.7377994 0.5875698
#> 6 0.5529963 0.5552535 0.6215621 0.5618841 0.5336650 0.6716372 0.5641578
#>          15        16        17        18        19        20        21
#> 1 0.6050878 0.5605091 0.6262520 0.6790944 0.6158975 0.5894641 0.6238942
#> 2 0.5492945 0.5571279 0.5683865 0.6719160 0.5864794 0.5817164 0.5785775
#> 3 0.5651808 0.5907658 0.5986803 0.7180656 0.6208023 0.5831755 0.6141896
#> 4 0.5846365 0.5995058 0.6019378 0.6723126 0.6185437 0.5788246 0.6204008
#> 5 0.5886468 0.6313116 0.6908350 0.7310610 0.6738026 0.6056946 0.6248085
#> 6 0.6245763 0.5734551 0.5897524 0.6553604 0.5945132 0.6253476 0.6236726
#>          22        23        24        25        26        27        28
#> 1 0.6087932 0.6864876 0.5674543 0.6124061 0.6000094 0.6287013 0.6246696
#> 2 0.5851407 0.6506247 0.5554757 0.5519726 0.6143138 0.5810440 0.5887911
#> 3 0.6301892 0.7101043 0.5608707 0.5964616 0.6536689 0.6015372 0.6254962
#> 4 0.6116276 0.6570546 0.5806733 0.5942875 0.6756654 0.6170196 0.6088559
#> 5 0.6692645 0.7281868 0.6019467 0.6240673 0.6717579 0.6824310 0.6781349
#> 6 0.5779917 0.6752783 0.5474248 0.6048853 0.6297599 0.6032439 0.6025756
#>          29        30        31        32        33        34        35
#> 1 0.5877126 0.5931557 0.6122546 0.6435203 0.6114740 0.6243358 0.7012874
#> 2 0.5493477 0.5852872 0.5988185 0.6153332 0.5921568 0.6244034 0.6905708
#> 3 0.5678376 0.6210073 0.6158757 0.6377790 0.6371270 0.6555867 0.7323038
#> 4 0.5759931 0.5912841 0.5983615 0.6066556 0.6016788 0.6087876 0.6950808
#> 5 0.6276522 0.6379576 0.6034350 0.6457655 0.6563073 0.6866576 0.7235042
#> 6 0.5816252 0.6242943 0.6152092 0.6137642 0.6116231 0.6246531 0.7312678
#>          36        37        38        39        40        41        42
#> 1 0.5974168 0.6467898 0.6716840 0.6643855 0.6683070 0.6353745 0.6004874
#> 2 0.5903695 0.5752574 0.6187077 0.6192344 0.6573943 0.6571387 0.6027516
#> 3 0.5687077 0.6126409 0.6617885 0.6959650 0.6909293 0.6525653 0.6014759
#> 4 0.6194344 0.5761296 0.6102164 0.6295049 0.6560977 0.5989171 0.6281164
#> 5 0.6056532 0.6441953 0.6593381 0.7110348 0.7252049 0.6568578 0.6156351
#> 6 0.6109414 0.6067715 0.6656711 0.6870593 0.6686455 0.6769139 0.6426746
#>          43        44        45        46        47        48        49
#> 1 0.6310309 0.5900500 0.5774603 0.6474425 0.6488217 0.6157111 0.6070289
#> 2 0.6028820 0.6126007 0.6208170 0.6652427 0.6134982 0.6243098 0.6319083
#> 3 0.6162330 0.6398741 0.6039060 0.7021972 0.6677691 0.6678864 0.6696487
#> 4 0.6197236 0.6379939 0.6156272 0.6582549 0.6062615 0.6139683 0.6466872
#> 5 0.6203780 0.6542059 0.6040162 0.7417980 0.6946591 0.6954255 0.6537119
#> 6 0.6386277 0.6418793 0.6306854 0.6948586 0.6326248 0.6678749 0.6692019
#>          50
#> 1 0.5867459
#> 2 0.6227903
#> 3 0.6365131
#> 4 0.6260672
#> 5 0.6250031
#> 6 0.6423431
head(as.matrix(median.avg.dist))
#>      1    2    3    4    5    6    7    8    9   10   11   12   13   14   15
#> 1 0.00 0.56 0.60 0.62 0.66 0.63 0.58 0.62 0.65 0.61 0.59 0.65 0.74 0.64 0.64
#> 2 0.56 0.00 0.53 0.59 0.64 0.63 0.60 0.58 0.58 0.60 0.59 0.62 0.68 0.58 0.61
#> 3 0.60 0.53 0.00 0.61 0.63 0.58 0.56 0.57 0.60 0.55 0.60 0.64 0.76 0.66 0.61
#> 4 0.62 0.59 0.61 0.00 0.64 0.66 0.62 0.58 0.57 0.58 0.63 0.62 0.66 0.55 0.57
#> 5 0.66 0.64 0.63 0.64 0.00 0.57 0.65 0.64 0.62 0.62 0.68 0.67 0.76 0.66 0.65
#> 6 0.63 0.63 0.58 0.66 0.57 0.00 0.58 0.62 0.61 0.66 0.60 0.59 0.74 0.69 0.70
#>     16   17   18   19   20   21   22   23   24   25   26   27   28   29   30
#> 1 0.63 0.65 0.76 0.67 0.65 0.65 0.66 0.68 0.66 0.66 0.62 0.66 0.65 0.66 0.66
#> 2 0.62 0.59 0.69 0.61 0.56 0.64 0.62 0.67 0.60 0.58 0.62 0.64 0.63 0.56 0.63
#> 3 0.61 0.64 0.72 0.69 0.64 0.62 0.62 0.74 0.65 0.65 0.66 0.64 0.61 0.60 0.64
#> 4 0.63 0.62 0.67 0.61 0.60 0.68 0.59 0.67 0.62 0.60 0.66 0.60 0.62 0.59 0.62
#> 5 0.65 0.73 0.81 0.70 0.65 0.64 0.72 0.69 0.71 0.67 0.71 0.68 0.72 0.70 0.67
#> 6 0.63 0.60 0.68 0.63 0.66 0.68 0.63 0.70 0.61 0.64 0.66 0.63 0.65 0.62 0.69
#>     31   32   33   34   35   36   37   38   39   40   41   42   43   44   45
#> 1 0.65 0.69 0.66 0.66 0.74 0.65 0.68 0.69 0.70 0.68 0.67 0.63 0.63 0.64 0.67
#> 2 0.62 0.61 0.61 0.63 0.75 0.66 0.60 0.60 0.67 0.68 0.68 0.62 0.64 0.64 0.69
#> 3 0.63 0.61 0.69 0.68 0.78 0.67 0.61 0.65 0.68 0.72 0.71 0.65 0.66 0.64 0.65
#> 4 0.59 0.65 0.61 0.66 0.79 0.61 0.57 0.61 0.65 0.66 0.68 0.63 0.66 0.68 0.71
#> 5 0.65 0.71 0.70 0.75 0.82 0.70 0.65 0.73 0.79 0.76 0.72 0.64 0.73 0.70 0.68
#> 6 0.68 0.68 0.67 0.68 0.82 0.72 0.71 0.69 0.71 0.75 0.74 0.69 0.67 0.69 0.72
#>     46   47   48   49   50
#> 1 0.69 0.66 0.66 0.66 0.67
#> 2 0.69 0.62 0.67 0.63 0.65
#> 3 0.73 0.70 0.70 0.68 0.72
#> 4 0.68 0.65 0.65 0.69 0.64
#> 5 0.76 0.70 0.75 0.72 0.70
#> 6 0.72 0.71 0.73 0.72 0.71
# Run example to illustrate low variance of mean, median, and stdev results
# Mean and median std dev are around 0.05
sdd <- avgdist(BCI, sample = 50, iterations = 100, meanfun = sd)
summary(mean.avg.dist)
#>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
#>  0.4640  0.6180  0.6540  0.6543  0.6900  0.8360 
summary(median.avg.dist)
#>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
#>  0.4600  0.6100  0.6500  0.6501  0.6900  0.8600 
summary(sdd)
#>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
#> 0.03849 0.05617 0.05921 0.05919 0.06208 0.07661 
# Test for when subsampling depth excludes some samples
# Return samples that are removed for not meeting depth filter
depth.avg.dist <- avgdist(BCI, sample = 450, iterations = 10)
#> Warning: The following sampling units were removed because they were below sampling depth: 1, 2, 6, 7, 8, 9, 11, 12, 13, 14, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 31, 33, 34, 36, 37, 38, 39, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50
# Print the result
depth.avg.dist
#>            3         4         5        10        15        30        32
#> 3  0.0000000 0.3282222 0.3635556 0.2942222 0.3562222 0.4555556 0.4726667
#> 4  0.3282222 0.0000000 0.3775556 0.3173333 0.3480000 0.3908889 0.4273333
#> 5  0.3635556 0.3775556 0.0000000 0.3900000 0.3995556 0.4968889 0.5231111
#> 10 0.2942222 0.3173333 0.3900000 0.0000000 0.3135556 0.4266667 0.4200000
#> 15 0.3562222 0.3480000 0.3995556 0.3135556 0.0000000 0.4535556 0.4657778
#> 30 0.4555556 0.3908889 0.4968889 0.4266667 0.4535556 0.0000000 0.3811111
#> 32 0.4726667 0.4273333 0.5231111 0.4200000 0.4657778 0.3811111 0.0000000
#> 35 0.6591111 0.6266667 0.6966667 0.6817778 0.6573333 0.5157778 0.5893333
#> 40 0.5624444 0.5286667 0.6413333 0.5240000 0.5628889 0.4540000 0.4057778
#>           35        40
#> 3  0.6591111 0.5624444
#> 4  0.6266667 0.5286667
#> 5  0.6966667 0.6413333
#> 10 0.6817778 0.5240000
#> 15 0.6573333 0.5628889
#> 30 0.5157778 0.4540000
#> 32 0.5893333 0.4057778
#> 35 0.0000000 0.4280000
#> 40 0.4280000 0.0000000