| Title: | Spatial Phylogenetic Analysis |
|---|---|
| Description: | Analyze spatial phylogenetic diversity patterns. Use your data on an evolutionary tree and geographic distributions of the terminal taxa to compute diversity and endemism metrics, test significance with null model randomization, analyze community turnover and biotic regionalization, and perform spatial conservation prioritizations. All functions support quantitative community data in addition to binary data. |
| Authors: | Matthew Kling [aut, cre, cph] (ORCID: <https://orcid.org/0000-0001-9073-4240>) |
| Maintainer: | Matthew Kling <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 1.4.0.9000 |
| Built: | 2026-05-28 18:07:44 UTC |
| Source: | https://github.com/matthewkling/phylospatial |
Nonlinear function that converts proportion of range conserved into conservation "benefit."
benefit(x, lambda = 1)benefit(x, lambda = 1)
x |
Fraction of taxon range protected (value between 0 and 1). |
lambda |
Shape parameter. |
Value between 0 and 1.
This function runs ape::dist.nodes() with some additional filtering and sorting. By default,
it returns distances between every pair of non-nested clades, i.e. every pair of collateral (non-lineal) nodes
including terminals and internal nodes. Package phytools is required for this function.
clade_dist(tree, lineal = FALSE, edges = TRUE)clade_dist(tree, lineal = FALSE, edges = TRUE)
tree |
A phylogeny of class |
lineal |
Logical indicating whether to retain distances for pairs of nodes that are lineal ancestors/descendants.
If |
edges |
Logical indicating whether to return a distance matrix with a row for every edge in |
A matrix of pairwise distances between nodes.
if(requireNamespace("phytools", quietly = TRUE)){ clade_dist(ape::rtree(10)) }if(requireNamespace("phytools", quietly = TRUE)){ clade_dist(ape::rtree(10)) }
Get example phylospatial data set based on a phylogeny and modeled distributions of 443 moss species
across California. This data set is a coarser version of data from Kling et al. (2024). It contains
occurrence probabilities, and is available in raster or polygon spatial formats.
moss(format = "raster")moss(format = "raster")
format |
Either "raster" (default) or "polygon" |
a phylospatial object
Kling, Gonzalez-Ramirez, Carter, Borokini, and Mishler (2024) bioRxiv, https://doi.org/10.1101/2024.12.16.628580.
moss()moss()
This function creates a phylospatial object. This is the core data type in the phylospatial library, and
is a required input to most other functions in the package. The two essential components of a spatial phylogenetic object
are a phylogenetic tree and an community data set.
phylospatial( comm, tree = NULL, spatial = NULL, data_type = c("auto", "probability", "binary", "abundance", "other"), clade_fun = NULL, build = TRUE, check = TRUE, area_tol = 0.01, rescale = c("sum1", "tip1", "raw") )phylospatial( comm, tree = NULL, spatial = NULL, data_type = c("auto", "probability", "binary", "abundance", "other"), clade_fun = NULL, build = TRUE, check = TRUE, area_tol = 0.01, rescale = c("sum1", "tip1", "raw") )
comm |
Community data representing the distribution of terminal taxa across sites. Can be a matrix with a column per terminal and
a row per site, a SpatRaster with one layer per terminal, or a |
tree |
Phylogeny of class phylo. Terminals whose names do not match |
spatial |
An optional |
data_type |
Character giving the data type of |
clade_fun |
Function to calculate the local community weight for a clade based on community weights for tips found in a given location.
Must be either NULL (the default, in which case the default function for the selected |
build |
Logical indicating whether |
check |
Logical indicating whether community data should be validated. Default is TRUE. |
area_tol |
Numeric value giving tolerance for variation in the area of sites. Default is |
rescale |
Character giving the branch-length rescaling method applied
to the tree during construction. Must be |
This function formats the input data as a phylospatial object. Beyond validating, cleaning, and restructing the data, the main operation
it performs is to compute community occurrence data for every internal clade on the tree.
Unoccupied sites (rows where no taxon occurs) are automatically removed from the community matrix during construction to improve
performance. The original site indices of occupied rows are stored in ps$occupied, and the total number of sites (including
unoccupied) in ps$n_sites, enabling reconstruction of full-extent spatial outputs. Functions that return spatial results
automatically expand occupied-only data back to the full spatial extent.
If your data are in the form of occurrence point localities (e.g. from GBIF or BIEN) rather
than a gridded community data set, use ps_grid() to rasterize the points onto a spatial grid
before passing the result to this function.
A phylospatial object, which is a list containing the following elements:
Character indicating the community data type
Phylogeny of class phylo
Community matrix containing only occupied sites, including a column for every terminal taxon and every larger clade. Column order corresponds to tree edge order.
A SpatRaster or sf providing spatial coordinates for all sites (including unoccupied).
May be missing if no spatial data was supplied.
Integer vector of row indices identifying which sites in the original data are occupied.
Total number of sites in the original data, including unoccupied.
A community dissimilarity matrix of class dist indicating pairwise phylogenetic dissimilarity
between occupied sites. Missing unless ps_add_dissim() is called.
Character indicating which branch-length rescaling was applied.
ps_grid() to convert occurrence point data into a binary or abundance raster that
can be used with phylospatial.
# load species distribution data and phylogeny comm <- terra::rast(system.file("extdata", "moss_comm.tif", package = "phylospatial")) tree <- ape::read.tree(system.file("extdata", "moss_tree.nex", package = "phylospatial")) # construct `phylospatial` object ps <- phylospatial(comm, tree) ps# load species distribution data and phylogeny comm <- terra::rast(system.file("extdata", "moss_comm.tif", package = "phylospatial")) tree <- ape::read.tree(system.file("extdata", "moss_tree.nex", package = "phylospatial")) # construct `phylospatial` object ps <- phylospatial(comm, tree) ps
Show a plot illustrating alternative values for the lambda parameter in ps_prioritize. Lambda determines the shape of
the "benefit" function that determines the conservation value of protecting a given proportion of the geographic range of a
species or clade. Positive values place a higher priority on protecting additional populations of largely unprotected taxa,
whereas negative values place a higher priority on protecting additional populations of relatively well-protected taxa. The
default value used by ps_prioritize is 1.
plot_lambda(lambda = c(-1, -0.5, 0, 0.5, 2, 1))plot_lambda(lambda = c(-1, -0.5, 0, 0.5, 2, 1))
lambda |
A vector of lambda values to plot |
Plots a figure
plot_lambda() plot_lambda(seq(0, 3, .1))plot_lambda() plot_lambda(seq(0, 3, .1))
phylospatial objectPlot a phylospatial object
## S3 method for class 'phylospatial' plot(x, y = c("tree", "comm"), max_taxa = 12, ...)## S3 method for class 'phylospatial' plot(x, y = c("tree", "comm"), max_taxa = 12, ...)
x |
|
y |
Either |
max_taxa |
Integer giving the maximum number of taxon ranges to plot if |
... |
Additional arguments passed to plotting methods, depending on |
A plot of the tree or community data.
ps <- ps_simulate(20, 20, 20) plot(ps, "tree") plot(ps, "comm")ps <- ps_simulate(20, 20, 20) plot(ps, "tree") plot(ps, "comm")
phylospatial objectThis function calculates pairwise phylogenetic dissimilarity between communities and returns the phylospatial
object with the dissimilarity data added as an element called dissim. See ps_dissim for details.
ps_add_dissim(ps, method = "sorensen", ...)ps_add_dissim(ps, method = "sorensen", ...)
ps |
|
method |
Dissimilarity metric; see ps_dissim for details. |
... |
Additional arguments passed to ps_dissim, such as |
ps with a new dissim element added.
ps <- ps_simulate(data_type = "prob") ps_add_dissim(ps) ps_add_dissim(ps, fun = "vegdist", method = "jaccard", endemism = TRUE)ps <- ps_simulate(data_type = "prob") ps_add_dissim(ps) ps_add_dissim(ps, fun = "vegdist", method = "jaccard", endemism = TRUE)
This function classifies sites into areas of significant endemism according to the scheme of Mishler et al. (2014). Categorization is based on randomization quantile values for PE, RPE, and CE (which Mishler et al. call "PE on the comparison tree").
ps_canape(rand, alpha = 0.05)ps_canape(rand, alpha = 0.05)
rand |
An object returned by running |
alpha |
Numeric value between 0 and 1 giving the one-tailed p-value threshold to use when determining significance. |
Endemism significance categories are defined as follows:
Endemism not significant: neither PE nor CE are significantly high at alpha.
Significant neoendemism: PE or CE are significantly high at alpha; RPE significantly low at alpha / 2 (two-tailed test).
Significant paleoendemism: PE or CE are significantly high at alpha; RPE significantly high at alpha / 2 (two-tailed test)..
Significant mixed-endemism: PE or CE are significantly high at alpha; RPE not significant.
Significant super-endemism: PE or CE are significantly high at alpha / 5; RPE not significant.
An object of the same class as rand containing a variable called "canape", with values 0-4
corresponding to not-significant, mixed-, super-, neo-, and paleo-endemism, respectively.
Mishler, B. D., Knerr, N., González-Orozco, C. E., Thornhill, A. H., Laffan, S. W., & Miller, J. T. (2014). Phylogenetic measures of biodiversity and neo-and paleo-endemism in Australian Acacia. Nature Communications, 5(1), 4473.
# classic CANAPE using binary data and the curveball algorithm # (note that a real analysis would require a much higher `n_rand`) set.seed(123456) ps <- ps_simulate(data_type = "binary") rand <- ps_rand(ps, metric = c("PE", "RPE", "CE"), fun = "nullmodel", method = "curveball", n_rand = 25, burnin = 10000, progress = FALSE) canape <- ps_canape(rand) terra::plot(canape)# classic CANAPE using binary data and the curveball algorithm # (note that a real analysis would require a much higher `n_rand`) set.seed(123456) ps <- ps_simulate(data_type = "binary") rand <- ps_rand(ps, metric = c("PE", "RPE", "CE"), fun = "nullmodel", method = "curveball", n_rand = 25, burnin = 10000, progress = FALSE) canape <- ps_canape(rand) terra::plot(canape)
This function is a wrapper around canaper::cpr_rand_test(). It only works with binary community data.
It is largely redundant with ps_rand() and ps_canape(), which are more flexible in supporting data sets
with non-binary community data. However, this function runs faster, and supports custom null models via
make.commsim.
ps_canaper(ps, null_model = "curveball", spatial = TRUE, ...)ps_canaper(ps, null_model = "curveball", spatial = TRUE, ...)
ps |
phylospatial object |
null_model |
see |
spatial |
Logical: should the function return a spatial object (TRUE, default) or a vector (FALSE). |
... |
further arguments passed to |
This function runs canaper::cpr_rand_test(); see the help for that function for details.
It also runs canaper::cpr_classify_endem() on the result, and includes the resulting classification as an
additional variable, 'endem_type', in the output. 'endem_type' values 0-4 correspond to not-significant, neo,
paleo, mixed, and super endemisim, respectively.
A matrix or SpatRaster, or sf with a column or layer for each metric.
Mishler, B. D., Knerr, N., González-Orozco, C. E., Thornhill, A. H., Laffan, S. W., & Miller, J. T. (2014). Phylogenetic measures of biodiversity and neo-and paleo-endemism in Australian Acacia. Nature Communications, 5(1), 4473.
Nitta, J. H., Laffan, S. W., Mishler, B. D., & Iwasaki, W. (2023). canaper: categorical analysis of neo‐and paleo‐endemism in R. Ecography, 2023(9), e06638.
if(requireNamespace("canaper")){ ps <- ps_simulate(data_type = "binary") terra::plot(ps_canaper(ps)$pd_obs_p_upper) }if(requireNamespace("canaper")){ ps <- ps_simulate(data_type = "binary") terra::plot(ps_canaper(ps)$pd_obs_p_upper) }
This function calculates pairwise phylogenetic dissimilarity between communities. It works with both binary and
quantitative community data sets. A wide range of phylogentic community dissimilarity metrics are supported,
including phylogenetic Sorensen's and Jaccard's distances, turnover and nestedness components of Sorensen's distance
(Baselga & Orme, 2012), and phylogenetic versions of all community distance indices provided through the vegan library.
The function also includes options to scale the community matrix in order to focus the analysis on endemism and/or
on proportional differences in community composition. The results from this function can be visualized using
ps_rgb or ps_regions, or used in a variety of statistical analyses.
ps_dissim( ps, method = "sorensen", fun = c("vegdist", "designdist", "chaodist"), endemism = FALSE, normalize = FALSE, tips_only = FALSE, n_cores = NULL, ... )ps_dissim( ps, method = "sorensen", fun = c("vegdist", "designdist", "chaodist"), endemism = FALSE, normalize = FALSE, tips_only = FALSE, n_cores = NULL, ... )
ps |
phylospatial object. |
method |
Character indicating the dissimilarity index to use:
|
fun |
Character indicating which general distance function from the |
endemism |
Logical indicating whether community values should be divided by column totals (taxon range sizes) to derive endemism before computing distances. |
normalize |
Logical indicating whether community values should be divided by row totals (community sums) before
computing distances. If |
tips_only |
Logical indicating whether to compute dissimilarity using only terminal taxa ( |
n_cores |
Integer controlling the computation backend. The default |
... |
Additional arguments passed to |
A pairwise phylogenetic dissimilarity matrix of class dist.
Graham, C. H., & Fine, P. V. (2008). Phylogenetic beta diversity: linking ecological and evolutionary processes across space in time. Ecology Letters, 11(12), 1265-1277.
Baselga, A., & Orme, C. D. L. (2012). betapart: an R package for the study of beta diversity. Methods in Ecology and Evolution, 3(5), 808-812.
Pavoine, S. (2016). A guide through a family of phylogenetic dissimilarity measures among sites. Oikos, 125(12), 1719-1732.
# example data set: ps <- ps_simulate(n_tips = 50) # The default arguments give Sorensen's quantitative dissimilarity index # (a.k.a. Bray-Curtis distance): d <- ps_dissim(ps) # Specifying a custom formula explicitly via `designdist`; # (this is the Bray-Curtis formula, so it's equivalent to the prior example) d <- ps_dissim(ps, method = "(b+c)/(2*a+b+c)", fun = "designdist", terms = "minimum", abcd = TRUE) # Alternative arguments can specify a wide range of dissimilarity measures; # here's endemism-weighted Jaccard's dissimilarity: d <- ps_dissim(ps, method = "jaccard", endemism = TRUE)# example data set: ps <- ps_simulate(n_tips = 50) # The default arguments give Sorensen's quantitative dissimilarity index # (a.k.a. Bray-Curtis distance): d <- ps_dissim(ps) # Specifying a custom formula explicitly via `designdist`; # (this is the Bray-Curtis formula, so it's equivalent to the prior example) d <- ps_dissim(ps, method = "(b+c)/(2*a+b+c)", fun = "designdist", terms = "minimum", abcd = TRUE) # Alternative arguments can specify a wide range of dissimilarity measures; # here's endemism-weighted Jaccard's dissimilarity: d <- ps_dissim(ps, method = "jaccard", endemism = TRUE)
This function calculates a variety of alpha phylogenetic diversity metrics, including measures of richness, regularity, and divergence. If continuous community data (probabilities or abundances) are provided, they are used in calculations, giving quantitative versions of the classic binary metrics.
ps_diversity(ps, metric = c("PD", "PE", "CE", "RPE"), spatial = TRUE)ps_diversity(ps, metric = c("PD", "PE", "CE", "RPE"), spatial = TRUE)
ps |
phylospatial object (created by |
metric |
Character vector containing the abbreviation for one or more diversity metrics listed in
the details below. Can also specify |
spatial |
Logical: should the function return a spatial object (TRUE, default) or a matrix (FALSE)? |
The function calculates the following metrics. Endemism-weighted versions of most metrics are available. All metrics are weighted by occurrence probability or abundance, if applicable.
Richness measures:
TD—Terminal Diversity, i.e. richness of terminal taxa (in many cases these are species):
TE—Terminal Endemism, i.e. total endemism-weighted diversity of terminal taxa, a.k.a. "weighted endemism":
CD—Clade Diversity, i.e. richness of taxa at all levels (equivalent to PD on a cladogram):
CE—Clade Endemism, i.e. total endemism-weighted diversity of taxa at all levels (equivalent to PE on a cladrogram):
PD—Phylogenetic Diversity, i.e. total branch length occurring in a site:
PE—Phylogenetic Endemism, i.e. endemism-weighted PD:
ShPD—Shannon Phylogenetic Diversity, a.k.a. "phylogenetic entropy" (this version is the log of the "effective diversity" version based on Hill numbers):
ShPE—Shannon phylogenetic Endemism, an endemism-weighted version of ShPD:
SiPD—Simpson Phylogenetic Diversity:
SiPE—Simpson Phylogenetic Endemism, an endemism-weighted version of SiPD:
Divergence measures:
RPD—Relative Phylogenetic Diversity, i.e. mean branch segment length (equivalent to PD / CR):
RPE—Relative Phylogenetic Endemism, i.e. mean endemism-weighted branch segment length (equivalent to PE / CE):
MPDT—Mean Pairwise Distance between Terminals, i.e. the classic MPD metric. This is the average of cophenetic distances, weighted by .
MPDN—Mean Pairwise Distance between Nodes, an experimental version of MPD that considers distances between every pair of non-nested clades, putting more weight on deeper branches than does MPDT.
This is the mean of distances between all collateral (non-lineal) node pairs including terminal and internal nodes, weighted by .
Note that divergence can also be assessed by using ps_rand() to run null model analyses of richness measures like PD.
Regularity measures:
VPDT—Variance in Pairwise Distances between Terminals, i.e. the classic VPD metric, weighted by .
VPDN—Variance in Pairwise Distances between Nodes, i.e. MPDN but variance.
In the above equations, indexes all taxa including terminals and larger clades; indexes terminals only; is the occurrence value
(binary, probability, or abundance) of clade/terminal in a given community; is the
length of the phylogenetic branch segment unique to clade ; and is the sum of across all sites. For Shannon
and Simpson indices, only nonzero elements of are used, , and .
A matrix, sf data frame, or SpatRaster with a column or layer for each requested diversity metric.
Faith, D. P. (1992). Conservation evaluation and phylogenetic diversity. Biological Conservation, 61(1), 1-10.
Laffan, S. W., & Crisp, M. D. (2003). Assessing endemism at multiple spatial scales, with an example from the Australian vascular flora. Journal of Biogeography, 30(4), 511-520.
Rosauer, D. A. N., Laffan, S. W., Crisp, M. D., Donnellan, S. C., & Cook, L. G. (2009). Phylogenetic endemism: a new approach for identifying geographical concentrations of evolutionary history. Molecular Ecology, 18(19), 4061-4072.
Allen, B., Kon, M., & Bar-Yam, Y. (2009). A new phylogenetic diversity measure generalizing the Shannon index and its application to phyllostomid bats. The American Naturalist, 174(2), 236-243.
Chao, A., Chiu, C. H., & Jost, L. (2010). Phylogenetic diversity measures based on Hill numbers. Philosophical Transactions of the Royal Society B: Biological Sciences, 365(1558), 3599-3609.
Mishler, B. D., Knerr, N., González-Orozco, C. E., Thornhill, A. H., Laffan, S. W., & Miller, J. T. (2014). Phylogenetic measures of biodiversity and neo-and paleo-endemism in Australian Acacia. Nature Communications, 5(1), 4473.
Tucker, C. M., Cadotte, M. W., Davies, T. J., et al. (2016) A guide to phylogenetic metrics for conservation, community ecology and macroecology. Biological Reviews, 92(2), 698-715.
Kling, M. M., Mishler, B. D., Thornhill, A. H., Baldwin, B. G., & Ackerly, D. D. (2019). Facets of phylodiversity: evolutionary diversification, divergence and survival as conservation targets. Philosophical Transactions of the Royal Society B, 374(1763), 20170397.
ps <- ps_simulate() div <- ps_diversity(ps) terra::plot(div)ps <- ps_simulate() div <- ps_diversity(ps) terra::plot(div)
Takes a matrix or vector computed on occupied sites only and expands it
back to the full site grid, inserting NA for unoccupied sites. This is
useful when performing custom analyses on ps$comm (which contains only
occupied sites) and mapping the results back to the full raster or spatial
object.
ps_expand(ps, x, spatial = FALSE)ps_expand(ps, x, spatial = FALSE)
ps |
A |
x |
A matrix (or vector) with |
spatial |
Logical: if |
If spatial = FALSE, a matrix with ps$n_sites rows and NA for unoccupied
sites. If spatial = TRUE, a SpatRaster or sf object.
ps <- ps_simulate() # custom analysis on the occupied-only community matrix site_totals <- matrix(rowSums(ps$comm), ncol = 1) colnames(site_totals) <- "total" # expand to full extent as a matrix ps_expand(ps, site_totals) # expand and convert to spatial ps_expand(ps, site_totals, spatial = TRUE)ps <- ps_simulate() # custom analysis on the occupied-only community matrix site_totals <- matrix(rowSums(ps$comm), ncol = 1) colnames(site_totals) <- "total" # expand to full extent as a matrix ps_expand(ps, site_totals) # expand and convert to spatial ps_expand(ps, site_totals, spatial = TRUE)
Calculate pairwise geographic distances between occupied sites in a phylospatial object.
The result is a dist object with the same dimensions and site ordering as ps_dissim(),
making it straightforward to compare phylogenetic dissimilarity with geographic distance
(e.g., for distance-decay analyses or Mantel tests).
ps_geodist(ps)ps_geodist(ps)
ps |
A |
For raster data, distances are computed between cell centroids. For polygon or line sf data,
distances are computed between geometry centroids. Great-circle distances are used automatically
when the data has a geographic (lon/lat) CRS; Euclidean distances are used for projected data
or data without a CRS. Units are meters when a CRS is defined, or unitless coordinate
distances otherwise.
A pairwise geographic distance matrix of class dist, with one entry per pair
of occupied sites. Units are meters when a CRS is defined, or raw coordinate units otherwise.
ps <- moss() geo <- ps_geodist(ps) phy <- ps_dissim(ps) # distance-decay plot plot(as.vector(geo), as.vector(phy), xlab = "Geographic distance (m)", ylab = "Phylogenetic dissimilarity", pch = ".", col = "#00000020")ps <- moss() geo <- ps_geodist(ps) phy <- ps_dissim(ps) # distance-decay plot plot(as.vector(geo), as.vector(phy), xlab = "Geographic distance (m)", ylab = "Phylogenetic dissimilarity", pch = ".", col = "#00000020")
phylospatial community dataGet phylospatial community data
ps_get_comm(ps, tips_only = TRUE, spatial = TRUE)ps_get_comm(ps, tips_only = TRUE, spatial = TRUE)
ps |
|
tips_only |
Logical indicating whether only the terminal taxa (TRUE, the default) or all taxa (FALSE) should be returned. |
spatial |
Logical indicating whether a spatial ( |
If spatial = TRUE, a SpatRaster or sf object with a
layer/column for every taxon, with NA for unoccupied sites. If
spatial = FALSE, a matrix containing only occupied sites (i.e.,
with nrow equal to the number of occupied sites, not the total
number of grid cells). Use ps_expand() to expand an occupied-only
matrix back to the full spatial extent if needed.
ps <- ps_simulate() # the defaults return a spatial object of terminal taxa distributions: ps_get_comm(ps) # get distributions for all taxa, as a matrix pcomm <- ps_get_comm(ps, tips_only = FALSE, spatial = FALSE)ps <- ps_simulate() # the defaults return a spatial object of terminal taxa distributions: ps_get_comm(ps) # get distributions for all taxa, as a matrix pcomm <- ps_get_comm(ps, tips_only = FALSE, spatial = FALSE)
This function takes a set of occurrence point localities (e.g. from GBIF or BIEN)
and rasterizes them onto a spatial grid to produce a community data set suitable for
passing to phylospatial(). Each species' point records are aggregated within grid cells
to produce either binary presence-absence or count data.
ps_grid( x, grid = NULL, res = NULL, crs = NULL, cols = c(1, 2, 3), data_type = c("binary", "count") )ps_grid( x, grid = NULL, res = NULL, crs = NULL, cols = c(1, 2, 3), data_type = c("binary", "count") )
x |
A data frame or |
grid |
An optional |
res |
Numeric giving the grid cell resolution when generating a new grid. Units are
meters if |
crs |
Optional CRS specification (any format accepted by |
cols |
A vector of length 1 (for |
data_type |
Character indicating the type of community data to produce. Either
|
When grid is NULL, a grid is automatically generated from the point data. If crs
is also NULL, an Albers Equal Area (AEA) projection is created based on the geographic
extent of the points, with the central meridian and latitude at the midpoint of the
coordinate ranges, and standard parallels at 1/6 and 5/6 of the latitude range. This
ensures that grid cells are equal-area, which is an assumption of various functions in the
phylospatial package. Users working with other spatial layers (e.g., environmental rasters)
may want to supply a grid or crs that matches their existing data.
When res is NULL and no grid is supplied, the resolution is automatically chosen so
that the grid contains approximately 1000 cells. Specifically, the resolution is set to
sqrt(extent_area / 1000), where extent_area is the area of the bounding box of the
projected points. The auto-generated grid is buffered by half a grid cell on each side
to ensure that edge points are not excluded.
The res parameter is interpreted in the units of the output CRS. When the auto-generated
AEA projection is used, units are meters. If a geographic (lon/lat) CRS is supplied, res
is in degrees and the resulting cells will not be equal-area; a warning is issued in this
case.
Coordinate cleaning and taxonomic name matching are outside the scope of this function.
Users are encouraged to clean occurrence data beforehand (e.g., using the CoordinateCleaner
package) and to ensure that species names in the occurrence data match the tip labels of
their phylogeny before proceeding to phylospatial().
A SpatRaster with one layer per species, suitable for passing directly to
phylospatial() as the comm argument. Layer names correspond to unique values in the
species column.
phylospatial() for constructing a spatial phylogenetic object from the output.
# simulate some occurrence records set.seed(42) occ <- data.frame( species = sample(paste0("sp", 1:10), 500, replace = TRUE), x = runif(500, -122, -118), y = runif(500, 34, 38) ) # grid the occurrences with auto resolution (~1000 cells) comm <- ps_grid(occ) terra::plot(comm[[1:4]]) # specify columns by name comm <- ps_grid(occ, cols = c("species", "x", "y")) # grid at a specific resolution (50 km) comm <- ps_grid(occ, res = 50000) # use a custom grid template <- terra::rast(res = 0.5, xmin = -123, xmax = -117, ymin = 33, ymax = 39, crs = "EPSG:4326") comm2 <- ps_grid(occ, grid = template) # from sf points occ_sf <- sf::st_as_sf(occ, coords = c("x", "y"), crs = 4326) comm3 <- ps_grid(occ_sf, cols = "species")# simulate some occurrence records set.seed(42) occ <- data.frame( species = sample(paste0("sp", 1:10), 500, replace = TRUE), x = runif(500, -122, -118), y = runif(500, 34, 38) ) # grid the occurrences with auto resolution (~1000 cells) comm <- ps_grid(occ) terra::plot(comm[[1:4]]) # specify columns by name comm <- ps_grid(occ, cols = c("species", "x", "y")) # grid at a specific resolution (50 km) comm <- ps_grid(occ, res = 50000) # use a custom grid template <- terra::rast(res = 0.5, xmin = -123, xmax = -117, ymin = 33, ymax = 39, crs = "EPSG:4326") comm2 <- ps_grid(occ, grid = template) # from sf points occ_sf <- sf::st_as_sf(occ, coords = c("x", "y"), crs = 4326) comm3 <- ps_grid(occ_sf, cols = "species")
Perform an ordination that reduces a spatial phylogenetic data set into k dimensions, using one of
several alternative ordination algorithms.
ps_ordinate(ps, method = c("cmds", "nmds", "pca"), k = 3, spatial = TRUE)ps_ordinate(ps, method = c("cmds", "nmds", "pca"), k = 3, spatial = TRUE)
ps |
A |
method |
Ordination method, either |
k |
Positive integer giving the desired number of output dimensions; default is |
spatial |
Logical indicating whether a spatial object (inherited from |
A matrix or spatial object with k variables.
For visualization using ordination onto RGB color space, see ps_rgb().
ps <- ps_add_dissim(ps_simulate(50, 5, 5)) ord <- ps_ordinate(ps, method = "cmds", k = 4) terra::plot(ord)ps <- ps_add_dissim(ps_simulate(50, 5, 5)) ord <- ps_ordinate(ps, method = "cmds", k = 4) terra::plot(ord)
Create a ranking of conservation priorities using optimal or probabilistic forward stepwise selection. Prioritization accounts for the occurrence quantities for all lineages present in the site, including terminal taxa and larger clades; the evolutionary branch lengths of these lineages on the phylogeny, which represent their unique evolutionary heritage; the impact that protecting the site would have on these lineages' range-wide protection levels; the compositional complementarity between the site, other high-priority sites, and existing protected areas; the site's initial protection level; the relative cost of protecting the site; and a free parameter "lambda" determining the shape of the conservation benefit function.
ps_prioritize( ps, init = NULL, cost = NULL, lambda = 1, protection = 1, max_iter = NULL, method = c("optimal", "probable"), trans = function(x) replace(x, which(rank(-x) > 25), 0), n_reps = 100, n_cores = 1, summarize = TRUE, spatial = TRUE, progress = interactive() )ps_prioritize( ps, init = NULL, cost = NULL, lambda = 1, protection = 1, max_iter = NULL, method = c("optimal", "probable"), trans = function(x) replace(x, which(rank(-x) > 25), 0), n_reps = 100, n_cores = 1, summarize = TRUE, spatial = TRUE, progress = interactive() )
ps |
phylospatial object. |
init |
Optional numeric vector or spatial object giving the starting protection status of each site across the study area. Values should be between 0 and 1 and represent the existing level of conservation effectiveness in each site. If this argument is not specified, it is assumed that no existing reserves are present. |
cost |
Optional numeric vector or spatial object giving the relative cost of protecting each site. Values should be positive, with greater values indicating higher cost of conserving a site. If this argument is not specified, cost is assumed to be uniform across sites. |
lambda |
Shape parameter for taxon conservation benefit function. This can be any real number. Positive values, such as the default
value |
protection |
Degree of protection of proposed new reserves (number between 0 and 1, with same meaning as |
max_iter |
Integer giving max number of iterations to perform before stopping, i.e. max number of sites to rank. |
method |
Procedure for selecting which site to add to the reserve network at each iteration:
|
trans |
A function that transforms marginal values into relative selection probabilities; only used if |
n_reps |
Number of random repetitions to do; only used if |
n_cores |
Number of compute cores to use for parallel processing; only used if |
summarize |
Logical: should summary statistics across reps (TRUE, default) or the reps themselves (FALSE) be returned? Only relevant
if |
spatial |
Logical: should the function return a spatial object (TRUE, default) or a matrix (FALSE)? |
progress |
Logical: should a progress bar be displayed? |
This function uses the forward stepwise selection algorithm of Kling et al. (2019) to generate a ranked conservation prioritization.
Prioritization begins with the starting protected lands network identified in init, if provided. At each iteration, the marginal
conservation value of fully protecting each site is calculated, and a site is selected to be added to the reserve network. Selection can
happen either in an "optimal" or "probable" fashion as described under the method argument. This process is repeated until all sites
are fully protected or until max_iter has been reached, with sites selected early in the process considered higher conservation
priorities.
The benefit of the probabilistic approach is that it relaxes the potentially unrealistic assumption that protected land will actually be added in the optimal order. Since the algorithm avoids compositional redundancy between high-priority sites, the optimal approach will never place high priority on a site that has high marginal value but is redundant with a slightly higher-value site, whereas the probabilistic approach will select them at similar frequencies (though never in the same randomized run).
Every time a new site is protected as the algorithm progresses, it changes the marginal conservation value of the other sites. Marginal
value is the increase in conservation benefit that would arise from fully protecting a given site, divided by the cost of protecting the
site. This is calculated as a function of the site's current protection level, the quantitative presence probability or abundance of all
terminal taxa and larger clades present in the site, their evolutionary branch lengths on the phylogeny, the impact that protecting the
site would have on their range-wide protection levels, and the free parameter lambda. lambda determines the relative importance of
protecting a small portion of every taxon's range, versus fully protecting the ranges of more valuable taxa (those with longer
evolutionary branches and smaller geographic ranges).
Matrix or spatial object containing a ranking of conservation priorities. Lower rank values represent higher
conservation priorities. All sites with a lower priority than max_iter have a rank value equal to the number
of sites in the input data set (i.e. the lowest possible priority).
method = "optimal". the result contains a single variable "priority" containing the ranking.
method = "probable" and summarize = TRUE, the "priority" variable gives the average rank across reps, variables labeled "pctX" give the Xth percentile of the rank distribution for each site, variables labeled "topX" give the proportion of reps in which a site was in the top X highest-priority sites, and variables labeled "treX" give a ratio representing "topX" relative to the null expectation of how often "topX" should occur by chance alone.
method = "probable" and summarize = FALSE, the result contains the full set of n_rep solutions,
each representing the the ranking, with low values representing higher priorities..
Kling, M. M., Mishler, B. D., Thornhill, A. H., Baldwin, B. G., & Ackerly, D. D. (2019). Facets of phylodiversity: evolutionary diversification, divergence and survival as conservation targets. Philosophical Transactions of the Royal Society B, 374(1763), 20170397.
# simulate a toy `phylospatial` data set set.seed(123) ps <- ps_simulate() # basic prioritization p <- ps_prioritize(ps) # specifying locations of initial protected areas # (can be binary, or can be continuous values between 0 and 1) # here we'll create an `init` raster with arbitrary values ranging from 0-1, # using the reference raster layer that's part of our `phylospatial` object protected <- terra::setValues(ps$spatial, seq(0, 1, length.out = 400)) cost <- terra::setValues(ps$spatial, rep(seq(100, 20, length.out = 20), 20)) p <- ps_prioritize(ps, init = protected, cost = cost) # using probabilistic prioritization p <- ps_prioritize(ps, init = protected, cost = cost, method = "prob", n_reps = 1000, max_iter = 10) terra::plot(p$top10)# simulate a toy `phylospatial` data set set.seed(123) ps <- ps_simulate() # basic prioritization p <- ps_prioritize(ps) # specifying locations of initial protected areas # (can be binary, or can be continuous values between 0 and 1) # here we'll create an `init` raster with arbitrary values ranging from 0-1, # using the reference raster layer that's part of our `phylospatial` object protected <- terra::setValues(ps$spatial, seq(0, 1, length.out = 400)) cost <- terra::setValues(ps$spatial, rep(seq(100, 20, length.out = 20), 20)) p <- ps_prioritize(ps, init = protected, cost = cost) # using probabilistic prioritization p <- ps_prioritize(ps, init = protected, cost = cost, method = "prob", n_reps = 1000, max_iter = 10) terra::plot(p$top10)
Generates a randomized version of a phylospatial object by extracting the
tip community matrix, permuting it using nullcat::quantize(), and rebuilding
the phylospatial object using the permuted tip matrix.
ps_quantize(ps, wt_row = NULL, wt_col = NULL, ...)ps_quantize(ps, wt_row = NULL, wt_col = NULL, ...)
ps |
Object of class |
wt_row, wt_col
|
Optional square numeric weight matrices controlling which pairs of
rows (sites) or columns (species) are likely to exchange values during randomization.
Enables spatially constrained or functional-group constrained null models; e.g. a
geographic distance decay matrix from ps_geodist can be transformed and used as
|
... |
Additional arguments passed to quantize, such as |
The nullcat quantize routine involves three steps: converting a
quantitative matrix to categorical strata, permuting the resulting categorical
matrix using one of several categorical null model algorithms, and mapping the
randomized categories back to quantitative values. Supply arguments via ...
to control options for each of these stages.
For repeated randomizations to generate a null distribution, it is more efficient
to use ps_rand(fun = "quantize"), which is structured to avoid unnecessarily
recomputing overhead that is shared across randomizations.
A randomized version of ps
if (requireNamespace("nullcat", quietly = TRUE)) { ps <- ps_simulate(data_type = "prob") ps_rand <- ps_quantize(ps, n_strata = 4, n_iter = 1000, method = "curvecat", fixed = "cell") # spatially constrained randomization geo <- as.matrix(ps_geodist(ps)) W <- exp(-geo / median(geo)) ps_rand <- ps_quantize(ps, n_strata = 4, n_iter = 1000, method = "curvecat", fixed = "cell", wt_row = W) }if (requireNamespace("nullcat", quietly = TRUE)) { ps <- ps_simulate(data_type = "prob") ps_rand <- ps_quantize(ps, n_strata = 4, n_iter = 1000, method = "curvecat", fixed = "cell") # spatially constrained randomization geo <- as.matrix(ps_geodist(ps)) W <- exp(-geo / median(geo)) ps_rand <- ps_quantize(ps, n_strata = 4, n_iter = 1000, method = "curvecat", fixed = "cell", wt_row = W) }
This function compares phylodiversity metrics calculated in ps_diversity to their null distributions computed by randomizing the community matrix or shuffling the tips of the phylogeny, indicating statistical significance under the assumptions of the null model. Various null model algorithms are available for binary, probability, and count data.
ps_rand( ps, metric = c("PD", "PE", "RPE", "CE"), fun = "tip_shuffle", method = NULL, n_iter = 1000, n_rand = 100, summary = "quantile", spatial = TRUE, n_cores = 1, progress = interactive(), wt_row = NULL, wt_col = NULL, ... )ps_rand( ps, metric = c("PD", "PE", "RPE", "CE"), fun = "tip_shuffle", method = NULL, n_iter = 1000, n_rand = 100, summary = "quantile", spatial = TRUE, n_cores = 1, progress = interactive(), wt_row = NULL, wt_col = NULL, ... )
ps |
|
metric |
Character vector giving one or more diversity metrics to calculate; see ps_diversity
for options. Can also specify |
fun |
Null model function to use. Must be either "tip_shuffle", "nullmodel", "nullcat", "quantize", or an actual function:
|
method |
Method used by the selected function.
|
n_iter |
Integer giving the number of swap iterations per randomized matrix. Controls
how thoroughly each null matrix is mixed before sampling. Default is |
n_rand |
Integer giving the number of random communities to generate. |
summary |
Character indicating which summary statistic to return. If |
spatial |
Logical: should the function return a spatial object (TRUE, default) or a matrix (FALSE). |
n_cores |
Integer giving the number of compute cores to use for parallel processing. |
progress |
Logical: should a progress bar be displayed? |
wt_row, wt_col
|
Optional square numeric weight matrices controlling which pairs of
rows (sites) or columns (species) are likely to exchange values during randomization.
Only used with |
... |
Additional arguments passed to the selected function: quantize,
nullcat, simulate.nullmodel, or a custom function.
Note that the |
A matrix with a row for every row of x, a column for every metric specified in metric, and
values for the summary statistic. Or if spatial = TRUE, a sf or SpatRaster object containing these data.
# simulate a `phylospatial` data set and run randomization with default settings ps <- ps_simulate(data_type = "prob") rand <- ps_rand(ps) # using the `quantize` function with the `curvecat` algorithm if(requireNamespace("nullcat")){ rand <- ps_rand(ps, fun = "quantize", method = "curvecat", transform = sqrt, n_strata = 4, fixed = "cell") } # binary data with nullcat's curvecat algorithm ps2 <- ps_simulate(data_type = "binary") if(requireNamespace("nullcat")){ rand <- ps_rand(ps2, fun = "nullcat", method = "curvecat", n_iter = 1000) } # spatially constrained randomization using geographic distance weights if(requireNamespace("nullcat")){ geo <- as.matrix(ps_geodist(ps2)) W <- exp(-geo / median(geo)) rand <- ps_rand(ps2, fun = "nullcat", method = "curvecat", n_iter = 1000, wt_row = W) } # using binary data, with a vegan `nullmodel` algorithm rand <- ps_rand(ps2, "PD", "nullmodel", "r2") # using abundance data ps3 <- ps_simulate(data_type = "abund") rand <- ps_rand(ps3, metric = c("ShPD", "SiPD"), fun = "nullmodel", method = "abuswap_c")# simulate a `phylospatial` data set and run randomization with default settings ps <- ps_simulate(data_type = "prob") rand <- ps_rand(ps) # using the `quantize` function with the `curvecat` algorithm if(requireNamespace("nullcat")){ rand <- ps_rand(ps, fun = "quantize", method = "curvecat", transform = sqrt, n_strata = 4, fixed = "cell") } # binary data with nullcat's curvecat algorithm ps2 <- ps_simulate(data_type = "binary") if(requireNamespace("nullcat")){ rand <- ps_rand(ps2, fun = "nullcat", method = "curvecat", n_iter = 1000) } # spatially constrained randomization using geographic distance weights if(requireNamespace("nullcat")){ geo <- as.matrix(ps_geodist(ps2)) W <- exp(-geo / median(geo)) rand <- ps_rand(ps2, fun = "nullcat", method = "curvecat", n_iter = 1000, wt_row = W) } # using binary data, with a vegan `nullmodel` algorithm rand <- ps_rand(ps2, "PD", "nullmodel", "r2") # using abundance data ps3 <- ps_simulate(data_type = "abund") rand <- ps_rand(ps3, metric = c("ShPD", "SiPD"), fun = "nullmodel", method = "abuswap_c")
Perform a clustering analysis that categorizes sites into biogeographic regions based on phylogenetic community compositional similarity.
ps_regions(ps, k = 5, method = "average", endemism = FALSE, normalize = TRUE)ps_regions(ps, k = 5, method = "average", endemism = FALSE, normalize = TRUE)
ps |
A |
k |
Number of spatial clusters to divide the region into (positive integer). See ps_regions_eval to help choose a value of k by comparing the variance explained by different numbers of regions. |
method |
Clustering method. Options include all methods listed under hclust, and |
endemism |
Logical indicating whether community values should be divided by column totals (taxon range sizes)
to derive endemism. Only used if |
normalize |
Logical indicating whether community values should be divided by row totals (community sums). If |
A raster or matrix with an integer indicating which of the k regions each site belongs to.
Daru, B. H., Elliott, T. L., Park, D. S., & Davies, T. J. (2017). Understanding the processes underpinning patterns of phylogenetic regionalization. Trends in Ecology & Evolution, 32(11), 845-860.
ps <- ps_simulate() # using kmeans clustering algorithm terra::plot(ps_regions(ps, method = "kmeans")) # to use a hierarchical clustering method, first we have to `ps_add_dissim()` terra::plot(ps_regions(ps_add_dissim(ps), k = 7, method = "average"))ps <- ps_simulate() # using kmeans clustering algorithm terra::plot(ps_regions(ps, method = "kmeans")) # to use a hierarchical clustering method, first we have to `ps_add_dissim()` terra::plot(ps_regions(ps_add_dissim(ps), k = 7, method = "average"))
This function compares multiple potential values for k, the number of clusters in to use in ps_regions(),
to help you decide how well different numbers of regions fit your data set. For each value of k, it performs
a cluster analysis and calculates the proportion of total variance explained (SSE, the sum of squared pairwise
distances explained). It also calculates second-order metrics of the relationship between k and SSE. While
many data sets have no optimal value of k and the choice is often highly subjective, these evaluation metrics
can help you identify potential points where the variance explained stops increasing quickly as k increases.
ps_regions_eval(ps, k = 1:20, plot = TRUE, ...)ps_regions_eval(ps, k = 1:20, plot = TRUE, ...)
ps |
A |
k |
Vector of positive integers giving possible values for |
plot |
Logical indicating whether to print a plot of the results ( |
... |
Further arguments passed to ps_regions. |
The function generates a data frame with the following columns. If plot = FALSE the data frame is
returned, otherwise the function prints a plot of the latter variables as a function of k:
"k": The number of clusters.
"sse": The proportion of total variance explained, with variance defined as squared pairwise community phylogenetic dissimilarity between sites.
"curvature": The local second derivative. Lower (more negative) values indicate more attractive break-point values of k.
"dist11": The distance from the point to the 1:1 line on a plot of k vs sse in which k values over the
interval from 1 to the number of sites are rescaled to the unit interval. Higher values indicate more
attractive values for k.
ps <- ps_add_dissim(ps_simulate()) ps_regions_eval(ps, k = 1:15, plot = TRUE)ps <- ps_add_dissim(ps_simulate()) ps_regions_eval(ps, k = 1:15, plot = TRUE)
Perform an ordination that reduces a spatial phylogenetic data set into three dimensions that can be
plotted as the RGB bands of color space to visualize spatial patterns of community phylogenetic composition.
This function is a wrapper around ps_ordinate().
ps_rgb(ps, method = c("nmds", "cmds", "pca"), trans = identity, spatial = TRUE)ps_rgb(ps, method = c("nmds", "cmds", "pca"), trans = identity, spatial = TRUE)
ps |
A |
method |
Ordination method, either "pca" (principal component analysis implemented via |
trans |
A function giving a transformation to apply to each dimension of the ordinated data.
The default is the identity function. Specifying |
spatial |
Logical indicating whether a spatial object (inherited from |
A matrix or spatial object with three variables containing RGB color values in the range 0-1.
ps <- ps_add_dissim(ps_simulate(50, 20, 20)) RGB <- ps_rgb(ps, method = "cmds") terra::plotRGB(RGB * 255, smooth = FALSE)ps <- ps_add_dissim(ps_simulate(50, 20, 20)) RGB <- ps_rgb(ps, method = "cmds") terra::plotRGB(RGB * 255, smooth = FALSE)
This function generates a simple phylospatial object that can be used for testing other functions
in the package. It is not intended to be realistic.
ps_simulate( n_tips = 10, n_x = 20, n_y = 20, data_type = c("probability", "binary", "abundance"), spatial_type = c("raster", "none"), seed = NULL )ps_simulate( n_tips = 10, n_x = 20, n_y = 20, data_type = c("probability", "binary", "abundance"), spatial_type = c("raster", "none"), seed = NULL )
n_tips |
Number of terminals on phylogeny. |
n_x |
Number of raster cells in x dimension of landscape. |
n_y |
Number of raster cells in y dimension of landscape. |
data_type |
Community data type for simulated ranges: either "probability" (default), "binary", or "abundance". |
spatial_type |
Either "raster" or "none". |
seed |
Optional integer to seed random number generator. |
phylospatial object, comprising a random phylogeny and community matrix in which each terminal has a
circular geographic range with a random radius and location. The spatial reference data is a SpatRaster.
# using all the defaults ps_simulate() # specifying some arguments plot(ps_simulate(n_tips = 50, n_x = 30, n_y = 40, data_type = "abundance"), "comm")# using all the defaults ps_simulate() # specifying some arguments plot(ps_simulate(n_tips = 50, n_x = 30, n_y = 40, data_type = "abundance"), "comm")
Estimates the number of iterations needed for the null model randomization in
ps_rand() to reach its stationary distribution, given a dataset and algorithm.
This is a convenience wrapper around nullcat::suggest_n_iter() that extracts the
appropriate community matrix from a phylospatial object. Use this before running
ps_rand() with fun = "nullcat" or fun = "quantize" to choose an appropriate
value for n_iter. The function runs multiple independent chains of the randomization
algorithm, records a mixing diagnostic at each step, and identifies the point at which
chains stabilize.
ps_suggest_n_iter(ps, fun = c("nullcat", "quantize"), plot = TRUE, ...)ps_suggest_n_iter(ps, fun = c("nullcat", "quantize"), plot = TRUE, ...)
ps |
A |
fun |
Character: |
plot |
Logical: if |
... |
Additional arguments passed to |
An integer giving the suggested minimum number of iterations, with additional
diagnostic information as attributes. See nullcat::suggest_n_iter() for details.
if (requireNamespace("nullcat", quietly = TRUE)) { set.seed(123) # binary data with nullcat ps_bin <- ps_simulate(data_type = "binary") ps_suggest_n_iter(ps_bin, fun = "nullcat", method = "curvecat", n_iter = 5000, n_chains = 3, plot = TRUE) # quantitative data with quantize ps <- ps_simulate(data_type = "prob") ps_suggest_n_iter(ps, fun = "quantize", method = "curvecat", n_strata = 4, fixed = "cell", n_iter = 2000, n_chains = 3, plot = TRUE) }if (requireNamespace("nullcat", quietly = TRUE)) { set.seed(123) # binary data with nullcat ps_bin <- ps_simulate(data_type = "binary") ps_suggest_n_iter(ps_bin, fun = "nullcat", method = "curvecat", n_iter = 5000, n_chains = 3, plot = TRUE) # quantitative data with quantize ps <- ps_simulate(data_type = "prob") ps_suggest_n_iter(ps, fun = "quantize", method = "curvecat", n_strata = 4, fixed = "cell", n_iter = 2000, n_chains = 3, plot = TRUE) }
Runs multiple independent chains of the null model randomization algorithm and
records a mixing diagnostic at each step, producing trace plots to assess convergence.
This is a convenience wrapper around nullcat::trace_cat() that extracts the
appropriate community matrix from a phylospatial object.
ps_trace(ps, fun = c("nullcat", "quantize"), plot = TRUE, ...)ps_trace(ps, fun = c("nullcat", "quantize"), plot = TRUE, ...)
ps |
A |
fun |
Character: |
plot |
Logical: if |
... |
Additional arguments passed to |
An object of class "cat_trace". See nullcat::trace_cat() for details.
ps_suggest_n_iter(), ps_rand()
if (requireNamespace("nullcat", quietly = TRUE)) { ps_bin <- ps_simulate(data_type = "binary") tr <- ps_trace(ps_bin, fun = "nullcat", method = "curvecat", n_iter = 2000, n_chains = 5, plot = TRUE) ps <- ps_simulate(data_type = "prob") tr <- ps_trace(ps, fun = "quantize", method = "curvecat", n_strata = 4, fixed = "cell", n_iter = 2000, n_chains = 5, plot = TRUE) }if (requireNamespace("nullcat", quietly = TRUE)) { ps_bin <- ps_simulate(data_type = "binary") tr <- ps_trace(ps_bin, fun = "nullcat", method = "curvecat", n_iter = 2000, n_chains = 5, plot = TRUE) ps <- ps_simulate(data_type = "prob") tr <- ps_trace(ps, fun = "quantize", method = "curvecat", n_strata = 4, fixed = "cell", n_iter = 2000, n_chains = 5, plot = TRUE) }
This is a simple wrapper around nullcat::quantize(), included in
phylospatial mainly for backward compatibility.
quantize(x = NULL, ...)quantize(x = NULL, ...)
x |
Community matrix with species in rows, sites in columns, and nonnegative quantities in cells. |
... |
Additional arguments passed to |
The nullcat quantize routine involves three steps: converting a
quantitative matrix to categorical strata, permuting the resulting categorical
matrix using one of several categorical null model algorithms, and mapping the
randomized categories back to quantitative values. Supply arguments via ...
to control options for each of these stages.
A randomized version of x.
if (requireNamespace("nullcat", quietly = TRUE)) { # example quantitative community matrix comm <- matrix(runif(2500), 50) # examples of different quantize usage rand <- quantize(comm) rand <- quantize(comm, n_strata = 4, transform = sqrt, fixed = "row") rand <- quantize(comm, method = "swapcat", n_iter = 500) }if (requireNamespace("nullcat", quietly = TRUE)) { # example quantitative community matrix comm <- matrix(runif(2500), 50) # examples of different quantize usage rand <- quantize(comm) rand <- quantize(comm, n_strata = 4, transform = sqrt, fixed = "row") rand <- quantize(comm, method = "swapcat", n_iter = 500) }
Convert a site-by-variable matrix into a SpatRaster or sf object
to_spatial(m, template)to_spatial(m, template)
m |
Matrix or vector with the same number of rows as sites in |
template |
|
SpatRaster with a layer for every column in m, or sf data frame with
a variable for every column in m, depending on the data type of template.
ps <- moss() # ps$comm contains only occupied sites, so expand before converting: to_spatial(ps_expand(ps, ps$comm[, 1:5]), ps$spatial)ps <- moss() # ps$comm contains only occupied sites, so expand before converting: to_spatial(ps_expand(ps, ps$comm[, 1:5]), ps$spatial)
A family of functions to modify the branch lengths of a phylogeny in ways
relevant to spatial phylogenetic analyses. These fall into two conceptual
categories. Affine rescaling (rescale_tree) applies a single
multiplicative factor to every branch, changing units without changing
relative branch lengths. Differential transforms (slice_tree, delta_tree,
uniform_tree) change the shape of the branch-length distribution by
treating different branches differently based on their position or length.
rescale_tree(tree, method = c("raw", "sum1", "tip1")) slice_tree(tree, min = -Inf, max = Inf, reference = NULL, from = "tips") delta_tree(tree, delta) uniform_tree(tree)rescale_tree(tree, method = c("raw", "sum1", "tip1")) slice_tree(tree, min = -Inf, max = Inf, reference = NULL, from = "tips") delta_tree(tree, delta) uniform_tree(tree)
tree |
Phylogeny of class ape::phylo. |
method |
Character giving the rescaling method for |
min, max
|
Numeric giving the lower and upper bounds of the depth
window for slicing, in the same units as |
reference |
Optional phylogeny of class ape::phylo whose branch
lengths define the slicing window. Must share topology with |
from |
Character: |
delta |
Numeric exponent for |
The two categories of operations serve different purposes and can be composed. A typical workflow is to apply a differential transform first (if any), and then optionally an affine rescaling (if any).
slice_tree retains only the portions of each branch that fall within
a specified window along the tree, setting other portions to zero. This
implements the "time-sliced" approach to phylodiversity analysis, enabling
questions about divergence within a specific depth range. The window can
be defined on the tree itself, or on a separate reference tree with
identical topology (e.g., slicing a phylogram by a chronogram, to isolate
molecular divergence within a specific age range). See Araujo et al. (2024)
for discussion of related approaches. Note that slicing modifies branch
lengths, which changes the depth coordinate system of the resulting tree
(because zeroed-out branches near the root collapse, shifting the retained
portion toward the root). If you need to apply multiple operations that
depend on the original tree's depth structure, pass the original tree as
reference on subsequent calls.
delta_tree applies Pagel's (1999) delta transformation, raising each
node depth (root-to-node distance) to the power delta, then recomputing
branch lengths from the transformed depths. The resulting tree is then
rescaled to match the original tree's total height, so that only the
distribution of branch length along the tree changes, not the total.
Values of delta < 1 lengthen deep branches at the expense of shallow
ones, emphasizing deeper divergence; values of delta > 1 do the opposite,
emphasizing recent divergence. Delta is the continuous analog of
slice_tree: both operate on position within the tree rather than on
individual branch lengths. The transformation is most interpretable on
ultrametric trees, where node depths have a single consistent meaning.
See Saladin et al. (2019) for an application to spatial phylogenetics.
uniform_tree sets all branch lengths equal to 1, producing a
cladogram. Phylogenetic diversity computed on a uniform tree is equivalent
to clade richness (CR), giving equal weight to every splitting event.
rescale_tree applies an affine rescaling:
"raw"—no rescaling (identity operation).
"sum1"—divides all branch lengths so they sum to 1. Branch lengths
then represent fractions of total tree length, and PD values represent
the fraction of total evolutionary history present in a site.
"tip1"—divides all branch lengths by the longest root-to-tip path,
so that the longest path equals 1. For ultrametric trees, every tip has
path length 1 (following Pavoine et al. 2005). For non-ultrametric trees,
only the longest path equals 1.
Rescaling does not change relative branch lengths, so it does not affect spatial patterns, correlations, or statistical significance of diversity metrics—only their numeric scale. Differential transforms, by contrast, change what the metrics are measuring.
A phylogeny of class ape::phylo with modified branch lengths.
Topology is always preserved; tree$tip.label, tree$node.label,
tree$edge, and tree$Nnode are unchanged.
Araujo, M. L., Silva, V., and Cassemiro, F. A. S. (2024). 'treesliceR': a package for slicing phylogenies and inferring phylogenetic patterns over evolutionary time. Ecography, e07364.
Pagel, M. (1999). Inferring the historical patterns of biological evolution. Nature, 401(6756), 877-884.
Pavoine, S., Ollier, S., and Dufour, A.-B. (2005). Is the originality of a species measurable? Ecology Letters, 8(6), 579-586.
Saladin, B., Thuiller, W., Graham, C. H., Lavergne, S., Maiorano, L., Salamin, N., and Zimmermann, N. E. (2019). Environment and evolutionary history shape phylogenetic turnover in European tetrapods. Nature Communications, 10(1), 249.
moss_tree <- ape::read.tree(system.file( "extdata", "moss_tree.nex", package = "phylospatial")) par(mfrow = c(1, 4), mar = c(2, 1, 2, 1)) plot(moss_tree, show.tip.label = FALSE, main = "original") plot(uniform_tree(moss_tree), show.tip.label = FALSE, main = "cladogram") plot(slice_tree(rescale_tree(moss_tree, "tip1"), min = 0, max = 0.25, from = "tips"), show.tip.label = FALSE, main = "sliced (recent 25%)") plot(delta_tree(moss_tree, delta = 1/3), show.tip.label = FALSE, main = "delta-transformed")moss_tree <- ape::read.tree(system.file( "extdata", "moss_tree.nex", package = "phylospatial")) par(mfrow = c(1, 4), mar = c(2, 1, 2, 1)) plot(moss_tree, show.tip.label = FALSE, main = "original") plot(uniform_tree(moss_tree), show.tip.label = FALSE, main = "cladogram") plot(slice_tree(rescale_tree(moss_tree, "tip1"), min = 0, max = 0.25, from = "tips"), show.tip.label = FALSE, main = "sliced (recent 25%)") plot(delta_tree(moss_tree, delta = 1/3), show.tip.label = FALSE, main = "delta-transformed")