A new method and tool (ExDet) to evaluate novelty environmental conditions

A common objective of correlative species distribution model is to be able to project the potential distribution of the target species into a new environmental space. This can be a new geographic space (e.g., invasive species) or projected future conditions.

One should be very careful in interpreting results if extrapolating to areas with conditions that fall outside the range of reference environmental variation. There are several methods to visualize this uncertainty. On this blog I have for example mentioned the multi-environmental similarity tool in Maxent (also implemented in R in amongst others the dismo package and as an addon for GRASS GIS), which allows you to create maps that provides the similarity of each point to a set of reference points (Elith et al. 2010) and thus provide a quick overview of areas with ‘novel’ conditions.

A disadvantage of this and other methods is that they only consider the ranges of the individual predictors, and ignore the correlation structure of the covariates used to build the model. In reality, it is not unlikely that at a given locations values of each univariate factor fall within the original range of values, but the combination of environmental conditions is new.

This is one of the main points of the recently published article ‘Here be dragons: a tool for quantifying novelty due to covariate range and correlation change when projecting species distribution models‘ by Mesgaran et al. (2014) in Diversity and Distributions. The authors propose a new method to measure similarity between reference and projected domains by accounting for both the deviation from the mean and the correlation between variables, which includes a multivariate similarity index based on the mahalanobis distance between all covariates.

In short, they compute the mahalanobis distances of any point in the reference data to the centre of the environmental space (r) formed by all reference data points. The maximum distance (Dmax) represents the ‘boundary’ of the reference environmental space. Now, any new point that falls outside the boundary of this environmental space represents a novel combination of covariates. To determine if a points falls outside the reference environmental space one just needs to calculate the distance to r and compare this to the Dmax. A Dnew > Dmax thus represents a novel combination of covariate (Mesgaran et al propose to use the ratio of Dnew and Dmax, which they call the multivariate combination novelty index (NT2). NT2 > 1 represent novel multivariate conditions.

Anyway, you better check out the above cited article for the details. They provide a clear description and helpful graphs. What the authors also provide is the software to implement compute the NT2 (and a few other things), the ExDet tool. The software provides an user friendly GUI and runs really fast. But for me it would be even better if I could integrate it in my normal workflow, which means, running it from the command line and preferably in conjunction with GRASS GIS. The main part I would be interested in is to be able to calculate this NT2. And the main part of doing that is to be able to calculate the Mahalanobis distance for set of reference raster layers.

In GRASS GIS there is no such function as far as I am aware, but in R there is, with the perhaps unsurprising name mahalanobis. Below I’ll give an example how to use this to calculate the NT2 in R and create the map below:

NT2 (dis)similarity of conditions in South Africa compared to reference conditions in Australia
NT2 (dis)similarity of conditions in South Africa compared to reference conditions in Australia

I’ll use the sample data provided on their website. It consists of one set of layers describing the conditions in Austria (reference set) and another set describing the conditions in South Africa (projection data).

# Names of the reference (ref) and projection (pro) data
ref <- c("AusBio13.asc", "AusBio14.asc", "AusBio5.asc", "AusBio6.asc")
pro <- c("SaBio13.asc", "SaBio14.asc", "SaBio5.asc", "SaBio6.asc")

# Import the data in R using the read.asciigrid function 
# of the sp package
refdat <- prodat <- list()
for(i in 1:4){
    refdat[[i]] <- read.asciigrid(fname=ref[i])@data
    prodat[[i]] <- read.asciigrid(fname=pro[i])@data
refdat <- do.call(cbind, refdat)
prodat <- do.call(cbind, prodat)

# Calculate the average and covariance matrix of the variables 
# in the reference set
ref.av  <- colMeans(refdat, na.rm=TRUE)
ref.cov <- var(refdat, na.rm=TRUE)

# Calculate the mahalanobis distance of each raster 
# cell to the environmental center of the reference 
# set for both the reference and the projection data 
# set and calculate the ratio between the two.
mah.ref    <- mahalanobis(x=refdat, center=ref.av, cov=ref.cov)
mah.pro   <- mahalanobis(x=prodat, center=ref.av, cov=ref.cov)
mah.max <- max(mah.ref[is.finite(mah.ref)])
nt2 <- as.data.frame(mah.pro / mah.max)

# Create and plot the raster layer
NT2 <- read.asciigrid(fname=pro[1])
NT2@data <- nt2
NT2rast <- raster(NT2)
plot(NT2rast, col=rainbow(100), ylim=c(-35,-20), xlim=c(15,35))

So, the ExDet tool is a very convenient tool, but if you rather do the calculations in R, the above example shows that that is fairly easy too.

Citation: Mesgaran, M.B., Cousens, R.D. & Webber, B.L. (2014) Here be dragons: a tool for quantifying novelty due to covariate range and correlation change when projecting species distribution models. Diversity & Distributions, 20: 1147–1159, DOI: 10.1111/ddi.12209


3 thoughts on “A new method and tool (ExDet) to evaluate novelty environmental conditions

  1. Great – this will be handy. You can simplify your refdat (and similarly prodat) creation with: library(raster); refdat <- as.matrix(stack(ref)). Not sure how efficiency compares with your approach, though.

  2. Matthew Bayly

    Hi this is probably not the most efficient coding, but for anyone else playing around with the tool here is some code for plotting the NT1 values (univariate extrapolation).

    # UD(ij) = min(
    MaxMin <- matrix(NA, nrow=2, ncol=ncol(refdat))
    colnames(MaxMin) <- colnames(refdat); rownames(MaxMin) = c("max", "min")
    for(i in 1:ncol(refdat)){
    MaxMin[1,i] <- max(refdat[,i], na.rm=TRUE) # max value of each variable in the reference matrix
    MaxMin[2,i] <- min(refdat[,i], na.rm=TRUE) # min value of each variable in the reference matrix
    UDs <- matrix(NA, nrow(prodat), ncol(prodat))
    for(j in 1:ncol(prodat)){
    for(i in 1:nrow(prodat)){
    UDs[i,j] <- (min(c(
    (prodat[i,j] – MaxMin[2,j]),(MaxMin[1,j] – prodat[i,j]), 0), na.rm=TRUE))/
    (MaxMin[1,j] – MaxMin[2,j])
    UDs <- data.frame(UDs)
    NT1 <- rowSums(UDs, na.rm=TRUE)

    # Create and plot the raster layer NT1
    mah.proR <- read.asciigrid(fname=pro[1]) # coordinates from a single grid
    NT1 <- data.frame(NT1)
    mah.proR@data <- NT1 # assing data values from above
    NT1rast <- raster(mah.proR)
    plot(NT1rast, col=rainbow(100), ylim=c(-35,-20), xlim=c(15,35))

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s