GRASS GIS: creating cross products of multiple raster map layers

I am looking at vegetation distribution in east Africa, using a map we have developed based on historical vegetation maps from the region, and I wanted to know the surface area per vegetation type per country. In ‘spreadsheet terms’, I wanted to create a cross- or pivot table. GRASS GIS offers several options to do this. Note that below I am using the examples provided in the help files of the functions. Check out these help files for further details.

r.report

This function reports statistics for one or more raster map layers. Selecting more then one layer will give you the statistics for each unique combination of the input raster values.

A strong point of r.report is that it allows you to select one or more specific units of measure in which statistics will be reported, including miles and acres (for those few remaining non-metric persons), meters, kilometers, hectares, cell count and percentage cover. Following is the result of a r.report run on the raster map layer geology (located in the Spearfish, SD sample data base), with the units expressed in square miles and acres (example copied from the help file);

As the example shows, the output is nicely, if not a bit old fashioned (but what is wrong with that) formatted table which is easy to read. The output can be directed into a text file or printing it out on screen.

Output table from r.report (from the r.report help file)

Although easy on the eye, it is much more difficult to further process the output. If you want to further work on the output data, e.g., in a spreadsheet or R, you better consider the next function.

r.stats

Similarly to r.report, r.stats generates area statistics for raster map layers. To quote the help file, “r.stats calculates the area present in each of the categories of user-selected raster map layer(s). Area statistics are given in units of square meters and/or cell counts”. I.e., the number of statics that can be calculated are much smaller. But really, who cares, it is easy enough to calculate the other statistics. Especially because the output is easy to read into a spreadsheet or your favorite statistical program.

For example, if three raster map layers a, b, and c, were specified, the output would look like (again, I am copying the example from the help file):

          0:0:0:8027500.00
          0:1:0:1152500.00
          1:0:0:164227500.00
          1:1:1:3355000.00
          2:0:0:31277500.00
          2:1:0:24207500.00
          2:1:1:1752500.00
          3:0:0:17140000.00
          3:1:1:2500.00

Within each grouping, the first field represents the category value of map layer a, the second represents the category values associated with map layer b, the third represents category values for map layer c, and the last field gives the area in square meters for the particular combination of these three map layers’ categories. Fields are separated by colons, but you can define your own separator.

The help file furthermore states that the output lists the results in square meters, ” assuming the map’s coordinate system is in meters”. I didn’t try out, but I assume that refers to projected layers with map coordinates in alternative units such as miles. It does work for layers in a geographic Coordinate System (this is also true for r.report).

r.cross

The function r.cross is different from the two functions above in that it creates an output raster map. The map represents all unique combinations of category values in the raster input layers (maximum number of layers is 10). To get statistics on the surface areas for each combination, you will need to use r.stats or r.report.

Let me again show the example from the help file. Suppose that, using two raster map layers, the following combinations occur:

          map1   map2
          ___________
           0      1
           0      2
           1      1
           1      2
           2      4

r.cross would produce a new raster map layer with 5 categories:

          map1   map2   output
          ____________________
           0      1       1
           0      2       2
           1      1       3
           1      2       4
           2      4       5

The category file created for the output raster map layer describes the combinations of input map layer category values which generated each category. In the above example, the category labels would be:

          category   category
          value      label
          ______________________________
             1       layer1(0) layer2(1)
             2       layer1(0) layer2(2)
             3       layer1(1) layer2(1)
             4       layer1(1) layer2(2)
             5       layer1(2) layer2(4)

So, if you simply want to check the surface area for each each unique combination of the input raster values, use r.report. If you want to further analyze the results in your favorite statistical package, use r.stats (you could for example run this function from within R, piping the results into a R data frame). And if you want to use the combined raster maps in further spatial analysis, use r.cross.

p.s. I couldn’t find a similar function in QGIS. But of course, you have the full GRASS functionality available from within QGIS using the GRASS toolbox.

p.s. It would be fairly easy to do this kind of overlays in R, but you might run into trouble with larger data sets (out of the several solutions, working with a combination of GRASS and R happens to be my favorite solution).

4 thoughts on “GRASS GIS: creating cross products of multiple raster map layers

  1. Michael Curran

    Fantastic, I second that comment. for spatial data combinations (e.g. which crops are grown on which land cover classes), r.cross looks likely to work wonders…

  2. Michael Curran

    In a similar vein, but deviating slightly, I am interested in the type of Raster Terrain Analysis found in QGIS. As far as I can tell, there is no Ruggedness Index in GRASS, or anything similar. A workaround that I considered, but have not tried yet, would be to create a vector grid of the desired resolution (e.g. 1km), and then run v.rast.stats with an underlying elevation model (e.g. srtm 90m), and rasterize one of the attribute columns such as standard deviation or something similar.

    However, perhaps this thread shows other methods that might be useful using only raster layers. Any thoughts would be appreciated, as the QGIS plugin is fixed at 3 pixels for the window (regardless of resolution), therefore you cannot produce a ruggedness index at a desired resolution.

    1. pvanb

      To start with your last point, I am not sure if I understand it. The number of neighboring cells used in gdaldem (the gdal function used by QGIS to calculate Terrain Ruggedness Index) does not affect or determine the resolution of your input and output raster layers. If you refer to the neighborhood size used in the analysis, see below.

      I am not aware of a Terrain Ruggedness Index (TRI) function in GRASS, similar to the one in gdal (which is what is used by QGIS). However, it might not be too difficult to calculate it with r.mapcalc. The method is explained in Wilson et al. (2007), which you can download from http://www.informaworld.com/index/778065046.pdf. I didn’t check the method, but assuming you can adapt it to use with different neighborhood sizes, you can calculate it with r.mapcalc, using the neighborhood modifier option.

      I am not too familiar with the subject, so I can’t really comment on the suitability of stdv as a measure for the terrain variability or complexity. But concerning the implementation, using vector grids might run you into trouble in GRASS GIS if they are very large. In my experience, GRASS GIS excels in raster analysis, but has more trouble when it comes to very large and complex vector layers.

      In any case, if I understand what you want to do, you better use the r.neighbour function in GRASS, that allows you to calculate various statistics, including stdv for any neighborhood size.

Leave a comment