Climate data sets, which one to select?

For species or vegetation modelling, one of the first choices to make is the selection of explanatory variables, which in most cases will include climatic or bioclimatic data sets. One of the most widely used global climate data sets in biogeographic and ecological research is from Worldclim (Hijmans et al., 2005). Alternative global rainfall data sets are from TAMSAT TARCAT (Maidment et al., 2014) and CHIRPS (Funk et al., 2014). The Worldclim data layers are based on an interpolation of average monthly climate data from weather stations. The other two data sets combine weather station data with satellite observations to improve accuracy where in situ rainfall measurements are sparse. All three data sets are available from the KITE resources website as part of the Africlim dataset (Platts et al. 2015).

Uncertainty in data sets based interpolation of weather station data can be highly uncertain, especially in mountainous and poorly sampled areas (Hijmans, et al., 2005). This is certainly an issue in eastern Africa, which is a topographically diverse region with a relative poor coverage of weather stations. On the other hand, rainfall estimates based on satellite imagery have issues as well. I am not a climatologists and I don’t find it easy to determine which data set I should use. But I can of course start by comparing the data sets. Below, I compare the long-term average annual rainfall data. Note that the Worldclim data set is representative for the time period 1950-2000, while the other two data sets are based on data from 1983-2012.

Click on image to enlarge /  open in slide-show

The images above show the mean annual rainfall. It is immediately evident that the average rainfall distribution as estimated by the TAMSAT data set deviates considerably from the other two estimates. Especially the low rainfall estimates for three of the five s0-called water towers of Kenya (Mount KenyaAberdare Range and the Mau Forest range) and Mount Kilimanjaro in Tanzania raise question marks.

In GRASS GIS it is easy to quickly compare two maps using the bivariate scatterplot tool in the Map display toolbar. Just select two raster layer and select the tool. You can further tweak the graph using the plot and text settings, and export it as png image or print it. Note that if you print it to file, you’ll get a PS (postscript) file, which you can further edit in e.g., Inkscape.

Click on image to enlarge /  open in slide-show

Below you see the scatterplots of Worldclim versus TAMSAT, Worldclim versus CHIRPS and TAMSAT versus CHIRPS (click on images to enlarge). They illustrate that there are large discrepancies in the estimated mean annual rainfall, and a R2 are between 0.73 and 0.8.

Click on image to enlarge /  open in slide-show

Another convenient tool, available from the toolbar in the Map display toolbar, is the profile analysis tool. With this tool you can display the values of one or more raster layers along a line which you can draw on the map canvas. This is particularly handy to see how two or more maps differ.

Click on image to enlarge /  open in slide-show

Below you can see the rainfall values along a transect I drew across the Kenyan highlands. The peaks in the graph are where the transect crosses Mount Kenya, the Aberdares and the Mau forest complex. The blue, red and green lines give the values of respectively the Worldclim, TAMSAT and CHIRPS dataset. The rainfall profile of TAMSAT suggests there is not much differences in annual rainfall between the mountain tops and the lowlands in between. The Worldclim and CHIRPS profiles are more alike, but with the Worldclim providing considerably higher estimates for the mountain peaks then CHIRPS.

Kenyan highlands
Mean annual rainfall values along a transect across the Kenyan highlands

It would be good to find out more about the differences between the Worldclim and CHIRPS estimates. For example, are these differences all due to data errors (in one or both data layers) or was the period 1983 – 2012 in fact drier than the 1950 – 2000 period? But that is a question I might get into later. For now it seems clear, to me at least, that the TAMSAT data has some issues, especially for the Kenyan highlands, suggesting it to be unsuitable for use in ecological or biogeographic studies in east Africa.


  • Funk, Chris, Pete Peterson, Martin Landsfeld, Diego Pedreros, James Verdin, Shraddhanand Shukla, Gregory Husak, James Rowland, Laura Harrison, Andrew Hoell & Joel Michaelsen. The climate hazards infrared precipitation with stations—a new environmental record for monitoring extremes. Scientific Data 2, 150066.
  • Hijmans, R.J., S.E. Cameron, J.L. Parra, P.G. Jones and A. Jarvis, 2005. Very high resolution interpolated climate surfaces for global land areas. International Journal of Climatology 25: 1965-1978.
  • Maidment, R., D. Grimes, R.P.Allan, E. Tarnavsky, M. Stringer, T. Hewison, R. Roebeling and E. Black (2014) The 30 year TAMSAT African Rainfall Climatology And Time series (TARCAT) data set. Journal of Geophysical Research 119 (18), 10,619–10,644.
  • Platts PJ, Omeny PA, Marchant R (2015). AFRICLIM: high-resolution climate projections for ecological applications in Africa. African Journal of Ecology 53, 103-108.


Use R to get gbif data into a GRASS database



The Global Biodiversity Information Facility (GBIF) is an international open data infrastructure that allows anyone, anywhere to access data about all types of life on Earth, shared across national boundaries via the Internet. GBIF provides a single point of access through to species records shared freely by hundreds of institutions worldwide. The data accessible through GBIF relate to evidence about more than 1.6 million species, collected over three centuries of natural history exploration and including current observations from citizen scientists, researchers and automated monitoring programs.

There are various ways to import GBIF data, including directly from the website as comma delimited file (csv) and using the addon for GRASS (I’ll post an example using this addon at a later stage). Here, however, I’ll use the rgbif package for R to obtain the data. In the link section some tutorials are listed that illustrate the use of other R packages. Continue reading “Use R to get gbif data into a GRASS database”

Finding open data for the Netherlands

Open data is  becoming increasingly important and there are considerable advantages, such as accountability, cost and time savings for users, easier knowledge sharing and increased efficiency in public services.

The importance of open data is more and more recognized (see e.g., this blog article (in Dutch) and this and this report). However, to bank on such advantages, there is a need to increase awareness about open data and make it easy to find and use the open data.  Continue reading “Finding open data for the Netherlands”

Picture Pile; play and help science

There is a successor of Cropland Capture, Picture Pile, from the people behind Geo-Wiki. Like Cropland Capture, this tool / game uses a citizen science approach, in this case to track deforestation.

The game presents a series of side-by-side images of the same location several years apart and ask you the question whether  “ you see tree loss over time?”. Options are yes, no or maybe.

Perhaps a bit to my surprise, this is fairly addictive and I love the idea behind it. And you can play it on your computer, tablet or phone. If you want to give it a try, go to .

Importing GLCF MODIS woody plant cover

The data set

The Global Land Cover Facility offers, amongst many other data sets, the MODIS Vegetation Continuous Fields data set for download. These are layers that contain proportional estimates for vegetative cover types (woody vegetation, herbaceous vegetation, and bare ground). As such they are very suitable depict areas of heterogeneous land cover.

Their MODIS products differ from DAAC editions by coming in GeoTIFF format, geographic coordinates, WGS84 datum, and a tiling system designed to fit well with Landsat imagery. Currently the collection 5 is available, which contains proportional estimates for woody cover vegetation for the years 2000 to 2010. It can be downloaded as tiles (195 in total) via a ftp server.

Below I’ll provide an example Continue reading “Importing GLCF MODIS woody plant cover”

Importing data in GRASS GIS – an example


ISRIC, Earth Institute, Columbia University, World Agroforestry Centre (ICRAF) and the International Center for Tropical Agriculture (CIAT) have recently released a new data set of raster layers with various predicted soil properties. This data set is referred to as the “AfSoilGrids250m” data set. It supersedes the SoilGrids1km data set and comes at a resolution of 250 meter. The AfSoilGrids250m data (GeoTIFFs) are available for download under the Attribution 4.0 International (CC BY 4.0) license. See this page for download information.

In this post I’ll show you how you can import this data set in a GRASS GIS database. Continue reading “Importing data in GRASS GIS – an example”

Online data sources: the global width database for large rivers

I came across this interesting data source, and though I might as well share it.

Description: A global database of the the width of the large rivers (GWD-LR). The river width is derived from “satellite-based water masks and flow direction maps … by applying the algorithm to the SRTM Water Body Database (WBD) and the HydroSHEDS flow direction map. Both bank-to-bank river width and effective river width excluding islands are calculated for river channels between 60S and 60N”. The results are evaluated against the existing data on the river width of the Congo and Mississippi Rivers.  Continue reading “Online data sources: the global width database for large rivers”