ISRIC, Earth Institute, Columbia University, World Agroforestry Centre (ICRAF) and the International Center for Tropical Agriculture (CIAT) have recently released a new data set of raster layers with various predicted soil properties. This data set is referred to as the “AfSoilGrids250m” data set. It supersedes the SoilGrids1km data set and comes at a resolution of 250 meter. The AfSoilGrids250m data (GeoTIFFs) are available for download under the Attribution 4.0 International (CC BY 4.0) license. See this page for download information.
In this post I’ll show you how you can import this data set in a GRASS GIS database. I’ll assume you have already downloaded all data. The data comes as compressed geotiffs accompanied with xml metadata files. Now, how do I get the data in GRASS? Well, the first question would be; where do you want to import it in? In an already existing GRASS database? And if yes, in which location and mapset? If you have by this time no clue what I am talking about, it is time to read a bit more about the GRASS GIS database structure. This wiki page is a good start. It may seem a bit complicated at first, but trust me, it offers significant advantages later on when you have to work with the data.
Getting to know the data
So, now that you have a better idea of how data is organized in a GRASS GIS database, let’s move on. Here I will assume you want to create a new location and mapset for your data in an existing database (let’s call it GRASSdb). The first step is to decompress one of the compressed geotifs. Be warned, some of the compressed files are over 400MB, decompressed the largest are over 3GB! We use the gzip command line tool to unzip the file (you may want to use your own favourite tool).
cd AFsoil gzip -d af_ALUM3S_T__M_xd1_250m.tif.gz
As a first step, it may be a good idea to get some more information about the file. You can use gdalinfo for that. This is a gdal command line tool. GRASS GIS depends on GDAL, so this tool should be available from your command line (Note for Window users; make sure to use the GRASS command line console).
This will give you information about amongst others the projection, the raster extent and resolution. Below the first few lines to give you an impression.
Driver: GTiff/GeoTIFF Files: af_ALUM3S_T__M_xd1_250m.tif Size is 29501, 31505 Coordinate System is: PROJCS["unnamed", GEOGCS["WGS 84", DATUM["WGS_1984", SPHEROID["WGS 84",6378137,298.257223563, AUTHORITY["EPSG","7030"]], AUTHORITY["EPSG","6326"]], PRIMEM["Greenwich",0], UNIT["degree",0.0174532925199433], AUTHORITY["EPSG","4326"]], PROJECTION["Lambert_Azimuthal_Equal_Area"], ...
Cut out the region of interest (optional)
If you want the data for the whole of Africa, you just skip this step. But what if you only need data for a specific region or country? You can cut out the area you need using pre-processing tools (e.g., gdal_translate), or post-process your data in GRASS GIS after importing the whole data set.
Below, I will use the command line tool gdal_translate to create a subset of the raster before importing the data in GRASS. To determine the region of interest I’ll use the coordinates of the corner points of my region of interest (eastern Africa). You can use your favourite tool to find out those coordinates, but make sure they are in Lambert Azimuthal Equal Area projection!
in=af_ALUM3S_T__M_xd1_250m.tif out=`echo $in | sed -e 's/__/_/g'` gdal_translate -projwin 217400 1133200 3054750 -2536250 $in $out
Note that I used the command line tool sed to create the output name based on the input name. A bit over the top perhaps, but it will come in handy when we need to repeat this for all layers.
Create a new location and mapset
OK, now we know a bit more about the data, and (optionally) have created a subset of the data, let’s move on and create a new location. You can do this using the following steps:
1) Open GRASS and select the option to create a new location
2) Enter a name and optionally a description. In the example below, I named the new location Datasets_LAEA, which is the location I will keep all source data in Lambert azimuthal equal-area projection. Note that we learned about the projection using the gdalinfo tool earlier.
3) Select the method to create a new location. In this case, we want to read the projection and datum terms from the georeferenced file.
4) In the next window, you can select the georefernced file. This is the geotif file we decompressed at the start.
5) In the next window you are presented with a summary of the new location you are about to create. If it all looks good, click finish.
6) Now you are asked if you want to import the data layer that you used to create the location. If you select yes, the data layer will be imported in the PERMANENT mapset. The default region extent and resolution will be set based on the imported data layer.
If you select no, for example because you want to import the data in another mapset, you will be presented with the option to set the extent and resolution manually.
Select yes and you will be presented with the screen below. The pre-filled extent and resolution are based on those of the layer you used to create the location. But you can use your own values if you like.
7) As a last step, you are asked if you want to create a new mapset. Below I create the mapset ‘AfSoilGrids250m’.
And that was it, you just created a new location and mapset. Now we can continue with importing the data layers in your newly created mapset.
Importing the data using the GUI
If you have all your raster layers in one folder, GRASS GIS makes it easy to import them all in one go. In the menu, go to File – Import raster data – Common format imports, and you will see the following screen. There are various options for importing data, including importing all layers in a directory. This is the option I have selected below. It will list all files with the selected extension (.tif here) and create Raster names based on the file names (you can change the names by clicking on them). To select all layers for import, right click on any of the layers and you’ll get the option to select or deselect all layers at once.
Although this is a very easy way to import layers, it is not the way I’ll import the AfSoilGrids250m layers. Remember that each of these layers is somewhere between 1 and 3.8 GB when uncompressed. With 130 layers that is roughly 200GB. I don’t have that much free space on my hard disk, so I will need another approach.
Importing the data using the command line (batch import)
First step is to open a terminal and go to the directory where you keep the downloaded data. The shell script below will take the compressed layers one by one, (i) decompress them, (ii) cut out the region of interest, (iii) import that in the GRASS GIS database and (iv) delete the temporary tif files.
cd /media/HD2/Data/AFsoil/ FILES=`ls -a *.gz` for f in $FILES do in=`echo $f | sed -e 's/.gz//g'` out1=`echo $in | sed -e 's/__/_/g'` out2=`echo $out1 | sed -e 's/.tif//g'` gzip -d $f gdal_translate -projwin 216000 1134000 3055500 -2537000 $in $out1 r.in.gdal input=$out1 output=$out2 memory=2047 rm $in $out1 done
Now just wait a bit, and you’ll have all your data in your GRASS GIS database. And yes, it is here that using sed to create the output name based on the input name comes in handy.
And a final screenshot
And let’s end this tutorial with a screenshots of one of the data layers I have just imported.
As a side note, doesn’t that histogram look great? One of those options I knew about but never really tried. You can simply select to include it when you add a legend to the map canvas; just tick the option to add a histogram.