The latest and greatest GRASS GIS 7.8.3 released. See the announcement on https://grass.osgeo.org/news/89/15/GRASS-GIS-7-8-3-released/. And if you haven’t updated to the 7.8 release series yet, make sure to check out all the great new features in GRASS GIS 7.8. on https://trac.osgeo.org/grass/wiki/Grass7/NewFeatures78.
Tag: GRASS GIS
Data exploration in GRASS GIS – boxplots
I am currently working on some exercises for which I need data about municipalities in the Netherlands. A good place to look for such data is the CBS (Dutch Central Bureau of Statistics). One data layer is vector layers of the dutch municipalities and neighborhoods, which include demographic data.
One of the first things I normally do when exploring new data is to look at the distribution of the data. For example by creating a histogram using the d.vect.colhist addon (see my earlier post). But what if I want to compare the distribution of different groups or samples? In such a case I find boxplots more convenient. However, there is no tool in GRASS GIS to create boxplots, so I had a look at the d.vect.colhist addon code and adapted the code to create boxplots instead of histograms.
An example
Let’s for example look at the average population densities of the municipalities.
What if I want to compare the distribution of the average population density per provinces Dutch provinces? You can install the addon (see the end of this post) and run d.vect.colbp on the command line or the console. This will open a window with different tabs.
In the first tab, you can define a column in the attribute table to plot (here BEV_DICHTH, which is the column with the population density) and a column that will be used to group the data (here provincie, which gives the names of the provinces the municipality belongs to). As you can see in the screenshot above, you have a few options to change the plot (layout). In this case, I choose to rotate the x-axis labels so they do not overlap. The resulting plot looks like:
You can of course also use the command line. In this case I will plot the boxplots horizontally using the ‘h flag’.
d.vect.colbp -h map=gemeenten@CBS column=BEV_DICHTH \ where="AANT_INW > 1" plot_output=example_1.png \ group_by=provincie order=ascending --overwrite
With will give you the plot below.
The add-on does not provide further options to change the appearance of the plot, as the main idea is to use this for quick exploration of your data, similar to the other plotting tools in GRASS GIS. However, you can save the plot as a svg file, and further edit it in e.g., Inkscape.
You can install the addon using the g.extension to install the addon:
g.extension d.vect.colbp
Any feedback will be most welcome. If you try it out and run into problems, please let me know (suggestions for improvements are of course also welcome).
Draw a histogram of vector attribute column in GRASS GIS
GRASS GIS has convenient tools to draw histograms of raster values. As similar tool to draw a histogram of values in a vector attribute table lacks. But you can easily add this functionality by installing the d.vect.colhist addon by Moritz Lennert. Read this short post on Ecodiv.earth tutorials.
Hands-on course to GIS and Remote Sensing with GRASS GIS
The hands-on GRASS GIS course at ITC – University of Twente on November 3rd, 2017 was a great success. The course, organized by ITC and OSGeo.nl, offered a very nice introduction to GRASS GIS by Veronica Andreo and a guided tour about working with GRASS GIS by Sajid Pareeth.
As part of the course, we also developed three modules with hands-on exercises on different topics related to raster time series processing, remote sensing images processing and spatial interpolation in GRASS GIS.
All the course materials are available online, so check them out and enjoy 🙂
GRASS GIS Jupyter notebooks
A great source of information about GRASS GIS is the GRASS Wiki. One example is this list with GRASS GIS Jupyter notebooks which was just added by Markus Neteler (no introduction needed I guess). There are some really nice tutorials there, which alone is reason enough to check out this list. Continue reading “GRASS GIS Jupyter notebooks”
K-fold cross validation in GRASS GIS
A common technique to estimate the accuracy of a predictive model is k-fold cross-validation. In k-fold cross-validation, the original sample is randomly partitioned into a number of sub-samples with an approximately equal number of records. Of these sub-samples, a single sub-sample is retained as the validation data for testing the model, and the remaining sub-samples are combined to be used as training data. The cross-validation process is then repeated as many times as there are sub-samples, with each of the sub-samples used exactly once as the validation data (Table 1).
The k evaluation results can then be averaged (or otherwise combined) to produce a single estimation. The advantage of this method is that all observations are used for both training and validation, and each observation is used for validation exactly once.
Functions for modelling and machine learning in e.g., R and Python’s Scikit-learn often contain build-in cross-validation routines. But it is also fairly easy to build such a routine yourself. This tutorial shows how one can easily build a k-fold cross-validation routine in GRASS GIS, e.g., to evaluate the predictive performance of two interpolation techniques, the inverse Distance Weighting and bilinear spline interpolation.
This tutorial is available on https://tutorials.ecodiv.earth.
GRASS GIS 7.2.1 released
After four months of development the new update release GRASS GIS 7.2.1 is available. It provides more than 150 stability fixes and manual improvements compared to the first stable release version 7.2.0. An overview of new features in this release series is available at New Features in GRASS GIS 7.2. See here the original announcement on the GRASS GIS website.
Plotting GRASS data in Python
GRASS GIS offers some useful but basic plotting options for raster data. However, for plotting of data in attribute tables and for more advanced graphs, we need to use other software tools. In this tutorial I explore some of the possibilities offered by Pandas plot() and how we can further tune plots using matplotlib / pyplot library.
GRASS and Pandas – from attribute table to pandas dataframe
Introduction
In this post I show how to import an attribute table of a vector layer in a GRASS GIS database into a Pandas data frame. Pandas stands for Python Data Analysis Library which provides high-performance, easy-to-use data structures and data analysis tools for the Python programming language. For people familiar with R, the Pandas data frame is an object similar to the R data frame. They are a lot like the most common way in which spreadsheets are used, with the data presented in rectangular form with columns holding variables and rows holding observations. An important characteristic is that the data frame, like a spreadsheet, can hold different types of data in different columns: numbers, character data, dates and so on. Continue reading “GRASS and Pandas – from attribute table to pandas dataframe”
Terrain attribute selection in environmental studies
Exploring species-environment relationships is important for amongst others habitat mapping, biogeographical classification, conservation, and management. And it has become easier with (i) the advance of a wide range of tools, including many open source tools, and (ii) availability of more relevant data sources. For example, there are many tools with which it is relatively easy to create a wide range of derived terrain variables using digital elevation (DEM) or bathymetric (DBM) models. However, the ease of use of many of these tools, especially when used by non-experts, may lead to the selection of arbitrary or sub-optimal set of variables. In addition, derived variables will often be highly correlated (Lecours et al. 2017).
Continue reading “Terrain attribute selection in environmental studies”