PostGIS and QGIS: Powerful Partners
QGIS Edinburgh, May 2015
This workshop aims to introduce users to working with OS data within PostGIS and QGIS.
Prerequisites
We will start with PostGIS and get some data loaded before continuing on to use QGIS.
I have found over the years there are always a few different ways to do anything when it comes to software and computers, in this guide we will generally stick with using the GUIs rather than the command line.
1) Creating a PostGIS Enbaled Database
To begin with we should create a PostGIS enabled database which will hold our spatial data. To do this you should open PGAdminIII.
Connect to your PostgreSQL server by double clicking the "PostgreSQL 9.3 (localhost:5432)" Server (this entry may be slightly different). It may ask you for your Postgres password which you entered in during the installation.
Depending on your view of security and whether this is just a local development platform you could click the box to save the password.
Once you have successfully connected to the server you can expand the database and you should see a sole entry called "postgres"
N.B I have already created my new OS database
Right hand mouse click on the "Databases" and select "New Database"
Call the new database "osdata" and set the owner to "postgres"
This will now create the database
We now need to enable the PostGIS extension on the database to give it all the extra spatial functionality.
Select your 'osdata' database and then click the SQL button on the top toolbar
A new window will open and look like this
This is our SQL Editor, where we type any SQL queries.
To add the PostGIS extension type the following into the SQL Editor
CREATE EXTENSION postgis;
To run this we use the green play button on the toolbar
You should get the following response
Query returned successfully with no result in 1174 ms.
If you do not then your installation of PostGIS may have not worked, so try and install that again and then run the above SQL again.
2) Loading data into your PostGIS database
Depending on your data there are many different methods for loading your spatial data into PostGIS. This guide will not cover them all. A good starting point is this summary post from Matt Walker found here . This is a summary of the PostGIS Day event held in November 2014.
My three methods of choice are the following
-
Use the shp2pgsql-gui that is installed when PostGIS is installed. Here is a link to a good tutorial found here
-
Write batch scripts or python loading scripts that use the GDAL/OGR library.
-
Restore Postgres dump files - data that has already been loaded into a PostGIS database can be exported to a dump file. This is very useful for doing backups. However, these dump files can be restored to a different database.
2.1) shp2pgsql-gui Loader
Let us start by using the shp2pgsql GUI loader to load some background context data. We will use Ordnance Survey's Strategi dataset for this. You can order and then download this dataset from OS OpenData download portal
Once you have downloaded the zip file, extract it to a folder.
Now in PGAdminIII click the 'Plugins' toolbar drop down and select 'PostGIS Shapefile and DBF Loader'. The loader will then open.
To load data, click the 'Add File' button and browse to the Strategi ESIR Shapefiles.
As you can see from the image above the data is all listed but before we can click the 'Import' button we need to change a few things to make sure the data loads properly.
We need to change the SRID code from '0' to '27700'
We also need to change the encoding from the standard UTF-8 to LATIN1. Click the 'Options' button and in the new dialogue
change the DBF file character encoding to LATIN1. This is because some of the Strategi data layers, text, contain special characters for Welsh names.
Once you have made the changes click the 'Import' button
It should only take a few moments to load all the data into our database.
Lets now turn to use QGIS to look at the data
3) Loading data into QGIS
Start by firing up QGIS
On the left hand side toolbar click the 'PostGIS' button
The 'Add PostGIS Table' dialogue box will open
Click 'New' and the 'Create a New PostGIS connection' dialogue box will open
And fill in the following details:
- Name: Local PostGIS
- Service: leave blank
- Host: localhost
- Port: 5432
- Database: osdata
- Username: postgres
- Password: enter your password
Click 'Test Connection' and if all your details are correct it should tell you that QGIS successfully connected to your PostGIS database.
Tick the 'Save Username' and 'Save Password' check boxes.
Then click 'Ok'
Back at the 'Add PostGIS Table' dialogue box, click the 'Connect' button and it will list the PostGIS tables that we created for Strategi above.
Select the following layers and rearrnage them in the QGIS Layers window:
- Settlement Seed
- Motorway
- Primary Road
- A Road
- Railway Line
- Rivers Line
- Coastline
- Lakes Region
- Urban Region
- Woodland Region
- National Park
Now that is not a very inspiring map!
It could do with some proper cartographic styling!!!!
Luckily Ordnance Survey did not just release the data they have also released stylesheets for the data as well. These can be found on the Ordnance Survey GitHub page. Download the zip file containing the stylesheets for Strategi and unzip the folder.
Back in QGIS we can now attach the relevent stylesheets to each data layer.
Right hand mouse click a layer and click 'Properties' and the 'Layer Properties' dialogue box will open.
Towards the bottom left of the gray area there is a 'Style' drop(up) down list, click that and click 'Load Style > Load from file'. Use the browser dialogue window to browse to the downloaded stylesheets and select the correct stylesheet for the layer.
Repeat that process for all layers until you get something like this:
Wow that is now a proper map!!
However, how time consuming was it to keep adding those stylesheets for each layer. And that is just for Strategi, what happens if you had Meridian2, OS VectorMap District, OS OpenMap Local etc to do as well.
Surely there must be a better way.....
3.1) Saving styles to PostGIS
Luckily the clever QGIS developers have realised that there should be a better way.
Since version 2.0 of QGIS you can save the styles to the database which is extremely useful as when you load in a data layer the data will already be styled.
Lets give this a try:
Open the 'Layer Properties' like we did previously but this time click 'Style' and then 'Save Style > Save in database' a new dialogue box will appear.
Give the style a name, for example
- 'BD_Strategi_SettlementSeed'
and then click the 'Use as default style for this layer' if that is what it should be.
You should receive a dialogue box saying 'Style Saved'.
Warning
In the past I have received some errors when QGIS tries to save the style. However, the most recent version of QGIS seems to have fixed some of the bugs
Repeat the above process for the other layers.
Now to test this out start a new blank project in QGIS.
Then load back in the layers you just had
HOOORAAAAHHHH
Pre-styled data!!
4) Load more data
As I mentioned before one of the other methods I use to load data into PostGIS is to restore PostgreSQL dump files. Within OS we use this technique to make sure we all have the latest data on our laptops.
I have already create a PostgreSQL dump file of one of our new OpenData datasets called OS Open Names. This can be downloaded from here
This file is 150Mb so should download pretty quickly.
Once downloaded we now need to restore this to our database.
4.1) Restore PostgreSQL dump file
There are two methods to load in this type of file,
- use pg_restore from the command line,
- use PGAdminIII
We will stick to using PGAdminIII, so lets reopen it if you closed it previously.
Now to restore the file we right hand mouse click on the database we are using, in this case 'osdata', and choose the 'Restore' option.
Click the '...' button on the same line as 'Filename' and a browser window will open and navigate to where you saved the dump file you just downloaded. To actually see the file you will need to change the file type it is looking for from 'Backup' to 'All Files'.
Select the 'os_open_names.dump.gz' file, then click 'Restore'
On my laptop this restore took a little over 3 minutes.
To see the data expand the 'osdata' database and you will see a new schema called 'osnames' and if you expand that you should see a single table.
Lets take a look at this data by right hand mouse clicking the table, and choosing 'View Data' and then 'View Top 100 Rows', and a new window will appear showing you the data.
The beauty of this method is that if the original data included indexes on any attributes or geometry then these will also be re created during the restore process.
To test this lets run an SQL to find some records.
4.2) Querying the data
So lets open an SQL window like we did earlier in the workshop and run the following command:
SELECT * FROM osnames.names WHERE name1 = 'Southampton'
It should return a single record in around 31ms.
Wow thats quick.
Why is it so quick, well the answer is that the field 'name1' as had an index built on it.
What are Indexes?
Indexes are a common way to enhance database performance. An index allows the database server to find and retrieve specific rows much faster than it could do without an index. But indexes also add overhead to the database system as a whole, so they should be used sensibly. (Source: PostgreSQL documentation)
So what happens if we didn't have that index. To test this let us remove the existing index.
Expand the table within PGAdminIII and expand the Indexes and you will see we have three indexes.
Right hand mouse click on the 'osnames_name1_idx' and choose 'Drop Cascaded' and select 'Yes' from the pop up window.
Let us run the same query above and see how long it now takes without an index.
My query took almost 2 secs.
That is a huge difference and that is on a dataset of only 2.5 million records.
Let us add the index back in by running the following SQL
CREATE INDEX osnames_names1_idx ON osnames.names (name1);
COMMIT;
4.3) Modifying the data
Within OS Open Names there is a field called 'name1' which is defined in the User Guide as:
*The proper noun that applies to the real world entity
You will find the names of locations, postcodes and road names within this one field.
Let us search for Cardiff
SELECT * FROM osnames.names WHERE name1 = 'Cardiff'
Oh dear no results? Why?
That is because the field name1 contains the local name, for example Welsh, whilst name2 will contain the English version, if required.
So let us try that query again.
SELECT * FROM osnames.names WHERE name2 = 'Cardiff'
Excellent, we get a result and we can see the Welsh name for Cardiff is 'Caerdydd'
When it comes to displaying this information on the map we may often see
Caerdydd/Cardiff
Can we achieve this?
Yes, we can by using a few simple SQL queries
ALTER TABLE osnames.names ADD COLUMN name varchar;
COMMIT;
UPDATE osnames.names SET name = ARRAY_TO_STRING(ARRAY[name1, name2], '/');
COMMIT;
CREATE INDEX osnames_name_idx ON osnames.names (name);
COMMIT;
The key SQL here is the ARRAY_TO_STRING which concatenates any fields together with a delimiter you specificy. For example
- Caerdydd/Cardiff
If however a field is blank it is clever and will miss it out, so if you look at the data there will not be values in the new name field like
- /Southampton
4.4) LIKE Queries
Sometimes when we search for data from a table we may not know the exact value to search. For example, if I had known that the first 5 letters of Cardiff in Welsh were 'Caerd' I could have used the following query to find the result.
SELECT * FROM osnames.names WHERE name1 LIKE 'Caerd%';
This return a result for Cardiff and also for Caerdon.
But did you notice how long it took? Over 3 seconds
And yet name1 still has an index on it is clear using a LIKE query is extremely slow in comparison to a '=' query.
We could have also found Cardiff by querying our new name field with this
SELECT * FROM osnames.names WHERE name LIKE '%/Cardiff';
This took closer to 4 seconds which for any application is too slow when we have instantaneous search results from Google.
So how can we make this better?
4.5) Full Text Search (FTS)
A good introduction to FTS can be found in the PostgreSQL documentation found here. Some features of FTS are:
- lexical analysis or tokenization—breaking a block of unstructured text into individual words, phrases, and special tokens
- morphological analysis, or stemming—collapsing variations of a given word into one index term; for example, treating "mice" and "mouse", or "electrification" and "electric" as the same word
- ranking—measuring the similarity of a matching record to the query string
Source: Stack Overflow
Essentially it provides a better and more efficient way to search for text and PostgreSQL supports FTS by using a data type called ts_vector.
I have already created a FTS tsvector data column and populated it using 'name1' as it does quite some time to build and create the index.
I used the following code:
ALTER TABLE osnames.names ADD COLUMN search_fts tsvector;
COMMIT;
UPDATE osnames.names SET search_fts = to_tsvector('english', name1);
COMMIT;
CREATE INDEX osnames_search_fts_idx ON osnames.names USING GIN(search_fts);
COMMIT;
Now lets try our previous search and see how much quicker it is using FTS.
SELECT * FROM osnames.names WHERE search_fts @@ to_tsquery('english', 'Caerd:*');
The results should appear instantly.
So we now have lightning fast queries in PostGIS, wouldn't it be great to get that same functionality in QGIS???!!
5) PostGIS Search
Last year I wrote a QGIS plugin to provide me a way to query my database using FTS.
So to use it we need to install the plugin.
Launch QGIS
From the top toolbar click 'Plugins' then 'Manage and Install Plugins'
Then click 'Settings' on the left hand side
And tick the 'Show also experimental plugins'
Once you have done that then select 'All' on the left hand side and in the top search bar type 'PostGIS'
You should see 'PostGIS Search' appear and then select it and then click the 'Install Plugin' button in the bottom right hand corner.
A dialogue box will appear saying the plugin has successfully been installed.
Before you can use the plugin you need to setup the configuration so it know what database, table and column to search
5.1) Configure PostGIS Search
When you install any QGIS plugins there are installed to the following directories:
- UNIX/Mac: ~/.qgis2/python/plugins and (qgis_prefix)/share/qgis/python/plugins
- Windows: ~/.qgis2/python/plugins and (qgis_prefix)/python/plugins
Home directory (denoted by above ~) on Windows is usually something like C:\Documents and Settings(user).
Within the PostGIS_Search folder there is a postgis.ini file. This file contains the configuration that the plugin uses to know what database and column to query.
Open the postgis.ini in a text editor and the file should look like this:
Using the details of your PostGIS database and the osnames table update the ini file with the details. It should look something like this:
Save your edits and then re launch QGIS.
You should see the PostGIS Search toolbar button
5.2) Querying using PostGIS Search
We should now have everything we need to use FTS within QGIS.
Let us load in the data we had loaded previously which should come in styled thanks to saving the stylesheets to the database.
Click the 'PostGIS Search' button
And lets us type in 'SO16 0AS' and it should autocomplete to leave one result.
Click this and it will ask you to declare the data projection
Select OSGB 1936 / British National Grid EPSG:27700
QGIS will then move and zoom to the new location and add a feature to the 'Layers'
It is by now means perfect and if you would like to help improve the plugin the get in touch.
END
Updated less than a minute ago