PostGIS

From Wildsong
Jump to navigationJump to search

Given: I have to be able to manage data in a SQL database server. I can't hack geoprocessing on the desktop any more. The datasets are too large. And there are too many! I need a catalog.

Building enterprise geodatabase server

The hardware

At home: a measly Intel Atom with 2GB of RAM, a 32GB SSD, and a WDC 1TB Green drive.

At work: 2 dual core Xeon processors, 12 GB of RAM, 3Ware RAID with four 1TB drives.

The software

Operating system: I first tried Debian, which was fine for PostgreSQL/Postgis but when I wanted to install ESRI ArcGIS Server to test it, I changed over to CentOS 5.5. ESRI requires Redhat Enterprise Linux but CentOS works just fine.

At home I am using Ubuntu Server 10.04, and experimenting with the latest stuff. At work I run the more staid editions so that I can be more confident that ESRI ArcGIS stuff will work.

Overall, what I need is

PostgreSQL 8.4
phpPgAdmin
Postgis
Proj4
GDAL, OGR
UMN Mapserver
python-mapscript
php5-mapscript

Ubuntu option

Installed from packages,

sudo apt-get install postgresql php5-pgsql\
 pgadmin3 pgadmin3-data\
 postgresql-client postgresql-contrib postgresql-doc\
 libproj-dev postgresql-dev-8.4 pidentd

The following extra packages will be installed:
  libossp-uuid16 pgagent postgresql-8.4 postgresql-client-8.4 postgresql-client-common postgresql-common postgresql-contrib-8.4 postgresql-doc-8.4

Not bad, but I want a newer postgis than the one in the Ubuntu repository (1.4).

So I downloaded an SVN snapshot of 1.52 and built it. I had already installed the GEOS package. http://trac.osgeo.org/geos/

 -------------- Dependencies -------------- 
  GEOS config:          /usr/local/bin/geos-config
  GEOS version:         3.2.2
  PostgreSQL config:    /usr/bin/pg_config
  PostgreSQL version:   PostgreSQL 8.4.4
  PROJ4 version:        47
  Libxml2 config:       /usr/bin/xml2-config
  Libxml2 version:      2.7.6
  PostGIS debug level:  0

 -------- Documentation Generation -------- 
  xsltproc:             /usr/bin/xsltproc
  xsl style sheets:     /usr/share/xml/docbook/stylesheet/nwalsh
  dblatex:              
  convert:              /usr/bin/convert

A couple utilities install into /usr/lib/postgressql/8.4 (shp2pgsql and pgsql2shp) but the bulk of Postgis is a set of SQL files that end up in /usr/share/postgresql/8.4/contrib

CentOS option

need to copy this from intranet wiki

Server management

Create user and password for phppgadmin

From command line... if you want to use database authentication

psql -U postgres
CREATE USER bwilson WITH PASSWORD 'jackalope';

To see the user table

SELECT * FROM pg_authid

Now user bwilson should be able to use phppgadmin

Template

http://geospatial.nomad-labs.com/2006/12/24/postgis-template-database/

Loading data into PostGIS

Lots of our data comes in the form of shapefiles, for this I can use shp2pgsql


For ESRI data then I think the best tack is to write a geoprocessing Python script - It can use the ESRI proprietary code to read any feature class using a cursor and load them into postgis.

To do this I need to be able to write to the PostGIS database from my Python script.

I guess I need a new page here.... Loading data into PostGIS

Setting up a new PostGIS database

su - postgres
psql
CREATE DATABASE "WA_South_King" WITH OWNER=postgres TABLESPACE=pg_default;

After creating the database you have to enable it for storing geographic data. Enable PL/pgSQL procedural language extension.

createlang plpgsql -U postgres -d WA_South_King

Did it work?

createlang -l -d WA_South_King
Procedural Languages
  Name   | Trusted?
---------+----------
 plpgsql | yes

Yes! Add the postgis extensions to the database.

The old way

psql -U postgres -d WA_South_King -f /usr/share/postgresql-8.3-postgis/lwpostgis.sql

The new way (postgis 1.5) (where I put the file anyway LOL)

psql -U postgres -d test -f /usr/local/share/postgis.sql

Load some data

Convert city polygon shapefile into SQL commands

shp2pgsql /export/kilchis/temp/Ann/IncidentView/Data/WA/South_King_Fire/AGI_shapefiles/Cities_revised.shp cities > cities.sql

Load the data now by executing the SQL

psql -U postgres -d WA_South_King < cities.sql

Query data

Did the load work? Try a simple query.

psql -U postgres -d WA_South_King
SELECT name FROM cities;
     name
------------------
Shoreline
Lake Forest Park
Kenmore
Bothell
...

The glory! Now try a geometric query

SELECT name, AREA(the_geom) FROM cities ORDER BY AREA(the_geom);
      name       |       area
------------------+------------------
Beaux Arts       | 2308400.44773863
RUSTON           | 7069686.79758783
Hunts Point      | 8537598.97906794
Skykomish        | 9349760.74397126
...

Setting the SRID (Spatial reference ID)

First I look at the shapefiles to see what the PRJ portion says. Then I look up the SRID in the Proj file /usr/share/proj/esri

Cities is WA North = 102348 and Counties is WA South = 102349

New short way

UpdateGeometrySRID([<schema_name>], <table_name>, <column_name>, <srid>)

Long old way

ALTER TABLE cities DROP CONSTRAINT "enforce_srid_the_geom" RESTRICT;
UPDATE cities SET the_geom=ST_SetSRID(the_geom, 102348);
ALTER TABLE cities ADD CONSTRAINT "enforce_srid_the_geom" CHECK(SRID(the_geom)=102348);
UPDATE geometry_columns set SRID=102348 WHERE f_table_name='cities';
ALTER TABLE WA_Counties DROP CONSTRAINT "enforce_srid_the_geom" RESTRICT;
UPDATE wa_counties SET the_geom=ST_SetSRID(the_geom, 102349);
ALTER TABLE wa_counties ADD CONSTRAINT "enforce_srid_the_geom" CHECK(SRID(the_geom)=102349);
UPDATE geometry_columns set SRID=102349 WHERE f_table_name='wa_counties';

Spatial queries

I have also added counties, WA_Counties table. So I can see what cities fall inside King county. The problem is that these tables are now in different projections, so the query fails.

SELECT cities.name, WA_Counties.county_nm FROM cities INNER JOIN WA_Counties ON (cities.the_geom && WA_Counties.the_geom AND intersects(cities.the_geom, WA_Counties.the_geom));
ERROR:  Operation on two geometries with different SRIDs

I cannot use the transform() function though because my SPATIAL_REF_SYS table is not loaded with the ESRI values! Que mal!

Use this GDAL util to convert the ESRI codes to Proj.4 codes

cd src/FWTools-2.0.6
export PYTHONPATH=`pwd`/pymod
export LD_LIBRARY_PATH=`pwd`/lib
cd share
../bin/epsg_tr.py -postgis -list esri_extra.wkt > ~/esri.sql

Load them into the database.

su postgres
psql -U postgres -d WA_South_King < /home/AGI/bwilson/esri.sql

Now transform() should work.

Reprojection

CREATE TABLE "wa_counties_p" (gid serial PRIMARY KEY,
"county_cod" int2,"county_fip" varchar(3),"county_nm" varchar(15),
"ecy_region" varchar(4),"air_region" varchar(46));

SELECT AddGeometryColumn(,'wa_counties_s','the_geom','-1','MULTIPOLYGON',2);

ALTER TABLE wa_counties_p DROP CONSTRAINT "enforce_srid_the_geom" RESTRICT;

INSERT INTO wa_counties_p SELECT gid,county_cod, county_fip, county_nm, ecy_region, air_region, transform(the_geom, 102348) FROM wa_counties;

ALTER TABLE wa_counties_p ADD CONSTRAINT "enforce_srid_the_geom"CHECK(SRID(the_geom)=102348);

Exporting data from PostGIS

pgsql2shp

Integration with ArcGIS Desktop using ZigGIS

Resources

Postgis wiki http://trac.osgeo.org/postgis/wiki/UsersWikiMain

An article of interest

http://www.bostongis.com/PrinterFriendly.aspx?content_name=sqlserver2008_postgis_mysql_compare

Other

From some forum or other

"574 post(s)

  1. 23-Mar-09 20:06

I suppose I could share a few things that I have found out through trial and error. I am certainly no expert.

I am running on windows server 2003 32 bit, so if you are running 64 bit or *nix versions, the experiences could be very different.

1. Make sure you have the latest version of postgres / postGIS. In another thread you mentioned that you have only just installed it, so I would think that this is the case. Version Postgres 8.3 has a nice easy windows installer that makes it a piece of cake to install. The application Stack Builder picks up the latest postGIS version, so there is no need to worry too much about installing postGIS seperately.

2. I found that installing the Tuning wizard that can be found in the Application Stack Builder helps to configure the memory setting quite well. For my purposes, I didn't need to tweak any of the memory setting into the postgresql.conf file.

3. If you are planning on doing spatial queries on any datasets that you export to postgres, then make sure you have spatial indexes on the geometry field. When you export from manifold using the Postgres type, then these indexes are created automatically. I would suggest reading the postgis user manual, it is pretty comprehensive, and fairly easy to follow. It will give you a good idea of what is possible with spatial queries. It is available http://postgis.refractions.net/download/postgis-1.3.5.pdf

4. The postGIS email list is very helpful. See http://postgis.refractions.net/mailman/listinfo/postgis-users to join.

5. If you are doing joins between tables in the database, then you will have to manually set up the indexes. If you are using pgAdmin III (which I like), then before you run queries, you can see if your indexes are being used by using explain.

--sql (for use within postgresql (or in the database manager console in manifold)


explain select * from test limit 1000;


This will tell you what type of scan is being done. In the above case it is always going to be a sequential scan, due to the nature of the query. If you are doing a query like:

select * from "PARCEL_MAPPING" INNER JOIN "100K index" ON st_intersects("PARCEL_MAPPING".geometry, "100K index".geometry)

where "100K index"."Sheet_Name" = 'DANYO'


There is an explain button in the query editor in pgAdmin III which shows this in a graphic form so you can see pretty quickly if you aren't taking advantage of your indexes.

6. Also as mentioned above, it is a good idea to do vaccuum analyzes often as these help indexes perform better.

7. If you are using large drawings that have lots of drawing objects, then make use of the AOI windowing when linking the drawings. Also see http://forum.manifold.net/forum/t64537.18 for a handy tool to help out.

8. Change the extents() aggregate function in postgres. Manifold uses this when linking drawings (for which reason I don't know). It is VERY slow on large datasets. If you use the tool mentioned above, you don't need the extents function at all. All I did was rename the function so I could still use it if I had to in code and postGIS, but Manifold wasn't automatically trying to use it.

--sql (in postgres)

DROP AGGREGATE extent(geometry);


CREATE AGGREGATE _extent(geometry) (

 SFUNC=public.st_combine_bbox,
 STYPE=box2d

);

That is about all I can think of for the moment, but I am sure others have more tips.

Hope this helps

James"