Loading data into PostGIS
This page is now getting a little more attention and organization.
Refer to "PostGIS in Action" (Obe and Hsu) Chapter 7
EPSG codes that I regularly use
- EPSG:2913: NAD83(HARN) / Oregon North (ft)
- EPSG:2914: NAD83(HARN) / Oregon South (ft)
- EPSG:32026: NAD27 / Oregon North
- EPSG:32027: NAD27 / Oregon South
- EPSG:2992: Oregon Lambert (ft)
- EPSG:2994: NAD83(HARN) / Oregon Lambert (ft)
See http://spatialreference.org
Loading shapefiles
First method: Loading shapefiles with shp2pgsql
Use the shp2pgsql command to create and load the table and then manually define the projection. Example:
shp2pgsql lowerclack_subbasins.shp | psql -U postgres -d crbc_spatial
The shp2pgsql command appears to ignore the PRJ file that defines projection, so I manually enter a command to define the projection like this:
echo "SELECT UpdateGeometrySRID('lowerclack_subbasins', 'the_geom', 2992);" | psql -U postgres -d crbc_spatial
There is also a plugin called shp2pgsql-gui included with the PostGIS source code for pgAdmin III that lets you load shapefiles directly from the pgAdmin tool. This looks elegant but I have not tested it yet.
Second method: Loading almost any data with ogr2ogr
I have information about using ogr2ogr to load GPX data into PostGIS on the GPX page.
Using 'ogr2ogr is convenient because you can load the data and reproject it in one step. Note source and destination are reversed in this command. Also very convenient because you can load any data that ogr2ogr knows about, do ogr2ogr --formats for the list.
# Define desired projection using EPSG code, this one is Web Mercator. PRJ="EPSG:900913" # Define the source and destination fields SRC=lowerclack_subbasins_merge.shp DST='PG:"user=postgres dbname=crbc_spatial"' # Do the work ogr2ogr -t_srs $PRJ -f PostgreSQL $DST $SRC -lco GEOMETRY_NAME=the_geom
Do note the order is "destination source" not "source destination"
For Mapping Vietnam I want to load data from GIS Data Depot into PostGIS. At the moment I only want to do this so I have some data for testing. Later on I might even want these layers in a map.
The downloaded data are in ESRI "E00" format and have been compressed.
# Uncompress all downloaded files for file in *.E00.GZ; do gunzip $file; done # put the data into "Web Mercator" projection PRJ="EPSG:900913" DB=vietnam
# The original files were coverages, so I have to know something about # what I want to transfer. For example for this file I want only PAL ogrinfo PPPOLY.E00 1: ARC (Line String) 2: CNT (Point) 3: LAB (Point) 4: PAL (Polygon) This is the one I want. ogr2ogr -t_srs $PRJ -f PostgreSQL \ PG:"user=postgres dbname=$DB" \ PPPOLY.E00 PAL \ -lco GEOMETRY_NAME=the_geom -nln political_boundaries
Broken shapefiles
First off, if you have access to ArcGIS, run your shapefiles through the "Repair Geometry" tool to repair or remove broken features.
Secondly, apparently it is common for shapefiles to have columns that exceed the constraints defined in their DBF fork. Use PRECISION=NO to force them to load.
Thirdly, use 'skipfailures' to drop features that are wrong, for example in multilinestring format in a linestring shapefile.
ogr2ogr -skipfailures -t_srs $PRJ $DST $SRC -lco GEOMETRY_NAME=the_geom -lco PRECISION=NO
I recall from the PostGIS book there is a better way to deal with this! I really don't want to simply drop the benighted features.
Check your work
The ogrinfo command can peek into your PostGIS database just as it can look into any spatial file it knows how to read. The -so option gives summary output instead of dumping every row to the screen.
ogrinfo -so PG:"user=postgres db=vietnam" political_boundaries
Loading data from OpenStreetMap
Using excellent instructions from http://bostongis.com/ In particular, the Almost idiot's guide...
First, get the data from CloudMade for California and Oregon
mkdir /green/GISData/OSM && cd /green/GISData/OSM wget http://downloads.cloudmade.com/americas/northern_america/united_states/california/california.osm.bz2 wget http://downloads.cloudmade.com/americas/northern_america/united_states/oregon/oregon.osm.bz2
Create a lovely Postgis database
createdb -U postgres osm -W -T template_postgis
My installed-from-source Postgis did not have hstore.sql so I built it:
cd ~/src/GIS/postgresql-9.0.6/contrib/hstore USE_PGXS=1 make && sudo USE_PGXS=1 make install cd /usr/local/pgsql/share/contrib
Do this each time you create a database a: add hstore support to the database
psql -U postgres osm < hstore.sql
Build osm2pgsql
No need for any special configuration options, just build and install!
sudo apt-get install libxml2-dev libbz2-dev cd ~/src/GIS svn co http://svn.openstreetmap.org/applications/utils/export/osm2pgsql/ cd osm2pgsql ./autogen.sh ./configure && make && sudo make install
Using osm2pgsql
osm2pgsql oregon.osm.bz2 -d osm -U postgres \ -S /green/bwilson/src/GIS/osm2pgsql/default.style --hstore osm2pgsql california.osm.bz2 --append -d osm -U postgres \ -S /green/bwilson/src/GIS/osm2pgsql/default.style --hstore
On my little server[ [Bellman]], it took 628 seconds to load the Oregon data and it took 3302 seconds to load California. Conclusion: California is much bigger, in fact there is probably room for me there in Nevada county.
Nice to have some data in my PostGIS server.
Now I need to get it showing up in Geoserver
In the ESRI world
- I want to be able to move data from ESRI proprietary formats into PostGIS.
- I want to be able to hand a set of ESRI compatible tools to my GIS Analyst co-workers so they can get the data into my fabled PostGIS data warehouse without my help. (Using ESRI "toolboxes" and ESRI Model Builder for example)
Note although I might be able to use an ArcSDE license (aka ArcGIS Server) to access PostGIS but I regard that as so much extra work it's not worth it. Also it requires an ArcInfo license on the desktop and I would prefer not to.
You only need an ArcGIS license for this to access the ESRI formats. If your data are in shapefiles you don't need this... scroll up.
In fact if your data are in ESRI File Geodatabases that are new (version 10) you can use GDAL 1.9.0 to access them, I can do this with GDAL on Linux, even. But most of my FGDBs are older.
In fact you only need the license for a few things these days.
Iterate a feature class
Here is a simple script (9.3) to iterate over a feature class and read its geometry. The nice thing about using the ESRI code for this is that it does not matter what the data source is, it's just a 'feature class' that can be stored in a shapefile or a personal geodatabase or ArcSDE... etc...
#!/usr/bin/env python import arcgisscripting gp = arcgisscripting.create(9.3) # Hard coded data source, for simplicity. workspace = 'D:/AGIProducts/IncidentView_Data/Data/TEMPORARY_WORKSPACE/WA_King.gdb' featureclass = 'test_Addresses_points' gp.workspace = workspace desc = gp.describe(featureclass) # Get some metadata shapefieldname = desc.ShapeFieldName print "ShapeType %s" % desc.ShapeType print "Indexed =", desc.HasSpatialIndex print "Shape field", shapefieldname fields = gp.ListFields(featureclass, '*') # Get a list of the attributes rows = gp.SearchCursor(featureclass) row = rows.Next() i = 0 while row: feature = row.GetValue(shapefieldname) # If this is a point, we can just grab its shape point = feature.GetPart() print i, point.x, point.y row = rows.Next() i += 1 del gp # Release the geoprocessing object # That's all!
Write the feature to PostGIS
So now I want to put the points into PostGIS... how to do that?
I probably could use the GDAL python bindings. I can probably figure out how to write the geometry directly to PostgreSQL
References
Paulo Corti's notes on Migrating shapefiles to PostGIS from "Thinking in GIS"