Geodatabases
Introduction to Geodatabases
These are notes written as I attemot to learn to use ESRI geodatabases. More specifically, the variation known as the "personal geodatabase" (or PGDB).
Brian Wilson 16:10, 15 January 2006 (PST)
Scope
I am not attempting to duplicate the ESRI documentation. As I learn new things about geodatabases, I will write about it here to help cement it more firmly in my head, and so that I can refer back to it later.
Links to ESRI geodatabase stuff
ArcGIS Desktop documentation The "Geodatabase Workbook" contains a quickstart guide and exercises. The "Building Geodatbases" book is more complete.
These two books are the sources for most of my knowledge of Geodatabases. There is also a book from ESRI Press "Designing Geodatabases" but you should really get the above (free) docs under your belt before looking at it.
Two main types of ESRI geodatabase
Multiuser Geodatabase -- if you use ArcSDE-based geodatabases, you get version control and multiuser features. ArcSDE is the 'application tier'; a layer of software that contains the geospatial features required to implement a geodatabase on top of various backend databases (Oracle, SQL Server, DB2, etc) which might or might not have any builtin spatial features.
Personal Geodatabase -- based on the MS Access ("Jet") database format; allows many of the same features as SDE for an individual user. No support for multiuser or versioned access. (In fact, I have seen Jet files blown apart if more than one user writes to it... I have heard people claim that multiuser access to Jet files is fine but my own bad experiences belie this.)
Terminology
An informal Comparison of geodata formats
Data model = a template that you can use to build a geodatabase, including documentation and suggested feature classes and topology rules
Feature Dataset = Feature class(es) + topology + network objects A PGDB can contain multiple Feature Datasets.
Feature Class = a table containing spatial data + attributes
Table = rows and columns of data; can contain spatial fields
Relationship class = binds tables together; can contain additional data
Topology = rules defining requirements for data to be stored in a dataset
Geometric network = topology rules defining how spatial features are connected
PGDB - you can keep all the data for a project in one PGDB.
Vector data -- Network Survey data
Raster data (ArcGIS 9.0+) --
Raster catalog -- You can have multiple rasters in a feature class, for example to store aerial photos when you don't want to create a mosaic.
Raster time series - you can have multiple rasters in a feature class ordered as layers and sorted by time; you can use the ordering to control how overlapping rasters are layered, too.
Annotation --
Metadata --
Tabular data --
Topology rules --
XML (ArcGIS 9.0+) -- XML can be used to import and export data. The geodatabase XML schema is documented on the ESRI site.
Here we go
You can think of a PGDB as simply a container into which you shove a bunch of shapefiles. That's fine, you can use them that way. But if you do, you are missing out on half the fun.
Shapefiles are actually very limited. When you are trying to use GIS to describe geospatial data, does one shapefile do it for you? No; you have to have a set of shapefiles, tied together by an ArcView project (MXD file) and some extra notes and documentation (in a file or on postit notes on your monitor frame.)
Tips and techniques
Importing CAD data
Using spatial adjustment tools
Attribute transfer
Topology rules