Geocoding: Difference between revisions

From Wildsong
Jump to navigationJump to search
Brian Wilson (talk | contribs)
Brian Wilson (talk | contribs)
 
(12 intermediate revisions by the same user not shown)
Line 3: Line 3:


== Geocoding Services ==
== Geocoding Services ==
In alphabetic order. After a little testing I am going to try LocationIQ first.


=== Esri ===
=== Esri ===
Line 8: Line 10:
Two options:
Two options:
# Build your own and host it in ArcGIS Enterprise.
# Build your own and host it in ArcGIS Enterprise.
# Use their service
# Use their service [https://developers.arcgis.com/features/geocoding/]
 
=== Google ===
 
I am just not a Google fan. I cannot update their database, so if a search fails, I cannot fix it in a timely or controllable fashion.
It's hard to see what limits are and what rates are. I know I could find it but I don't care right now.
I've tried it out in the past, in fact I have a page with a demo in [[Wiki Maps]].
Go forth and research it yourself: https://developers.google.com/maps/documentation/geocoding/intro
 
=== LocationIQ ===
 
I signed up here using my day job email.
 
Home page: https://locationiq.com/
API: https://locationiq.org/docs#forward-geocoding
 
Based on Nominatim. They also offer map services, but I am not concerned about that right now.
 
Free tier allows 5,000 requests/day, rate limit of 2/second)


=== Nominatim ===
=== Nominatim ===


[https://nominatim.openstreetmap.org/ Nominatim] is the [[OpenStreetMap]] geocoder. If I put the right data into OSM, it works great for me. For example, I currently live in
[https://nominatim.openstreetmap.org/ Nominatim] is the [[OpenStreetMap]] geocoder. If I put the right data into OSM, it works great for me. For example, when I lived in
the [https://nominatim.openstreetmap.org/search.php?q=anderson+ungerman&polygon_geojson=1&viewbox= Anderson Ungerman] duplex.
the [https://nominatim.openstreetmap.org/search.php?q=anderson+ungerman&polygon_geojson=1&viewbox= Anderson Ungerman] duplex, I added it to OSM.
I added it to OSM so now I can find my way home! Updates are near instantaneous, if you add an address you can immediately start  
I moved to [https://nominatim.openstreetmap.org/search.php?q=skopal+wilson&polygon_geojson=1&viewbox= the Skopal Wilson residence] and added it so I can still find my way home! Updates are near instantaneous, if you add an address you can immediately start  
searching for it in Nominatim. I have been adding historic houses around Astoria, as a hobby.
searching for it in Nominatim. I have been adding historic houses around Astoria, as a hobby.


Line 22: Line 42:
API: https://nominatim.org/release-docs/develop/api/Overview/
API: https://nominatim.org/release-docs/develop/api/Overview/


See also https://locationiq.com/ (10,000 requests/day, 2 per second) and OpenCage (2500 requests/day, 1 per second).
If you need to do a lot of geocoding it's very easy VERY very easy to set up a Docker with a geographic region in it. It took me about 30 minutes.
Then after setting it up I did updates in OSM so my data is out of date already. And I found out how poorly reverse geocoding works in Clatsop County, LOL!
See below on reverse.


=== Google ===
=== OpenCage ===


I am just not a Google fan. I cannot update their database, so if anything fails, I cannot fix it in a timely or controllable fashion.
I set up an account using Github (personal account)
It's hard to see what limits are and what rates are. I know I could find it but I don't care right now.
I've tried it out in the past, in fact I have a page with a demo in [[Wiki Maps]].
Go forth and research it yourself: https://developers.google.com/maps/documentation/geocoding/intro
 
=== OpenCage ===


I set up an account using Github
OpenCage is a [https://opencagedata.com/credits fusion of many services].
6a9af53028354dbfbf92b0dde5dc2c4a


OpenCage is a [https://opencagedata.com/credits fusion of many services]. You can use up to 2500 geocodes per day for free, not a problem AT ALL for my use cases!
You can use up to 2500 geocodes per day for free, rate limited to 1 per second; not a problem for my current use cases.


Because it's a fusion, you can search for many things, not just OpenStreetMap addresses.
Because it's a fusion, you can search for many things, not just OpenStreetMap addresses.


== My extensive test data ==
== My extensive test results ==


A mix of addresses and descriptions
A mix of addresses and descriptions, biased for Clatsop County, Oregon.


{| border=1
{| border=1
|-
|-
| data                   || OpenCage || Esri || Google || Nominatim || LocationIQ  
| data                     || situs || OpenCage || Esri || Google || Nominatim || LocationIQ || notes
|-
| 32534 River Point Dr, OR ||  yes  || yes      || Wrong || Wrong  || yes      ||    yes    || wrong house #
|-
|-
| 32534 Riverpoint Dr, OR || yes      ||     ||       || yes      ||          
| 36534 Riverpoint Dr, OR ||  no  || yes      || Wrong || Wrong  || yes      ||   yes    || wrong spelling
|-
|-
| Anderson Ungerman       || yes      ||     ||       || yes      ||          
| Anderson Ungerman       ||  *    || yes      || Wrong ||   0    || yes      ||   yes     
|-
|-
| 97103                   || yes      ||     ||       || yes      ||          
| 97103                   ||  *    || yes      || yes  || yes  || yes      ||   yes       
|-
|-
| Sutton Mountain         || yes      ||     ||       || yes      ||          
| Sutton Mountain         ||  *    || yes      || Wrong || yes  || yes      ||   yes   
|-
|-
| Soapstone Lake         || yes      ||     ||       || yes      ||          
| Soapstone Lake           ||  *    || yes      || Wrong || yes  || yes      ||   yes       
|-
|-
| Knappa                 || yes      ||     ||       || yes      ||          
| Knappa                   ||  *    || yes      || yes  || yes  || yes      ||   yes       
|-
|-
| Hammond, OR             || yes      ||     ||       || yes      ||          
| Hammond, OR             ||  *    || yes      || yes  || yes  || yes      ||   yes   
|-
|-
| Battery Russell         || yes      ||     ||       || yes      ||          
| Battery Russell         ||  *    || yes      || yes  || yes  || yes      ||   yes       
|}
|}
\* situs address search is not designed to handle anything like this
"yes" means the result showed up correctly either on a map or as a list item if there were multiple results.
"Wrong" means it gave me only one result that was clearly incorrect (for example, Sutton Mountain but in the wrong state)
Thoughts on these results:
* I know I could make some of the results better by tuning queries, but that means more work for me!
* I know that I can improve any service based on OSM by improving the data myself.
* Conversely I have no way of improving the purely commercial/closed Esri and Google services.
* I could set up my own Esri geocoding engine but that's more work for me and still costs money.
* The base Nominatim services and the "fused" OpenCage service gave the same 100% result so at this time I don't see a benefit there.
* I think I can easily set up a React component that will source any of the Nominatim-based services and then select which one I want at run time. They should all return the same results.
This echoes what I have found throughout my explorations of closed/proprietary services versus open data / open source;
they might be free or inexpensive but the open services give me more control over quality and final results.
I also find the amount of work is commensurate. Phone support is available from expensive services but I generally
don't find that it helps me to struggle along with mediocre commercial services then finally fall back on phone support
when I can just work everything out for myself with free/open source solutions.
== Reverse Geocoding ==
=== Nominatim and OpenStreetMap data ===
Using Nominatim for reverse geocoding gives sketchy results. Ideally it will return perfect results. You click on a house, it gives you house number.
In reality it gives you a nearby street name and no house number and frequently gets the street wrong. This is because it's using ancient Tiger data and
the nearest OSM object. If someone has entered the polygon for the house and then tagged it correctly, great! So if you live in a place where the mappers
have entered every structure and tagged it, it will work well. In my place (Clatsop County currently), it does not.
I have to upload good structure data first. :-) Once I do that, it will be perfect.
=== Esri ArcGIS ===
Accuracy in my admittedly difficult tests are poor. I am thinking I might need to set up my own geocoding service using county roads data, sigh.

Latest revision as of 21:35, 14 April 2021

I've done work before on geocoding and geocoders but somehow never wrote a separate page for it. Recapturing what I've done now...

Geocoding Services

In alphabetic order. After a little testing I am going to try LocationIQ first.

Esri

Two options:

  1. Build your own and host it in ArcGIS Enterprise.
  2. Use their service [1]

Google

I am just not a Google fan. I cannot update their database, so if a search fails, I cannot fix it in a timely or controllable fashion. It's hard to see what limits are and what rates are. I know I could find it but I don't care right now. I've tried it out in the past, in fact I have a page with a demo in Wiki Maps. Go forth and research it yourself: https://developers.google.com/maps/documentation/geocoding/intro

LocationIQ

I signed up here using my day job email.

Home page: https://locationiq.com/ API: https://locationiq.org/docs#forward-geocoding

Based on Nominatim. They also offer map services, but I am not concerned about that right now.

Free tier allows 5,000 requests/day, rate limit of 2/second)

Nominatim

Nominatim is the OpenStreetMap geocoder. If I put the right data into OSM, it works great for me. For example, when I lived in the Anderson Ungerman duplex, I added it to OSM. I moved to the Skopal Wilson residence and added it so I can still find my way home! Updates are near instantaneous, if you add an address you can immediately start searching for it in Nominatim. I have been adding historic houses around Astoria, as a hobby.

Usage: https://operations.osmfoundation.org/policies/nominatim/ No fees but limit your requests to one per second. Most likely not a problem.

API: https://nominatim.org/release-docs/develop/api/Overview/

If you need to do a lot of geocoding it's very easy VERY very easy to set up a Docker with a geographic region in it. It took me about 30 minutes. Then after setting it up I did updates in OSM so my data is out of date already. And I found out how poorly reverse geocoding works in Clatsop County, LOL! See below on reverse.

OpenCage

I set up an account using Github (personal account)

OpenCage is a fusion of many services.

You can use up to 2500 geocodes per day for free, rate limited to 1 per second; not a problem for my current use cases.

Because it's a fusion, you can search for many things, not just OpenStreetMap addresses.

My extensive test results

A mix of addresses and descriptions, biased for Clatsop County, Oregon.

data situs OpenCage Esri Google Nominatim LocationIQ notes
32534 River Point Dr, OR yes yes Wrong Wrong yes yes wrong house #
36534 Riverpoint Dr, OR no yes Wrong Wrong yes yes wrong spelling
Anderson Ungerman * yes Wrong 0 yes yes
97103 * yes yes yes yes yes
Sutton Mountain * yes Wrong yes yes yes
Soapstone Lake * yes Wrong yes yes yes
Knappa * yes yes yes yes yes
Hammond, OR * yes yes yes yes yes
Battery Russell * yes yes yes yes yes

\* situs address search is not designed to handle anything like this

"yes" means the result showed up correctly either on a map or as a list item if there were multiple results.

"Wrong" means it gave me only one result that was clearly incorrect (for example, Sutton Mountain but in the wrong state)

Thoughts on these results:

  • I know I could make some of the results better by tuning queries, but that means more work for me!
  • I know that I can improve any service based on OSM by improving the data myself.
  • Conversely I have no way of improving the purely commercial/closed Esri and Google services.
  • I could set up my own Esri geocoding engine but that's more work for me and still costs money.
  • The base Nominatim services and the "fused" OpenCage service gave the same 100% result so at this time I don't see a benefit there.
  • I think I can easily set up a React component that will source any of the Nominatim-based services and then select which one I want at run time. They should all return the same results.

This echoes what I have found throughout my explorations of closed/proprietary services versus open data / open source; they might be free or inexpensive but the open services give me more control over quality and final results. I also find the amount of work is commensurate. Phone support is available from expensive services but I generally don't find that it helps me to struggle along with mediocre commercial services then finally fall back on phone support when I can just work everything out for myself with free/open source solutions.

Reverse Geocoding

Nominatim and OpenStreetMap data

Using Nominatim for reverse geocoding gives sketchy results. Ideally it will return perfect results. You click on a house, it gives you house number. In reality it gives you a nearby street name and no house number and frequently gets the street wrong. This is because it's using ancient Tiger data and the nearest OSM object. If someone has entered the polygon for the house and then tagged it correctly, great! So if you live in a place where the mappers have entered every structure and tagged it, it will work well. In my place (Clatsop County currently), it does not.

I have to upload good structure data first. :-) Once I do that, it will be perfect.

Esri ArcGIS

Accuracy in my admittedly difficult tests are poor. I am thinking I might need to set up my own geocoding service using county roads data, sigh.