1

Hello,

In a project I currently working for there's a need of reverse geocoding of ~15 000 - 20 000 latitude and longitude coordinates. To be more specific I'm not interested in exact street address but just in administrative entities for these coordinates, cities or towns maximum.

Nominatim usage policy doesn't encourage bulk geocoding and amount of my requests is quite large but I have a very simple single-threaded script to do it which seems to fall under the limitations of usage policies. The speed is not critical and setting up local Nominatim server is quite an overkill for this task. Unfortunately even business geocoding solution are aimed more for regular application use while I'm in fact interested in the data itself.

So there're my main questions:

  1. Is it possible to reverse geocode about ~20000 using public API. If not then what amount of requests is acceptable under usage policies?
  2. If it is, should I force something like 1 second wait intervals between requests to ensure lower load on public servers.
  3. Should I notify about my bulk geocoding

Sorry if there're somewhere exactly the same questions I couldn't found one

asked 22 Jan '14, 06:17

seqai's gravatar image

seqai
21112
accept rate: 0%

edited 22 Jan '14, 12:17

aseerel4c26's gravatar image

aseerel4c26 ♦
32.2k16239552

1
(24 Sep '16, 10:49) SomeoneElse ♦

2 Answers:
7

The Nominatim Usage Policy for OSM's own Nominatim instance says

As a general rule, bulk geocoding of larger amounts of data is not encouraged. If you have regular geocoding tasks, please, look into alternatives below. Smaller one-time bulk tasks may be permissible, if these additional rules are followed

  • limit your requests to a single thread

  • limited to 1 machine only, no distributed scripts (including multiple Amazon EC2 instances or similar)

  • Results must be cached on your side. Clients sending repeatedly the same query may be classified as faulty and blocked.

If you say that speed is not critical for you, then to be absolutely safe I'd insert a longer pause between requests - say, 5 seconds - which would take your total run time to about 1.5 days, and you could be sure not to tax OSM's servers too much.

You could also look into MapQuest's Nominatim Server the usage policy of which is somewhat more relaxed than OpenStreetMap's (they have bigger machines).

permanent link

answered 22 Jan '14, 06:48

Frederik%20Ramm's gravatar image

Frederik Ramm ♦
73.3k866641137
accept rate: 24%

edited 22 Jan '14, 06:48

2

I think that currently it's the best solution for me. If constant requests for almost 2 days is not a problem then I think I'll go for it. The script is really tiny and I can use it even on my ancient notebook running it for couple of days. From MapQuest usage policies I understood that they're more oriented towards software and web-developers so I'm not sure that onetime bulk geocoding falls under it. Thank you very much!

(22 Jan '14, 07:02) seqai

@seqai: By the way: Today there seem to be problems with the osm.org Nominatim server, so I suggest not to start today. Here are the stats if you want to look for low load times or problems.

(22 Jan '14, 12:16) aseerel4c26 ♦

@seqai: Oh, and since you say that you have a simple script: take care to set a proper user agent (maybe your email address or a reference to this question here?). This is mentioned in the "requirements" on the policy page which also mentions one request per second as maximum.

(22 Jan '14, 12:24) aseerel4c26 ♦

@aseerel4c26 Thank you for notifying about the problems, though I've already had my script running and loaded 1510 coordinates with 5 sec pause between requests and haven't met any problems. I've included this exact information ou mentioned (email and link to this question as it's requested in policies) in the User-agent. For now I'm going to pause to wait until everything is ok.

(22 Jan '14, 12:43) seqai
5

To get which country and city some coordinate is in, you can use Overpass API. Please see this the example. You can start it with the button "Execute".

permanent link

answered 22 Jan '14, 06:52

Roland%20Olbricht's gravatar image

Roland Olbricht
6.4k35986
accept rate: 35%

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:

×591
×117
×74
×12

question asked: 22 Jan '14, 06:17

question was seen: 11,903 times

last updated: 24 Sep '16, 10:49

powered by OSQA