Getting specific POI data - and keeping them up to date.

This is a follow-up to another question on this site, where someone asked about getting (specific) POIs from the OSM data. The suggested ideal method is to get a recent copy of the Planet file and use osmosis to extract the data you're interested in.

I do heartily agree that this is the preferred way to go, especially since XAPI availability has been intermittent and performance has been slow to the point that it's no longer usable - although the new Java implementation may alleviate this situation.

One of the unique qualities of OpenStreetMap is its continuous updating by all those thousands of contributors around the globe. Ideally you would want to reflect that in your POI extract. My question is:

What is the workflow for keeping an up-to-date OSM-based POI database that performs well?

To make this a little more concrete, here's what I currently do:

Get an initial planet file from OSM
Extract the POIs that I want to be available using the osmosis --tag-filter option
In the same operation, write the POIs to a PostGIS database using --write-pgsql
Initialize a replication environment using --read-replication-interval-init, following the instructions on the wiki
Set up periodical replication using osmosis --rri --wpc in crontab

This works, but the database grows because the --rri task replicates all changes and not just the POIs I'm interested in. So derived questions are:

Is there a way to filter change streams in osmosis before writing them to an output stream?
Is the workflow described a good way to approach this challenge?

asked 25 Mar '11, 10:20

mvexel
732●8●15●23
accept rate: 0%

dear mvexel - very good outline of your needs. let us know if you have any success. please share with us all your insights and all your further needs.

(20 May '14, 21:17) say_hello_to...

One Answer:

There is currently no ready-made solution for what you want to do. If you are a programmer then the easiest way to accomplish what you want is to write a small parser for OSM/OSMChange files yourself (you will be able to re-use the code for both) and follow this logic:

Get an initial planet file from OSM
Extract the POIs that you want to be available using your own program and write to PostGIS
Initialize a replication environment using --read-replication-interval-init
Periodically, retrieve updates and process them.

The logic to apply when processing updates is:

Change file contains node deletion - then delete the node on your side.
Change file contains node creation - then check if the tags are of interest and create the node on your side if applicable.
Change file contains node modification - check if the tags are of interest; if not, then delete the node in your local database if it is there (this means somebody has removed the tags of interest from the node); if the node has interesting tags, then update or create it on your side.

You could of course also amend Osmosis to do what you want.

This procedure has a disadvantage in that it is hard to cope with POIs that are modeled as ways in OSM (because you might theoretically receive an OsmChange file that says "way #1234 now has these tags" and you go "wow, I need that in my database", but you don't know about that way's geometry because you ignored it on import since it didn't have the right tags...)

One way to avoid programming, but at the cost of more processing overhead, is this:

Get an initial planet file from OSM
Extract the POIs that you want to be available using the osmosis --tag-filter option; save to new file (e.g. --write-pbf myfile.osm.pbf)
Import new file into some kind of database using whatever process you fancy (could also be osm2pgsql or imposm, both of which may be faster than osmosis)
Initialize a replication environment using --read-replication-interval-init
Periodically call Osmosis to append an update to your local file and immediately tag-filter the result in the same step (i.e. --rri --simc --read-pbf myfile.osm.pbf --ac --tag-filter ... --write-pbf myfile-new.osm.pbf), then again make a full import of the resulting file into a database of your choice. Afterwards, rename myfile-new.osm.pbf to myfile.osm.pbf and use that as the basis onto which you apply the next update.

This wastes resources by always doing a full import, but the import will always be "clean" and only contain the things you really want. Plus, it has the capacity to work with way-POIs as well.

When using a local .osm.pbf file as your master database like in this example, it is advisable to use the --compress=none switch on --write-pbf which will speed up writing.

permanent link

answered 25 Mar '11, 12:27

Frederik Ramm ♦
73.3k●86●664●1137
accept rate: 24%

edited 08 Apr '12, 19:03

I wrote a tutorial based on my own experiences and the second method you described.

(27 Mar '11, 20:28) mvexel

mvexel: Do you import the geometry of the ways? Just using --tf accept-nodes man_made=surveillance will remove the nodes of the ways.

(30 Mar '11, 15:33) emj

many thanks for this answer..

(20 May '14, 22:53) say_hello_to...

Markdown Basics

*italic* or _italic_
**bold** or __bold__
link:[text](http://url.com/ "title")
image?![alt text](/path/img.jpg "title")
numbered list: 1. Foo 2. Bar
to add a line break simply add two spaces to where you would like the new line to be.
basic HTML tags are also supported

learn more about Markdown

Question tags:

osmosis ×228
update ×173
poi ×163

question asked: 25 Mar '11, 10:20

question was seen: 14,162 times

last updated: 20 May '14, 22:53

Getting specific POI data - and keeping them up to date.

Related questions