-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Refactored OSM Shapefile importer #874
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thanks, will review! |
karussell
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, had only some minor notes. Will test now with real data.
reader-shp/.gitignore
Outdated
| @@ -0,0 +1,2 @@ | |||
| #Don't ignore my Malta test data used in the automated test!! | |||
| !malta-latest.osm.pbf No newline at end of file | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind to move this to the root .gitignore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed in the next commit.
reader-shp/pom.xml
Outdated
| <name>GraphHopper Reader for Shapefile Data</name> | ||
|
|
||
| <properties> | ||
| <geotools.version>13.2</geotools.version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As commented here #616 (comment) we could upgrade to 15.x (?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to 15.2 in next commit.
reader-shp/pom.xml
Outdated
| </configuration> | ||
| </plugin> | ||
| </plugins> | ||
| </build> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section can be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed in the next commit.
| import com.graphhopper.reader.shp.OSMShapeFileReader.EdgeAddedListener; | ||
| import com.graphhopper.storage.GraphHopperStorage; | ||
|
|
||
| public class GraphhopperShp extends GraphHopper { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know it is a bit nitpicking but would you mind to rename to GraphhopperSHP as we have GraphHopperOSM
Also would you add the GraphHopper license header to all new files?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed in the next commit. Probably not gramatically correct though! OSM is an acronymn, SHP isn't. Licence file added.
| * OSMShapeFileReader for files present at : http://download.geofabrik.de/ It extracts the data as per the structure of | ||
| * shape files | ||
| * | ||
| * @author Vikas Veshishth, Philip Welch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here also a minor edit:
@author Vikas Veshishth
@author Philip Welch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed in the next commit.
|
|
||
| } | ||
| } catch (Exception e) { | ||
| LOGGER.error("Exception while processing for roads: ", e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we have to throw an exception, e.g. throw new RuntimeException("some info", e)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed in the next commit. I've changed the exceptions to be unchecked (as I hate checked exceptions!) and don't catch them anymore (other than wrapping them in a runtime exception to uncheck them if needed). The original exception is likely to be more informative than a rethrown one anyway.
| /** | ||
| * ShapeFileReader takes care of reading a shape file and writing it to a road network graph | ||
| * | ||
| * @author Vikas Veshishth |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to add yourself here too :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed in the next commit
| return initDataReader(reader); | ||
| } | ||
|
|
||
| public void addListener(EdgeAddedListener l) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did I understood the purpose of the listeners: adding custom attributes to the graph? Why are multiple listeners necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The listeners don't modify anything, they're only used to help me output debugging information when checking over the building of the graph. They're not actually called from the commited code (just an external main I was using to run a debug from). They are useful though, so I'd prefer keep them in temporarily.
| import com.vividsolutions.jts.geom.GeometryFactory; | ||
|
|
||
| public class Utils { | ||
| public static String toWKT(PointList list) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it necessary to create a new factory for every call? Or is performance overhead neglectable and problematic stuff to ugly like non-thread safety?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is only called a couple of times a unit test (and some of my other debugging code like below which I've left out of the project). Performance overhead is negligible.
| new File(tempOutputDirFromShp).mkdirs(); | ||
| new File(tempOutputDirFromPbf).mkdirs(); | ||
|
|
||
| System.out.println("Building from shapefile"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No logging necessary in the tests IMO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer to keep the printlns if OK. There's some console output from the tests I like to check : )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that if every test does this we would be overwhelmed by log statements (we have >1K tests ;)). And every test should just print ok to command line if succeeded. For development purposes you can use this (and throw away afterwards) although I always recommend to formulate the things you check on command line with your eyes into an assert statement. Do you know what I mean :) ?
|
When trying 'sachsen' I get: Caused by: java.lang.NullPointerException
at com.graphhopper.reader.shp.ShapeFileReader.getFeatureIerator(ShapeFileReader.java:56)
at com.graphhopper.reader.shp.OSMShapeFileReader.processJunctions(OSMShapeFileReader.java:82)Will look into it tomorrow. |
|
Re: sachsen. I've just tried this file and its fine. Route built below using this code. The exception suggests the shapefile can't be loaded? You've unzipped and referenced the roads file yes? |
|
Works now surprisingly well - thanks a bunch!
I specified the folder not the shp file ... was too late yesterday ;) ... we should still add a NPE check (?) |
|
It looks like geotools is not JDK9 ready, which is a bit ugly: Do you have an idea if we somehow can workaround this without changing geotools? This was the only thing I found in the web but maybe this need a change to geotools directly: levigo/jbig2-imageio#18 |
|
Ah, it is already known to them: https://osgeo-org.atlassian.net/browse/GEOT-5289 and here geotools/geotools#1064 |
|
hmm.. Probably best to wait until they've fixed it then. Dr Philip WelchOpen Door Logistics LimitedSpecialists in open source solutions for transport logistics. Email: info@opendoorlogistics.com ---- On Thu, 17 Nov 2016 12:19:27 +0000 Peter<notifications@github.com> wrote ---- Ah, it is already known to them: https://osgeo-org.atlassian.net/projects/GEOT/issues/GEOT-5289?filter=allopenissues |
|
For now I've just excluded the shp module from JDK9 build and will still merge. |
|
Merged it via this commit 1960b0f The ugliness here is that I wanted it in a single commit (for simplicity) but now it looks like a change only from vvikas. Still when you look at 'git log' the author is "vvikas and [your name]" although it did not pick up the second email, obviously. I hope this is okay and have added you as a contributor in our list and hopefully more changes are coming :) (?) |
|
Cool. Don't worry about who the change shows up from - it's not a problem. |

Fixes / changes completed
Highway tag – we’re now using this (not used before, previously the code was using a hardcoded speed or max speed if non-zero).
Max speed tag now passed properly to the encoders. Added a filter for value “0” as in Geofabrik shapefiles 0 means no tag, but 0 will probably just mean 0 speed in the Graphhopper flag encoders…
One way tags now used properly. Previously the code was only checking for “0” and treating this as boolean. You also have to check for against the direction of digitisation (e.g. value “-1” if its one-way against the directory of digitisation). Also Geofabrik is using an odd (unfortunately non-standard) convention for their shapefile oneway field; I now read this convention but turn into the standard convention in the ReaderWay I pass to the flag encoders. This could break in the future if Geofabrik clean this up (the code will throw an exception with a useful message if it breaks).
Encoding manager now used properly. I now fill a ReaderWay with the tags and pass to the EncodingManager (the EncodingManager was not used previously), so we should get proper behaviour over multiple speed profile types (car, bike etc). Previously there was some hardcoding to “car”.
Fixed partially connected graph bug. With a shapefile from Geofabrik we don’t have node objects, only the geometry of the road (they don’t have a separate node export shapefile). Worse still, the road geometry generally isn’t split at junctions / intersections. Previously the code was assuming that any point at the start or the end of a road geometry line needed to be a node, and was splitting edges wherever one of their geometry points corresponded to the start or end of another edge. This fails badly though, because many times you get two edges crossing each other (e.g. at crossroads) where neither edges are broken at the crossroads – resulting in edges not being split at the crossroads, the graph being only half-connected and hence some pretty odd routes. Instead I’ve changed the code to create a node – and therefore split edges – based on any identical coordinate shared in two or more edges. This is (I believe) the best we can do with Geofabrik shapefile data but may fail in some odd / rare circumstances (e.g. as no z-level considered). This also means each coordinate (pillar or tower) gets stored in a hashmap during the build, which is not memory-friendly for very large files. However for a HERE importer I believe the data would be setup differently (so we’d be able to do this differently).
Edge length was being rounded down to int metres, I’ve kept it in double instead.
Added the estimated_distance and estimated_centre artificial tags
Ensured code handles shapefile multilinestring or linestring geometry (both are possible, previously multilinestring was assumed).
Unit tests - two tests for travel times, comparing between building using pbf (the existing reader project) and the shapefile project. One test is between some predefined points with known travel times, the other chooses many pairs of random points (200 currently). For the random pairs of points, we’re getting a mean agreement in travel times between the pbf and shp of around 99%, which is pretty good as the shapefile isn’t taking into account lots of other tags : ) Obviously in some cases we get less agreement presumably due to tags we’re not using…
Unit tests - we now have a (very basic) geometry test
Unit tests - we now have a test for a single one way example
Small amounts of code tidying (no code conventions applied though).
Improves #616