Merged
Conversation
…s work in gurka though
…a bytes for a 64bit wayid
This was
linked to
issues
Jun 12, 2020
…e fixed now and were just a symptom of edgeinfo wayid bugs
gknisely
previously approved these changes
Jun 16, 2020
This was referenced Jun 29, 2020
Closed
Closed
gknisely
approved these changes
Jun 30, 2020
6 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
this fixes #2410 #2415 #2414 and supercedes them as well.
osm wayids are technically 64bits. while in practice wayids dont usually appear above 32bits (yet) its entirely within spec to support this. we have encountered osm extracts for example who have ids above 32bits in size. this PR works by optionally adding 4 more bytes to the edgeinfo struct when it encounters a wayid that is larger than 32bits. for the osm planet this should mean that we see no increase in data size but only on extracts that actually make use of ids wider than 32bits
This pr also does the same for nodes. The task is quite a bit different. Here we dont actually store the node ids other than at parsing time. We only use them to mark the nodes that occur on ways that are interesting for routing. After we are done finding those nodes (via their ids that we got from the ways they were part of) we throw them away. In the end all we really need to do is be able to is remember that we have seen a node before and if we haven't put it into memory.
This pr supercedes #2414. In that PR we tried to straight up use hashmaps directly but after looking at the total number of nodes it was clear that such an approach would use way too much ram. Ive measured approximately
1300000000nodes on ways that we care about in the OSM dataset. If those ids are sequential we'll see about 500MB of ram usage. I suspect that we might be double that in practice which is resonable. I also removed the whole reading and writing of this info to file from the places where it happens and encapsulated them in the idtable class. The unit tests were also updated. Remove some that no longer made sense in the context of hashed containers. I've also added a test to the unit test that tests the serializing and deserializing to file and tested it with the max number of ids we should expect to see. Though in practice we probably will see a larger key space, this simple test (which sets/gets 1.3bn, serializes, deserializes and checks for equality) runs in 17 seconds.