Skip to content

64bit osmid support#2422

Merged
kevinkreiser merged 61 commits intomasterfrom
64bit
Jun 30, 2020
Merged

64bit osmid support#2422
kevinkreiser merged 61 commits intomasterfrom
64bit

Conversation

@kevinkreiser
Copy link
Copy Markdown
Member

this fixes #2410 #2415 #2414 and supercedes them as well.

osm wayids are technically 64bits. while in practice wayids dont usually appear above 32bits (yet) its entirely within spec to support this. we have encountered osm extracts for example who have ids above 32bits in size. this PR works by optionally adding 4 more bytes to the edgeinfo struct when it encounters a wayid that is larger than 32bits. for the osm planet this should mean that we see no increase in data size but only on extracts that actually make use of ids wider than 32bits

This pr also does the same for nodes. The task is quite a bit different. Here we dont actually store the node ids other than at parsing time. We only use them to mark the nodes that occur on ways that are interesting for routing. After we are done finding those nodes (via their ids that we got from the ways they were part of) we throw them away. In the end all we really need to do is be able to is remember that we have seen a node before and if we haven't put it into memory.

This pr supercedes #2414. In that PR we tried to straight up use hashmaps directly but after looking at the total number of nodes it was clear that such an approach would use way too much ram. Ive measured approximately 1300000000 nodes on ways that we care about in the OSM dataset. If those ids are sequential we'll see about 500MB of ram usage. I suspect that we might be double that in practice which is resonable. I also removed the whole reading and writing of this info to file from the places where it happens and encapsulated them in the idtable class. The unit tests were also updated. Remove some that no longer made sense in the context of hashed containers. I've also added a test to the unit test that tests the serializing and deserializing to file and tested it with the max number of ids we should expect to see. Though in practice we probably will see a larger key space, this simple test (which sets/gets 1.3bn, serializes, deserializes and checks for equality) runs in 17 seconds.

kevinkreiser and others added 30 commits June 5, 2020 10:13
@kevinkreiser kevinkreiser mentioned this pull request Jun 15, 2020
2 tasks
gknisely
gknisely previously approved these changes Jun 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

OSM way Id exceeds 32 bit maximum OSM Id exceeds max specified when building admins OSM ID > 10**15 exceeds maximum

2 participants