Skip to content

Conversation

@louis-e
Copy link
Owner

@louis-e louis-e commented Jan 6, 2026

Motivation

  • Avoid building an intermediate untyped JSON tree to reduce memory pressure and parsing overhead.
  • Keep OSM data typed during parsing to simplify downstream processing and error handling.
  • Reduce expensive cloning of way/node data when assembling relations.
  • Free large temporary maps earlier so memory can be reused during generation.

Description

  • Parse Overpass/file responses with streaming deserialization using serde_json::Deserializer::from_reader(...) directly into OsmData in src/retrieve_data.rs and return OsmData instead of Value.
  • Make OsmData a public typed struct (including remark) and update parse_osm_data to accept OsmData in src/osm_parser.rs.
  • Replace ways_map storage with HashMap<u64, Arc<ProcessedWay>> and change ProcessedMember to hold Arc<ProcessedWay> to avoid cloning way node vectors repeatedly.
  • Drop nodes_map and ways_map after processing relations to release memory earlier and update call sites (src/test_utilities.rs) to use the new typed fetch return.

Testing

  • Ran cargo fmt to apply formatting changes and ensure code style is consistent, which completed successfully.
  • No automated unit/integration tests were executed as part of this change set.

Codex Task

Copilot AI review requested due to automatic review settings January 6, 2026 14:42
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors OSM data parsing to use streaming deserialization and reduce memory usage by minimizing cloning. The key changes include:

  • Streaming deserialization of OSM data directly into typed structures instead of building intermediate JSON trees
  • Wrapping ProcessedWay in Arc to enable sharing across relations without cloning large node vectors
  • Explicit memory cleanup by dropping large temporary maps after relation processing

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
src/test_utilities.rs Updated to work with new OsmData return type instead of serde_json::Value
src/retrieve_data.rs Modified fetch functions to stream-deserialize directly into OsmData struct, eliminating intermediate JSON tree construction
src/osm_parser.rs Made OsmData public with typed fields, changed way storage to use Arc<ProcessedWay> for zero-copy sharing in relations, and added explicit cleanup of temporary maps

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@louis-e louis-e merged commit 7ec90b4 into main Jan 6, 2026
2 checks passed
@louis-e louis-e deleted the codex/refactor-data-parsing-and-memory-usage branch January 6, 2026 15:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants