-
-
Notifications
You must be signed in to change notification settings - Fork 764
Prallelize AWS terrain tile downloads #688
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
⏱️ Benchmark run finished in 0m 31s 📈 Compared against baseline: 30s 🟢 Generation time is unchanged. 📅 Last benchmark: 2026-01-02 12:36:13 UTC You can retrigger the benchmark by commenting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds parallel downloading of AWS terrain tiles to improve performance when fetching elevation data. The changes refactor the tile download logic to use the rayon library for concurrent downloads.
- Introduced rayon for parallel processing of tile downloads
- Refactored tile download and caching logic into separate functions
- Changed error handling from
Box<dyn std::error::Error>toStringfor better compatibility with parallel operations - Modified tile processing to download all tiles in parallel first, then process them sequentially
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| /// Fetches a tile from cache or downloads it if not available | ||
| fn fetch_or_load_tile( | ||
| tile_x: u32, | ||
| tile_y: u32, | ||
| zoom: u8, | ||
| tile_path: &Path, | ||
| ) -> Result<image::ImageBuffer<Rgb<u8>, Vec<u8>>, String> { | ||
| if tile_path.exists() { | ||
| // Check if the cached file has a reasonable size (PNG files should be at least a few KB) | ||
| let file_size = std::fs::metadata(tile_path).map(|m| m.len()).unwrap_or(0); | ||
|
|
||
| if file_size < 1000 { | ||
| eprintln!( | ||
| "Warning: Cached tile at {} appears to be too small ({} bytes). Refetching tile.", | ||
| tile_path.display(), | ||
| file_size | ||
| ); | ||
|
|
||
| // Remove the potentially corrupted file | ||
| let _ = std::fs::remove_file(tile_path); | ||
|
|
||
| // Re-download the tile | ||
| return download_tile(tile_x, tile_y, zoom, tile_path); | ||
| } | ||
|
|
||
| // Try to load cached tile, but handle corruption gracefully | ||
| match image::open(tile_path) { | ||
| Ok(img) => { | ||
| println!( | ||
| "Loading cached tile x={tile_x},y={tile_y},z={zoom} from {}", | ||
| tile_path.display() | ||
| ); | ||
| Ok(img.to_rgb8()) | ||
| } | ||
| Err(e) => { | ||
| eprintln!( | ||
| "Cached tile at {} is corrupted or invalid: {}. Re-downloading...", | ||
| tile_path.display(), | ||
| e | ||
| ); | ||
|
|
||
| // Remove the corrupted file | ||
| let _ = std::fs::remove_file(tile_path); | ||
|
|
||
| // Re-download the tile | ||
| download_tile(tile_x, tile_y, zoom, tile_path) | ||
| } | ||
| } | ||
| } else { | ||
| // Download the tile for the first time | ||
| download_tile(tile_x, tile_y, zoom, tile_path) | ||
| } | ||
| } |
Copilot
AI
Jan 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a potential race condition when multiple threads check if a cached tile exists and then attempt to download it. Two threads could simultaneously determine that a tile doesn't exist or is corrupted, both attempt to delete it, and both attempt to download and write to the same file path. While the file system operations might succeed, this wastes bandwidth and could result in corrupted files if writes interleave. Consider adding file-based locking (e.g., using the fs2 crate which is already a dependency) or using atomic file operations to prevent concurrent downloads of the same tile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot The race condition is technically possible but has minimal impact:
- Worst case: two threads download the same tile simultaneously, wasting some bandwidth
- File writes are atomic at the OS level, so no corruption risk
- The probability is low (requires exact timing on the same tile)
- Adding file-based locking would introduce complexity and potential deadlocks in parallel code.
I've added a doc comment to fetch_or_load_tile noting this behavior is intentional and harmless.
| let response = client.get(&url).send().map_err(|e| e.to_string())?; | ||
| response.error_for_status_ref().map_err(|e| e.to_string())?; | ||
| let bytes = response.bytes().map_err(|e| e.to_string())?; | ||
| std::fs::write(tile_path, &bytes).map_err(|e| e.to_string())?; | ||
| let img = image::load_from_memory(&bytes).map_err(|e| e.to_string())?; |
Copilot
AI
Jan 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Converting all errors to String using map_err(|e| e.to_string()) loses the original error type and stack trace information. While this simplifies error handling for parallel operations, it makes debugging harder since the error chain is lost. This is particularly problematic for network errors, file I/O errors, and image decoding errors which have rich context. Consider using a custom error type that preserves error context or at least includes more specific error information in the string conversion.
| let response = client.get(&url).send().map_err(|e| e.to_string())?; | |
| response.error_for_status_ref().map_err(|e| e.to_string())?; | |
| let bytes = response.bytes().map_err(|e| e.to_string())?; | |
| std::fs::write(tile_path, &bytes).map_err(|e| e.to_string())?; | |
| let img = image::load_from_memory(&bytes).map_err(|e| e.to_string())?; | |
| let response = client | |
| .get(&url) | |
| .send() | |
| .map_err(|e| format!("HTTP request to {url} failed: {e}"))?; | |
| response | |
| .error_for_status_ref() | |
| .map_err(|e| format!("HTTP error status for {url}: {e}"))?; | |
| let bytes = response | |
| .bytes() | |
| .map_err(|e| format!("Reading response body from {url} failed: {e}"))?; | |
| std::fs::write(tile_path, &bytes).map_err(|e| { | |
| format!( | |
| "Writing tile to {} failed: {e}", | |
| tile_path.display() | |
| ) | |
| })?; | |
| let img = image::load_from_memory(&bytes) | |
| .map_err(|e| format!("Decoding image from {url} failed: {e}"))?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot This is an intentional tradeoff. The function returns Result<_, String> to simplify parallel collection with rayon. While custom error types would preserve the error chain, the benefits are marginal for this use case:
- Error messages still contain the relevant information (URL, file path, HTTP status, etc.)
- This is internal tile-fetching code, not a public API
- The added complexity of custom error types isn't justified for a simple download helper
The current approach is pragmatic and the error messages provide sufficient context for debugging.
No description provided.