Skip to content

Conversation

@XianlinSheng
Copy link
Contributor

  1. Added a map_editing step in the workflow, after parsing element to ProcessedElement, before generating_world(...). This enables development for map transformation, e.g. translating (Suggestion: Co-ordinate sync with reality [OPT-IN] #29), rotating ([FEATURE] The ability to rotate maps #97), snapping to grid ([FEATURE] Snap roads and features to 90°  #274). To define operations, edit map_editing.json, which is a list of operation config dict that will be deserialized to construct specific types of Operator. The operation is applied on i32 coordinate, and i suppose 1m resolution is accurate enough for a city.
  2. Enabled world generation at non-origin, which is a necessary feature for map editing.
    journeymap
    remotesight

@louis-e louis-e mentioned this pull request Apr 13, 2025
3 tasks
@louis-e
Copy link
Owner

louis-e commented Apr 13, 2025

Thanks a lot for your effort! I'll take some time to look into this to make sure that I fully understand the changes and then follow up with merging it!

…sian coordinate_system::cartesian for extension of spherical coordinate submod
@XianlinSheng
Copy link
Contributor Author

XianlinSheng commented Apr 20, 2025

update: the architecture of cartesian is slightly modified for integration of a future spherical coordinate system. Prev: src/cartesian.rs; Now: src/coordinate_system/cartesian/*; The GeoCoord and BBox in #377 can be integrated in a unified way into a submodule src/coordinate_system/spherical_or_other_preferred_names/*. Coordinate transformations, where a lot of customizations and extensions can be done, can be implemented in coordinate_system/transformation/*

@benjamin051000
Copy link
Contributor

Whoaaaaaa this is cool. I'll do a code review when I finish BBox PR

Copy link
Contributor

@benjamin051000 benjamin051000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall AWESOME idea and implementation. I've left many notes because I'm passionate about this change and want to see it perfected and merged!

Note

I am a Rust noob. If some of these comments are wrong, feel free to tell me I'm wrong and hit resolve. No hard feelings :D

Overall thoughts

  • I like the Operator design. Very nice and flexible. How hard would it be to add things like repetition operators (e.g., repeat a transformation/rotation several times)? Looking forward to seeing how simple adding rotation will be :D
  • Please add unit tests wherever you see fit. I know they're pretty simple data structures, but it gives us additional confidence & debug info when weird edge cases arise.
  • It would also be useful to be able to pass transformations via CLI arguments. Something like --transform translate vector 500 500 or something. That may be complex and not that great though so we can discuss separately

I'm willing to approve this* if you review my comments for important suggestions first.

* I am not the maintainer so my words have 0 weight

@XianlinSheng
Copy link
Contributor Author

Thanks a lot for your time. I will response to every comment thoroughly and give you my thoughts on some of the designs when I have enough time on this, maybe on this weekend.

@XianlinSheng
Copy link
Contributor Author

Overall AWESOME idea and implementation. I've left many notes because I'm passionate about this change and want to see it perfected and merged!

Note

I am a Rust noob. If some of these comments are wrong, feel free to tell me I'm wrong and hit resolve. No hard feelings :D

Overall thoughts

  • I like the Operator design. Very nice and flexible. How hard would it be to add things like repetition operators (e.g., repeat a transformation/rotation several times)? Looking forward to seeing how simple adding rotation will be :D
  • Please add unit tests wherever you see fit. I know they're pretty simple data structures, but it gives us additional confidence & debug info when weird edge cases arise.
  • It would also be useful to be able to pass transformations via CLI arguments. Something like --transform translate vector 500 500 or something. That may be complex and not that great though so we can discuss separately

I'm willing to approve this* if you review my comments for important suggestions first.

  • I am not the maintainer so my words have 0 weight

Repetition is very simple: repetition itself is an operator, just a special one whose operation is to call another underlying operator several times.

Rotation on the elements is vector mathematics 101. The relatively trickier thing is the xzbbox. This thing needs to be rotated too. The idea is to make XZBBox an enum too, which can be Rect, Polygon, or even other shapes. The implementation is not difficult.

I try to add those tests following the style in your BBox PR.

Yes lets discuss the CLI implementation separately and currently just make it support reading a config file. Btw, if I decide to make the process highly programmable and able to run in batch, I choose to wrap it up as a python shell - everything connected to python will have the access to everything.

@louis-e
Copy link
Owner

louis-e commented Apr 26, 2025

Woah you guys are amazing, I can't wait to see this merged!
Since there are still a few open comments, ping me when you are done from your side and ready to merge. I don't have any open comments myself, @benjamin051000 you already covered most of it with your review. Big thanks to you both!

…e namings; fix water area dependencies on new xzbbox (add get_min_coords)
@XianlinSheng
Copy link
Contributor Author

XianlinSheng commented Apr 27, 2025

  • Validity check on XZBBox
  • Operator trait
  • Unit tests
  • CLI file name argument (separate PR)
  • File simplification (translator in trait style, XZPoint and XZVector keeps separate files for details clarity and safety)
  • JSON or TOML

I reserved the places for previously merged BBox GeoCoord modules and coordinate transformations in coordinate_system. This will be handled in a separate PR.

@benjamin051000
Copy link
Contributor

As you continue work on this feel free to mark my notes as "resolved" (there's a button to do this) to clean up the PR. Thanks! Excellent work so far. It seems like you are still working on various things here so I'll wait to look at your updates

@XianlinSheng
Copy link
Contributor Author

XianlinSheng commented Apr 28, 2025

A comment on BBox: I am trying to add unit test on my transformation module so I copied the example bbox coordinate in bbox test to fetch data. It failed because API did not return anything. I changed to another coordinate that has longitude > 90 and realized that this example coordinate is not in the correct order, which is instead located in the middle of Arabian Sea so of course it is returning nothing.

// Arnis, Germany (but not, this is in Arabian Sea)
assert!(BBox::new(9.927928, 54.627053, 9.937563, 54.634902).is_ok());

where

pub fn new(min_lat: f64, min_lng: f64, max_lat: f64, max_lng: f64)

I changed it to the correct order and added three more cities that have longitude > 90 or < -90, two having latitude > 0, one < 0, so that will not be confusing.

@XianlinSheng
Copy link
Contributor Author

OK, the operator module is changed to trait style as suggested; XZBBox is created using factory method and it checks for validity inside; all public member functions/data of XZBBox ensures its validity; tests are added for xzbbox, operators json reading, and translation.

@louis-e
Copy link
Owner

louis-e commented May 1, 2025

retrigger-benchmark

@github-actions
Copy link

github-actions bot commented May 1, 2025

⏱️ Benchmark run finished in 1m 58s
🧠 Peak memory usage: 1964 MB

📈 Compared against baseline: 130s
🧮 Delta: -12s
🔢 Commit: 922e752

✅ This PR improves generation time.

You can retrigger the benchmark by commenting retrigger-benchmark.

@XianlinSheng
Copy link
Contributor Author

I think it's ready for review again. Most of the review comments are taken care of, except XZPoint and XZVector simplification, which I still think they should be in separate files and structures because of potential algorithm safety issue and the clarity when trying to tell apart their nuances for users.

Copy link
Contributor

@benjamin051000 benjamin051000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thanks for submitting those changes.

Approving. Note that I have not actually tested this code locally.

@louis-e
Copy link
Owner

louis-e commented May 11, 2025

retrigger-benchmark

@louis-e
Copy link
Owner

louis-e commented May 11, 2025

Hi there, first of all thanks for the great work @XianlinSheng and thanks for your extensive review @benjamin051000! I wanted to review it earlier already but got sick - so sorry for the delay. There are no more comments from my side, this is amazing to build on in the future. Is this ready to merge from your side @XianlinSheng?

@XianlinSheng
Copy link
Contributor Author

No problems on my side. Take your time and hope you get well soon.

@louis-e louis-e merged commit b13eb1f into louis-e:main May 11, 2025
3 checks passed
@louis-e louis-e mentioned this pull request Sep 13, 2025
Comment on lines +7 to +10
#[derive(Clone, Debug)]
pub enum XZBBox {
Rect(XZBBoxRect),
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@XianlinSheng Hey, sorry, I know this PR is months old at this point. Just wondering, what is the purpose of this being an enum? Is there anything else that could really go in here other than a Rect? Just wondering as I'm adding functionality around this and want to know how generic I should keep things

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for asking this question first. XZBBox is the mask of the region of generation. Normally it is a rectangle aligned with grid lines, but after rotation or other transformations this region can no longer be simply described like this. At this moment I already have a complete general polygon implementation in my branch that is necessary for rotation, so you can go to my dev-xzpoly to check out how it is designed. One can choose a form to instantiate as long as the shape can fully describe what you have now. For users of this structure, they only call the 'contains' (check a point is included). 'bounding_bbox' (circumscribed grid-aligned rectangle) method, display/add/sub traits, and possible transformations on it, so adding a new shape needs 1. 'contains' method. 2. 'bounding_bbox' method. 3. display/add/sub traits. 4. supports for your shape in all map transformation operators (it won't compile if the new shape is not covered anywhere that needs to modify the xzbbox).

The purpose of using a XZBBox is to truncate the OSM data and let it generate only inside the region. The fetched OSM data is not just the points inside the LLBBox you selected on the map but includes the complete continuous elements that are partially covered - like selecting objects in modeling software, you will select the whole object as long as the region touches it. An element may extend even much further than the llbbox dimension, and this is the reason why impl #504 found that the generation was still in the original rectangle after the rotation transform on existing coordinates - the existing data is not just the points inside the selected region, and we need to use XZBBox to restrict the generation of the redundant elements outside of it.

This design will build the necessary blocks for features like generation at non origin #29 (using rectangle), rotation #97, clip #97 (using polygon), custom shape, projection-corrected #29 (using other shapes), non-continuous space #459 (general mask). The polygon impl already has a mask field inside for the inference performance of method 'contains', and the rotation code for elevation data gives the idea to rotate a field, so a general mask box is also in fact ready-to-go, simply rearranging some existing methods.

May I know your plan so we can reach the optimal implementation?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, makes sense!

Basically I'm working on generating the world in batches, where each batch is an n chunk by n chunk square. This is to lower RAM usage, something that users have been complaining about (and even trying to script their way around batching, which kind of works, but has issues)

The way I'm doing things right now is to divide the XZBBoxRect up into these chunks, which are then saved to disk. Doing it this way is quite nice because it means we'll never touch the same chunk twice. Of course, that invariant doesn't hold if XZBBox isn't axis-aligned. Maybe you have some thoughts here?

Maybe there could be an Intersection(XZBBox, XZBBox) variant to XZBBox`. Then the first one could be the user-defined shape, and the second one could be the n chunk by n chunk batch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me understand this situation completely and better before I give you my thoughts. Which process and which data structure is taking up the most RAM? By dividing you mean generating it in multiple steps so it won't store everything at the same time?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal is to batch the entire pipeline, i.e. only fetch what we need from overpass for that specific batch, then run it through the entire thing, and save it out to the region file. There will be duplicated work, especially if the batch size is too small, but it would make it feasible to create huge worlds (on the order of entire provinces/states)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK it's completely alright to have different ideas and find which is the really the optimal and suitable for this case, but that is after both proposals are really understood clearly.

The role of Python is not a second programming language, it's not part of the program, but another user interface, CLI, GUI, Python Shell, and a programmable interface has much greater abilities than these two. It will not be the developers to write Python scripts but users. Still, developers can provide some commonly needed scripts ahead of time as presets. The reason I propose this batch can be written as a Python preset is that I saw some ideas that need a programmable interface will be benefited from such a preset. Like the world generation considering projection corrections (or any approach to generate large world as you wish) in #29 needs a highly customized and programmable way to arrange the pipelines of how to combine the small jobs and form a large world. If such a preset exists, not just the users with resource problems can call the function, world designers can also call it to generate it without knowing dividing details. Projection and city design professionals will write their Python scripts and perform tests and studies to generate optimal combinations of small jobs. They don't care the rust details in generation a job, their scripts are not part of this project and they should have the freedom to choose not to focus too much on computer science but on their professionals, which is the strongest power of Python as a glue.

What you proposed makes sense and is in fact not contradictory to a Python shell. It's just a matter of on which side we realize it of the interface. I do not insist which side it should be, what I care is to make it useful to as more features as possible. So of course we can still write this in rust, but we better consider to extend the ability of the interface too (it can be just the GUI/CLI now) to make full use of this feature.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I'm still concerned with the amount of lift and complexity it will add, plus how much it will slow down other feature development (we'll need to consider all sorts of edge cases as a result of allowing arbitrary user pipelines)

I definitely don't think this is something that'd be needed for #29. It needs a few extra settings at most. Maybe if they had the ability to construct arbitrary pipelines they could use them to solve #29, but it's still a lot of work, just given to the user as a Python script, instead of in the Rust core that can be more beneficial to everyone.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand what you mean and definitely we won't start with too big an architecture. These are just some inputs that might be useful when you decide how to expose the batch feature to user, it might be some possibilities if you are thinking how to make it work for as many cases as possible, but don't feel stressed with the information and these are not rules.

Let's back to the original question because either way some problems always need to be solved. The current pipeline includes the map transformation and what's your current idea to make it fit in batches? The map transformation is defined for one pipeline only, if defining it to work on the original map selection, each batches will need to perform different operations, like rotating an original selection does not mean the same rotation on small jobs.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, might be something we need to figure out for the future. For now I am probably going to only support batching on axis-aligned bounding boxes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK you can keep working on your code and I'll try to figure out a way to let this support. I'll let you know first once I have some ideas.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants