Skip to content

Demo of merging pygeos source into Shapely repo#1221

Closed
jorisvandenbossche wants to merge 392 commits intoshapely:mainfrom
jorisvandenbossche:merge-pygeos
Closed

Demo of merging pygeos source into Shapely repo#1221
jorisvandenbossche wants to merge 392 commits intoshapely:mainfrom
jorisvandenbossche:merge-pygeos

Conversation

@jorisvandenbossche
Copy link
Copy Markdown
Member

@jorisvandenbossche jorisvandenbossche commented Nov 6, 2021

Nothing final, but just experimenting with a possible workflow for #962 (comment), to see what this gives / to decide what we exactly want (cc @caspervdw).

I followed the approach outlined in https://stackoverflow.com/a/10548919/653364. This doesn't use git subtree in the end, but actually a simple merge of a remote. This also doesn't add it to a subdirectoy, but I renamed the few conflicting files/dirs in advance in the pygeos source.

This results in a cleanly merged repo with history. You can see the git log of the created branch at https://github.com/jorisvandenbossche/Shapely/commits/merge-pygeos (and eg see recent commits of both shapely repo and pygeos repo intertwined).

The following commands completely reproduce the result (so this can easily be repeated whenever we are ready for the merge), starting from two fresh clones:

git clone https://github.com/pygeos/pygeos.git pygeos-clean
cd pygeos-clean
# update commit messages
git-filter-repo --message-callback 'return re.sub(b"#([0-9]+)", b"pygeos/pygeos#\\1", message)'
git-filter-repo --message-callback 'return message.removeprefix(b"[Done] ")'
# rename conflicting files / dirs
git mv MANIFEST.in MANIFEST_pygeos.in 
git mv pyproject.toml pyproject_pygeos.toml 
git mv README.rst README_pygeos.rst 
git mv setup.cfg setup_pygeos.cfg
git mv setup.py setup_pygeos.py
git mv .gitignore .gitignore_pygeos
git mv docs/ docs_pygeos
git mv ci/install_geos.cmd ci/install_geos_pygeos.cmd
git mv ci/install_geos.sh ci/install_geos_pygeos.sh
git commit -am "Migration to Shapely: rename conflicting files"
git clone https://github.com/Toblerity/Shapely.git shapely-clean
cd shapely-clean
# adding pygeos as a remote and fetching it
git remote add pygeos ../pygeos-clean
git fetch pygeos --tags
git merge --allow-unrelated-histories -m "Merge PyGEOS source code into Shapely" pygeos/master
# this is now pushed to my fork, when final this is pushed to shapely:main
git push https://github.com/jorisvandenbossche/Shapely.git HEAD:merge-pygeos

caspervdw and others added 30 commits August 30, 2019 20:50
Split single ufuncs.c file in multiple files
[Done] Separating Missing (NaG) and Empty
…adme

Update README to include conda-forge installation instructions
@coveralls
Copy link
Copy Markdown

coveralls commented Nov 6, 2021

Coverage Status

Coverage remained the same at 85.362% when pulling 177524e on jorisvandenbossche:merge-pygeos into 1f60441 on Toblerity:main.

@jorisvandenbossche
Copy link
Copy Markdown
Member Author

One thing I already noticed: the commits from pygeos have the #xxx issue or PR reference in the commit messages, and they now link to wrong, unrelated Shapely issues.

We could rewrite the commit messages from the pygeos source before merging to rename each #xxx to pygeos/pygeos#xxx, which should then preserve the correct links to the PRs/issues with the relevant discussion. (example that github links this: pygeos/pygeos#1)

Currently the workflow above preserves the commit hashes as they are in the pygeos repo (since it's a clean merge), editing the commit messages would of course change the hash. So a trade-off between preserving hashes vs preserving correct links in the git log on github.

(I think I personally would prefer getting correct links)

@jorisvandenbossche
Copy link
Copy Markdown
Member Author

Such a commit message rewriting would be easily achieved with:

git-filter-repo --message-callback 'return re.sub(b"#([0-9]+)", b"pygeos/pygeos#\\1", message)'

using https://github.com/newren/git-filter-repo (the recommended alternative for git's own git filter-branch (https://git-scm.com/docs/git-filter-branch#_warning)).
See https://www.mankier.com/1/git-filter-repo#Callbacks for the documentation about the callback.

mwtoews and others added 9 commits November 9, 2021 12:53
* Fix error handling in STRtree

* Formatting

* Update CHANGELOG.rst

Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>

Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
* Document known issue with to/from shapely (pygeos/pygeos#424)

* Document lazy evaluation in points (pygeos/pygeos#397)

* Document behavior of get_dimensions (pygeos/pygeos#289)

* Update pygeos/creation.py

Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>

Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
@jorisvandenbossche jorisvandenbossche force-pushed the merge-pygeos branch 2 times, most recently from efd7a31 to 780020f Compare November 29, 2021 09:12
@jorisvandenbossche
Copy link
Copy Markdown
Member Author

OK, the merge is done!

@jorisvandenbossche jorisvandenbossche deleted the merge-pygeos branch December 1, 2021 08:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.