Skip to content

Make md5 hash independent of platform line endings#722

Merged
larsoner merged 4 commits intosphinx-gallery:masterfrom
sdhiscocks:md5sum_universal
Jul 7, 2020
Merged

Make md5 hash independent of platform line endings#722
larsoner merged 4 commits intosphinx-gallery:masterfrom
sdhiscocks:md5sum_universal

Conversation

@sdhiscocks
Copy link
Copy Markdown
Contributor

Currently md5 hash for Python files depends on what platform file is on, as file endings will differ. This change reads the Python files in the hashing function with universal line endings, such md5 is independent of platform being used.

Surrogate escapes are used, to ensure that it can handle any errors without data loss.

This is useful when adding a complex or long running examples into a repository that may be used across platforms.

Currently md5 hash for Python files depends on what platform file is on,
as file endings will differ. This change reads the Python files in the
hashing function with universal line endings, such md5 is independent of
platform being used.

Surrogate escapes are used, to ensure that it can handle any errors
without data loss.

This is useful when adding a complex or long running examples into a
repository that may be used across platforms.
@codecov
Copy link
Copy Markdown

codecov bot commented Jun 30, 2020

Codecov Report

Merging #722 into master will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #722   +/-   ##
=======================================
  Coverage   97.47%   97.47%           
=======================================
  Files          32       32           
  Lines        3560     3571   +11     
=======================================
+ Hits         3470     3481   +11     
  Misses         90       90           
Impacted Files Coverage Δ
sphinx_gallery/backreferences.py 96.31% <100.00%> (ø)
sphinx_gallery/gen_gallery.py 95.04% <100.00%> (ø)
sphinx_gallery/gen_rst.py 97.58% <100.00%> (ø)
sphinx_gallery/tests/test_full.py 99.77% <100.00%> (+<0.01%) ⬆️
sphinx_gallery/tests/test_gen_rst.py 99.13% <100.00%> (+<0.01%) ⬆️
sphinx_gallery/utils.py 96.59% <100.00%> (+0.12%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0022f0e...df69b7d. Read the comment docs.

@lucyleeow
Copy link
Copy Markdown
Contributor

Thanks @sdhiscocks. Maybe we could also add a test in test_full.py that checks value of the md5 hash of an example? Since it will be run on Windows and Linux in our CI, it would check this. WDYT @larsoner ?

@larsoner
Copy link
Copy Markdown
Contributor

Sure, seems like that would work

@lucyleeow lucyleeow added the bug label Jun 30, 2020
@lucyleeow
Copy link
Copy Markdown
Contributor

Thanks @sdhiscocks, @larsoner merge when you're happy/CI's green!

This ensures consisted md5sum for text files e.g. Python files, and
IPython Notebooks; but valid md5sums for binary files.
@larsoner larsoner merged commit 358c4c0 into sphinx-gallery:master Jul 7, 2020
@larsoner
Copy link
Copy Markdown
Contributor

larsoner commented Jul 7, 2020

Thanks @sdhiscocks !

@sdhiscocks sdhiscocks deleted the md5sum_universal branch July 8, 2020 07:33
sdhiscocks added a commit to sdhiscocks/sphinx-gallery that referenced this pull request Aug 3, 2023
This resolves an issue where the system default encoding for opening a
file is not UTF-8, whereas the encoding default for string encode method
is UTF-8, and such hash differs depending on OS.

Related to sphinx-gallery#722 which originally attempted to resolve inconsistencies
across OSs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants