Skip to content

BLD: create a LICENSE_BUNDLED.txt file from third_party licenses#50745

Closed
mattip wants to merge 3 commits intopytorch:masterfrom
mattip:license
Closed

BLD: create a LICENSE_BUNDLED.txt file from third_party licenses#50745
mattip wants to merge 3 commits intopytorch:masterfrom
mattip:license

Conversation

@mattip
Copy link
Copy Markdown
Contributor

@mattip mattip commented Jan 19, 2021

Fixes #50695.

Rather than maintain a LICENSE_BUNDLED.txt by hand, this build it out of the subrepos.

I copied and adapted the sdist handling from Numpy added a separate file, so the LICENSE.txt file of the repo remains in pristine condition and the GitHub website still recognizes it. If we modify the file, the website will no longer recognize the license.

This is not enough, since the license in the wheel wheel and sdist is not modified. Numpy has a separate step when preparing the wheel to concatenate the licenses. I am not sure where/if the conda-forge numpy-feedstock also fixes up the license.

Should I commit commited the artifact to the repo and add added a test that reproducing the file is consistent.

Edit: now the file is part of the repo.

Edit: rework the mention of sdist. After this is merged another PR is needed to make the sdist and wheel ship the proper merged license.

@mattip
Copy link
Copy Markdown
Contributor Author

mattip commented Jan 19, 2021

Here is the file the new command creates

LICENSES_BUNDLED.txt

Edit: link to latest version of the file

@mattip
Copy link
Copy Markdown
Contributor Author

mattip commented Jan 19, 2021

There are a few quirks in the file. There are some third_party projects PyTorch included multiple times (benchmark has 4 copies, clog has 3, googletest has 10, there are more), I included each one separately since they are not all on the same version. Most of the licenses are copies of the original BSD/MIT/Boost/Apache, except:

  • neon2sse removes one of the BSD clause "Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution." Strange, and it makes the license unapproved. uses a BSD-Source-Code license that I had not seen before.
  • nccl removes the phrase "AND CONTRIBUTORS" from the last clause's "BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS ``AS IS'', which also makes the license non-compliant. is a bit different but is fine.
  • fmt adds an additional exception after the license.
  • eigen offers a numer of license files COPYING*. I chose the BSD one.

@seemethere seemethere self-requested a review January 19, 2021 18:51
@seemethere seemethere added this to the 1.8.0 milestone Jan 19, 2021
@seemethere seemethere added the module: build Build system issues label Jan 19, 2021
@mattip
Copy link
Copy Markdown
Contributor Author

mattip commented Jan 21, 2021

@rgommers do you have an idea how/where to merge the LICENSE files when generating a wheel? I guess I could expand the bdist_wheel command like this does by wrapping the sdist command?

Comment thread third_party/LICENSES_BUNDLED.txt Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented Jan 21, 2021

Codecov Report

Merging #50745 (6f77914) into master (a469336) will decrease coverage by 0.00%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master   #50745      +/-   ##
==========================================
- Coverage   80.65%   80.65%   -0.01%     
==========================================
  Files        1912     1912              
  Lines      208048   208048              
==========================================
- Hits       167810   167808       -2     
- Misses      40238    40240       +2     

Copy link
Copy Markdown
Contributor

@walterddr walterddr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to modify the NOTICE file too since it also contains paragraphs discussing thrid-party license

Comment thread third_party/LICENSES_BUNDLED.txt Outdated
@mattip
Copy link
Copy Markdown
Contributor Author

mattip commented Jan 21, 2021

do we need to modify the NOTICE file too since it also contains paragraphs discussing third-party license

Each snippet taken from somewhere else and merged into the pytorch sources (not the third party directory) should be attributed in the source. I think another file should be used to maintain the required attributions for such code. Is that the NOTICE file? The point of the current PR is to add attributions for third party libraries in the third_parties directory which are maintained separately from pytorch itself. If things are missing from NOTICES maybe it can be fixed in a different PR?

Copy link
Copy Markdown
Collaborator

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, thanks @mattip. Main comment on dealing with sdist/wheels in other PRs.

Comment thread third_party/LICENSES_BUNDLED.txt Outdated
Comment thread setup.py Outdated
Comment thread third_party/LICENSES_BUNDLED.txt Outdated
@mattip
Copy link
Copy Markdown
Contributor Author

mattip commented Jan 26, 2021

Rebased and squashed. Ready for review, assuming CI passes.

@mattip mattip changed the title WIP: create a LICENSE_BUNDLED.txt file from third_party licenses BLD: create a LICENSE_BUNDLED.txt file from third_party licenses Jan 26, 2021
Copy link
Copy Markdown
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@walterddr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Copy Markdown
Contributor

@walterddr walterddr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me. Thanks for the contribution @mattip!

Comment thread third_party/build_bundled.py Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont know if we have a guidance but was wondering if we should put this in tools/ or scripts/ instead of third_party/. CC @seemethere @malfet

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No preference, but it would be good to have a shebang at the head of the file

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

accepted

Comment thread third_party/build_bundled.py Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No preference, but it would be good to have a shebang at the head of the file

Comment thread third_party/build_bundled.py
Co-authored-by: Nikita Shulga <nikita.shulga@gmail.com>
@facebook-github-bot
Copy link
Copy Markdown
Contributor

@walterddr merged this pull request in 24eab1d.

seemethere added a commit that referenced this pull request Apr 25, 2022
This script was hanging up on a couple of licenses not being parsed
correctly / not accounted for in the [original pass through](#50745)

Should resolve issues with tests not running 100% correctly. (originally reported in #76128)

Also adds an option to run the script and specify a different output file:
* CLI Argument `--out-file`
* Environment Variable: `PYTORCH_THIRD_PARTY_BUNDLED_LICENSE_FILE`

Signed-off-by: Eli Uriegas <eliuriegasfb.com>

[ghstack-poisoned]
laurentdupin pushed a commit to laurentdupin/pytorch that referenced this pull request Apr 24, 2026
…orch#50745)

Summary:
Fixes pytorch#50695.

Rather than maintain a LICENSE_BUNDLED.txt by hand, this build it out of the subrepos.

I ~copied and adapted the sdist handling from Numpy~ added a separate file, so the LICENSE.txt file of the repo remains in pristine condition and the GitHub website still recognizes it. If we modify the file, the website will no longer recognize the license.

This is not enough, since the license in the ~wheel~ wheel and sdist is not modified. Numpy has a [separate step](https://github.com/MacPython/numpy-wheels/blob/master/patch_code.sh) when preparing the wheel to concatenate the licenses. I am not sure where/if the [conda-forge numpy-feedstock](https://github.com/conda-forge/numpy-feedstock/) also fixes up the license.

~Should~ I ~commit~ commited the artifact to the repo and ~add~ added a test that reproducing the file is consistent.

Edit: now the file is part of the repo.

Edit: rework the mention of sdist. After this is merged another PR is needed to make the sdist and wheel ship the proper merged license.

Pull Request resolved: pytorch#50745

Reviewed By: seemethere, heitorschueroff

Differential Revision: D26074974

Pulled By: walterddr

fbshipit-source-id: bacd5d6870e9dbb419a31a3e3d2fdde286ff2c94
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

License file in wheels and conda packages does not contain any third-party licenses

8 participants