Conversation
Receiving objects: 100% (4145/4145), 8.26 MiB | 930.00 KiB/s, done. Note: adding pruning instruction to Gopkg.toml makes vendor be 54Mb instead of 407Mb
|
After reading the discussion over on #2691, I'm really not sure what benefit using a submodule vs just having the pruned vendor/ directory in this repo is, and can see a number of issues with having a submodule.
The point raised in #2691 by @jwendell also still stands - I cloned istio a few days ago, and it took ~10-15m to get into a state where I could We could switch to a world where everything is gated behind scripts, and Istio becomes a 'special' project that cannot be built/worked with with plain ol' tooling, but I don't (personally) think this is a desirable state. (just my 2c as an outside collaborator who has worked closely with the k/k dep managed pieces, including the staging repo/bot that 'exports' subdirectories from the k/k mono-repo so they can be consumed by 3rd party projects) |
|
have you tried this branch ? there is no symlink, no issue and it’s smaller and faster without polluting the main repo |
|
let me answer your points one by one:
there is no symlink and go get works fine with submodules
there is nothing special/extra to understand - it's just git. and if you "go get istio.io/istio/" you don't even need any setup. also today you have to grok our overly complex makefiles and init scripts, this would just make it simpler if anything
When someone wants to look at istio's source code - maybe a README/doc change... plenty of reasons but that's not actually the driver for this (will come back to that at the end)
There are - some docs and samples ask to clone istio when the release tar fall behind the documentation (but again this isn't the issue)
Did you try ? I did and it just works fine
I agree and that's in part why I don't want to check in vendor in the main repo but again this option/PR is 8 mbytes download
I agree with you - this being said this is actually were we are now, and this PR would help simplify If we do our job right "make" is all you need to do after cloning So back to why we should not mix millions of code we don't author with code we do author in the same repo/revision history/set of PRs when there is an easy way to seperate the two yet achieve the goals of hermetic and faster start... While I would hope it's self evident, I can add some points:
|
|
Re: "extra barrier of entry for new users to understand", for any such users besides myself know that you'll need to run a command like "git submodule update --init --recursive" to populate the vendor/ directory. I didn't find any docs that explained how git correlates commit shas across the repos (i.e., for a given commit sha on istio repo which sha of the vendor repo to use). There's only 1 branch using this scheme so I couldn't readily experiment with how git handles changing branches (I could experiment with this branch vs. master [and back], which isn't a smooth transition but the annoyance would decrease over time). From what I read it seems like people have to explicitly request a submodule update (e.g., "git submodule update"). I suppose this is akin to needing to re-run the dep tool and it could be hookup up to the makefile but it'll probably catch a few people off-guard that vendor/ doesn't update like the rest of the tree. |
costinm
left a comment
There was a problem hiding this comment.
Let's see how it works - if problems are found with submodule we can checkin vendor directly.
Thanks Laurent !
/lgtm
|
/test all [submit-queue is verifying that this PR is safe to merge] |
|
/lgtm cancel Lets please discuss this properly. I only see 3 people out of a repo of 70 people. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: costinm The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
|
My two cents: If I have to update dependencies (where they are no longer compatible with the existing code in istio/istio), how will this work? I can't send a PR to istio/vendor because it will break ongoing tests/PRs in istio/istio. Nor can I simply send a actual code PR that depends on updated dependencies, as its not going to pass without dependency update. We would then have to force merge one to let the other go through. That becomes even more of a problem when the PR in question requires long review. Is there a solid technical disadvantage to having a large repo (80M or whatever?). Does it have any impact on the final product we ship or does it impact our day to day development ? The big downside I can see there is that our PRs might be bigger, if we update vendors and code. As such, PRs with Goldens are hard to review. now, a PR with 100s of files will be equally harder. On the other hand, this proposal of storing dependencies in a separate repo is not really helping either. It just moves the problem to a different state, where we often end up having to co-ordinate across two repos for dependency update. Imagine the consternation when PR one updates a dependency in istio/vendor and another PR undoes the update. None of us know much abotu the code in the dependencies. So, its going to be really hard to differentiate whether one PR is reverting another PRs update or not. This problem gains more importance when we have concurrent PRs in istio/istio, where one PR requires updated dependencies for the code it adds, while the other PR is still using an older version of the same code and cannot pass with updated dependencies. @jwendell and co - I understand the use case RedHat has. But does it have to involve committing the entire vendor tree in istio/istio? Can't you track istio/istio and have a redhat/istio-vendor repo, that literally commits all the dependencies, such that any machine where you want to compile istio could essentially clone these two git repositories alone. You just run |
|
@rshriram Thanks for your comments. You summarized well the situation. The biggest cons (repo size and PR's that update dependencies) of having a single repo are small compared to the benefits. About this PR in particular, like you and @mattdelco well said, I don't see a difference of having this submodule stuff and having to run Regarding our downstream building, yep, this is the situation we have right now, having to maintain 2 repos or a fork of istio repo with vendor in. While not the ideal solution, it might work (at the cost of some extra work). |
|
My point: we can submit, test how it works with tools like go get and dep update, and if we're happy If we're not happy after few days - we can revert, minimal harm done. This PR shouldn't break anything - it's hard to evaluate how this will work, and wasting The main benefit of this PR is that it allows us to play with checked in vendor and evaluate |
|
Re. developer workflow: there should be no change except decrease in time for initial build. Git submodules are a standard git feature - it's not a symlink. If things go well - I would be happy to add istio/api as a submodule as well.. If submodule creates It's pretty clear that people have very different and strong opinions, so let's get more data. I'm of the strong opinion to checkin vendor in istio, but happy to see other options. |
|
When I was reading up on submodules I came across various various warnings about git's poor handling of transitioning between a submodule and a directory with the same name, so the experiment might not be so lightweight to try and/or revert. Regardless someone would first need to update init.sh to make calls to git rather than dep (and this person or another would need to think about how to allow building from a source tar). I don't feel like I have enough data to compare how the options behave once there's a fair amount of change history in both, nor do I have enough background on why git might have less transfer overhead with a separate repo (nor do I know how branching or reproducibility is handled). If being lightweight was a high priority then perhaps personal branches on istio/istio should be banned in practice because those also seem to add overhead. |
|
As requested, I cooked up some changes to hook up the Makefile. I only focused on the common case of a dev trying to build the go binaries (i.e., I punted on tests and how someone updates the vendor repo with newer content): vendor-sub...mattdelco:vendor-sub Some of the key points in the Makefile:
Some of the key points in bin/init.sh:
release/store_artifacts.sh and tools/istio-docker.mk is just bring in a fix from another PR so I could do a test release build. |
|
In my experience, make pulling dependencies when building leads to a lot of
frustration and problems.
Make is supposed to build. "Git pull" and similar tools are intended for
getting the sources - when a developer does 'git pull'
he may fetch the corresponding deps as well. We can wrap it in a make
target or script - but this is a distinct operation
from building. Build should only use local files and ideally not download
anything from the net.
We spent quite a bit of time optimizing the build time for presubmit -
current Makefile attempts to keep the
targets a developer typically uses ( 'make pilot, etc' ) separate from the
targets used in the presubmit/CICD,
and I would appreciate adding new behavior in new targets.
Vendor doesn't solve all problems - we still need init to download an envoy
( which could also be the result
of a build - so a user who does a local build and copies the binary should
end up using what he built ), and
there are few tools and helpers that need to be downloaded, but as part of
the 'getting stuff' stage. Once
you start building - it should be possible to build and test offline.
…On Fri, Feb 9, 2018 at 1:17 PM, mattdelco ***@***.***> wrote:
When I was reading up on submodules I came across various various warnings
about git's poor handling of transitioning between a submodule and a
directory with the same name, so the experiment might not be so lightweight
to try and/or revert. Regardless someone would first need to update init.sh
to make calls to git rather than dep (and this person or another would need
to think about how to allow building from a source tar).
I don't feel like I have enough data to compare how the options behave
once there's a fair amount of change history in both, nor do I have enough
background on why git might have less transfer overhead with a separate
repo (nor do I know how branching or reproducibility is handled). If being
lightweight was a high priority then perhaps personal branches on
istio/istio should be banned in practice because those also seem to add
overhead.
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#3348 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAFI6jc9juYoxXxcooikNlkw-iJXJNDhks5tTLXYgaJpZM4R_i8D>
.
|
|
It seems that many are assuming there are no downsides to submodules so I'm going to try an enumerate the ones I'm aware of 1 - git clone no longer just works as expected with vendor support For many developers used to vendored projects they will be used to the following steps
In order for this to work developers will now need to replace 'git clone ' with either
I'm sure many will argue this is a minor change but either way it is a change in the developer workflow 2 - git pull no longer works as expected Many developers will expect to update their projects by following a workflow similar to
Developers will expect this to fetch the latest updates from the repo and merge the changes so their copy is up to date, unfortunately when using submodules this doesn't happen as the submodule is left in the prior state (remember this fact, it is relevant in the next issue). The new workflow for developers becomes
Again some may argue this is a minor change but yet again this is another change in the developer workflow 3 - the potential for losing submodule changes Remember that developer who only ran 'git pull upstream'? He has a local copy of the Istio repository which is up to date with master, after all he issued the git pull before working on this change, has made changes to his local istio repo and now wants to commit them. The developer adds all the files, unfortunately also including the vendor submodule, pushes this to his own repository (no conflicts raised since he is already up to date) before submitting his PR. The PR gets reviewed, nobody catches the vendor submodule being included by mistake and is accepted. You now have a commit history which has lost a change, this developer has essentially reverted a previous change and nothing in the git tooling is preventing him from doing this. To me this is one of the most serious issues with submodules, had the vendor directory been in the same repo then the developer's merge would have updated everything and, if he was behind, the git tooling would have noticed and warned the developer about this. 4 - Managing multiple repos I think this is an obvious issue so I won't do anything other than raise awareness. In summary we seem to be choosing between two different approaches to vendoring since I believe we already have consensus that vendoring is an appropriate way forward. On the one hand you have the option to check the vendor directory into the same repository with the downside of adding to the size (8MB compressed from what Laurent's said earlier) but with a much simpler and, at least IMO, a more reliable workflow for everyone. On the other side you have the option to use submodules which has the benefit of keeping the vendor directory separate from the main code but comes with a number of issues including one which I believe should be serious enough to question its approach. There may be other issues lurking in submodules, to be honest I've just spent an hour playing around with the basic workflows looking for issues, but I've already found the ones mentioned above. I am happy to discuss this with anyone before we try to reach a decision, although I'm travelling again next week I will definitely attend the test/release meeting which I hope will prioritise this issue. |
|
@rshriram We have already discussed alternatives to vendoring within the community however it does involve work on our side and I would expect other companies in similar positions would also need to duplicate this within their environment. At the moment we are maintaining a fork of istio which includes the vendor directory and, in order to simplify the tracking, we have discussed creating branches off the main codebase, containing just the vendor content, whenever we need to push to our internal git repo. So yes we can build up tooling to work around this but our obvious preference is to have this supported in the community since it benefits not only ourselves but others. Even with the submodule approach to vendoring we will likely still have a need for tooling since we will have to handle the management of the submodule. |
|
@knrc thanks for your comments/insights ! 1&2: how about "go get istio.io/istio" working for initial setup (it'll get the submodule, I tested it with fortio) or "make" (which is needed anyway for various setup, getting envoy etc...) I would think people who want to contribute can be comfortable with a fairly standard 1 command to type? 3: I don't understand the issue: 4: we already have multiple repos: api/ tools/ contrib/ gogo, mixerclient, proxy, website etc... and again the vendor submodule is transparent with 1 line in the makefile |
|
@knrc istio needs and produces containers, go build doesn't, so we need some minimal "make" targets in any case. also it's really not that hard to notice that you need a recursive clone or |
| - attach_workspace: | ||
| at: /go | ||
| - run: bin/testEnvRootMinikube.sh start | ||
| - restore_cache: |
There was a problem hiding this comment.
We still need the cache for envoy and other downloads.
There was a problem hiding this comment.
I thought you mentioned that's dealt with through the attach_workspace
|
attach_workspace works for build -> integration.
That means test / lint / etc will not have access to it.
Let's keep the cache for now (without vendor). Right now I think envoy will
be the only
cached component - but we may cache other things ( depending on how well go
1.10
compilation works I was considering trying incremental builds ).
…On Wed, Feb 14, 2018 at 3:31 PM, Laurent Demailly ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In .circleci/config.yml
<#3348 (comment)>:
> - attach_workspace:
at: /go
- run: bin/testEnvRootMinikube.sh start
- - restore_cache:
I thought you mentioned that's dealt with through the attach_workspace
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#3348 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAFI6tLNEsZGv01qmuXsohT3FKqMpQxtks5tU2y0gaJpZM4R_i8D>
.
|
|
@ldemailly PR needs rebase |
|
/test e2e-suite-rbac-auth |
|
@ldemailly: The following test failed, say
DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
mixer test failure is unrelated. |
|
@ldemailly I haven't checked out either of these branches to try the code, although I think the main requirement in my mind (air-gapped reproducible builds) are met by either solution. I'm a fan of executing technical work that are two way doors - in other words, things that can easily be reversed if problems are found. A vendor commit is somewhat permanent without a history rewrite (as already covered in this PR). Cheers |
|
Approved by Test&Release WG and Sven Merging |
|
writing an FAQ and email before merging actually |
Alternative to #2691 with all the benefits and more, without the downsides:
Submodule is only:
Receiving objects: 100% (4145/4145), 8.26 MiB | 930.00 KiB/s, done.
Note: adding pruning instruction to Gopkg.toml makes vendor be 54Mb
instead of 407Mb
vendor checked in as https://github.com/istio-releases/vendor
also fixes #3483