Skip to content

DO NOT MERGE: fix PHP build issue with upb textformat (version2).#23634

Closed
jtattermusch wants to merge 10 commits intogrpc:masterfrom
jtattermusch:xds_logging2
Closed

DO NOT MERGE: fix PHP build issue with upb textformat (version2).#23634
jtattermusch wants to merge 10 commits intogrpc:masterfrom
jtattermusch:xds_logging2

Conversation

@jtattermusch
Copy link
Copy Markdown
Contributor

Better version #23628 (if it works).

@jtattermusch
Copy link
Copy Markdown
Contributor Author

CC @stanley-cheung

@jtattermusch
Copy link
Copy Markdown
Contributor Author

Local run of DOCKERHUB_ORGANIZATION=grpctesting tools/run_tests/task_runner.py -f php linux artifact has also passed.

@jtattermusch
Copy link
Copy Markdown
Contributor Author

CC @markdroth @stanley-cheung

Copy link
Copy Markdown
Member

@markdroth markdroth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question as in #23628: Will this approach make it harder to move forward with #23122 when the time comes?

# See https://github.com/grpc/grpc/issues/23307

# move all .upbdefs.c files from under src/core/ext/upb-generated to src/core/ext/upbdefs-generated
UPBDEFS_OUTPUT_DIR=$PWD/src/core/ext/upbdefs-generated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we wind up going with this approach, you will also need to change tools/distrib/check_upb_output.sh to check the upbdefs directory as well.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will check the situation with #23122 and come back with an opinion.

@stanley-cheung
Copy link
Copy Markdown
Contributor

Several observations and concerns:

  1. I verified that this PR, as is, works for PHP. I am able to build the PHP extension and pass unit test (so that means all the relevant symbols are properly compiled and accessible at run time. I had a previous trouble with this - at one of my earlier attempts, I am able to compile the extension but some symbols are missing at run time. This PR, as is, does not have that problem).

  2. So for this PR, here's the end state:

  • the existing upb-generated directory will contain all the .upb.c, .upb.h and .upbdefs.h files.
  • a new upbdefs-genearted directory will contain all the .upbdefs.c files.
  • That means, there will be no renaming of any files. Just moving the .upbdefs.c files. So there is also no need to change any #include references.
  1. I am a little unsure how this all works. Here's my understanding, please correct if I am wrong:
  • The #include statements only ever reference .upbdefs.h files. So as long as they stay in the same location (i.e. upb-generated directory), there is no need to change any of these references.
  • The .upbdefs.c files provide the implementations of all the symbols required from these .upbdefs.h files, and more importantly, these .upbdefs.c files never reference each other, nor by their corresponding .upbdefs.h file. So as long as each language's "extension"/'gems'/nugets/etc compile these upbdefs.c files, that would be enough to provide the symbols at runtime, and it doesn't matter where they live.
  1. I have a slight concern on step 3 in this PR. Just to clarify, this PR has 4 steps / commits, as of now:
    1. add another operation to gen_upb_api.sh, to move the .upbdefs.c files from upb-generated to upbdefs-generated. So this step will be executed automatically every time gen_upb_api.sh is run
    2. this is just checking in the moved files from step 1. (so this step will be combined with step 1 in the future).
    3. there's a manual step to fix the BUILD files because of these moved .upbdefs.c files
    4. run generate_projects.sh
  • So for step 3 here, does it mean whenever next time there's a change caused by gen_upb_api.sh, that someone will need to remember to fix the BUILD file manually too? That seems a bit unsafe. Can this step be automated somehow too? Coz looks like there will be plenty of churn on these upb generated files in the near future? Or did I misunderstand that, even these changes will be part of the future step 1 too?
  1. Overall I also agree that this is a better solution because there is no need for a specific fix for PHP, and there's no need to rename any files. If we can make this as safe as possible for future churn, I agree this should be the way to go.

# move all .upbdefs.c files from under src/core/ext/upb-generated to src/core/ext/upbdefs-generated
UPBDEFS_OUTPUT_DIR=$PWD/src/core/ext/upbdefs-generated
rm -rf $UPBDEFS_OUTPUT_DIR
cp -r $UPB_OUTPUT_DIR $UPBDEFS_OUTPUT_DIR
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it safe to not have a mkdir -p $UPBDEFS_OUTPUT_DIR prior to this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it works. The cp command takes the entire upb-generated and stores a copy as upbdefs-generated (if the destination directory would exist, the outcome would be different (upb-generated would end up under upbdefs-generated/upb-generated), but we do rm -rf $UPBDEFS_OUTPUT_DIR in the previous step.

@markdroth
Copy link
Copy Markdown
Member

  • So for step 3 here, does it mean whenever next time there's a change caused by gen_upb_api.sh, that someone will need to remember to fix the BUILD file manually too? That seems a bit unsafe. Can this step be automated somehow too? Coz looks like there will be plenty of churn on these upb generated files in the near future? Or did I misunderstand that, even these changes will be part of the future step 1 too?

Note that it's already the case that when the set of files we're generating changes, we need to make the corresponding changes in the BUILD file and then re-run generate_projects.sh. This PR does not make that any worse, as far as I can tell.

@stanley-cheung
Copy link
Copy Markdown
Contributor

  • So for step 3 here, does it mean whenever next time there's a change caused by gen_upb_api.sh, that someone will need to remember to fix the BUILD file manually too? That seems a bit unsafe. Can this step be automated somehow too? Coz looks like there will be plenty of churn on these upb generated files in the near future? Or did I misunderstand that, even these changes will be part of the future step 1 too?

Note that it's already the case that when the set of files we're generating changes, we need to make the corresponding changes in the BUILD file and then re-run generate_projects.sh. This PR does not make that any worse, as far as I can tell.

Ah I see. Thanks for clarifying. I was actually looking more into this and was coming to a similar conclusion - that if someone is adding/changing the list of .protos being generated in gen_upb_api.sh, that same someone would likely be looking into changing the BUILD file anyway.

So the slightly new cognitive load is that .upbdefs.c files will now be generated into a different directory upbdefs-generated.


# move all .upbdefs.c files from under src/core/ext/upb-generated to src/core/ext/upbdefs-generated
UPBDEFS_OUTPUT_DIR=$PWD/src/core/ext/upbdefs-generated
rm -rf $UPBDEFS_OUTPUT_DIR
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Earlier in the script, there's an option of invoking this script and providing a custom location for the upb-generated directory name as $1. Just want to point out that there won't be such option for this new upbdefs-generated directory.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I didn't want to get into details like that in the first iteration of the PR as it's just a technicality.
We could either:

  • stop supporting the $1 argument (which I'd be in favor of if this option it's used very often as these special cases add complexity to the script and I'd like to avoid unnecessary complexity as much as possible).
  • make sure the $1 still works fine (doable, it's just a technicality).

@jtattermusch
Copy link
Copy Markdown
Contributor Author

Several observations and concerns:

  1. I verified that this PR, as is, works for PHP. I am able to build the PHP extension and pass unit test (so that means all the relevant symbols are properly compiled and accessible at run time. I had a previous trouble with this - at one of my earlier attempts, I am able to compile the extension but some symbols are missing at run time. This PR, as is, does not have that problem).

Thanks for confirming!

  1. So for this PR, here's the end state:
  • the existing upb-generated directory will contain all the .upb.c, .upb.h and .upbdefs.h files.
  • a new upbdefs-genearted directory will contain all the .upbdefs.c files.
  • That means, there will be no renaming of any files. Just moving the .upbdefs.c files. So there is also no need to change any #include references.

That's accurate. If the idea of this PR looks good I can improve the comment to make this clearer.

I'd probably also change it to make ".upbdefs.h" files also land in upbdefs-generated (for consistency, after that you'd have one directory for plain upb messages - .upb.h, upb.c; and one directory for upb reflection upbdefs.h and upbdefs.c which is a layout that looks pretty reasonable to me).
That of course would require a bit of extra work (I don't include that part to avoid obscuring the idea of this PR), but should be pretty straightforward.

  1. I am a little unsure how this all works. Here's my understanding, please correct if I am wrong:
  • The #include statements only ever reference .upbdefs.h files. So as long as they stay in the same location (i.e. upb-generated directory), there is no need to change any of these references.
  • The .upbdefs.c files provide the implementations of all the symbols required from these .upbdefs.h files, and more importantly, these .upbdefs.c files never reference each other, nor by their corresponding .upbdefs.h file. So as long as each language's "extension"/'gems'/nugets/etc compile these upbdefs.c files, that would be enough to provide the symbols at runtime, and it doesn't matter where they live.

IMHO the location of .h files is not important at all (of course as long as all the build use the right include paths). .h files are not compilation units (.c files are, .h are just included in them by a preprocessor so their location is completely irrelevant) and the issue we're facing is purely about the resulting object files colliding with each other.

  1. I have a slight concern on step 3 in this PR. Just to clarify, this PR has 4 steps / commits, as of now:

    1. add another operation to gen_upb_api.sh, to move the .upbdefs.c files from upb-generated to upbdefs-generated. So this step will be executed automatically every time gen_upb_api.sh is run
    2. this is just checking in the moved files from step 1. (so this step will be combined with step 1 in the future).
    3. there's a manual step to fix the BUILD files because of these moved .upbdefs.c files
    4. run generate_projects.sh
  • So for step 3 here, does it mean whenever next time there's a change caused by gen_upb_api.sh, that someone will need to remember to fix the BUILD file manually too? That seems a bit unsafe. Can this step be automated somehow too? Coz looks like there will be plenty of churn on these upb generated files in the near future? Or did I misunderstand that, even these changes will be part of the future step 1 too?

That was explained by Mark. Currently the BUILD file needs to be updated manually (at least before #23122 is in place) so this is really only about knowing that upbdefs.c files will be generated under upbdefs-generated instead of upb-generated, which seems pretty straightforward.

  1. Overall I also agree that this is a better solution because there is no need for a specific fix for PHP, and there's no need to rename any files. If we can make this as safe as possible for future churn, I agree this should be the way to go.

I'll check if this trick seems to cause problems in conjunction with planned PR #23122.
We'd also need to check if this would cause any problems with import (I don't think it should, at least not until #23122 , but we'd need to check first).
Other than that, I am not aware of any additional problems this would cause.

@stanley-cheung
Copy link
Copy Markdown
Contributor

@jtattermusch We probably should move this forward. Anything I can help?

@jtattermusch
Copy link
Copy Markdown
Contributor Author

superseded by #23835.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants