All classes in backport providers are now importable in Airflow 1.10 #8991

potiuk · 2020-05-24T08:09:09Z

Make sure to mark the boxes below before creating PR: [x]

Description above provides context of the change
Unit tests coverage for changes (not needed for documentation changes)
Target Github ISSUE in description if exists
Commits follow "How to write a good git commit message"
Relevant documentation is updated including usage instructions.
I will engage committers as explained in Contribution Workflow Example.

In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.
Read the Pull Request Guidelines for more information.

potiuk · 2020-05-24T08:10:30Z

@BasPH @kaxil @ashb as discussed -> I've added automated import for all classes in providers in 1.10. Few fixes were neeed but otherwise it looks good :)

airflow/providers/amazon/aws/hooks/batch_client.py

airflow/providers/odbc/hooks/odbc.py

backport_packages/import_all_provider_classes.py

potiuk · 2020-05-24T15:32:37Z

@ashb @kaxil @BasPH @mik-laj

I think I nailed it with import tests for backport packages. I removed all the README changes from code changes so that it will be easier to review that one. I will regenerate the READMES again in the separate commit on top of that one.

I implemented all the refactorings that were needed in automated way using Bowler. I actually started to like Bowler :).

I got to the point however that I had to refactor it - we had one huge method that did everything - so I separated all the refactoring code out from setup_backport_packages and I converted it into a separate class (this way Query was easier to build as a field in the class rather than passing the query around. @turbaszek -> you might want to take a look and see if you are ok with that, but I think the way it is done now is much more readable - we have separate refactoring methods for separate classes.

Now - we also have two separate scripts for testing packages - one is to install them one-by-one for airflow 1.10 and see if they can be installed and the second one installs all of them, finds all the relevant classes and imports them (and fails if some of those imports fail). It's done rather neatly and for us It is I think a great tool to keep backport packages usable in the future - as those tests will be running continuously in CI with every PR. @BasPH -> it was a really good idea. I see some of the changes coming (like functional DAGs) might unknowingly destroy the backward compatibility of provider packages (similar to lineage changes for Papermill) - but with this test, it will be a lot harder to introduce such problems.

One last comment for Papermill - it's AUTO lineage feature caused import of examples to fail (yes - we are also checking the examples!), So I added info that AUTO lineage will not work in 1.10 (and I updated the example so that it behaves correctly in Airflow 1.10 as well) @bolkedebruin - you might want to take a look at this part to see if I did not introduce any problems.

airflow/providers/papermill/ADDITIONAL_INFO.md

airflow/config_templates/config.yml

backport_packages/refactor_backport_packages.py

airflow/providers/apache/hive/operators/hive.py

airflow/config_templates/config.yml

backport_packages/import_all_provider_classes.py

backport_packages/refactor_backport_packages.py

requirements/requirements-python3.7.txt

scripts/ci/in_container/kubernetes/app/templates/configmaps.template.yaml

setup.py

potiuk · 2020-05-25T15:33:02Z

Hey @ashb I think you were the last one with comments - I pushed the last fixup with all the resulting explanations and changes and I hope the build will succeed this time - I had only one datacatalog example change that was failing. Please take a look - I have two more changes that are follow - up from that one - the sooner we merge it, the earlier I can cut RC3.

The only unresolved comment is about using stdout instead of print - which I am not sure why. IMHO print in python3 is much better than stdout or stderr because it's now a function and it even has the file parameter where you can pass file=sys.stderr to it, and you do not have use all those unneeded "\n"s. I understand it is not a blocker, but I would still love to know what's the rationale for using stdout now that we switched to python 3 ?

…w 1.10

potiuk · 2020-05-25T18:09:05Z

All Green @ashb

ashb · 2020-05-25T19:16:53Z

Hey @ashb I think you were the last one with comments - I pushed the last fixup with all the resulting explanations and changes and I hope the build will succeed this time - I had only one datacatalog example change that was failing. Please take a look - I have two more changes that are follow - up from that one - the sooner we merge it, the earlier I can cut RC3.

The only unresolved comment is about using stdout instead of print - which I am not sure why. IMHO print in python3 is much better than stdout or stderr because it's now a function and it even has the file parameter where you can pass file=sys.stderr to it, and you do not have use all those unneeded "\n"s. I understand it is not a blocker, but I would still love to know what's the rationale for using stdout now that we switched to python 3 ?

I I mean we should do print(..., file=sys.stdout) for anything that is an "error". Not a blocker

kaxil · 2020-05-25T19:21:24Z

Hey @ashb I think you were the last one with comments - I pushed the last fixup with all the resulting explanations and changes and I hope the build will succeed this time - I had only one datacatalog example change that was failing. Please take a look - I have two more changes that are follow - up from that one - the sooner we merge it, the earlier I can cut RC3.
The only unresolved comment is about using stdout instead of print - which I am not sure why. IMHO print in python3 is much better than stdout or stderr because it's now a function and it even has the file parameter where you can pass file=sys.stderr to it, and you do not have use all those unneeded "\n"s. I understand it is not a blocker, but I would still love to know what's the rationale for using stdout now that we switched to python 3 ?

I I mean we should do print(..., file=sys.stdout) for anything that is an "error". Not a blocker

Based on https://docs.python.org/3/library/functions.html#print , the default value of file argument is sys.stdout.

The file argument must be an object with a write(string) method; if it is not present or None, sys.stdout will be used.

ashb · 2020-05-25T19:50:50Z

Hey @ashb I think you were the last one with comments - I pushed the last fixup with all the resulting explanations and changes and I hope the build will succeed this time - I had only one datacatalog example change that was failing. Please take a look - I have two more changes that are follow - up from that one - the sooner we merge it, the earlier I can cut RC3.
The only unresolved comment is about using stdout instead of print - which I am not sure why. IMHO print in python3 is much better than stdout or stderr because it's now a function and it even has the file parameter where you can pass file=sys.stderr to it, and you do not have use all those unneeded "\n"s. I understand it is not a blocker, but I would still love to know what's the rationale for using stdout now that we switched to python 3 ?

I I mean we should do print(..., file=sys.stdout) for anything that is an "error". Not a blocker

Based on https://docs.python.org/3/library/functions.html#print , the default value of file argument is sys.stdout.

The file argument must be an object with a write(string) method; if it is not present or None, sys.stdout will be used.

I am saying, we are printing errors to stdout, which is "wrong". Errors should be printed on stderr.

kaxil · 2020-05-25T19:56:06Z

I mean we should do print(..., file=sys.stdout) for anything that is an "error"

I haven't followed the PR just read the line :D "I mean we should do print(..., file=sys.stdout) for anything that is an "error", hence my comment ;)

airflow/providers/papermill/example_dags/example_papermill.py

backport_packages/import_all_provider_classes.py

ashb · 2020-05-25T20:03:39Z

@potiuk Argh, sorry, my previous comment said stdout when I meant stderr.

backport_packages/import_all_provider_classes.py

backport_packages/refactor_backport_packages.py

setup.py

backport_packages/template_base_operator.py.txt

backport_packages/refactor_backport_packages.py

… Airflow 1.10

ashb

Looking good Jarek!

potiuk requested review from BasPH, ashb, kaxil and mik-laj May 24, 2020 08:09

boring-cyborg bot added provider:Apache provider:amazon AWS/Amazon - related issues labels May 24, 2020

potiuk mentioned this pull request May 24, 2020

Fixed name of 20 remaining wrongly named operators. #8994

Merged

6 tasks

potiuk force-pushed the import_all_backport_classes branch from f030935 to 2812bb2 Compare May 24, 2020 08:17

ashb reviewed May 24, 2020

View reviewed changes

airflow/providers/amazon/aws/hooks/batch_client.py Outdated Show resolved Hide resolved

BasPH approved these changes May 24, 2020

View reviewed changes

airflow/providers/odbc/hooks/odbc.py Outdated Show resolved Hide resolved

backport_packages/import_all_provider_classes.py Outdated Show resolved Hide resolved

potiuk force-pushed the import_all_backport_classes branch 2 times, most recently from 8422b01 to 441b604 Compare May 24, 2020 15:18

potiuk force-pushed the import_all_backport_classes branch 2 times, most recently from 571c445 to f2e885b Compare May 24, 2020 18:56

mik-laj reviewed May 24, 2020

View reviewed changes

airflow/providers/papermill/ADDITIONAL_INFO.md Outdated Show resolved Hide resolved

mik-laj approved these changes May 24, 2020

View reviewed changes

potiuk force-pushed the import_all_backport_classes branch 4 times, most recently from 6b9debc to d25d854 Compare May 25, 2020 08:51

ashb reviewed May 25, 2020

View reviewed changes

airflow/config_templates/config.yml Outdated Show resolved Hide resolved

turbaszek reviewed May 25, 2020

View reviewed changes

backport_packages/refactor_backport_packages.py Outdated Show resolved Hide resolved

turbaszek approved these changes May 25, 2020

View reviewed changes

All classes in backport providers are now importable in Airflow 1.10

e40e269

potiuk force-pushed the import_all_backport_classes branch from d25d854 to 290d2eb Compare May 25, 2020 12:59

ashb requested changes May 25, 2020

View reviewed changes

ashb reviewed May 25, 2020

View reviewed changes

requirements/requirements-python3.7.txt Outdated Show resolved Hide resolved

ashb reviewed May 25, 2020

View reviewed changes

requirements/requirements-python3.7.txt Outdated Show resolved Hide resolved

ashb reviewed May 25, 2020

View reviewed changes

scripts/ci/in_container/kubernetes/app/templates/configmaps.template.yaml Outdated Show resolved Hide resolved

ashb reviewed May 25, 2020

View reviewed changes

setup.py Outdated Show resolved Hide resolved

potiuk force-pushed the import_all_backport_classes branch 2 times, most recently from 8a078a0 to 0fadc3e Compare May 25, 2020 15:26

fixup! All classes in backport providers are now importable in Airflo…

71c7cf7

…w 1.10

potiuk force-pushed the import_all_backport_classes branch from 0fadc3e to 71c7cf7 Compare May 25, 2020 17:02

ashb requested changes May 25, 2020

View reviewed changes

airflow/providers/papermill/example_dags/example_papermill.py Outdated Show resolved Hide resolved

ashb requested changes May 25, 2020

View reviewed changes

airflow/providers/papermill/example_dags/example_papermill.py Outdated Show resolved Hide resolved

backport_packages/import_all_provider_classes.py Outdated Show resolved Hide resolved

ashb reviewed May 25, 2020

View reviewed changes

backport_packages/import_all_provider_classes.py Outdated Show resolved Hide resolved

ashb reviewed May 25, 2020

View reviewed changes

backport_packages/refactor_backport_packages.py Outdated Show resolved Hide resolved

ashb reviewed May 25, 2020

View reviewed changes

backport_packages/refactor_backport_packages.py Show resolved Hide resolved

ashb reviewed May 25, 2020

View reviewed changes

backport_packages/refactor_backport_packages.py Outdated Show resolved Hide resolved

ashb reviewed May 25, 2020

View reviewed changes

setup.py Show resolved Hide resolved

ashb requested changes May 25, 2020

View reviewed changes

backport_packages/template_base_operator.py.txt Outdated Show resolved Hide resolved

backport_packages/refactor_backport_packages.py Outdated Show resolved Hide resolved

potiuk force-pushed the import_all_backport_classes branch from 3a4b2cd to 8010554 Compare May 25, 2020 22:37

fixup! fixup! All classes in backport providers are now importable in…

22ce507

… Airflow 1.10

potiuk force-pushed the import_all_backport_classes branch from 8010554 to 22ce507 Compare May 25, 2020 22:39

ashb approved these changes May 25, 2020

View reviewed changes

potiuk merged commit cdb3f25 into apache:master May 26, 2020

potiuk deleted the import_all_backport_classes branch May 26, 2020 00:43

All classes in backport providers are now importable in Airflow 1.10 #8991

All classes in backport providers are now importable in Airflow 1.10 #8991

Uh oh!

Conversation

potiuk commented May 24, 2020

Uh oh!

potiuk commented May 24, 2020

Uh oh!

Uh oh!

Uh oh!

Uh oh!

potiuk commented May 24, 2020

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

potiuk commented May 25, 2020

Uh oh!

potiuk commented May 25, 2020

Uh oh!

ashb commented May 25, 2020

Uh oh!

kaxil commented May 25, 2020

Uh oh!

ashb commented May 25, 2020

Uh oh!

kaxil commented May 25, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ashb commented May 25, 2020

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ashb left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

kaxil commented May 25, 2020 •

edited

Loading