Skip to content

Conversation

@ashb
Copy link
Member

@ashb ashb commented Feb 22, 2020

This module imports form airflow.models, which in turn means we impor
all the other modules (which we can fix in a later issue) -- but there
is still no need to import that module here as it is only used for type
checking.

So lets use the already existing guard built in to the typing module.

On my machine this speeds up import airflow from 0.74s to 0.44s


Issue link: AIRFLOW-6883

Make sure to mark the boxes below before creating PR: [x]

  • Description above provides context of the change
  • Commit message/PR title starts with [AIRFLOW-NNNN]. AIRFLOW-NNNN = JIRA ID*
  • Unit tests coverage for changes (not needed for documentation changes)
  • Commits follow "How to write a good git commit message"
  • Relevant documentation is updated including usage instructions.
  • I will engage committers as explained in Contribution Workflow Example.

* For document-only changes commit message can start with [AIRFLOW-XXXX].


In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.
Read the Pull Request Guidelines for more information.

This module imports form airflow.models, which in turn means we impor
all the other modules (which we can fix in a later issue) -- but there
is still no need to import that module here as it is only used for type
checking.

So lets use the already existing guard built in to the typing module.

On my machine this speeds up `import airflow` from 0.74s to 0.44s
"""File logging handler for tasks."""
import logging
import os
from typing import Optional
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be handled by moving stuff out of models/init.py rather by hacking stuff for TYPE_CHECKING

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not both? This is what the TYPE_CHECKING was designed for.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the #7484 for discussion/proposal of this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if importing this one module was quick, it's still an extra import that is not needed at run time. That's a number of directory searches looking for the this one file, and a opening a file, parsing it etc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really - it's a hack to hide architectural problems in legacy code.

Copy link
Member Author

@ashb ashb Feb 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(And the number of directory searches goes up if we use implicit namespaces.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I know what the documentation says, it's just my view on what TYPE_CHECKING is often used for)

Copy link
Member

@potiuk potiuk Feb 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if importing this one module was quick, it's still an extra import that is not needed at run time. That's a number of directory searches looking for the this one file, and a opening a file, parsing it etc.

I believe removing init.py stuff is different (and much better) way of solving the long import time.

When we properly move stuff out of init.py's for models (and other places) no single import will be expensive because we will only import what is needed and nothing more. Yes some "types" will be imported needlessly but those will really be types that will be immediately loaded if anything from this function is used and they are probably already imported elsewhere.

By proper tree-like/lack-of-cycles/no implicit imports structure of the code we will achieve what we want to do now by manual TYPE_CHECKING removal of some stuf - avoid expensive imports where not needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(And the number of directory searches goes up if we use implicit namespaces.)

Yeah. I already fully abandoned this idea. I simply stopped being fan of implicit packages. We should not use them. There were many problems in our code that we would have to solve and gain nothing in return, behavior of some code changes when there are implicit packages and there are still many tools/packages (like sphinx but many others) that do not work well with implicit packages yet. It makes little sense to use implicit packages now that we can install providers packages without them.

@ashb
Copy link
Member Author

ashb commented Feb 23, 2020

Fixed in other ways

@ashb ashb closed this Feb 23, 2020
@ashb ashb deleted the speedup-import branch February 23, 2020 22:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants