Move a few imports inside functions to improve import speed of the library by dschult · Pull Request #4296 · networkx/networkx

dschult · 2020-10-28T20:22:33Z

Some of the imports are at the module level still -- instead of inside a function.
This negates our efforts at lazy importing, thus increasing import time (by a very small amount but users are still complaining).

Fixes #2900

I will take another look at this in light of #4277 before it should be merged.

Feedback? @pb-cdunn

Erotemic

I made a few comments. Hope these help.

networkx/generators/community.py

networkx/linalg/algebraicconnectivity.py

networkx/readwrite/gexf.py

networkx/generators/geometric.py

pb-cdunn

I guess all I want is to avoid importing the most expensive packages (scipy and numpy?) when they are not needed.

But I think the pattern in this PR could be problematic. (See my other comment.)

networkx/generators/community.py

networkx/linalg/algebraicconnectivity.py

Erotemic · 2020-11-10T20:10:56Z

@pb-cdunn Nested imports do not import multiple times. There is a global sys.modules dictionary that maps module names to loaded modules. The second time around its as fast as a dict lookup (one of the fastest things in Python).

You are right about the not found case. This can be avoided by having some cached function like:

import functools

@functools.lru_cache(None)
def _scipy_module():
    try:
        import scipy
    except Exception:
        scipy = None
    return scipy

In [4]: with ubelt.Timer('foo'): 
   ...:     _scipy_module() 
   ...:                                                                                                                                

tic('foo')
...toc('foo')=0.0151s

In [5]: with ubelt.Timer('foo'): 
   ...:     _scipy_module() 
   ...:                                                                                                                                

tic('foo')
...toc('foo')=0.0000s

This avoids the issue with the not-found case. However, how many times are you calling this in the not found case when you don't have the ubiquitous scipy or numpy installed? And how large of graphs are those pure-python methods able to handle? Does this actually matter in the case where graph size is going to bottleneck you before file-system searches? If the answer is No, I think the current import style might be simpler and faster in the most common use case.

On numpy, I guess avoiding it is nice in some applications, but parallel computation on long lists of numbers is near-universally helpful. The numpy packages is also significantly smaller and more ubiquitous (a difficult feat to accomplish) than scipy. I typically put import scipy inside functions, but I think there is an argument for assuming numpy globally (because avoiding that dict lookup does still help in critical sections).

pb-cdunn · 2020-11-24T21:02:32Z

I typically put import scipy inside functions, but I think there is an argument for assuming numpy globally...

I completely disagree with that. This is a graph library, not a numerics library. Anyway, I'll look at this again someday after this is released.

Erotemic · 2020-11-24T21:31:28Z

This is a graph library, not a numerics library

There are deep connections between graph theory and linear algebra. In some cases I imagine its faster to compute the answer to a graph question by converting to a matrix representation, solving the reduced problem and converting back. My point is that graphs and numerics are not mutually exclusive. I'm not saying that numpy must be assumed to exist and globally imported, but I am saying there is an argument for it.

pb-cdunn · 2020-11-24T21:38:30Z

True. But it's unfortunate that we are forced to load things we do not need.

The real problem was that, long ago, all components were star-imported into the top module, and many others recursively, for convenience. That was a mistake, I think inarguably. Now we're looking for work-arounds. If scipy is no longer load, that should help.

Erotemic · 2020-11-25T19:59:49Z

Yeah, I agree that was a mistake. I used to do that with my libraries as well.

Sometimes, I still do like to have something close to that functionality, but I never use import * anymore. Instead I use a package I wrote called mkinit (https://github.com/Erotemic/mkinit) that autogenerates explicit __init__.py files that have a similar effect to import * (it statically introspects module attributes and includes them depending on command line arguments and special dunder variables in the __init__.py itself).

If you are looking for workarounds, you could try using mkinit to generate the __init__.py files, which would give you backwards compatible behavior that you could gradually move away from. In PR #4349 I create two new submodules and there is a comment that documents how I used mkinit to generate each of them: (e.g.

networkx/networkx/algorithms/minors/__init__.py

Line 20 in b6e6c46

mkinit ~/code/networkx/networkx/algorithms/minors/__init__.py -w

)

…brary (networkx#4296) * move scipy imports into functions * run black * Move xml and lxml import inside functions in graphml.py * Move imports inside functions * Add tests to graphml.py to appease the codecov CI * rearrange KDTree import handling * remove global np from alg_connectivity code * fix black * fix black

dschult added 5 commits October 7, 2020 16:18

move scipy imports into functions

526e644

run black

ddcf9fb

Move xml and lxml import inside functions in graphml.py

ad61fda

Move imports inside functions

77df89f

Add tests to graphml.py to appease the codecov CI

f3000e7

dschult added this to the networkx-2.6 milestone Oct 28, 2020

dschult added the type: Maintenance label Oct 28, 2020

Erotemic reviewed Nov 8, 2020

View reviewed changes

networkx/generators/community.py Show resolved Hide resolved

networkx/linalg/algebraicconnectivity.py Outdated Show resolved Hide resolved

networkx/readwrite/gexf.py Show resolved Hide resolved

networkx/generators/geometric.py Outdated Show resolved Hide resolved

pb-cdunn reviewed Nov 10, 2020

View reviewed changes

networkx/generators/community.py Show resolved Hide resolved

networkx/linalg/algebraicconnectivity.py Outdated Show resolved Hide resolved

dschult added 4 commits November 10, 2020 22:30

rearrange KDTree import handling

c443c35

remove global np from alg_connectivity code

f292cce

fix black

718d837

fix black

957e695

jarrodmillman merged commit ddf06e1 into networkx:master Nov 12, 2020

SeanDS mentioned this pull request Nov 12, 2020

Long import times #2900

Closed

dschult deleted the time_imports branch November 15, 2020 19:48

dschult mentioned this pull request Dec 7, 2020

Figure out how to best structure imports within the package #4426

Closed

Erotemic mentioned this pull request Dec 18, 2020

Reorganize minor submodule as subpackage #4349

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Move a few imports inside functions to improve import speed of the library#4296

Move a few imports inside functions to improve import speed of the library#4296
jarrodmillman merged 9 commits intonetworkx:masterfrom
dschult:time_imports

dschult commented Oct 28, 2020

Uh oh!

Erotemic left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pb-cdunn left a comment

Uh oh!

Uh oh!

Uh oh!

Erotemic commented Nov 10, 2020

Uh oh!

pb-cdunn commented Nov 24, 2020

Uh oh!

Erotemic commented Nov 24, 2020

Uh oh!

pb-cdunn commented Nov 24, 2020

Uh oh!

Erotemic commented Nov 25, 2020

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

dschult commented Oct 28, 2020

Uh oh!

Erotemic left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pb-cdunn left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Erotemic commented Nov 10, 2020

Uh oh!

pb-cdunn commented Nov 24, 2020

Uh oh!

Erotemic commented Nov 24, 2020

Uh oh!

pb-cdunn commented Nov 24, 2020

Uh oh!

Erotemic commented Nov 25, 2020

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

4 participants