Move a few imports inside functions to improve import speed of the library#4296
Move a few imports inside functions to improve import speed of the library#4296jarrodmillman merged 9 commits intonetworkx:masterfrom
Conversation
Erotemic
left a comment
There was a problem hiding this comment.
I made a few comments. Hope these help.
pb-cdunn
left a comment
There was a problem hiding this comment.
I guess all I want is to avoid importing the most expensive packages (scipy and numpy?) when they are not needed.
But I think the pattern in this PR could be problematic. (See my other comment.)
|
@pb-cdunn Nested imports do not import multiple times. There is a global You are right about the not found case. This can be avoided by having some cached function like: import functools
@functools.lru_cache(None)
def _scipy_module():
try:
import scipy
except Exception:
scipy = None
return scipyIn [4]: with ubelt.Timer('foo'):
...: _scipy_module()
...:
tic('foo')
...toc('foo')=0.0151s
In [5]: with ubelt.Timer('foo'):
...: _scipy_module()
...:
tic('foo')
...toc('foo')=0.0000sThis avoids the issue with the not-found case. However, how many times are you calling this in the not found case when you don't have the ubiquitous scipy or numpy installed? And how large of graphs are those pure-python methods able to handle? Does this actually matter in the case where graph size is going to bottleneck you before file-system searches? If the answer is No, I think the current On numpy, I guess avoiding it is nice in some applications, but parallel computation on long lists of numbers is near-universally helpful. The numpy packages is also significantly smaller and more ubiquitous (a difficult feat to accomplish) than scipy. I typically put import scipy inside functions, but I think there is an argument for assuming numpy globally (because avoiding that dict lookup does still help in critical sections). |
I completely disagree with that. This is a graph library, not a numerics library. Anyway, I'll look at this again someday after this is released. |
There are deep connections between graph theory and linear algebra. In some cases I imagine its faster to compute the answer to a graph question by converting to a matrix representation, solving the reduced problem and converting back. My point is that graphs and numerics are not mutually exclusive. I'm not saying that numpy must be assumed to exist and globally imported, but I am saying there is an argument for it. |
|
True. But it's unfortunate that we are forced to load things we do not need. The real problem was that, long ago, all components were star-imported into the top module, and many others recursively, for convenience. That was a mistake, I think inarguably. Now we're looking for work-arounds. If scipy is no longer load, that should help. |
|
Yeah, I agree that was a mistake. I used to do that with my libraries as well. Sometimes, I still do like to have something close to that functionality, but I never use If you are looking for workarounds, you could try using |
…brary (networkx#4296) * move scipy imports into functions * run black * Move xml and lxml import inside functions in graphml.py * Move imports inside functions * Add tests to graphml.py to appease the codecov CI * rearrange KDTree import handling * remove global np from alg_connectivity code * fix black * fix black
Some of the imports are at the module level still -- instead of inside a function.
This negates our efforts at lazy importing, thus increasing import time (by a very small amount but users are still complaining).
Fixes #2900
I will take another look at this in light of #4277 before it should be merged.
Feedback? @pb-cdunn