Skip to content

python: simple addition to simplify importing MRtrix3 libraries#1613

Merged
jdtournier merged 32 commits intodevfrom
python_clean_locate_mrtrix3
Jul 1, 2019
Merged

python: simple addition to simplify importing MRtrix3 libraries#1613
jdtournier merged 32 commits intodevfrom
python_clean_locate_mrtrix3

Conversation

@jdtournier
Copy link
Copy Markdown
Member

With this proposal, the code used to locate the MRtrix3 libraries is shifted into a python file (tentatively called bin/__locate_mrtrix3.py) in the bin/ folder, and applications only need to invoke:

import __locate_mrtrix3
from mrtrix3 import app
app.execute()

This could be simplified further by renaming __locate_mrtrix3.py to run_mrtrix3.py or something, and have that file also contain the app.execute() call. But that would preclude its use in other modules - not sure it's a big loss anyway, since modules would reside in other locations and so couldn't use this directly anyway.


Another potential addition would be to extend this file with functionality to operate from 3rd party modules. So all they'd need is to copy/paste it into their modules, and then we could offer a range of options to identify the MRtrix3 libraries:

  • check if a build symlink exists and follow that if found
  • find mrinfo or mrconvert in the PATH, and use their path to locate the corresponding lib/ folder
  • use an environment variable like MRTRIX_ROOT_DIR or something to locate the libraries
  • check the config file for an equivalent entry

None of these would be foolproof, but they'd at least allow external module operation without modifying the PYTHON_PATH, which might be desirable... (?)

With this proposal, the code used to locate the MRtrix3 libraries is
shifted into a python file (tentatively called bin/__locate_mrtrix3.py)
in the bin/ folder, and applications only need to invoke:

    import __locate_mrtrix3
    from mrtrix3 import app
    app.execute
@dchristiaens
Copy link
Copy Markdown
Member

I'm a big fan of any efforts to simplify this matter. Your commit looks OK to me, but best to wait for TravisCI to confirm.

This could be simplified further by renaming __locate_mrtrix3.py to run_mrtrix3.py or something, and have that file also contain the app.execute() call.

I would not quite go that far (as you said too), but I would advocate making it a true "gateway" module that also imports the relevant elements of mrtrix3. For instance

from mrtrix3runtime import app, run, MRtrixError    # import everything at once (PEP8)

def usage():
    pass

def execute():
    pass

app.execute()

The file mrtrix3runtime.py would then contain the current contents of __locate_mrtrix3.py, plus a line

from mrtrix3 import app, run, path, image, MRtrixError

added at the end.

@thijsdhollander
Copy link
Copy Markdown
Contributor

Added dwi2response. Also, there seems to be some problem for foreach:

Traceback (most recent call last):
  File "/home/thijs/mrtrix3/bin/foreach", line 280, in <module>
    app.execute()
  File "/home/thijs/mrtrix3/lib/mrtrix3/app.py", line 93, in execute
    module.usage(CMDLINE)
  File "/home/thijs/mrtrix3/bin/foreach", line 43, in usage
    index = next(i for i,s in enumerate(sys.argv) if s == ':')
NameError: global name 'sys' is not defined

Lestropie and others added 3 commits April 29, 2019 18:27
- Do more checks in attempts to locate MRtrix3 Python modules
- Wrap import of this module in a try clause so that external scripts can be executed without deletion of this line.
- Move revised description of execution of external Python scripts from FAQ page to external modules page.
@jdtournier
Copy link
Copy Markdown
Member Author

Also, there seems to be some problem for foreach:

I have to admit, I'm not sure how Python handles these things. All these commands ran fine on my system, but for some weird reason responsemean required an additional import os, sys statement, when none of the others did. Might be to do with my version of Python, or some other subtlety that I'm not aware of...

Anyway, it's easily fixed, as done in c45185a - but I'm none the wiser as to how this actually works.


I would advocate making it a true "gateway" module that also imports the relevant elements of mrtrix3

Sounds like a good idea. I assume you've tested it and it works...? Whether to go along with that is a decision I'll leave up to the heavy Python users. As long as there's a simple one-liner that can be used instead of that bulky chunk of code...

@dchristiaens
Copy link
Copy Markdown
Member

I would advocate making it a true "gateway" module that also imports the relevant elements of mrtrix3

Sounds like a good idea. I assume you've tested it and it works...? Whether to go along with that is a decision I'll leave up to the heavy Python users. As long as there's a simple one-liner that can be used instead of that bulky chunk of code...

Not yet, but it's easy to do. I'll give it a crack if there are no a priori objections.

Another potential addition would be to extend this file with functionality to operate from 3rd party modules. So all they'd need is to copy/paste it into their modules, and then we could offer a range of options to identify the MRtrix3 libraries:

  • check if a build symlink exists and follow that if found
  • find mrinfo or mrconvert in the PATH, and use their path to locate the corresponding lib/ folder
  • use an environment variable like MRTRIX_ROOT_DIR or something to locate the libraries
  • check the config file for an equivalent entry

None of these would be foolproof, but they'd at least allow external module operation without modifying the PYTHON_PATH, which might be desirable... (?)

Yes, that would be a nice addition, although of the options listed, I think the first one (checking for build symlink) is the only practical one for developers. The code would first need to check if the python library is modified (overwritten) in the module directory, and otherwise follow the symlink to the linked build.

The other options could lead to conflicts with multiple installed versions, for instance a master-build in the PATH with the module linked to a current dev-build in a different folder.

dchristiaens and others added 2 commits April 29, 2019 17:39
…ix3 into python_clean_locate_mrtrix3

Conflicts:
	bin/5ttgen
	bin/dwi2response
	bin/dwibiascorrect
	bin/dwigradcheck
	bin/dwinormalise
	bin/dwipreproc
	bin/dwishellmath
	bin/foreach
	bin/labelsgmfix
	bin/population_template
	bin/responsemean
@Lestropie
Copy link
Copy Markdown
Member

I've never been able to figure out a way of doing this that I'm completely happy with.

A few changes:

  • Wrapping the import __locate_mrtrix3 in a try clause for each executable script means that a third-party script can be executed without requiring deletion of the relevant code (currently the whole lib folder import code block has to be deleted).

  • If PYTHONPATH already contains lib/mrtrix3/, don't go through with trying to locate it.

  • If the modules still can't be found, search relative to the location of mrconvert within PATH as a last-ditch effort.

check if a build symlink exists and follow that if found

Would be worth adding I think; gives slightly better guarantee of software version matching than the above.

use an environment variable like MRTRIX_ROOT_DIR or something to locate the libraries

I don't think that provides any benefit over setting PYTHONPATH.

I would not quite go that far (as you said too), but I would advocate making it a true "gateway" module that also imports the relevant elements of mrtrix3.

I'm not at all familiar with the use of Python in this way; my only experience is with my own modules. If you can make a modification that makes this library and its usage "more like others", go for it: I simply don't know what that is.

@Lestropie
Copy link
Copy Markdown
Member

Proposed precedence for adding appropriate lib/mrtrix3/ directory to sys.path (opinions / corrections welcome):

  1. If file __locate_mrtrix3.py being executed is itself a softlink, follow the softlink from the module to the core MRtrix3 bin/ directory, then navigate to the corresponding lib/ directory. If it is not a softlink, simply navigate from bin/ to lib/ and see if the modules are there.

    • Documentation regarding use of the Python API should be updated to mention that including a softlink to __locate_mrtrix3.py in the same directory as the script being executed is an option.
  2. If the executing script resides in a directory called bin/, and in the parent directory there is a softlink called build, follow that softlink from the module to the core MRtrix3 and look for the lib/ directory within it.

    • This, as far as I can tell, is only for the case where the user has made a copy of __locate_mrtrix3.py rather than a softlink, but the module filesystem structure is otherwise adhered to.
  3. Try importing mrtrix3 without any modification to sys.path; if this works, the user has set PYTHONPATH.

    • Note that it will also be necessary to check sys.path prior to approaches 1 and 2, as otherwise the import attempts there could work due to PYTHONPATH rather than the actual mechanisms being tested that are supposed to take precedence.
  4. Check PATH for mrconvert; if found, navigate from bin/ to lib/ and try the import one last time.

@jdtournier
Copy link
Copy Markdown
Member Author

jdtournier commented Apr 30, 2019

At the risk of confusing matters further... I've just had a quick look into using the imp module to help with this. I just gave this a crack, and it seems to work out of the box, on both Python versions. Code lifted more or less directly from the docs.

Basic idea: add a tiny 'module' file called mrtrix3.py in the bin/ folder, which becomes the module to load if present, but doesn't otherwise require special handling from the application side - if it's not present, the main mrtrix3 module needs to be in the PYTHONPATH, as expected. The trick is for this new module file to allow itself to be overridden with the main module if found... Remarkably, this seems to work:

bin/mrtrix3.py:

import inspect, os, sys, imp
LIB_FOLDER = os.path.realpath(os.path.join(os.path.dirname(os.path.realpath(inspect.getfile(inspect.currentframe()))), os.pardir, 'lib'))
fp, pathname, description = imp.find_module("mrtrix3", [ LIB_FOLDER ])
try:
  imp.load_module("mrtrix3", fp, pathname, description)
finally:
  if fp:
    fp.close()

I just checked by removing the whole import __locate_mrtrix3 line from one of the commands (along with the surrounding try clause), and it works as expected. If the command is in the bin/ folder with this new bin/mrtrix3.py file alongside it, it finds and loads the right mrtrix3 module. If there's no mrtrix3.py file alongside the command, it only works if PYTHONPATH includes the right folder.

I have a feeling this ticks all the right boxes...? What I've pasted above is a quick proof of principle, it should be possible to extend it with the kinds of tricks we were discussing above. Using symlinks works out of the box thanks to the use of os.realpath(), and we could also add in the build symlink as an option (not as critical in my opinion, but why not?). And if all else fails, we could try looking in the PYTHONPATH directly (we can since imp.find_module() expects a list of locations as its path argument).

@dchristiaens
Copy link
Copy Markdown
Member

At the risk of confusing matters further... I've just had a quick look into using the imp module to help with this. I just gave this a crack, and it seems to work out of the box, on both Python versions.

Very cool, I really like this solution. As you say, it appears to check all the boxes, and makes for a very clean interface in the scripts. Can you go ahead and commit your proposal so we can test it?

What I've pasted above is a quick proof of principle, it should be possible to extend it with the kinds of tricks we were discussing above. Using symlinks works out of the box thanks to the use of os.realpath(), and we could also add in the build symlink as an option (not as critical in my opinion, but why not?).

Personally, I don't think checking whether build is symlinked is needed if we can also symlink the mrtrix3.py file itself. If you'd like to avoid having too many symlinked dependencies in external modules, we could perhaps include a simple script to initialise/setup a new module.

@jdtournier
Copy link
Copy Markdown
Member Author

Can you go ahead and commit your proposal so we can test it?

OK, done. But I've not deleted the __locate_mrtrix3.py file, since there's a fair bit of additional logic in there that I've not transferred over - like I said, it's more of a proof of concept at this stage.

Personally, I don't think checking whether build is symlinked is needed if we can also symlink the mrtrix3.py file itself.

I'm in two minds about this. We'd need to either copy or symlink this file over to reside alongside any additional scripts anyway. But the build symlink route might come in handy in situations where a whole module is set up (with both Python and C++ code), and the mrtrix3.py file is copied over into the bin/ folder of that module. At this point, the build symlink would provide a unique reference to the target installation of MRtrix3, and would allow the developer of the module to limit the number of steps required by users at install time to setting up a single symlink, which would then work both for building the C++ code and running the accompanying Python scripts. So I do see there might be value in it under some situations...

@maxpietsch
Copy link
Copy Markdown
Member

I quite like the symlink of mrtrix3.py approach, gives transparent and full control and makes for a clean and simple from mrtrix3 import app, ....

To avoid users setting up two symlinks to possibly conflicting versions, @dchristiaens suggested instructing users to only symlink build in external modules. ./build could create the symlink to mrtrix3.py, warn if it does exist and differs. The instructions would simply be, symlink build followed by ./build in the module directory to set up the python side and compile the c++ side of the module. On windows, I think we'd need to ask the user for admin privileges (see here) or create the symlink as documented for build.

@Lestropie
Copy link
Copy Markdown
Member

Lestropie commented May 3, 2019

Changes in c755207 work for me for a range of different scenarios, & on a Windows machine too. You can see the ordering of different tests and the logic behind them in the file.

Before merging, I would suggest that the "External modules" documentation should be updated to reflect this as much as possible.

I'm also wondering given these changes whether it should still be necessary to do:

from mrtrix3 import app
app.execute()

, or whether it would be possible for a "import mrtrix3" to be adequate,using the inspect module in both bin/mrtrix3.py and lib/mrtrix3/__init__.py to invoke app.execute().

(Note: From prior testing, the relevant import call must appear after definition of the usage() and execute() functions; otherwise the latter have not yet been parsed when traversing the library modules, since the Python interpreter is entirely linear, even when dealing with imports)

Lestropie added 3 commits May 4, 2019 16:54
- Silence pylint warnings arising from parsing of the helper script rather than the modules themselves.
- Alter syntax for invoking the app.execute() function.
- pylint: Don't run for bin/__pycache__/.
- Fix non-testing of contents of bin/ directory in run_pylint.
- Instruct python to ignore warnings related to the new trickery of "import mrtrix3".
- population_template: Move import commands to inside functions in order to support new module loading mechanism.
Lestropie added 3 commits May 9, 2019 16:30
This script will now only attempt to load the MRtrix3 Python libraries using two approaches:
- Relative path to itself, after having traversed any softlinks.
Relative path to build script after having traversed any softlinks.
@Lestropie
Copy link
Copy Markdown
Member

I'm still slightly undecided on the "from mrtrix3 import app; app.execute()" vs. "import mrtrix3" issue, and would appreciate input from anyone using other module functions but not execute(). The first option additionally means that within the usage() and execute() functions, other modules are imported via "from mrtrix3 import module"; but "from mrtrix3 import app" must be omitted, as it is already imported at the outer scope.

What I'm wondering instead is if those functionalities within mrtrix3.app that are only functional if execute() is used should appear in mrtrix3.command, for which the equivalent of execute() would be run upon import; each script would then have from mrtrix3 import command at the tail (appreciate this is still not 100% explicit that it's resulting in substantial code execution though). mrtrix3.app would then retain anything that could reasonably be used even if execute() is not; e.g. printing functions, progress bar, ...

I'd probably need to give this a go in order to figure out whether or not it would work sensibly. And it would mean another additional module file, which may be frowned upon. And "from mrtrix3 import command" is still not fantastic as the ultimate execution code.

@maxpietsch
Copy link
Copy Markdown
Member

maxpietsch commented May 9, 2019

My 2 pence: Having an import statement of the library or parts thereof execute the script is surprising (ok, I meant not pythonic, makes me shiver). Having a single line app.execute(), possibly guarded by a check for __main__ is much cleaner and explicit.

About looking for the python library location: I'd vote for links to mrtrix3.py taking precedence over any other setting. Not a big fan of fishing for our binaries in PATH to find the library location. I'd prefer to explicitly define the version to be used.

Checking PYTHONPATH feels unnecessary unless setting and following links is a pain (windows?) but not too fussed about that one as long as symlinks override PYTHONPATH. A limitation of relying on path variables is that if one wanted to call a script from within a script via run.command, one could use different library versions for the outer and inner script using links. To achieve the same via PATH variables would require passing environments to subprocess calls and hardcoding these into scripts.

I'd warn if PYTHONPATH is used and in any case, if the python and c++ library versions do not match.

@dchristiaens
Copy link
Copy Markdown
Member

My 2 pence: Having an import statement of the library or parts thereof execute the script is surprising (ok, I meant not pythonic, makes me shiver). Having a single line app.execute(), possibly guarded by a check for __main__ is much cleaner.

I have the exact same feeling. An import statement should only load stuff into the namespace, not invoke a function. I think this is one of these cases where shorter code is reducing readability; I strongly argue against it. I would also like to see a check for __main__ included.

Regarding the library localisation, I think we just need a sensible default and leave everything else up to the user or module developer. I don't mind if you want to use the path of mrinfo as a last resort (with a warning), but I don't think it's needed at all. I would rather not touch the PYTHONPATH; it is very unconventional to do so within a module and it risks confusing developers who rely on it for their work.

@jdtournier
Copy link
Copy Markdown
Member Author

jdtournier commented May 9, 2019

OK, more thoughts...

I see you've removed the fall-back mechanisms of looking through PYTHONPATH or the system PATH - I think that's the right call. If users place this mrtrix3.py file or symlink next to their script, we should consider it an explicit statement of intent: the version of the MRtrix3 library that this script is expected to use will be at one of the locations searched by mrtrix3.py - and if the library can't be found, that's unexpected, and an error. If the mrtrix3.py file is not present, then the script is expected to rely on the traditional PYTHONPATH mechanism, which users should already be used to. All good.

I also think executing the script upon import is very unexpected. I'd much rather stick to an explicit one-liner that very clearly say what it does - and importantly at what point in the script.

Just to add to the discussion: there is a potential problem on the Windows side given the difficulties in creating symbolic links. It's possible, but not trivial, and likely to complicate deployment of modules in the future. So I've had a quick look into alternatives, as least for the build script. Using Windows shortcuts is a possibility, but difficult/impossible to invoke from the MSYS2 command line. However, a much simpler trick is simply to:

$ echo "../mrtrix3/build" > build
$ chmod a+x ./build
$ ./build

This basically sorts out the lack of shortcut support on Windows, with a simpler approach than we're currently advocating, I reckon. For mrtrix3.py to support this would however require a touch more work, but nothing more than reading the first line, stripping out any comments, and using that path as a candidate to search (stripping any comments means the equivalent line for the build script itself would be empty). What do you reckon?

@Lestropie Lestropie self-requested a review May 9, 2019 13:30
Lestropie added 2 commits May 17, 2019 10:53
Following discussion in #1613.
Partly revert aa495aa: No longer execute using a simple "import" call, but move the principal execute() function to __init__.py; this means that the mrtrix3.app module can be imported in a similar fashion to all other modules, rather than requiring special treatment due to being imported in the outermost code loop.
This module can be a collection point for functions or classes that are of utility but don't sensibly fit into any of the other named modules.
@Lestropie
Copy link
Copy Markdown
Member

I think I've got the right compromise with respect to the import issue. With 88a97a5, the syntax changes to:

import mrtrix3
mrtrix3.execute()

So the "execute()" is explicit, but the mrtrix3.app module is not imported in a different fashion to the other modules as it used to be prior to this PR, making the import syntax overall more consistent.

Checking PYTHONPATH feels unnecessary unless setting and following links is a pain (windows?) but not too fussed about that one as long as symlinks override PYTHONPATH.

PYTHONPATH is not explicitly checked: If it is set, and there is not a file mrtrix3.py in the same directory as the executed script, then the "import mrtrix3" line at the bottom of the script will immediately parse lib/mrtrix3/__init__.py, without any softlink trickery.

I'd warn if PYTHONPATH is used

I'm not sure about issuing a warning: If the environment variable has been set, the user has made an explicit intervention to do so; the only criticism is that it is a less robust approach than softlinking mrtrix3.py in the situation of multiple parallel installations, but very few users will actually possess such.

and in any case, if the python and c++ library versions do not match.

Yes, I should have a go at that.

@Lestropie
Copy link
Copy Markdown
Member

and in any case, if the python and c++ library versions do not match.

Yes, I should have a go at that.

Actually, not sure if anything can be done here: The version number within Python is set by the build script. So the only way I can see this string varying from that of a binary is if that particular binary was not compiled on the most recent execution of build, which says more about that particular binary than it does the Python library.

Lestropie added 2 commits May 20, 2019 10:53
Rectification of issues arising from code branching between #1607 and #1613.
@Lestropie Lestropie added this to the MRtrix3 3.0 release milestone May 22, 2019
This supports the case where the build script in a module is not a
symbolic link, but a regular text file containing a single line with the
location of the build script.
@jdtournier
Copy link
Copy Markdown
Member Author

OK, I've just added support for using a text file containing the path to the build script in the module loading, and updated the docs to match. I reckon it's good to go...

Copy link
Copy Markdown
Member

@Lestropie Lestropie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick test of having a single line path as the contents of file build worked on my Windows machine. Had to update dwicat based on changes in this branch to pass CI. Overall I think we've converged on a decent solution.

@jdtournier jdtournier merged commit 6e7d1e7 into dev Jul 1, 2019
@jdtournier jdtournier deleted the python_clean_locate_mrtrix3 branch July 1, 2019 16:33
Lestropie added a commit that referenced this pull request Jan 26, 2020
Resolve conflict between #1550 and #1613, which resulted in compiled Python file "bin/mrtrix3.pyc" being moved to bin/purged_files/.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants