Skip to content

WIP: Use classes for readers/writers and use a metaclass for auto-registering#2063

Closed
astrofrog wants to merge 9 commits into
astropy:masterfrom
astrofrog:refactor-io-registry
Closed

WIP: Use classes for readers/writers and use a metaclass for auto-registering#2063
astrofrog wants to merge 9 commits into
astropy:masterfrom
astrofrog:refactor-io-registry

Conversation

@astrofrog

Copy link
Copy Markdown
Member

This is work in progress - do not merge!

This goes towards addressing the issue in #1046 of using classes to define readers/writers, and uses a metaclass adapted from the one currently in astropy.io.ascii to auto-register the readers/writers. I also implemented a FITS reader/writer for NDData and found that this resulted in additional circular imports which motivated some of the refactoring here.

A few things I'm still not happy with:

  • I don't like the use of partial to get around the fact that the methods require the first argument to be self but we are calling them like class methods.
  • Importing the connectors has now been moved to read and write to avoid circular imports. However, I find this still sub-optimal because I had to write the import_connectors function in registry which still has the hard-coded imports (this is to make sure the classes get defined, otherwise they don't and don't get auto-registered!).

In addition I still need to integrate astropy.io.ascii into this new system (if we go ahead with it).

Of course, there are tests, docs, etc. missing from here, but this is just a prototype at this stage.

@taldcroft - could you let me know what you think of this so far? How do you think we should deal with astropy.io.ascii? Should it use a separate metaclass, or should it be integrated into the new system?

Once this is all in place we can also try and see whether we can use it to improve the documentation situation, by building the Table.read/Table.write and NDData.read/NDData.write docstrings dynamically using the metaclass.

@astrofrog

Copy link
Copy Markdown
Member Author

To avoid seeing all the indent changes due to putting functions in classes:

https://github.com/astropy/astropy/pull/2063/files?w=0

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this maybe also include ''? Sometimes blank keywords are considered to contain "useful" metadata. Though in this case it might not be well preserved either. Hard to say...

@taldcroft

Copy link
Copy Markdown
Member

@astrofrog - Sorry, I'm a bit swamped with things at the moment but will try to take a look this week.

@taldcroft

Copy link
Copy Markdown
Member

@astrofrog - I've been looking at this in the background and thinking about it. I don't have any fully formulated ideas, but the first thing that strikes me is that it seems this doesn't take full advantage of using classes in the sense that it still just registers the reader/writer/identifier functions (though they are now methods). I was originally thinking more about registering classes or class instances. There is some opportunity to be lightweight in the sense of lazy loading where possible.

We still haven't solved the inline documentation problem. I'm stuck because at the moment because to do that would require essentially loading all of io.fits, io.ascii, etc in order to know all the available readers and their docs. I don't want this to happen as an immediate side effect if importing Table, but it could happen at the moment you ask Table.read what formats are supported.

Anyway, just want to let you know I haven't forgotten this.

@astrofrog

Copy link
Copy Markdown
Member Author

@taldcroft - I also was thinking about this and coming to the same conclusion that we could just register the classes and have the registry automatically read all the information from the class.

There is one solution to the documentation problem, which while not elegant, would certainly save us a lot of hassle. The solution is simply to move registry.py to a proper sub-package in astropy/io/registry and then move all the connect.py files to astropy/io/registry rather than have them live in the different sub-packages. Then it would be much easier to have dynamic documentation, while not necessarily importing the different sub-pacakges if the imports are run-time imports. Yes, this may not be 'ideal' but at the moment we have to hard-code the list of formats in the registry anyway. Anyway, at the moment import_connectors() (in registry.py) is importing the sub-packages already, which is not ideal either.

In some ways it'd also be easier to keep all the connectors in one place, as it's easier to make them more consistent with each others. Do you have any thoughts on this?

@embray

embray commented Feb 28, 2014

Copy link
Copy Markdown
Member

I think I'm also on the same page as @taldcroft on this, and it looks like you might be coming to the same conclusions too, so that's good. The way I saw it, I figured you were doing things the way you were for backwards compatibility with the original reader/writer registry. But I don't think that system is used much outside of Astropy itself...? (do any affiliated packages use it?)

I would want to keep it around a bit longer for backwards compatibility anyways, but the new registry should probably avoid it and take more advantage of the use of classes.

@embray

embray commented Feb 28, 2014

Copy link
Copy Markdown
Member

I'm also okay on the astropy.io.registry subpackage. I see the connect drivers as being rather separate and distinct from the packages like astropy.io.fits that implement the low-level interfaces for individual file types.

@astrofrog

Copy link
Copy Markdown
Member Author

Sounds good - I'll give the astropy.io.registry approach a try and will see if I can get the auto-documentation stuff to work at the same time.

@taldcroft

Copy link
Copy Markdown
Member

There is this conflict between keeping things light via external definition of readers vs. importing the actual code.

With io.ascii I took the approach of using the import of io.ascii to define the connect drivers. Part of the motivation is to allow for local definition of the allowed read/write kwargs within the relevant source code. We can dive into one of the most stressing cases, the LaTeX reader/writer. In addition to all the standard args in io.ascii.read, it takes several format-specific args.

So imagine having a BaseIO class, from which the io.ascii MetaBaseReader is derived. That MetaBaseReader can have specific knowledge of all the standard io.ascii.read/write args, and then the LaTeX class could provide (via class attributes?) the modifications to the standard args for purposes of generating docs. That keeps the definition of everything local to the code, which is good. For io.ascii it would also allow improved behavior by providing a mechanism for io.ascii.read(.., format='latex') explicitly know what the allowed args are. Putting all the arg definitions outside in the registry would be a bit of a problem.

@astrofrog - I might have an idea of how to get all this to congeal together, but it's a little fuzzy still. Can I take a crack at this? If you are psyched to do this then that's fine of course.

@astrofrog

Copy link
Copy Markdown
Member Author

@taldcroft - please do feel free to have a go at this! You can start from scratch and don't need to include the commits here. I'm short on time over the next week or so.

@astrofrog

Copy link
Copy Markdown
Member Author

@taldcroft - just to check, are you still planning to have a go at this?

@taldcroft

Copy link
Copy Markdown
Member

@astrofrog - errr, I got distracted before getting a chance at this and now I'm thinking about coordinates and table stuff.. I can give you back the baton on this.

@astrofrog

Copy link
Copy Markdown
Member Author

@taldcroft - no problem, just wanted to check. I'll see if I can work on it during a flight at the weekend, so if you have any other comments in addition to those you added above, let me know!

@mdboom

mdboom commented May 21, 2014

Copy link
Copy Markdown
Contributor

As I'm further investigating import times, one thing I've discovered is that importing Table, in order to support the generic reader/writer interface, must import all of io.fits, io.votable, io.ascii etc., just to be able to use a Table. One thing that might be nice as part of this refactoring (if you think it's related) would be to allow the registry callbacks to import the modules that support their formats lazily. As it stands, you can't really do anything with Table without importing the entire io enchilada, which adds a 300% overhead over just importing astropy.table.

@taldcroft

Copy link
Copy Markdown
Member

@mdboom - keeping the io enchilada out of the initial import is definitely in the plan. It will take some doing, especially for io.ascii, but it's the right thing to do.

@embray

embray commented May 22, 2014

Copy link
Copy Markdown
Member

@mdboom See, relatedly, #1402 . I've had that open for a while but haven't really addressed it.

@astrofrog

Copy link
Copy Markdown
Member Author

This is superseded by #3090

@astrofrog astrofrog closed this Jan 15, 2015
@astrofrog astrofrog removed this from the v1.0.0 milestone Jan 15, 2015
@astrofrog astrofrog deleted the refactor-io-registry branch July 5, 2016 18:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants