Skip to content

Remove dependency on six - we're Py3 only now!#11158

Merged
jklymak merged 5 commits intomatplotlib:masterfrom
Zac-HD:no-six
May 17, 2018
Merged

Remove dependency on six - we're Py3 only now!#11158
jklymak merged 5 commits intomatplotlib:masterfrom
Zac-HD:no-six

Conversation

@Zac-HD
Copy link
Copy Markdown
Contributor

@Zac-HD Zac-HD commented May 2, 2018

This should be pretty simple (if tedious) to review - six provides very simple translations so each snippet is obvious; there's just a lot of them!

@WeatherGod
Copy link
Copy Markdown
Member

WeatherGod commented May 2, 2018 via email

@jklymak
Copy link
Copy Markdown
Member

jklymak commented May 2, 2018

We discussed a bit on the weekly call.

I'm a little hazy on the exact implementation, but I think that the issue here is not the PR per se, which is against 3.0, but if someone writes code that is meant to be back ported as well as going against master they will not write python-2 compatible code if all the six stuff is taken out. Other devs piped up that they aren't dropping harmless py2 stuff until they end long-term-support of 2.0.

I'm not experienced enough with transitions like this to know whether this is all manageable, but my personal vote would be to hold off on this sort of change until we have it all sorted out.

@Zac-HD Zac-HD closed this May 2, 2018
@jklymak
Copy link
Copy Markdown
Member

jklymak commented May 2, 2018

Well, I didn't mean for you to close it yet - I was hoping someone more experienced would weigh in... I'll re-open just because we really should understand the py3 migration path...

@jklymak jklymak reopened this May 2, 2018
PyEval_RestoreThread: NULL state bug on win32
"""
_focus = windowing.FocusManager()
windowing.FocusManager()
Copy link
Copy Markdown
Contributor

@anntzer anntzer May 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That actually relies on _focus being garbage collected at the end of the function (IOW FocusManager has a non-trivial __del__, would better be rewritten as a contextmanager nowadays but that's another story).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh... dear. It's actually not used in many places, but I'll convert it in a seperate pull.

return parts


def remove_coding(text):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deprecation, yada yada

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't public API, is it? It looks to me like an internal helper for the Sphinx extension, and nobody downstream should be importing that.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Far too many things that should not be public API unfortunately are.

@tacaswell
Copy link
Copy Markdown
Member

I am 👍 on this going in (assuming tests pass and public API is properly deprecated).

As per https://matplotlib.org/devel/coding_guide.html#branches-and-backports

We always will backport to 2.2.x

  • critical bug fixes (segfault, failure to import, things that the user can not work around)
  • fixes for regressions against 2.0 or 2.1

Everything else (regressions against 1.x versions, bugs/api inconsistencies the user can work around in their code) are on a case-by-case basis, should be low-risk, and need someone to advocate for and shepherd through the backport.

Mercifully the critical bugs have been rare and the rate of regressions being reported has gone down quite a bit.

If we are going to hedge on keeping master branch mostly py2 compatible we don't get much by going py3 only!

@tacaswell tacaswell added this to the v3.0 milestone May 2, 2018
@jklymak
Copy link
Copy Markdown
Member

jklymak commented May 2, 2018

I’m fine w that. But what’s the path if I need to fix a regression in 2.2.3? Code versus master and then fix the backport manually adding back in any six code needed?

@WeatherGod
Copy link
Copy Markdown
Member

WeatherGod commented May 3, 2018 via email

@ImportanceOfBeingErnest
Copy link
Copy Markdown
Member

If I recall correctly the main argument for dropping py2 support is to relieve strain from developpers so they can spend their time on implementing new features instead of working around incompatibility issues.
Now it seems people spend more time disassembling a previously working code, finding arguments why that would be useful and fighting with the issues brought up by the need to keep diverging branches in sync, instead of using the freedom of py3 to fix existing problems (which are hard to come by with py2) or implementing new features (without the need to think about backwards compatibility).
If the strategy here is to make backporting as hard as possible such that even the simplest documentation fixes cannot be brought to the 2.2 branch any more and the meaning of "critical bug" becomes closer and closer to "complete failure" the whole "LTS up to 2020" is just hypocritical.

@anntzer
Copy link
Copy Markdown
Contributor

anntzer commented May 3, 2018

If someone really wants to revert the removal of future imports that I did, go ahead, I honestly don't care that much.
I think the real question is the following: Do you expect new PRs to use six and/or to be Py2-compatible? If yes, then we haven't really gained anything. If no, then someone will have to do the work regarding backporting in any case...
Note that it is extremely easy to write code that's not Py2-compatible even though it may appear to be syntactically correct (e.g. str vs unicode vs six.text_types). Let's say I put up a PR that merges cleanly, but as someone tries to backport it to 2.2.x it turns out that there's a subtle bytes/unicode bug on Py2. Who's in charge of fixing that?

@efiring
Copy link
Copy Markdown
Member

efiring commented May 3, 2018

I'm also in favor of stripping out six now. It improves readability. I doubt it will substantially impede critical backports. (Many of the uses of six--many instances of 'six.iteritems' for example--are not even necessary for compatibility.)

@tacaswell
Copy link
Copy Markdown
Member

Again, I am not super worried about backport because they should only be for critical bugs and regressions. If the backports are getting harder (and they are going to get harder independent if we strip six out or not) we then we do less non-critical backports.

I think the real question is the following: Do you expect new PRs to use six and/or to be Py2-compatible? If yes, then we haven't really gained anything. If no, then someone will have to do the work regarding backporting in any case...

I would say no, we should not be making any concessions to py2 compatibility in new code.

@jklymak
Copy link
Copy Markdown
Member

jklymak commented May 3, 2018

Thanks @efiring, @WeatherGod , @tacaswell @anntzer and @ImportanceOfBeingErnest I think that clarifies things for me a fair bit; to paraphrase:

  1. build for 3.0, even critical bug fixes.
  2. be prepared to do a bit of fiddling during the backport. ...and/or ask for help.
  3. we should not let the backporting pain slow down adding py3 improvements (like this PR).

@Zac-HD
Copy link
Copy Markdown
Contributor Author

Zac-HD commented May 3, 2018

Thanks for all your comments - I will keep working on this as the majority opinion in in favor.

My main motivation is to remove a layer of unnecessary complexity - contributing to Matplotlib (or just reading the code to debug a plot) sometimes feels like death by a thousand papercuts. Removing six is a small step in the right direction, and if we keep making incremental progress it adds up to qualitative change!

@ImportanceOfBeingErnest
Copy link
Copy Markdown
Member

Mind that this is not only about code, it is also about documentation. An error in the docs can be as critical as an error in the code from a user's perspective. I recently submitted a PR to fix the documentation, which currently has a major part of it completely undocumented (the axes_grid1 and axisartist toolkits). This could not be backported to the current doc branch because some future imports conflicting - totally unrelated to the actual PR.
The problem is for such bug fixes that in principle they would be easily solved by the person doing the PR, yet due to the backporting it is a task that only a handful of people are capable of doing.
Rightfully they refuse to do that as it is more work for them, which they should better spend on reviewing all the other open PR etc.

Ideally there should be the option to pull/push from/to the p2 branch independently with a clear manual on how to do that. There should be ci testing available to make sure fixes don't break anything. You can then leave it to the people doing the PRs to make sure everything runs smoothly. On a by-case basis critical fixes can be brought to that branch by the dev team, but essentially you have the freedom to completely transfer responsibility (appart from reviewing) to the contributors. The question whether or not a bug is critical is then simply a question of whether someone is willing to do that extra work (which usually isn't that much anyways).

@Zac-HD
Copy link
Copy Markdown
Contributor Author

Zac-HD commented May 3, 2018

Surely anyone could git cherry-pick the commits, fix merge conflicts, and open a PR against the 2.2 branch though? I don't see what makes this harder than any other contribution.

Backports aren't automatic, and taking six out doesn't make them harder.

@WeatherGod
Copy link
Copy Markdown
Member

WeatherGod commented May 3, 2018 via email

@Zac-HD
Copy link
Copy Markdown
Contributor Author

Zac-HD commented May 3, 2018

I've seen that and it's great, but a clean merge is not always a correct merge - there should still be a human in the loop!

We're getting off topic though - I'd appreciate it if we could stick at least mostly to review of code. I'll push an update tomorrow with a first round of fixes...

@jklymak
Copy link
Copy Markdown
Member

jklymak commented May 3, 2018

I think a PR that changes 70 files is an appropriate place to discuss the practical implications of the PR for back porting.

@tacaswell
Copy link
Copy Markdown
Member

Mind that this is not only about code, it is also about documentation. An error in the docs can be as critical as an error in the code from a user's perspective. I recently submitted a PR to fix the documentation, which currently has a major part of it completely undocumented (the axes_grid1 and axisartist toolkits). This could not be backported to the current doc branch because some future imports conflicting - totally unrelated to the actual PR.

I agree errors in documentation as as or more important than bugs, however we should apply the same criteria to documentation as we do to code. Those modules have never been documented so adding them is not fixing a regression from 2.x series.

The problem is for such bug fixes that in principle they would be easily solved by the person doing the PR, yet due to the backporting it is a task that only a handful of people are capable of doing.

We should broaden the number of people who feel comfortable doing manual backports then. We have some minimal instructions at https://matplotlib.org/devel/coding_guide.html#manual-backports What do we need to add to that to make it clearer?

Ideally there should be the option to pull/push from/to the p2 branch independently with a clear manual on how to do that. There should be ci testing available to make sure fixes don't break anything.

You can checkout and open PRs against 2.2.x and CI is still enabled on that branch. However, if we start to accumulate lots of development on the v2.2.x branch we will then have to deal with forward porting and backporting will only become more difficult so I am not excited about this.

On a by-case basis critical fixes can be brought to that branch by the dev team, but essentially you have the freedom to completely transfer responsibility (appart from reviewing) to the contributors.

Reviewing is where our bottleneck is.

So far no one has volunteered to be the v2.2 release manager. If someone does I will be happy to defer to what ever they want to do on many of these positions.

@ImportanceOfBeingErnest
Copy link
Copy Markdown
Member

@tacaswell I will comment on the respective PR that I mentionned in a minute, not to clutter this discussion here.

I think there has to be a point in time from which on the two branches are simply coexisting and diverging. Up to that point it does not make sense to disassemble the compatible code for pure reason of beautification or ease of read. After that point, master branch can be completely freed from anything that people don't like about py2. (Although my argument about resources spent on that work would still apply.) If then some bugfix concerns both and is critical enough, it needs two PRs, one for each branch. While creating a double PR can be delegated to the contributor, merging can not. Also, if a bug is fixed in master, it should in most cases be trivial to fix it in the 2.2 branch as well, mostly replicating the fix and working around the differences between py2/py3.
I do not understand the manual backport section well enough. But what seems obvious is that this cannot be done by the contributor.

If reviewing is a bottleneck, one smight think about delegating more of that as well. Maybe it can be separated from the rights to merge?

@dopplershift
Copy link
Copy Markdown
Contributor

If reviewing is a bottleneck, one smight think about delegating more of that as well. Maybe it can be separated from the rights to merge?

There's nothing stopping non-committers from writing reviews, it works just fine today. They just don't appear as the green check mark.

I'm really not buying this argument that BUG FIXES, specifically regressions from >=2.0, are going to have such a large surface area that the removal of six is going to create this huge burden for back ports. Honestly, this feels like trepidation at the dropping of 2.7 (from those who still have to use it) rather than a solid technical concern.

@WeatherGod
Copy link
Copy Markdown
Member

WeatherGod commented May 3, 2018 via email

@efiring
Copy link
Copy Markdown
Member

efiring commented May 4, 2018 via email

else:
if six.PY2 and isinstance(s, six.text_type):
s = s.encode("utf-8")
s = str(s)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

attn @anntzer

This logic does not seem identical to be, but I am not sure of the context.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The else block is a no-op on Python 3, so it can be removed. That leaves the former block, and we can simplify the condition by observing that str(some_string) is a no-op too, and therefore calling s = str(s) unconditionally.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty sure the call to str/six.text_types was only necessary because of unicode/str issues; now s should always be a str and we shoudn't even have to call str(s).

@Zac-HD
Copy link
Copy Markdown
Contributor Author

Zac-HD commented May 8, 2018

Ping @tacaswell - I've rebased on master to resolve a merge conflict. Would you mind reviewing the other half of the diff so we can merge this before another conflict comes up?

# save dir for next time
rcParams['savefig.directory'] = os.path.dirname(
six.text_type(fname))
str(fname))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should fit on one line now?

for argname, argdefault in zip(args[1:], defaults)]
spec = inspect.getfullargspec(cls.__init__)
if spec.defaults:
argstr = ",".join("%s=%s" % pair for pair in zip(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would still break at the for here (and the , if necessary).

return base ** lx


def nearest_long(x):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

round does banker's rounding in Python 3; nearest_long does not.

Copy link
Copy Markdown
Contributor Author

@Zac-HD Zac-HD May 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...yep, that's not round. It's int(x +/- .5). /facepalm.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for being unclear; it's not int either, it's whatever the other kind of rounding is.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, round-half-to-zero. Fixed by inlining the logic in the two places it was needed, and just using round when we know the value is very close to an integer.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, public api, yada yada.

@Zac-HD Zac-HD force-pushed the no-six branch 3 times, most recently from 6132674 to de30254 Compare May 8, 2018 11:13
return self._non_decade_format(sign_string, base, fx, usetex)
elif usetex:
return r'$%s%s^{%d}$' % (
sign_string, base, int(fx + (-.5 if fx < 0 else .5)))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually fx is close to int here (the "not is_x_decade" case is handled just above) so it has already been rounded above, so you don't need to call any of nearest_long, round, or your inlined version.

@Zac-HD Zac-HD force-pushed the no-six branch 2 times, most recently from 20ba8d4 to 245bba9 Compare May 11, 2018 13:38
@Zac-HD
Copy link
Copy Markdown
Contributor Author

Zac-HD commented May 11, 2018

Hi @anntzer & @tacaswell - I've rebased on master, split it into separate commits for easier reviewing, and deprecated instead of removing the various potentially-public functions. Ready for final review!

Please leave a fresh comment if I've missed anything - this has been a long thread and I'm keen to get this merged before I have to rebase again 😄

else:
builtins = sys.modules['__builtin__']

builtins = sys.modules['builtins']
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or just "import builtins"

@anntzer
Copy link
Copy Markdown
Contributor

anntzer commented May 11, 2018

Technically you also need an entry in doc/next_api_changes/ but I'm not going to hold up the PR for that (and likewise I can handle the remaining comments myself in a separate PR, treat that as a future todo :p).

if spec.defaults:
argstr = ",".join("%s=%s" % pair
for pair in zip(spec.args[-len(spec.defaults):],
spec.defaults))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would write this as

",".join(map("{}={}".format, spec.args[-len(spec.defaults):], spec.defaults))

(the rarely seen but ever useful implicit zip-and-star in multi-arg map).

(would also leave a space after the comma, but that's a separate point)

else:
pwd = os.getcwd()
pwd = os.getcwd()
old_sys_path = list(sys.path)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would rewrite as sys.path.copy(), otherwise it's not immediately clear what's happening.

sys.stdout = io.StringIO()
else:
sys.stdout = cStringIO.StringIO()
sys.stdout = io.StringIO()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see if contextlib.redirect_stdout works here

Copy link
Copy Markdown
Contributor Author

@Zac-HD Zac-HD May 12, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably would, but that's a larger change so I'm leaving it for later.

six.exec_("__name__ = '__main__'", ns)
code = remove_coding(code)
six.exec_(code, ns)
exec("__name__ = '__main__'", ns)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ns["__name__"] = "__main__"

@Zac-HD
Copy link
Copy Markdown
Contributor Author

Zac-HD commented May 12, 2018

Thanks @anntzer! I've done the easy bits here 😄

I think contextlib.redirect_stdout would both work and remove the need for a _dummy_print function, but it's not obviously correct so I'm leaving that for a future PR as you suggested. Similarly I don't see what the API change would be, unless either deprecations count or users are accessing imported names (which are non-public, per PEP8).

So... merge please?

@jklymak
Copy link
Copy Markdown
Member

jklymak commented May 12, 2018

@Zac-HD Remember we a) have a really long review queue; this is 10-days old, @anntzer has some things that are 10 months old I'm sure he'd like reviewed as well; b) we require two reviews before things get merged. So you'll have to get someone else excited about going through this 😉 Its not going to be me, because my knowledge of py2->py3 is pretty minimal. @tacaswell started, but is often quite busy, but he seemed interested so I'm sure he'll get to it.

@jklymak
Copy link
Copy Markdown
Member

jklymak commented May 16, 2018

@Zac-HD I'll merge if you rebase (unless I see something that @anntzer missed, which is 99.9999% unlikley). Please ping when you do...

@Zac-HD
Copy link
Copy Markdown
Contributor Author

Zac-HD commented May 17, 2018

Ping @jklymak, with many thanks 😄

@jklymak jklymak merged commit 8168210 into matplotlib:master May 17, 2018
@jklymak
Copy link
Copy Markdown
Member

jklymak commented May 17, 2018

Merged. Thanks a lot, that looked like a lot of work! Hopefully our tests catch any possible ways this could go wrong.

@Zac-HD Zac-HD deleted the no-six branch May 17, 2018 23:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants