[MRG+1] Fix alerts found with lgtm#9278
Conversation
|
Thanks. The string comparison by reference fix is the only one that appears to actually be a bug fix. In some cases we intentionally avoid Is there a way to mark LGTM alerts as "Won't fix" (e.g. the first)? Your PR introduces flake8 errors. |
|
From a glance at the alert, it looks like a number of them are valid. What's the best way to get a permalink for an alert? |
|
Apart from Flake8 errors (please fix if you can), this LGTM. |
sklearn/decomposition/sparse_pca.py
Outdated
| method=method, random_state=random_state) | ||
| self.n_iter = n_iter | ||
| self.callback = callback | ||
| self.callback = callback |
There was a problem hiding this comment.
It seems to me that there is trailing whitespace here.
|
All these changes are good. I am 👍 for merge, but the style issues (flake8) should be addressed. |
…e/scikit-learn into fix-alerts-found-with-lgtm
|
Still failing flake8, says Travis: |
|
@jnothman hopefully that was the last one but will update if not - replying to your other questions now! |
|
@jnothman permalink to alerts are obtained by clicking the (diagonal up) arrow at the end of the line highlighting the alert - while the question mark symbol next to it takes you to the code query. e.g. for the reference vs value check the link is https://lgtm.com/projects/g/scikit-learn/scikit-learn/snapshot/1a981e87ba6f8d66ecfd6940184836446ded3f4c/files/benchmarks/bench_plot_randomized_svd.py#V185 |
|
yes, flake is happy. |
|
But I should add that an alert does not need to have found its way into your code base to be referenced and discussed: if you turn on PR integration any introduced alerts will get a similar link that can be shared (obviously pushing new commits will re-trigger the analysis and that might fix the alert) see for example: https://lgtm.com/projects/g/matplotlib/matplotlib/rev/pr-b7a573728fb8b7712d6aca3e99ef36d8f308c015 (this is an example of a fixed alert, but the idea is the same) |
|
Thanks for this. I've posted another couple of LGTM-derived issues. |
|
@jnothman you're welcome. And regarding marking alerts as "won't fix": this is something we're actively working on, in the future you will be able to dismiss individual alerts, turn particular queries on/off and decide which parts of your code base should not be analysed (we automatically detect library, generated and test code but we might miss some). We've recently come out of beta and we keep rolling out features, any feedback is welcome! |
|
Ability to reject false positives would definitely be high on or top of the list. Other usability things I have noted:
|
|
do we want LGTM CI or do we wait for flagging false positives? |
|
@jhelie: thanks for looping me in; @jnothman: thanks for the feedback! Regarding CI documentation: a red x will only appear if it is not possible to analyse a revision. For Python analysis this shouldn't happen (famous last words...), but for languages like Java it's very difficult to analyse source code that doesn't compile. However, you're right: this needs to be documented more clearly; I will pass that on to our engineering and documentation teams! I'll pass your report re. the back button on as well. Browsing and filtering alerts by path is very much on our schedule for the next few months — stay tuned! |
|
I don't understand how lgtm infers input types. Does it use the docstring? or just fuzz it? |
|
@amueller: the tech behind it is documented here: https://lgtm.com/docs/lgtm/about-lgtm. The queries are all open-sourced here: https://github.com/lgtmhq/lgtm-queries. Pull requests with improvements are really welcome! We will soon introduce functionality running your own queries as part of PR integration; you can already get started with your own queries on the query console here: https://lgtm.com/query/project:12060052/lang:python/ We pride ourselves on our extremely low false-positive levels, made possible by the query technology and our vision: code = data. If you find any FPs: do let us know (or file a PR!). And if you'd like to know more about the tech behind it, do have a read of some of papers:
Any other questions: shoot me a message! |
|
@sjvs awesome, thanks! |
|
@amueller: your question about input types appeared as I was typing my previous response. The quick answer: lgtm performs points-to analysis which tracks the possible types of a variable. This analysis is being improved on a daily basis (literally as we speak). To answer your original question:
I'd say: yes, you do want that 😄. On a more serious note: we launched lgtm.com fairly recently and are very keen to receive feedback from the community. So I'd really appreciate your thoughts and suggestions for improvements — including (especially!) on the continuous integration. |
|
The main reason I'd be tempted towards lgtm.com integration is that if we
don't integrate it takes a long time to update and show us that we've
successfully eradicated alerts! :P
…On 6 July 2017 at 01:29, Jean Helie ***@***.***> wrote:
@amueller <https://github.com/amueller>: what @sjvs
<https://github.com/sjvs> says, and please feel free to ping me if you
suspect an FP (I can see a few dubious cases atm but always keen to hear
from the devs!)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#9278 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz6wd9RWY2STA0i09y13CQi6HoViAYks5sK6vogaJpZM4ONuv2>
.
|
|
Hehe — point taken, @jnothman. I'll make sure to look into increasing the update frequency soon! We're regularly polling > 50000 GitHub projects to keep the free results up-to-date for everybody, so we have to be a little bit careful 😉 |
|
well if the people visiting and subscribed to lgtm are core devs of some
project, that might affect priority.
…On 6 Jul 2017 7:57 am, "Bas van Schaik" ***@***.***> wrote:
Hehe — point taken, @jnothman <https://github.com/jnothman>. I'll make
sure to look into increasing the update frequency soon! We're regularly
polling > 50000 GitHub projects to keep the free results up-to-date for
everybody, so we have to be a little bit careful 😉
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#9278 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz6yI6stiu3_FmzPiW1IghjsMms0CXks5sLAblgaJpZM4ONuv2>
.
|
|
How many of those 50k repositories have a commit every week, I'm curious?
|
|
Also @sjvs, have you considered supporting Cython? There's a shortage of quality control tools there, though its use is on the rise. |
|
@jnothman we're continuously monitoring language popularity and actively working on supporting more languages on lgtm.com so thanks for letting us know about your interest in Cython. At the moment I don't think it's scheduled just yet and I believe the C languages will come first. Regarding your question on project activity: we're curious about this kind of questions too and I'm hoping to have time to write up a short analysis of all the projects on lgtm - there's always more data than time! |
|
At the moment it should take < 48 hours for the results of the analysis to appear - and congrats on fixing 2 alert btw :) |
|
@jnothman <https://github.com/jnothman> we're continuously monitoring
language popularity and actively working on supporting more languages on
lgtm.com so thanks for letting us know about your interest in Cython. At
the moment I don't think it's scheduled just yet and I believe the C
languages will come first.
The point is that a lot of analysis will be a duplicate of Python.
Potentially much lower hanging fruit than C.
…On 7 July 2017 at 17:47, Jean Helie ***@***.***> wrote:
At the moment it should take < 48 hours for the results of the analysis to
appear - and congrats on fixing 2 alert btw :)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#9278 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz6ww137c8zzPK56q2coFGZl7HduV0ks5sLeKtgaJpZM4ONuv2>
.
|
* compare values rather than references * add __ne__ to class * add missing self parameter * properly initialise child class using super() * remove useless loop variables and assigments * remove useless return statements * respect PEP8 style * fix another flake8 check
|
@jnothman: Cython certainly looks interesting. You're right, it's potentially easier to implement than C. Note that we've had C/C++ analysis support for a long time in our other products, so porting that to lgtm.com is less work than you may think it is. I'll certainly add Cython to the list! |
* compare values rather than references * add __ne__ to class * add missing self parameter * properly initialise child class using super() * remove useless loop variables and assigments * remove useless return statements * respect PEP8 style * fix another flake8 check
* compare values rather than references * add __ne__ to class * add missing self parameter * properly initialise child class using super() * remove useless loop variables and assigments * remove useless return statements * respect PEP8 style * fix another flake8 check
* compare values rather than references * add __ne__ to class * add missing self parameter * properly initialise child class using super() * remove useless loop variables and assigments * remove useless return statements * respect PEP8 style * fix another flake8 check
* compare values rather than references * add __ne__ to class * add missing self parameter * properly initialise child class using super() * remove useless loop variables and assigments * remove useless return statements * respect PEP8 style * fix another flake8 check
* compare values rather than references * add __ne__ to class * add missing self parameter * properly initialise child class using super() * remove useless loop variables and assigments * remove useless return statements * respect PEP8 style * fix another flake8 check
* compare values rather than references * add __ne__ to class * add missing self parameter * properly initialise child class using super() * remove useless loop variables and assigments * remove useless return statements * respect PEP8 style * fix another flake8 check
* compare values rather than references * add __ne__ to class * add missing self parameter * properly initialise child class using super() * remove useless loop variables and assigments * remove useless return statements * respect PEP8 style * fix another flake8 check
This PR fixes some alerts flagged by lgtm.com code queries. Most are minor improvements suggestions (mostly fixing some misconstructed child classes) but a few some should fix unintentional code behaviour.
The standard lgtm engineering analytics highlights dependencies, vulnerabilities and tracks introduced and fixed alerts as the code changes (and shows that scikit-learn keeps getting better) but you can investigate further by writing your own custom queries to explore your code base.
I personally find lgtm alerts most useful when using PR integration as it allows to check and deal with potential issues before they find their way into master (see for instance 5147fd0).
Full disclosure: I frequently use scikit-learn but also work on lgtm.com, hence my interest! Hope that helps, any question please ask.