Skip to content

New warning: warn on generic comparison#1642

Closed
madroach wants to merge 1 commit intoocaml:trunkfrom
madroach:warn_gen_comp
Closed

New warning: warn on generic comparison#1642
madroach wants to merge 1 commit intoocaml:trunkfrom
madroach:warn_gen_comp

Conversation

@madroach
Copy link
Copy Markdown
Contributor

@madroach madroach commented Mar 1, 2018

Add a new warning 63 which will trigger when the compiler fails to specialize a comparison primitive and will use the generic comparison function.

This is not finished yet. Still missing is:

  • make it compile
  • make [@warning -63] work, which it does not at the moment
  • maybe remove generic comparisons from compiler / stdlib / mark them with [@warning -63]

help is appreciatet as I am stuck at this point.

@Octachron
Copy link
Copy Markdown
Member

The fact that you failed to compile the compiler with the new warning enabled is not exactly encouraging in term of usability, and the various code rewrite, duplication or even straigthforward deletion(!) in this PR drives this point further. Have you any plan to improve the usability of this warning (by adding comparison combinators for instance)?

@madroach
Copy link
Copy Markdown
Contributor Author

madroach commented Mar 1, 2018

The fact that you failed to compile the compiler with the new warning enabled is not exactly encouraging in term of usability, and the various code rewrite, duplication or even straigthforward deletion(!) in this PR drives this point further.

Let me clarify: The compiler compiles fine (when using only the first two commits). The warning works as advertised, and only triggers on compiling stdlib, tools and ocamldoc.
The last two commits were an attempt to fix compiling stdlib and the other tools. I removed them for the time being. I do not propose to make the tools and stdlib codebase conform to the new warning.

My experience was actually quite the opposite of your observation. Most uses the warning triggered on were easy to fix by constraining the type and therefore speeding up the compiled code.

Have you any plan to improve the usability of this warning (by adding comparison combinators for instance)?

Yes, I do. The warning needs to support disabling by annotations like [@ocaml.warning -63]. With this supported one can enable the warning via the command-line option -w +63 and if encountered mark its use as safe via the annotation or fix the code.
One could of course also add comparison combinators. The most needed one would be for the option type.

@alainfrisch
Copy link
Copy Markdown
Contributor

I think the warning would make a lot of sense if we could provide a coherent story to actually avoid the generic comparison (perhaps with modular implicits). It is nice to detect "direct" use of the comparison, but the warning (in its current implementation strategy at least) would not allow to detect indirect uses through library functions such as List.assoc, Hashtbl.find, etc. Also, even for basic types, the stdlib would need to expose explicit comparison functions for those types (like the new Float.compare from the PR about the new Float module -- we would need that for all basic types). I don't think it is a particularly nice style to accept the generic comparison, but only in cases where the optimizer is clever enough to detect it applies on known basic types.

@madroach
Copy link
Copy Markdown
Contributor Author

madroach commented Mar 7, 2018

One way to avoid the warning it is by type constraints.
My motivation for this warning is that the Containers stdlib replacement removed the polymorphic comparison altogether (and all functions depending on it). So when using Containers this optional warning would make a lot of sense; even in its current implementation. What should be fixed is disabling the warning with attributes.

Copy link
Copy Markdown
Contributor

@Drup Drup left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think having the warning is useful. It might be enabled locally in some files to avoid the generic comparison (debugging where it is used is extremely annoying). Sure, it doesn't prevent external libraries to use it, but I do not think that's so bad, especially for alternate stdlibs.

I really don't think we should try to apply it to the ocaml compiler itself though. At least not for now. So I propose we drop the second commit.

The implementation looks fine, but it's really missing some tests, in particular to exercise the locality of the attribute (checking that e1; e2[@warning ... only warns in e1, not 2).

@madroach madroach force-pushed the warn_gen_comp branch 2 times, most recently from bf46672 to 8b64907 Compare July 21, 2018 21:42
@madroach
Copy link
Copy Markdown
Contributor Author

The implementation looks fine, but it's really missing some tests, in particular to exercise the locality of the attribute (checking that e1; e2[@warning ... only warns in e1, not e2).

I rebased on current trunk and added a test, which currently fails because disabling the warning using attributes doesn't work at all. So now still missing is:

  • Documentation
  • Changelog
  • Fix [@warning "-65"]

I can do the first two, but I am stuck regarding the [@warning ] attribute. I don't understand enough about the various compiler stages and how the [@warning ] framework is supposed to fit in.
Remarkably the documentation says this about the [@warning ] attribute:

“ocaml.warning” or “warning”, with a string literal payload. This can be used as floating attributes in a signature/structure/object/object type. The string is parsed and has the same effect as the -w command-line option, in the scope between the attribute and the end of the current signature/structure/object/object type. The attribute can also be attached to any kind of syntactic item which support attributes (such as an expression, or a type expression) in which case its scope is limited to that item. Note that it is not well-defined which scope is used for a specific warning. This is implementation dependant and can change between versions. Some warnings are even completely outside the control of “ocaml.warning” (for instance, warnings 1, 2, 14, 29 and 50).

So now I'm not even sure whether it is supposed to work with the new warning...

@gasche
Copy link
Copy Markdown
Member

gasche commented Jul 22, 2018

Yes, it is supposed to work.

You should look at warning_scope in parsing/builtin_attributes.ml, and the way it is used in the codebase.

@madroach madroach force-pushed the warn_gen_comp branch 4 times, most recently from 0914bed to b42e6f9 Compare September 3, 2018 22:16
@madroach
Copy link
Copy Markdown
Contributor Author

madroach commented Sep 3, 2018

Ok, I added enough calls to Builtin_attributes.warning_scope to make the w65.ml test work as expected, but certainly not enough to properly support other warnings that may be added in the future.
If you think another testcase should be added, please let me know.
Otherwise I believe this is ready for inclusion.

@madroach madroach force-pushed the warn_gen_comp branch 7 times, most recently from 25db463 to 473a21c Compare September 8, 2018 20:19
@Drup
Copy link
Copy Markdown
Contributor

Drup commented Oct 3, 2018

@alainfrisch Do you still think we should have a better stdlib story before adding this warning? Given that you're probably the one most familiar with that part of the codebase, what do you think of the implementation?

@alainfrisch
Copy link
Copy Markdown
Contributor

Yes, I'm still not a big fan of relying on specialization of primitives to allow uses of the generic comparison at specific types. The direction should be to provide type-specific comparisons functions in Stdlib (and longer term, perhaps rely on modular implicits) and warn on any use of the generic comparison, directly or indirectly. This could be implemented using #1804 (one would annotate the generic comparison functions, but also List.assoc, List.find, Hashtbl functions, etc).

In particular, this PR could give a false sense of security, as it will not detect indirect uses of the generic comparison.

That said:

  • The improvement on warning scopes during Typedtree->Lamba is nice and could be used in other contexts.
  • The new warning could be an incremental help for users, to detect some problems right now.

So I'm not against merging this PR. I only quickly looked at the implementation and it looks ok, but I haven't done a more thorough review yet. @Drup : if you feel confident in the implementation, I can merge the PR.

@madroach madroach force-pushed the warn_gen_comp branch 2 times, most recently from a92ffaa to e519652 Compare December 4, 2018 21:02
@madroach
Copy link
Copy Markdown
Contributor Author

madroach commented Dec 5, 2018

I'd love to see this merged soon because rebasing it again and again is a rather boring job. The actual implementation is quite simple and straightforward.
Most of the changes are only necessary to disable the warning in all parts of the compiler.

transl_structure loc fields cc rootpath final_env rem
| Tstr_attribute attr ->
Builtin_attributes.warning_scope ~ppwarning:false [attr]
(fun () -> transl_structure loc fields cc rootpath final_env rem)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better to useBuiltin_attributes.warning_attribute here (in line with what Typemod does).

@alainfrisch
Copy link
Copy Markdown
Contributor

The improvement on warning scopes during Typedtree->Lamba is nice and could be used in other contexts.

Thinking more about it, I think this needs to be discussed. The approach in this PR is to try replicating the propagation of the warning-control attribute as done during type-checking. But there are 47 calls to Builtin_attributes.{warning_scope, warning_attribute} in typing/...

It would be more robust and future proof to process the attributes only once and store the resulting warning settings, either with new fields in the Typedtree (eg. exp_warnings: Warnings.state) or directly in Env.t. (Cf point "Consider storing warning settings (+other context) as part of Env.t?" in typing/TODO.md) If one wants to be serious about warnings raised post-type-checking, I don't think one can avoid such an approach.

@madroach
Copy link
Copy Markdown
Contributor Author

madroach commented Dec 5, 2018

Thanks for your review. I changed two usages of warning_scope to warning_attribute.
I don't feel competent to contribute substantially to a discussion about where to store warning settings.
Nevertheless for the warning I am proposing in this PR only a manageable number (10) of calls to warning_{scope,attribute} are added. It will not prevent any future redesign of the warning scope design since occurences of warning_{scope,attribute} can easily be found and removed.

Changes Outdated
### Language features:

- Add optional warning for (possibly unwanted) usage of the generic comparison
function.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Less verbose: "Add warning for use of generic comparison functions."

| Unsafe_without_parsing (* 64 *)
| Redefining_unit of string (* 65 *)
| Unused_open_bang of string (* 66 *)
| Generic_comparison (* 67 *)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be nicer to include the offending operation in question, perhaps like this:

| Generic_comparison of string

and then give more precise warnings:

this comparison will use the generic equality function
this comparison will use the generic comparison function
this comparison will use the generic inequality function

rather than calling everything "comparison". (For one thing, it's not quite accurate to say "the" generic comparison function, but I think it's generally better to be as precise as possible, if it's not a lot of extra work.)

Hint: Did you mean 'type %s = unit'?" name
| Generic_comparison ->
"this comparison will use the generic comparison function.\n\
It may have unexpected semantics in some cases."
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than the rather vague "It may have unexpected semantics in some cases" (which might be true of anything, depending on your expectations) I think it would be better to say what the unexpected semantics are. Perhaps "which operates on the low-level representations of values, ignoring types".

the warning is triggered when a structural comparison is not
optimized to int/float/string comparison, but the generic comparison
function is used.
@madroach
Copy link
Copy Markdown
Contributor Author

madroach commented Dec 6, 2018

Thanks @yallop for your review. I changed the pull request according to your proposals.
The warning message is now more informative.

@gasche
Copy link
Copy Markdown
Member

gasche commented Dec 6, 2018

In the future I think that it could be interesting to teach an implementation of this warning that most builtin parametrized types preserve comparability: t list is safely-comparable when t is, etc. This is doable as an extension of the current implementation, and it doesn't need to be included in this PR, but I checked the interface/wording of the warning to check that it would extend gracefully to that scenario, and I think it does.

Another thing I would like to see is a more complete documentation of the rationale for this warning in the Warnings Reference. The important point is that some datatypes (such as Set.t and Map.t) have a non-canonical representations for which the structural equality is too fine-grained, not the equality the user actually wants. This is too long to explain in a warning and possibly in warn-help, but it should be in the manual. Again, this can come in a later PR.

@yallop
Copy link
Copy Markdown
Member

yallop commented Dec 6, 2018

The important point is that some datatypes (such as Set.t and Map.t) have a non-canonical representations for which the structural equality is too fine-grained, not the equality the user actually wants.

Another point worth noting is that the structural equality doesn't fare well with "functional" values:

# (lazy 4) = (lazy 4);;
- : bool = true
# (lazy (3+1)) = (lazy 4);;
- : bool = false
# (lazy (3+1)) = (lazy (3+1));;
Exception: Invalid_argument "compare: functional value".

@jberdine
Copy link
Copy Markdown
Contributor

jberdine commented Dec 6, 2018

I'm not familiar enough with translcore and translmod to be sure from the code, but what is the expected behavior for use of polymorphic compare on types whose representations are not exposed fully in the interface? That is, any type whose representation is hidden by an interface, or exposed but private, has a high chance of needing a comparison function that is more knowledgable of the representation invariants than the underlying value representation. Conversely, if a module's clients are allowed/encouraged to construct arbitrary values of the representation type, then modulo functional values, etc., one might argue that comparing the low-level value representation induces the right function.

@alainfrisch
Copy link
Copy Markdown
Contributor

Recent remarks by @jberdine and @gasche suggest that it might be better to check whether type is "ok for generic comparison" rather than "the primitive specialization applies". What about marking some type variables (in the value declaration for generic comparison, but also possibly for List.assoc, etc) with an attribute, whose semantics is to check that post-type checking, each instance of that variable is unified with a "valid type"?

@gasche
Copy link
Copy Markdown
Member

gasche commented Dec 6, 2018

I thought at first that a completely different implementation would be required to check that a type is "ok for comparison", but I think that it is actually reasonable to do it as a small change over this PR: at the place in Translprim.specialize_primitive where this PR raises a warning, we can condition the warning on a type-directed check that the type is indeed not properly comparable. Specific types corresponding to static specialization opportunities have already been ruled out, but we can there have a richer check that knows about comparability of types constructed using (non-private) lists, arrays, options etc.

You are proposing a more ambitious change, which is basically to re-introduce SML's "equality types" in the type theory of OCaml. We could do this, but I would rather favor a low-tech option (as in this PR, plus maybe a more clever check), and wait for some sort of type-classes to have a flexible long-term solution.

@madroach
Copy link
Copy Markdown
Contributor Author

madroach commented Dec 6, 2018

This PR was inspired by the containers library which made the comparison operators monomorphic. What we are trying to do here is find some kind of compromise between dogmatic monomorphism and complete polymorphism.
The more complex the data structure, the higher the risk of introducing subtle bugs. The simpler the data structure, the more unnecessary work will be necessary to provide dumb comparators or somehow work around limitations placed by this warning.
Where to draw the line?
I think my approach of warning about the "black magic" of generic comparison is a quite reasonable approach, because

  • it is fair to assume that the programmer will know what is going on when primitive types or constant constructors are compared. -> therefore allow specialized comparison.
  • discriminating by the complexity of compound types is hard and will not be easily understood / will be hard to predict by the user. -> therefore warn about compound types.

@nojb nojb mentioned this pull request Mar 15, 2019
@hhugo
Copy link
Copy Markdown
Contributor

hhugo commented Apr 25, 2020

I'm a bit sad to not see progress here. As I understand it, people are mostly in favor. @madroach, do you still want to push this through ?

@madroach
Copy link
Copy Markdown
Contributor Author

Honestly, my motivation is limited. I could of course do a merge again. But I won't do any work on the code before consensus is reached on when the warning should trigger and my implementation is the way to go.
To me it seems @gasche tried to convince @alainfrisch that we should stay with my "low-tech" approach. So we are waiting for approval by @alainfrisch ?

@gasche wrote:

plus maybe a more clever check

I tried to explain my rationale about when to warn in my last comment, but let me give an example in addition. Even on integers the physical comparison may not yield the expected result if you are working on a residue system.

@alainfrisch
Copy link
Copy Markdown
Contributor

Don't let my comments block this PR. If a maintainer is happy with the proposal and willing to review the PR, I've no opposition to merging. I still don't believe this is the right approach, but I don't have any better ready to propose, and the warning might be better than nothing, at least for some users, and others can always disable it.

(But please don't mention me in the list of reviewers, I haven't looked at the code!)

@damiendoligez
Copy link
Copy Markdown
Member

I think it would be better if this was an alert rather than a warning. Then we'll be able to annotate stdlib functions like List.mem with the same alert. Make it disabled by default and we get a pretty good way to avoid using generic comparisons for the programmers who need that.

We can always add eqtypes to the type system later if we ever want to.

@alainfrisch
Copy link
Copy Markdown
Contributor

x-ref : #9928

@gasche
Copy link
Copy Markdown
Member

gasche commented Mar 10, 2023

Closing, as the discussion has moved to #9928.

@gasche gasche closed this Mar 10, 2023
EmileTrotignon pushed a commit to EmileTrotignon/ocaml that referenced this pull request Jan 12, 2024
…caml#1642)

---------

Co-authored-by: Sabine Schmaltz <sabineschmaltz@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants