Various changes for tools/check-typo#1287
Conversation
|
@dra27 yes, it was miscopied, change is OK. |
|
Indeed, I don't like the URL shortener idea -- thanks for asking. We win very little (the only strong positive I see is to respect the 80 columns limit), and we start depending on an extra external service that (1) introduces a new way our links may become dead (although it's reasonable to hope that even if Google killed the service (likely), it would preserve existing URLs) and (2) leaks information from people reading documentation to a third-party for no good reason¹. One thing that would be useful, I think, is to ask archive.org for a version of these pages, which has the side-effect of getting it to record them. (This is not failsafe, if the site later uses a robots.txt to disable archiving they will still be unavailable.) ¹: for the record, although I like privacy and believe that google is probably evil or may be in the future, I don't personally mind that much the use of a Google-supported (or someone-else-supported) url shortener, but I know that some people do mind; in particular the German-speaking school of free software and privacy activism is very sensitive about privacy issues, even little things, and I think it's nice to avoid moves that could turn them off OCaml contributions. |
| let pp_sep ppf () = Format.fprintf ppf ";@ " | ||
| let print_array pp ppf a = | ||
| Format.fprintf ppf "@[<hov>⟦%a⟧@]" | ||
| Format.fprintf ppf "@[<hov>\xe2\x9f\xa6%a\xe2\x9f\xa7@]" |
There was a problem hiding this comment.
May I suggest to replace the format string with a simple "[|%a|]"? (This is the failing path of the test, so there would be no need to update the reference file).
There was a problem hiding this comment.
Sure - I just (pedantically) mapped it to the characters you'd used!
There was a problem hiding this comment.
(I also find [|%a|] more readable.)
Makefile
Outdated
| $(call SUBST,CC_PROFILE) \ | ||
| $(call SUBST,EXT_ASM) \ | ||
| $(call SUBST,EXT_DLL) \ | ||
| -e 's|%%EXT_EXE%%|$(EXE)|' \ |
There was a problem hiding this comment.
Any reason why EXT_EXE doesn't get the same treatment?
There was a problem hiding this comment.
It's the only one where the substitution variable name doesn't match the Makefile variable name... I guess utils/config.mlp could be changed...
|
license header on |
|
David Allsopp (2017/08/14 11:22 +0000):
dra27 commented on this pull request.
> - -e 's|%%CCOMPTYPE%%|$(CCOMPTYPE)|' \
- -e 's|%%CC_PROFILE%%|$(CC_PROFILE)|' \
- -e 's|%%EXT_ASM%%|$(EXT_ASM)|' \
- -e 's|%%EXT_DLL%%|$(EXT_DLL)|' \
+ sed $(call SUBST,AFL_INSTRUMENT) \
+ $(call SUBST,ARCH) \
+ $(call SUBST,ARCMD) \
+ $(call SUBST,ASM) \
+ $(call SUBST,ASM_CFI_SUPPORTED) \
+ $(call SUBST,BYTECCLIBS) \
+ $(call SUBST,BYTERUN) \
+ $(call SUBST,CC) \
+ $(call SUBST,CCOMPTYPE) \
+ $(call SUBST,CC_PROFILE) \
+ $(call SUBST,EXT_ASM) \
+ $(call SUBST,EXT_DLL) \
-e 's|%%EXT_EXE%%|$(EXE)|' \
Why this change, actually? What does it bring compared to how things
were done before?
Autoconf will make all this part go away anyway so I am not sure this
really needs to be touched?
|
|
@shindere - for this pull request, it means that Semantically, it does also decrease the chance of generating invalid strings because it means that |
|
David Allsopp (2017/08/16 13:04 +0000):
@shindere - for this pull request, it means that `Makefile` now passes
`tools/check-typo` as the `WITH_SPACETIME_CALL_COUNTS` options was
tricky to split.
Okay
Semantically, it does also decrease the chance of generating invalid
strings because it means that `\` is always correctly escaped for each
variable. Previously this was done manually for `FLEXLINK_FLAGS`,
`MKDLL`, `MKEXE` and `MKMAINDLL` only but in theory one could have
paths in any of the compiler flags variables too.
Fine with me then. Thanks!
|
edwintorok
left a comment
There was a problem hiding this comment.
Changes to lintapidiff look fine
|
@gasche - thanks! I found a way to alter the ASCIIdoc output to reduce the line length with the URL. There's no equivalent in Markdown, and it seems a shame to enable |
testsuite/tests/lib-digest/md5.ml
Outdated
| Int32.(logor (shift_left (of_int (Bytes.get s (j+3) |> Char.code)) 24) | ||
| (logor (shift_left (of_int (Bytes.get s (j+2) |> Char.code)) 16) | ||
| (logor (shift_left (of_int (Bytes.get s (j+1) |> Char.code)) 8) | ||
| (of_int (Bytes.get s j |> Char.code))))) |
There was a problem hiding this comment.
Is there a way to format this code to make it readable?
let byte n = Bytes.get s (j+n) |> Char.code |> Int32.of_int in
let open Int32 in
byte 0
|> logor (shift_left (byte 1) 8)
|> logor (shift_left (byte 2) 16)
|> logor (shift_left (byte 3) 24)
There was a problem hiding this comment.
Readable test cases?! Whatever next!
Sorry, I'd missed this one previously, but I've changed that - far better to improve the readability of the code, than just reduce the line lengths below the maximum.
| typechecker: | ||
|
|
||
| http://caml.inria.fr/pub/docs/u3-ocaml/index.html[Using, Understanding, and Unraveling the OCaml Language by Didier Rémy] :: | ||
| http://caml.inria.fr/pub/docs/u3-ocaml/index.html[Using, Understanding, and Unraveling the OCaml Language by Didier Rémy] :: |
There was a problem hiding this comment.
Why is this necessary, is this file not valid UTF8?
There was a problem hiding this comment.
Funny fact: I've known Didier for more than 25 years, and I learned two weeks ago that his name officially doesn't have an accent...
This is slightly dangerous: I've once seen a file with UTF-8 in the comments and Latin-1 in the code. Emacs got confused by this and introduced mojibake in the code (which made it buggy). But if we make sure there's no non-ascii character anywhere else than in headers, I guess it's OK. |
6871c4c to
316319d
Compare
|
@damiendoligez - I've pushed an extra commit (316319d) which adds a |
|
This is now rebased. I have hopefully applied enough care in the root Makefile (the change for updating utils/config.ml) to reflect recent additions, but this must therefore be allowed to run through CI. All previous review points had been dealt with so as long as Damien is happy with the new UTF-8/non-ASCII check, I think this it then good to go, and I'll get #1288 updated. (I'm liking having tools/check-typo always check my commits, but I'm liking the fact it reveals other people's violations less!!) |
316319d to
6d40526
Compare
| BOOT_FLEXLINK_CMD = FLEXLINK_CMD="../boot/ocamlrun ../flexdll/flexlink.exe" | ||
| CAMLOPT := OCAML_FLEXLINK="boot/ocamlrun flexdll/flexlink.exe" $(CAMLOPT) | ||
| FLEXDLL_DIR=$(if $(wildcard flexdll/flexdll_*.$(O)),"+flexdll") | ||
| FLEXDLL_DIR=$(if $(wildcard flexdll/flexdll_*.$(O)),+flexdll) |
There was a problem hiding this comment.
Travis complains about this change. The value of FLEXDLL_DIR must be a string in OCaml syntax, so the quotes are needed.
There was a problem hiding this comment.
It's not quite this - it's a mistake in one of the macros further down. I've just pushed a fix. The issue is that FLEXDLL_DIR is non-empty then it must be a complete OCaml string (with the double quotes) but if it's empty then it should be empty - as the result goes into a string list into sys/config.mlp. The new generic SUBST macro escapes double-quotes which was what was causing the problem for this specific variable.
|
@dra27:
"Permit lines between 80-132 in .travis-ci.sh": why is that needed? Why
isn't it possible to have this file respect the 80 characters line
limit?
"Tweak .gitattributes for fakeclock.c in testsuite": same question
here, why this?
I thought we decided not to have copyright headers in test programs, but
perhaps I am mistaken here?
"Permit UTF-8 characters in copyright and authors": are we really sure
we want to go in this direction?
"Ignore very long URLs in tools/check-typo": typo in the commit message:
This is only to allow **a allow a** very long URL
Am I correct that you will get rid of the URL shortener, finally?
|
|
David Allsopp (2017/10/11 10:39 +0000):
This is now rebased. I have hopefully applied enough care in the root
Makefile (the change for updating utils/config.ml) to reflect recent
additions, but this must therefore be allowed to run through CI.
Will you use precheck?
|
6d40526 to
b914315
Compare
|
@shindere: yes, I will run this through precheck (once it also passes Travis - careless of me not to have checked Linux in the first place). |
|
David Allsopp (2017/10/11 15:03 +0000):
@shindere: yes, I will run this through precheck (once it also passes
Travis - careless of me not to have checked Linux in the first place).
Great, thanks. Please let me know how it feels with precheck, I had
re-build it recently.
|
9b54cd4 to
6f16a74
Compare
The particular line was the API query because the URL is very long. Specifically for the CI, it seemed easier just to allow this one to have a few overly long lines because it should change very infrequently. I can revisit this if you prefer.
We had, but fakeclock.c is a massive piece of unscrupulous C on which I wish to be identified (both for the "oh my God, who did this" reason and also for copyright!).
It's easy for me to say no one should have accents in their names! That said, I think we should permit it, yes, even though the support here is not perfect (part of the "TODO Ensure the file is valid UTF-8" part would also include "and ensure that the user has not used any combining characters", i.e. that it's normalised, but I think that then strays into over-engineering... and I'm certainly not planning on implementing Unicode normalisation in awk!).
Fixed 😊
I should have re-worded this when I did the fixup (I have now) - I've used as many markdown/ASCIIdoc tricks as possible to shorten those lines. |
|
David Allsopp (2017/10/11 15:16 +0000):
> "Permit lines between 80-132 in .travis-ci.sh": why is that needed? Why
isn't it possible to have this file respect the 80 characters line
limit?
The particular line was the API query because the URL is very long.
Specifically for the CI, it seemed easier just to allow this one to
have a few overly long lines because it should change very
infrequently. I can revisit this if you prefer.
Perhaps it would be possible in check-typo to allow lines above 80
characters only for URLs?
I'm sorroy, I confused this file and appveyor_build.sh which also seems
to have several lines that go beyond 80 characters and could benefit
from a re-formatting.
> "Tweak .gitattributes for fakeclock.c in testsuite": same question
here, why this?
>
>I thought we decided not to have copyright headers in test programs, but
perhaps I am mistaken here?
We had, but fakeclock.c is a massive piece of unscrupulous C on which
I wish to be identified (both for the "oh my God, who did this" reason
and also for copyright!).
I think file authors can already be identified through git blame and am
not very tnthousiastic about exceptions but if our fellow developers are
okay, fine with me.
> "Permit UTF-8 characters in copyright and authors": are we really sure
we want to go in this direction?
It's easy for me to say no one should have accents in their names!
That said, I think we should permit it, yes, even though the support
here is not perfect (part of the "TODO Ensure the file is valid UTF-8"
part would also include "and ensure that the user has not used any
combining characters", i.e. that it's normalised, but I think that
then strays into over-engineering... and I'm certainly not planning on
implementing Unicode normalisation in awk!).
Well I wouldn't have changed anything, although I am concernend. I think
I'd prefer to keep things as simple as possible.
> "Ignore very long URLs in tools/check-typo": typo in the commit message:
This is only to allow **a allow a** very long URL
Fixed 😊
> Am I correct that you will get rid of the URL shortener, finally?
I should have re-worded this when I did the fixup (I have now) - I've
used as many markdown/ASCIIdoc tricks as possible to shorten those
lines.
OK.
|
|
@dra27 Please write a Changes entry and I think you can merge this; @damiendoligez approved yesterday. |
|
Damien Doligez (2017/10/20 02:41 -0700):
> OK. Didn't we want to *remove* the accents, actually?
I've given up on removing accents from the names of contributors and
institutions.
:(
|
|
@dra27 Is there a reason this hasn't been merged to 4.06? You just need to write a Changes entry, or so I thought. |
Large piece of code which does have a header.
Allows names to include accented characters, but only in a comment.
Incorrectly copied from an old QPL header.
Reduce line lengths and tidy-up the code while at it.
Simultaneously deals with an overly-long line and also ensure that backslashes would be escaped no matter which variables they happened to appear in.
This is only to allow for a few specific instances, so for the time being implemented for long-long and not very-long-line.
Add a new warning non-ascii-utf8 displayed only if the non-ascii attribute is specified and UTF-8 characters were ignored in the copyright or authors lines in the header.
1104c58 to
f282223
Compare
|
Travis Changes test appears to be overloaded. Merged and cherry-picked to 4.06 as 4e03f87 |
This is a prerequisite for #1148 moving towards having the entire repository pass the
tools/check-typoscript.Review points definitely required:
tools/check-typoto permit UTF-8 characters in copyright and authors lines in the headers (allows several existing headers to retain accented characters and also means no author in future should feel compelled to use incorrect/alternate spellings of names just to satisfy a script)Makefileare worthwhile in their own right, but consequently should be reviewed as code, not reformattingCCOUTPUTinotherlibs/systhreads/Makefileis a merge artefact from @shindere's marvellous merge, but I didn't triple-check itThere are various people who should briefly sign off on (hopefully) trivial alterations:
tools/check-symbol-nameswas lacking a copyright and I assume that you're happy to be credited with your script 😉tools/lintapidiff.mlhas a mis-formatted header, but I believe you are the both author and copyright holder!tools/make_opcodes.mllhas a wrong licence, either because it was miscopied or possibly because the patch originally predated the change of OCaml's licence. Please confirm that switching the file to declare LGPL 2.1 as for the rest of the tree is OK.