Skip to content

Conversation

@faho
Copy link
Member

@faho faho commented Aug 4, 2021

In most cases, like math, we want C-semantics for floating point
numbers. In particular "." needs to be the decimal separator.

Instead, we pay the price in printf, which is currently the sole place
to output in locale-specific numbers and attempt to read them and
C-style ones.

This is very likely to contain GNUisms, not work on BSD, yadda yadda.

This is an RFC, intended to get people to check and to think about.

The results are... interesting.

All tests done on my Archlinux system with LANG=de_DE.UTF-8:

  • math is about 10% faster with the new system - this is understandable, since it no longer needs to switch locale
  • external_cmds.fish is about 1.4x faster than the old system! From 1.5s to 1.05s!

I have absolutely no idea why, but the same speedup is reproducible by just setting LC_NUMERIC=C, so it seems the multibyte locales are just slower? I have no idea where we'd be using them, tho, and cachegrind doesn't appear helpful either - it's possible wcs2string is faster (which takes ~50% of that benchmark, it's ridiculous).

cc @ridiculousfish

@faho faho added enhancement RFC performance Purely performance-related enhancement without any changes in black box output labels Aug 4, 2021
@faho faho added this to the fish-future milestone Aug 4, 2021
@faho
Copy link
Member Author

faho commented Aug 5, 2021

As a sidenote: printfs output being locale-dependent is super awkward, because that means

math (printf '%f' 5.2) + 2

might or might not work, depending on the locale. But since math cannot (and definitely SHOULD NOT) try to read numbers in the regional way (because it uses "," as argument separator) the only possible change is to make printf print numbers with ".", always. Which would be a break from e.g. bash (and probably POSIX), but not one I'm opposed to.

@ridiculousfish
Copy link
Member

I like the change. The switch to uselocale instead of setlocale is quite nice for thread safety reasons. This works fine on my Mac, though there's no apparent performance difference. We'll see which if any platforms are broken.

@faho
Copy link
Member Author

faho commented Aug 9, 2021

This works fine on my Mac, though there's no apparent performance difference

Not even faster math? This should be as fast as fba86fb, at least.

auto is_digit = [](wchar_t c) { return '0' <= c && c <= '9'; };
if (len_plus_0 <= sizeof narrow && std::all_of(str, str + len, is_digit)) {
auto is_ascii = [](wchar_t c) { return 0 <= c && c <= 127; };
if (len_plus_0 <= sizeof narrow && std::all_of(str, str + len, is_ascii)) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I am not sure if the fast path actually helps anymore if we always use LC_NUMERIC=C.

I would appreciate benchmarks on this.

@faho faho force-pushed the putting-the-c-in-numeric branch from 04e3411 to dd5f4b6 Compare August 9, 2021 18:11
@faho
Copy link
Member Author

faho commented Aug 9, 2021

Ah, yes, forgot that NetBSD does not have uselocale.

So the ideal of thread-safe locale switching goes out of the window on that system. The alternatives are:

  • setlocale
  • printf just does "." always.

I'm tempted to just do the latter, first on NetBSD, and then everywhere, because that makes more sense anyway.

@faho
Copy link
Member Author

faho commented Aug 10, 2021

external_cmds.fish is about 1.4x faster than the old system! From 1.5s to 1.05s!

Okay, on WSL1 I get 1.8x, but only if I effectively set LC_NUMERIC to de_DE.UTF-8 - en_US.UTF-8 has about the same performance as C. Something here is causing massive slowdowns in either that locale or comma-using ones in general.

Edit: Tested with fr_FR.UTF-8, same slowdown. So something has issues with comma-using locales, and it's probably glibc. Unfortunately under valgrind the issue goes away, so I have no idea what becomes faster/slower here.

@faho
Copy link
Member Author

faho commented Aug 10, 2021

Okay, so this works on linux (with glibc), macOS, FreeBSD. It works with setlocale on NetBSD (because they believe that you should use printf_l and refuse to implement uselocale - yes, really). And it "works" on OpenBSD with the wonderful "CAVEAT":

On systems other than OpenBSD, the LC_NUMERIC locale(1) category can cause parsing failures; see CAVEATS in setlocale(3) for details.

Yeah, sure, OpenBSD. Removing a feature is totally the same as unbreaking things. (LC_NUMERIC is basically unused there, which means it'll never read localized numbers anywhere. Whatever)

For the record: I hate all of this.

@faho faho marked this pull request as ready for review August 14, 2021 08:55
@faho faho force-pushed the putting-the-c-in-numeric branch from dd5f4b6 to f7409af Compare August 14, 2021 09:56
@faho
Copy link
Member Author

faho commented Aug 17, 2021

Note: This doesn't need to be merged for 3.4.0.

I would prefer more testing and benchmarks (because this is such a portability mess!), so I'll delay merging it.

@faho faho force-pushed the putting-the-c-in-numeric branch from f7409af to d331ac0 Compare August 17, 2021 14:35
faho added 4 commits October 13, 2021 20:59
In most cases, like math, we want C-semantics for floating point
numbers. In particular "." needs to be the decimal separator.

Instead, we pay the price in printf, which is currently the sole place
to output in locale-specific numbers and attempt to read them and
C-style ones.

This is *very likely* to contain GNUisms.
We call dup*, we need to call free*. I hate C.
Blergh.

Simply blergh.

Pretty sure this isn't thread-safe, but I'm also pretty sure there *is
no* threadsafe way of doing this without uselocale, so whatever.
@faho faho force-pushed the putting-the-c-in-numeric branch from e778350 to a177824 Compare October 13, 2021 19:01
@faho
Copy link
Member Author

faho commented Oct 13, 2021

Okay, I am merging this now, because:

  1. Nobody seems to have found any issues with this
  2. I have found a system where comma-using locales are pathologically slow - by a factor of 20x or so?

@faho faho merged commit 8dc3982 into fish-shell:master Oct 13, 2021
@faho faho deleted the putting-the-c-in-numeric branch October 14, 2021 05:58
@faho
Copy link
Member Author

faho commented Oct 14, 2021

Okay, I figured out why it's soo much slower:

#if defined(__GLIBC__) && !defined(__UCLIBC__)
const char *version = gnu_get_libc_version();
result = version && strtod(version, nullptr) >= 2.24;
#endif

In comma-using locales on systems using glibc, that check fails!

The glibc version string is stored with a ., and so reading it as a float (which is wrong anyway!) fails, and so fish stops us from using posix spawn, which can be quite a bit slower (especially on my WSL1 test case!).

This check should be rewritten. @ridiculousfish should this not be using __GLIBC_MINOR__ and friends? Alternatively we could use strtod_l with a C locale here, or strtoi twice for the components?

@zanchey zanchey modified the milestones: fish-future, fish 3.4.0 Oct 14, 2021
@ridiculousfish
Copy link
Member

Oh wow. That must have been very annoying to track down. Yes __GLIBC_MINOR__ is what I should have been using all along, I'll add that.

ridiculousfish added a commit that referenced this pull request Oct 16, 2021
__GLIBC_PREREQ is the preferred way to conditionally enable features
based on glibc versions. Use it to avoid expensive parsing and
locale sensitivity. See #8204
@ridiculousfish
Copy link
Member

Looks like __GLIBC_PREREQ is the thing to use. f0d4fd8

krobelus added a commit to krobelus/fish-shell that referenced this pull request Dec 12, 2021
I used the command from fish-shell#8092 to list issues/PRs with missing changelog entries,
and went through most of them.

Each issue gets three lines
- subject
- URL
- verdict

If the verdict ends with ", ignoring", I added it the the ignored list in the changelog.

The issues are grouped by verdict, with the interesting/leftover ones
on top (obviously no need to double check *everything*).

The "gh" script is already a quantum leap we can still find better
ways to share the changelog burden.
I noticed that there are many minor updates that can probably be
ignored.  Filtering them out doesn't take much time but it adds up,
especially if it's a single person doing it.

---

Issue/PR title: Make --no-config mode more comfortable
fish-shell#8493
in-flight

Issue/PR title: Debian fish binary package difficult to install
fish-shell#7845
TODO

Issue/PR title: Work around `setpgid` error on older Apple platforms
fish-shell#8153
this extends 7474, which wasn't listed either, we should probably mention it?

Issue/PR title: Abbr -q return status inconsistent
fish-shell#8431
fixes return status of "abbr -q", should we mention this?

Issue/PR title: math: (n n): incorrect error
fish-shell#8511
improved error output, which is very nice but too minor?

Issue/PR title: Fish autocomplete error on iOS procursus
fish-shell#8205
niche fix, ignoring

Issue/PR title: Fix `fish_key_reader` wrapper check
fish-shell#8271
minor update to not create a harmless alias for fish_key_reader, ignoring

Issue/PR title: funced dosn't like backslash escapes in function names
fish-shell#8289
minor escaping fix, ignoring

Issue/PR title: Hide whatis database building from the user
fish-shell#8310
not something many users would notice in the first place, ignoring

Issue/PR title: Duplicated "Type 'help argparse' for related documentation" for argparse
fish-shell#8368
minor update to error message, ignoring

Issue/PR title: Variable highlight color does not span lines
fish-shell#8444
very obscure fix, ignoring

---

Issue/PR title: Add --function to `read`
fish-shell#8295
added to existing entry (565)

Issue/PR title: Added completions for ethtool
fish-shell#8283
added to existing entry

Issue/PR title: Add dart completion
fish-shell#8315
added to existing entry

Issue/PR title: Add common lisp completions(sbcl/roswell)
fish-shell#8330
added to existing entry

Issue/PR title: Fix st issue with shift+tab
fish-shell#8354
added to existing entry (8352)

Issue/PR title: Support vi-mode cursors in Foot Terminal 
fish-shell#8391
added to existing entry (8167)

Issue/PR title: Completions pager should redraw if the subbed completion wraps/unwraps the line
fish-shell#8405
added to existing entry (8509)

---

Issue/PR title: Binding escape as user binding breaks escape sequence bindings (arrows, etc)
fish-shell#8428
added new entry

Issue/PR title: Windows "color" command completion
fish-shell#8483
added new entry

Issue/PR title: Doesn't build when using netbsd curses on Linux
fish-shell#8087
added new entry

Issue/PR title: Don't override linker
fish-shell#8152
added new entry

Issue/PR title: Add completions for `git-sizer`
fish-shell#8156
added new entry

Issue/PR title: `d3ceba107e88b6c6e1a0358ebcb30366aeef653f` causes issues with repainting multi-line prompt
fish-shell#8163
added new entry

Issue/PR title: Builtin math ncr can be extremely slow
fish-shell#8170
added new entry

Issue/PR title: Completion sometimes missing the last token
fish-shell#8175
added new entry

Issue/PR title: `set -S` should mark read-only variables
fish-shell#8179
added new entry

Issue/PR title: Errors when trying to autocomplete (invalid) UTF-8 escapes
fish-shell#8195
added new entry

Issue/PR title: Always use LC_NUMERIC=C internally
fish-shell#8204
added new entry

Issue/PR title: Slow interaction between backgrounding, universal variables, and repainting
fish-shell#8209
added new entry

Issue/PR title: Unsetting `$fish_emoji_width` doesn't clear the cached width
fish-shell#8274
added new entry

Issue/PR title: If prompt ends in an empty line, the commandline is inserted at the width of the line before
fish-shell#8298
added new entry

Issue/PR title: assertion normal_exited() failed related to paged builtin help
fish-shell#8308
added new entry

Issue/PR title: colors don't kick in for ls on macOS Big Sur, Monterey (and maybe FreeBSD)
fish-shell#8309
added new entry

Issue/PR title: Adds sub-command clear-session to history command. Issue fish-shell#5791
fish-shell#8337
added new entry (as 5791)

Issue/PR title: Display local branches before unique remote branches in git completion
fish-shell#8338
added new entry

Issue/PR title: Fix delete-key in st
fish-shell#8352
added new entry

Issue/PR title: sigsegv on set --show variable (when LANG is set to fr_FR.utf8)
fish-shell#8358
added new entry

Issue/PR title: Add clasp completion
fish-shell#8373
added new entry

Issue/PR title: `cargo run --example` completions break with nested example directories
fish-shell#8429
added new entry

Issue/PR title: argparse completions
fish-shell#8434
added new entry

Issue/PR title: Stop linking to StackOverflow
fish-shell#8495
added new entry

Issue/PR title: fish_key_reader ^C warning isn't right
fish-shell#8510
added new entry

Issue/PR title: Use `--almost-all` in `la` function
fish-shell#8519
added new entry

---

Issue/PR title: improve the experience of using fish over mosh
fish-shell#1363
listed as 8376, ignoring

Issue/PR title: incomplete man page completions
fish-shell#8305
listed as 8309, ignoring

Issue/PR title: Support "$(cmd)" command substitution without line splitting
fish-shell#8059
listed as 159, ignoring

Issue/PR title: fish_config: Read colorschemes from .theme files
fish-shell#8127
listed as 8132, ignoring

Issue/PR title: funced: edit the whole file, not just the function definition
fish-shell#8130
listed as 391, ignoring

Issue/PR title: builtin cd: print error about broken symlink
fish-shell#8270
listed as 8264, ignoring

Issue/PR title: fix man completion for BSD's mandoc
fish-shell#8306
listed as 8305, ignoring

Issue/PR title: Don't escape tildes that come from custom completions
fish-shell#8441
listed as 8441, ignoring

Issue/PR title: Use `cargo run --example` to get list of examples
fish-shell#8446
listed as 8429, ignoring

---

Issue/PR title: Node completion: add v8 sparkplug option
fish-shell#8118
update to existing completions, ignoring

Issue/PR title: Add zypper subcommands completion
fish-shell#8183
update to existing completions, ignoring

Issue/PR title: completion nmap: suppress warning when local scripts folder exists
fish-shell#8184
update to existing completions, ignoring

Issue/PR title: add missing `git commit` completions
fish-shell#8191
update to existing completions, ignoring

Issue/PR title: Updated ping completions
fish-shell#8192
update to existing completions, ignoring

Issue/PR title: Add `--function` to `set` completion
fish-shell#8202
update to existing completions, ignoring

Issue/PR title: completion: support `--no` prefixes for mpv flag options
fish-shell#8219
update to existing completions, ignoring

Issue/PR title: complete "mpc load"
fish-shell#8241
update to existing completions, ignoring

Issue/PR title: Add and fix completions for new options
fish-shell#8243
update to existing completions, ignoring

Issue/PR title: Fix completions/ls.fish
fish-shell#8249
update to existing completions, ignoring

Issue/PR title: Fix completions/coredumpctl.fish and add new complete
fish-shell#8256
update to existing completions, ignoring

Issue/PR title: completions/git: Handle "1 .T" & "1 AT" files
fish-shell#8311
update to existing completions, ignoring

Issue/PR title: completions/xbps-query: add missing `-p` completions
fish-shell#8323
update to existing completions, ignoring

Issue/PR title: Update ldapsearch.fish
fish-shell#8326
update to existing completions, ignoring

Issue/PR title: small fix completions/duply.fish
fish-shell#8327
update to existing completions, ignoring

Issue/PR title: Update ip.fish
fish-shell#8334
update to existing completions, ignoring

Issue/PR title: Fix ant completion
fish-shell#8344
update to existing completions, ignoring

Issue/PR title: Update dmesg completions
fish-shell#8365
update to existing completions, ignoring

Issue/PR title: No hints for -g|--global and -U|--universal flags for abbr command
fish-shell#8367
update to existing completions, ignoring

Issue/PR title: Updated systemd-analyze completions
fish-shell#8381
update to existing completions, ignoring

Issue/PR title: vmctl completion function call needs to be quoted
fish-shell#8406
update to existing completions, ignoring

Issue/PR title: pabcnetcclear command completion update
fish-shell#8480
update to existing completions, ignoring

---

Issue/PR title: document `--no-config`
fish-shell#8176
doc update, ignoring

Issue/PR title: Theme demo needs to be adjusted so that only unmatched quote is an error
fish-shell#8260
doc update, ignoring

Issue/PR title: no error about wrong >>? redirection operator
fish-shell#8380
doc update, ignoring

Issue/PR title: set -l works outside of command block
fish-shell#8385
doc update, ignoring

Issue/PR title: Some enhancements to "for" and "while" loop pages
fish-shell#8409
doc update, ignoring

Issue/PR title: Html docs: Remove link underlines again?
fish-shell#8439
doc update, ignoring

Issue/PR title: Old-style options support "=" assignment operator in complete builtin
fish-shell#8457
doc update, ignoring

Issue/PR title: Document prompt_hostname
fish-shell#8522
doc update, ignoring

---

Issue/PR title: edit_command_buffer: use "command" to ignore any functions with the same name
fish-shell#8221
only helps broken systems, ignoring

Issue/PR title: Prepend command to cat
fish-shell#8287
only helps broken systems, ignoring

Issue/PR title: Make less version check compatible with older Fish
fish-shell#8299
only helps broken systems, ignoring

Issue/PR title: Skip leading `command` in `__fish_man_page`
fish-shell#8447
only helps broken systems, ignoring

Issue/PR title: fish_config doesn't work without curses module
fish-shell#8487
only helps broken systems, ignoring

---

Issue/PR title: fix 'socket file name too long' error
fish-shell#8128
test fix with long tempdirs (macOS), not really user-visible, ignoring

Issue/PR title: Give tests a more generic name
fish-shell#8449
not user-visible, ignoring

Issue/PR title: string tests sometimes failing on macOS (Github Actions)
fish-shell#8353
not user-visible, ignoring

Issue/PR title: history merge test fails on OpenBSD
fish-shell#6477
not user-visible, ignoring

---

Issue/PR title: Obtain Deno completions from itself
fish-shell#8471
update to an unreleased feature (7138), ignoring

Issue/PR title: `string length --visible` performance
fish-shell#8253
update to an unreleased feature, ignoring

Issue/PR title: Backspace character is ignored when calculating string widths
fish-shell#8277
update to an unreleased feature, ignoring

Issue/PR title: `fish_config choose` leaves previous right prompt in place
fish-shell#8314
update to an unreleased feature, ignoring

Issue/PR title: parenthesis characters outer of $(command substitution) in string cause error
fish-shell#8394
update to an unreleased feature, ignoring

Issue/PR title: Parser bug with command substitutions in strings inside parenthesis
fish-shell#8500
update to an unreleased feature, ignoring

Issue/PR title: fish_config: silently doesn't set color schemes.
fish-shell#8419
regression, not in any release, ignoring

Issue/PR title: :program: in sphinx doesn't link
fish-shell#8438
regression, not in any release, ignoring

Issue/PR title: __fish_seen_argument.fish throws exception when autocompleting
fish-shell#8478
regression, not in any release, ignoring

---

Issue/PR title: Fix typo in abbr docs
fish-shell#8280
typofix, ignoring

Issue/PR title: Fix typo in `set_colors` command documentation
fish-shell#8321
typofix, ignoring

Issue/PR title: Typo funcions -> functions
fish-shell#8257
typofix, ignoring

---

Issue/PR title: remove make_pair
fish-shell#8206
no behavior change, ignoring

Issue/PR title: replace push_back with emplate_back
fish-shell#8222
no behavior change, ignoring

Issue/PR title: clang-tidy: remove pointless virtual
fish-shell#8224
no behavior change, ignoring

Issue/PR title: change value to rvalue reference
fish-shell#8227
no behavior change, ignoring

Issue/PR title: convert const ref to rvalue ref
fish-shell#8228
no behavior change, ignoring

Issue/PR title: clang-tidy: use for range loops
fish-shell#8229
no behavior change, ignoring

Issue/PR title: fix deleted constructors
fish-shell#8230
nno behavior change, ignoring

Issue/PR title: clang-tidy: const reference conversions
fish-shell#8231
no behavior change, ignoring

Issue/PR title: clang-tidy: simplify two bool returns
fish-shell#8235
no behavior change, ignoring

Issue/PR title: clang-tidy: replace size comparisons with empty
fish-shell#8237
no behavior change, ignoring

Issue/PR title: clang-tidy: replace NULL with nullptr
fish-shell#8239
no behavior change, ignoring

Issue/PR title: add constexpr
fish-shell#8252
no behavior change, ignoring

Issue/PR title: __fish_seen_subcommand_from and __fish_seen_argument update
fish-shell#8430
no behavior change (apart from a regression that's fixed), ignoring

Issue/PR title: Run fish_indent on all non-test .fish files
fish-shell#8476
no behavior change, ignoring

Issue/PR title: Use test command instead of bracket command
fish-shell#8477
no behavior change, ignoring

Issue/PR title: Fix code scanning alert - Wrong type of arguments to formatting function
fish-shell#8521
no behavior change, ignoring

Issue/PR title: clang-tidy: replace push_back with emplace_back
fish-shell#8236
no behavior change, ignoring

---
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 17, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

enhancement performance Purely performance-related enhancement without any changes in black box output RFC

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants