Add an RFC describing SV boolean semantics#3
Conversation
|
I think this is terrible idea! We seem to be suddenly obsessed with ignoring perl's basic operating principle (that data is fluid and typeless - type interpretation being imposed by ops operating on the data) for the sake of the one use case of round trips for JSON and similar serialisers . Such a boolean flag adds extra overhead to lots of hot places in the perl core - for example pp_add(), which currently has to handle all permutations of two args being (int/num/string/other), would have to cope with (bool/int/num/string/other) x 2 |
This doesn't change the semantics of any existing operations: all values ("boolean-intent" or not) can still be tested for truth, can still be numified or stringified. Nothing at all changes there. The only thing that is new is that now there is an additional question that can be asked ("is this value boolean-intent?"), a question which answers "yes" to the result of any boolean predicate test operator, or any SV transitively initialised from such.
Indeed - it turned out for that and other reasons I didn't implement it in a flag. Instead, in my branch I have a small optimisation in
my @arr;
push @arr, 1 == 1 for 1 .. 100000;will now consume less memory and take less CPU time to run.
As far as I can tell, this optimisation is already beneficial in terms of CPU and memory savings, even if we ignore the new ability we gain by having You can see my current progress at |
|
On a broader note:
I don't know about "suddenly" there. It's been the case a long time that Perl users have cared about things like boolean roundtrips in serialisers. The entire JSON module ecosystem (consisting of multiple modules, by multiple authors) invented a special-case hack for this problem - basically see the entire context and history around https://metacpan.org/pod/JSON::PP::Boolean This suggested change is part of a broader set of changes (e.g. Nicholas Clark's PV vs IV flags adjustments) to add what I am desperately trying not to call a "type system" to Perl. "Type System" usually brings to mind that the computer (either the compiler or the interpreter) will forbid certain operations that it doesn't think match up. That isn't what we're doing here. What we're doing is adding more what I am trying to call "intention tracking" - the idea that given a scalar value, can we know what the programmer intended it to mean? Sure any value can be tested for boolean truth, for numerical value, or stringy value, but what did the programmer really have in mind as "the" canonical shape of this data. Looking at the wider picture, comparing a few other languages, Perl is starting to look somewhat lacking here: I'm not saying this one change alone is sufficient, but this is one small facet of a much larger issue. Without fixing the larger issue, it becomes harder for Perl to stand alongside these other languages in modern commercial settings. Interoperability of data is an important concern these days - gone is the time of little standalone awk-like scripts running on the local developer's machine. There is no getting around it - we need abilities like this for Perl to remain relevant to the modern world. |
There's really nothing sudden about it. We have more cases where the community has been begging for a little "stricter" types for many many years. The old scalar value problem of characters vs bytes comes to mind, when dealing with I/O. There's a reason |
d0a1d4e to
075fde9
Compare
|
Given this doesn't have a negative performance impact, it seems like a great improvement for allowing Perl to inter-op more easily with modern devops stuff, which almost always involves json or yaml wrangling. |
|
On Sat, Aug 07, 2021 at 07:41:16AM -0700, Paul Evans wrote:
As far as I can tell, this optimisation is already beneficial in terms of CPU and memory savings, even if we ignore the new ability we gain by having `SvISBOOL()`
You can see my current progress at
https://github.com/leonerd/perl5/tree/stable-bool
The trouble with this is that it is making a shared read-only string
buffer effectively writeable. This SEGVs on your branch:
my $x = 1;
my $y = ($x == 1);
$y =~ s/1/0/;
…--
In England there is a special word which means the last sunshine
of the summer. That word is "spring".
|
Ahyes, indeed. :/ I wonder if setting the COW flag will solve this. Possibly, though it suggests there might be a possibility for other similar troubles. In any case while I think it over I'll add some tests and a fix for this one at least. |
|
OK, I think have made some progress there. I've also made a (draft) PR for the branch, so discussions on the implementation can be had over there: Perl/perl5#19040 That leaves this thread free for the abstract intent of the idea, aside from the impl. |
|
Implementation-wise, the code looks happy and solid, and so far seems to be waiting on this RFC process to continue before it gets merged. How do we resolve this stalemate? Can we "accept" the RFC? Or failing that, do we just merge the PR that implements it into perl core and call it done? |
rfcs/rfcTODO.md
Outdated
|
|
||
| The core immortals `PL_sv_yes` and `PL_sv_no` will always respond true to `SvBOOLOK()`, and such a flag will be reliably copied by `sv_setsv()` and friends. The result here is that the result of boolean-returning ops will be `SvBOOLOK()` and this flag will remain with any copies of that value that get made - either over the arguments stack or stored in lexicals or elements of aggregate structures. | ||
|
|
||
| These `SvBOOLOK()` values will still be subject to the usual semantics regarding macros like `SvPV` or `SvIV` - so numerically they will still be 1 or 0, and stringily they will still be "1" and "" (though see the Future Scope section below). |
There was a problem hiding this comment.
You might want to clarify here to cover @iabyn's comment, that the POK, IOK etc flags on the SV won't be changing, so code like pp_add doesn't need any changes.
|
|
||
| Obviously such a solution is specific to JSON encoding and does not apply to, for example, message gateway between JSON and MsgPack, which would require some translation inbetween. A true in-core solution to this problem would have many benefits to interoperability of data handling between these various modules. | ||
|
|
||
| ((TODO: Add some comments about purely in-core handling as well that don't rely on serialisation)) |
There was a problem hiding this comment.
Should this TODO be updated? (or removed)
|
With regard to the detail you are less sure about, the If they will live with Scalar::Util then they should all be there, and one can I believe having a But I believe the best solution to this will also be informed by the separate discussion going on in p5p/etc about having the new generic |
|
It seems reasonable to put true and false constants in the Until then, I think an |
|
@duncand Yes you're right about the |
|
If using |
|
@kraih That's fair. What I was specifically advocating was that we do NOT have |
|
Actually I would argue that |
|
Adding it as a feature is a prerequisite for "appearing automatically when one says |
075fde9 to
feea6f9
Compare
feea6f9 to
e6ecd22
Compare
While this is still in progress its ID number remains "TODO"; I will set it to the next free number just before merge.