Skip to content

Signed/unsigned names#83

Closed
sunfishcode wants to merge 3 commits intomasterfrom
signed-unsigned-names
Closed

Signed/unsigned names#83
sunfishcode wants to merge 3 commits intomasterfrom
signed-unsigned-names

Conversation

@sunfishcode
Copy link
Member

Some places said "Uint32" and some said "UInt32". I picked the latter here from personal preference, though I'm open to other preferences, and mainly just wish the naming to be consistent.

This patch also adds "S" to integer types used with explicit signedness.

@kripken
Copy link
Member

kripken commented May 26, 2015

I would actually prefer Uint, because that is how JavaScript typed arrays are named.

@jfbastien
Copy link
Member

+1 on consistency.
I have no preference on either capitalization, so lgtm either way.

@BrendanEich
Copy link

+1 on @kripken's point. It'll be a drag to have to type or even read UInt32 and Uint32Array.

/be

@sunfishcode
Copy link
Member Author

I have found it a drag to uncapitalize the I in Uint32 when it's capitalized in Int32, though I am perhaps biased because the convention in LLVM is to capitalize the I in UInt32 :). I won't hold out forever though if there's an overall preference for Uint32 :).

It's worth pointing out just for general awareness that we already differ from JS Typed Arrays in the convention of prepending 'S' for signed types, because we distinguish between Signed, Unsigned, and signedness-neutral. And because the U/S creates a nice symmetry and clearly highlights when there are signed vs unsigned things happening.

Some operators used "Uint32" and some "UInt32". Pick "Uint32" because
it's similar to JS Typed Array naming, although we also use "Sint32"
which is still different from JS Typed Array naming.
Add an 'S' for signed memory types, and use "Uint32" instead of "UInt32".
@sunfishcode sunfishcode force-pushed the signed-unsigned-names branch from 9e8ba43 to 272f9c6 Compare June 1, 2015 16:35
@sunfishcode
Copy link
Member Author

Ok, based on feedback here, I updated the patch to prefer lower case after the {S,U}. This seems to generalize to operators as well, so I also renamed Int32SDiv to Int32Sdiv and so on.

@pizlonator
Copy link
Contributor

FWIW, I prefer UInt32 over Uint32.

-Fil

On May 28, 2015, at 7:19 AM, Dan Gohman notifications@github.com wrote:

I have found it a drag to uncapitalize the I in Uint32 when it's capitalized in Int32, though I am perhaps biased because the convention in LLVM is to capitalize the I in UInt32 :). I won't hold out forever though if there's an overall preference for Uint32 :).

It's worth pointing out just for general awareness that we already differ from JS Typed Arrays in the convention of prepending 'S' for signed types, because we distinguish between Signed, Unsigned, and signedness-neutral. And because the U/S creates a nice symmetry and clearly highlights when there are signed vs unsigned things happening.


Reply to this email directly or view it on GitHub.

@BrendanEich
Copy link

If clean-slate, I prefer UInt32 also. Holding shift down (LH) while typing U and I (RH) is no less ergonomic and the result is prettier because regular in spelling of `Int.

If the slate here is clean enough because Typed Arrays are not closely coupled or in nearby content, no worries on that front. WebAssembly is master of its own naming domain ;-).

/be

@sunfishcode
Copy link
Member Author

Ok, a UInt32 version of the patch is available as #93, so we can pick which one we like.

@kripken
Copy link
Member

kripken commented Jun 1, 2015

I can see valid reasons for preferring either Uint or UInt. However, I think it will look like an annoying inconsistency in the web platform if people writing a website with both JS and wasm (which I think may become common) have to type Uint in one but read UInt in the other.

@BrendanEich
Copy link

@kripken: I agreed with your point in an earlier comment. But will people have to write JS and wasm by hand very often? Sorry to be dense, could you say more about that scenario?

/be

@kripken
Copy link
Member

kripken commented Jun 1, 2015

wasm probably wouldn't be written by hand much, but it would be read for debugging purposes not infrequently, I think, and in general we care about having a good View Source option for wasm.

Imagine someone with a site that uses js and wasm, and debugging some minor issue, through control flow that passes from js to wasm and back. It would feel odd to have Uint in some places and UInt in others.

@jfbastien
Copy link
Member

In think case, we don't have much reason to choose one or the other besides consistency with the web platform. Despite wasm being its own master w.r.t. naming, I think we should avoid being different just because, I'd therefore go with the same naming that the web paltform has.

@BrendanEich
Copy link

To add a bit more to the Web Platform side: SIMD and Value Types for ES7/2016 will as you'd hope match the Typed Array naming convention, so Uint32, Uint64, etc.

I'm back on the Web-first side, FWIW (stable, not flip-flopping again :-P). Our slate is not clean, and LLVM's slate is not overriding compared to the Web standards starting from WebGL, where typed arrays originated.

/be

@jfbastien
Copy link
Member

If anything, we want to move away from LLVM because we don't want the impression of a single-compiler spec! LLVM is really important to us, it's probably the first compiler we'll bring up, but we want other compilers on board.

@sunfishcode
Copy link
Member Author

I'm actually going to make the Web Platform argument in opposite direction. Matching the Web Platform for unsigned types makes it even more confusing that we're not matching it on signed types.

Also, the argument for UInt32 is not just "because LLVM", it's that it just looks nicer (see multiple comments above).

@pizlonator
Copy link
Contributor

On Jun 1, 2015, at 8:36 PM, Dan Gohman notifications@github.com wrote:

I'm actually going to make the Web Platform argument in opposite direction. Matching the Web Platform for unsigned types makes it even more confusing that we're not matching it on signed types.

Also, the argument for UInt32 is not just "because LLVM", it's that it just looks nicer (see multiple comments above).

+1

I don’t think that being consistent with other specifications is very important. We should just pick what we like. I’ve always thought that the WebGL/ES choice of Uint over UInt was awkward, and it was a rather innovative (ahem) convention.

-Filip

@kripken
Copy link
Member

kripken commented Jun 2, 2015

@sunfishcode : I don't understand your first paragraph. Where are we matching and we are we not, in the context you mean?

@sunfishcode
Copy link
Member Author

In wasm, signed things have an 'S', to clearly pair them with their 'U' counterparts. And since we have concepts where an Int32 is explicitly signed, explicitly unsigned, and neutral, S-, U-, and plain Int32 is a natural naming scheme to indicate which one we mean. That's incompatible with "Int32" meaning a signed int32.

@kripken
Copy link
Member

kripken commented Jun 2, 2015

I see. So JS has Int32, Uint32 and wasm can have Int32, Uint32, Sint32 and you are saying that the Int32 is not quite the same between those. That's a fair point.

However, signed is the "default" in JS, in the sense that

  1. almost all to-int convertions convert to a signed value (|,>>,~,&, all except for >>>)
  2. JS engines tend to optimize to signed integers
  3. asm.js uses signed for FFIs and returns etc.
  4. The issue that sparked this topic, that typed arrays default to signed, and a U appears for the unsigned.

So I would say that Int32 meaning "no sign specified" in wasm is represented most closely by the same label, Int32, in JS, where "no sign specified" can't exist and therefore when you want the "default" number without specifying a sign, you mean the signed value. It seems like a natural fit.

The bottom line for me is whether a web developer with an app containing both JS and wasm would prefer to see Int32, Uint32 in JS alongside either Int32, Uint32, Sint32 or Int32, UInt32, SInt32 in wasm. The former seems better to me: Uint32 maps directly; Int32 maps in every practical way - pass one back and forth between JS and wasm, and it looks as you'd expect, with the caveat that the sign "doesn't matter" in wasm, but devtools like the debugger would likely present it as signed, which makes it consistent; and Sint32 is the "new thing" in wasm that the dev needs to be aware of, that it maps to the same Int32 in JS. While I can appreciate the point that purposefully making the names different in wasm might help "separate" the two sides, and I see your point that there is some basic difference, perhaps justifying the separation, overall it just seems like a downside to me, compared to a developer seeing the same name on both sides, and things just working, which as argued above, I think they would.

@lukewagner
Copy link
Member

I agree with Dan's comment that using s/u/- to mean signed/unsigned/indifferent is a valuable convention. I have a hard time finding an aesthetic preference between Uint32 and UInt32, though. (I was about to suggest uint32, but this doesn't concatenate well in opcode names unless we switch to _ separation, a naming scheme I personally like but am not going to propose). I do think that there is greater-than-zero value to consistency, though, since it's not just JS, but all the specs/WebIDL that mention Uint32Array.

@sunfishcode
Copy link
Member Author

There's a lack of a strong current pulling either way, but I'd really like consistency here, and that requires we make a choice.

I'll add one more reason for preferring UInt32 to the reasons above: capital-after-the-{U,S} helps emphasize that the {U,S} are participating in the naming convention of identifying signed/unsigned things. In e.g. "SRem", it's slightly more clear that the abbreviation decomposes to "S" + "Rem".

Is this enough to push it over the top?

@kripken
Copy link
Member

kripken commented Jun 8, 2015

I agree that is a valid point. But equally compelling to me are arguments that we should just go lowercase: "srem", "sint". This is the most natural to me, likely because of LLVM IR. But it's just a subjective feeling of mine.

I could also see an argument for "Srem" to be compelling. I mean, it looks a little odd because I am used to LLVM conventions, but again, that's just a subjective gut feeling of mine. "Srem" is logically consistent in its own way ("signed remainder, and we capitalize just the first letter of the entire operation"), and while it feels odd to me in one way, it feels right in another because of all the times I have written "Sint" on the web.

All this was debated when typed arrays were designed for JS, and it could have gone either way. It went the way it did. IMO we should not fork the naming conventions of the web because of how a small number of us here might happen to feel right now. No subjective feeling of ours can be more important than overall consistency in the entire platform, can it? Surely such consistency is an objective value.

@sunfishcode
Copy link
Member Author

I'd be ok with lowercase names too if we can get consensus on that. #122 is a pull request for it.

@kripken
Copy link
Member

kripken commented Jun 8, 2015

I would prefer Sint, but I am ok with sint.

I think that sint is also not "going with the grain" of the current notation on the web, but on the other hand, Sint/SInt feels almost willful ("we picked the opposite bikeshed", "we also have a mix of upper and lower, and it follows different conventions") while Sint/sint feels like a natural separation ("this is a different domain", "we don't have a mix of upper and lower, we just have lower everywhere").

@jfbastien
Copy link
Member

Can we close this in favor of #122?

@sunfishcode
Copy link
Member Author

Closing. Based on discussion above, #122 seems most likely to gain consensus of the three PRs here, so I'll close this one and the other captializing one so that we can focus on #122.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants