Normative: Require the latest available Unicode version instead of a fixed version number#620
Normative: Require the latest available Unicode version instead of a fixed version number#620bterlson merged 1 commit intotc39:masterfrom mathiasbynens:unicode-9
Conversation
|
Should probably do a review of the changes before committing to this. Due diligence and all of that. Can you comment on whether/which changes are relevant? |
|
No due dilligence, I want my Unicode power symbol now!! |
|
@bterlson Sure.
Did I miss anything? |
|
@mathiasbynens Additions don't seem worrying. Anything by way of "breaking changes" there? Removals from |
|
No removals from I’ve checked all of the above using the Unicode data files directly, but it can all be verified quite easily by running // Look for removals in `ID_Continue`:
const a = require('unicode-8.0.0/Binary_Property/ID_Continue/code-points.js');
const b = new Set(require('unicode-9.0.0/Binary_Property/ID_Continue/code-points.js'));
const diff = new Set(a.filter(x => !b.has(x)));
console.log(diff);
// → Set { }Once we’ve established there are no removals, we can easily count the number of new symbols: const a = require('unicode-8.0.0/Binary_Property/ID_Continue/code-points.js').length;
const b = require('unicode-9.0.0/Binary_Property/ID_Continue/code-points.js').length;
console.log(b - a);
// 7339The same goes for other properties, e.g.: // Look for removals in `S` case folding:
const a = Object.keys(require('unicode-8.0.0/Case_Folding/S/code-points.js'));
const b = new Set(Object.keys(require('unicode-9.0.0/Case_Folding/S/code-points.js')));
const diff = new Set(a.filter(x => !b.has(x)));
console.log(diff); |
|
Good point! I will explore some and report back. |
|
Unicode now seems to be on a yearly update schedule that slightly lags ECAM-262. If out intent is to update these references every year , wouldn't it be better to use an open ended reference to the current Unicode standard. In standards documents, a normative reference to another standard that does not include a specific version or date qualifier means the "current version". |
|
@allenwb That’s what I proposed three years ago: https://bugs.ecmascript.org/show_bug.cgi?id=2071#c0 |
|
Whichever update process is used, Ecma-402 will need to be updated as well |
|
@allenwb I think that is what we had consensus for as well. An open-ended reference seems fine but I was thinking a specific reference at least cued us to do the due diligence of looking for potential issues. I worry without that (and @mathiasbynens's expertise) we'd grow complacent :-P Happy to update to an open-ended reference though. |
|
When I proposed upgrading to Unicode 8.0, I had no idea that an open-ended reference was in the cards as a legal possibility for a spec to do, and didn't realize that we got consensus on that. I thought the consensus was annual bumps like this. I like Allen's idea. Let's keep doing the due diligence, but I don't think we need explicit bump commits to enforce that. |
|
Going through the notes I see that we had consensus for "8 or greater" which doesn't actually imply that we can use an unversioned reference. For now I won't take this PR, and will add to the agenda for Redmond that we discuss the unversioned reference. |
|
PR updated to refer to the latest available Unicode version, as per the July 27 meeting. |
|
@mathiasbynens Looks great, thanks so much! |
|
@mathiasbynens do you plan to update 402 as well? |
…fixed version number Ref. tc39/ecma262#620.
Instead of referring to a version snapshot, link to the latest version of UTR15. Ref. #620.
…fixed version number Ref. tc39/ecma262#620.
- 2016: the Unicode change affected what was considered whitespace (tc39#300 / 24dad16) - 2017: the latest version of Unicode is mandated (tc39#620) - 2018: changed tagged template literal objects to be cached per source location rather than per realm (tc39#890) - 2019: Atomics.wake was renamed to Atomics.notify (tc39#1220) - 2019: `await` was changed to require fewer ticks (tc39#1250)
- 2016: the Unicode change affected what was considered whitespace (tc39#300 / 24dad16) - 2017: the latest version of Unicode is mandated (tc39#620) - 2018: changed tagged template literal objects to be cached per source location rather than per realm (tc39#890) - 2019: Atomics.wake was renamed to Atomics.notify (tc39#1220) - 2019: `await` was changed to require fewer ticks (tc39#1250)
…ns (tc39#1698) - 2016: the Unicode change affected what was considered whitespace (tc39#300 / 24dad16) - 2017: the latest version of Unicode is mandated (tc39#620) - 2018: changed tagged template literal objects to be cached per source location rather than per realm (tc39#890) - 2019: Atomics.wake was renamed to Atomics.notify (tc39#1220) - 2019: `await` was changed to require fewer ticks (tc39#1250)
…ns (tc39#1698) - 2016: the Unicode change affected what was considered whitespace (tc39#300 / 24dad16) - 2017: the latest version of Unicode is mandated (tc39#620) - 2018: changed tagged template literal objects to be cached per source location rather than per realm (tc39#890) - 2019: Atomics.wake was renamed to Atomics.notify (tc39#1220) - 2019: `await` was changed to require fewer ticks (tc39#1250)
…ns (tc39#1698) - 2016: the Unicode change affected what was considered whitespace (tc39#300 / 24dad16) - 2017: the latest version of Unicode is mandated (tc39#620) - 2018: changed tagged template literal objects to be cached per source location rather than per realm (tc39#890) - 2019: Atomics.wake was renamed to Atomics.notify (tc39#1220) - 2019: `await` was changed to require fewer ticks (tc39#1250)
As of June 21st, Unicode 9.0.0 is the latest version.Update July 27: This PR has been updated to refer to the latest available Unicode version rather than v9.0.0 specifically, as per the July 27 meeting.