Fix incorrect version, etc in "Collation and Unicode support" #6657
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This fixes #6618
Main purpose of this update is to correct the note stating: "Starting with SQL Server 2014 (12.x), all new _140 collations automatically support supplementary characters.". I accidentally introduced that error back in a5dd5b5#diff-bd26ae1d32f0764b643c46bbd2af1259a58e375debebe0d71b2cdb1583dfd754R153 on 2017-10-24, due to inconsistent naming conventions (at least at that time) for the include files. That was supposed to have been "SQL Server 2017 (14.x)".
Some of the include files had version numbers in their names referring to the internal version number (e.g. 11 = version 11.x / 110 — this is SQL Server 2012), while other include files had version numbers in their names referring to the common version number / product name (e.g. 14 = SQL Server 2014; this is version 12.x / 120).
Fixed data type used for
CONVERTin server collation query. It wasvarchar, which has two problems: 1) the base datatype isnvarchar, and even though only standard ASCII characters are used, it's still best to match the base data type, and 2) it's a bad practice to not specify a max size for variable length types as the default is situation-dependent. Here the default is 30 (instead of 1) which is still a problem given that as of SQL Server 2019, 1476 out of 5508 (27%) collation names are over 30 characters long and are thus silently truncated by usingvarchar.Please see "Arguments" list in documentation for SERVERPROPERTY.
Fixed data type used for
CONVERTin database collation query. It wasvarchar(50), which, while better than justvarchar, still has two problems: 1) the base datatype isnvarchar, and even though only standard ASCII characters are used, it's still best to match the base data type, and 2) while 50 is better than the default 30, this is still a problem given that as of SQL Server 2019, 48 collation names are over 50 characters long and are thus silently truncated by usingvarchar(50).Please see "Arguments" list in documentation for DATABASEPROPERTYEX.
Fixed the max number of code points for BMP and all of Unicode to be 65,536 and 1,114,112, respectively. Both were 1 less than that (in a few places), most likely using the max addressable code point value in each case, and not accounting for U+0000 (i.e. range vs quantity).
Removed "_140" from note regarding new collations automatically supporting supplementary characters. The starting SQL Server version is already mentioned, and making this version specific means one more place to update if/when a new series of collations is introduced (one more place that can be overlooked, leaving misleading documentation).
For consistency, removed "_" prefix from two remaining collation version number references that had them. The vast majority of collation version number references do not use that prefix, so now none of them do.
Under "GB18030 support", changed "100 level" to be "version 100" for consistency.
Under "Japanese collations ...", added "BIN and BIN2" for accuracy.
Under "Japanese collations ...", updated query to list new collations: 1) actual column names are not capitalized in the DB, and 2) since the Japanese collations were the only collations added in SQL Server 2017, using
COLLATIONPROPERTY(name, 'Version')is a more deterministic method of filtering as it doesn't rely on string parsing yet is logically equivalent.If not using
COLLATIONPROPERTY, then the following is preferred:Added "UCS-2" to keywords (meta data) as it was the only one missing from those combinations.
Take care,
Solomon...
https://SqlQuantumLift.com/
https://SqlQuantumLeap.com/
https://SQLsharp.com/