Skip to content

SORT: use strcoll() when appropriate. #1074

@antirez

Description

@antirez

If you search for "change SORT order" thread in the Redis mailing list, you'll find that Redis up to 2.6 version uses binary comparison between strings for everything but when you use BY and ALPHA at the same time in SORT. In that case strcoll() is used, but since no locale is set, the effect is that anyway binary comparison is used since the default locale will be "C".

So well, the net result is that Redis always uses lexicographic comparison, that's bad since SORT is designed in order to return results in a way that is ready to display to the user, so proper collation should be used.

The no brainer would be to use strcoll() everywhere in the context of the SORT command, and use:

setlocale(LC_COLLATE,""); // in order to active the environment locale for strcoll().

However SORT + STORE is a write command, and as such should behave in a predictable way so that AOF and the replication stream are consistent. There is no easy way to understand if a master and a slave are using the same collation, nor I want to introduce an handshake thing for this, so the temporary solution for 2.8 could be the following:

  1. Use setlocale as specified above.
  2. Use strcoll() everywhere in the scope of SORT>
  3. However if STORE is used, sort just with strcmp() in a binary way.

This at least means that for many uses (everytime SORT is not involved) you can expect SORT + ALPHA to return results in the expected order.

Comments welcomed!

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions