Skip to content

Allow normalisation of white space #648

@pfumagalli

Description

@pfumagalli

One of the features I found extremely useful in XSLT/XPath/XQuery was the normalisation of white spaces within a string (in other words, on top of trimming a string, any multiple occurrence of a white space character gets replaced by a single white space character).

For example the string hello\r\nworld!\t would be normalised simply as hello world!.

This is extremely useful in non-latin (read Japanese, I live in Tokyo) languages where a number of characters can be used for separating words. In Japanese, normally people type the unicode character \u3000 as that's what entered by default when hitting "space" on the keyboard, but that's not necessarily what one might want to retain.

For example, I'd love for the string 山田 太郎 (the space used here is the normal space I get when using the Japanese keyboard, copy and paste it into hexdump -C on a UTF8 console) into a more normal 山田 太郎 (replaced with ASCII 0x20).

Metadata

Metadata

Assignees

Labels

featureNew functionality or improvement

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions