Skip to content

Implement robust string normalization #577

@mvorisek

Description

@mvorisek

Fields should support these normalization options:

  1. normalize NL (\n, \r\n, \r to \n - currently done always without option to set it on/off)
  2. replace NL to space ' '
  3. replace all white chars (\s in regex except NL) to single space ' '
  4. remove all leading and trailing white chars
  5. remove all trailing white chars in each string line
  6. allow to exclude tabs (\t) from white chars

Used should have full control over all of these options and their combinations using one field property like normalizeText.

For non-binary fields all of these options except rule 2. and 6. should be on by default. In atk4/ui is makes sense to turn rule 2. and 6. on too for all fields except textarea.

Feedback and help with the implementation welcomed.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions