Skip to content

Separate functions to trim whitespace and normalize domain name#1316

Merged
matsduf merged 3 commits into
zonemaster:developfrom
mattias-p:normalization
Jun 2, 2025
Merged

Separate functions to trim whitespace and normalize domain name#1316
matsduf merged 3 commits into
zonemaster:developfrom
mattias-p:normalization

Conversation

@mattias-p

@mattias-p mattias-p commented Jan 8, 2024

Copy link
Copy Markdown
Member

Purpose

This PR separates the normalization of what's logically part of domain names from the white space stripping that is also specified in RequirementsAndNormalizationOfDomainNames.md to make it possible to apply them individually.

Context

This PR fixes item 3 in zonemaster/zonemaster-cli#364.

zonemaster/zonemaster-backend#1143 should be merged together with this one.

This is technically a major change because normalize_name() is part of the public API. However, even though we've included an implementation in prior releases, we've never actually called it (except from unit tests).

Changes

  • This PR splits out trim_space() from normalize_name().
  • As a drive-by change it also improves the quality of the associated unit test.

How to test this PR

Zonemaster CLI should present the following behaviors:

$ zonemaster-cli --show-testcase --no-ipv6 --test basic/basic01 'example .com'
Domain name has an ASCII label ("example ") with a character not permitted.

$ zonemaster-cli --show-testcase --no-ipv6 --test basic/basic01 'example '
Domain name has an ASCII label ("example ") with a character not permitted.

$ zonemaster-cli --show-testcase --no-ipv6 --test basic/basic01 ' '
Domain name has an ASCII label (" ") with a character not permitted.

$ zonemaster-cli --raw --level info --no-ipv6 --test basic01 zonemaster.fr --ns 'd.nic.fr. /194.0.9.1'
Invalid name in --ns argument:
    Domain name has an ASCII label (" ") with a character not permitted.

$ zonemaster-cli --raw --level info --no-ipv6 --test basic01 zonemaster.fr --ns 'd.nic.fr./ 194.0.9.1'
Invalid IP address in --ns argument:
    Invalid chars in IP  194.0.9.1

@mattias-p mattias-p added the V-Major Versioning: The change gives an update of major in version. label Jan 8, 2024
@mattias-p mattias-p changed the title Normalization Separate functions to trim whitespace and normalize domain name Jan 8, 2024
@mattias-p mattias-p added this to the v2023.2 milestone Jan 8, 2024
@matsduf matsduf modified the milestones: v2023.2, v2024.1 Jan 16, 2024
@mattias-p

Copy link
Copy Markdown
Member Author

@matsduf has expressed that he wants to update RequirementsAndNormalizationOfDomainNames.md before this is merged.

@matsduf matsduf left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If trimming of leading and trailing white space is removed from Engine, then the function must be in GUI or in Backend. Do we have the function in GUI?

Another option would be that Engine offers two alternatives, trimming of leading and trailing white space enabled and trimming disabled, respectively. That would make it easy for a new client to choose.

Also, we have zmb in Backend. Should that behave like GUI and trim? Or should that be like CLI and not trim? If zmb should trim (to make it more GUI like) then trimming should be done in Backend.

@matsduf matsduf modified the milestones: v2024.1, v2024.2 Jun 12, 2024
@mattias-p

Copy link
Copy Markdown
Member Author

@matsduf Regarding your comment above, as you remember, we discussed this topic at a work group meeting a while ago. I believe it was after your comment, but it seems we forgot to note our conclusions here. Instead of leaving (seemingly) unanswered questions hanging, it seems appropriate to summarize some clarifications along the lines of our discussion as I remember it.

  • The test case specifications should not specify the trimming of surrounding space as part of the definition of any of its data types (e.g. domain names). It is questionable if the test case specifications need to specify space trimming at all. But if they do, they should specify a common trimming algorithm that can be applied (or avoided) at the discretion of user interface implementations to conform whatever expectations apply to that particular kind of user interface.
  • GUI should trim spaces from values in any text field (e.g. domain name fields) before doing anything with them (e.g. before sending them to Backend). This is what you'd expect from a web application and we don't have any reasons to deviate.
  • RPCAPI should not trim space from any of its arguments. It exposes an application programming interface and clients are expected to say what they mean and not be sloppy in their requests.
  • Neither zmb nor zmtest should trim space from any of its arguments. You wouldn't expect a command line tool to do that.
  • Additionally, zmb should definitely not trim space from any of its arguments because we need it to be able to test how Backend responds to requests both with and without surrounding space in all request arguments.

@matsduf, @marc-vanderwal, @tgreenx, @MichaelTimbert do you guys agree with the above?

@matsduf

matsduf commented Jul 23, 2024

Copy link
Copy Markdown
Contributor

@mattias-p, please see zonemaster/zonemaster#1260.

@matsduf

matsduf commented Jul 23, 2024

Copy link
Copy Markdown
Contributor

@mattias-p, you do not trimming to be part of the normalization specification (you write "test case specification" but I think you mean the normalization document), but you want the normalization to happen in GUI. Where do you want that to be specified or do you prefer it to be unspecified.

I am not sure that I agree with your bullets, but I will not totally object. A requirement is that operations happening are documented and specified.

@mattias-p mattias-p modified the milestones: v2024.2, v2025.1 Nov 13, 2024
mattias-p added 3 commits May 27, 2025 08:25
Also, fix resulting compilation errors and make iteration order deterministic.
* Better diagnostics.
* More complete test for bad domain names.

@marc-vanderwal marc-vanderwal left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks fine; just one nit.

Comment thread t/normalization.t
use Test::More;
use Test::Exception;

#!perl

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.t files don’t need shebangs, do they?

@matsduf matsduf May 28, 2025

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not hurt, does it? It can help some editors to recognize it as a Perl script.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strictly speaking, I don't think it's required. I picked it up as a habit at some point, I don't remember exactly why. It seems at this point it's mostly a historical artifact. I noticed that it's present in examples in Intermediate Perl from 2012.

@matsduf

matsduf commented May 28, 2025

Copy link
Copy Markdown
Contributor

Before this PR is merged, zonemaster/zonemaster#1260 should be approved and merged.

@matsduf matsduf left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine given that zonemaster/zonemaster#1260 is approved and merged.

@matsduf

matsduf commented Jun 2, 2025

Copy link
Copy Markdown
Contributor

zonemaster/zonemaster#1260 has been merged. How is this PR related to zonemaster/zonemaster-backend#1143

@mattias-p

Copy link
Copy Markdown
Member Author

This is a breaking change, and zonemaster/zonemaster-backend#1143 depends on this change.

@matsduf matsduf merged commit 84ab5aa into zonemaster:develop Jun 2, 2025
3 checks passed
@mattias-p mattias-p deleted the normalization branch June 2, 2025 13:32
@tgreenx tgreenx added the S-ReleaseTested Status: The PR has been successfully tested in release testing label Jun 12, 2025
@tgreenx

tgreenx commented Jun 12, 2025

Copy link
Copy Markdown
Contributor

v2025.1 Release testing

Works as advertised in the "How to test" section.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

S-ReleaseTested Status: The PR has been successfully tested in release testing V-Major Versioning: The change gives an update of major in version.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improved handling of domain names in command line arguments

4 participants