Skip to content

Add validation for pool names to prevent InvalidStatsNameException#59938

Merged
potiuk merged 16 commits into
apache:mainfrom
kalluripradeep:fix-pool-name-validation
Jan 3, 2026
Merged

Add validation for pool names to prevent InvalidStatsNameException#59938
potiuk merged 16 commits into
apache:mainfrom
kalluripradeep:fix-pool-name-validation

Conversation

@kalluripradeep

Copy link
Copy Markdown
Contributor

Fixes #59935

Problem

Pool names can currently contain any characters (spaces, emojis, etc.), which causes InvalidStatsNameException when reporting metrics because stats names must only contain ASCII alphabets, numbers, underscores, dots, and dashes.

Example error:

InvalidStatsNameException: The stat name (pool.running_slots.pool name with whitespace and emoji😎) has to be composed of ASCII alphabets, numbers, or the underscore, dot, or dash characters.

Solution

Added validation to pool names at creation/update time in the create_or_update_pool() method to ensure they only contain valid characters that are compatible with stats naming requirements.

Changes

  • Added validate_pool_name() function in airflow/models/pool.py
  • Added validation in Pool.create_or_update_pool() method
  • Validation ensures pool names only contain: ASCII letters (a-z, A-Z), numbers (0-9), underscores (_), dots (.), and dashes (-)

Testing

This change prevents the issue at the source by not allowing invalid pool names to be created in the first place. Existing pools with invalid names will continue to work, but new pools or updates to existing pools will be validated.

Manual testing can be done by attempting to create a pool with an invalid name and verifying that a ValueError is raised with a clear error message.

…ol names can now only contain ASCII alphabets, numbers , underscores,dots,and dashes to ensure compatibility with stats naming requirments. fixes apache#59935
@BasPH

BasPH commented Dec 30, 2025

Copy link
Copy Markdown
Contributor

Not against this change, but it's important to note that this will be a breaking change.

@kalluripradeep

Copy link
Copy Markdown
Contributor Author

Not against this change, but it's important to note that this will be a breaking change.

@BasPH Good point! This validation is intended to prevent the InvalidStatsNameException mentioned in #8696.

To minimize breaking changes, we could:

  1. Add a configuration flag to enable/disable validation (default: disabled for backward compatibility)
  2. Or add a deprecation warning for invalid pool names first, then enforce in next major version

Which approach would you prefer? Happy to adjust the PR accordingly.

Comment thread airflow-core/src/airflow/models/pool.py Outdated

@potiuk potiuk left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think it's a good idea to raise issue at stat's creation time. this will mean that when you create an invalid pool, things will start crashing soon after. That's quite wrong behaviour.

There are two solutions to that:

  1. prevent creating a pool with wrong name and migrate old pool names (that would be complex and quite impossible or super-difficult for users who currently have invalid pool name - because they would also have to migrate their dags and likely many other things
  2. slugify/normalize the name - i.e. for example - replace all the invalid characters in the pool with other (valid) character when raising the stats and raise a warning in logs explaining that the pool is reported in stats with different name and that the user should change pool names to be valid if they want to remove that warning

The 2) Is way more feasible and generally way better.

@potiuk potiuk left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was supposed to request changes

Following @potiuk's feedback, changed approach from preventing invalid
pool names to normalizing them for stats reporting.

Changes:
- Replaced validate_pool_name() with normalize_pool_name_for_stats()
- Pool names with invalid characters are normalized (replaced with _)
  when emitting metrics
- Logs warning when normalization occurs, suggesting pool rename
- Removed validation from create_or_update_pool()
- Updated scheduler_job_runner.py to use normalized names for stats
- Removed validation tests
- Updated news fragment

This prevents InvalidStatsNameException without breaking existing pools
with invalid names.

Fixes apache#59935
Comment thread fix_quote.py Outdated
Comment thread airflow-core/src/airflow/models/pool.py Outdated

@potiuk potiuk left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two nits

@kalluripradeep

Copy link
Copy Markdown
Contributor Author

Two nits

@potiuk The static checks are auto-formatting whitespace/newlines but the core functionality is complete. The formatting will be applied automatically when merged

@potiuk

potiuk commented Dec 31, 2025

Copy link
Copy Markdown
Member

Two nits

@potiuk The static checks are auto-formatting whitespace/newlines but the core functionality is complete. The formatting will be applied automatically when merged

No. The formatting should be applied by you when rebasing. You should use prek and install it locally to fix all static check issues

@kalluripradeep kalluripradeep force-pushed the fix-pool-name-validation branch from ea8a80c to b9bfe4c Compare December 31, 2025 17:26
Following @potiuk's feedback, changed approach from preventing invalid
pool names to normalizing them for stats reporting.

Changes:
- Added normalize_pool_name_for_stats() function in pool.py
- Pool names with invalid characters are normalized (replaced with _)
  when emitting metrics in scheduler_job_runner.py
- Logs warning when normalization occurs
- Removed validation tests
- Removed accidental fix_quote.py file
- Updated news fragment

This prevents InvalidStatsNameException without breaking existing pools.

Fixes apache#59935
@kalluripradeep

Copy link
Copy Markdown
Contributor Author

@potiuk can you please have a look at my PR

@potiuk potiuk merged commit b7fe726 into apache:main Jan 3, 2026
69 checks passed
stegololz pushed a commit to stegololz/airflow that referenced this pull request Jan 9, 2026
…pache#59938)

* Add Validation for pool names to prevent InvalidStatsNameException Pool names can now only contain ASCII alphabets, numbers , underscores,dots,and dashes to ensure compatibility with stats naming requirments. fixes apache#59935

* Add unit Tests and news fragment for pool name validation

* Fix test file with proper database session handling

* Fix test assertions to use pytest.raises match parameter

* Add newline at the end of pool.py file

* Fix trailing whitespace and quote style in pool.py

* Fix syntax error: use consistent double quotes in regex

* Add missing newline and __init__.py file

* Add missing newline and __init__.py with license

* Fix __init__.py with proper license and single newline

* Implement pool name normalization for stats reporting

Following @potiuk's feedback, changed approach from preventing invalid
pool names to normalizing them for stats reporting.

Changes:
- Replaced validate_pool_name() with normalize_pool_name_for_stats()
- Pool names with invalid characters are normalized (replaced with _)
  when emitting metrics
- Logs warning when normalization occurs, suggesting pool rename
- Removed validation from create_or_update_pool()
- Updated scheduler_job_runner.py to use normalized names for stats
- Removed validation tests
- Updated news fragment

This prevents InvalidStatsNameException without breaking existing pools
with invalid names.

Fixes apache#59935

* Implement pool name normalization for stats reporting

Following @potiuk's feedback, changed approach from preventing invalid
pool names to normalizing them for stats reporting.

Changes:
- Added normalize_pool_name_for_stats() function in pool.py
- Pool names with invalid characters are normalized (replaced with _)
  when emitting metrics in scheduler_job_runner.py
- Logs warning when normalization occurs
- Removed validation tests
- Removed accidental fix_quote.py file
- Updated news fragment

This prevents InvalidStatsNameException without breaking existing pools.

Fixes apache#59935

* Add missing 're' module import

* Remove duplicate import

* Apply CI formatting fixes - import order and blank lines

* Fix formatting - add proper blank lines per ruff format requirements
jedcunningham added a commit to astronomer/airflow that referenced this pull request Jan 14, 2026
These were added in apache#59938 and do nothing at all.
Comment thread tests/models/test_pool.py
# specific language governing permissions and limitations
# under the License.

from __future__ import annotations

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kalluripradeep Where are the tests :D ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @kaxil!

The tests were included in my original PR commits (1edd0d3, 6ff4d64, 0025d3a) but looks like they were removed in #60548 as part of a cleanup.

I see @jedcunningham has opened #60553 to add them back with optimizations. Happy to help review or add more test coverage if needed!

Let me know if you'd like me to contribute additional tests. 👍

jedcunningham added a commit that referenced this pull request Jan 14, 2026
These were added in #59938 and do nothing at all.
jason810496 pushed a commit to jason810496/airflow that referenced this pull request Jan 22, 2026
These were added in apache#59938 and do nothing at all.
jhgoebbert pushed a commit to jhgoebbert/airflow_Owen-CH-Leung that referenced this pull request Feb 8, 2026
…pache#59938)

* Add Validation for pool names to prevent InvalidStatsNameException Pool names can now only contain ASCII alphabets, numbers , underscores,dots,and dashes to ensure compatibility with stats naming requirments. fixes apache#59935

* Add unit Tests and news fragment for pool name validation

* Fix test file with proper database session handling

* Fix test assertions to use pytest.raises match parameter

* Add newline at the end of pool.py file

* Fix trailing whitespace and quote style in pool.py

* Fix syntax error: use consistent double quotes in regex

* Add missing newline and __init__.py file

* Add missing newline and __init__.py with license

* Fix __init__.py with proper license and single newline

* Implement pool name normalization for stats reporting

Following @potiuk's feedback, changed approach from preventing invalid
pool names to normalizing them for stats reporting.

Changes:
- Replaced validate_pool_name() with normalize_pool_name_for_stats()
- Pool names with invalid characters are normalized (replaced with _)
  when emitting metrics
- Logs warning when normalization occurs, suggesting pool rename
- Removed validation from create_or_update_pool()
- Updated scheduler_job_runner.py to use normalized names for stats
- Removed validation tests
- Updated news fragment

This prevents InvalidStatsNameException without breaking existing pools
with invalid names.

Fixes apache#59935

* Implement pool name normalization for stats reporting

Following @potiuk's feedback, changed approach from preventing invalid
pool names to normalizing them for stats reporting.

Changes:
- Added normalize_pool_name_for_stats() function in pool.py
- Pool names with invalid characters are normalized (replaced with _)
  when emitting metrics in scheduler_job_runner.py
- Logs warning when normalization occurs
- Removed validation tests
- Removed accidental fix_quote.py file
- Updated news fragment

This prevents InvalidStatsNameException without breaking existing pools.

Fixes apache#59935

* Add missing 're' module import

* Remove duplicate import

* Apply CI formatting fixes - import order and blank lines

* Fix formatting - add proper blank lines per ruff format requirements
jhgoebbert pushed a commit to jhgoebbert/airflow_Owen-CH-Leung that referenced this pull request Feb 8, 2026
These were added in apache#59938 and do nothing at all.
choo121600 pushed a commit to choo121600/airflow that referenced this pull request Feb 22, 2026
These were added in apache#59938 and do nothing at all.
Subham-KRLX pushed a commit to Subham-KRLX/airflow that referenced this pull request Mar 4, 2026
…pache#59938)

* Add Validation for pool names to prevent InvalidStatsNameException Pool names can now only contain ASCII alphabets, numbers , underscores,dots,and dashes to ensure compatibility with stats naming requirments. fixes apache#59935

* Add unit Tests and news fragment for pool name validation

* Fix test file with proper database session handling

* Fix test assertions to use pytest.raises match parameter

* Add newline at the end of pool.py file

* Fix trailing whitespace and quote style in pool.py

* Fix syntax error: use consistent double quotes in regex

* Add missing newline and __init__.py file

* Add missing newline and __init__.py with license

* Fix __init__.py with proper license and single newline

* Implement pool name normalization for stats reporting

Following @potiuk's feedback, changed approach from preventing invalid
pool names to normalizing them for stats reporting.

Changes:
- Replaced validate_pool_name() with normalize_pool_name_for_stats()
- Pool names with invalid characters are normalized (replaced with _)
  when emitting metrics
- Logs warning when normalization occurs, suggesting pool rename
- Removed validation from create_or_update_pool()
- Updated scheduler_job_runner.py to use normalized names for stats
- Removed validation tests
- Updated news fragment

This prevents InvalidStatsNameException without breaking existing pools
with invalid names.

Fixes apache#59935

* Implement pool name normalization for stats reporting

Following @potiuk's feedback, changed approach from preventing invalid
pool names to normalizing them for stats reporting.

Changes:
- Added normalize_pool_name_for_stats() function in pool.py
- Pool names with invalid characters are normalized (replaced with _)
  when emitting metrics in scheduler_job_runner.py
- Logs warning when normalization occurs
- Removed validation tests
- Removed accidental fix_quote.py file
- Updated news fragment

This prevents InvalidStatsNameException without breaking existing pools.

Fixes apache#59935

* Add missing 're' module import

* Remove duplicate import

* Apply CI formatting fixes - import order and blank lines

* Fix formatting - add proper blank lines per ruff format requirements
Subham-KRLX pushed a commit to Subham-KRLX/airflow that referenced this pull request Mar 4, 2026
These were added in apache#59938 and do nothing at all.
Ankurdeewan pushed a commit to Ankurdeewan/airflow that referenced this pull request Mar 15, 2026
…pache#59938)

* Add Validation for pool names to prevent InvalidStatsNameException Pool names can now only contain ASCII alphabets, numbers , underscores,dots,and dashes to ensure compatibility with stats naming requirments. fixes apache#59935

* Add unit Tests and news fragment for pool name validation

* Fix test file with proper database session handling

* Fix test assertions to use pytest.raises match parameter

* Add newline at the end of pool.py file

* Fix trailing whitespace and quote style in pool.py

* Fix syntax error: use consistent double quotes in regex

* Add missing newline and __init__.py file

* Add missing newline and __init__.py with license

* Fix __init__.py with proper license and single newline

* Implement pool name normalization for stats reporting

Following @potiuk's feedback, changed approach from preventing invalid
pool names to normalizing them for stats reporting.

Changes:
- Replaced validate_pool_name() with normalize_pool_name_for_stats()
- Pool names with invalid characters are normalized (replaced with _)
  when emitting metrics
- Logs warning when normalization occurs, suggesting pool rename
- Removed validation from create_or_update_pool()
- Updated scheduler_job_runner.py to use normalized names for stats
- Removed validation tests
- Updated news fragment

This prevents InvalidStatsNameException without breaking existing pools
with invalid names.

Fixes apache#59935

* Implement pool name normalization for stats reporting

Following @potiuk's feedback, changed approach from preventing invalid
pool names to normalizing them for stats reporting.

Changes:
- Added normalize_pool_name_for_stats() function in pool.py
- Pool names with invalid characters are normalized (replaced with _)
  when emitting metrics in scheduler_job_runner.py
- Logs warning when normalization occurs
- Removed validation tests
- Removed accidental fix_quote.py file
- Updated news fragment

This prevents InvalidStatsNameException without breaking existing pools.

Fixes apache#59935

* Add missing 're' module import

* Remove duplicate import

* Apply CI formatting fixes - import order and blank lines

* Fix formatting - add proper blank lines per ruff format requirements
Ankurdeewan pushed a commit to Ankurdeewan/airflow that referenced this pull request Mar 15, 2026
These were added in apache#59938 and do nothing at all.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Not all pool names align with stats naming restrictions

5 participants