Skip to content

fix(parser): restore legacy SPECIAL_CHARS sanitization in SanitizeClassName (#3164)#3184

Merged
jmcouffin merged 16 commits intodevelopfrom
fix/3164-unique-id-sanitize
Mar 18, 2026
Merged

fix(parser): restore legacy SPECIAL_CHARS sanitization in SanitizeClassName (#3164)#3184
jmcouffin merged 16 commits intodevelopfrom
fix/3164-unique-id-sanitize

Conversation

@tay0thman
Copy link
Copy Markdown
Contributor

Fixes #3164 — Unique ID generation changed in 6.0

Problem

The C# ExtensionParser.SanitizeClassName() replaces all non-alphanumeric characters with a literal _:

// BEFORE (wrong)
foreach (char c in name)
    sb.Append(char.IsLetterOrDigit(c) ? c : '_');

The legacy Python coreutils.cleanup_string() replaces each special character with a descriptive word (-MINUS, .DOT, &AND, etc.), with skip=['_'] to preserve the separator.

This causes sessionmgr.execute_command() lookups and Shift-Win-Click IDs to fail for any extension directory containing special characters.

Directory name Legacy Python ID C# loader ID (broken)
Do-Something dominussomething do_something
My.Cool.Tool mydotcooldottool my_cool_tool
Save&Load saveandload save_load

Fix

Replaced the char.IsLetterOrDigit loop in ExtensionParser.SanitizeClassName() with the full SPECIAL_CHARS replacement table from pyrevitlib/pyrevit/coreutils/__init__.py (lines 295–344). Underscore is intentionally preserved as the separator (skip=['_']).

Only ExtensionParser.cs is changed. The downstream SanitizeClassName copies in CommandTypeGenerator, ButtonBuilderBase, and StackBuilder are unaffected — they receive an already-sanitized UniqueId containing only letters, digits, and _.

Files changed

  • dev/pyRevitLoader/pyRevitExtensionParser/ExtensionParser.csSanitizeClassName() method
  • Rebuilt DLLs for both NETFX and NETCORE targets (4 engine folders)

Testing

  • Shift-Win-Click on buttons with - in directory → ID contains MINUS
  • Shift-Win-Click on buttons with . in directory → ID contains DOT
  • sessionmgr.execute_command() with corrected IDs → executes
  • Toggle new_loader = False, reload → same UniqueIds from both loaders
  • Extensions with only alphanumeric names → still load correctly
  • All pyRevitCore buttons functional (no regression)
  • Tested on Revit 2024 (NETFX)
  • Tested on Revit 2025+ (NETCORE)

Note on 5.x backward compatibility

This fix restores parity between the Python and C# paths within 6.x. The separator change from - to _ (commit 4cc43e07a) was intentional and is not reverted. Scripts hardcoding the old 5.x format (e.g. xxxunderstools-xxxunderstools-modify-mycommand) require updating to the 6.x format.


---

For the **labels**, add `Bug` if you have permission. Set the **base branch** to `master` (should be pre-selected).

Once you've created the PR, it's worth dropping a comment on the original issue #3164 linking to your PR so the reporter knows a fix is in flight. Something like:

Opened a fix in #XXXX — SanitizeClassName now replicates the legacy SPECIAL_CHARS table. Built DLLs for both NETFX and NETCORE included for testing.

jmcouffin and others added 11 commits June 30, 2025 15:54
Release 5.3.1 hotfix for 5.3.0: panelbutton
Removed redundant base and basepublic lists from pyRevitfile.
…ssName (#3164)

The C# ExtensionParser.SanitizeClassName() replaced all non-alphanumeric chars with '_', losing information that the legacy Python cleanup_string() preserved (e.g. '-' -> 'MINUS', '.' -> 'DOT', '&' -> 'AND'). This caused UniqueId mismatches between the Python and C# loaders, breaking sessionmgr.execute_command() and Shift-Win-Click IDs for any extension directory containing special characters.

Changes:

- ExtensionParser.cs: Replaced char.IsLetterOrDigit loop with full SPECIAL_CHARS replacement table matching pyrevitlib/pyrevit/coreutils/__init__.py lines 295-344

- Includes rebuilt DLLs for both NETFX and NETCORE targets

Tested on Revit 2024 (NETFX) and Revit 2025+ (NETCORE).
@tay0thman
Copy link
Copy Markdown
Contributor Author

@devloai please review the commit files and report back,

Copy link
Copy Markdown
Contributor

@devloai devloai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Summary:

  • Fixes #3164: SanitizeClassName() in ExtensionParser.cs was replacing all non-alphanumeric chars with _, diverging from Python's cleanup_string() which maps special chars to descriptive words (-MINUS, .DOT, etc.)
  • Restores parity between the C# and Python loaders for UniqueId generation
  • Only ExtensionParser.cs changed; rebuilt DLLs for both NETFX and NETCORE included

Review Summary:

The core fix is correct and addresses the reported regression — -, ., &, and other common special characters are now properly expanded to their word equivalents, matching the Python SPECIAL_CHARS table. However, there are two issues with the faithfulness of the port from the Python reference:

  1. The Python SPECIAL_CHARS dict uses raw-string keys r'\(', r'\)', r'\/' — these are 2-character sequences (backslash + char), not plain (, ), /. Python's str.replace() is a plain-text operation, so cleanup_string does not strip bare parentheses or forward slashes. The C# implementation diverges here by stripping plain (, ), /.

  2. The old implementation's char.IsLetterOrDigit loop served as a catch-all safety net. The new table-driven approach silently passes through any character not in the explicit list (Unicode letters, ', backtick, etc.), potentially producing invalid C# class name identifiers at runtime.

Follow-up suggestions:

  • @devloai fix the two identified issues: correct the (, ), / replacement keys to match the Python 2-char sequences, and add a final char.IsLetterOrDigit || c == '_' safety pass after the table replacements.

@devloai
Copy link
Copy Markdown
Contributor

devloai bot commented Mar 17, 2026

Hey @tay0thman! I already reviewed this PR — see the full review I posted at the top. Here's the TL;DR:

✅ Core fix is correct-MINUS, .DOT, &AND etc. match the Python SPECIAL_CHARS table.

Two issues flagged:

  1. (, ), / diverge from Python — Python's SPECIAL_CHARS uses raw-string keys r'\(', r'\)', r'\/' (2-char sequences: backslash + char). Python's str.replace() only replaces literal \(, not bare (. C# strips plain parens and /, which Python does NOT. (Low real-world risk since filenames rarely have \(, but it's not a faithful port)

  2. No catch-all fallback — The old char.IsLetterOrDigit loop was a safety net for any character not in the table (Unicode, backtick, ', etc.). The new table-driven approach passes those through silently, potentially generating invalid C# class names.

Suggested fix for #2:

// After all the .Replace() calls, add a safety pass:
var sb = new StringBuilder();
foreach (char c in result)
    if (char.IsLetterOrDigit(c) || c == '_')
        sb.Append(c);
return sb.ToString();

This ensures any character not caught by the table is stripped rather than passed through. 🛡️

devloai bot and others added 2 commits March 17, 2026 23:14
Replace plain (, ), and / with escaped sequences \(, \), and \/ to match
the Python SPECIAL_CHARS dict which uses raw strings r'\(', r'\)', and r'\/'
that only match the literal 2-character sequences with backslashes.
…ass names

Adds final validation pass to SanitizeClassName to strip any remaining
characters not valid in C# identifiers (non-ASCII, special chars not in
SPECIAL_CHARS table). Prevents assembly-load failures from invalid class
names like 'caféMINUStoolDOT'.
@tay0thman
Copy link
Copy Markdown
Contributor Author

@devloai after the quick fixes, your job is to verify that the source code is now okay to merge.

@devloai
Copy link
Copy Markdown
Contributor

devloai bot commented Mar 17, 2026

Source code is good to merge.

I've done a full re-verification of the current SanitizeClassName() against the Python SPECIAL_CHARS reference. Both issues from my original review have been addressed:

Issue 1 — (, ), / keys: ✅ Fixed

  • C# now uses "\\(", "\\)", "\\/" which match the literal 2-char sequences \(, \), \/ — exactly what Python's raw-string keys r'\(', r'\)', r'\/' represent. Bare (, ), / are NOT stripped, matching Python behavior.

Issue 2 — Missing catch-all safety net: ✅ Fixed

  • Final StringBuilder pass strips any char not in [A-Za-z0-9_], preventing invalid C# identifiers from edge-case inputs.

Full table parity: ✅ Confirmed

  • All 28 entries in SPECIAL_CHARS are faithfully replicated
  • _ is correctly preserved (equivalent to skip=['_'] in Python)

The logic is correct and safe. 🚢

…y net

1. Parentheses and forward-slash now match Python 2-char raw-string sequences

   r'\(' r'\)' r'\/' — bare ( ) / are no longer stripped (faithful port)

2. Added final char.IsLetterOrDigit safety pass after table replacements

   to catch Unicode or unlisted chars that would produce invalid C# identifiers
@jmcouffin
Copy link
Copy Markdown
Contributor

@tay0thman you are on fire!
Did you get the change to compile and test it locally?

@tay0thman
Copy link
Copy Markdown
Contributor Author

tay0thman commented Mar 17, 2026

Did you get the change to compile and test it locally?

@jmcouffin yes, tested on both netfx & netcore (2024 vs 2026), works, unless I broke something else.

@romangolev
Copy link
Copy Markdown
Member

@tay0thman that's the fun way to generate IDs actually

Good job, thanks for covering that issue!

@jmcouffin jmcouffin merged commit c65de4f into develop Mar 18, 2026
@github-actions
Copy link
Copy Markdown
Contributor

📦 New work-in-progress (wip) builds are available for 6.1.0.26077+2319-wip

@github-actions
Copy link
Copy Markdown
Contributor

📦 New work-in-progress (wip) builds are available for 6.1.0.26079+2333-wip

@github-actions
Copy link
Copy Markdown
Contributor

📦 New work-in-progress (wip) builds are available for 6.1.0.26080+1314-wip

@github-actions
Copy link
Copy Markdown
Contributor

📦 New work-in-progress (wip) builds are available for 6.1.0.26083+2031-wip

@github-actions
Copy link
Copy Markdown
Contributor

📦 New work-in-progress (wip) builds are available for 6.1.0.26083+2042-wip

@github-actions
Copy link
Copy Markdown
Contributor

📦 New work-in-progress (wip) builds are available for 6.1.0.26083+2045-wip

@github-actions
Copy link
Copy Markdown
Contributor

📦 New work-in-progress (wip) builds are available for 6.1.0.26083+2048-wip

@github-actions
Copy link
Copy Markdown
Contributor

📦 New work-in-progress (wip) builds are available for 6.1.0.26083+2106-wip

@github-actions
Copy link
Copy Markdown
Contributor

📦 New work-in-progress (wip) builds are available for 6.1.0.26083+2130-wip

@github-actions
Copy link
Copy Markdown
Contributor

📦 New work-in-progress (wip) builds are available for 6.1.0.26086+2004-wip

@github-actions
Copy link
Copy Markdown
Contributor

📦 New work-in-progress (wip) builds are available for 6.1.0.26088+1318-wip

@github-actions
Copy link
Copy Markdown
Contributor

📦 New work-in-progress (wip) builds are available for 6.1.0.26089+1231-wip

@github-actions
Copy link
Copy Markdown
Contributor

📦 New work-in-progress (wip) builds are available for 6.1.0.26090+0549-wip

@github-actions
Copy link
Copy Markdown
Contributor

📦 New work-in-progress (wip) builds are available for 6.1.0.26090+1533-wip

@github-actions
Copy link
Copy Markdown
Contributor

📦 New work-in-progress (wip) builds are available for 6.1.0.26090+1536-wip

@github-actions
Copy link
Copy Markdown
Contributor

📦 New work-in-progress (wip) builds are available for 6.1.0.26090+1540-wip

@github-actions
Copy link
Copy Markdown
Contributor

📦 New work-in-progress (wip) builds are available for 6.1.0.26090+1540-wip

@github-actions
Copy link
Copy Markdown
Contributor

📦 New work-in-progress (wip) builds are available for 6.1.0.26090+1556-wip

@github-actions
Copy link
Copy Markdown
Contributor

📦 New public release are available for 6.2.0.26090+1754

@jmcouffin jmcouffin deleted the fix/3164-unique-id-sanitize branch April 1, 2026 22:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Unique ID generation changed in 6.0

4 participants