fix(parser): restore legacy SPECIAL_CHARS sanitization in SanitizeClassName (#3164)#3184
fix(parser): restore legacy SPECIAL_CHARS sanitization in SanitizeClassName (#3164)#3184
Conversation
pyRevit Release v5.2.0
Release 5.3.0
Small bump for 5.3 release
Release 5.3.0
Release 5.3.1 hotfix for 5.3.0: panelbutton
Removed redundant base and basepublic lists from pyRevitfile.
pyRevit 6.0.0 Release
6.1.0 Release
…ssName (#3164) The C# ExtensionParser.SanitizeClassName() replaced all non-alphanumeric chars with '_', losing information that the legacy Python cleanup_string() preserved (e.g. '-' -> 'MINUS', '.' -> 'DOT', '&' -> 'AND'). This caused UniqueId mismatches between the Python and C# loaders, breaking sessionmgr.execute_command() and Shift-Win-Click IDs for any extension directory containing special characters. Changes: - ExtensionParser.cs: Replaced char.IsLetterOrDigit loop with full SPECIAL_CHARS replacement table matching pyrevitlib/pyrevit/coreutils/__init__.py lines 295-344 - Includes rebuilt DLLs for both NETFX and NETCORE targets Tested on Revit 2024 (NETFX) and Revit 2025+ (NETCORE).
|
@devloai please review the commit files and report back, |
There was a problem hiding this comment.
PR Summary:
- Fixes #3164:
SanitizeClassName()inExtensionParser.cswas replacing all non-alphanumeric chars with_, diverging from Python'scleanup_string()which maps special chars to descriptive words (-→MINUS,.→DOT, etc.) - Restores parity between the C# and Python loaders for UniqueId generation
- Only
ExtensionParser.cschanged; rebuilt DLLs for both NETFX and NETCORE included
Review Summary:
The core fix is correct and addresses the reported regression — -, ., &, and other common special characters are now properly expanded to their word equivalents, matching the Python SPECIAL_CHARS table. However, there are two issues with the faithfulness of the port from the Python reference:
-
The Python
SPECIAL_CHARSdict uses raw-string keysr'\(',r'\)',r'\/'— these are 2-character sequences (backslash + char), not plain(,),/. Python'sstr.replace()is a plain-text operation, socleanup_stringdoes not strip bare parentheses or forward slashes. The C# implementation diverges here by stripping plain(,),/. -
The old implementation's
char.IsLetterOrDigitloop served as a catch-all safety net. The new table-driven approach silently passes through any character not in the explicit list (Unicode letters,', backtick, etc.), potentially producing invalid C# class name identifiers at runtime.
Follow-up suggestions:
@devloaifix the two identified issues: correct the(,),/replacement keys to match the Python 2-char sequences, and add a finalchar.IsLetterOrDigit || c == '_'safety pass after the table replacements.
|
Hey @tay0thman! I already reviewed this PR — see the full review I posted at the top. Here's the TL;DR: ✅ Core fix is correct — Two issues flagged:
Suggested fix for #2: // After all the .Replace() calls, add a safety pass:
var sb = new StringBuilder();
foreach (char c in result)
if (char.IsLetterOrDigit(c) || c == '_')
sb.Append(c);
return sb.ToString();This ensures any character not caught by the table is stripped rather than passed through. 🛡️ |
Replace plain (, ), and / with escaped sequences \(, \), and \/ to match
the Python SPECIAL_CHARS dict which uses raw strings r'\(', r'\)', and r'\/'
that only match the literal 2-character sequences with backslashes.
…ass names Adds final validation pass to SanitizeClassName to strip any remaining characters not valid in C# identifiers (non-ASCII, special chars not in SPECIAL_CHARS table). Prevents assembly-load failures from invalid class names like 'caféMINUStoolDOT'.
|
@devloai after the quick fixes, your job is to verify that the source code is now okay to merge. |
|
✅ Source code is good to merge. I've done a full re-verification of the current Issue 1 —
Issue 2 — Missing catch-all safety net: ✅ Fixed
Full table parity: ✅ Confirmed
The logic is correct and safe. 🚢 |
…y net
1. Parentheses and forward-slash now match Python 2-char raw-string sequences
r'\(' r'\)' r'\/' — bare ( ) / are no longer stripped (faithful port)
2. Added final char.IsLetterOrDigit safety pass after table replacements
to catch Unicode or unlisted chars that would produce invalid C# identifiers
|
@tay0thman you are on fire! |
@jmcouffin yes, tested on both netfx & netcore (2024 vs 2026), works, unless I broke something else. |
|
@tay0thman that's the fun way to generate IDs actually Good job, thanks for covering that issue! |
|
📦 New work-in-progress (wip) builds are available for 6.1.0.26077+2319-wip |
|
📦 New work-in-progress (wip) builds are available for 6.1.0.26079+2333-wip |
|
📦 New work-in-progress (wip) builds are available for 6.1.0.26080+1314-wip |
|
📦 New work-in-progress (wip) builds are available for 6.1.0.26083+2031-wip |
|
📦 New work-in-progress (wip) builds are available for 6.1.0.26083+2042-wip |
|
📦 New work-in-progress (wip) builds are available for 6.1.0.26083+2045-wip |
|
📦 New work-in-progress (wip) builds are available for 6.1.0.26083+2048-wip |
|
📦 New work-in-progress (wip) builds are available for 6.1.0.26083+2106-wip |
|
📦 New work-in-progress (wip) builds are available for 6.1.0.26083+2130-wip |
|
📦 New work-in-progress (wip) builds are available for 6.1.0.26086+2004-wip |
|
📦 New work-in-progress (wip) builds are available for 6.1.0.26088+1318-wip |
|
📦 New work-in-progress (wip) builds are available for 6.1.0.26089+1231-wip |
|
📦 New work-in-progress (wip) builds are available for 6.1.0.26090+0549-wip |
|
📦 New work-in-progress (wip) builds are available for 6.1.0.26090+1533-wip |
|
📦 New work-in-progress (wip) builds are available for 6.1.0.26090+1536-wip |
|
📦 New work-in-progress (wip) builds are available for 6.1.0.26090+1540-wip |
|
📦 New work-in-progress (wip) builds are available for 6.1.0.26090+1540-wip |
|
📦 New work-in-progress (wip) builds are available for 6.1.0.26090+1556-wip |
|
📦 New public release are available for 6.2.0.26090+1754 |
Fixes #3164 — Unique ID generation changed in 6.0
Problem
The C#
ExtensionParser.SanitizeClassName()replaces all non-alphanumeric characters with a literal_:The legacy Python
coreutils.cleanup_string()replaces each special character with a descriptive word (-→MINUS,.→DOT,&→AND, etc.), withskip=['_']to preserve the separator.This causes
sessionmgr.execute_command()lookups and Shift-Win-Click IDs to fail for any extension directory containing special characters.Do-Somethingdominussomethingdo_somethingMy.Cool.Toolmydotcooldottoolmy_cool_toolSave&Loadsaveandloadsave_loadFix
Replaced the
char.IsLetterOrDigitloop inExtensionParser.SanitizeClassName()with the fullSPECIAL_CHARSreplacement table frompyrevitlib/pyrevit/coreutils/__init__.py(lines 295–344). Underscore is intentionally preserved as the separator (skip=['_']).Only
ExtensionParser.csis changed. The downstreamSanitizeClassNamecopies inCommandTypeGenerator,ButtonBuilderBase, andStackBuilderare unaffected — they receive an already-sanitized UniqueId containing only letters, digits, and_.Files changed
dev/pyRevitLoader/pyRevitExtensionParser/ExtensionParser.cs—SanitizeClassName()methodTesting
-in directory → ID containsMINUS.in directory → ID containsDOTsessionmgr.execute_command()with corrected IDs → executesnew_loader = False, reload → same UniqueIds from both loadersNote on 5.x backward compatibility
This fix restores parity between the Python and C# paths within 6.x. The separator change from
-to_(commit4cc43e07a) was intentional and is not reverted. Scripts hardcoding the old 5.x format (e.g.xxxunderstools-xxxunderstools-modify-mycommand) require updating to the 6.x format.Opened a fix in #XXXX —
SanitizeClassNamenow replicates the legacySPECIAL_CHARStable. Built DLLs for both NETFX and NETCORE included for testing.