Skip to content

Long numeric HTML Entities in templates are compiled wrong #51274

@delbertooo

Description

@delbertooo

Which @angular/* package(s) are the source of the bug?

Don't known / other

Is this a regression?

No

Description

In HTML you can use numeric HTML Entities (hex or decimal) to represent characters (e.g. emojis and other symbols). E.g. A for the capital letter "A".

When Angular compiles a template containing an HTML entity with a length of 5 hexadecimal characters (e.g. 🛈), the resulting template contains a wrong unicode character at the expected point.

E.g. to display the "Circled Information Source" someone can use the HTML code 🛈 or 🛈. If you put one of these entities into your template, the resulting glyph is just a different unicode glyph. In UTF-8 encoding the template should contain (hex) F0 9F 9B 88 but contains (hex) EF 9B 88. The contained (hex) EF 9B 88 is just the 2-byte cropped entity  (missing the leftmost 1) in UTF-8 encoding.

Please provide a link to a minimal reproduction of the bug

https://stackblitz.com/edit/stackblitz-starters-pntjeb?file=src%2Fmain.ts

Please provide the exception or error you saw

The wrong glyph is rendered for long HTML Entities. The correct glyph is rendered if the Unicode glyph is contained directly inside of the templates content.

image

Please provide the environment you discovered this bug in (run ng version)

Angular CLI: 16.1.7
Node: 18.17.0
Package Manager: npm 9.6.7
OS: darwin arm64

Angular: 16.1.8
... common, compiler, compiler-cli, core, platform-browser

Package                         Version
---------------------------------------------------------
@angular-devkit/architect       0.1601.7
@angular-devkit/build-angular   16.1.7
@angular-devkit/core            16.1.7
@angular-devkit/schematics      16.1.7
@angular/cli                    16.1.7
@schematics/angular             16.1.7
rxjs                            7.8.1
typescript                      5.1.6
zone.js                         0.13.1

Anything else?

What I think is wrong

Angular seems to translate these HTML Entities to Unicode escape sequences in JS Strings (e.g. HTML  gets JS "\uFFFF"). From my observations my blind guess would be, that Angular tries to simply translate these into old-style JS Unicode escapes. These old style escapes support 4 hexadecimal chars only ("\u0000" to "\uFFFF"). So these longer Strings get simply chopped of after the 4 right-most hexadecimal chars.

I guess Angular would compile HTML  into JS "\uFFFF" (missing 1).

Why do I think so?

It just looks like it in the compiled templates of my example. See the comments in the compiled templates below.

Production build:

const compiled = {
    // ...
    template: function (n, r) {
        if (1 & n && (gt(0, "div")(1, "h3"), Ue(2, "HTML Entity "), gt(3, "code"), Ue(4, "🛈"), 
            ut(), Ue(5, " in Angular templates"), ut(), gt(6, "p"), Ue(7, "Writing it as HTML Entity:"),
            
            //                                           vvvvvvvv HTML Entity text in template
            ut(), gt(8, "p")(9, "span", null, 0), Ue(11, "\uf6c8"), ut(), Ue(12, " (hex in browser "), 
            
            
            gt(13, "code"), Ue(14), ut(), Ue(15, ")"), ut(), gt(16, "p"), 
            Ue(17, "should be equal to writing it as Unicode String in Template:"), ut(), 
            
            //                                       vvvvvvvvvvv Unicode text in template
            gt(18, "p")(19, "span", null, 1), Ue(21, "\u{1f6c8}"), ut(), Ue(22, " (hex in browser "), 
            
            
            gt(23, "code"), Ue(24), ut(), Ue(25, ")"), ut(), 
            Gc(26, zA, 5, 0, "p", 2), Gc(27, ZA, 4, 0, "ng-template", null, 3, Jm), gt(29, "p"), 
            Ue(30, "Hacky variant by setting HTML at runtime:"), ut(), gt(31, "p"), qc(32, "span", null, 4), 
            Ue(34, " (hex in browser "), gt(35, "code"), Ue(36), ut(), Ue(37, ")"), ut()()), 2 & n) {
            const o = function Jp(e) {
                return function zr(e, t) {
                    return e[t]
                }(function Qv() {
                    return W.lFrame.contextLView
                }(), he + e)
            }(28);
            fi(14), wi(r.hexOne), fi(10), wi(r.hexTwo), fi(2), 
                zc("ngIf", r.hexOne != r.hexTwo)("ngIfElse", o), fi(10), wi(r.hexThree)
        }
    },
    // ...
}

... or in development build:

const compiled = {
  // ...
  template: function AddOneButtonComponent_Template(rf, ctx) {
    if (rf & 1) {
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵelementStart"](0, "div")(1, "h3");
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵtext"](2, "HTML Entity ");
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵelementStart"](3, "code");
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵtext"](4, "🛈");
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵelementEnd"]();
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵtext"](5, " in Angular templates");
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵelementEnd"]();
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵelementStart"](6, "p");
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵtext"](7, "Writing it as HTML Entity:");
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵelementEnd"]();
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵelementStart"](8, "p")(9, "span", null, 0);

      //                                                       vvvvvvvv HTML Entity text in template
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵtext"](11, "\uF6C8");
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵelementEnd"]();
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵtext"](12, " (hex in browser ");
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵelementStart"](13, "code");
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵtext"](14);
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵelementEnd"]();
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵtext"](15, ")");
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵelementEnd"]();
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵelementStart"](16, "p");
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵtext"](17, "should be equal to writing it as Unicode String in Template:");
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵelementEnd"]();
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵelementStart"](18, "p")(19, "span", null, 1);

      //                                                       vvvvvvvvvvvvvv Unicode text in template
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵtext"](21, "\uD83D\uDEC8");
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵelementEnd"]();
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵtext"](22, " (hex in browser ");
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵelementStart"](23, "code");
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵtext"](24);
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵelementEnd"]();
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵtext"](25, ")");
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵelementEnd"]();
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵtemplate"](26, AddOneButtonComponent_p_26_Template, 5, 0, "p", 2);
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵtemplate"](27, AddOneButtonComponent_ng_template_27_Template, 4, 0, "ng-template", null, 3, _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵtemplateRefExtractor"]);
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵelementStart"](29, "p");
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵtext"](30, "Hacky variant by setting HTML at runtime:");
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵelementEnd"]();
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵelementStart"](31, "p");
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵelement"](32, "span", null, 4);
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵtext"](34, " (hex in browser ");
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵelementStart"](35, "code");
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵtext"](36);
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵelementEnd"]();
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵtext"](37, ")");
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵelementEnd"]()();
    }
    if (rf & 2) {
      const _r3 = _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵreference"](28);
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵadvance"](14);
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵtextInterpolate"](ctx.hexOne);
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵadvance"](10);
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵtextInterpolate"](ctx.hexTwo);
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵadvance"](2);
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵproperty"]("ngIf", ctx.hexOne != ctx.hexTwo)("ngIfElse", _r3);
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵadvance"](10);
      _angular_core__WEBPACK_IMPORTED_MODULE_2__["ɵɵtextInterpolate"](ctx.hexThree);
    }
  }
  // ...
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    P3An issue that is relevant to core functions, but does not impede progress. Important, but not urgentarea: compilerIssues related to `ngc`, Angular's template compilerbugcompiler: parserhelp wantedAn issue that is suitable for a community contributor (based on its complexity/scope).

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions