Skip to content

Using emojis with blog plugin causes crash #5555

@perpil

Description

@perpil

Context

Including emojis in blog content (in my case 💻) causes crashes during serve and build

Bug description

If you include an emoji character like 💻 anywhere in your blog content it crashes with:

lxml.etree.XMLSyntaxError: Char 0x0 out of allowed range, line 1, column 2

Using emojis on main site pages work fine. I tried to workaround it by using the pymdownx.emoji plugin, but I need the emoji in a code fence, and it wasn't replacing emojis in the code fence (likely by design).

Full trace:

INFO     -  DeprecationWarning: pkg_resources is deprecated as an API
              File
            "/Users/david/Documents/GitHub/mkdocs-material/material/plugins/info/plugin.py",
            line 33, in <module>
                from pkg_resources import get_distribution, working_set
              File
            "/opt/homebrew/lib/python3.10/site-packages/pkg_resources/__init__.py",
            line 121, in <module>
                warnings.warn("pkg_resources is deprecated as an API",
            DeprecationWarning)
INFO     -  DeprecationWarning: Deprecated call to
            `pkg_resources.declare_namespace('google')`.
            Implementing implicit namespace packages (as specified in PEP 420)
            is preferred to `pkg_resources.declare_namespace`. See
            https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
              File
            "/opt/homebrew/lib/python3.10/site-packages/pkg_resources/__init__.py",
            line 2870, in activate
                declare_namespace(pkg)
              File
            "/opt/homebrew/lib/python3.10/site-packages/pkg_resources/__init__.py",
            line 2338, in declare_namespace
                warnings.warn(msg, DeprecationWarning, stacklevel=2)
INFO     -  DeprecationWarning: Deprecated call to
            `pkg_resources.declare_namespace('google.logging')`.
            Implementing implicit namespace packages (as specified in PEP 420)
            is preferred to `pkg_resources.declare_namespace`. See
            https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
              File
            "/opt/homebrew/lib/python3.10/site-packages/pkg_resources/__init__.py",
            line 2870, in activate
                declare_namespace(pkg)
              File
            "/opt/homebrew/lib/python3.10/site-packages/pkg_resources/__init__.py",
            line 2338, in declare_namespace
                warnings.warn(msg, DeprecationWarning, stacklevel=2)
INFO     -  DeprecationWarning: Deprecated call to
            `pkg_resources.declare_namespace('mpl_toolkits')`.
            Implementing implicit namespace packages (as specified in PEP 420)
            is preferred to `pkg_resources.declare_namespace`. See
            https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
              File
            "/opt/homebrew/lib/python3.10/site-packages/pkg_resources/__init__.py",
            line 2870, in activate
                declare_namespace(pkg)
              File
            "/opt/homebrew/lib/python3.10/site-packages/pkg_resources/__init__.py",
            line 2338, in declare_namespace
                warnings.warn(msg, DeprecationWarning, stacklevel=2)
INFO     -  DeprecationWarning: Deprecated call to
            `pkg_resources.declare_namespace('ruamel')`.
            Implementing implicit namespace packages (as specified in PEP 420)
            is preferred to `pkg_resources.declare_namespace`. See
            https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
              File
            "/opt/homebrew/lib/python3.10/site-packages/pkg_resources/__init__.py",
            line 2870, in activate
                declare_namespace(pkg)
              File
            "/opt/homebrew/lib/python3.10/site-packages/pkg_resources/__init__.py",
            line 2338, in declare_namespace
                warnings.warn(msg, DeprecationWarning, stacklevel=2)
INFO     -  Building documentation...
INFO     -  Cleaning site directory
INFO     -  The following pages exist in the docs directory, but are not
            included in the "nav" configuration:
              - index.md
ERROR    -  Error reading page 'blog/posts/hello-world.md': Document is empty
Traceback (most recent call last):
  File "/opt/homebrew/lib/python3.10/site-packages/pyquery/pyquery.py", line 59, in fromstring
    result = getattr(etree, meth)(context)
  File "src/lxml/etree.pyx", line 3257, in lxml.etree.fromstring
  File "src/lxml/parser.pxi", line 1916, in lxml.etree._parseMemoryDocument
  File "src/lxml/parser.pxi", line 1796, in lxml.etree._parseDoc
  File "src/lxml/parser.pxi", line 1085, in lxml.etree._BaseParser._parseUnicodeDoc
  File "src/lxml/parser.pxi", line 618, in lxml.etree._ParserContext._handleParseResultDoc
  File "src/lxml/parser.pxi", line 728, in lxml.etree._handleParseResult
  File "src/lxml/parser.pxi", line 657, in lxml.etree._raiseParseError
  File "<string>", line 1
lxml.etree.XMLSyntaxError: Char 0x0 out of allowed range, line 1, column 2

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/homebrew/bin/mkdocs", line 8, in <module>
    sys.exit(cli())
  File "/opt/homebrew/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/opt/homebrew/lib/python3.10/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/opt/homebrew/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/opt/homebrew/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/opt/homebrew/lib/python3.10/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/opt/homebrew/lib/python3.10/site-packages/mkdocs/__main__.py", line 234, in serve_command
    serve.serve(dev_addr=dev_addr, livereload=livereload, watch=watch, **kwargs)
  File "/opt/homebrew/lib/python3.10/site-packages/mkdocs/commands/serve.py", line 83, in serve
    builder(config)
  File "/opt/homebrew/lib/python3.10/site-packages/mkdocs/commands/serve.py", line 76, in builder
    build(config, live_server=live_server, dirty=dirty)
  File "/opt/homebrew/lib/python3.10/site-packages/mkdocs/commands/build.py", line 308, in build
    _populate_page(file.page, config, files, dirty)
  File "/opt/homebrew/lib/python3.10/site-packages/mkdocs/commands/build.py", line 177, in _populate_page
    page.markdown = config.plugins.run_event(
  File "/opt/homebrew/lib/python3.10/site-packages/mkdocs/plugins.py", line 520, in run_event
    result = method(item, **kwargs)
  File "/Users/david/Documents/GitHub/mkdocs-material/material/plugins/blog/plugin.py", line 357, in on_page_markdown
    read = readtime.of_markdown(markdown, rate)
  File "/opt/homebrew/lib/python3.10/site-packages/readtime/api.py", line 40, in of_markdown
    return utils.read_time(markdown, format='markdown', wpm=wpm)
  File "/opt/homebrew/lib/python3.10/site-packages/readtime/utils.py", line 48, in read_time
    el = pq(html)
  File "/opt/homebrew/lib/python3.10/site-packages/pyquery/pyquery.py", line 212, in __init__
    elements = fromstring(context, self.parser)
  File "/opt/homebrew/lib/python3.10/site-packages/pyquery/pyquery.py", line 63, in fromstring
    result = getattr(lxml.html, meth)(context)
  File "/opt/homebrew/lib/python3.10/site-packages/lxml/html/__init__.py", line 873, in fromstring
    doc = document_fromstring(html, parser=parser, base_url=base_url, **kw)
  File "/opt/homebrew/lib/python3.10/site-packages/lxml/html/__init__.py", line 761, in document_fromstring
    raise etree.ParserError(
lxml.etree.ParserError: Document is empty

Related links

Reproduction

example.zip

Steps to reproduce

  1. mkdocs build

Note that if you delete the file docs/blog/posts/hello-world.md and build again it works. index.md also contains an emoji: 💻

Browser

No response

Before submitting

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugIssue reports a bugresolvedIssue is resolved, yet unreleased if open

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions