Skip to content

Error on Windows 10 when adding URL: UnicodeEncodeError: 'charmap' codec can't encode: character maps to <undefined> #678

@Leontking

Description

@Leontking
[i] [2021-03-27 04:40:48] ArchiveBox v0.5.4: archivebox add https://youtube.com/
    > E:\ArchiveBox

[!] Warning: Missing 6 recommended dependencies
    ! WGET_BINARY: wget (unable to detect version)
    ! SINGLEFILE_BINARY: single-file (unable to detect version)
      Hint: npm install --prefix . "git+https://github.com/ArchiveBox/ArchiveBox.git"
            or archivebox config --set SAVE_SINGLEFILE=False to silence this warning

    ! READABILITY_BINARY: readability-extractor (unable to detect version)
      Hint: npm install --prefix . "git+https://github.com/ArchiveBox/ArchiveBox.git"
            or archivebox config --set SAVE_READABILITY=False to silence this warning

    ! MERCURY_BINARY: mercury-parser (unable to detect version)
      Hint: npm install --prefix . "git+https://github.com/ArchiveBox/ArchiveBox.git"
            or archivebox config --set SAVE_MERCURY=False to silence this warning

    ! CHROME_BINARY: unable to find binary (unable to detect version)
    ! RIPGREP_BINARY: rg (unable to detect version)

[+] [2021-03-27 04:40:52] Adding 1 links to index (crawl depth=0)...
    > Saved verbatim input to sources/E:\ArchiveBox\sources\1616820052-import.txt
    > Parsed 1 URLs from input (Plain Text)
    > Found 1 new URLs not already in index

[*] [2021-03-27 04:40:52] Writing 1 links to main index...
    √ E:\ArchiveBox\index.sqlite3

[▶] [2021-03-27 04:40:52] Starting archiving of 1 snapshots in index...
    ! Failed to archive link: UnicodeEncodeError: 'charmap' codec can't encode character '\u25be' in position 9443: character maps to <undefined>

Traceback (most recent call last):
  File "d:\python\lib\runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "d:\python\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "D:\Python\Scripts\archivebox.exe\__main__.py", line 7, in <module>
    from .cli import main
  File "d:\python\lib\site-packages\archivebox\cli\__init__.py", line 129, in main
    run_subcommand(
  File "d:\python\lib\site-packages\archivebox\cli\__init__.py", line 69, in run_subcommand
    module.main(args=subcommand_args, stdin=stdin, pwd=pwd)    # type: ignore
  File "d:\python\lib\site-packages\archivebox\cli\archivebox_add.py", line 85, in main
    add(
  File "d:\python\lib\site-packages\archivebox\util.py", line 112, in typechecked_function
    return func(*args, **kwargs)
  File "d:\python\lib\site-packages\archivebox\main.py", line 592, in add
    archive_links(new_links, overwrite=False, **archive_kwargs)
  File "d:\python\lib\site-packages\archivebox\util.py", line 112, in typechecked_function
    return func(*args, **kwargs)
  File "d:\python\lib\site-packages\archivebox\extractors\__init__.py", line 173, in archive_links
    archive_link(to_archive, overwrite=overwrite, methods=methods, out_dir=Path(link.link_dir))
  File "d:\python\lib\site-packages\archivebox\util.py", line 112, in typechecked_function
    return func(*args, **kwargs)
  File "d:\python\lib\site-packages\archivebox\extractors\__init__.py", line 95, in archive_link
    write_link_details(link, out_dir=out_dir, skip_sql_index=False)
  File "d:\python\lib\site-packages\archivebox\util.py", line 112, in typechecked_function
    return func(*args, **kwargs)
  File "d:\python\lib\site-packages\archivebox\index\__init__.py", line 333, in write_link_details
    write_html_link_details(link, out_dir=out_dir)
  File "d:\python\lib\site-packages\archivebox\util.py", line 112, in typechecked_function
    return func(*args, **kwargs)
  File "d:\python\lib\site-packages\archivebox\index\html.py", line 79, in write_html_link_details
    atomic_write(str(Path(out_dir) / HTML_INDEX_FILENAME), rendered_html)
  File "d:\python\lib\site-packages\archivebox\util.py", line 112, in typechecked_function
    return func(*args, **kwargs)
  File "d:\python\lib\site-packages\archivebox\system.py", line 47, in atomic_write
    f.write(contents)
  File "d:\python\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u25be' in position 9443: character maps to <undefined>

Metadata

Metadata

Assignees

No one assigned

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions