-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Closed
Labels
size: easystatus: wipWork is in-progress / has already been partially completedWork is in-progress / has already been partially completed
Milestone
Description
I'm encountering the same problem user @jrruethe already described some time ago. Seems like it was solved, but it reoccurred on my setup after installing the latest update and running archivebox setup command for some reason.
- Ran
arcivebox update(several times, it reproduces) - On a specific link it crashes, giving the following output
[√] [2021-04-15 10:56:49] "The Long War on Objectivity | The New Republic"
https://newrepublic.com/article/158497/long-war-objectivity
√ ./archive/1617309812.979884
> readability
! Failed to archive link: Exception: Exception in archive_methods.save_readability(Link(url=https://newrepublic.com/article/158497/long-war-objectivity))
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/archivebox/extractors/__init__.py", line 114, in archive_link
log_archive_method_finished(result)
File "/usr/lib/python3/dist-packages/archivebox/logging_util.py", line 435, in log_archive_method_finished
hints = hints if isinstance(hints, (list, tuple)) else hints.split('\n')
TypeError: a bytes-like object is required, not 'str'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/bin/archivebox", line 11, in <module>
load_entry_point('archivebox==0.6.2', 'console_scripts', 'archivebox')()
File "/usr/lib/python3/dist-packages/archivebox/cli/__init__.py", line 140, in main
run_subcommand(
File "/usr/lib/python3/dist-packages/archivebox/cli/__init__.py", line 80, in run_subcommand
module.main(args=subcommand_args, stdin=stdin, pwd=pwd) # type: ignore
File "/usr/lib/python3/dist-packages/archivebox/cli/archivebox_update.py", line 119, in main
update(
File "/usr/lib/python3/dist-packages/archivebox/util.py", line 114, in typechecked_function
return func(*args, **kwargs)
File "/usr/lib/python3/dist-packages/archivebox/main.py", line 783, in update
archive_links(to_archive, overwrite=overwrite, **archive_kwargs)
File "/usr/lib/python3/dist-packages/archivebox/util.py", line 114, in typechecked_function
return func(*args, **kwargs)
File "/usr/lib/python3/dist-packages/archivebox/extractors/__init__.py", line 181, in archive_links
archive_link(to_archive, overwrite=overwrite, methods=methods, out_dir=Path(link.link_dir))
File "/usr/lib/python3/dist-packages/archivebox/util.py", line 114, in typechecked_function
return func(*args, **kwargs)
File "/usr/lib/python3/dist-packages/archivebox/extractors/__init__.py", line 130, in archive_link
raise Exception('Exception in archive_methods.save_{}(Link(url={}))'.format(
Exception: Exception in archive_methods.save_readability(Link(url=https://newrepublic.com/article/158497/long-war-objectivity))
ArchiveBox v0.6.2
Cpython Linux Linux-5.4.0-71-generic-x86_64-with-glibc2.29 x86_64
IN_DOCKER=False DEBUG=False IS_TTY=True TZ=UTC SEARCH_BACKEND_ENGINE=ripgrep
[i] Dependency versions:
√ ARCHIVEBOX_BINARY v0.6.2 valid /usr/bin/archivebox
√ PYTHON_BINARY v3.8.5 valid /usr/bin/python3.8
√ DJANGO_BINARY v2.2.12 valid /usr/lib/python3/dist-packages/django/bin/django-admin.py
√ CURL_BINARY v7.68.0 valid /usr/bin/curl
√ WGET_BINARY v1.20.3 valid /usr/bin/wget
√ NODE_BINARY v10.19.0 valid /usr/bin/node
√ SINGLEFILE_BINARY v0.3.16 valid ./node_modules/single-file/cli/single-file
√ READABILITY_BINARY v0.0.2 valid ./node_modules/readability-extractor/readability-extractor
√ MERCURY_BINARY v1.0.0 valid ./node_modules/@postlight/mercury-parser/cli.js
√ GIT_BINARY v2.25.1 valid /usr/bin/git
- YOUTUBEDL_BINARY - disabled /usr/bin/youtube-dl
√ CHROME_BINARY v89.0.4389.114 valid /usr/bin/chromium-browser
√ RIPGREP_BINARY v11.0.2 valid /usr/bin/rg
[i] Source-code locations:
√ PACKAGE_DIR 23 files valid /usr/lib/python3/dist-packages/archivebox
√ TEMPLATES_DIR 3 files valid /usr/lib/python3/dist-packages/archivebox/templates
- CUSTOM_TEMPLATES_DIR - disabled
[i] Secrets locations:
- CHROME_USER_DATA_DIR - disabled
- COOKIES_FILE - disabled
[i] Data locations:
√ OUTPUT_DIR 14 files valid /home/.../archivebox
√ SOURCES_DIR 27 files valid ./sources
√ LOGS_DIR 1 files valid ./logs
√ ARCHIVE_DIR 9024 files valid ./archive
√ CONFIG_FILE 291.0 Bytes valid ./ArchiveBox.conf
√ SQL_INDEX 105.7 MB valid ./index.sqlite3
srsudar
Metadata
Metadata
Assignees
Labels
size: easystatus: wipWork is in-progress / has already been partially completedWork is in-progress / has already been partially completed