-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Description
Describe the bug
After running for a while and/or snapshotting a certain amount of URLs, the Chromium call that does screenshots and DOM starts failing. When I exec into the container and run the chromium command again manually, it always works. But when doing "Pull" from the web UI, it always fails. It only starts working again if I stop the container, docker rm it, and then start it again.
It's impossible from the log to see the exact Chromium error when this happens, because every single Chromium call is always prefixed by these error lines (and the ArchiveBox log only shows the first 5 lines):
find: ‘/root/.config/chromium/Crash Reports/pending/’: No such file or directory
[574:599:0714/201021.258704:ERROR:bus.cc(399)] Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory
[574:603:0714/201021.766940:ERROR:bus.cc(399)] Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory
[574:603:0714/201021.767034:ERROR:bus.cc(399)] Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory
[574:599:0714/201021.770660:ERROR:bus.cc(399)] Failed to connect to the bus: Could not parse server address: Unknown address type (examples of valid types are "tcp" and on UNIX "unix")
[574:599:0714/201021.771017:ERROR:bus.cc(399)] Failed to connect to the bus: Could not parse server address: Unknown address type (examples of valid types are "tcp" and on UNIX "unix")
[574:599:0714/201021.771115:ERROR:bus.cc(399)] Failed to connect to the bus: Could not parse server address: Unknown address type (examples of valid types are "tcp" and on UNIX "unix")
[574:599:0714/201021.771276:ERROR:bus.cc(399)] Failed to connect to the bus: Could not parse server address: Unknown address type (examples of valid types are "tcp" and on UNIX "unix")
[574:574:0714/201021.783527:ERROR:chrome_browser_cloud_management_controller.cc(162)] Cloud management controller initialization aborted as CBCM is not enabled.
[574:599:0714/201021.800785:ERROR:bus.cc(399)] Failed to connect to the bus: Could not parse server address: Unknown address type (examples of valid types are "tcp" and on UNIX "unix")
[574:599:0714/201021.800842:ERROR:bus.cc(399)] Failed to connect to the bus: Could not parse server address: Unknown address type (examples of valid types are "tcp" and on UNIX "unix")
[609:609:0714/201021.834450:ERROR:angle_platform_impl.cc(43)] Display.cpp:1023 (initialize): ANGLE Display::initialize error 12289: Could not open the default X display.
ERR: Display.cpp:1023 (initialize): ANGLE Display::initialize error 12289: Could not open the default X display.
[609:609:0714/201021.834540:ERROR:gl_display.cc(520)] EGL Driver message (Critical) eglInitialize: Could not open the default X display.
[609:609:0714/201021.834551:ERROR:gl_display.cc(790)] eglInitialize Default failed with error EGL_NOT_INITIALIZED
[609:609:0714/201021.834564:ERROR:gl_display.cc(824)] Initialization of all EGL display types failed.
[609:609:0714/201021.834578:ERROR:gl_ozone_egl.cc(26)] GLDisplayEGL::Initialize failed.
[609:609:0714/201021.834653:ERROR:angle_platform_impl.cc(43)] Display.cpp:1023 (initialize): ANGLE Display::initialize error 12289: Could not open the default X display.
ERR: Display.cpp:1023 (initialize): ANGLE Display::initialize error 12289: Could not open the default X display.
[609:609:0714/201021.834676:ERROR:gl_display.cc(520)] EGL Driver message (Critical) eglInitialize: Could not open the default X display.
[609:609:0714/201021.834682:ERROR:gl_display.cc(790)] eglInitialize Default failed with error EGL_NOT_INITIALIZED
[609:609:0714/201021.834691:ERROR:gl_display.cc(824)] Initialization of all EGL display types failed.
[609:609:0714/201021.834698:ERROR:gl_ozone_egl.cc(26)] GLDisplayEGL::Initialize failed.
[609:609:0714/201021.836399:ERROR:viz_main_impl.cc(186)] Exiting GPU process due to errors during initialization
[574:574:0714/201021.876529:ERROR:object_proxy.cc(590)] Failed to call method: org.freedesktop.portal.Settings.Read: object_path= /org/freedesktop/portal/desktop: unknown error type:
[574:651:0714/201021.894184:ERROR:bus.cc(399)] Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory
[574:651:0714/201021.894302:ERROR:bus.cc(399)] Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory
[574:651:0714/201021.894554:ERROR:bus.cc(399)] Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory
[574:651:0714/201021.894618:ERROR:bus.cc(399)] Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory
[574:651:0714/201021.894702:ERROR:bus.cc(399)] Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory
[650:650:0714/201021.930958:ERROR:angle_platform_impl.cc(43)] Display.cpp:1023 (initialize): ANGLE Display::initialize error 12289: Could not open the default X display.
ERR: Display.cpp:1023 (initialize): ANGLE Display::initialize error 12289: Could not open the default X display.
[650:650:0714/201021.930990:ERROR:gl_display.cc(520)] EGL Driver message (Critical) eglInitialize: Could not open the default X display.
[650:650:0714/201021.930994:ERROR:gl_display.cc(790)] eglInitialize Default failed with error EGL_NOT_INITIALIZED
[650:650:0714/201021.931003:ERROR:gl_display.cc(824)] Initialization of all EGL display types failed.
[650:650:0714/201021.931012:ERROR:gl_ozone_egl.cc(26)] GLDisplayEGL::Initialize failed.
[650:650:0714/201021.931059:ERROR:angle_platform_impl.cc(43)] Display.cpp:1023 (initialize): ANGLE Display::initialize error 12289: Could not open the default X display.
ERR: Display.cpp:1023 (initialize): ANGLE Display::initialize error 12289: Could not open the default X display.
[650:650:0714/201021.931071:ERROR:gl_display.cc(520)] EGL Driver message (Critical) eglInitialize: Could not open the default X display.
[650:650:0714/201021.931076:ERROR:gl_display.cc(790)] eglInitialize Default failed with error EGL_NOT_INITIALIZED
[650:650:0714/201021.931079:ERROR:gl_display.cc(824)] Initialization of all EGL display types failed.
[650:650:0714/201021.931084:ERROR:gl_ozone_egl.cc(26)] GLDisplayEGL::Initialize failed.
[650:650:0714/201021.932311:ERROR:viz_main_impl.cc(186)] Exiting GPU process due to errors during initialization
Despite these lines, the Screenshot and DOM still work manually. But they're preventing me from seeing what's going on when Chromium does fail to produce the Screenshot and DOM during the original run.
Steps to reproduce
- Add a bunch of URLs
- After like ~100 or more, all Screenshots and DOM saving starts to fail.
Screenshots or log output
When re-running "Pull" on the failed snapshots, it always fails again and produces this output:
archivebox | [▶] [2023-07-14 19:21:50] Starting archiving of 1 snapshots in index...
archivebox |
archivebox | [√] [2023-07-14 19:21:50] "Website Name and Title"
archivebox | https://domain.name.here/
archivebox | √ ./archive/1689326989.501584
archivebox | > screenshot
archivebox | Extractor failed:
archivebox | Failed to save screenshot
archivebox | [307222:307247:0714/192151.111727:ERROR:bus.cc(399)] Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory
archivebox | [307222:307250:0714/192151.131872:ERROR:bus.cc(399)] Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory
archivebox | [307222:307250:0714/192151.131911:ERROR:bus.cc(399)] Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory
archivebox | [307222:307247:0714/192151.138760:ERROR:bus.cc(399)] Failed to connect to the bus: Could not parse server address: Unknown address type (examples of valid types are "tcp" and on UNIX "unix")
archivebox | [307222:307247:0714/192151.138842:ERROR:bus.cc(399)] Failed to connect to the bus: Could not parse server address: Unknown address type (examples of valid types are "tcp" and on UNIX "unix")
archivebox | Run to see full output:
archivebox | cd /data/archive/1689326989.501584;
archivebox | /usr/bin/chromium --headless=new --no-sandbox --no-zygote --disable-dev-shm-usage --disable-software-rasterizer --run-all-compositor-stages-before-draw --hide-scrollbars --window-size=1440,2000 --autoplay-policy=no-user-gesture-required --no-first-run --use-fake-ui-for-media-stream --use-fake-device-for-media-stream --disable-sync "--user-agent=Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Chrome/102.0.0.0 Safari/605.1.15 ArchiveBox/{VERSION} (+https://github.com/ArchiveBox/ArchiveBox/)" --window-size=1440,2000 --screenshot https://domain.name.here/
archivebox |
archivebox | > dom
archivebox | Extractor failed:
archivebox | Failed to save DOM
archivebox | [307272:307298:0714/192151.313489:ERROR:bus.cc(399)] Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory
archivebox | [307272:307301:0714/192151.318865:ERROR:bus.cc(399)] Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory
archivebox | [307272:307301:0714/192151.318894:ERROR:bus.cc(399)] Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory
archivebox | [307272:307298:0714/192151.320545:ERROR:bus.cc(399)] Failed to connect to the bus: Could not parse server address: Unknown address type (examples of valid types are "tcp" and on UNIX "unix")
archivebox | [307272:307298:0714/192151.320601:ERROR:bus.cc(399)] Failed to connect to the bus: Could not parse server address: Unknown address type (examples of valid types are "tcp" and on UNIX "unix")
archivebox | Run to see full output:
archivebox | cd /data/archive/1689326989.501584;
archivebox | /usr/bin/chromium --headless=new --no-sandbox --no-zygote --disable-dev-shm-usage --disable-software-rasterizer --run-all-compositor-stages-before-draw --hide-scrollbars --window-size=1440,2000 --autoplay-policy=no-user-gesture-required --no-first-run --use-fake-ui-for-media-stream --use-fake-device-for-media-stream --disable-sync "--user-agent=Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Chrome/102.0.0.0 Safari/605.1.15 ArchiveBox/{VERSION} (+https://github.com/ArchiveBox/ArchiveBox/)" --window-size=1440,2000 --dump-dom https://domain.name.here/
archivebox |
archivebox | 192 files (5.4 MB) in 0:00:00s
archivebox |
archivebox | [√] [2023-07-14 19:21:51] Update of 1 pages complete (1.23 sec)
ArchiveBox version
find: '/.config/chromium/Crash Reports/pending/': No such file or directory
0.6.3
ArchiveBox v0.6.3 40ddd33 Cpython Linux Linux-6.1.0-10-amd64-x86_64-with-glibc2.31 x86_64
DEBUG=False IN_DOCKER=True IS_TTY=True TZ=UTC FS_ATOMIC=True FS_REMOTE=True FS_PERMS=644 0:0 SEARCH_BACKEND=ripgrep
[i] Dependency versions:
√ PYTHON_BINARY v3.11.4 valid /usr/local/bin/python3.11
√ SQLITE_BINARY v2.6.0 valid /usr/local/lib/python3.11/sqlite3/dbapi2.py
√ DJANGO_BINARY v3.1.14 valid /usr/local/lib/python3.11/site-packages/django/__init__.py
√ ARCHIVEBOX_BINARY v0.6.3 valid /usr/local/bin/archivebox
√ CURL_BINARY v7.74.0 valid /usr/bin/curl
√ WGET_BINARY v1.21 valid /usr/bin/wget
√ NODE_BINARY v18.16.1 valid /usr/bin/node
√ SINGLEFILE_BINARY v0.3.16 valid /node/node_modules/single-file/cli/single-file
√ READABILITY_BINARY v0.0.2 valid /node/node_modules/readability-extractor/readability-extractor
√ MERCURY_BINARY v1.0.0 valid /node/node_modules/@postlight/mercury-parser/cli.js
- GIT_BINARY - disabled /usr/bin/git
√ YOUTUBEDL_BINARY v2023.07.06 valid /usr/local/bin/yt-dlp
√ CHROME_BINARY v114.0.5735.198 valid /usr/bin/chromium
√ RIPGREP_BINARY v12.1.1 valid /usr/bin/rg
[i] Source-code locations:
√ PACKAGE_DIR 23 files valid /app/archivebox
√ TEMPLATES_DIR 3 files valid /app/archivebox/templates
- CUSTOM_TEMPLATES_DIR - disabled
[i] Secrets locations:
- CHROME_USER_DATA_DIR - disabled
- COOKIES_FILE - disabled
[i] Data locations:
√ OUTPUT_DIR 9 files @ valid /data
√ SOURCES_DIR 325 files valid ./sources
√ LOGS_DIR 2 files valid ./logs
√ ARCHIVE_DIR 1150 files valid ./archive
√ CONFIG_FILE 133.0 Bytes valid ./ArchiveBox.conf
√ SQL_INDEX 11.2 MB valid ./index.sqlite3