Skip to content

Enable core dump collection and analysis for system tests#5025

Merged
pwojcikdev merged 5 commits intonanocurrency:developfrom
pwojcikdev:system-tests-core-dumps
Feb 9, 2026
Merged

Enable core dump collection and analysis for system tests#5025
pwojcikdev merged 5 commits intonanocurrency:developfrom
pwojcikdev:system-tests-core-dumps

Conversation

@pwojcikdev
Copy link
Copy Markdown
Contributor

No description provided.

@gr0vity-dev-bot
Copy link
Copy Markdown

gr0vity-dev-bot commented Feb 6, 2026

Test Results for Commit c72bd18

Pull Request 5025: Results
Overall Status:

Test Case Results

  • 5n4pr_conf_10k_bintree: PASS (Duration: 118s)
  • 5n4pr_conf_10k_change: PASS (Duration: 178s)
  • 5n4pr_conf_change_dependant: PASS (Duration: 138s)
  • 5n4pr_conf_change_independant: PASS (Duration: 145s)
  • 5n4pr_conf_send_dependant: PASS (Duration: 147s)
  • 5n4pr_conf_send_independant: PASS (Duration: 126s)
  • 5n4pr_rocks_10k_bintree: PASS (Duration: 119s)
  • 5n4pr_rocks_10k_change: PASS (Duration: 151s)

Last updated: 2026-02-08 20:25:09 UTC

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR centralizes core dump enablement for CI test scripts and expands core dump collection/analysis to system tests to make post-failure debugging easier.

Changes:

  • Refactors core dump setup logic into a shared setup_core_dumps helper in ci/tests/common.sh.
  • Updates run-tests.sh and run-system-tests.sh to call setup_core_dumps.
  • Adds per-systest core dump analysis invocation in run-system-tests.sh.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
ci/tests/run-tests.sh Replaces inline core-dump configuration with a shared helper call.
ci/tests/run-system-tests.sh Enables core dumps for system tests and runs core dump analysis after each systest.
ci/tests/common.sh Introduces setup_core_dumps to consolidate OS-specific core dump configuration.
Comments suppressed due to low confidence (1)

ci/tests/run-system-tests.sh:52

  • The guard only checks that COREDUMP_DIR is non-empty, but setup_core_dumps may set it even if directory creation/core-pattern configuration failed. Consider also checking -d "$COREDUMP_DIR" (and maybe readability) before invoking show-core-dumps.sh to avoid noisy ls/analysis errors.
        overall_status=1
    else
        echo "::error::Systest failed: $name ($status)"
        overall_status=1
    fi

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +26 to +28
ulimit -c unlimited
echo "${DEFAULT_COREDUMP_DIR}/core-%e.%p" | sudo tee /proc/sys/kernel/core_pattern
export COREDUMP_DIR=${DEFAULT_COREDUMP_DIR}
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Linux this sets /proc/sys/kernel/core_pattern, which is a global kernel setting and can impact other processes on the runner. Consider saving the previous value and restoring it via a trap (or scoping core dumps to the workspace without changing core_pattern) so the job doesn’t leave the host in a modified state.

Copilot uses AI. Check for mistakes.
Comment on lines +22 to +27
# Ensure directory exists and is writable for core dumps
sudo mkdir -p "${DEFAULT_COREDUMP_DIR}"
sudo chmod a+w "${DEFAULT_COREDUMP_DIR}"
# Enable core dumps
ulimit -c unlimited
echo "${DEFAULT_COREDUMP_DIR}/core-%e.%p" | sudo tee /proc/sys/kernel/core_pattern
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setup_core_dumps unconditionally uses sudo and assumes it’s available and non-interactive. If sudo prompts for a password (or is missing), this can hang/fail CI; consider detecting root vs non-root and using sudo -n (fail fast) with a clear warning when core dumps can’t be enabled.

Copilot uses AI. Check for mistakes.
exit 1
;;
esac
setup_core_dumps
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setup_core_dumps does not set COREDUMP_DIR on Windows/unsupported OSes, but this script later tries to analyze core dumps on failures. Consider having setup_core_dumps return a status / set a flag, and gate core-dump analysis on that to avoid misleading "COREDUMP_DIR is not set" output when core dumps aren’t supported.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 6 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


# Show core dumps after each test
if [ -n "${COREDUMP_DIR-}" ]; then
"$(dirname "$BASH_SOURCE")/show-core-dumps.sh" "${NANO_NODE_EXE}"
Copy link

Copilot AI Feb 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In systests, multiple binaries can generate cores (e.g., nano_node and nano_rpc), but this always analyzes all core files using only ${NANO_NODE_EXE}. With core_pattern using %e, this can lead to incorrect backtraces (and show-core-dumps.sh also deletes cores after analysis), potentially losing useful dumps. Consider invoking analysis per executable (node + rpc) or updating show-core-dumps.sh to derive the executable from the core filename (%e) and map it to the correct binary path.

Suggested change
"$(dirname "$BASH_SOURCE")/show-core-dumps.sh" "${NANO_NODE_EXE}"
show_core_script="$(dirname "$BASH_SOURCE")/show-core-dumps.sh"
"$show_core_script" "${NANO_NODE_EXE}"
"$show_core_script" "${NANO_RPC_EXE}"

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

-e DEADLINE_SCALE_FACTOR \
nano-test:latest \
bash -c "cd build && ../ci/tests/run-${TEST_TYPE}-tests.sh"
bash -c "source ci/tests/setup-core-dumps.sh && cd build && ../ci/tests/run-${TEST_TYPE}-tests.sh"
Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enabling core dumps in the Docker test container is good, but core analysis on Linux currently relies on gdb in show-core-dumps.sh. The Docker test image (docker/tests/Dockerfile-tests) installs sudo jq netcat-openbsd but not gdb, so core dump analysis is likely to fail inside Docker. Consider installing gdb in the Docker test image or updating the analyzer to fall back to lldb when gdb isn’t available.

Suggested change
bash -c "source ci/tests/setup-core-dumps.sh && cd build && ../ci/tests/run-${TEST_TYPE}-tests.sh"
bash -c "if ! command -v gdb >/dev/null 2>&1; then sudo apt-get update && sudo apt-get install -y gdb; fi; source ci/tests/setup-core-dumps.sh && cd build && ../ci/tests/run-${TEST_TYPE}-tests.sh"

Copilot uses AI. Check for mistakes.
Comment on lines 1 to +5
#!/bin/bash
set -uo pipefail

EXECUTABLE=${1-}

Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

show-core-dumps.sh deletes core files unconditionally after invoking the debugger. Because the script doesn’t use set -e, a failed gdb/lldb invocation (e.g., debugger missing, non-zero exit) will still remove the core dump, making post-mortem debugging impossible. Consider capturing the debugger exit status and only deleting the core dump on successful analysis (or gate deletion behind an explicit flag).

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

shopt -s nullglob
core_dumps=("${COREDUMP_DIR}"/core*)
core_dumps=("${COREDUMP_DIR}"/core-${exe_name}.*)

Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

show-core-dumps.sh now only matches dumps named core-<exe_name>.*. On Linux the core pattern uses %e, which can be truncated (comm length) and may not match basename "$EXECUTABLE" for longer binary names, causing core dumps to be missed. Consider adding a fallback (e.g., also scanning core* when no matches are found) or switching the Linux core pattern to one that can be derived reliably (and adjusting the matcher accordingly).

Suggested change
# Fallback: if no matching core-<exe_name>.* files are found, look for any core* files.
if [ ${#core_dumps[@]} -eq 0 ]; then
core_dumps=("${COREDUMP_DIR}"/core*)
fi

Copilot uses AI. Check for mistakes.
Comment on lines +20 to +24
# Enable core dumps for this process
if [ -n "${COREDUMP_DIR-}" ]; then
ulimit -c unlimited
fi

Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This script captures status=$? from timeout to report failures and continue running remaining systests. If errexit is enabled in the environment (e.g., inherited via SHELLOPTS after sourcing another script), a non-zero timeout exit can terminate the script before status handling and core-dump analysis. Consider explicitly disabling errexit around the timeout call (or making the exit-code capture resilient).

Copilot uses AI. Check for mistakes.
Comment on lines +14 to +17
# Enable core dumps for this process
if [ -n "${COREDUMP_DIR-}" ]; then
ulimit -c unlimited
fi
Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This script may be invoked from wrappers that enable set -e (e.g., run-core-tests.sh / run-rpc-tests.sh). If errexit is inherited via SHELLOPTS, a failing test binary can terminate the script before it can capture $? and run core-dump analysis. Consider explicitly disabling errexit around the test invocation (and restoring it if desired) so failure handling is reliable.

Copilot uses AI. Check for mistakes.
@@ -0,0 +1,52 @@
#!/bin/bash
set -euo pipefail
Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setup-core-dumps.sh is sourced in run-docker-tests.sh. Because it runs set -euo pipefail, sourcing it will change shell options in the caller and may leak errexit via SHELLOPTS into subsequently executed test scripts, breaking logic that expects to capture non-zero exit codes and continue (e.g., to analyze core dumps). Consider avoiding set -e in a script that may be sourced, or saving/restoring shell options when sourced.

Suggested change
set -euo pipefail
if [[ "${BASH_SOURCE[0]}" == "${0}" ]]; then
set -euo pipefail
fi

Copilot uses AI. Check for mistakes.
@pwojcikdev pwojcikdev force-pushed the system-tests-core-dumps branch from ac25804 to c72bd18 Compare February 8, 2026 19:15
@pwojcikdev pwojcikdev merged commit 11dfcf5 into nanocurrency:develop Feb 9, 2026
27 of 28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants