Skip to content

Fix job display name extraction: handle escaped quotes and identify correct Worker log on non-ephemeral runners#2089

Merged
rodrigo-roca merged 5 commits intoDataDog:masterfrom
LudovicTOURMAN:fix/job-display-name-parsing
Feb 13, 2026
Merged

Fix job display name extraction: handle escaped quotes and identify correct Worker log on non-ephemeral runners#2089
rodrigo-roca merged 5 commits intoDataDog:masterfrom
LudovicTOURMAN:fix/job-display-name-parsing

Conversation

@LudovicTOURMAN
Copy link
Contributor

@LudovicTOURMAN LudovicTOURMAN commented Feb 3, 2026

Problems

This PR fixes two related issues (cf Issue) with GitHub Actions job display name extraction from Runner diagnostic logs:

Problem 1: Escaped Quotes and Special Characters ✅

The current regex for extracting job display names fails when job names contain escaped quotes or other special characters.

Failing Example:

  • Job name: End-to-End Tests (@org/backend, "features/a*", apps/backend)
  • Current result: End-to-End Tests (@org/backend, \ (truncated at first quote)

Root Cause:
In Worker diagnostic logs, job names are stored as JSON strings with proper escaping. The current regex /"jobDisplayName":\s*"([^"]+)"/ uses [^"]+ which stops at the first quote character, even when it's escaped as \".

Problem 2: Wrong Worker Log on Non-Ephemeral Runners 🆕

On non-ephemeral (reusable) runners, multiple Worker_*.log files accumulate from different job executions. The current implementation iterates through logs and returns the first jobDisplayName found, which may belong to a previous job run rather than the current one.

Real-World Failing Scenario:
We encountered this issue in production where:

  • Runner had multiple logs from different job runs
  • Current job was build-amd64-base-rails-assets
  • But datadog-ci extracted Docker build / Docker build from an older log file
  • This caused incorrect job tagging in Datadog CI Visibility

Root Cause:
No filtering mechanism to identify which Worker log corresponds to the currently executing job.


Solutions

Solution 1: Proper Escape Handling ✅

  1. Updated regex: /"jobDisplayName":\s*"((?:[^"\\]|\\.)*)"/

    • Uses (?:[^"\\]|\\.)* to correctly match JSON-escaped strings
    • Handles all escape sequences: \", \\, \n, \t, \uXXXX, etc.
  2. Proper unescaping: Added unescapeJsonString() function using JSON.parse()

    • Correctly unescapes all JSON escape sequences
    • Handles edge cases like Unicode, backslashes, control characters

Solution 2: Worker Log Identification 🆕

Use the ACTIONS_ORCHESTRATION_ID environment variable to identify the correct Worker log file:

  1. Read process.env.ACTIONS_ORCHESTRATION_ID (format: <guid>.<job-id>.__default)
  2. Extract the planId GUID (everything before first dot)
  3. Search Worker logs for one containing this GUID as "planId": "<guid>"
  4. Only extract jobDisplayName from the matched log
  5. Fall back to checking all logs if ACTIONS_ORCHESTRATION_ID unavailable or planId not found

Why this works:

  • ACTIONS_ORCHESTRATION_ID is available in GitHub Actions environment (Runner v2.331.0+)
  • The planId GUID appears in Worker log JSON as "planId": "<guid>"
  • The planId is globally unique per job execution (type: Guid in Runner source code)
  • This ensures we always extract the job name from the correct log, even on non-ephemeral runners

Test Coverage

Existing Tests (Escape Sequences) ✅

Added comprehensive tests covering:

  • Job names with quotes (single and multiple)
  • Backslashes (Windows paths, regex patterns)
  • Emojis and Unicode (🚀, Chinese, Japanese, Arabic, Cyrillic)
  • Matrix jobs with various formats
  • Reusable workflows with slashes
  • Special characters and complex combinations

New Tests (Worker Log Identification) 🆕

Added 4 new test cases:

  1. Single log with ACTIONS_ORCHESTRATION_ID: Verifies correct planId matching
  2. Multiple logs on non-ephemeral runner: Ensures correct log selected among multiple Worker logs from different jobs
  3. Fallback without ACTIONS_ORCHESTRATION_ID: Falls back to checking all logs (backward compatibility)
  4. Fallback when planId not found: Falls back to checking all logs if planId doesn't match any log

Bonus Fix 🎁

Fixed test helpers to properly JSON-escape job names using JSON.stringify(), which resolved 8 pre-existing test failures.

All 1012 tests now pass


Issues We Faced

During development and testing, we encountered several challenges:

  1. Test Failures Due to Invalid JSON: The test helper functions were directly interpolating job names into JSON templates without escaping, causing invalid JSON when job names contained quotes or special characters. Fixed by using JSON.stringify() in test helpers.

  2. Multiple File Reads: Initial test expectations were incorrect - the implementation reads logs multiple times (once while searching for planId, then again to extract job name). Adjusted test assertions to match actual behavior (3 reads instead of 2).

  3. Real Production Issue: On non-ephemeral runners with multiple Worker logs, datadog-ci was consistently extracting job names from old logs instead of the current job. This PR completely resolves this by using the unique planId from ACTIONS_ORCHESTRATION_ID.


Verification

Tested against real-world failing cases:

  1. ✅ Job names with escaped quotes now parse correctly
  2. ✅ Non-ephemeral runners now select the correct Worker log using planId matching
  3. ✅ Backward compatibility maintained: works without ACTIONS_ORCHESTRATION_ID
  4. ✅ All existing functionality preserved with fallback behavior

Compatibility

  • Requires: Runner v2.331.0+ for ACTIONS_ORCHESTRATION_ID environment variable
  • Fallback: Works with older Runner versions (checks all logs, as before)
  • No breaking changes: Fully backward compatible

This PR fixes critical issues that were causing incorrect job identification in Datadog CI Visibility, especially on non-ephemeral runners commonly used in production environments.

@LudovicTOURMAN LudovicTOURMAN requested a review from a team as a code owner February 3, 2026 23:40
@Drarig29 Drarig29 added the datadog-ci For PRs spanning multiple commands, and repo-wide changes label Feb 4, 2026
@LudovicTOURMAN LudovicTOURMAN changed the title Fix job display name parsing for names containing quotes and special characters Fix job display name extraction: handle escaped quotes and identify correct Worker log on non-ephemeral runners Feb 4, 2026
…runners

This PR addresses two critical issues in GitHub Actions job display name
extraction from Runner diagnostic logs:

1. Escaped Quotes: The regex now properly handles JSON-escaped strings
   containing quotes, backslashes, and other special characters using an
   improved pattern and proper JSON.parse() unescaping.

2. Non-Ephemeral Runners: Uses ACTIONS_ORCHESTRATION_ID environment variable
   to identify the correct Worker log file when multiple logs accumulate
   from different job executions, preventing extraction of job names from
   old logs.

The implementation includes comprehensive test coverage with 4 new test cases
for Worker log identification scenarios and fixes for test helpers to properly
escape job names in mock data. All 1012 tests now pass.

Changes are backward compatible with older Runner versions through graceful
fallback behavior when ACTIONS_ORCHESTRATION_ID is unavailable.
@Drarig29
Copy link
Contributor

Drarig29 commented Feb 5, 2026

Hi @LudovicTOURMAN! Can you run yarn format?

@LudovicTOURMAN
Copy link
Contributor Author

Oh yes, my bad.
This is fixed @Drarig29

Copy link
Contributor

@rodrigo-roca rodrigo-roca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just left a few minor comments

@LudovicTOURMAN
Copy link
Contributor Author

Thanks for the feedback @rodrigo-roca
I've updated the code following so.

@rodrigo-roca rodrigo-roca merged commit fe48329 into DataDog:master Feb 13, 2026
27 checks passed
@LudovicTOURMAN LudovicTOURMAN deleted the fix/job-display-name-parsing branch February 13, 2026 13:10
@ava-silver ava-silver mentioned this pull request Feb 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

datadog-ci For PRs spanning multiple commands, and repo-wide changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants