Skip to content

Enforce mandatory analysis commitment across all agentic workflows#880

Merged
pethers merged 12 commits intomainfrom
copilot/improve-analysis-workflows
Mar 31, 2026
Merged

Enforce mandatory analysis commitment across all agentic workflows#880
pethers merged 12 commits intomainfrom
copilot/improve-analysis-workflows

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Mar 31, 2026

All 10 agentic workflows were deleting analysis artifacts (rm -rf analysis/) before PR creation, discarding valuable intelligence every run. Noop runs lost analysis entirely.

ai-driven-analysis-guide.md — Rule 5: ALWAYS Commit Analysis

New absolute rule mandating every workflow run commits analysis. Scenario table covers article generation, quiet periods, existing analysis, translation, and MCP failures. Anti-pattern gallery explicitly rejects the old cleanup pattern.

All 10 workflow .md files — three changes each

  • Replaced analysis deletion with run-scoped, file-type-specific cleanup that removes only raw MCP payload files within the current run's analysis/${TODAY}/<slug>/ directory, preserving per-file analysis markdown (*.analysis.md) and never touching historically committed data:
# Before (REJECTED)
rm -rf analysis/ 2>/dev/null || true
git checkout HEAD -- analysis/ 2>/dev/null || true

# After — scoped to current run directory, preserves all markdown analysis artifacts
RUN_ANALYSIS_DIR="analysis/${TODAY}/breaking"
if [ -d "$RUN_ANALYSIS_DIR" ]; then
  find "$RUN_ANALYSIS_DIR" -type f -path "*/data/*" ! -name "*.analysis.md" ! -name "*.md" -delete 2>/dev/null || true
  find "$RUN_ANALYSIS_DIR" -type d -name "data" -empty -delete 2>/dev/null || true
fi

Each workflow uses its own slug for scoping: breaking, week-in-review, month-in-review, week-ahead, month-ahead, committee-reports, motions, propositions, ${ARTICLE_TYPE_SLUG} (article-generator), and translate (news-translate).

  • Added "IMPROVE EXISTING ANALYSIS" mandate — check for prior analysis in analysis/${TODAY}/${SLUG}/ and extend it rather than replacing
  • Updated quiet-period noop guidance — create analysis-only PRs instead of discarding work via safeoutputs___noop

news-translate.md — new Step 4c: Translation Analysis + baseline summary

Translation workflow previously performed zero analysis. Now requires coverage matrix, terminology consistency, and quality assessment. Scope verification updated to allow analysis/ directory. A baseline summary.md template is auto-generated at the analysis directory creation point so that even noop runs (no articles found, all translations already exist) produce an analysis artifact satisfying Rule 5. Cleanup is scoped to analysis/${ARTICLE_DATE}/translate/ specifically, enforcing folder isolation so translation cannot delete other workflows' data for the same date.

analysis/README.md — Rule 7

Cross-references the new guide Rule 5 in the existing critical rules section. Explicitly states that the data/ directory must not be deleted wholesale and all per-file analysis markdown must always be preserved and committed.

- Add Rule 5 "ALWAYS Commit Analysis — No Workflow Run Wasted" to ai-driven-analysis-guide.md
- Replace rm -rf analysis/ cleanup with data-only cleanup in all 10 workflows
- Add "IMPROVE EXISTING ANALYSIS" mandate to all article workflows
- Update noop handling in breaking/article-generator to create analysis-only PRs
- Add translation analysis step (Step 4c) to news-translate.md
- Update scope verification in translate to allow analysis/ directory changes
- Add Rule 7 to analysis/README.md cross-referencing the new mandate

Agent-Logs-Url: https://github.com/Hack23/euparliamentmonitor/sessions/2a3bceb3-4d06-4759-8e21-e0d736358248

Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
@github-actions github-actions bot added documentation Documentation updates infrastructure CI/CD and build infrastructure github_actions GitHub Actions updates workflows GitHub workflows labels Mar 31, 2026
Copilot AI and others added 2 commits March 31, 2026 12:10
Replace safeoutputs___noop on quiet periods with analysis-only PR creation
in committee-reports, motions, propositions, and week-ahead workflows.

Agent-Logs-Url: https://github.com/Hack23/euparliamentmonitor/sessions/2a3bceb3-4d06-4759-8e21-e0d736358248

Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
Copilot AI changed the title [WIP] Add improved analysis for agentic workflows Enforce mandatory analysis commitment across all agentic workflows Mar 31, 2026
Copilot AI requested a review from pethers March 31, 2026 12:14
@pethers pethers marked this pull request as ready for review March 31, 2026 12:14
@pethers pethers requested a review from Copilot March 31, 2026 12:14
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Enforces a new requirement that every agentic workflow run persists and commits analysis artifacts (including quiet/noop scenarios), and updates workflows to preserve analysis while cleaning only large raw data outputs.

Changes:

  • Added “ALWAYS Commit Analysis” rules to the analysis guide and analysis README, including scenario-based enforcement guidance.
  • Updated 10 workflow instruction files to prefer analysis-only PRs during quiet periods and to “improve existing analysis” rather than overwrite.
  • Added mandatory translation-quality analysis step to the translation workflow and expanded its scope verification to allow analysis/.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
analysis/README.md Adds Rule 7 mandating committed analysis artifacts and analysis-only PRs on quiet runs.
analysis/methodologies/ai-driven-analysis-guide.md Adds Rule 5 enforcement checklist + anti-patterns for always committing analysis.
.github/workflows/news-weekly-review.md Preserves analysis (stops deleting) and mandates improving existing week-in-review analysis.
.github/workflows/news-week-ahead.md Switches quiet guidance to analysis-only PRs and preserves analysis artifacts.
.github/workflows/news-propositions.md Switches quiet guidance to analysis-only PRs and preserves analysis artifacts.
.github/workflows/news-motions.md Switches quiet guidance to analysis-only PRs and preserves analysis artifacts.
.github/workflows/news-monthly-review.md Preserves analysis artifacts; mandates improving existing month-in-review analysis.
.github/workflows/news-month-ahead.md Preserves analysis artifacts; mandates improving existing month-ahead analysis.
.github/workflows/news-committee-reports.md Switches quiet guidance to analysis-only PRs and preserves analysis artifacts.
.github/workflows/news-breaking.md Replaces noop-on-quiet guidance with analysis-only PRs; preserves analysis artifacts.
.github/workflows/news-article-generator.md Updates breaking-news gate to create analysis-only PRs when quiet; preserves analysis artifacts.
.github/workflows/news-translate.md Allows analysis/ changes in scope verification and adds mandatory translation analysis step.

Comment on lines +622 to +623
find analysis/ -type d -name "data" -exec rm -rf {} + 2>/dev/null || true
echo "🧹 Cleaned metadata files and raw MCP data; analysis artifacts PRESERVED for commit"
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cleanup command removes entire data directories under analysis/, but per-file analysis artifacts are intentionally stored alongside MCP data as {id}.analysis.md (see analysis/README.md:461). This will delete those analysis markdown files and can also wipe tracked historical analysis/**/data/ content, causing large unintended deletions in the PR. Instead, scope cleanup to the current run’s directory and delete only raw data file types (e.g., JSON) while preserving *.analysis.md and other markdown artifacts.

Suggested change
find analysis/ -type d -name "data" -exec rm -rf {} + 2>/dev/null || true
echo "🧹 Cleaned metadata files and raw MCP data; analysis artifacts PRESERVED for commit"
find analysis/ -type f \( -name "*.json" -o -name "*.ndjson" -o -name "*.csv" \) -delete 2>/dev/null || true
echo "🧹 Cleaned metadata files and raw MCP data files; analysis markdown artifacts PRESERVED for commit"

Copilot uses AI. Check for mistakes.
Comment on lines +835 to +837
# Remove only raw MCP data downloads to control PR size. Analysis markdown MUST be committed.
find analysis/ -type d -name "data" -exec rm -rf {} + 2>/dev/null || true
echo "🧹 Cleaned metadata files and raw MCP data; analysis artifacts PRESERVED for commit"
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cleanup deletes all analysis/**/data directories, which removes per-file analysis outputs (*.analysis.md) that are stored alongside MCP data by design (analysis/methodologies/ai-driven-analysis-guide.md:422). It can also delete previously committed historical data directories, producing a huge PR diff. Please limit deletion to the current date/type directory and remove only raw MCP payload files (while keeping *.analysis.md/markdown).

Suggested change
# Remove only raw MCP data downloads to control PR size. Analysis markdown MUST be committed.
find analysis/ -type d -name "data" -exec rm -rf {} + 2>/dev/null || true
echo "🧹 Cleaned metadata files and raw MCP data; analysis artifacts PRESERVED for commit"
# Remove only raw MCP payload files for TODAY's run; analysis markdown MUST be committed.
find "analysis" -type f -path "analysis/*/$TODAY/data/*" ! -name '*.md' ! -name '*.markdown' -delete 2>/dev/null || true
echo "🧹 Cleaned metadata files and today's raw MCP data payloads; analysis artifacts PRESERVED for commit"

Copilot uses AI. Check for mistakes.
Comment on lines +827 to +828
find analysis/ -type d -name "data" -exec rm -rf {} + 2>/dev/null || true
echo "🧹 Cleaned metadata files and raw MCP data; analysis artifacts PRESERVED for commit"
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing entire analysis/**/data directories will also remove per-file analysis markdown ({id}.analysis.md) that is supposed to live next to each downloaded MCP file (analysis/README.md:461). It may also delete tracked historical data and create unintended deletions in the PR. Recommend deleting only raw data file extensions inside the current run’s data directory, preserving *.analysis.md and other markdown artifacts.

Suggested change
find analysis/ -type d -name "data" -exec rm -rf {} + 2>/dev/null || true
echo "🧹 Cleaned metadata files and raw MCP data; analysis artifacts PRESERVED for commit"
find analysis/ -type f \( -name '*.json' -o -name '*.ndjson' -o -name '*.csv' -o -name '*.xml' -o -name '*.txt' \) -path '*/data/*' -delete 2>/dev/null || true
echo "🧹 Cleaned metadata files and raw MCP data files; analysis artifacts PRESERVED for commit"

Copilot uses AI. Check for mistakes.
Comment on lines +799 to +801
# Remove only raw MCP data downloads to control PR size. Analysis markdown MUST be committed.
find analysis/ -type d -name "data" -exec rm -rf {} + 2>/dev/null || true
echo "🧹 Cleaned metadata files and raw MCP data; analysis artifacts PRESERVED for commit"
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The find ... -name "data" -exec rm -rf cleanup removes per-file analysis artifacts that are stored in the data/ tree as *.analysis.md (analysis/templates/per-file-political-intelligence.md:24). It can also wipe historical analysis/<date>/data already committed to the repo. Please adjust cleanup to preserve markdown analysis files and avoid deleting outside the current run’s directory.

Suggested change
# Remove only raw MCP data downloads to control PR size. Analysis markdown MUST be committed.
find analysis/ -type d -name "data" -exec rm -rf {} + 2>/dev/null || true
echo "🧹 Cleaned metadata files and raw MCP data; analysis artifacts PRESERVED for commit"
# Remove only raw MCP data downloads for the CURRENT RUN to control PR size.
# Analysis markdown (*.analysis.md) MUST be committed and must NOT be deleted.
RAW_DATA_ROOT="${RAW_DATA_ROOT:-analysis/current-run}"
# Delete non-markdown raw data files under the current run's data/ directories
find "${RAW_DATA_ROOT}" -type f -path '*/data/*' ! -name '*.analysis.md' -delete 2>/dev/null || true
# Optionally remove now-empty data/ directories for the current run
find "${RAW_DATA_ROOT}" -type d -name "data" -empty -delete 2>/dev/null || true
echo "🧹 Cleaned metadata files and current-run raw MCP data; analysis *.analysis.md artifacts PRESERVED for commit"

Copilot uses AI. Check for mistakes.
# ⚠️ MANDATORY: Commit analysis artifacts per ai-driven-analysis-guide.md Rule 5
# No workflow run should be wasted — analysis is ALWAYS persisted.
# Remove only raw MCP data downloads to control PR size. Analysis markdown MUST be committed.
find analysis/ -type d -name "data" -exec rm -rf {} + 2>/dev/null || true
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This command deletes all analysis/**/data directories, but the repo’s analysis protocol stores per-file analysis markdown alongside raw MCP data in data/ (analysis/README.md:435-462). That means this cleanup will delete required analysis artifacts and may also remove already-committed historical data. Please only delete raw data files (not *.analysis.md) and scope to the current run’s path.

Suggested change
find analysis/ -type d -name "data" -exec rm -rf {} + 2>/dev/null || true
find analysis/ -type f -path "*/data/*" ! -name "*.analysis.md" -delete 2>/dev/null || true

Copilot uses AI. Check for mistakes.
Comment on lines +630 to +631
find analysis/ -type d -name "data" -exec rm -rf {} + 2>/dev/null || true
echo "🧹 Cleaned metadata files and raw MCP data; analysis artifacts PRESERVED for commit"
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cleanup deletes every analysis/**/data directory, but the analysis methodology stores per-file analysis markdown in that data/ tree as {id}.analysis.md (analysis/methodologies/ai-driven-analysis-guide.md:422). It will therefore delete required analysis artifacts and may wipe tracked historical data. Please adjust to remove only raw MCP payload files (e.g., JSON) within the current run’s directory, preserving *.analysis.md and other markdown.

Suggested change
find analysis/ -type d -name "data" -exec rm -rf {} + 2>/dev/null || true
echo "🧹 Cleaned metadata files and raw MCP data; analysis artifacts PRESERVED for commit"
find analysis/ -type f -path "*/data/*" \( -name "*.json" -o -name "*.ndjson" -o -name "*.log" -o -name "*.tmp" \) -delete 2>/dev/null || true
echo "🧹 Cleaned metadata files and raw MCP data payloads; analysis markdown artifacts PRESERVED for commit"

Copilot uses AI. Check for mistakes.
Comment on lines +680 to +684
# ⚠️ MANDATORY: Commit analysis artifacts per ai-driven-analysis-guide.md Rule 5
# No workflow run should be wasted — analysis is ALWAYS persisted.
# Remove only raw MCP data downloads to control PR size. Analysis markdown MUST be committed.
find analysis/ -type d -name "data" -exec rm -rf {} + 2>/dev/null || true
echo "🧹 Cleaned metadata files and raw MCP data; analysis artifacts PRESERVED for commit"
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cleanup deletes analysis/**/data directories, but per-file analysis artifacts are saved alongside the raw data as {id}.analysis.md under data/ (analysis/README.md:461). This will remove required analysis markdown and can also delete committed historical data directories, creating unintended PR deletions. Please scope cleanup to the current date/type and delete only raw data file types while preserving *.analysis.md/markdown artifacts.

Copilot uses AI. Check for mistakes.
Comment on lines 531 to 534
find analysis/ -type d -name "data" -exec rm -rf {} + 2>/dev/null || true
echo "🧹 Cleaned raw data; translation analysis artifacts PRESERVED for commit"

ARTICLE_DATE="${ARTICLE_DATE:-$(date -u +%Y-%m-%d)}"
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cleanup removes analysis/**/data directories, but translation analysis artifacts may need to reference per-file analyses stored inside data/ as {id}.analysis.md (analysis/templates/per-file-political-intelligence.md:24). Deleting the whole directory will also delete those markdown artifacts and can wipe tracked historical analysis/<date>/data content. Please delete only raw payload files (e.g., JSON) and scope to the current run’s directory, preserving markdown.

Suggested change
find analysis/ -type d -name "data" -exec rm -rf {} + 2>/dev/null || true
echo "🧹 Cleaned raw data; translation analysis artifacts PRESERVED for commit"
ARTICLE_DATE="${ARTICLE_DATE:-$(date -u +%Y-%m-%d)}"
# Determine article date for scoping analysis cleanup to the current run
ARTICLE_DATE="${ARTICLE_DATE:-$(date -u +%Y-%m-%d)}"
# Clean only raw payload files (e.g., JSON) for this run's analysis directory,
# preserving markdown analysis artifacts and historical analysis data.
CURRENT_ANALYSIS_DIR="analysis/${ARTICLE_DATE}"
if [ -d "${CURRENT_ANALYSIS_DIR}" ]; then
find "${CURRENT_ANALYSIS_DIR}" -type f -name '*.json' -delete 2>/dev/null || true
fi
echo "🧹 Cleaned raw data payloads; translation analysis markdown artifacts PRESERVED for commit"

Copilot uses AI. Check for mistakes.

**Enforcement checklist:**
- [ ] Analysis artifacts are included in `git add` alongside article files — never deleted before PR creation
- [ ] Raw MCP data files (`data/` subdirectories) may be cleaned to control PR size, but analysis markdown is ALWAYS committed
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rule 5 says raw MCP data/ subdirectories may be cleaned, but elsewhere in this guide per-file analysis is defined to live inside data/ as {id}.analysis.md (see “Conflict Resolution” table). As written, “clean data/ subdirectories” can be interpreted as deleting the entire directory and would remove required analysis artifacts. Please clarify that cleanup must only remove raw payload files (e.g., *.json) while preserving *.analysis.md and other markdown artifacts, or adjust the prescribed storage layout so analysis is not under data/.

Suggested change
- [ ] Raw MCP data files (`data/` subdirectories) may be cleaned to control PR size, but analysis markdown is ALWAYS committed
- [ ] Raw MCP data payload files (for example `data/**/*.json`) may be cleaned to control PR size, but **never delete `*.analysis.md` or other markdown artifacts** under either `analysis/` or `data/`; analysis markdown is ALWAYS committed

Copilot uses AI. Check for mistakes.
- On quiet days with no article, create an **analysis-only PR** instead of discarding analysis via noop
- Before producing new analysis, check for existing analysis and **improve/extend/correct/complete** it
- Even the translation workflow must perform translation coverage and quality analysis
- Raw MCP data files may be cleaned to control PR size, but analysis markdown is ALWAYS committed
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This rule allows cleaning raw MCP data/ files, but the analysis protocol stores per-file analysis markdown alongside the MCP data inside data/ (analysis/README.md “Save analysis alongside data file”). Please tighten the wording so “cleaning data” cannot be interpreted as deleting the whole data/ directory, and explicitly require preserving *.analysis.md/markdown artifacts.

Suggested change
- Raw MCP data files may be cleaned to control PR size, but analysis markdown is ALWAYS committed
- Raw MCP data payloads (e.g. large JSON/XML responses) **may be cleaned or pruned** to control PR size, but the `data/` directory itself MUST NOT be deleted or recreated as part of cleaning, and all per-file analysis markdown artifacts (e.g. `*.analysis.md` and other `.md` analysis files stored alongside the data) MUST ALWAYS be preserved and committed.

Copilot uses AI. Check for mistakes.
@pethers
Copy link
Copy Markdown
Member

pethers commented Mar 31, 2026

@copilot apply changes based on the comments in this thread

…ctories

Replace `find -type d -name data -exec rm -rf` (deletes entire data/ dirs
including *.analysis.md) with file-type-specific deletion that only removes
raw payload files (JSON, NDJSON, CSV, XML) while preserving all markdown
analysis artifacts. Empty data/ dirs are cleaned up afterward.

Also clarify Rule 5 and Rule 7 documentation to explicitly state that
*.analysis.md and other markdown files in data/ must never be deleted.

Agent-Logs-Url: https://github.com/Hack23/euparliamentmonitor/sessions/85c0abe4-7c0f-4bff-9781-ef33c55a3907

Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Mar 31, 2026

@copilot apply changes based on the comments in this thread

Applied all 12 review comments in commit 00c7684. The cleanup in all 10 workflows now uses file-type-specific deletion (find analysis/ -type f -path "*/data/*" ! -name "*.analysis.md" ! -name "*.md" -delete) instead of directory deletion, preserving per-file *.analysis.md artifacts alongside MCP data. Empty data/ directories are cleaned up afterward. Rule 5 in ai-driven-analysis-guide.md and Rule 7 in analysis/README.md are also clarified to explicitly state that *.analysis.md and other markdown files must never be deleted.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 6 comments.

Comment on lines +622 to +624
find analysis/ -type f -path "*/data/*" ! -name "*.analysis.md" ! -name "*.md" -delete 2>/dev/null || true
find analysis/ -type d -name "data" -empty -delete 2>/dev/null || true
echo "🧹 Cleaned metadata files and raw MCP data payloads; analysis markdown artifacts PRESERVED for commit"
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cleanup command deletes all non-Markdown files under analysis/**/data/**, which will also remove already-committed historical MCP payloads (e.g., analysis/2026-03-28/data/meps/person-*.json). That would create very large diffs and break existing analysis references. Scope cleanup to the current workflow’s directory (e.g., analysis/${TODAY}/week-in-review/data/) and/or delete only newly-downloaded/untracked payloads (so tracked historical files are not removed).

Suggested change
find analysis/ -type f -path "*/data/*" ! -name "*.analysis.md" ! -name "*.md" -delete 2>/dev/null || true
find analysis/ -type d -name "data" -empty -delete 2>/dev/null || true
echo "🧹 Cleaned metadata files and raw MCP data payloads; analysis markdown artifacts PRESERVED for commit"
find "analysis/${TODAY}/week-in-review" -type f -path "*/data/*" ! -name "*.analysis.md" ! -name "*.md" -delete 2>/dev/null || true
find "analysis/${TODAY}/week-in-review" -type d -name "data" -empty -delete 2>/dev/null || true
echo "🧹 Cleaned metadata files and raw MCP data payloads for this run; analysis markdown artifacts PRESERVED for commit"

Copilot uses AI. Check for mistakes.
Comment on lines +835 to +838
# Remove only raw MCP data downloads to control PR size. Analysis markdown MUST be committed.
find analysis/ -type f -path "*/data/*" ! -name "*.analysis.md" ! -name "*.md" -delete 2>/dev/null || true
find analysis/ -type d -name "data" -empty -delete 2>/dev/null || true
echo "🧹 Cleaned metadata files and raw MCP data payloads; analysis markdown artifacts PRESERVED for commit"
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The find analysis/ ... -delete cleanup will delete all non-Markdown files in every analysis/**/data/** folder, including historically committed MCP payloads. This risks huge PRs with mass deletions and leaves existing analysis referencing missing data files. Please scope deletion to analysis/${TODAY}/week-ahead/data/ (and/or only remove untracked/new downloads) instead of the whole analysis/ tree.

Suggested change
# Remove only raw MCP data downloads to control PR size. Analysis markdown MUST be committed.
find analysis/ -type f -path "*/data/*" ! -name "*.analysis.md" ! -name "*.md" -delete 2>/dev/null || true
find analysis/ -type d -name "data" -empty -delete 2>/dev/null || true
echo "🧹 Cleaned metadata files and raw MCP data payloads; analysis markdown artifacts PRESERVED for commit"
# Remove only raw MCP data downloads for THIS run to control PR size. Analysis markdown MUST be committed.
DATA_DIR="analysis/$TODAY/week-ahead/data"
if [ -d "$DATA_DIR" ]; then
find "$DATA_DIR" -type f ! -name "*.analysis.md" ! -name "*.md" -delete 2>/dev/null || true
find "$DATA_DIR" -type d -empty -delete 2>/dev/null || true
fi
echo "🧹 Cleaned metadata files and raw MCP data payloads for today; analysis markdown artifacts PRESERVED for commit"

Copilot uses AI. Check for mistakes.
Comment on lines +652 to +655
# Remove only raw MCP data downloads to control PR size. Analysis markdown MUST be committed.
find analysis/ -type f -path "*/data/*" ! -name "*.analysis.md" ! -name "*.md" -delete 2>/dev/null || true
find analysis/ -type d -name "data" -empty -delete 2>/dev/null || true
echo "🧹 Cleaned metadata files and raw MCP data payloads; analysis markdown artifacts PRESERVED for commit"
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cleanup deletes all non-Markdown files under analysis/**/data/** across the repo, including already-committed historical payloads. That can produce massive deletion diffs and break existing analysis files that reference those payload paths. Please scope deletion to the current run directory (e.g., analysis/${TODAY}/month-in-review/data/) or delete only newly-downloaded/untracked payloads.

Suggested change
# Remove only raw MCP data downloads to control PR size. Analysis markdown MUST be committed.
find analysis/ -type f -path "*/data/*" ! -name "*.analysis.md" ! -name "*.md" -delete 2>/dev/null || true
find analysis/ -type d -name "data" -empty -delete 2>/dev/null || true
echo "🧹 Cleaned metadata files and raw MCP data payloads; analysis markdown artifacts PRESERVED for commit"
# Remove only raw MCP data downloads for the current run to control PR size. Analysis markdown MUST be committed.
RUN_DATA_DIR="analysis/${TODAY}/month-in-review/data"
if [ -d "$RUN_DATA_DIR" ]; then
find "$RUN_DATA_DIR" -type f ! -name "*.analysis.md" ! -name "*.md" -delete 2>/dev/null || true
find "$RUN_DATA_DIR" -type d -empty -delete 2>/dev/null || true
fi
echo "🧹 Cleaned metadata files and current-run raw MCP data payloads; analysis markdown artifacts PRESERVED for commit"

Copilot uses AI. Check for mistakes.
Comment on lines +120 to +123
5. **ANALYSIS-ONLY PR**: If analysis determines no breaking news significance, **still create an analysis-only PR** with `safeoutputs___create_pull_request` containing analysis artifacts in `analysis/${TODAY}/breaking/`.
- Per `ai-driven-analysis-guide.md` Rule 5, no workflow run should be wasted
- If existing analysis for this date exists, improve/extend it
- Use `safeoutputs___noop` ONLY when MCP server is completely unavailable and zero data was collected
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The workflow now mandates analysis-only PRs, but also says to use safeoutputs___noop when MCP is completely unavailable and zero data is collected. That contradicts Rule 5’s “no workflow run wasted” requirement: even on MCP failure you can still commit an analysis artifact documenting the outage (and any troubleshooting context) via an analysis-only PR. Consider removing this exception or narrowing it to cases where PR creation itself is impossible.

Copilot uses AI. Check for mistakes.
Comment on lines +627 to +631
# ⚠️ MANDATORY: Commit analysis artifacts per ai-driven-analysis-guide.md Rule 5
# No workflow run should be wasted — analysis is ALWAYS persisted.
# Remove only raw MCP data downloads to control PR size. Analysis markdown MUST be committed.
find analysis/ -type f -path "*/data/*" ! -name "*.analysis.md" ! -name "*.md" -delete 2>/dev/null || true
find analysis/ -type d -name "data" -empty -delete 2>/dev/null || true
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The raw payload cleanup is scoped to the entire analysis/ tree, so it will delete non-Markdown files from historically committed analysis/**/data/** directories (not only today’s breaking run). This can generate massive deletion diffs and break existing analysis references. Please scope cleanup to analysis/${TODAY}/breaking/data/ and/or delete only newly-downloaded/untracked files.

Copilot uses AI. Check for mistakes.
Comment on lines +682 to 688
# Remove only raw MCP data downloads to control PR size. Analysis markdown MUST be committed.
find analysis/ -type f -path "*/data/*" ! -name "*.analysis.md" ! -name "*.md" -delete 2>/dev/null || true
find analysis/ -type d -name "data" -empty -delete 2>/dev/null || true
echo "🧹 Cleaned metadata files and raw MCP data payloads; analysis markdown artifacts PRESERVED for commit"

TODAY=$(date -u +%Y-%m-%d)
BRANCH_NAME="news/articles-$TODAY"
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cleanup command operates on analysis/ globally and will delete non-Markdown files under all analysis/**/data/** directories, including payload files already committed in the repo for previous dates. That can create very large PR diffs and break existing analysis that references those files. Please scope deletion to analysis/${TODAY}/${ARTICLE_TYPE_SLUG}/data/ (and/or only remove untracked/new downloads).

Suggested change
# Remove only raw MCP data downloads to control PR size. Analysis markdown MUST be committed.
find analysis/ -type f -path "*/data/*" ! -name "*.analysis.md" ! -name "*.md" -delete 2>/dev/null || true
find analysis/ -type d -name "data" -empty -delete 2>/dev/null || true
echo "🧹 Cleaned metadata files and raw MCP data payloads; analysis markdown artifacts PRESERVED for commit"
TODAY=$(date -u +%Y-%m-%d)
BRANCH_NAME="news/articles-$TODAY"
# Remove only raw MCP data downloads for this run to control PR size. Analysis markdown MUST be committed.
TODAY=$(date -u +%Y-%m-%d)
BRANCH_NAME="news/articles-$TODAY"
ARTICLE_TYPE_SLUG="${ARTICLE_TYPE_SLUG:-all}"
DATA_DIR="analysis/${TODAY}/${ARTICLE_TYPE_SLUG}/data"
echo "Using scoped analysis data directory for cleanup: $DATA_DIR"
find "$DATA_DIR" -type f ! -name "*.analysis.md" ! -name "*.md" -delete 2>/dev/null || true
find "$DATA_DIR" -type d -empty -delete 2>/dev/null || true
echo "🧹 Cleaned metadata files and raw MCP data payloads for current run; analysis markdown artifacts PRESERVED for commit"

Copilot uses AI. Check for mistakes.
@pethers
Copy link
Copy Markdown
Member

pethers commented Mar 31, 2026

@copilot apply changes based on the comments in this thread

Each workflow now only cleans raw MCP payload files within its own
analysis/${TODAY}/<slug>/ directory instead of the entire analysis/
tree. This prevents mass deletion of historically committed data and
avoids large PR diffs from touching unrelated analysis directories.

Workflow-specific scoping:
- news-breaking → analysis/${TODAY}/breaking/
- news-weekly-review → analysis/${TODAY}/week-in-review/
- news-monthly-review → analysis/${TODAY}/month-in-review/
- news-week-ahead → analysis/${TODAY}/week-ahead/
- news-month-ahead → analysis/${TODAY}/month-ahead/
- news-committee-reports → analysis/${TODAY}/committee-reports/
- news-motions → analysis/${TODAY}/motions/
- news-propositions → analysis/${TODAY}/propositions/
- news-article-generator → analysis/${TODAY}/${ARTICLE_TYPE_SLUG}/
- news-translate → analysis/${ARTICLE_DATE}/

Agent-Logs-Url: https://github.com/Hack23/euparliamentmonitor/sessions/7ce00342-f1fd-453a-abf0-12ca8d65428a

Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
…d baseline summary

- Cleanup now targets only analysis/${ARTICLE_DATE}/translate/ instead of the
  entire date directory, preventing deletion of other workflows' data
- Added minimal summary.md baseline template at analysis dir creation so even
  noop runs (no articles found, all translations exist) produce an analysis
  artifact per Rule 5

Agent-Logs-Url: https://github.com/Hack23/euparliamentmonitor/sessions/3ca37f25-3e4c-421a-9102-abeba633316a

Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Mar 31, 2026

@copilot apply changes based on the comments in this thread

Applied both review comments in commit 4701edd:

  1. Added a baseline summary.md template at analysis directory creation so even noop runs (no articles found, all translations exist) produce an analysis artifact per Rule 5.
  2. Scoped the cleanup from analysis/${ARTICLE_DATE} to analysis/${ARTICLE_DATE}/translate/ to enforce folder isolation and prevent deleting other workflows' data for the same date.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 2 comments.

Comment on lines 631 to 638
RUN_ANALYSIS_DIR="analysis/${TODAY}/breaking"
if [ -d "$RUN_ANALYSIS_DIR" ]; then
find "$RUN_ANALYSIS_DIR" -type f -path "*/data/*" ! -name "*.analysis.md" ! -name "*.md" -delete 2>/dev/null || true
find "$RUN_ANALYSIS_DIR" -type d -name "data" -empty -delete 2>/dev/null || true
fi
echo "🧹 Cleaned raw MCP data payloads for ${TODAY}/breaking; analysis markdown artifacts PRESERVED for commit"

TODAY=$(date -u +%Y-%m-%d)
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this cleanup block, ${TODAY} is used to build RUN_ANALYSIS_DIR before TODAY is (re)set later in the same step. If $TODAY is unset or differs from the intended UTC date, cleanup may target the wrong directory (or analysis//breaking). Compute/export TODAY once before this block (prefer UTC) and avoid reassigning it after the cleanup so the directory, branch name, and PR title all align.

Suggested change
RUN_ANALYSIS_DIR="analysis/${TODAY}/breaking"
if [ -d "$RUN_ANALYSIS_DIR" ]; then
find "$RUN_ANALYSIS_DIR" -type f -path "*/data/*" ! -name "*.analysis.md" ! -name "*.md" -delete 2>/dev/null || true
find "$RUN_ANALYSIS_DIR" -type d -name "data" -empty -delete 2>/dev/null || true
fi
echo "🧹 Cleaned raw MCP data payloads for ${TODAY}/breaking; analysis markdown artifacts PRESERVED for commit"
TODAY=$(date -u +%Y-%m-%d)
TODAY=$(date -u +%Y-%m-%d)
RUN_ANALYSIS_DIR="analysis/${TODAY}/breaking"
if [ -d "$RUN_ANALYSIS_DIR" ]; then
find "$RUN_ANALYSIS_DIR" -type f -path "*/data/*" ! -name "*.analysis.md" ! -name "*.md" -delete 2>/dev/null || true
find "$RUN_ANALYSIS_DIR" -type d -name "data" -empty -delete 2>/dev/null || true
fi
echo "🧹 Cleaned raw MCP data payloads for ${TODAY}/breaking; analysis markdown artifacts PRESERVED for commit"

Copilot uses AI. Check for mistakes.
Comment on lines +496 to +500
## Step 4c: Translation Analysis (MANDATORY per ai-driven-analysis-guide.md Rule 5)

> **⚠️ MANDATORY**: Per `analysis/methodologies/ai-driven-analysis-guide.md` Rule 5, no workflow run should be wasted. The translation workflow MUST produce analysis artifacts documenting translation quality, coverage, and terminology consistency. If existing translation analysis exists for this date, improve and extend it.

Before creating the PR, read ALL methodology documents in `analysis/methodologies/` and produce a translation analysis report in `analysis/${ARTICLE_DATE}/translate/`:
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Step 4c adds mandatory translation analysis artifacts, but earlier steps still safeoutputs___noop and exit 0 when (a) no English articles exist for the date or (b) translations already exist. Those early exits happen before this step runs, so the workflow can still produce no committed analysis, contradicting Rule 5 and the PR description's “noop runs produce an analysis artifact”. Move baseline analysis creation earlier (before any noop/exit), and replace these noop paths with an analysis-only PR when any analysis/artifacts (even the baseline) are required.

Copilot uses AI. Check for mistakes.
@pethers
Copy link
Copy Markdown
Member

pethers commented Mar 31, 2026

@copilot apply changes based on the comments in this thread

… before noop exits in translate.md

- news-breaking.md: Compute TODAY=$(date -u +%Y-%m-%d) before RUN_ANALYSIS_DIR
  uses it, preventing cleanup targeting wrong directory
- news-translate.md: Move baseline analysis summary creation to immediately
  after date context, before any safeoutputs___noop exits
- news-translate.md: Replace noop exits with analysis-only PR paths that
  append run-outcome details and call safeoutputs___create_pull_request
- news-translate.md: Simplify Step 4c to reference earlier baseline instead
  of duplicating the template

Agent-Logs-Url: https://github.com/Hack23/euparliamentmonitor/sessions/c1217b93-d97e-410b-966c-d73f4894f9dc

Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Mar 31, 2026

@copilot apply changes based on the comments in this thread

Both review comments applied in commit 0372a66:

  1. news-breaking.md: TODAY=$(date -u +%Y-%m-%d) is now computed before the cleanup block that references ${TODAY}, so RUN_ANALYSIS_DIR, BRANCH_NAME, and PR title all use the same date value.
  2. news-translate.md: Baseline analysis summary creation moved to immediately after the date context block (before any noop exits). The two early safeoutputs___noop + exit 0 paths are replaced with analysis-only PR paths that append run-outcome details to the baseline summary and call safeoutputs___create_pull_request. Step 4c now references the earlier baseline instead of duplicating the template.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 2 comments.

body: "## Translation Analysis Only\n\nNo English articles found for ${ARTICLE_DATE}. Baseline translation analysis committed per Rule 5.",
base: "main",
head: BRANCH_NAME
})
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the "no English articles found" branch, the workflow calls safeoutputs___create_pull_request but does not exit 0 (or otherwise stop) afterwards. Execution will continue with empty $ARTICLE_TYPES/$NEEDS_TRANSLATION, likely causing incorrect behavior and potentially attempting additional PR creation. Add an explicit exit/return immediately after creating the analysis-only PR.

Suggested change
})
})
exit 0

Copilot uses AI. Check for mistakes.
body: "## Translation Analysis Only\n\nAll articles for ${ARTICLE_DATE} already have translations. Baseline translation analysis committed per Rule 5.",
base: "main",
head: BRANCH_NAME
})
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the "all translations already exist" branch, the workflow calls safeoutputs___create_pull_request but does not exit 0 afterwards. The remaining steps will still run even though the analysis-only PR is already created, which can lead to errors or duplicate work. Add an explicit exit/return immediately after creating the analysis-only PR.

Suggested change
})
})
exit 0

Copilot uses AI. Check for mistakes.
@pethers
Copy link
Copy Markdown
Member

pethers commented Mar 31, 2026

@copilot apply changes based on the comments in this thread

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Mar 31, 2026

@copilot apply changes based on the comments in this thread

Both review comments applied in commit 0d2c82b. Added exit 0 after each safeoutputs___create_pull_request call in the two translate noop branches (no English articles found, all translations already exist) to prevent execution from continuing with empty variables or creating duplicate PRs.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

.github/workflows/news-article-generator.md:694

  • TODAY is recomputed here after it has already been used for analysis cleanup above. This can cause the branch name / PR title date to differ from the analysis directory that was cleaned/preserved. Reuse a single TODAY value throughout this step (and keep it UTC to match other workflows) so analysis paths and PR metadata stay aligned.
echo "🧹 Cleaned raw MCP data payloads for ${TODAY}/${ARTICLE_TYPE_SLUG}; analysis markdown artifacts PRESERVED for commit"

TODAY=$(date -u +%Y-%m-%d)
BRANCH_NAME="news/articles-$TODAY"
echo "Branch: $BRANCH_NAME"

Comment on lines +348 to +356
- **Action**: No new translations produced; baseline analysis committed
EOF
BRANCH_NAME="news/translate-${ARTICLE_DATE}"
safeoutputs___create_pull_request({
title: "chore: translate EU Parliament articles ${ARTICLE_DATE} (analysis-only)",
body: "## Translation Analysis Only\n\nAll articles for ${ARTICLE_DATE} already have translations. Baseline translation analysis committed per Rule 5.",
base: "main",
head: BRANCH_NAME
})
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the earlier noop branch: safeoutputs___create_pull_request({ ... }) is shown inside a bash block, which is likely to be executed as shell code and fail. Move the tool call into a javascript block (or otherwise ensure the instructions can be executed without mixing shell and tool-call syntax).

Copilot uses AI. Check for mistakes.
@pethers
Copy link
Copy Markdown
Member

pethers commented Mar 31, 2026

@copilot apply changes based on the comments in this thread

…code blocks in translate workflow

Agent-Logs-Url: https://github.com/Hack23/euparliamentmonitor/sessions/12ee1831-e5e4-4cc6-b8d6-d90a91d20f1b

Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Mar 31, 2026

@copilot apply changes based on the comments in this thread

Applied in commit a1de089. Both safeoutputs___create_pull_request calls in the translate noop branches are now in javascript code blocks instead of being mixed into bash blocks. The bash blocks are split at the boundary points with prose transitions between them.

@pethers pethers merged commit c058546 into main Mar 31, 2026
14 checks passed
@pethers pethers deleted the copilot/improve-analysis-workflows branch March 31, 2026 14:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Documentation updates github_actions GitHub Actions updates infrastructure CI/CD and build infrastructure workflows GitHub workflows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants