Analytics Agents on AWS: When to Trust the Answer and When to Review the Query

Stop waiting on the analytics backlog. Start asking your data warehouse directly.

Marcelo Acosta Cavalero

Apr 28, 2026

A product manager needed churn rates segmented by pricing tier and region for a board deck due Friday.

She filed a Jira ticket with the analytics team on Monday.

By Wednesday, the analyst had questions about which date range and whether “churned” meant cancelled or downgraded.

Thursday morning the analyst delivered a CSV.

The product manager spent Thursday afternoon reformatting it into charts that told the wrong story because the segment definitions did not match what she meant.

The data existed in the warehouse.

The question was clear.

Four days and three handoffs turned a 10-minute query into a week-long project.

This is the fourth edition of a five-part series cataloging real AI architecture patterns running on AWS.

Edition 1 covered customer-facing agents.

Edition 2 covered internal knowledge and productivity agents.

Edition 3 covered workflow automation and process agents.

This edition focuses on data and analytics: agents that make data accessible to people who have questions but lack SQL skills, agents that monitor and manage data pipelines, and agents that turn raw data into decisions without a human analyst as the translation layer.

If you missed the earlier editions, go back to Edition 1 for the “Agent or Not?” scoring framework and the AgentCore vs Quick breakdown.

Edition 2 introduced Amazon Q Business for knowledge workloads.

Edition 3 introduced the hybrid Step Functions + AgentCore pattern for workflow automation.

Those mental models apply here.

One platform update before the cards.

Data and analytics agents benefit from capabilities that did not feature heavily in earlier editions.

Bedrock Knowledge Bases now supports structured data retrieval, connecting to structured data stores with Redshift as the query engine.

Three APIs handle the natural-language-to-SQL pipeline: GenerateQuery (standalone SQL generation), Retrieve (query and return data), and RetrieveAndGenerate (query, retrieve, and produce a natural language response).

This means you can build a basic analytics agent using Bedrock Knowledge Bases alone, without AgentCore, for straightforward question-to-answer workloads.

AgentCore earns its place when the agent needs multi-step reasoning, tool orchestration beyond a single Redshift query, or custom logic around how results are interpreted and delivered.

Redshift itself now integrates generative AI natively: Amazon Q generative SQL in the Query Editor handles natural language queries for analysts, SQL-based ML lets you build predictive models directly in Redshift, and LLM invocation functions call Bedrock models from within SQL for text summarization, entity extraction, and sentiment analysis on warehouse data.

SageMaker Unified Studio includes a Data Agent that generates code and SQL from natural language prompts with a built-in SQL editor spanning Athena, Redshift, EMR, and Glue.

For teams already invested in the SageMaker ecosystem, this provides analytics agent capabilities without additional platform adoption.

Several patterns below note where Bedrock Knowledge Bases structured retrieval, Redshift-native AI, or SageMaker Data Agent can substitute for a full AgentCore deployment.

This edition also adds a fifth consideration:

when an analytics agent should generate the answer versus when it should generate the query and let the human validate.

The Answer vs Query Decision

Analytics agents face a trust problem that other agent categories do not.

When a support agent resolves a ticket incorrectly, the customer complains and you fix it.

When an analytics agent returns an incorrect number, someone might put it in a board deck, base a pricing decision on it, or report it to a regulator.

Wrong answers in analytics have a longer blast radius because they often travel far from the person who asked the question before anyone checks them.

Two modes address this.

Answer mode: the agent generates the final number, chart, or insight directly.

This works when the data model is well-documented, the question maps cleanly to known metrics with agreed-upon definitions, and the stakes of a wrong answer are low.

Dashboard refresh, standard KPI reporting, and exploratory data analysis fit answer mode well.

Query mode: the agent generates a SQL query, a dashboard configuration, or a data transformation plan, and a human reviews it before execution or before the results leave the analytics environment.

This works for ad hoc questions involving complex joins, financial reporting where auditability matters, and any scenario where the cost of a wrong answer exceeds the cost of a human review step.

The agent still saves 80% of the time by translating natural language into a correct query, but keeps a human in the validation loop.

A third approach is gaining traction for high-stakes analytics: semantic-layer-first retrieval.

Instead of generating raw SQL against warehouse tables, the agent queries a governed semantic layer (dbt Semantic Layer, Cube, AtScale, MetricFlow, or equivalent) that defines certified metrics, allowed dimensions, and valid join paths.

The semantic layer generates SQL deterministically.

When a question falls outside the semantic layer’s coverage, it rejects the request instead of returning a plausible wrong number.

This fails more safely than raw text-to-SQL, which is critical when the output reaches a board deck or regulatory filing.

Current benchmarks still show a meaningful gap between automated text-to-SQL and human data engineers on realistic enterprise schemas, so the semantic layer acts as guardrails until the technology matures further.

Most patterns in this edition support both modes.

The card descriptions note which mode fits best for each use case.

Start with query mode for anything that touches financial reporting, regulatory data, or executive decision-making.

For executive, financial, regulatory, or board-level numbers, consider semantic-layer-first retrieval as the safer production path.

Graduate to answer mode after the agent demonstrates consistent reliability on your organization’s own schemas and metric definitions.

Reference Architectures for Data and Analytics Agents

Data and analytics agents interact with different infrastructure than the agents in earlier editions.

Data warehouses, lakes, lakehouses, BI tools, ETL pipelines, and metadata catalogs replace the CRM and ticketing system APIs.

The reference architectures reflect this shift.

Reference Architecture J - Analytics Agent with Data Lake Access

Platform: AgentCore (or Bedrock Knowledge Bases structured retrieval for simpler workloads)

When to use: The agent translates natural language questions into SQL, executes against the data warehouse, and returns results as tables, charts, or narrative summaries.

AgentCore Gateway mediates access to data sources with authentication and query governance.

The Glue Data Catalog gives the agent table and column metadata, which reduces schema hallucination.

For production accuracy on business-critical metrics, it should be paired with a semantic layer or metric catalog that defines business metrics, join paths, certified dimensions, ownership, caveats, and freshness.

Glue Data Catalog tells the agent what exists. The semantic layer tells the agent what the business means.

For straightforward single-query workloads, Bedrock Knowledge Bases structured retrieval (with Redshift as the query engine) handles natural-language-to-SQL without AgentCore.

Use the GenerateQuery API when you want just the SQL, Retrieve for SQL plus results, and RetrieveAndGenerate for a full natural language answer.

AgentCore adds value when the agent needs multi-step reasoning, tool orchestration across multiple data sources, custom result formatting, or conversation memory across analysis sessions.

Covers self-service analytics, ad hoc reporting, data exploration, and any pattern where business users need answers from structured data without writing SQL.

The agent needs read access to the warehouse and awareness of the data model, but does not modify data.

Reference Architecture J2 - Governed Analytics Agent with Semantic Layer

Platform: AgentCore, Quick, or Bedrock Knowledge Bases (conversational layer) + semantic layer (metric correctness)

When to use: The agent answers questions about certified business metrics, KPIs, board-level numbers, financial reporting, and any scenario where metric definitions matter more than exploratory flexibility.

The conversational layer (AgentCore, Quick, or Bedrock KB) handles the natural language interaction.

The semantic layer translates metric requests into deterministic SQL based on governed definitions, join paths, certified dimensions, and allowed aggregations.

When a question falls outside the semantic layer’s coverage, the request is rejected rather than producing a plausible wrong number.

This is the safer production architecture for high-stakes analytics.

Raw text-to-SQL (Architecture J) works well for exploration and ad hoc analysis. Governed analytics through a semantic layer works for recurring metrics, executive reporting, and anything that reaches a board deck or regulatory filing.

Several patterns below note where J2 is the recommended starting point.

Reference Architecture K - Quick-Native Analytics Workspace

Platform: Quick

When to use: The primary need is AI-powered visualization, natural language querying of dashboards, automated report generation, or deep research across enterprise data.

Amazon Quick Suite combines five capabilities: Quick Sight for BI (dashboards, Q&A, forecasting, scenarios, stories, and executive summaries), Quick Research for long-form analysis across structured and unstructured sources, Quick Flows for UI and application workflow automation, Quick Automate for scheduled reporting and process automation, and Quick Index for knowledge indexing across enterprise content.

Quick provides a growing set of AWS, SaaS, Slack, Microsoft 365, and REST API integrations for connecting enterprise data sources.

No custom agent code required.

Covers standard BI workloads where the data sources connect through Quick’s native connectors and the business users need visualization, trend analysis, and scheduled reporting rather than custom analytical logic.

Reference Architecture L - Pipeline Orchestration Agent

Platform: AgentCore

When to use: The agent monitors data pipelines, diagnoses failures, manages data quality, and handles the operational aspects of keeping analytics infrastructure healthy.

Triggers from pipeline events, schedule failures, or data quality alerts rather than human questions.

Similar to the event-driven process agents from Edition 3 but focused specifically on data infrastructure.

AgentCore’s Code Interpreter tool is particularly useful here: the agent can execute diagnostic scripts, parse log files, and run data profiling code in a sandboxed environment without needing a separate Lambda function for each analysis task.

Covers pipeline monitoring, data quality management, schema drift detection, and any pattern where the agent maintains the data platform rather than answering business questions.

Reference Architecture M - Multi-Agent Analytics Coordinator

Platform: AgentCore (multi-agent)

When to use: Complex analytical questions that require multiple steps: querying different data sources, combining results, generating visualizations, and producing narrative explanations.

A single agent trying to write SQL, create charts, and explain trends in one context window becomes unreliable.

Specialized agents behind a coordinator keep each task focused.

The coordinator decomposes the question, delegates to specialists, and synthesizes results.

The 25 Use Cases

Self-Service Analytics and Business Intelligence

#076 - Natural Language to SQL Analytics Agent

Pattern: New build

Platform: AgentCore

Complexity: Quick Win

Reference Architecture: J

What the agent does:

Accepts natural language questions from business users (”What was our revenue by product line last quarter, excluding returns?”) and translates them into SQL queries against the data warehouse.
Reads the Glue Data Catalog to understand table and column metadata, and consults the semantic layer or metric catalog for standard metric calculations, join paths, and certified dimensions.
Generates the SQL, executes it, and returns results as formatted tables or simple visualizations.
Operates in query mode by default: shows the generated SQL alongside results so analysts can verify correctness.
Maintains a library of validated query patterns that improve accuracy over time.
When the question is ambiguous (”What’s our churn rate?” could mean three different things depending on the definition), asks the user to clarify before executing.
For teams that want this capability without building a custom agent, three lighter-weight options exist: Bedrock Knowledge Bases structured retrieval with the GenerateQuery or RetrieveAndGenerate APIs handles single-turn natural-language-to-SQL against Redshift, Amazon Q generative SQL in the Redshift Query Editor provides the same capability for analysts already working in Redshift, and SageMaker Unified Studio’s Data Agent handles SQL generation within the SageMaker notebook environment.
The full AgentCore pattern adds value when you need multi-turn conversation, cross-source queries, custom business logic around metric definitions, or integration into Slack and other channels beyond a query editor.
Regardless of which path you choose, maintain a validated query library as an operational asset: approved SQL templates for known metrics, test cases with expected results, regression examples, rejected ambiguous questions, and canonical clarification prompts.
This library is the practical bridge between demo and production.

AWS services: Bedrock (Claude), AgentCore Runtime, AgentCore Gateway, Bedrock Knowledge Bases (structured retrieval via Redshift), Glue Data Catalog (metadata), S3 (query history)

You need this if: Your analytics team spends more than 30% of their time fielding ad hoc data requests that could be answered with straightforward SQL, and business users wait days for answers to questions they could ask in seconds.

#077 - Executive Dashboard Narrator Agent

Pattern: New build

Platform: Both (AgentCore backend + Quick dashboards)

Complexity: Quick Win

Reference Architecture: J2 + K

What the agent does:

Monitors executive dashboards in Quick Sight and generates narrative summaries of what changed and why.
Instead of executives staring at charts trying to spot trends, the agent produces a daily or weekly briefing: “Revenue is up 8% WoW driven by Enterprise tier in EMEA. Support ticket volume spiked 23% on Thursday, correlated with the v4.2 release. Churn rate dropped to 2.1%, the lowest since Q3.”
Pulls data from the same sources feeding the dashboards, runs comparison queries against prior periods, and identifies statistically significant changes versus normal variance.
Delivers summaries via email, Slack, or embedded in the Quick Sight dashboard itself.

AWS services: Bedrock (Claude), AgentCore Runtime, Amazon Quick Sight, Redshift, EventBridge (scheduler), SES/SNS (delivery)

You need this if: Your executives glance at dashboards without extracting insights, skip the weekly metrics review because they do not have time to interpret 15 charts, and your analytics team writes the same “what happened this week” summary manually every Monday.

#078 - Metric Definition and Discovery Agent

Pattern: New build

Platform: AgentCore

Complexity: Quick Win

Reference Architecture: J2

What the agent does:

Serves as the single source of truth for metric definitions across the organization.
When someone asks “What’s our NRR?” the agent returns the official definition, the SQL logic behind it, which dashboard shows it, and the current value.
Maintains a metric catalog with ownership, calculation methodology, data sources, known caveats, and refresh frequency.
When two teams use different definitions for the same metric (marketing’s “active user” versus product’s “active user”), surfaces the discrepancy and links to the governance process for resolution.
Answers questions like “Who owns the churn metric?” and “When was the MRR calculation last updated?”

AWS services: Bedrock (Claude), AgentCore Runtime, Bedrock Knowledge Bases (metric catalog), Glue Data Catalog, DynamoDB (metric ownership and lineage)

You need this if: Different teams report different numbers for the same metric, nobody trusts the data because definitions are inconsistent, and your weekly leadership meeting wastes 20 minutes debating whose numbers are right.

#079 - Self-Service Cohort Analysis Agent

Pattern: New build

Platform: AgentCore

Complexity: Strategic Bet

Reference Architecture: J

What the agent does:

Enables product and marketing teams to run cohort analyses without SQL.
Users describe the cohort they want to analyze in natural language (”Users who signed up in January, completed onboarding, and used feature X at least 3 times in their first 30 days”).
The agent translates this into the correct event-based queries, handles the time windowing logic that makes cohort SQL complex, and returns retention curves, conversion funnels, or behavioral comparisons.
Uses AgentCore Code Interpreter to generate visualizations (retention heatmaps, funnel charts, comparison plots) directly from query results without requiring a separate BI tool for the visual output.
Supports standard cohort types: acquisition cohorts, behavioral cohorts, and feature adoption cohorts.
Generates the underlying SQL for analyst review and maintains a library of reusable cohort definitions.

AWS services: Bedrock (Claude), AgentCore Runtime, AgentCore Code Interpreter, AgentCore Memory (analysis context), Redshift (event data), Glue Data Catalog, S3 (cohort definition library)

You need this if: Cohort analysis requests sit in your analytics backlog for weeks because the SQL is complex, and product managers make decisions based on aggregate metrics because they cannot get segmented views fast enough.

#080 - Ad Hoc Report Builder Agent

Pattern: Modernization from chatbot

Platform: Both (AgentCore backend + Quick visualization)

Complexity: Quick Win

Reference Architecture: J + K

What the agent does:

Generates formatted reports from natural language requests.
A sales leader asks for “Q1 pipeline by stage, rep, and region with comparison to Q1 last year.”
The agent queries the data warehouse, builds the appropriate tables and charts, applies standard company formatting, and delivers a report ready for presentation.
Handles common report types: period-over-period comparisons, ranking tables, trend analyses, and distribution summaries.
For recurring requests, saves the report definition so it can be regenerated with fresh data on demand or on a schedule through Quick Automate or the organization’s existing reporting scheduler.
Operates in answer mode for standard reports with validated templates and query mode for novel requests.

AWS services: Bedrock (Claude), AgentCore Runtime, Amazon Quick Suite (Quick Sight + Quick Automate), Redshift, S3 (report templates), SES (report delivery)

You need this if: Your team generates 20+ ad hoc reports per week by manually querying data and formatting results in spreadsheets, and 60% of those reports are variations of the same 5-10 templates.

#081 - Competitive Intelligence Analytics Agent

Pattern: New build Platform: AgentCore Complexity: Strategic Bet Reference Architecture: J

What the agent does: Aggregates and analyzes competitive data from internal and external sources. Pulls win/loss data from the CRM, pricing intelligence from competitive tracking tools, feature comparison data from product marketing’s knowledge base, and market share estimates from analyst reports. Answers questions like “What’s our win rate against Competitor X in the mid-market segment this quarter?” and “Which features do we lose deals on most often?” Generates competitive briefings before sales calls by pulling the latest intelligence on the specific competitor. Identifies trending competitive themes from recent deal notes and support escalations where customers mention competitors.

AWS services: Bedrock (Claude), AgentCore Runtime, Bedrock Knowledge Bases (competitive intelligence corpus), CRM API, Redshift (win/loss data), S3 (analyst reports)

You need this if: Your competitive intelligence lives in scattered documents and CRM fields that nobody analyzes systematically, and your sales team enters competitive deals without context on recent win/loss patterns.

Automated Reporting and Insights

#082 - Anomaly Detection and Explanation Agent

Pattern: New build

Platform: Both (AgentCore backend + Quick dashboards)

Complexity: Strategic Bet

Reference Architecture: J + K

What the agent does:

Monitors key business metrics for anomalies and explains what caused them.
Goes beyond threshold-based alerting: uses statistical models to identify when a metric deviates from its expected pattern given seasonality, day-of-week effects, and recent trends.
When an anomaly is detected, automatically investigates by drilling into dimensions (which product, region, channel, or customer segment drove the change), correlating with known events (deployments, marketing campaigns, pricing changes, outages), and checking whether the anomaly appears in related metrics.
Delivers an explanation, not just an alert: “Conversion rate dropped 15% on Tuesday.
The drop is concentrated in mobile web traffic from paid search.
The checkout page load time increased 2.3 seconds after the 2pm deployment.”

AWS services: Bedrock (Claude), AgentCore Runtime, Amazon Quick Sight, Redshift, CloudWatch (application metrics), EventBridge (anomaly triggers), SNS (alerts)

You need this if: Your team gets metric alerts that say “Revenue dropped 12%” and then spends 2 hours investigating why, or worse, nobody investigates because there are too many alerts and most turn out to be normal variance.

#083 - Automated Weekly Business Review Agent

Pattern: New build Platform: Both (AgentCore backend + Quick dashboards)

Complexity: Quick Win Reference Architecture: J2 + K

What the agent does:

Produces a structured weekly business review package every Monday morning.
Pulls KPIs from the data warehouse, compares against targets and prior periods, identifies the top 3-5 items that warrant leadership attention, and generates a narrative summary with supporting data.
Adapts the summary to the audience: the CEO version focuses on revenue, growth, and strategic metrics.
The VP Engineering version focuses on velocity, reliability, and customer-reported issues.
The VP Sales version focuses on pipeline, conversion, and forecast accuracy.
Each version draws from the same underlying data but emphasizes what matters to that reader.

AWS services: Bedrock (Claude), AgentCore Runtime, Amazon Quick Suite (Quick Sight + Quick Research), Redshift, EventBridge (scheduler), SES (distribution), S3 (report archive)

You need this if: Your WBR preparation takes an analyst 4+ hours every Monday, the review frequently surfaces stale data because the report was built on Friday’s numbers, and different leaders get inconsistent views of the same metrics.

#084 - Customer Analytics Storyteller Agent

Pattern: New build

Platform: AgentCore

Complexity: Strategic Bet

Reference Architecture: J

What the agent does:

Transforms raw customer data into narrative account profiles for customer success and sales teams.
For a given account, synthesizes usage data, support history, billing trends, feature adoption, NPS scores, and contract details into a coherent story: where the account started, how their usage evolved, what problems they encountered, which features they adopted or ignored, and what the data suggests about their trajectory.
Generates pre-meeting briefings (”This account’s usage dropped 30% after they lost their internal champion in March.
They opened 4 support tickets about the reporting feature.
Their contract renews in 60 days.”) and proactive health alerts when data patterns suggest risk.

AWS services: Bedrock (Claude), AgentCore Runtime, AgentCore Memory (account analysis history), Redshift (usage and billing data), CRM API, support platform API

You need this if: Your customer success managers prepare for account reviews by manually checking 5+ systems, miss signals because data is fragmented, and write account summaries that are outdated by the time they finish.

#085 - Financial Close Reporting Agent

Pattern: Migration from RPA

Platform: AgentCore

Complexity: Foundation Build

Reference Architecture: J2

What the agent does:

Accelerates the financial close process by automating report generation and variance analysis.
Pulls trial balance data, subledger details, and intercompany transactions from the ERP.
Generates standard close reports: P&L by cost center, balance sheet reconciliation summaries, revenue waterfall, and expense variance analysis.
For each material variance, drills into the underlying transactions and drafts an explanation (”R&D expense is $180K over budget due to three unplanned contractor extensions approved in the last two weeks of the quarter”).
Operates in query mode for all financial outputs: generates the reports and explanations, but the controller reviews and approves before distribution.
Maintains an audit trail of every query, data source, and transformation.

AWS services: Bedrock (Claude), AgentCore Runtime, AgentCore Policy (financial data access controls), ERP API, Redshift, S3 (report archive with audit trail), DynamoDB (variance tracking)

You need this if: Your financial close takes 10+ business days, analysts spend most of that time generating reports and writing variance explanations rather than analyzing results, and the CFO gets the final package too late to act on insights.

Data Pipeline Management and Operations

#086 - Pipeline Failure Diagnosis Agent

Pattern: New build

Platform: AgentCore

Complexity: Quick Win

Reference Architecture: L

What the agent does:

Monitors data pipelines for failures and diagnoses root causes automatically.
When a Glue job, MWAA DAG, or Step Functions workflow fails, the agent pulls the error logs, identifies the failure point, checks for common causes (schema changes in source systems, permission issues, resource exhaustion, data volume spikes, upstream pipeline delays), and classifies the failure as transient (retry), environmental (fix and retry), or structural (requires code change).
Uses AgentCore Code Interpreter to parse log files, run diagnostic queries against pipeline metadata, and profile data samples without requiring pre-built Lambda functions for each analysis type.
For transient failures, retries with appropriate backoff. For environmental failures, applies pre-approved remediations from a bounded runbook (adjusting memory allocation within configured limits, refreshing credentials through the standard rotation path) with dry-run validation, rollback capability, and audit logging through the agent workflow, tool invocation records, CloudWatch metrics where available, and a dedicated remediation audit table.
Anything that changes infrastructure, credentials, permissions, or production data goes through approval workflows, with AgentCore Policy enforcing which tools and actions the agent is allowed to invoke.
For structural failures, files a ticket with the diagnosis and suggested fix. Tracks failure patterns and identifies pipelines that need refactoring based on recurring failure modes.

AWS services: Bedrock (Claude), AgentCore Runtime, AgentCore Code Interpreter, AgentCore Gateway (Glue API, MWAA API, Step Functions API), CloudWatch Logs, EventBridge (failure events), SNS (escalation), DynamoDB (failure pattern tracking)

You need this if: Your data engineering team starts every morning triaging pipeline failures from overnight runs, spends 30+ minutes diagnosing each failure, and 70% of failures follow patterns that could be diagnosed and remediated automatically.

#087 - Schema Drift Detection and Impact Agent

Pattern: New build

Platform: AgentCore

Complexity: Strategic Bet

Reference Architecture: L

What the agent does:

Monitors source systems for schema changes that could break downstream pipelines and analytics.
Detects when a source table adds, removes, or renames columns, changes data types, or modifies constraints.
For each detected change, traces the impact through the data lineage graph: which ETL jobs read from this table, which warehouse tables depend on it, which dashboards and reports would be affected, and which metric definitions reference the changed columns.
Classifies each change by impact severity (breaking versus non-breaking) and urgency (pipeline runs in 2 hours versus next weekly refresh).
Notifies the relevant data engineers with a complete impact assessment and a suggested remediation plan.

AWS services: Bedrock (Claude), AgentCore Runtime, Glue Data Catalog (schema snapshots), EventBridge (schema change events), DynamoDB (lineage graph), SNS (notifications), S3 (schema history)

You need this if: Source system teams change schemas without notifying downstream consumers, your pipelines break silently and produce incorrect data before anyone notices, and schema migration coordination happens through Slack messages that get lost.

#088 - Data Freshness and SLA Monitor Agent

Pattern: New build

Platform: Both (AgentCore backend + Quick dashboards)

Complexity: Quick Win

Reference Architecture: L + K

What the agent does:

Tracks data freshness across the warehouse and alerts stakeholders when data falls behind its expected refresh schedule.
Maintains an SLA registry: which tables should refresh hourly, daily, or weekly, and who depends on them.
When a table misses its SLA, traces the root cause upstream through the dependency chain to find the bottleneck (a source system delay, a slow ETL job, a queue backlog) and reports the issue with an estimated resolution time based on historical patterns.
Quick dashboards show a real-time view of data freshness across the warehouse, highlighting which business-critical datasets are current and which are stale.
Distinguishes between stale data that affects active decisions (a dashboard someone checks every morning) and stale data in rarely-accessed tables.

AWS services: Bedrock (Claude), AgentCore Runtime, Amazon Quick Sight, Glue Data Catalog, CloudWatch (pipeline metrics), EventBridge (SLA timers), SNS (alerts), DynamoDB (SLA registry)

You need this if: Business users do not trust the data because they have been burned by stale numbers, your team has no systematic way to know whether the warehouse is current, and data freshness issues get discovered when someone reports a suspicious metric, not when the pipeline falls behind.

#089 - ETL Job Optimization Agent

Pattern: New build

Platform: AgentCore

Complexity: Strategic Bet

Reference Architecture: L

What the agent does:

Analyzes ETL job performance and recommends optimizations.
Profiles job execution history: runtime trends, resource utilization (memory, CPU, shuffle), data volume growth, and cost per run.
Identifies jobs that are slowing down (runtime increased 40% over the last month due to data growth), jobs that are over-provisioned (using 10 DPUs but only need 3), and jobs with inefficient patterns (full table scans where incremental loads would work, redundant transformations, unnecessary materializations).
Generates specific optimization recommendations with estimated impact: “Converting job X from full load to incremental based on the updated_at column would reduce runtime from 45 minutes to 8 minutes and cut cost by 82%.” Complements Glue’s built-in generative AI assistance (which handles ETL code generation, Spark job modernization, and Spark troubleshooting from job metadata and logs) by adding cross-job portfolio analysis, cost modeling across the full pipeline fleet, historical trend analysis, and organization-specific optimization policies that Glue’s native AI does not cover.
Does not execute optimizations directly.
Produces recommendations for data engineers to review and implement.

AWS services: Bedrock (Claude), AgentCore Runtime, AgentCore Code Interpreter, Glue API (job metrics and configuration), CloudWatch (performance data), S3 (optimization reports), Cost Explorer (job cost analysis)

You need this if: Your Glue or Spark jobs have grown organically without performance tuning, your data warehouse costs are climbing, and your data engineers lack time to profile and optimize because they are busy fighting pipeline fires.

#090 - Data Catalog Enrichment Agent

Pattern: New build

Platform: AgentCore

Complexity: Quick Win

Reference Architecture: L

What the agent does:

Improves data catalog quality by automatically generating and maintaining metadata that data engineers never have time to write.
Profiles tables to generate column descriptions based on data patterns, sample values, and naming conventions.
Infers business meaning by analyzing how columns are used in existing queries and dashboard definitions.
Tags tables and columns with business domains, sensitivity classifications, and ownership.
Identifies undocumented tables that are heavily queried (important but invisible) and documented tables that nobody queries (potentially stale).
Links related tables based on join patterns observed in query logs.
Produces a catalog coverage report showing which areas of the warehouse are well-documented and which are dark zones.

AWS services: Bedrock (Claude), AgentCore Runtime, Glue Data Catalog, Amazon DataZone (business glossary and data product discovery), Athena (query log analysis), CloudWatch Logs (query patterns), DynamoDB (enrichment tracking)

You need this if: Your data catalog exists but coverage is below 30%, business users cannot find the data they need because descriptions are missing or outdated, and “ask the person who built the table” is the primary discovery mechanism.

Data Quality and Governance

#091 - Data Quality Monitoring and Remediation Agent

Pattern: New build

Platform: Both (AgentCore backend + Quick dashboards)

Complexity: Strategic Bet

Reference Architecture: L + K

What the agent does:

Continuously monitors data quality across the warehouse using a combination of deterministic rules and AI-detected anomalies.
Deterministic checks catch known issues: null rates exceeding thresholds, referential integrity violations, value range violations, uniqueness constraint breaches, and format inconsistencies.
The agent adds a reasoning layer that catches issues rules miss: detecting when a distribution shift in a column suggests a source system change, identifying when a sudden drop in row count correlates with a known upstream issue rather than a genuine business decline, and flagging when data patterns deviate from seasonal expectations.
For each quality issue, classifies severity, identifies the most likely root cause, and recommends a remediation path.
Quick dashboards provide a data quality scorecard across the warehouse.

AWS services: Bedrock (Claude), AgentCore Runtime, Amazon Quick Sight, Glue Data Catalog, Athena (quality queries), EventBridge (monitoring schedule), SNS (alerts), DynamoDB (quality metrics history)

You need this if: Data quality issues surface when a dashboard shows impossible numbers, your quality checks are limited to basic row counts and null checks, and nobody knows the overall health of the data warehouse until something breaks visibly.

#092 - PII Detection and Classification Agent

Pattern: New build

Platform: AgentCore

Complexity: Strategic Bet

Reference Architecture: L

What the agent does:

Scans the data warehouse and data lake for personally identifiable information that should not be there or should be masked.
Goes beyond pattern matching (regex for SSN formats, email patterns): analyzes column content in context to detect PII that evades simple rules.
A “notes” field containing customer phone numbers, a “description” column with embedded email addresses, or a “metadata” JSON field with names and addresses.
For warehouse-resident data, Redshift’s LLM invocation functions can call Bedrock models directly from SQL to classify column content at scale without extracting data from the warehouse.
For data lake scanning, the agent coordinates Macie and custom Bedrock calls across S3 buckets.
Classifies each finding by PII type (direct identifiers, quasi-identifiers, sensitive attributes) and regulatory relevance (GDPR personal data, HIPAA PHI, CCPA categories).
Generates a remediation report with specific recommendations: mask, tokenize, remove, or restrict access.
Tracks remediation progress and re-scans to verify fixes.

AWS services: Bedrock (Claude), AgentCore Runtime, AgentCore Policy (data access restrictions), Glue Data Catalog, Redshift (LLM invocation for in-warehouse scanning), Amazon Macie (S3 scanning), S3 (scan reports), DynamoDB (PII inventory)

You need this if: Your data governance team cannot confidently answer “where does PII live in our data warehouse?” and you have discovered PII in unexpected places during audits or incident investigations.

#093 - Data Lineage and Impact Analysis Agent

Pattern: New build

Platform: AgentCore

Complexity: Foundation Build

Reference Architecture: L

What the agent does:

Builds and maintains a data lineage graph that traces how data flows from source systems through transformations into the warehouse and out to dashboards and reports.
Parses ETL job definitions, SQL transformations, view definitions, and dashboard queries to construct the lineage automatically.
When someone asks “Where does the revenue number on the CEO dashboard come from?” the agent traces it back through every transformation to the source system tables.
When someone plans to modify a source table or ETL job, the agent provides a forward-looking impact analysis: every downstream table, dashboard, report, and metric definition that would be affected.
Keeps the lineage graph current by monitoring pipeline changes and re-parsing modified job definitions.

AWS services: Bedrock (Claude), AgentCore Runtime, Glue Data Catalog, Athena (query log parsing), Neptune or DynamoDB (lineage graph store), S3 (job definition archive)

You need this if: Nobody can explain how a number on a dashboard was derived, changes to upstream tables break downstream analytics unpredictably, and your data lineage documentation is either nonexistent or hopelessly outdated.

#094 - Access Control and Usage Auditor Agent

Pattern: New build

Platform: AgentCore

Complexity: Quick Win

Reference Architecture: L

What the agent does:

Monitors who accesses what data and whether their access patterns align with their role.
Analyzes query logs from Redshift, Athena, and other analytics tools to build user access profiles.
Identifies anomalies: a marketing analyst querying HR compensation tables, a former contractor’s credentials still accessing production data, an account running significantly more queries than its historical baseline.
Cross-references actual access patterns against defined access policies and flags misalignments in both directions: users accessing data they should not (security risk) and users lacking access to data they need for their role (productivity blocker).
AgentCore Policy, which compiles natural language policy definitions into Cedar authorization rules, can enforce data access boundaries for the agent itself, ensuring the auditor agent does not access data outside its audit scope.
Produces audit-support reports for SOC 2, HIPAA, and GDPR reviews.

AWS services: Bedrock (Claude), AgentCore Runtime, AgentCore Policy (Cedar-based access rules), CloudTrail (API access logs), Redshift audit logs, Athena query logs, AWS Lake Formation (fine-grained access control), Amazon DataZone (governance context), DynamoDB (access profiles), S3 (audit reports)

You need this if: Your last audit required weeks of manual log analysis to demonstrate who accessed what data, you suspect access permissions have drifted from policy, and you lack visibility into actual data usage patterns.

Predictive Analytics and Forecasting

#095 - Demand Forecasting Explanation Agent

Pattern: New build

Platform: Both (AgentCore backend + Quick dashboards)

Complexity: Strategic Bet

Reference Architecture: J + K

What the agent does:

Wraps your existing demand forecasting models with an explanation and interaction layer.
Business users ask questions about the forecast in natural language: “Why is the Q3 forecast for Product X 20% higher than Q3 last year?” and the agent decomposes the forecast drivers.
Pulls feature importances from the underlying ML model, correlates forecast changes with specific input variables (promotional calendar, pricing changes, seasonal patterns, leading indicators), and generates human-readable explanations.
Lets planners run what-if scenarios: “What happens to the forecast if we move the promotion from July to August?” by adjusting model inputs and showing the resulting change.
Quick dashboards show forecast accuracy over time, bias trends, and confidence intervals by product and region.
For teams without a separate ML platform, Redshift ML provides an alternative path for simpler forecasting tasks: build and deploy predictive models using SQL within Redshift, and the agent wraps those Redshift-native models with the same explanation and scenario layer.
The fit depends on the forecasting complexity and data setup involved.

AWS services: Bedrock (Claude), AgentCore Runtime, Amazon Quick Suite (Quick Sight + Quick Research), SageMaker or Redshift ML (forecast models), Redshift (historical data), S3 (forecast outputs)

You need this if: Your demand planning team does not trust the ML forecasts because they cannot understand why the model predicts what it predicts, and planners override the model 40%+ of the time based on gut feel rather than data.

#096 - Churn Prediction Insight Agent

Pattern: New build

Platform: Both (AgentCore backend + Quick dashboards)

Complexity: Strategic Bet

Reference Architecture: J + K

What the agent does:

Transforms churn prediction model outputs into actionable intelligence for customer success teams.
Instead of a raw churn probability score, delivers a contextualized assessment for each at-risk account: what changed in their behavior, which risk factors are driving the score, how this account compares to similar accounts that churned or retained, and what intervention has worked for accounts with a similar risk profile.
Generates weekly at-risk account lists prioritized by revenue impact and intervention likelihood.
Tracks whether recommended interventions actually reduced churn, feeding effectiveness data back to improve future recommendations.
Quick dashboards show churn risk distribution, cohort-level trends, and intervention effectiveness metrics.

AWS services: Bedrock (Claude), AgentCore Runtime, Amazon Quick Suite (Quick Sight + Quick Research), SageMaker (churn model), Redshift (customer data), CRM API, DynamoDB (intervention tracking)

You need this if: Your churn model produces scores that the customer success team ignores because the scores lack context, and your team cannot answer “why is this customer at risk?” without manual investigation.

#097 - Revenue Forecasting and Scenario Agent

Pattern: New build

Platform: AgentCore

Complexity: Foundation Build

Reference Architecture: J2

What the agent does:

Helps finance and revenue operations teams build and validate revenue forecasts through conversational interaction.
Pulls pipeline data from the CRM, historical close rates by segment and stage, seasonal patterns, and known future events (pricing changes, product launches, contract renewals).
Generates a bottoms-up forecast with confidence intervals and decomposes it by segment, product, and rep.
Supports scenario analysis: “What if close rates drop 10% due to the macro environment?” or “What if we accelerate the enterprise pipeline by 30 days?” Compares the model’s forecast against rep-submitted forecasts and identifies where the biggest gaps exist between human judgment and data-driven prediction.
Operates in query mode for all financial outputs.

AWS services: Bedrock (Claude), AgentCore Runtime, AgentCore Policy (financial data access), CRM API (pipeline data), Redshift (historical revenue), S3 (scenario outputs), DynamoDB (forecast versions)

You need this if: Your revenue forecast relies primarily on rep gut feel, accuracy is below 85%, and your FP&A team spends days building scenario models in spreadsheets that are stale by the time they present to the board.

Embedded Analytics and Decision Support

#098 - Product Analytics Question Agent

Pattern: New build

Platform: AgentCore

Complexity: Quick Win

Reference Architecture: J

What the agent does:

Gives product managers direct access to product analytics without depending on the data team.
Answers questions about feature usage, user flows, conversion funnels, and engagement metrics through natural language: “What percentage of users who start the onboarding wizard complete it within 7 days?” or “Show me the adoption curve for feature X by plan tier.” Understands the product’s event taxonomy and translates business questions into the correct event-based queries.
Handles the time-series complexity that makes product analytics SQL hard: session stitching, funnel ordering, retention windows, and cohort boundaries.
Generates the SQL for analyst review on complex queries and returns answers directly for standard metric lookups.

AWS services: Bedrock (Claude), AgentCore Runtime, Redshift/Athena (event data), Glue Data Catalog (event taxonomy), AgentCore Memory (analysis context), S3 (query pattern library)

You need this if: Your product managers file analytics tickets that take days to fulfill, make decisions without data because the wait is too long, and your analytics team is a bottleneck despite strong SQL skills because demand outstrips capacity.

#099 - Marketing Attribution Analysis Agent

Pattern: New build

Platform: AgentCore (multi-agent) + Quick dashboards

Complexity: Strategic Bet

Reference Architecture: M + K

What the agent does:

Makes marketing attribution data accessible and actionable for marketing teams.
A coordinator agent receives questions like “Which channels drove the most pipeline for Enterprise accounts this quarter?” and decomposes them across specialist agents.
A data collection agent queries marketing platforms (ad spend, impressions, clicks), web analytics (sessions, conversions), and the CRM (pipeline, closed-won).
A modeling agent runs multi-touch attribution calculations across the collected data, supporting multiple models (first touch, last touch, linear, time-decay, position-based).
A synthesis agent reconciles marketing platform data (which over-counts) against CRM-confirmed pipeline (which under-counts) and explains how the answer changes depending on the model chosen.
The multi-agent approach keeps each specialist focused on its data source and methodology rather than one agent juggling 6+ platform APIs and complex attribution math in a single context window.
Quick dashboards show channel performance trends, campaign ROI, and attribution model comparisons.

AWS services: Bedrock (Claude), AgentCore Runtime (multi-agent), AgentCore Memory (analysis context), Amazon Quick Suite (Quick Sight + Quick Research), Redshift (marketing and pipeline data), CRM API, marketing platform APIs, S3 (attribution model outputs)

You need this if: Your marketing team debates attribution methodology instead of optimizing spend, reporting takes a week because the data lives in 5 different platforms, and nobody trusts the numbers because different tools show different results.

#100 - Operational Capacity Planning Agent

Pattern: New build

Platform: Both (AgentCore backend + Quick dashboards)

Complexity: Strategic Bet

Reference Architecture: J + K

What the agent does:

Helps operations leaders plan staffing and resource allocation based on data-driven demand signals.
Analyzes historical patterns in ticket volume, order throughput, call volume, or transaction load to project future demand by hour, day, and week.
Correlates demand patterns with business events (product launches, marketing campaigns, seasonal peaks, day-of-week effects) and external factors (holidays, weather for logistics operations).
Generates staffing recommendations: how many agents, warehouse workers, or support reps are needed per shift to meet SLA targets.
Runs what-if scenarios: “What happens to wait times if we reduce the team by 2 FTEs?” or “How many seasonal hires do we need for the holiday peak?”
Quick dashboards show capacity utilization, SLA performance, and demand forecast accuracy.

AWS services: Bedrock (Claude), AgentCore Runtime, Amazon Quick Suite (Quick Sight + Quick Research), Redshift (operational data), EventBridge (demand signals), S3 (capacity models)

You need this if: Your capacity planning relies on last year’s numbers plus a growth factor, you regularly under-staff during unexpected peaks and over-staff during slow periods, and the operations team cannot model the impact of business decisions on staffing needs.

What These 25 Patterns Reveal

Different dynamics emerge when agents work with data instead of documents or processes.

Quick is the strongest AWS-native default for dashboards, business-user analysis, research, and automation.

AgentCore is the better fit when the workload needs custom tools, multi-step orchestration, policy-bounded actions, cross-system workflows, or code execution outside the BI workspace.

For organizations starting with analytics agents, Quick provides the fastest path to value with the least custom development.

The metadata catalog is necessary but not sufficient.

Eight patterns depend directly on the Glue Data Catalog or equivalent metadata layer.

An analytics agent that does not understand your data model hallucinates column names, joins tables incorrectly, and generates plausible-looking wrong answers.

But schema metadata alone does not prevent the agent from computing a metric incorrectly.

Production analytics agents need a semantic layer on top of the catalog: governed metric definitions, business glossary, certified dimensions, join rules, data quality status, and access-control context.

Amazon DataZone handles cataloging, governance, business glossary, and data product discovery. Pattern #078 (Metric Definition and Discovery) and Pattern #090 (Data Catalog Enrichment) exist specifically because these prerequisites are rarely met.

Query mode is the safe default for analytics. Unlike customer support agents where a wrong answer triggers immediate feedback (the customer says “that’s wrong”), analytics agents produce outputs that travel through the organization unchallenged. Generating the SQL query and showing it alongside results gives analysts a verification step that catches errors before they propagate.

Seven patterns explicitly recommend starting in query mode and graduating to answer mode after proving accuracy.

Pipeline agents and analytics agents are complementary.

The pipeline management patterns keep the data infrastructure healthy so the analytics patterns have reliable data to work with.

Organizations that build analytics agents without pipeline monitoring discover that their elegant natural-language-to-SQL agent returns stale data because nobody knew the pipeline broke at 3am.

Foundation models handle the translation layer, not the computation.

Every analytics pattern uses Bedrock for natural language understanding and SQL generation, not for performing calculations.

The warehouse does the math.

The model translates intent into queries and results into explanations.

This distinction matters for accuracy: letting the model compute revenue by summing numbers it read from a table produces unreliable results.

Generating a SQL SUM() query and letting Redshift compute the answer produces exact results.

The build-versus-buy spectrum is wider for analytics agents than any other category in this series.

Bedrock Knowledge Bases structured retrieval handles simple natural-language-to-SQL against Redshift without custom code.

Amazon Q generative SQL in the Redshift Query Editor provides the same capability for analysts already in that environment. SageMaker Unified Studio’s Data Agent covers SQL generation within notebooks.

Quick Suite handles visualization, research, executive summaries, and many business-user analysis workflows natively.

A governed semantic layer handles certified metric retrieval deterministically. AgentCore earns its place when you need multi-turn reasoning, cross-source orchestration, custom business logic, or channel integration beyond the query editor.

The most common mistake is building a full AgentCore agent for a workload that Bedrock Knowledge Bases, Quick, or a governed semantic layer can handle with less custom code.

Self-service does not mean no governance.

Various patterns address data quality, PII detection, lineage, and access control. Self-service analytics without governance creates a new problem: more people accessing data means more opportunities for misinterpretation, unauthorized access, and compliance violations.

The governance agents in this edition are the guardrails that make self-service safe.

What Comes Next

One more edition to close the series:

Edition 5 - Compliance, security, and governance agents (high-stakes environments with strict audit and control requirements)

If you are building data and analytics agents, start with #076 (Natural Language to SQL) or #078 (Metric Definition and Discovery). Natural Language to SQL delivers the most visible impact because it directly addresses the “analyst backlog” problem every data team faces.

Metric Definition and Discovery is lower effort and solves the trust problem that undermines every other analytics investment.

Both require a well-maintained metadata catalog, which is why #090 (Data Catalog Enrichment) makes a strong parallel workstream.

Discussion about this post

Ready for more?