Skip to content

Conversation

@shruthilayaj
Copy link
Member

@shruthilayaj shruthilayaj commented Nov 27, 2025

Parallelize the stats query by explicitly passing a list of attributes to fetch.
Hoping to make the stats and ranked endpoints faster this way.

@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Nov 27, 2025
@shruthilayaj shruthilayaj marked this pull request as ready for review December 3, 2025 20:15
@shruthilayaj shruthilayaj requested review from a team as code owners December 3, 2025 20:15
@shruthilayaj shruthilayaj force-pushed the shruthi/feat/parallelize-trace-item-stats branch from 5a5b131 to 02db9ee Compare December 3, 2025 20:16
attrs_response = snuba_rpc.attribute_names_rpc(attrs_request)

# Chunk attributes and run stats query in parallel
chunked_attributes: dict[int, list[AttributeKey]] = defaultdict(list[AttributeKey])
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: defaultdict is incorrectly instantiated with list[AttributeKey] (a type hint) instead of a callable list, causing a TypeError on first access.
Severity: CRITICAL | Confidence: High

🔍 Detailed Analysis

The defaultdict constructor is incorrectly instantiated with list[AttributeKey]. This is a type hint (a types.GenericAlias object in Python 3.9+), not a callable function. When a missing key is accessed in chunked_attributes, defaultdict attempts to call list[AttributeKey](), leading to a TypeError: 'types.GenericAlias' object is not callable. This will cause an immediate server crash in OrganizationTraceItemsStatsEndpoint.get() and OrganizationTraceItemsAttributesRankedEndpoint.get() when processing attribute distributions.

💡 Suggested Fix

Change defaultdict(list[AttributeKey]) to defaultdict(list). The type hint dict[int, list[AttributeKey]] is correct and should remain.

🤖 Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.

Location: src/sentry/api/endpoints/organization_trace_item_stats.py#L97

Potential issue: The `defaultdict` constructor is incorrectly instantiated with
`list[AttributeKey]`. This is a type hint (a `types.GenericAlias` object in Python
3.9+), not a callable function. When a missing key is accessed in `chunked_attributes`,
`defaultdict` attempts to call `list[AttributeKey]()`, leading to a `TypeError:
'types.GenericAlias' object is not callable`. This will cause an immediate server crash
in `OrganizationTraceItemsStatsEndpoint.get()` and
`OrganizationTraceItemsAttributesRankedEndpoint.get()` when processing attribute
distributions.

Did we get this right? 👍 / 👎 to inform future reviews.
Reference ID: 5300574

@shruthilayaj shruthilayaj force-pushed the shruthi/feat/parallelize-trace-item-stats branch from 0242c8b to cee4d15 Compare December 3, 2025 20:23
Comment on lines +131 to +133
for stats in result:
for stats_type, data in stats.items():
stats_results[stats_type]["data"].update(data["data"])
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: The organization_trace_item_stats.py endpoint's response format changed, breaking API compatibility.
Severity: CRITICAL | Confidence: High

🔍 Detailed Analysis

The organization_trace_item_stats.py endpoint's response format has changed from a dictionary structure, {"data": {stat_type: {...}}}, to a list of single-key dictionaries, {"data": [{stat_type: {...}}, ...]}. This transformation is a breaking API change that will cause clients expecting the original dictionary format to fail.

💡 Suggested Fix

Revert the response format in organization_trace_item_stats.py line 135 to return Response({"data": stats_results}) instead of Response({"data": [{k: v} for k, v in stats_results.items()]}).

🤖 Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.

Location: src/sentry/api/endpoints/organization_trace_item_stats.py#L131-L133

Potential issue: The `organization_trace_item_stats.py` endpoint's response format has
changed from a dictionary structure, `{"data": {stat_type: {...}}}`, to a list of
single-key dictionaries, `{"data": [{stat_type: {...}}, ...]}`. This transformation is a
breaking API change that will cause clients expecting the original dictionary format to
fail.

Did we get this right? 👍 / 👎 to inform future reviews.
Reference ID: 5302423

meta=attrs_meta,
limit=max_attributes,
type=attr_type,
intersecting_attributes_filter=cohort_2,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Suspect cohort attributes missed due to baseline-only attribute filtering

The TraceItemAttributeNamesRequest uses intersecting_attributes_filter=cohort_2 (the baseline cohort), which means only attributes present in the baseline are fetched and analyzed. These attributes are then used to query both the suspect cohort (cohort_1) and baseline cohort (cohort_2). Attributes unique to the suspect cohort are completely missed, which defeats the purpose of comparing cohorts to find differentiating attributes. The old code didn't pre-filter attributes, so it analyzed all attributes from both cohorts independently.

Fix in Cursor Fix in Web

Copy link
Contributor

@Abdkhan14 Abdkhan14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm, I would look at that seer comment to make sure we don't break the FE

@shruthilayaj
Copy link
Member Author

shruthilayaj commented Dec 3, 2025

Lgtm, I would look at that seer comment to make sure we don't break the FE

stats results would have returned list[dict[str, Any]] originally

@shruthilayaj shruthilayaj merged commit a0836a8 into master Dec 3, 2025
68 checks passed
@shruthilayaj shruthilayaj deleted the shruthi/feat/parallelize-trace-item-stats branch December 3, 2025 20:53
@github-actions github-actions bot locked and limited conversation to collaborators Dec 19, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Scope: Backend Automatically applied to PRs that change backend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants