Skip to content

ref(relay): Stop sending unnecessary fields in project config#45124

Merged
jjbayer merged 25 commits intomasterfrom
test/validate-project-config
Mar 6, 2023
Merged

ref(relay): Stop sending unnecessary fields in project config#45124
jjbayer merged 25 commits intomasterfrom
test/validate-project-config

Conversation

@jjbayer
Copy link
Member

@jjbayer jjbayer commented Feb 27, 2023

Make sure that the project configs that Sentry generates are consistent with what Relay serializes for downstream usage:

  • Sampling rules in Relay do not have an active field, so stop sending it.
  • glob sampling rules do not have options, so stop sending them.
  • Sentry unnecessarily sends some empty/default project config fields that Relay can easily derive. Stop sending them.

Why?

  • feat(py): Expose validation function for project configs relay#1860 introduced a validation function that can be used Sentry-side to check if Sentry accidentally produces any fields that Relay cannot use. This has already surfaced some inconsistencies, but it requires that Sentry and Relay produce the same fields.
  • Omitting default values reduces the payload size both in redis and on the wire.

Note: This PR will fail a lot of tests until getsentry/relay#1887 has been merged and released to the Python library.

@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Feb 27, 2023
@jjbayer jjbayer changed the title Test/validate project config ref(relay): Stop sending unnecessary fields in project config Feb 28, 2023
jjbayer added a commit to getsentry/relay that referenced this pull request Mar 1, 2023
Some fields that are not being sent by Sentry when they are empty are
serialized by Relay. Align with Sentry and do not serialize these fields
if they have default values.

Reasons:

* The main reason is that #1860
introduced a validation function that can be used Sentry-side to check
if Sentry accidentally produces any fields that Relay cannot use. This
has already surfaced some inconsistencies (see
getsentry/sentry#45124), but it requires that
Sentry and Relay produce the same fields.
* Reduced payload size when sending configs to downstream Relays.
@jjbayer jjbayer changed the base branch from master to chore/relay-bump March 2, 2023 10:40
return []
else:
for rule in rules:
rule.pop("active")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relay's SamplingRule has no such field.

),
"end": datetime.utcfromtimestamp(
boosted_release.timestamp + boosted_release.platform.time_to_adoption
).strftime(self.date_format),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is not necessary functionally, but it makes sure that Relay's serialized project config look the same as Sentry's serialized project config.

@codecov
Copy link

codecov bot commented Mar 2, 2023

Codecov Report

Merging #45124 (7645238) into master (f706a95) will decrease coverage by 0.07%.
The diff coverage is 92.85%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #45124      +/-   ##
==========================================
- Coverage   80.22%   80.15%   -0.07%     
==========================================
  Files        4724     4725       +1     
  Lines      198815   198865      +50     
  Branches    12006    12006              
==========================================
- Hits       159506   159408      -98     
- Misses      39047    39195     +148     
  Partials      262      262              
Impacted Files Coverage Δ
...c_sampling/rules/biases/boost_environments_bias.py 100.00% <ø> (ø)
...mpling/rules/biases/boost_key_transactions_bias.py 100.00% <ø> (ø)
...sampling/rules/biases/ignore_health_checks_bias.py 100.00% <ø> (ø)
...ntry/dynamic_sampling/rules/biases/uniform_bias.py 100.00% <ø> (ø)
src/sentry/relay/config/__init__.py 97.51% <85.71%> (-1.20%) ⬇️
src/sentry/dynamic_sampling/rules/base.py 100.00% <100.00%> (ø)
...ampling/rules/biases/boost_latest_releases_bias.py 100.00% <100.00%> (ø)
src/sentry/dynamic_sampling/rules/utils.py 100.00% <100.00%> (ø)
src/sentry/features/exceptions.py 60.00% <0.00%> (-40.00%) ⬇️
src/sentry/lang/native/processing.py 55.20% <0.00%> (-28.13%) ⬇️
... and 65 more

Base automatically changed from chore/relay-bump to master March 2, 2023 11:35
@jjbayer jjbayer marked this pull request as ready for review March 2, 2023 11:57
@jjbayer jjbayer requested review from a team as code owners March 2, 2023 11:57
@jjbayer jjbayer requested a review from a team March 2, 2023 11:57
@jjbayer jjbayer removed the request for review from a team March 2, 2023 14:07
Copy link
Contributor

@olksdr olksdr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm


class BoostLatestReleasesRulesGenerator(BiasRulesGenerator):

date_format = "%Y-%m-%dT%H:%M:%SZ"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit

Suggested change
date_format = "%Y-%m-%dT%H:%M:%SZ"
datetime_format = "%Y-%m-%dT%H:%M:%SZ"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Comment on lines +106 to +107
class Rule(ActiveRule):
active: bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really like this change. I understand the practicality, but this says a rule is a type of an active rule. Same with ActiveDecayingRule below.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't know how to change this, except maybe by improving the names. This type's fields are a superset of the other type's, which is solved by inheritance.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The naming here confuses me, especially considering we have inheritance in place. I would do something like this:

class Rule(TypedDict):
    samplingValue: SamplingValue
    type: str
    condition: Condition
    id: int

class StatefulRule(Rule):
   active: bool

What do you think?

Comment on lines +599 to +600
# Sort extractMetrics to be consistent with Relay serialization:
metrics.sort()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'm not sure if this is a good change. I don't think having the same order as what relay serializes is important; and if this is for tests, that's something tests should care about (not prod).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, moved the sorting to tests now.

Comment on lines +71 to +73
else:
for rule in rules:
rule.pop("active") # type: ignore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

active should just not be added to rules, instead of generating it and removing it later. I understand, however, that this may be out of the scope of your PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but I found it hard to follow in what places it was still needed, so I went for this hack instead. @iambriccardo maybe you can improve on this in a follow-up PR if you have time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inside rules/biases/ you can find all the rules formats, you can easily remove active from them. To my understanding of Relay sampling, the active field is unused.

Copy link
Contributor

@iambriccardo iambriccardo Mar 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# src/sentry/dynamic_sampling/rules/biases/boost_latest_releases_bias.py
class BoostLatestReleasesRulesGenerator(BiasRulesGenerator):
    def _generate_bias_rules(self, bias_data: BiasData) -> List[PolymorphicRule]:
        boosted_releases = bias_data["boostedReleases"]

        return cast(
            List[PolymorphicRule],
            [
                {
                    "samplingValue": {
                        "type": "factor",
                        "value": bias_data["factor"],
                    },
                    "type": "trace",
                    "active": True, # Remove this here
                    ...



def generate_rules(project: Project) -> List[PolymorphicRule]:
def generate_rules(project: Project) -> Sequence[PolymorphicActiveRule]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would you change the return to a Sequence? Is there a specific reasoning?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mypy started complaining because List is invariant while Sequence is covariant (docs). With your suggestion to just remove active, I was able to revert this change though.

Copy link
Contributor

@iambriccardo iambriccardo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit confused of why we are keeping the active field in the TypedDicts definition. The active field should be removed and that shouldn't cause any issues.

class Condition(TypedDict):
op: str
inner: List[Inner]
op: Literal["and", "or"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Comment on lines +106 to +107
class Rule(ActiveRule):
active: bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The naming here confuses me, especially considering we have inheritance in place. I would do something like this:

class Rule(TypedDict):
    samplingValue: SamplingValue
    type: str
    condition: Condition
    id: int

class StatefulRule(Rule):
   active: bool

What do you think?

@jjbayer
Copy link
Member Author

jjbayer commented Mar 6, 2023

I am a bit confused of why we are keeping the active field in the TypedDicts definition. The active field should be removed and that shouldn't cause any issues.

@iambriccardo You are right, I thought it was being used internally but I confused it with ActivatableBias. Removing active from Rule will mean that the rule hash changes though, will that have any negative consequences in production?

@iambriccardo
Copy link
Contributor

I am a bit confused of why we are keeping the active field in the TypedDicts definition. The active field should be removed and that shouldn't cause any issues.

@iambriccardo You are right, I thought it was being used internally but I confused it with ActivatableBias. Removing active from Rule will mean that the rule hash changes though, will that have any negative consequences in production?

@jjbayer you are free to remove it also from the hash. The role of the hash is just to maintain an in-memory map to calculate rules diffs for logging purposes. Considering that the hash is recreated on deployment, we won't have any issues.

Copy link
Contributor

@iker-barriocanal iker-barriocanal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

I'm curious how many GB (and $) we save with this PR 👀.

@jjbayer jjbayer merged commit 4f7e5fe into master Mar 6, 2023
@jjbayer jjbayer deleted the test/validate-project-config branch March 6, 2023 13:54
@github-actions github-actions bot locked and limited conversation to collaborators Mar 22, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Scope: Backend Automatically applied to PRs that change backend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants