worker: Reduce frequency of very frequently run jobs#62864
Conversation
I looked at background routines and noticed that many routines are running _very_ frequently, which can cause higher than necessary load, and also stress on redis to store all these runs. I tried to put in some more sensible numbers (aligning it with what most other routines seem to use). For many instances, this will mean a 30x reduction in job invocations. Test plan: Code review from owners, and will watch S2 and dotcom for potential issues.
|
This stack of pull requests is managed by Graphite. Learn more about stacking. |
| func (c *config) Load() { | ||
| c.EncryptionInterval = c.GetInterval("RECORD_ENCRYPTER_INTERVAL", "1s", "How frequently to encrypt/decrypt a batch of records in the database.") | ||
| c.MetricsInterval = c.GetInterval("RECORD_ENCRYPTER_METRICS_INTERVAL", "10s", "How frequently to update progress metrics related to encryption/decryption.") | ||
| c.EncryptionInterval = c.GetInterval("RECORD_ENCRYPTER_INTERVAL", "10s", "How frequently to encrypt/decrypt a batch of records in the database.") |
There was a problem hiding this comment.
This is only relevant for when encryption is initially enabled, or getting disabled. Records written while encryption is on will be encrypted right away. This should be more than sensible.
| c.EncryptionInterval = c.GetInterval("RECORD_ENCRYPTER_INTERVAL", "1s", "How frequently to encrypt/decrypt a batch of records in the database.") | ||
| c.MetricsInterval = c.GetInterval("RECORD_ENCRYPTER_METRICS_INTERVAL", "10s", "How frequently to update progress metrics related to encryption/decryption.") | ||
| c.EncryptionInterval = c.GetInterval("RECORD_ENCRYPTER_INTERVAL", "10s", "How frequently to encrypt/decrypt a batch of records in the database.") | ||
| c.MetricsInterval = c.GetInterval("RECORD_ENCRYPTER_METRICS_INTERVAL", "30s", "How frequently to update progress metrics related to encryption/decryption.") |
There was a problem hiding this comment.
Just metrics, should be fine to lag a bit more behind.
|
|
||
| func (c *Config) Load() { | ||
| c.Interval = c.GetInterval("CODEINTEL_RANKING_COORDINATOR_INTERVAL", "1s", "How frequently to run the ranking coordinator.") | ||
| c.Interval = c.GetInterval("CODEINTEL_RANKING_COORDINATOR_INTERVAL", "30s", "How frequently to run the ranking coordinator.") |
There was a problem hiding this comment.
This and the following ranking configs are not doing any work on instances as the feature is disabled everywhere, but we still run the background routines, so this will reduce frequency but not affect any functionality.
|
|
||
| func (c *Config) Load() { | ||
| c.Interval = c.GetInterval("CODEINTEL_UPLOAD_BACKFILLER_INTERVAL", "10s", "The frequency with which to run periodic codeintel backfiller tasks.") | ||
| c.Interval = c.GetInterval("CODEINTEL_UPLOAD_BACKFILLER_INTERVAL", "30s", "The frequency with which to run periodic codeintel backfiller tasks.") |
There was a problem hiding this comment.
Does this seem sensible to Graph?
| commitGraphUpdateTaskInterval := env.ChooseFallbackVariableName("CODEINTEL_UPLOAD_COMMITGRAPH_UPDATE_TASK_INTERVAL", "PRECISE_CODE_INTEL_COMMIT_GRAPH_UPDATE_TASK_INTERVAL") | ||
|
|
||
| c.Interval = c.GetInterval("CODEINTEL_UPLOAD_COMMITGRAPH_UPDATER_INTERVAL", "1s", "How frequently to run the upload commitgraph updater routine.") | ||
| c.Interval = c.GetInterval("CODEINTEL_UPLOAD_COMMITGRAPH_UPDATER_INTERVAL", "10s", "How frequently to run the upload commitgraph updater routine.") |
There was a problem hiding this comment.
Does this seem sensible to Graph?
|
|
||
| c.CommitBatchSize = c.GetInt(commitBatchSize, "100", "The number of commits to process per upload at a time.") | ||
| c.ExpirerInterval = c.GetInterval("CODEINTEL_UPLOAD_EXPIRER_INTERVAL", "1s", "How frequently to run the upload expirer routine.") | ||
| c.ExpirerInterval = c.GetInterval("CODEINTEL_UPLOAD_EXPIRER_INTERVAL", "30s", "How frequently to run the upload expirer routine.") |
There was a problem hiding this comment.
Does this seem sensible to Graph?
keynmol
left a comment
There was a problem hiding this comment.
I think all of these are fine - my guess is that all those 1s are remnants from the time when the feature was being developed, where local feedback loop needed to be tight.
Don't a see a reason for any of those frequencies to be 1s.

I looked at background routines and noticed that many routines are running very frequently, which can cause higher than necessary load, and also stress on redis to store all these runs.
I tried to put in some more sensible numbers (aligning it with what most other routines seem to use). For many instances, this will mean a 30x reduction in job invocations.
Test plan:
Code review from owners, and will watch S2 and dotcom for potential issues.