Conversation
c45365d to
1c93bdc
Compare
1c93bdc to
5f12226
Compare
5f12226 to
7def64a
Compare
|
To include a complete end-to-end benchmark I've done a simple promql query (
|
|
This PR is against Prometheus 1.7, which is not the current development version. Any changes of this nature should be proposed against master. |
|
@brian-brazil I can do that, I opened the PR here as the issue (#3601 ) was reported on 1.7. The changes here can (and should) be applied to any releases that are still supported (I'm not sure which ones those are, so I'll need some guidance on what are the currently supported releases). |
|
A change like this would be aimed for 2.1 currently, which is to say master. New features are not being added to 1.x. |
|
I can re-open the PR against master, but prometheus/common#112 needs to be merged first. Before moving the PR I'd like to see that merged and some feedback on this change (since it will be basicaly the same change against master). Is there also no mechanism to backport to previous releases? From looking at the releases page it seems that N-1 is supported (minor releases etc.)? |
|
All the relevant code changed with 2.0, you'd need to redo it all. Only bugfixes would be considered for backport. |
|
What classifies bug fix? From the end to end data this is ~10x perf
improvement.
Also, your latest response still has no feedback for a change like this.
…On Dec 27, 2017 4:48 PM, "Brian Brazil" ***@***.***> wrote:
All the relevant code changed with 2.0, you'd need to redo it all.
Only bugfixes would be considered for backport.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#3608 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABi3nXHiLCCYI_2QTp4GzjKfuVV9Tt1Zks5tEuVhgaJpZM4RKaNd>
.
|
|
A performance improvement is not considered a bugfix. |
|
Ok, how about feedback on the change?
…On Dec 27, 2017 4:57 PM, "Brian Brazil" ***@***.***> wrote:
A performance improvement is not considered a bugfix.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#3608 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABi3nR1QakyNWtnQ3j9B1mWlfCAoIqyuks5tEudsgaJpZM4RKaNd>
.
|
|
It's a highly invasive change that'd require significant ongoing error-prone maintenance, and it also violates layering. Such downsides require major major benefits to justify, and the benchmark numbers produced thus far are at odds with previous profiling in this area. |
I'm not sure if we're looking at the same diff? The reason this particular PR is so large is due to changes in the vendor directory. The actual change to the code is ~30-40 lines of change, which I'd hardly call "highly invasive".
I'm also not sure what maintenance you are expecting to perform? There is one codegen'd file -- which requires no human to write the code and the only other one is a small marshal method on the queryData struct -- which isn't exactly a large struct.
What are you implying with this? What layering is being broken? In golang marshal methods have to be in the package with the struct definition-- so if you don't want the marshal/unmarshal methods in this package then we need to move it to another package. Past that I have no idea what other layering you could be referring to, so please elaborate.
I agree, which is why I've spent a significant amount of time collecting a LOT of data (and creating a variety of benchmarks to demonstrate this)
Thats usually how performance improvements work? For all the benchmarks we're talking about the code is all open -- so it is comparable. The benchmarks you are referring to are all done by the jsoniter guys on a generic dataset. The benchmarks I've put in place are (1) with the prometheus structs and (2) a full end-to-end test (which is exactly what you asked for). At this point I'm at a loss for what additional information to provide. I have overwhelming data supporting my position, and all you've been able to counter with is "suspicion" -- with no recourse. What data or proof do you require to get such a change in? I'm growing weary of gathering data only to be met with 'suspicion' and uncertainty. This is a simple performance optimization (a HUGE one, but a simple one nonetheless). If not performance data, what data do you need? |
|
Obsoleted by #3536 |

This is a fix for #3601, this wouldn't be the actual PR (as we'll need to commit the changes in prometheus/common#112 first) but this gives an idea of the level of change required (very small).
I've run this locally with some test data, and here are the comparisons:
With this we can see the large performance gains even with small returns. Not only do we get these gains (~5-6 for "real" data) we also significantly reduce the memory required to marshal out the response (which means this scales significantly better into large response sizes.
Also, as a bonus, since this iteratively serializes it will also stop mid-serialization if the client disconnects!