Skip to content

1.7 easyjson#3608

Closed
jacksontj wants to merge 1 commit intoprometheus:release-1.7from
jacksontj:1.7_easyjson
Closed

1.7 easyjson#3608
jacksontj wants to merge 1 commit intoprometheus:release-1.7from
jacksontj:1.7_easyjson

Conversation

@jacksontj
Copy link
Contributor

@jacksontj jacksontj commented Dec 21, 2017

This is a fix for #3601, this wouldn't be the actual PR (as we'll need to commit the changes in prometheus/common#112 first) but this gives an idea of the level of change required (very small).

I've run this locally with some test data, and here are the comparisons:

pre-patch with-patch
Matrix 214ms 37ms
Vector 108ms 20ms
Scalar 20ms 11ms
String 18ms 13ms

With this we can see the large performance gains even with small returns. Not only do we get these gains (~5-6 for "real" data) we also significantly reduce the memory required to marshal out the response (which means this scales significantly better into large response sizes.

Also, as a bonus, since this iteratively serializes it will also stop mid-serialization if the client disconnects!

@jacksontj
Copy link
Contributor Author

jacksontj commented Dec 28, 2017

To include a complete end-to-end benchmark I've done a simple promql query (metricname[range]) over various time durations pre and post easyjson change, here are the results:

Time (minutes) Old New Improvement
0 0.121 0.086 -28.92561983
1 0.171 0.076 -55.55555556
10 1.226 0.119 -90.29363785
30 2.126 0.58 -72.7187206
60 5.173 0.441 -91.47496617
120 9.086 1.107 -87.81642087
180 13.719 1.795 -86.91595597
240 17.668 1.956 -88.92913742

A pretty graph version:
image

@brian-brazil
Copy link
Contributor

This PR is against Prometheus 1.7, which is not the current development version. Any changes of this nature should be proposed against master.

@jacksontj
Copy link
Contributor Author

@brian-brazil I can do that, I opened the PR here as the issue (#3601 ) was reported on 1.7. The changes here can (and should) be applied to any releases that are still supported (I'm not sure which ones those are, so I'll need some guidance on what are the currently supported releases).

@brian-brazil
Copy link
Contributor

A change like this would be aimed for 2.1 currently, which is to say master. New features are not being added to 1.x.

@jacksontj
Copy link
Contributor Author

I can re-open the PR against master, but prometheus/common#112 needs to be merged first. Before moving the PR I'd like to see that merged and some feedback on this change (since it will be basicaly the same change against master).

Is there also no mechanism to backport to previous releases? From looking at the releases page it seems that N-1 is supported (minor releases etc.)?

@brian-brazil
Copy link
Contributor

All the relevant code changed with 2.0, you'd need to redo it all.

Only bugfixes would be considered for backport.

@jacksontj
Copy link
Contributor Author

jacksontj commented Dec 28, 2017 via email

@brian-brazil
Copy link
Contributor

A performance improvement is not considered a bugfix.

@jacksontj
Copy link
Contributor Author

jacksontj commented Dec 28, 2017 via email

@brian-brazil
Copy link
Contributor

It's a highly invasive change that'd require significant ongoing error-prone maintenance, and it also violates layering. Such downsides require major major benefits to justify, and the benchmark numbers produced thus far are at odds with previous profiling in this area.

@jacksontj
Copy link
Contributor Author

It's a highly invasive change

I'm not sure if we're looking at the same diff? The reason this particular PR is so large is due to changes in the vendor directory. The actual change to the code is ~30-40 lines of change, which I'd hardly call "highly invasive".

that'd require significant ongoing error-prone maintenance

I'm also not sure what maintenance you are expecting to perform? There is one codegen'd file -- which requires no human to write the code and the only other one is a small marshal method on the queryData struct -- which isn't exactly a large struct.

and it also violates layering.

What are you implying with this? What layering is being broken? In golang marshal methods have to be in the package with the struct definition-- so if you don't want the marshal/unmarshal methods in this package then we need to move it to another package. Past that I have no idea what other layering you could be referring to, so please elaborate.

Such downsides require major major benefits to justify

I agree, which is why I've spent a significant amount of time collecting a LOT of data (and creating a variety of benchmarks to demonstrate this)

and the benchmark numbers produced thus far are at odds with previous profiling in this area.

Thats usually how performance improvements work? For all the benchmarks we're talking about the code is all open -- so it is comparable. The benchmarks you are referring to are all done by the jsoniter guys on a generic dataset. The benchmarks I've put in place are (1) with the prometheus structs and (2) a full end-to-end test (which is exactly what you asked for). At this point I'm at a loss for what additional information to provide. I have overwhelming data supporting my position, and all you've been able to counter with is "suspicion" -- with no recourse.

What data or proof do you require to get such a change in? I'm growing weary of gathering data only to be met with 'suspicion' and uncertainty. This is a simple performance optimization (a HUGE one, but a simple one nonetheless). If not performance data, what data do you need?

@brian-brazil
Copy link
Contributor

Obsoleted by #3536

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants