Skip to content

Add ParseNumeric and replace uses of ParseInt/Float#2364

Closed
isugimpy wants to merge 20 commits intokedacore:mainfrom
isugimpy:main
Closed

Add ParseNumeric and replace uses of ParseInt/Float#2364
isugimpy wants to merge 20 commits intokedacore:mainfrom
isugimpy:main

Conversation

@isugimpy
Copy link

@isugimpy isugimpy commented Nov 29, 2021

Signed-off-by: Jayme Howard g.prime@gmail.com

This is a potential implementation of a solution for the problem I described in #2299 with YAML conversion causing problematic behavior. The ParseNumeric function should be close to a drop-in replacement for strconv.Parse{Int,Float}, but with the ability to parse strings with a type hint that I've described in the issue. As an example, d(1.0) would hint at a decimal (float64) value to be returned. I did not make this change to any other scalers as I'm unsure that this change will be accepted as is and didn't want to make the investment. This should, practically speaking, retain backwards compatibility with all existing scaling behavior, as far as I can tell from my tests.

If the maintainers consider this acceptable, I can go through and update other things that currently use Parse{Int,Float} to use this function, update the changelog, and will open a PR to update the docs as well.

Checklist

  • Commits are signed with Developer Certificate of Origin (DCO - learn more)
  • Tests have been added
  • A PR is opened to update our Helm chart (repo) (if applicable, ie. when deployment manifests are modified)
  • A PR is opened to update the documentation on (repo) (if applicable)
  • Changelog has been updated

Fixes #2299

@isugimpy isugimpy requested a review from a team as a code owner November 29, 2021 14:54
Signed-off-by: Jayme Howard <g.prime@gmail.com>
Signed-off-by: Jayme Howard <g.prime@gmail.com>
Signed-off-by: Jayme Howard <g.prime@gmail.com>
Signed-off-by: Jayme Howard <g.prime@gmail.com>
@isugimpy
Copy link
Author

I don't think I have the ability to fix this last one, as it's a timeout on your linting steps.

Copy link
Member

@JorTurFer JorTurFer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relative to the code, some small nits.
Relative to the usage, I don't get the benefit of this specific change. I'm not an expert but I think that apart from this you should change also the CRD to support number values also and I'm not totally sure about if it's a good idea.
@kedacore/keda-contributors ?

Comment on lines +57 to +59
hintedRegexp := regexp.MustCompile(`^(?P<Type>[a-z])\((?P<Value>[\-]?(?:(?:0|[1-9]\d*)(?:\.\d*)?|\.\d+))\)$`)
floatRegexp := regexp.MustCompile(`^-?\d+\.\d+$`)
intRegexp := regexp.MustCompile(`^-?\d+$`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these regexes be placed at package level inside an init function?
Something like this https://github.com/kedacore/keda/blob/main/pkg/scalers/rabbitmq_scaler.go#L27-L29
Otherwise, the regex will be compiled every time

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's reasonable to make the init function change. Will do that right now.

As far as the CRD change, in the absence of answers on the preferred way to do it, I opted for doing the thing that provides the least potential risk of breaking something that already works, and that requires changing the least existing code. I'm actually going to make one other change to the function signature for ParseNumeric in the same commit that adds the init to put the bitsize argument back as well, to further reduce the surface of changes.

@JorTurFer JorTurFer requested a review from a team November 29, 2021 20:12
@JorTurFer
Copy link
Member

BTW, Thanks a ton for the contribution ❤️

Signed-off-by: Jayme Howard <g.prime@gmail.com>
@isugimpy
Copy link
Author

BTW, Thanks a ton for the contribution heart

Glad to do it. Getting this solved would be a huge win for me and my team, so I've got a vested interest in making sure it gets done. If I can take a little burden off y'all and make it happen, well worth the time investment to me.

@tomkerkhove tomkerkhove changed the title Add ParseNumeric and configure cpu_memory_scaler to use for #2299 Add ParseNumeric and configure cpu_memory_scaler to use Nov 30, 2021
@JorTurFer
Copy link
Member

Hi,
Sorry for the slow response, the weeks previous x-mas are crazy...
I think that I'm still missing something :(
Your problem is that the automated tools that you use automatically converts for example "1" into 1, so the ScaledObject fails during to apply because of the type, right?
Your proposal basically is to add support to specify "1" in this other way "d(1)" in order to avoid automatic conversions using other tools during the process. Am I right?
In that case, shouldn't this change be applied in all scalers?

Personally, I'm not totally sure about this notation but in base of the tests we can still use the old one at the same time and we provide a fallback way for these automatizations 🤔 Maybe explaining it in a section inside the docs?
WDTY @zroubalik , @tomkerkhove @ahmelsayed ?

@isugimpy
Copy link
Author

isugimpy commented Dec 1, 2021

Apologies, I may not have explained it as well as I had previously thought.

When you're using anything that parses the YAML and outputs it again, with Kustomize being the best example, the object gets loaded in, parsed to native Go objects, and eventually output back as YAML. Now, let's say we start with a ScaledObject with quoted numeric values like "1". The YAML gets read, and a string with the contents 1 is created. At output time, what goes into the resulting output is a 1 with no quotes, because YAML implicitly treats strings as quoted. When the next thing goes to read the YAML, it sees an unquoted value which "looks" like a numeric, and parses it as a number, storing it as an integer. That integer no longer meets the OpenAPIv3 spec for the CRD, and the apply fails because of it.

There are two ways to solve this. One would be to rewrite the spec to allow either a numeric or string value, and adjust everything accordingly. The other would be the approach above, where type hinting is provided in the string. Because the value doesn't "look" like a numeric, a YAML parser will always treat it as a string. This provides a nice fallback mechanism for users who do use tools like Kustomize (or other tools which work the same way), allowing a mechanism to provide a value which clearly doesn't "look" like a number, while maintaining compatibility with the existing schema and behavior.

I agree this should go into other scalers. In fact, I'm willing to do the work to do so if that's desired. But I didn't want to make the change significantly larger and touch the other scalers without having gotten buy-in from maintainers first. Also very happy to write the relevant docs, but again didn't want to go down that road without ensuring that there would be buy-in first.

Comment on lines +51 to +64
func ParseNumeric(s string, bitSize int) (interface{}, error) {
/* This function is a wrapper around ParseInt and ParseFloat
mostly, but also allows for providing in-string type hinting
to ensure that numeric values that would be cast to numbers
by YAML parsers instead forcefully remain as strings until
they can get parsed at runtime.

For this purpose, we look for a hint which is defined as
a single letter, followed by a parentheses wrapped numeric
value. d or f for decimal/float, i or n for integer.

If we don't get a match, check if s looks like a float and
parse it as one if so, otherwise parse as an int.
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These comments would be great as standard Go documentation:

Suggested change
func ParseNumeric(s string, bitSize int) (interface{}, error) {
/* This function is a wrapper around ParseInt and ParseFloat
mostly, but also allows for providing in-string type hinting
to ensure that numeric values that would be cast to numbers
by YAML parsers instead forcefully remain as strings until
they can get parsed at runtime.
For this purpose, we look for a hint which is defined as
a single letter, followed by a parentheses wrapped numeric
value. d or f for decimal/float, i or n for integer.
If we don't get a match, check if s looks like a float and
parse it as one if so, otherwise parse as an int.
*/
// ParseNumeric is a wrapper around ParseInt and ParseFloat
// mostly, but also allows for providing in-string type hinting
// to ensure that numeric values that would be cast to numbers
// by YAML parsers instead forcefully remain as strings until
// they can get parsed at runtime.
//
// For this purpose, we look for a hint which is defined as
// a single letter, followed by a parentheses wrapped numeric
// value. d or f for decimal/float, i or n for integer.
//
// If we don't get a match, check if s looks like a float and
// parse it as one if so, otherwise parse as an int.
func ParseNumeric(s string, bitSize int) (interface{}, error) {

return nil, err
}
utilizationNum := int32(valueNum)
utilizationNum, _ := valueNum.(int32)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's doubtful that this type assertion would ever fail, but it would still be helpful to check it for success, as defensive programming:

Suggested change
utilizationNum, _ := valueNum.(int32)
utilizationNum, ok := valueNum.(int32)
if !ok {
return nil, fmt.Errorf("some error text")
}

@isugimpy
Copy link
Author

isugimpy commented Dec 2, 2021

@arschles Thank you for the feedback! Both addressed as requested. Hopefully that error message provides what you're looking for there.

Signed-off-by: Jayme Howard <g.prime@gmail.com>
Signed-off-by: Jayme Howard <g.prime@gmail.com>
Signed-off-by: Jayme Howard <g.prime@gmail.com>
Copy link
Contributor

@arschles arschles left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@isugimpy I left a possible enhancement to the error message, but it's really no big deal if that goes in or not. This looks great, thanks for looking through my earlier review!

Signed-off-by: Jayme Howard <g.prime@gmail.com>

Co-authored-by: Aaron Schlesinger <70865+arschles@users.noreply.github.com>
Copy link
Contributor

@arschles arschles left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@isugimpy this looks good to me! Thanks for going through all my comments etc...

@isugimpy
Copy link
Author

Just following up on this. I know it's hard to get time around stuff like this especially during the holidays, but I want to make sure it doesn't get lost in the shuffle.

Signed-off-by: Jayme Howard <g.prime@gmail.com>
Signed-off-by: Jayme Howard <g.prime@gmail.com>
@isugimpy
Copy link
Author

You're sure right, sorry about that. I saw what looked like a different failure and completely missed that my own changes were failing tests. I've written a fix that should solve the problem, and found a linting error I had previously missed.

I can't seem to get the full test suite to run locally, but the tests for the code I touched pass on my machine at this point.

As previously offered, I'll still be glad to go through the other scalers I haven't touched and convert references from ParseInt/ParseFloat to ParseNumeric if this work is accepted.

Copy link
Contributor

@arschles arschles left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one suggestion and one question left

return nil, fmt.Errorf("provided value for Utilization (%d) was not a valid 32-bit integer", valueNum)
return nil, fmt.Errorf("provided value for Utilization (%d) was not a valid integer", valueNum)
}
utilizationNum := int32(valueNum.(int64))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about converting to int64 above, then you don't have to do the valueNum.(int64) again here?

valueNum64, ok := valueNum.(int64)
if !ok { 
    //...
}
utilizationNum := int32(valueNum64)

Also, do we need to worry about overflow here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about that and tried it and the issue is that there's the possibility that the user could pass a numeric that's a float, and ParseNumeric() will return a float64. The way you wrote the code above is how I had written it previously and it actually broke the tests. So the initial assertion on 63 is to make sure that we actually got an int64 back and can return the error if it wasn't one. We don't want to coerce something that the user says is a float into an int. Then, if it did assert as an int64, we need to convert it to int32 for it to actually be usable, which necessitates a type assertion because ParseNumeric() returns an interface{}.

We don't have to worry about overflow (and for the same reason as the original code). On 58, the last arg to ParseNumeric() is the bit length of the value to fit the result into. If you pass a number that's larger than the maximum size it'd allow for, ParseInt() inherently changes it to be the maximum integer that the data type would allow for. Under the hood, ParseNumeric() is just calling ParseInt() after validating that the passed value is an integer, so we inherit its behavior.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, apologies, I'm tracking what you're saying now. Sorry, memory's fuzzy from originally doing this like a month ago. Yeah, I think I can do that change and have it work.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though, actually, is that worth it? Is it preferred to allocate more memory for the new variable instead of just doing the runtime assertion a second time?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

up to you. I was suggesting it for a bit of readability improvement, which is subjective. performance-wise, in the grand scheme of things, it's a very minimal amount of memory. without profiling, I couldn't really say if it has an impact at all.

Copy link
Contributor

@arschles arschles left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

return nil, fmt.Errorf("provided value for Utilization (%d) was not a valid 32-bit integer", valueNum)
return nil, fmt.Errorf("provided value for Utilization (%d) was not a valid integer", valueNum)
}
utilizationNum := int32(valueNum.(int64))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

up to you. I was suggesting it for a bit of readability improvement, which is subjective. performance-wise, in the grand scheme of things, it's a very minimal amount of memory. without profiling, I couldn't really say if it has an impact at all.

Comment on lines +51 to +62
// ParseNumeric is a wrapper around ParseInt and ParseFloat
// mostly, but also allows for providing in-string type hinting
// to ensure that numeric values that would be cast to numbers
// by YAML parsers instead forcefully remain as strings until
// they can get parsed at runtime.
//
// For this purpose, we look for a hint which is defined as
// a single letter, followed by a parentheses wrapped numeric
// value. d or f for decimal/float, i or n for integer.
//
// If we don't get a match, check if s looks like a float and
// parse it as one if so, otherwise parse as an int.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for this great godoc. perhaps it would be helpful to add some examples of strings and how they would be parsed? just a suggestion to do if you want, it's not blocking this PR.

Signed-off-by: Jayme Howard <g.prime@gmail.com>
…alue looks like an int if needed

Signed-off-by: Jayme Howard <g.prime@gmail.com>
@isugimpy isugimpy changed the title Add ParseNumeric and configure cpu_memory_scaler to use Add ParseNumeric and replace uses of ParseInt/Float Jan 30, 2022
Signed-off-by: Jayme Howard <g.prime@gmail.com>
Copy link
Contributor

@arschles arschles left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @isugimpy! I left one nitpick suggestion about spacing, otherwise this is great

Comment on lines +57 to +62
valueNum, err := kedautil.ParseNumeric(config.TriggerMetadata[valueKey], 64)
if err != nil {
return nil, err
}
var ok bool
meta.value, ok = valueNum.(int64)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just adjusting spacing

Suggested change
valueNum, err := kedautil.ParseNumeric(config.TriggerMetadata[valueKey], 64)
if err != nil {
return nil, err
}
var ok bool
meta.value, ok = valueNum.(int64)
valueNum, err := kedautil.ParseNumeric(config.TriggerMetadata[valueKey], 64)
if err != nil {
return nil, err
}
var ok bool
meta.value, ok = valueNum.(int64)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I'm parsing your suggestion. Won't that indentation cause linting errors? Weirdly, this isn't how it looks in my vim, there are proper tabs as far as I see.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when I made that comment, there was no indentation whatsoever in my diff for this section. now, it's indented "too much". no idea what happened, but anyway it's fine now!

@isugimpy isugimpy requested review from JorTurFer and removed request for a team February 14, 2022 10:50
Copy link
Member

@zroubalik zroubalik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am very sorry for the delay in the review from my side, I have been super busy and I haven't been sure what to do with this :) Only thing I am concerned whether it could not break CRD validation. Do you think that you can add e2e test covering this? A test that will create a ScaledObject with numeric value? The existing tests are here: https://github.com/kedacore/keda/tree/main/tests

Thanks!

isugimpy added 3 commits May 28, 2022 15:34
Signed-off-by: Jayme Howard <g.prime@gmail.com>
Signed-off-by: Jayme Howard <g.prime@gmail.com>
Signed-off-by: Jayme Howard <g.prime@gmail.com>
@isugimpy
Copy link
Author

@zroubalik Apologies, time got away from me and I just realized I left this hanging this whole time. I don't know if this would satisfy what you're wanting, but I made a small change to one of the existing e2e tests which should exercise the behavior of parsing with the hint. Existing tests would use the hintless behavior already. As far as I know that should work transparently.

However, I can't make the e2e actually run because of the labels on the PR.

isugimpy added 2 commits May 28, 2022 16:01
Signed-off-by: Jayme Howard <g.prime@gmail.com>
Signed-off-by: Jayme Howard <g.prime@gmail.com>
@isugimpy isugimpy requested a review from zroubalik May 28, 2022 22:02
@zroubalik
Copy link
Member

zroubalik commented Jul 14, 2022

@isugimpy sorry for the delay, that e2e test has been migrated to go in the meantime. Do you think you can modify it? Also please don't change the original test, rather extend it to cover this new scenario.

I have been thinking about this for some time (sorry for that 😅 ), and I think that we can go ahead with this change.

@tomkerkhove
Copy link
Member

Any update on this?

@tomkerkhove
Copy link
Member

Closing due to inactivity for a long period, feel free to re-open if you have time to complete it anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Scalers Should Use Numeric Types for Thresholds and Values

6 participants