Add auto conversion option to convert_type processor#5782
Add auto conversion option to convert_type processor#5782kkondaka merged 3 commits intoopensearch-project:mainfrom
Conversation
| } else { | ||
| this.type = null; | ||
| this.converter = null; | ||
| } |
There was a problem hiding this comment.
nit: Probably else block is not needed if we assign null at their initialization.
There was a problem hiding this comment.
I got error that variables are not initialized.
There was a problem hiding this comment.
if null values are assigned at their initialization, it won't complain
| final LocalDate localDateForDefaultValues = LocalDate.now(DEFAULT_ZONE_ID); | ||
|
|
||
| final DateTimeFormatterBuilder dateTimeFormatterBuilder = new DateTimeFormatterBuilder() | ||
| .appendPattern(pattern) |
There was a problem hiding this comment.
What if the pattern already includes zone?
If possible, avoiding the use of DEFAULT_ZONE_ID will make it more flexible
| Object result = null; | ||
| try { | ||
| result = autoConvert(event, entry.getValue(), keyPrefix+entry.getKey()+"/"); | ||
| if (result != null) { |
There was a problem hiding this comment.
if the value is null, do we want to still set null in that specific key? Looks like we are dropping that key here.
There was a problem hiding this comment.
What do you mean? autoConvert only converts strings to boolean/integer/float/long/double. If the string is "null", it won't be converted to null, that's not something auto conversion currently handles.
| for (DateTimeFormatter formatter : coerceDateTimeFormatters) { | ||
| try { | ||
| ZonedDateTime tmp = ZonedDateTime.parse(str, formatter); | ||
| long r = (long)tmp.toInstant().toEpochMilli(); |
There was a problem hiding this comment.
Returning Instant type is better here, I guess
There was a problem hiding this comment.
Why? We want long with milliseconds. How's Instant useful?
| return null; | ||
| } else if (lstr.contains(".") || lstr.contains("e")) { | ||
| Double d = Double.parseDouble(lstr); | ||
| if (d <= Float.MAX_VALUE && d >= Float.MIN_VALUE) { |
There was a problem hiding this comment.
nit: Probably keeping it as Double is better?
There was a problem hiding this comment.
It looks like TargetType doesn't have Float. I guess, for consistency, I will remove Float.
| } | ||
| return d; | ||
| } else if (Character.isDigit(firstChar) || firstChar == '-' || firstChar == '+') { | ||
| Long l = Long.parseLong(str); |
There was a problem hiding this comment.
parseLong could throw exception. We may want to catch and ignore?
There was a problem hiding this comment.
It is ignored in the caller of autoConvert
san81
left a comment
There was a problem hiding this comment.
Thank you for adding this enhancement. Just left a few comments.
Signed-off-by: Krishna Kondaka <krishkdk@amazon.com>
Signed-off-by: Krishna Kondaka <krishkdk@amazon.com>
Signed-off-by: Krishna Kondaka <krishkdk@amazon.com>
| if (result != null) { | ||
| event.put(keyPrefix+entry.getKey(), result); | ||
| } | ||
| } catch (Exception ignored) {} |
There was a problem hiding this comment.
Might be worth having a metric here to debug if users have issues with using coerce_strings
There was a problem hiding this comment.
coercion is best effort. Not sure if metric here makes sense. If there is a string starting with a digit like 1234-sdfjh-34 It will fail one or more of coercions but that's not necessarily metric worthy
| } | ||
| } | ||
| return null; | ||
| } else if (lstr.contains(".") || lstr.contains("e")) { |
There was a problem hiding this comment.
Is it safe to assume this is double? Or if we try and fail it's just a no-op? What happens if parseDouble throws?
There was a problem hiding this comment.
Again, It is a best effort. Float/doubles will always have one of both of these and they be converted properly. It is possible that a string like "sdfsdf.eeee" fits this condition and the code tries to convert and it will fail. And that's OK because it is a no-op. Yes, there is some wastage of cycles but there is no way to make sure a string cannot be coerced without actually parsing it. We are reducing unnecessary parse calls using this checks.
| } | ||
|
|
||
| } else if (objValue instanceof Map) { | ||
| doAutoConversion(event, (Map<String, Object>)objValue, keyPrefix); |
There was a problem hiding this comment.
Should we put a limit on the recursion here?
There was a problem hiding this comment.
Why? It will be finite, right?
| String t3 = zonedDateTime.format(DateTimeFormatter.ofPattern(ConvertEntryTypeProcessorConfig.DEFAULT_TIME_STRING_FORMATS.get(2))); | ||
| String t4 = zonedDateTimePST.format(DateTimeFormatter.ofPattern(ConvertEntryTypeProcessorConfig.DEFAULT_TIME_STRING_FORMATS.get(3))); | ||
|
|
||
| final Map<String, Object> testData1 = new HashMap<>(); |
There was a problem hiding this comment.
Might be good to add coverage for nested maps
There was a problem hiding this comment.
There is a coverage for nested maps already. testData1 is inside testData
…ect#5782) * Add auto conversion option to convert_type processor Signed-off-by: Krishna Kondaka <krishkdk@amazon.com> * Addressed review comments Signed-off-by: Krishna Kondaka <krishkdk@amazon.com> * Modified to coerse floats to double Signed-off-by: Krishna Kondaka <krishkdk@amazon.com> --------- Signed-off-by: Krishna Kondaka <krishkdk@amazon.com> Signed-off-by: Jonah Calvo <caljonah@amazon.com>
Description
Add auto conversion option to convert_type processor. Adds the following config option to
convert_typeprocessor to automatically convert strings.Issues Resolved
Resolves #5733
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.