[ML-Dataframe] Feature/fib multi aggs and sources#34525
[ML-Dataframe] Feature/fib multi aggs and sources#34525hendrikmuhs merged 5 commits intoelastic:feature/fibfrom
Conversation
|
Pinging @elastic/ml-core |
|
retest this please |
ed77c6a to
cdcfba8
Compare
There was a problem hiding this comment.
Seems like we are needlessly creating a leaky/incomplete abstraction. We should return a stream of objects that knows how to transform themselves into an XContentObject instead of creating a HashMap and then requiring the caller to know how to do the transformation (even thought the caller is us). This will afford us a nice separation of duties and an OK abstraction, though definitely not perfect as aggregations are so flexible.
There was a problem hiding this comment.
Could you elaborate on what is leaky/incomplete?
Using a map is intentional, will add some reasoning as documentation.
There was a problem hiding this comment.
It seems to me that all of this building could be self contained. As it stands, we have multiple functions passing around not well defined objects and just happen to know the inner workings of it. Maybe that is just how aggregations are, but seems to me a very easy way to introduce latent bugs.
There was a problem hiding this comment.
To answer the questions:
We have no schema for the output, it could be anything, having that said, this is an internal helper class, not exposed to the outside world. Before it goes out XContent::map ensures it is proper.
Regarding using maps as intermediate format: I checked for a more high-level construct and it turns out, there isn't one. The concept of a document in ingest is Map<String, Object>, so I choose the same here. Eventually we want to support running post-pivot ingest transformations. So this allows us to easily hook in ingest. (LBNL the map makes testing easier, but that isn't the primary reason.)
There was a problem hiding this comment.
As mentioned above, I think whatever object we return should know how to take a builder and add itself appropriately.
There was a problem hiding this comment.
If this class is only ever going to contain static utility methods it's usual to make it final and add a private no-op constructor.
There was a problem hiding this comment.
The declaration and initial assignment could all be one statement.
There was a problem hiding this comment.
If this ever did happen it's going to be frustrating as it doesn't say what the unsupported aggregation was. I think the message should include the type of the aggregation. Also, you could assert we don't get here. (The CI runs with assertions enabled but production runs with assertions disabled.)
There was a problem hiding this comment.
I think Arrays.asList with 1 element shows up as a warning in IntelliJ. You can avoid this by using Collections.singletonList for 1 element lists.
4af0d68 to
f331658
Compare
|
Merging this to get some progress, I expect this part to be revisited, not the final version. Note on CI: Broken as the new module is unknown and causes some tests to fail, so no green CI at the moment. |
FEATURE BRANCH PR
implement support for multiple sources and aggregations