Skip to content

feat: Add a configuration property for Spark ANSI mode#14375

Closed
philo-he wants to merge 3 commits intofacebookincubator:mainfrom
philo-he:add-ansi
Closed

feat: Add a configuration property for Spark ANSI mode#14375
philo-he wants to merge 3 commits intofacebookincubator:mainfrom
philo-he:add-ansi

Conversation

@philo-he
Copy link
Copy Markdown
Contributor

@philo-he philo-he commented Aug 8, 2025

This change prepares for supporting an ANSI-compliant dialect in Velox. The added
configuration property allows function implementations to adapt based on the user's
ANSI mode setting.

See ANSI support discussion: #3869.

@netlify
Copy link
Copy Markdown

netlify Bot commented Aug 8, 2025

Deploy Preview for meta-velox ready!

Name Link
🔨 Latest commit 861b09a
🔍 Latest deploy log https://app.netlify.com/projects/meta-velox/deploys/68aeaffa84a56800086e64cd
😎 Deploy Preview https://deploy-preview-14375--meta-velox.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 8, 2025
@mbasmanova mbasmanova changed the title feat: Add a configuration property for ANSI mode feat: Add a configuration property for Spark ANSI mode Aug 8, 2025
@mbasmanova mbasmanova requested a review from rui-mo August 8, 2025 03:43
Comment thread velox/core/QueryConfig.h Outdated
/// If true, Velox will use an ANSI-compliant dialect. For example, Velox will
/// throw a runtime exception instead of returning null results when the inputs
/// to a SQL operator or function are invalid.
static constexpr const char* kAnsiEnabled = "ansi_enabled";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this is specific to Spark functions. Presto functions are ANSI compliant already.

If so, let's add spark. prefix like other Spark-specific configs.

There is also a question about how to implement this without causing confusion or wrong results. It seems this change would allow user to request ANSI mode, but the results will not be compiant.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mbasmanova, thanks for your review. I will update the new config to indicate that it is intended for Spark use only.

There will be several upcoming PRs that rely on this configuration to adapt function implementations. The legacy mode and ANSI mode will coexist in Spark for a long period. We plan to introduce branching logic in relevant Velox functions, respecting this configuration to ensure ANSI compliance while minimizing code duplication. It's also possible to add separate ANSI-compliant functions if there isn't much shared code.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense, but my question is about something else. How do we ensure that when ANSI is on the query returns correct results. If we land this change, then turning ANSI on will produce wrong results (same as with ANSI off).

One option is to create an allow list of functions and special forms that support ANSI mode and fail the query if ANSI mode is ON, but query contains a function that's not in allow list. We would need to think about where to place such a check.

CC: @rui-mo

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mbasmanova, currently there is a rule in Gluten to fall back to vanilla Spark for the entire query when ANSI is on. Our plan is to firstly support a few common ANSI-compliant functions in Velox before removing this rule to allow the offloading. After this rule is removed, it's quite possible that incorrect results may be produced for functions that either lack ANSI support or have existing bugs, as you pointed out. To address this, maybe we can maintain a blacklist in Gluten to prevent the offloading for these functions, similar to our current approach when ANSI is off. This approach appears more user-friendly, as it allows the query to be executed in fallback mode rather than failing on Velox side.

Copy link
Copy Markdown
Contributor

@rui-mo rui-mo Aug 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ANSI mode-controlled scenarios are described in Spark's documentation at https://github.com/apache/spark/blob/v4.0.0/docs/sql-ref-ansi-compliance.md, and they have a variety of effects on the detailed implementation. Given this, it might be difficult to cover every aspect by adding a check in Velox.

Since Gluten currently fallback for ANSI on, it would more of an experimental feature. We can also add a document in Velox to list the ANSI-compatible functions when supporting them, which should be rather limited for now.

Copy link
Copy Markdown
Contributor

@rui-mo rui-mo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

Comment thread velox/docs/configs.rst Outdated
- bool
- false
- If true, Velox will use an ANSI-compliant dialect. For example, Velox will throw a runtime exception instead of
returning null results when the inputs to a SQL operator or function are invalid.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you please mention current status of this config? I suppose it won't take effect for now.

Copy link
Copy Markdown
Contributor

@mbasmanova mbasmanova Aug 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a sense of a timeline for updating ALL Spark functions to support this property? How many functions need updating? How do we plan to test this? Can we setup Fuzzer that runs with ansi on and compares the results with Spark?

I'm concerned about users starting to use this property and run into data correctness issues.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mbasmanova, my current estimate is at least one quarter to support all related Spark functions, depending on the number of contributors involved. I am still listing the functions that require ANSI support. Like non-ANSI functions, we will port Spark test cases into Velox and enable fuzzer tests to ensure consistency with Spark.

@mbasmanova
Copy link
Copy Markdown
Contributor

We can also add a document in Velox to list the ANSI-compatible functions when supporting them

That would be helpful. Let's come up with a convention for how to indicate in the docs whether a function supports ANSI mode or not.

Can we add an API that takes an Expr tree and returns true if all expressions are ANSI-compliant or not? This would allow non-Gluten users to check compliance easily.

@mbasmanova
Copy link
Copy Markdown
Contributor

Does ANSI mode affect the behavior of any operators? Or is it only for functions?

@philo-he
Copy link
Copy Markdown
Contributor Author

Can we add an API that takes an Expr tree and returns true if all expressions are ANSI-compliant or not? This would allow non-Gluten users to check compliance easily.

@mbasmanova, sounds great. I will evaluate the feasibility. Thanks for the suggestion.

Does ANSI mode affect the behavior of any operators? Or is it only for functions?

To the best of my knowledge, ANSI mode primarily affects expressions. Maybe, one exception is implicit type coercion during write operations, which it can also influence. However, this effect is indirect—ANSI actually controls the underlying expressions that the write operation relies on.

Copy link
Copy Markdown
Contributor Author

@philo-he philo-he left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mbasmanova, I've updated this PR. Could you take a look again?

Comment thread velox/core/QueryConfig.h Outdated
static constexpr const char* kPrestoArrayAggIgnoreNulls =
"presto.array_agg.ignore_nulls";

/// If true, Velox will use an ANSI-compliant dialect. For example, Velox will
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not quite accurate to say "Velox will use" here because the setting applies only to Spark functions package, not to core engine. Let's change to something lilke

/// If true, Spark function's behavior is ANSI-compliant, e.g. throws runtime exception instead of returning null on invalid inputs.

Also, let's clarify that this setting is not implemented yet and provide guidance on how to tell which functions support this. Are we going to come up with some way to mark a function as ANSI-compliant in the docs?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mbasmanova, The suggestion makes sense. I've updated the PR accordingly. We will document the ANSI behavior in the existing documentation for the specific function as part of the ANSI-support PR. Thanks.

Copy link
Copy Markdown
Contributor

@mbasmanova mbasmanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

We will document the ANSI behavior in the existing documentation for the specific function as part of the ANSI-support PR.

Let's come up with a standard way of doing that before PRs start showing up to ensure all PRs do that in a consistent easy to understand way.

CC: @rui-mo

@mbasmanova mbasmanova added the ready-to-merge PR that have been reviewed and are ready for merging. PRs with this tag notify the Velox Meta oncall label Aug 27, 2025
@rui-mo
Copy link
Copy Markdown
Contributor

rui-mo commented Aug 27, 2025

Let's come up with a standard way of doing that before PRs start showing up to ensure all PRs do that in a consistent easy to understand way.

@mbasmanova @philo-he I assume we could document the "standard way" at below with another PR, and follow this way in subsequent PRs.

The semantics of Spark functions match Spark 3.5 with ANSI OFF.

The standard way could be to add a label (ANSI Compatible) at the function signature if the ANSI ON/OFF behaviors are both supported, and document the details of differences. Example:

Screenshot 2025-08-27 at 17 07 17

How do you think?

@mbasmanova
Copy link
Copy Markdown
Contributor

@rui-mo I like this proposal.

@facebook-github-bot
Copy link
Copy Markdown
Contributor

@peterenescu has imported this pull request. If you are a Meta employee, you can view this in D81253191.

@facebook-github-bot
Copy link
Copy Markdown
Contributor

@peterenescu merged this pull request in 0570617.

facebook-github-bot pushed a commit that referenced this pull request Sep 9, 2025
Summary:
This is follow-up work for #14375 (comment).

Pull Request resolved: #14698

Reviewed By: mbasmanova

Differential Revision: D81962221

Pulled By: Yuhta

fbshipit-source-id: 572ba75309b75a101410b569528037245f0d9178
WangGuangxin pushed a commit to WangGuangxin/bolt that referenced this pull request Mar 22, 2026
Summary:
This change prepares for supporting an ANSI-compliant dialect in Velox. The added
configuration property allows function implementations to adapt based on the user's
ANSI mode setting.

See ANSI support discussion: facebookincubator/velox#3869.

Pull Request resolved: facebookincubator/velox#14375

Reviewed By: Yuhta

Differential Revision: D81253191

Pulled By: peterenescu

fbshipit-source-id: a5f0212c95302721adaffdce7de512b160631646
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged ready-to-merge PR that have been reviewed and are ready for merging. PRs with this tag notify the Velox Meta oncall

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants