Skip to content

Conversation

@alexander-beedie
Copy link
Collaborator

@alexander-beedie alexander-beedie commented Nov 3, 2025

Closes #18654.

Updates

  • A long overdue rework of SQL "JOIN" constraint processing - much improved, in terms of clarity, robustness, and capability (for example, we now use sqlparser-rs' own Visitor functionality to associate left/right sides of the join condition; see FindTableIdentifier for details - this approach may be useful elsewhere, later).

  • Improved implementation of resolve_compound_identifier, adding additional handling for identifiers referring to columns created by multiple/consecutive joins that are not in the original/leftmost frame schema, and tidying up some other identifier resolution (mostly by preferring match over if/else chains).

These improvements finally allow us to support expressions (and literals) in join constraints, not just column names. All existing tests pass, and plenty of new test coverage was added.

Example

import polars as pl

df1 = pl.DataFrame({
    "code": ["HELLO", "WORLD", "FOO"],
    "value": [100, 200, 300],
})
df2 = pl.DataFrame({
    "code": ["hello", "world", "bar"],
    "value": [-1.2345, 4.5678, 2.2222],
})

Can now support a join condition on "LOWER(<col>)":

df_res = pl.sql("""
    SELECT
      df1.code AS code1,
      df2.code AS code2,
      (df1.value * df2.value) AS val
    FROM df1
    INNER JOIN df2 ON LOWER(df1.code) = df2.code
""").collect()

# shape: (2, 3)
# ┌───────┬───────┬─────────┐
# │ code1 ┆ code2 ┆ val     │
# │ ---   ┆ ---   ┆ ---     │
# │ str   ┆ str   ┆ f64     │
# ╞═══════╪═══════╪═════════╡
# │ HELLO ┆ hello ┆ -123.45 │
# │ WORLD ┆ world ┆ 913.56  │
# └───────┴───────┴─────────┘

Previously this would have raised the following exception...

SQLInterfaceError: 
  only equi-join constraints (on identifiers) are currently supported...

...but now we can better introspect into the SQL expression, we can support more.

@github-actions github-actions bot added A-sql Area: Polars SQL functionality enhancement New feature or an improvement of an existing feature python Related to Python Polars rust Related to Rust Polars labels Nov 3, 2025
@codecov
Copy link

codecov bot commented Nov 3, 2025

Codecov Report

❌ Patch coverage is 78.94737% with 24 lines in your changes missing coverage. Please review.
✅ Project coverage is 81.80%. Comparing base (53eaf6c) to head (1064016).
⚠️ Report is 14 commits behind head on main.

Files with missing lines Patch % Lines
crates/polars-sql/src/sql_expr.rs 54.16% 22 Missing ⚠️
crates/polars-sql/src/context.rs 96.96% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #25132      +/-   ##
==========================================
- Coverage   81.81%   81.80%   -0.01%     
==========================================
  Files        1708     1708              
  Lines      236102   236162      +60     
  Branches     3005     3005              
==========================================
+ Hits       193159   193202      +43     
- Misses      42175    42192      +17     
  Partials      768      768              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ritchie46 ritchie46 merged commit a4e27ff into pola-rs:main Nov 4, 2025
36 checks passed
@alexander-beedie alexander-beedie deleted the enhanced-sql-joins branch November 4, 2025 11:23
jqnatividad added a commit to dathere/qsv that referenced this pull request Nov 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-sql Area: Polars SQL functionality enhancement New feature or an improvement of an existing feature python Related to Python Polars rust Related to Rust Polars

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Not able to filter on JOIN level for SQL interface

2 participants