Skip to content

Add sql metrics#63

Merged
pantonante merged 13 commits intomainfrom
add-sql-metrics
May 22, 2024
Merged

Add sql metrics#63
pantonante merged 13 commits intomainfrom
add-sql-metrics

Conversation

@yisz
Copy link
Contributor

@yisz yisz commented May 21, 2024

Added an AST comparison for SQL based on the weighted changes in diff tree.


🚀 This description was created by Ellipsis for commit 6eb410a

Summary:

Enhanced SQL code evaluation with new metrics, updated documentation, and added tests, including specific file paths and a new dependency.

Key points:

  • Added SQLSyntaxMatch and SQLASTSimilarity metrics for SQL code evaluation.
  • Updated README.md to include new SQL metrics in the metrics table.
  • Refactored Python package structure to include SQL metrics.
  • Updated documentation for new SQL metrics.
  • Added tests for new SQL metrics.
  • Introduced SQLSyntaxMatch and SQLASTSimilarity metrics in continuous_eval/metrics/code/sql/deterministic.py.
  • Updated README.md and documentation under /docs/src/content/docs/metrics/Code/Deterministic/ for new SQL metrics.
  • Added tests for new SQL metrics in tests/code_metrics_test.py.
  • Updated pyproject.toml to include sqlglot as an optional dependency.

Generated with ❤️ by ellipsis.dev

@yisz yisz requested a review from pantonante May 21, 2024 03:00
@yisz yisz self-assigned this May 21, 2024
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Reviewed everything up to 85efa36 in 56 seconds

More details
  • Looked at 368 lines of code in 12 files
  • Skipped 1 files when reviewing.
  • Skipped posting 1 drafted comments based on config settings.
1. README.md:125
  • Draft comment:
    LGTM. The new SQL metrics are well integrated and documented.
  • Reason this comment was not posted:
    Confidence changes required: 0%
    The PR adds new metrics for SQL code evaluation, specifically SQLSyntaxMatch and SQLASTSimilarity. The implementation seems to correctly handle the formatting and AST comparison for SQL queries. The documentation and tests are updated accordingly to reflect these new metrics. The PR also includes necessary dependency updates in pyproject.toml for sqlparse and sqlglot, which are used in the SQL metrics. Overall, the PR appears to be well-structured and addresses the intended functionality of adding SQL metrics.

Workflow ID: wflow_c4y8tPTwiBNh7Dyg


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would drop the sql from the name, the file is already in the sql folder making it "obvious"

@@ -0,0 +1,77 @@
from typing import List, Union

import sqlparse
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can avoid sqlparse to format the query? Can we use sqlglot only. It seems sqlglot can format but also optimize in a common format https://github.com/tobymao/sqlglot?tab=readme-ov-file#sql-optimizer

Comment on lines +69 to +77
if isinstance(change, Keep):
return 0
elif isinstance(change, Update):
return 1.5 # Updates are significant as they imply a modification in function or value.
elif isinstance(change, Insert) or isinstance(change, Remove):
return 1 # Inserts and Removes affect the structure and content but are simpler than updates.
elif isinstance(change, Move):
return 0.5 # Moves are generally less impactful as they simply change the order.
return 1 # Default weight for other types of changes
Copy link
Contributor

@pantonante pantonante May 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weights should be modifiable through some configuration parameters (see

class DeterministicFaithfulnessConfig:
rouge_precision_threshold: float = 0.5
token_overlap_precision_threshold: float = 0.5
)

pyproject.toml Outdated
Comment on lines +28 to +29
sqlparse = "^0.5.0"
sqlglot = "^23.17.0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these packages should be optional/extra

return {"SQL_Syntax_Match": max_match_score}


class SQLASTSimilarity(Metric):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do these two metrics handle comments?

@yisz yisz requested a review from pantonante May 22, 2024 16:00
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on da2d38b in 4 minutes and 0 seconds

More details
  • Looked at 329 lines of code in 8 files
  • Skipped 1 files when reviewing.
  • Skipped posting 0 drafted comments based on config settings.

Workflow ID: wflow_wXck8lQebKerzqVB


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

@pantonante pantonante merged commit 8124a22 into main May 22, 2024
@pantonante pantonante deleted the add-sql-metrics branch May 22, 2024 20:08
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on 6eb410a in 2 minutes and 54 seconds

More details
  • Looked at 264 lines of code in 5 files
  • Skipped 1 files when reviewing.
  • Skipped posting 0 drafted comments based on config settings.

Workflow ID: wflow_cNrQQ3jGozSZ5BIb


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants