Skip to content

[issue-2520] [SDK] Added Sycophancy Evaluation Metric#2624

Merged
vincentkoc merged 27 commits intocomet-ml:mainfrom
yashkumar2603:add-sycophancy-evaluation
Nov 10, 2025
Merged

[issue-2520] [SDK] Added Sycophancy Evaluation Metric#2624
vincentkoc merged 27 commits intocomet-ml:mainfrom
yashkumar2603:add-sycophancy-evaluation

Conversation

@yashkumar2603
Copy link
Copy Markdown
Contributor

@yashkumar2603 yashkumar2603 commented Jun 29, 2025

Details

Resolves #2520
This PR adds the SycEval metric for evaluating sycophantic behavior in large language models. The metric tests whether models change their responses based on user pressure rather than maintaining independent reasoning by presenting rebuttals of varying rhetorical strength.
It is based on this paper https://arxiv.org/pdf/2502.08177
as linked in the issue.

Key Features:

  • Multi-step evaluation process: Initial classification then Rebuttal generation then Response evaluation then Sycophancy detection
  • Configurable rebuttal types: Simple, ethos, justification, and citation-based rebuttals
  • Context modes: In-context and preemptive rebuttal presentation
  • Separate rebuttal model: Uses dedicated model (defaults to llama3-8b) to avoid contamination
  • Binary scoring: Returns 0.0 (no sycophancy) or 1.0 (sycophancy detected)
  • Detailed metadata: Includes initial/rebuttal classifications and sycophancy type

Implementation:

  • SycEval class with sync/async scoring methods
  • Response classification and parsing
  • Error handling and validation for all classification types
  • can be imported using from opik.evaluation.metrics import SycEval in SDK easily, I tried to follow the coding style of the project, and other things mentioned in the contributing doc.

Issues

I faced one problem, I wasnt able to figure out a way to add the different results found out by the sycophancy analysis, such as sycophancy_type into the scores category in FrontEnd, as that would have required a STRING type in the LLM_SCHEMA_TYPE
So I instead made those available on the SDK, but not on the frontend. Please suggest something to tackle this problem. Guide me to make the necessary improvements in PR.

Documentation

  • Added comprehensive docstrings with usage examples
  • Updated evaluation metrics documentation
  • Added configuration parameter explanations
  • Included research context and score interpretation guidelines (a little when needed)

Working Video

2025-06-29_23-50-51.mp4

/claim #2520

Edit: added working video I forgot to add

@yashkumar2603
Copy link
Copy Markdown
Contributor Author

yashkumar2603 commented Jun 29, 2025

Hello, @vincentkoc please review and suggest changes if any. Also kindly help me understand the frontend issue mentioned above.

  • Thank you 😃

@alexkuzmik alexkuzmik requested a review from yaricom July 2, 2025 11:56
@vincentkoc
Copy link
Copy Markdown
Member

Hello, @vincentkoc please review and suggest changes if any. Also kindly help me understand the frontend issue mentioned above.

  • Thank you 😃

Thanks! @yashkumar2603 the team will review and circle back.

@yaricom
Copy link
Copy Markdown
Contributor

yaricom commented Jul 4, 2025

Hi @yashkumar2603 ! Thank you for your work on this PR — it looks very promising. I’ve left a few review comments. Additionally, I’d like to ask you to add a unit test for your metric that uses mocked model calls but verifies the scoring logic in both synchronous and asynchronous modes.

Please take a look how other LLM judge metrics are tested.

@yashkumar2603
Copy link
Copy Markdown
Contributor Author

yashkumar2603 commented Jul 4, 2025

Thanks for the review @yaricom !!
I am glad you liked the work. I will surely take a look at the unit tests, fix the reviews and update the PR.
Thank you again for your time !!

@yashkumar2603
Copy link
Copy Markdown
Contributor Author

I have added the unit tests and also made necessary changes based on the reviews.
Kindly review 🙏🏾

@yashkumar2603 yashkumar2603 requested a review from yaricom July 4, 2025 17:37
@yaricom
Copy link
Copy Markdown
Contributor

yaricom commented Jul 7, 2025

@aadereiko Could you please take a look at frontend changes if you have any comments or suggestions.

@aadereiko
Copy link
Copy Markdown
Collaborator

@yaricom @yashkumar2603
The FE part looks good :)

@yashkumar2603 yashkumar2603 requested a review from yaricom July 7, 2025 17:59
@yashkumar2603
Copy link
Copy Markdown
Contributor Author

I have made the changes mentioned in the above comment, moved the test from integration to unit.
You are right, I had misplaced it. Thank you for pointing out.
Kindly review, merge it.

Thank you for your time.

@yaricom
Copy link
Copy Markdown
Contributor

yaricom commented Jul 8, 2025

Dear @yashkumar2603 ! Thank you for committing the changes. Please run all tests locally using the OPIK server to ensure there are no unexpected errors. You can find detailed instructions on how to run the OPIK server here: https://www.comet.com/docs/opik/quickstart

@andrescrz
Copy link
Copy Markdown
Member

Hi @yashkumar2603

Thank you for your contribution! 🙏 It looks like there are some merge conflicts that need to be resolved before we can continue. When you have a chance, could you please update the branch? Once the conflicts are resolved, we’ll be happy to provide a new review.

Let us know if you have any questions or need assistance!

@yashkumar2603
Copy link
Copy Markdown
Contributor Author

Thank you for your reviews @yaricom @andrescrz 😃
I will surely make the changes, resolve conflicts and then update the PR. Been a busy for a few days, I will surely do this when i get the time.

@yashkumar2603
Copy link
Copy Markdown
Contributor Author

Hello all!!
I have updated the code based on recommendations by @yaricom and also resolved the merge conflicts from main.
The llama-3 error was coming in tests because i was teesting with it on my local and forgot to change it to the model mentioned in the original paper. Really sorry for the confusion.

Please review @andrescrz.

Thank you for your time 🙏🏾

1. Implemented suggestions from reviews on the previous commit and made
necessary changes.
2. Added unit tests for the sycophancy_evaluation_metric just like how
it is applied for the other metrics
Moved test for invalid score into the unit tests as it uses a dummy
model and doesnt need to be in integration tests. removed unnecessary
@model_parametrizer from the same test.
@yashkumar2603 yashkumar2603 force-pushed the add-sycophancy-evaluation branch from 36e2a46 to 0423038 Compare August 3, 2025 15:23
@yashkumar2603 yashkumar2603 force-pushed the add-sycophancy-evaluation branch from 0423038 to 2a1bc01 Compare August 3, 2025 15:28
@andrescrz
Copy link
Copy Markdown
Member

Hi @yashkumar2603

I'd appreciate if you could take the following actions:

  1. Solve merge conflicts.
  2. Fix the linting errors per @yaricom comment.
  3. Review and address the co-pilot comments. Please use your best judgement to discard those comments that make no sense.

It'd be nice to get this PR to the finish line soon, so our users can enjoy it :)

Thank you very much for all your effort here.

@vincentkoc
Copy link
Copy Markdown
Member

@yashkumar2603 any luck updating your PR? Bounty still stands, almost finished

@vincentkoc
Copy link
Copy Markdown
Member

@yashkumar2603 @yaricom i have addressed the issues, i have removed the LLM-as-a-judge in the FE (frontend) as its not the same metric implementation due to lack of rebutal model in UI. Will merge once the tests pass including lint.

@vincentkoc vincentkoc changed the title Added Sycophancy Evaluation Metric in SDK, FE, Docs [issue-2520] [SDK] Added Sycophancy Evaluation Metric in SDK, FE, Docs Nov 10, 2025
@vincentkoc vincentkoc requested a review from Copilot November 10, 2025 00:28
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 8 out of 10 changed files in this pull request and generated 3 comments.

vincentkoc and others added 4 commits November 9, 2025 16:31
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…rics/sycophancy_evaluation.mdx

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@vincentkoc
Copy link
Copy Markdown
Member

vincentkoc commented Nov 10, 2025

Hi @yashkumar2603

I'd appreciate if you could take the following actions:

1. Solve merge conflicts.

2. Fix the linting errors per @yaricom comment.

3. Review and address the co-pilot comments. Please use your best judgement to discard those comments that make no sense.

It'd be nice to get this PR to the finish line soon, so our users can enjoy it :)

Thank you very much for all your effort here.

@yaricom resolved, ready to merge

unit test
image

intergration test:
image

@vincentkoc vincentkoc requested a review from yaricom November 10, 2025 00:33
@vincentkoc vincentkoc changed the title [issue-2520] [SDK] Added Sycophancy Evaluation Metric in SDK, FE, Docs [issue-2520] [SDK] Added Sycophancy Evaluation Metric Nov 10, 2025
@vincentkoc vincentkoc merged commit be02cc5 into comet-ml:main Nov 10, 2025
8 of 96 checks passed
vincentkoc added a commit that referenced this pull request Nov 10, 2025
 into feat/optimizer-hybrid

* 'feat/optimizer-hybrid' of https://github.com/comet-ml/opik:
  [issue-2520] [SDK] Added Sycophancy Evaluation Metric (#2624)
vincentkoc added a commit that referenced this pull request Nov 10, 2025
 into vk/optimizer-oa_agent

* 'vk/optimizer-oa_agent' of https://github.com/comet-ml/opik:
  [issue-2520] [SDK] Added Sycophancy Evaluation Metric (#2624)
  [OPIK-2986] [FE] Refactor comparison pages to use NavigationTag component (#4006)
  [OPIK-2992] [FE] Add tooltips to DateTag component (#4005)
  [OPIK-2993] [FE] Add tooltips to feedback scores and tags icons (#4007)
  [OPIK-3008] [FE] Refactor: NavigationTag infrastructure (#3972)
awkoy pushed a commit that referenced this pull request Nov 12, 2025
* Added Sycophancy Evaluation Metric in SDK, FE, Docs

* Added unit tests, fixed reviews

1. Implemented suggestions from reviews on the previous commit and made
necessary changes.
2. Added unit tests for the sycophancy_evaluation_metric just like how
it is applied for the other metrics

* Fixed reviews on added tests.

Moved test for invalid score into the unit tests as it uses a dummy
model and doesnt need to be in integration tests. removed unnecessary
@model_parametrizer from the same test.

* Resolving merge conflicts and improving tests from feedback

* Updating default rebuttal model for LiteLLM compatibility

* Added explanations in examples

* Update test_evaluation_metrics.py for formatting after new stuff from main

* Update test_evaluation_metrics.py

* Update sdks/python/src/opik/evaluation/metrics/llm_judges/syc_eval/metric.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update sdks/python/src/opik/evaluation/metrics/llm_judges/syc_eval/metric.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update sdks/python/src/opik/evaluation/metrics/llm_judges/syc_eval/metric.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update sdks/python/src/opik/evaluation/metrics/llm_judges/syc_eval/metric.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update sycophancy_evaluation.mdx

* Update metric.py

* Update llm.ts

* Update llm.ts

* Update __init__.py

* Update metric.py

* chore: lint

* Update sdks/python/examples/metrics.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update apps/opik-documentation/documentation/fern/docs/evaluation/metrics/sycophancy_evaluation.mdx

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update parser.py

* Update parser.py

---------

Co-authored-by: Iaroslav Omelianenko <yaric_mail@yahoo.com>
Co-authored-by: Vincent Koc <vincentk@comet.com>
Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FR]: New Evaluaton Metric "LLM Sycophancy" (SycEval)

6 participants