Conversation
There was a problem hiding this comment.
👍 Looks good to me! Reviewed everything up to e399a06 in 1 minute and 38 seconds
More details
- Looked at
1495lines of code in23files - Skipped
1files when reviewing. - Skipped posting
1drafted comments based on config settings.
1. continuous_eval/eval/__init__.py:9
- Draft comment:
The import ofEvaluationRunneris correctly added to reflect the new changes in the system architecture. - Reason this comment was not posted:
Confidence changes required:0%
The PR description mentions the removal of the Evaluation Manager and the creation of a new Pipeline Logger class. The code changes reflect these updates, including the addition of the Evaluation Runner and updates to the documentation. The PR also includes other bug fixes and updates to the documentation to reflect the new changes in the system architecture.
Workflow ID: wflow_BKTc0VGtqrOzCZod
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.
There was a problem hiding this comment.
👍 Looks good to me! Incremental review on 068de1f in 1 minute and 50 seconds
More details
- Looked at
231lines of code in7files - Skipped
0files when reviewing. - Skipped posting
1drafted comments based on config settings.
1. docs/src/content/docs/pipeline/pipeline_logger.mdx:25
- Draft comment:
The usage of thevalueparameter here is inconsistent with its definition earlier in the document. Thevalueparameter is meant to store the output of a module, not the name of a tool. Iftool_nameis intended to be logged, it should be part of a different parameter or method designed for logging tool usage.
pipelog.log(uid=sample_uid, module="module_name", output="output_of_module", tool_name="tool_name", tool_args={"a": a, "b": b})
- Reason this comment was not posted:
Confidence of 0% on close inspection, compared to threshold of 50%.
Workflow ID: wflow_VLyCLCKwI51TlYET
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.
|
@pantonante I updated examples in the docs & readme in couple more places to reflect the latest eval runner & logger. The examples repo and full examples in the docs still uses eval_manager; we can update those once we merge these changes in. Please check if my doc changes are good before merging into main! Thanks! |
There was a problem hiding this comment.
❌ Changes requested. Incremental review on c90711a in 1 minute and 16 seconds
More details
- Looked at
77lines of code in1files - Skipped
0files when reviewing. - Skipped posting
0drafted comments based on config settings.
Workflow ID: wflow_BQMFBWtw3BCY7Njj
Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.
| @@ -199,25 +200,24 @@ print(pipeline.graph_repr()) # optional: visualize the pipeline | |||
| Now you can run the evaluation on your pipeline | |||
There was a problem hiding this comment.
The example code in the README still references the old EvaluationManager. Since the PR description mentions its removal, the documentation should be updated to reflect the new EvaluationRunner usage consistently throughout.
| Now you can run the evaluation on your pipeline | |
| # Update the example code to use `EvaluationRunner` instead of the old `EvaluationManager`. |
Changelog:
Summary:
Transitioned from
EvaluationManagertoEvaluationRunner, introducedPipelineLogger, updated documentation includingREADME.md, and refactored the codebase.Key points:
EvaluationManagerEvaluationRunnerandPipelineLoggerREADME.mdGenerated with ❤️ by ellipsis.dev