Skip to content

Sequence prob computation#223

Merged
CharlesMoslonka merged 21 commits intomainfrom
sequence_prob_computation
Mar 9, 2026
Merged

Sequence prob computation#223
CharlesMoslonka merged 21 commits intomainfrom
sequence_prob_computation

Conversation

@CharlesMoslonka
Copy link
Copy Markdown
Collaborator

Big PR for the sequence probability computation.
Tagged for release.

BREAKING CHANGE: Since we introduce other types of parsing for the output, we change the name for it to be clearer.
We add a new parser (just for OpenAI objects, for now) for the sampled tokens probabilities, a new abstract class, and a new scorer that returns the probability of the genereted tokens and of the whole sequence.
@gemini-code-assist

This comment was marked as resolved.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request introduces sequence probability computation and refactors the uncertainty detector base classes. While the overall structure is good, there are significant inconsistencies between the vLLM and OpenAI implementations for sampled token logprobs. Specifically, the OpenAI functions only process the first output/choice and return a 1D array, whereas the vLLM function returns a 2D array (list of sequences). This inconsistency will cause the SentenceProbabilityScorer to fail or produce incorrect results when used with OpenAI outputs. Additionally, there are some type hint mismatches and potential IndexError risks in the OpenAI parsing logic.

Comment thread src/artefactual/preprocessing/openai_parser.py Outdated
Comment thread src/artefactual/preprocessing/openai_parser.py Outdated
Comment thread src/artefactual/preprocessing/openai_parser.py Outdated
Comment thread src/artefactual/preprocessing/openai_parser.py Outdated
Comment thread src/artefactual/preprocessing/parser.py Outdated
Comment thread src/artefactual/preprocessing/vllm_parser.py Outdated
Comment thread src/artefactual/scoring/probability_methods/sentence_proba.py Outdated
@CharlesMoslonka CharlesMoslonka marked this pull request as draft February 23, 2026 16:56
@CharlesMoslonka CharlesMoslonka marked this pull request as ready for review February 23, 2026 17:32
@CharlesMoslonka
Copy link
Copy Markdown
Collaborator Author

/gemini review

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant refactoring to better distinguish between parsing log probabilities for entropy-based methods and for sequence probability calculations. It renames several functions for clarity (e.g., parse_model_outputs to parse_top_logprobs) and adds new functionality to compute sequence probabilities from sampled token logprobs. The class hierarchy for uncertainty detectors has been refined, and the test suite has been reorganized and expanded accordingly. Overall, these changes improve the structure and clarity of the codebase. I've provided a few suggestions to improve type hints and docstrings for better accuracy and developer experience.

Comment thread src/artefactual/preprocessing/parser.py Outdated
Comment thread src/artefactual/preprocessing/openai_parser.py Outdated
Comment thread src/artefactual/preprocessing/openai_parser.py Outdated
Comment thread src/artefactual/preprocessing/vllm_parser.py Outdated
Comment thread src/artefactual/preprocessing/vllm_parser.py Outdated
Comment thread src/artefactual/preprocessing/vllm_parser.py Outdated
Comment thread src/artefactual/preprocessing/openai_parser.py Outdated
Comment thread src/artefactual/preprocessing/openai_parser.py Outdated
Comment thread src/artefactual/scoring/probability_methods/sentence_proba.py Outdated
@CharlesMoslonka CharlesMoslonka marked this pull request as draft March 5, 2026 17:31
@CharlesMoslonka CharlesMoslonka marked this pull request as ready for review March 5, 2026 17:31
Copy link
Copy Markdown
Collaborator

@chicham chicham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small fixes before merge. Test coverage is handled by PR #224.


def compute_token_scores(self, inputs: list[NDArray[np.floating]]) -> list[NDArray[np.floating]]:
"""
Compute sentence-level probability scores by summing token log probabilities.
Copy link
Copy Markdown
Collaborator

@chicham chicham Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This docstring is copy-pasted from compute. It says "sentence-level" and "summing" but this method exponentiates each token individually via np.exp(seq). It computes token-level probabilities, not sentence-level.

@chicham
Copy link
Copy Markdown
Collaborator

chicham commented Mar 6, 2026

Test coverage: This PR does not include unit tests for the new parser functions (vllm_sampled_tokens_logprobs, sampled_tokens_logprobs_responses_api, sampled_tokens_logprobs_chat_completion_api). These are covered by PR #224 which is now ready for review.

@CharlesMoslonka can you merge #224 after this one?

Comment thread src/artefactual/preprocessing/vllm_parser.py Outdated
@CharlesMoslonka CharlesMoslonka merged commit 856a555 into main Mar 9, 2026
10 checks passed
@CharlesMoslonka CharlesMoslonka deleted the sequence_prob_computation branch March 9, 2026 09:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants