Skip to content

fixed annotation deduplicate function#674

Merged
YukiMatsuzawa merged 2 commits intomasterfrom
fixed_annotation_duplicatefunction
Nov 28, 2025
Merged

fixed annotation deduplicate function#674
YukiMatsuzawa merged 2 commits intomasterfrom
fixed_annotation_duplicatefunction

Conversation

@htsugawa
Copy link
Copy Markdown
Contributor

No description provided.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes the annotation deduplication function by introducing a new unified MatchResultAnnotationDeduplicator class that deduplicates alignment spots by Library ID, InChIKey, and Name, and enhances the existing MspAnnotationDeduplicator with similar multi-criteria deduplication.

  • Replaces separate MSP and TextDB deduplicators with a unified MatchResultAnnotationDeduplicator
  • Adds InChIKey and Name-based deduplication to MspAnnotationDeduplicator
  • Simplifies deduplication logic in AlignmentRefiner and DimsAlignmentRefiner

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.

File Description
src/MSDIAL5/MsdialCore/Algorithm/Alignment/SpotAction.cs Adds new MatchResultAnnotationDeduplicator class and extends MspAnnotationDeduplicator with InChIKey and Name-based deduplication
src/MSDIAL5/MsdialCore/Algorithm/Alignment/AlignmentRefiner.cs Replaces separate MSP and TextDB deduplicators with unified MatchResultAnnotationDeduplicator
src/MSDIAL5/MsdialDimsCore/Algorithm/Alignment/DimsAlignmentRefiner.cs Replaces separate MSP and TextDB deduplicators with unified MatchResultAnnotationDeduplicator

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

}

// by InChIKey
spotList = spots.Where(n => n.MspBasedMatchResult.InChIKey.Length > 1).OrderByDescending(spot => spot.MspBasedMatchResult.InChIKey).ToList();
Copy link

Copilot AI Nov 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing null or empty check for InChIKey. The filter only checks InChIKey.Length > 1 without first checking if InChIKey is null or empty, which could cause a NullReferenceException. Consider adding:

spotList = spots.Where(n => !string.IsNullOrEmpty(n.MspBasedMatchResult?.InChIKey) && n.MspBasedMatchResult.InChIKey.Length > 1).OrderByDescending(spot => spot.MspBasedMatchResult.InChIKey).ToList();
Suggested change
spotList = spots.Where(n => n.MspBasedMatchResult.InChIKey.Length > 1).OrderByDescending(spot => spot.MspBasedMatchResult.InChIKey).ToList();
spotList = spots.Where(n => !string.IsNullOrEmpty(n.MspBasedMatchResult?.InChIKey) && n.MspBasedMatchResult.InChIKey.Length > 1).OrderByDescending(spot => spot.MspBasedMatchResult.InChIKey).ToList();

Copilot uses AI. Check for mistakes.
if (_param.OnlyReportTopHitInTextDBSearch) {
var textDbDedupicator = new TextAnnotationDeduplicator();
textDbDedupicator.Process(alignments);
if (_param.OnlyReportTopHitInMspSearch) {
Copy link

Copilot AI Nov 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The OnlyReportTopHitInTextDBSearch parameter is no longer being checked after this change. If the new MatchResultAnnotationDeduplicator is intended to handle both MSP and TextDB deduplication, the parameter should either be removed from the codebase or the condition should be updated to:

if (_param.OnlyReportTopHitInMspSearch || _param.OnlyReportTopHitInTextDBSearch) {

Otherwise, the TextDB-specific parameter has become non-functional.

Suggested change
if (_param.OnlyReportTopHitInMspSearch) {
if (_param.OnlyReportTopHitInMspSearch || _param.OnlyReportTopHitInTextDBSearch) {

Copilot uses AI. Check for mistakes.
{
public void Process(IEnumerable<AlignmentSpotProperty> spots) {
var spotList = spots.OrderByDescending(spot => spot.MspID).ToList();

Copy link

Copilot AI Nov 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential NullReferenceException if spotList is empty. The first pass (lines 121-143) doesn't check if the list has any elements before accessing spotList[currentPeakId] at lines 124 and 131. Consider adding a guard:

if (spotList.Count == 0) return;
Suggested change
if (spotList.Count == 0) return;

Copilot uses AI. Check for mistakes.
for (int i = 1; i < spotList.Count; i++) {
if (spotList[i].MatchResults.Representative.Name.ToLower() != currentName) {
currentPeakId = i;
currentName = spotList[currentPeakId].MatchResults.Representative.Name;
Copy link

Copilot AI Nov 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent case handling for Name comparison. Line 92 uses ToLower() when setting currentName, but line 96 doesn't apply ToLower() when updating currentName. This should be:

currentName = spotList[currentPeakId].MatchResults.Representative.Name.ToLower();
Suggested change
currentName = spotList[currentPeakId].MatchResults.Representative.Name;
currentName = spotList[currentPeakId].MatchResults.Representative.Name.ToLower();

Copilot uses AI. Check for mistakes.
}

public void Process(IEnumerable<AlignmentSpotProperty> spots) {
var spotList = spots.Where(n => n.IsReferenceMatched(_evaluator)).OrderByDescending(spot => spot.MatchResults.Representative.LibraryID).ToList();
Copy link

Copilot AI Nov 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential NullReferenceException if spotList is empty. The code accesses spotList[currentPeakId] at line 35 without checking if the list has any elements. Consider adding a check:

if (spotList.Count == 0) return;
Suggested change
var spotList = spots.Where(n => n.IsReferenceMatched(_evaluator)).OrderByDescending(spot => spot.MatchResults.Representative.LibraryID).ToList();
var spotList = spots.Where(n => n.IsReferenceMatched(_evaluator)).OrderByDescending(spot => spot.MatchResults.Representative.LibraryID).ToList();
if (spotList.Count == 0) return;

Copilot uses AI. Check for mistakes.
@@ -48,13 +48,8 @@ public DimsAlignmentRefiner(MsdialDimsParameter param, IupacDatabase iupac, IMat

private void Deduplicate(IList<AlignmentSpotProperty> alignments) { // TODO: change deduplicate process (msp, textdb, metabolite name...)
if (_param.OnlyReportTopHitInMspSearch) { //to remove duplicate identifications
Copy link

Copilot AI Nov 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The OnlyReportTopHitInTextDBSearch parameter is no longer being checked after this change. If the new MatchResultAnnotationDeduplicator is intended to handle both MSP and TextDB deduplication, the parameter should either be removed from the codebase or the condition should be updated to check both parameters. Otherwise, the TextDB-specific parameter has become non-functional.

Suggested change
if (_param.OnlyReportTopHitInMspSearch) { //to remove duplicate identifications
if (_param.OnlyReportTopHitInMspSearch || _param.OnlyReportTopHitInTextDBSearch) { //to remove duplicate identifications

Copilot uses AI. Check for mistakes.
@YukiMatsuzawa YukiMatsuzawa merged commit b06064a into master Nov 28, 2025
9 checks passed
@YukiMatsuzawa YukiMatsuzawa deleted the fixed_annotation_duplicatefunction branch November 28, 2025 03:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants