Skip to content

[BI-2065] - Experiment overwrite missing phenotype validations#340

Merged
HMS17 merged 7 commits intodevelopfrom
bug/BI-2065
May 6, 2024
Merged

[BI-2065] - Experiment overwrite missing phenotype validations#340
HMS17 merged 7 commits intodevelopfrom
bug/BI-2065

Conversation

@HMS17
Copy link
Contributor

@HMS17 HMS17 commented Mar 19, 2024

Description

Story: BI-2065 - Experiment overwrite does not do all data validations on phenotype data

Updated logic in ExperimentProcessor.java:validateObservations to run validations in the overwrite case

Dependencies

n/a

Testing

  • Have one ontology term for each type of scale class
  • Create experiment files, one with valid data for a phenotype corresponding to each scale class, and valid data for at least one timestamp, and one with invalid data for said phenotypes/timestamp
  • Import first experiment file, should import successfully
  • Import second experiment file, should get corresponding error messages for each incorrect phenotype and timestamp value

Checklist:

  • I have performed a self-review of my own code
  • I have tested my code and ensured it meets the acceptance criteria of the story
  • I have tested that my code works with both the brapi-java-server and BreedBase
  • I have create/modified unit tests to cover this change
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to documentation
  • I have run TAF: <please include a link to TAF run>

@github-actions github-actions bot added the bug Something isn't working label Mar 19, 2024
@HMS17 HMS17 marked this pull request as ready for review March 20, 2024 02:58
@HMS17 HMS17 requested review from a team, davedrp and nickpalladino and removed request for a team March 20, 2024 02:58
@davedrp davedrp self-assigned this Mar 20, 2024
@HMS17 HMS17 requested review from a team, davedrp and mlm483 and removed request for a team and davedrp March 21, 2024 21:47
@mlm483 mlm483 self-assigned this Mar 26, 2024
@davedrp davedrp assigned davedrp and mlm483 and unassigned davedrp and mlm483 Mar 27, 2024
Copy link
Member

@nickpalladino nickpalladino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked a little more and saw that NAs are being skipped in validateObservationValue so the point I brought up shouldn't be an issue. I'll re-review after the changes you're working on are done.

Copy link
Contributor

@mlm483 mlm483 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found two overwrite scenarios which I think need to be handled differently, although that could happen in a different story.

Empty timestamps cause unhandled exception when overwriting

To reproduce:

  • Upload germplasm.xls, ontology.xls and minimal_experiment.xlsx. This experiment file has a TS column with some empty values.
  • After the upload succeeds, download the experiment file with timestamps for the "Some Phenotypes with TS" experiment you just created.
  • Reupload this file (you can change some of the pheno values if you want).
  • The empty timestamps, which were successfully uploaded initially, result in an unhandled DateTimeParseException.

Desired Behavior:

  • If empty timestamps are allowed in the initial upload, they should be allowed in the overwrite.

"NA" values are treated as "no change" when overwriting

To reproduce:

  • Upload germplasm.xls, ontology.xls and experiment.xlsx.
  • Download the resulting experiment.
  • Change some pheno values to "NA" and reupload to overwrite.
  • Upload succeeds, but pheno values changed to "NA" are treated as if they are unchanged.

Desired behavior:

  • Per Shawn's comment, "NA" should be saved to the database when overwriting.

} else if (existingObsByObsHash.containsKey(importHash) && !isObservationMatched(importHash, importObsValue, phenoCol, rowNum)) {

// different data means validations still need to happen
// TODO consider moving these two calls into a separate method since called twice together
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this referring to validateObservationValue and validateTimeStampValue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, since we have validation in two places in the method (one for the overwrite case and one for the non-overwrite case, it feels like it should be cleaned up in some manner)

@HMS17
Copy link
Contributor Author

HMS17 commented Apr 5, 2024

I found two overwrite scenarios which I think need to be handled differently, although that could happen in a different story.

Empty timestamps cause unhandled exception when overwriting

To reproduce:

  • Upload germplasm.xls, ontology.xls and minimal_experiment.xlsx. This experiment file has a TS column with some empty values.
  • After the upload succeeds, download the experiment file with timestamps for the "Some Phenotypes with TS" experiment you just created.
  • Reupload this file (you can change some of the pheno values if you want).
  • The empty timestamps, which were successfully uploaded initially, result in an unhandled DateTimeParseException.

Desired Behavior:

  • If empty timestamps are allowed in the initial upload, they should be allowed in the overwrite.

"NA" values are treated as "no change" when overwriting

To reproduce:

  • Upload germplasm.xls, ontology.xls and experiment.xlsx.
  • Download the resulting experiment.
  • Change some pheno values to "NA" and reupload to overwrite.
  • Upload succeeds, but pheno values changed to "NA" are treated as if they are unchanged.

Desired behavior:

  • Per Shawn's comment, "NA" should be saved to the database when overwriting.

I can look into seeing how extensive those changes might be, since it would be neater to have it all handled here if it doesn't increase scope too much rather than having to list still failing overwrite conditions.

HMS17 added 2 commits May 2, 2024 09:45
Handled case where present observation but blank associated timestamp led to an unhandled attempt to parse said blank timestamp
@mlm483
Copy link
Contributor

mlm483 commented May 2, 2024

@HMS17 The DateTimeParseException is fixed. 👍

I'm seeing odd behavior when overwriting observation values with the string "NA", it isn't shown in the preview and seems to be converted to null based on the observations table view after overwriting. You may want to talk to Shawn about the desired behavior, "NA" should be saved to the database when overwriting. I think she wants the literal string "NA" written to the database rather than null in this scenario.

@mlm483 mlm483 self-requested a review May 6, 2024 17:05
Copy link
Contributor

@mlm483 mlm483 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After getting bi-api to build with the updated tablesaw dependency, everything is working as expected.

@HMS17 HMS17 requested review from mlm483 and nickpalladino May 6, 2024 17:44
@HMS17 HMS17 merged commit 38f35d0 into develop May 6, 2024
@HMS17 HMS17 deleted the bug/BI-2065 branch May 6, 2024 20:21
@HMS17 HMS17 restored the bug/BI-2065 branch May 6, 2024 20:42
@mlm483
Copy link
Contributor

mlm483 commented May 8, 2024

I merged bug/BI-2065 into release/0.9.1, ed1fd02.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants