Skip to content

[BI-2046] Overwrite observations happening without reference to ObsUnitID#332

Merged
dmeidlin merged 40 commits intorelease/0.9from
bug/BI-2046
Apr 5, 2024
Merged

[BI-2046] Overwrite observations happening without reference to ObsUnitID#332
dmeidlin merged 40 commits intorelease/0.9from
bug/BI-2046

Conversation

@dmeidlin
Copy link
Contributor

@dmeidlin dmeidlin commented Feb 14, 2024

Description

Story: BI-2046

Validation errors are now returned for display in the experiment import preview for any rows referring to an existing environment and with the ObsUnitID blank.

Dependencies

bi-web release/0.9

Testing

Verify all tests pass in ExperimentFileImportTest

  1. import an experiment with observations
  2. download the exp data with ObsUnitId populated
  3. select multiple observations for updating and change phenotype values
  4. import and verify updates by viewing experiment details

Checklist:

  • I have performed a self-review of my own code
  • I have tested my code and ensured it meets the acceptance criteria of the story
  • I have tested that my code works with both the brapi-java-server and BreedBase
  • I have create/modified unit tests to cover this change
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to documentation
  • I have run TAF: <please include a link to TAF run>

@github-actions github-actions bot added the bug Something isn't working label Feb 14, 2024
@dmeidlin dmeidlin requested review from a team, mlm483 and nickpalladino and removed request for a team February 15, 2024 16:14
Copy link
Member

@nickpalladino nickpalladino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did @shawnyarnes want the Observation units to be looked up based on the ObsUnitID? It looks like in fetchOrCreateObsUnitPIO there is dependence on study information. I think ultimately she wanted updates to only rely on the ObsUnitID and none of the other data fields in the file.

If I try to update with a file like this:
exp4_download_obsunitupdate.xls

I get an exception:

java.lang.NumberFormatException: null
	at java.base/java.lang.Integer.parseInt(Integer.java:620)
	at java.base/java.lang.Integer.parseInt(Integer.java:776)
	at org.breedinginsight.brapps.importer.services.processors.ExperimentProcessor.yearToSeasonDbIdFromDatabase(ExperimentProcessor.java:1826)
	at org.breedinginsight.brapps.importer.services.processors.ExperimentProcessor.yearToSeasonDbId(ExperimentProcessor.java:1796)
	at org.breedinginsight.brapps.importer.services.processors.ExperimentProcessor.initNewBrapiData(ExperimentProcessor.java:560)
	at org.breedinginsight.brapps.importer.services.processors.ExperimentProcessor.process(ExperimentProcessor.java:233)
	at org.breedinginsight.brapps.importer.services.processors.ProcessorManager.process(ProcessorManager.java:63)
	at org.breedinginsight.brapps.importer.model.imports.experimentObservation.ExperimentImportService.process(ExperimentImportService.java:74)
	at org.breedinginsight.brapps.importer.services.FileImportService.lambda$processFile$9(FileImportService.java:427)
	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1771)
	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.exec(CompletableFuture.java:1763)
	at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
	at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1016)
	at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1665)
	at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1598)
	at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:183)

Base automatically changed from release/0.9 to develop March 12, 2024 16:11
@dmeidlin dmeidlin requested review from mlm483 and nickpalladino April 1, 2024 19:53
@dmeidlin dmeidlin changed the base branch from develop to release/0.9 April 1, 2024 19:59
Copy link
Contributor

@mlm483 mlm483 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested, working. Looks good, the only critical change is the error message text. Other than that I had a few thoughts and clarifying questions.

Error Message

  • There is duplication of "Import Cannot Proceed". See screenshot in one of my comments on ExperimentProcessor.java.
  • The error is a simple banner error stating "Experimental entities are missing ObsUnitIDs". I don't have an issue with this, but it wasn't clear if the error is intended to indicate row numbers or not. Would defer to you and Shawn on this.

"Reference" Observation Unit

  • Code and comments use the term Reference Observation Unit, could you clarify what "Reference" means in this context?


private static final String NAME = "Experiment";
private static final String MISSING_OBS_UNIT_ID_ERROR = "Experimental entities are missing ObsUnitIDs";
private static final String MISSING_OBS_UNIT_ID_ERROR = "Experimental entities are missing ObsUnitIDs. Import cannot proceed";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private static final String MISSING_OBS_UNIT_ID_ERROR = "Experimental entities are missing ObsUnitIDs. Import cannot proceed";
private static final String MISSING_OBS_UNIT_ID_ERROR = "Experimental entities are missing ObsUnitIDs";

"Import cannot proceed." is duplicated, because ". Import cannot proceed." is appended by bi-web.
Screenshot 2024-04-02 at 12 43 21 PM

}

private void processAndCacheTrial(BrAPITrial existingTrial, Program program, String trialRefSource, Map<String, PendingImportObject<BrAPITrial>> trialByNameNoScope) {
private void processAndCacheTrial(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having "Cache" in the name suggests there might be interaction with our Redis cache. I get that it's caching in an instance variable, but if possible I might rename this and other similar methods to remove the word "Cache".

this.obsVarDatasetByName = initializeObsVarDatasetByName(program, experimentImportRows);
this.existingGermplasmByGID = initializeExistingGermplasmByGID(program, experimentImportRows);
// check for references to Deltabreed-generated observation units
referenceOUIds = collateReferenceOUIds(importRows);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's an example of instance variables mutated as side effects (collateReferenceOUIds mutates hasAllReferenceUnitIds and hasNoReferenceUnitIds). There is also the assignment which makes it even less obvious that there are side effects.

Comment on lines +199 to +200
initializeTrialsForExistingObservationUnits(program, trialByNameNoScope);
initializeStudiesForExistingObservationUnits(program, studyByNameNoScope);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are examples of mutating passed-by-reference variables.

Comment on lines +190 to +211
// check for references to Deltabreed-generated observation units
referenceOUIds = collateReferenceOUIds(importRows);

if (hasAllReferenceUnitIds) {
try {

// get all prior units referenced in import
pendingObsUnitByOUId = fetchReferenceObservationUnits(referenceOUIds, program);
observationUnitByNameNoScope = mapPendingObservationUnitByName(pendingObsUnitByOUId, program);
initializeTrialsForExistingObservationUnits(program, trialByNameNoScope);
initializeStudiesForExistingObservationUnits(program, studyByNameNoScope);
locationByName = initializeLocationByName(program, studyByNameNoScope);
obsVarDatasetByName = initializeObsVarDatasetForExistingObservationUnits(trialByNameNoScope, program);
existingGermplasmByGID = initializeGermplasmByGIDForExistingObservationUnits(observationUnitByNameNoScope, program);
for (Map.Entry<String, PendingImportObject<BrAPIObservationUnit>> unitEntry : pendingObsUnitByOUId.entrySet()) {
String unitId = unitEntry.getKey();
BrAPIObservationUnit unit = unitEntry.getValue().getBrAPIObject();
mapPendingTrialByOUId(unitId, unit, trialByNameNoScope, studyByNameNoScope, pendingTrialByOUId, program);
mapPendingStudyByOUId(unitId, unit, studyByNameNoScope, pendingStudyByOUId, program);
mapPendingLocationByOUId(unitId, unit, pendingStudyByOUId, locationByName, pendingLocationByOUId);
mapPendingObsDatasetByOUId(unitId, pendingTrialByOUId, obsVarDatasetByName, pendingObsDatasetByOUId);
mapGermplasmByOUId(unitId, unit, existingGermplasmByGID, pendingGermplasmByOUId);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mixture of assignments, mutating instance variables as side effects, and mutating passed-by-reference variables confusing. I have to dig into every called method to understand what state is mutated where and the lack of a consistent pattern makes it harder for me to reason about what's happening.

Given it's working and we want to release v0.9 I don't recommend making large changes, but maybe the team can discuss what we want our best practices to be in an IT planning meeting.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya, I think we can address this as part of the v0.10 work and refactoring of the ExperimentProcessor

if (hasAllReferenceUnitIds) {
String unitName = obsUnitPIO.getBrAPIObject().getObservationUnitName();
String studyName = studyPIO.getBrAPIObject().getStudyName();
key = getObservationHash(studyName + unitName, variableName, studyName);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is studyName prepended to unitName here?

Comment on lines +1349 to +1352
if (trialPio!=null && ImportObjectState.EXISTING==trialPio.getState() &&
(StringUtils.isBlank( importRow.getObsUnitID() )) && (envPio!=null && ImportObjectState.EXISTING==envPio.getState() ) ){
throw new UnprocessableEntityException(PREEXISTING_EXPERIMENT_TITLE);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This conditional is pretty complex, a comment explaining the intention might be helpful.

Comment on lines +2227 to +2235
/**
* This function collates unique ObsUnitID values from a list of BrAPIImport objects.
* It iterates through the list and adds non-blank ObsUnitID values to a Set. It also checks for any repeated ObsUnitIDs.
*
* @param importRows a List of BrAPIImport objects containing ExperimentObservation data
* @return a Set of unique ObsUnitID strings
* @throws IllegalStateException if a repeated ObsUnitID is encountered
*/
private Set<String> collateReferenceOUIds(List<BrAPIImport> importRows) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to indicate that the hasAllReferenceUnitIds and hasNoReferenceUnitIds instance variables are set/mutated as well.

@dmeidlin
Copy link
Contributor Author

dmeidlin commented Apr 3, 2024

Updated the error message.
Reference OU ids are the ids supplied by the user via import, and the corresponding units are reference OUs. These are OUs referenced by the user in the import for updating.

@mlm483 mlm483 self-requested a review April 4, 2024 17:34
Copy link
Member

@nickpalladino nickpalladino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When testing I saw a few things but think they are unrelated to the work in this card. I noticed when doing an updating phenotypes for a single observation unit the statistics were showing 1 for the observation units but I thought it would be 0. I also noticed the changeLog was showing a blank string for originalValue and I was expecting the prior value. I'm good creating new cards for these if they're legitimate issues.
exp4.xls
exp4_no_refs_2.xls
Screenshot 2024-04-04 at 11 12 20 AM
Screenshot 2024-04-04 at 11 17 16 AM

Comment on lines +190 to +211
// check for references to Deltabreed-generated observation units
referenceOUIds = collateReferenceOUIds(importRows);

if (hasAllReferenceUnitIds) {
try {

// get all prior units referenced in import
pendingObsUnitByOUId = fetchReferenceObservationUnits(referenceOUIds, program);
observationUnitByNameNoScope = mapPendingObservationUnitByName(pendingObsUnitByOUId, program);
initializeTrialsForExistingObservationUnits(program, trialByNameNoScope);
initializeStudiesForExistingObservationUnits(program, studyByNameNoScope);
locationByName = initializeLocationByName(program, studyByNameNoScope);
obsVarDatasetByName = initializeObsVarDatasetForExistingObservationUnits(trialByNameNoScope, program);
existingGermplasmByGID = initializeGermplasmByGIDForExistingObservationUnits(observationUnitByNameNoScope, program);
for (Map.Entry<String, PendingImportObject<BrAPIObservationUnit>> unitEntry : pendingObsUnitByOUId.entrySet()) {
String unitId = unitEntry.getKey();
BrAPIObservationUnit unit = unitEntry.getValue().getBrAPIObject();
mapPendingTrialByOUId(unitId, unit, trialByNameNoScope, studyByNameNoScope, pendingTrialByOUId, program);
mapPendingStudyByOUId(unitId, unit, studyByNameNoScope, pendingStudyByOUId, program);
mapPendingLocationByOUId(unitId, unit, pendingStudyByOUId, locationByName, pendingLocationByOUId);
mapPendingObsDatasetByOUId(unitId, pendingTrialByOUId, obsVarDatasetByName, pendingObsDatasetByOUId);
mapGermplasmByOUId(unitId, unit, existingGermplasmByGID, pendingGermplasmByOUId);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya, I think we can address this as part of the v0.10 work and refactoring of the ExperimentProcessor

observations.add(this.observationByHash.get(getImportObservationHash(importRow, getVariableNameFromColumn(column))));
}
String observationHash;
if (hasAllReferenceUnitIds) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in v0.10 we can try to make these workflows more modular and break things out a bit more rather than having if-else blocks in a number of different places

@dmeidlin dmeidlin merged commit 0db3d2f into release/0.9 Apr 5, 2024
@dmeidlin dmeidlin deleted the bug/BI-2046 branch April 5, 2024 20:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants