feat: improve `Trainer` and `DeeprankDataset` logic for production testing by gcroci2 · Pull Request #515 · DeepRank/deeprank2

gcroci2 · 2023-10-19T13:03:00Z

Main changes:

Now DeeprankDataset takes as input train_data, that before was called dataset_train. Now train_data can be a DeeprankDataset representing the training set (as before), or a pre-trained model (new feature). It needs to be set only if train is False (as before, so only in validation/testing sets cases). Now we are able to use a test dataset without the need for the original model's training dataset. We can use the info stored in the pre-trained model to inherit the needed attributes.
- There were many issues with loading the lambda transformations from the torch-saved pre-trained model (testing only use case). I ended up converting the lambdas to strings first, and then saving them. When the pre-trained model is loaded in the GraphDataset class, the strings representing the lambdas are evaluated and converted back to functions.

Secondary changes:

In DeeprankDataset classes, if target attribute is present (e.g., binary, inherited) but it's not in the HDF5 and we're not in the training phase, now no error is raised. Indeed, it should be possible to run a pre-trained model on data point/s even if the target value/s are not present, for doing predictions only. It's actually a typical test-case scenario, in which we don't have any labels for the new data points that we want to evaluate.
On the other hand, if we're in the training phase (train = True) and no target is set, or the set target is not in the hdf5 file/s, then a ValueError is raised.
self.pretrained_model_path is now defined in the init of the Trainer class and defaulted to pretrained_model.
self.model_load_state_dict is also defined in the init of the Trainer class and defaulted to None. It is assigned to a value only in the case of a pre-trained model or at the end of the training phase. This way in the test() method we can first verify if the model has actually been loaded (pre-trained case) or trained. If not, the test() method now throws an error.
In the Trainer class' init, before loading parameters and the pre-trained model there was a check for the target, which in the pre-trained model case could be not present at all (it is saved in the model itself, no need to define it in the Trainer instance). I removed the check.
I added the following attributes in _init_from_dataset, which need to be saved in the model's file for those cases in which we want to test it on some other data without redefining the training set: features_transform, means, devs, target_transform, classes, classes_to_index. I also added them to the model's state dict which is saved at the end of the training (_save_model, same for _load_params).
Now the saved model contains a key called data_type, needed for checking which type of dataset was used during the training of the model.
I removed the warning about not having a validation set during the training because it was given at each epoch. Now it's printed only once when you call the train() method.
When you call torch.load() is called on a model's file which contains GPU tensors, those tensors will be loaded to GPU by default. But if no Cuda was available, the code crashed. Now there is a check for that, and in case Cuda is not available the tensors are loaded into the CPU.
target_filter wasn't really working. Some other edits made me notice that the functionality was broken, and it's fixed now.

Still to solve:

For some reason, PyTorch gives a weird error about the Adam optimizer, but only in the 3.11 Python version. I tried to fix the relevant torch packages versions, but it's still failing. I haven't touched anything about the optimizer though. Any idea? @DaniBodor

…loaded/trained

…nality refactor

…in when train is False in dataset.py

…set classes

…ns (much more reliable)

…tate dict

…ll not needed)

…rget values are present in the hdf5 file/s

…te but no target values are present in the hdf5 file/s

…er_targets

DaniBodor · 2023-11-20T11:50:16Z

I unsubscribed to notifications for this PR for now. Please tag me again if needed and/or when you want me to re-review.

docs/getstarted.md

deeprank2/dataset.py

tests/test_dataset.py

docs/getstarted.md

DaniBodor

Just leaving these as comments for now. Once you/we figure out why the build is failing, I will review that before approving.

Co-authored-by: Dani Bodor <d.bodor@esciencecenter.nl>

DaniBodor · 2023-11-27T10:45:00Z

It looks to me like the problem with the 3.11 build is really a core change in pytorch. I don't think it'll be easy for us (def not me) to figure out what the problem is. Maybe it's best to create an issue on pytorch and see if they know how to solve.

github-actions · 2023-12-13T03:22:40Z

This PR is stale because it has been open for 14 days with no activity.

gcroci2 · 2024-01-03T18:08:02Z

I am merging this PR. The issue with Python 3.11 will be solved in another PR.

gcroci2 added 2 commits October 19, 2023 14:59

add relevant attributes to the Trainer and improve their logic

a5ad04a

add tests for testing when no test is provided and when no mmodel is …

a094e2f

…loaded/trained

gcroci2 self-assigned this Oct 19, 2023

gcroci2 linked an issue Oct 19, 2023 that may be closed by this pull request

Improve Trainer and DeeprankDataset for production testing #510

Closed

5 tasks

gcroci2 changed the title ~~Improve Trainer logic for production testing~~ refactor: improve Trainer and DeeprankDataset logic for production testing Oct 19, 2023

gcroci2 added 25 commits October 20, 2023 14:25

fix test_optim

a6209cb

change dataset_train to train_data and update docs, for later functio…

d562513

…nality refactor

change dataset_train to train_data in all relevant scripts

fa45c60

improve logic for handling both a pre-trained model and a dataset_tra…

fe18d3d

…in when train is False in dataset.py

add logic for handling the pre-trained model as input in DeeprankData…

f9b82de

…set classes

add tests for catching uncorrect pre-trained models

c21fe2b

add folder for pretrained models in tests

fc5f6af

update data paths in test_dataset.py

0a068b5

implement inheritance in dataset from a pre-trained model

bcc5138

add tests for inheritance from pre-trained model

5280ece

add classes_to_index as inherited param and to the pre-trained model

7fcc033

add classes_to_index to the tests' models

cc3d79a

add classes_to_index's check to the tests

6b037ad

save features_transform's lambdas as strings and load them as functio…

8e2496c

…ns (much more reliable)

update pre-trained models

11f826e

add trainer tests for testing without defining the dataset_train

0fcea3a

fix test_dataset.py for the newly defined features_transform in the s…

f8e6c57

…tate dict

remove dill usage since we're not saving lambda functions anymore (di…

b8e2348

…ll not needed)

improve initialization order in the Trainer class

060e6bf

fix datasets for cases in which there is a target attribute but no ta…

1d3c0f5

…rget values are present in the hdf5 file/s

fix Trainer _eval method for cases in which there is a target attribu…

4d588e1

…te but no target values are present in the hdf5 file/s

add logic for checking the target settings in the init, and fix _filt…

147a16a

…er_targets

add tests for cases with no target and improve target's filter tests

08c90e0

fix tests according to the new target's checks

3b5d746

add hdf5 file with no target

a5fd524

gcroci2 added 10 commits November 21, 2023 14:32

improve testing new data part

79b00a2

add testing new data in the readme

ef57a25

uniform pretrained_model_path to pretrained_model

d07b8c3

make error msg about the dataset clearer

55c061c

use None instead of 'None' in the trainer _eval and _epoch methods

2584917

fix prospector errors

0906b2d

Merge branch 'dev' into 510_testing_pre_trained_gcroci2

ae7fe4d

move features checking after inheritance

836c5b2

fix prospector errors

b364eba

try to fix optimizer error in py3.11

780f2b9

gcroci2 requested a review from DaniBodor November 22, 2023 14:49

DaniBodor reviewed Nov 23, 2023

View reviewed changes

docs/getstarted.md Outdated Show resolved Hide resolved

deeprank2/dataset.py Outdated Show resolved Hide resolved

tests/test_dataset.py Show resolved Hide resolved

docs/getstarted.md Outdated Show resolved Hide resolved

DaniBodor reviewed Nov 23, 2023

View reviewed changes

Update docs/getstarted.md

2935414

Co-authored-by: Dani Bodor <d.bodor@esciencecenter.nl>

github-actions bot added the stale issue not touched from too much time label Dec 13, 2023

gcroci2 added 9 commits January 3, 2024 17:20

remove train parameter from dataset.py

2e4ec65

remove train refs from trainer.py

d1da845

update tests with the new train_data logic

0bb3023

update docs

0255dcf

update tutorials

f4ba712

change train_data to train_source

bff8a3d

add comment for clarifying tests

8c3c5d1

merge with dev

3d31c59

fix integration test

7f33a68

gcroci2 merged commit 226ff35 into dev Jan 3, 2024

gcroci2 deleted the 510_testing_pre_trained_gcroci2 branch January 3, 2024 18:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: improve `Trainer` and `DeeprankDataset` logic for production testing#515

feat: improve `Trainer` and `DeeprankDataset` logic for production testing#515
gcroci2 merged 63 commits intodevfrom
510_testing_pre_trained_gcroci2

gcroci2 commented Oct 19, 2023 •

edited

Loading

Uh oh!

DaniBodor commented Nov 20, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

DaniBodor left a comment

Uh oh!

DaniBodor commented Nov 27, 2023

Uh oh!

github-actions bot commented Dec 13, 2023

Uh oh!

gcroci2 commented Jan 3, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

gcroci2 commented Oct 19, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DaniBodor commented Nov 20, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

DaniBodor left a comment

Choose a reason for hiding this comment

Uh oh!

DaniBodor commented Nov 27, 2023

Uh oh!

github-actions bot commented Dec 13, 2023

Uh oh!

gcroci2 commented Jan 3, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

gcroci2 commented Oct 19, 2023 •

edited

Loading