Add save/load & input/output schema methods to T4Rec Model class #507
Add save/load & input/output schema methods to T4Rec Model class #507
Conversation
Click to view CI ResultsGitHub pull request #507 of commit ad37cb1742c48226ee48b10b556e9d3af7ab4448, no merge conflicts.
Running as SYSTEM
Setting status of ad37cb1742c48226ee48b10b556e9d3af7ab4448 to PENDING with url http://10.20.17.181:8080/job/transformers4rec_tests/230/ and message: 'Build started for merge commit.'
Using context: Jenkins Unit Test Run
Building on master in workspace /var/jenkins_home/workspace/transformers4rec_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA-Merlin/Transformers4Rec.git
> git init /var/jenkins_home/workspace/transformers4rec_tests/transformers4rec # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/Transformers4Rec.git
> git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
> git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Transformers4Rec.git +refs/heads/*:refs/remotes/origin/* # timeout=10
> git config remote.origin.url https://github.com/NVIDIA-Merlin/Transformers4Rec.git # timeout=10
> git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
> git config remote.origin.url https://github.com/NVIDIA-Merlin/Transformers4Rec.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/Transformers4Rec.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
> git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Transformers4Rec.git +refs/pull/507/*:refs/remotes/origin/pr/507/* # timeout=10
> git rev-parse ad37cb1742c48226ee48b10b556e9d3af7ab4448^{commit} # timeout=10
Checking out Revision ad37cb1742c48226ee48b10b556e9d3af7ab4448 (detached)
> git config core.sparsecheckout # timeout=10
> git checkout -f ad37cb1742c48226ee48b10b556e9d3af7ab4448 # timeout=10
Commit message: "add suport of list outputs"
> git rev-list --no-walk d532234b241f46d77366b98d3450b08f83133c20 # timeout=10
First time build. Skipping changelog.
[transformers4rec_tests] $ /bin/bash /tmp/jenkins16443494207033087218.sh
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-7.1.3, pluggy-1.0.0
rootdir: /var/jenkins_home/workspace/transformers4rec_tests/transformers4rec
plugins: anyio-3.6.1, xdist-2.5.0, forked-1.4.0, cov-4.0.0
collected 1 item
|
Documentation previewhttps://nvidia-merlin.github.io/Transformers4Rec/review/pr-507 |
Click to view CI ResultsGitHub pull request #507 of commit 6dea3fe7ad046fa643b27e77439037ead84b51d3, no merge conflicts.
Running as SYSTEM
Setting status of 6dea3fe7ad046fa643b27e77439037ead84b51d3 to PENDING with url http://10.20.17.181:8080/job/transformers4rec_tests/231/ and message: 'Build started for merge commit.'
Using context: Jenkins Unit Test Run
Building on master in workspace /var/jenkins_home/workspace/transformers4rec_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA-Merlin/Transformers4Rec.git
> git init /var/jenkins_home/workspace/transformers4rec_tests/transformers4rec # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/Transformers4Rec.git
> git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
> git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Transformers4Rec.git +refs/heads/*:refs/remotes/origin/* # timeout=10
> git config remote.origin.url https://github.com/NVIDIA-Merlin/Transformers4Rec.git # timeout=10
> git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
> git config remote.origin.url https://github.com/NVIDIA-Merlin/Transformers4Rec.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/Transformers4Rec.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
> git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Transformers4Rec.git +refs/pull/507/*:refs/remotes/origin/pr/507/* # timeout=10
> git rev-parse 6dea3fe7ad046fa643b27e77439037ead84b51d3^{commit} # timeout=10
Checking out Revision 6dea3fe7ad046fa643b27e77439037ead84b51d3 (detached)
> git config core.sparsecheckout # timeout=10
> git checkout -f 6dea3fe7ad046fa643b27e77439037ead84b51d3 # timeout=10
Commit message: "add shape property and fix pr comment"
> git rev-list --no-walk ad37cb1742c48226ee48b10b556e9d3af7ab4448 # timeout=10
[transformers4rec_tests] $ /bin/bash /tmp/jenkins12034054313781789087.sh
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-7.1.3, pluggy-1.0.0
rootdir: /var/jenkins_home/workspace/transformers4rec_tests/transformers4rec
plugins: anyio-3.6.1, xdist-2.5.0, forked-1.4.0, cov-4.0.0
collected 1 item
|
Click to view CI ResultsGitHub pull request #507 of commit e6f6e58ba93a1f67557c686ce6aafd5ed9891c7c, no merge conflicts.
Running as SYSTEM
Setting status of e6f6e58ba93a1f67557c686ce6aafd5ed9891c7c to PENDING with url http://10.20.17.181:8080/job/transformers4rec_tests/233/ and message: 'Build started for merge commit.'
Using context: Jenkins Unit Test Run
Building on master in workspace /var/jenkins_home/workspace/transformers4rec_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA-Merlin/Transformers4Rec.git
> git init /var/jenkins_home/workspace/transformers4rec_tests/transformers4rec # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/Transformers4Rec.git
> git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
> git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Transformers4Rec.git +refs/heads/*:refs/remotes/origin/* # timeout=10
> git config remote.origin.url https://github.com/NVIDIA-Merlin/Transformers4Rec.git # timeout=10
> git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
> git config remote.origin.url https://github.com/NVIDIA-Merlin/Transformers4Rec.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/Transformers4Rec.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
> git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Transformers4Rec.git +refs/pull/507/*:refs/remotes/origin/pr/507/* # timeout=10
> git rev-parse e6f6e58ba93a1f67557c686ce6aafd5ed9891c7c^{commit} # timeout=10
Checking out Revision e6f6e58ba93a1f67557c686ce6aafd5ed9891c7c (detached)
> git config core.sparsecheckout # timeout=10
> git checkout -f e6f6e58ba93a1f67557c686ce6aafd5ed9891c7c # timeout=10
Commit message: "update shape property with the convention used in systems"
> git rev-list --no-walk 0732292df37d1cf427785608858f5590e0bcf6ab # timeout=10
First time build. Skipping changelog.
[transformers4rec_tests] $ /bin/bash /tmp/jenkins15048533315320923114.sh
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-7.1.3, pluggy-1.0.0
rootdir: /var/jenkins_home/workspace/transformers4rec_tests/transformers4rec
plugins: anyio-3.6.1, xdist-3.0.2, cov-4.0.0
collected 1 item
|
Click to view CI ResultsGitHub pull request #507 of commit 3fb91d2b5b4d584a37f574e7c679896e1885b12a, no merge conflicts.
Running as SYSTEM
Setting status of 3fb91d2b5b4d584a37f574e7c679896e1885b12a to PENDING with url http://10.20.17.181:8080/job/transformers4rec_tests/234/ and message: 'Build started for merge commit.'
Using context: Jenkins Unit Test Run
Building on master in workspace /var/jenkins_home/workspace/transformers4rec_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA-Merlin/Transformers4Rec.git
> git init /var/jenkins_home/workspace/transformers4rec_tests/transformers4rec # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/Transformers4Rec.git
> git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
> git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Transformers4Rec.git +refs/heads/*:refs/remotes/origin/* # timeout=10
> git config remote.origin.url https://github.com/NVIDIA-Merlin/Transformers4Rec.git # timeout=10
> git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
> git config remote.origin.url https://github.com/NVIDIA-Merlin/Transformers4Rec.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/Transformers4Rec.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
> git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Transformers4Rec.git +refs/pull/507/*:refs/remotes/origin/pr/507/* # timeout=10
> git rev-parse 3fb91d2b5b4d584a37f574e7c679896e1885b12a^{commit} # timeout=10
Checking out Revision 3fb91d2b5b4d584a37f574e7c679896e1885b12a (detached)
> git config core.sparsecheckout # timeout=10
> git checkout -f 3fb91d2b5b4d584a37f574e7c679896e1885b12a # timeout=10
Commit message: "remove max_sequence_length from in/out schema methods"
> git rev-list --no-walk e6f6e58ba93a1f67557c686ce6aafd5ed9891c7c # timeout=10
[transformers4rec_tests] $ /bin/bash /tmp/jenkins1055468907347027097.sh
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-7.1.3, pluggy-1.0.0
rootdir: /var/jenkins_home/workspace/transformers4rec_tests/transformers4rec
plugins: anyio-3.6.1, xdist-3.0.2, cov-4.0.0
collected 1 item
|
| # At inference, we just need the predictions tensors. | ||
| # TODO: We are simplifying the logic around `hf_format` in the multi-gpu | ||
| # support work. | ||
| if not training and not self.hf_format: |
There was a problem hiding this comment.
Can we remove not training from here so that the hf_format controls the output. That would make it work in systems without a model adapter wrapper class
Click to view CI ResultsGitHub pull request #507 of commit c76b416a920916779dfcba953e80a3a02c5c3538, no merge conflicts.
Running as SYSTEM
Setting status of c76b416a920916779dfcba953e80a3a02c5c3538 to PENDING with url http://10.20.17.181:8080/job/transformers4rec_tests/235/ and message: 'Build started for merge commit.'
Using context: Jenkins Unit Test Run
Building on master in workspace /var/jenkins_home/workspace/transformers4rec_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA-Merlin/Transformers4Rec.git
> git init /var/jenkins_home/workspace/transformers4rec_tests/transformers4rec # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/Transformers4Rec.git
> git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
> git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Transformers4Rec.git +refs/heads/*:refs/remotes/origin/* # timeout=10
> git config remote.origin.url https://github.com/NVIDIA-Merlin/Transformers4Rec.git # timeout=10
> git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
> git config remote.origin.url https://github.com/NVIDIA-Merlin/Transformers4Rec.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/Transformers4Rec.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
> git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Transformers4Rec.git +refs/pull/507/*:refs/remotes/origin/pr/507/* # timeout=10
> git rev-parse c76b416a920916779dfcba953e80a3a02c5c3538^{commit} # timeout=10
Checking out Revision c76b416a920916779dfcba953e80a3a02c5c3538 (detached)
> git config core.sparsecheckout # timeout=10
> git checkout -f c76b416a920916779dfcba953e80a3a02c5c3538 # timeout=10
Commit message: "fix PR comments"
> git rev-list --no-walk 3fb91d2b5b4d584a37f574e7c679896e1885b12a # timeout=10
[transformers4rec_tests] $ /bin/bash /tmp/jenkins14522582708295971654.sh
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-7.1.3, pluggy-1.0.0
rootdir: /var/jenkins_home/workspace/transformers4rec_tests/transformers4rec
plugins: anyio-3.6.1, xdist-3.0.2, cov-4.0.0
collected 1 item
|
Click to view CI ResultsGitHub pull request #507 of commit a399724271a5c77c0fc25f9873afc1456e003f6e, no merge conflicts.
Running as SYSTEM
Setting status of a399724271a5c77c0fc25f9873afc1456e003f6e to PENDING with url http://merlin-infra1.nvidia.com:8080/job/transformers4rec_tests/240/ and message: 'Build started for merge commit.'
Using context: Jenkins Unit Test Run
Building on master in workspace /var/jenkins_home/workspace/transformers4rec_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA-Merlin/Transformers4Rec.git
> git init /var/jenkins_home/workspace/transformers4rec_tests/transformers4rec # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/Transformers4Rec.git
> git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
> git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Transformers4Rec.git +refs/heads/*:refs/remotes/origin/* # timeout=10
> git config remote.origin.url https://github.com/NVIDIA-Merlin/Transformers4Rec.git # timeout=10
> git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
> git config remote.origin.url https://github.com/NVIDIA-Merlin/Transformers4Rec.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/Transformers4Rec.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
> git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Transformers4Rec.git +refs/pull/507/*:refs/remotes/origin/pr/507/* # timeout=10
> git rev-parse a399724271a5c77c0fc25f9873afc1456e003f6e^{commit} # timeout=10
Checking out Revision a399724271a5c77c0fc25f9873afc1456e003f6e (detached)
> git config core.sparsecheckout # timeout=10
> git checkout -f a399724271a5c77c0fc25f9873afc1456e003f6e # timeout=10
Commit message: "Merge branch 'main' into save-schema-for-t4rec-model"
> git rev-list --no-walk ecae4337558075f1282ad3a5e40bbf6346b57243 # timeout=10
First time build. Skipping changelog.
[transformers4rec_tests] $ /bin/bash /tmp/jenkins17120315256631438005.sh
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-7.1.3, pluggy-1.0.0
rootdir: /var/jenkins_home/workspace/transformers4rec_tests/transformers4rec
plugins: anyio-3.6.1, xdist-3.0.2, cov-4.0.0
collected 1 item
|
Click to view CI ResultsGitHub pull request #507 of commit 9c513119b2f522c662a288dd6dade872b906af14, no merge conflicts.
Running as SYSTEM
Setting status of 9c513119b2f522c662a288dd6dade872b906af14 to PENDING with url http://merlin-infra1.nvidia.com:8080/job/transformers4rec_tests/244/ and message: 'Build started for merge commit.'
Using context: Jenkins Unit Test Run
Building on master in workspace /var/jenkins_home/workspace/transformers4rec_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA-Merlin/Transformers4Rec.git
> git init /var/jenkins_home/workspace/transformers4rec_tests/transformers4rec # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/Transformers4Rec.git
> git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
> git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Transformers4Rec.git +refs/heads/*:refs/remotes/origin/* # timeout=10
> git config remote.origin.url https://github.com/NVIDIA-Merlin/Transformers4Rec.git # timeout=10
> git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
> git config remote.origin.url https://github.com/NVIDIA-Merlin/Transformers4Rec.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/Transformers4Rec.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
> git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Transformers4Rec.git +refs/pull/507/*:refs/remotes/origin/pr/507/* # timeout=10
> git rev-parse 9c513119b2f522c662a288dd6dade872b906af14^{commit} # timeout=10
Checking out Revision 9c513119b2f522c662a288dd6dade872b906af14 (detached)
> git config core.sparsecheckout # timeout=10
> git checkout -f 9c513119b2f522c662a288dd6dade872b906af14 # timeout=10
Commit message: "Merge branch 'main' into save-schema-for-t4rec-model"
> git rev-list --no-walk 9e8632f3e5567381999a8da5a3edcfbe98529a9a # timeout=10
First time build. Skipping changelog.
[transformers4rec_tests] $ /bin/bash /tmp/jenkins1417152996509697193.sh
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-7.1.3, pluggy-1.0.0
rootdir: /var/jenkins_home/workspace/transformers4rec_tests/transformers4rec
plugins: anyio-3.6.1, xdist-3.0.2, cov-4.0.0
collected 1 item
|
9c51311 to
0a2f6bd
Compare
Click to view CI ResultsGitHub pull request #507 of commit 0a2f6bd2c47536c52ed305f116eed260d5ce2d9b, no merge conflicts.
Running as SYSTEM
Setting status of 0a2f6bd2c47536c52ed305f116eed260d5ce2d9b to PENDING with url http://merlin-infra1.nvidia.com:8080/job/transformers4rec_tests/245/ and message: 'Build started for merge commit.'
Using context: Jenkins Unit Test Run
Building on master in workspace /var/jenkins_home/workspace/transformers4rec_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA-Merlin/Transformers4Rec.git
> git init /var/jenkins_home/workspace/transformers4rec_tests/transformers4rec # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/Transformers4Rec.git
> git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
> git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Transformers4Rec.git +refs/heads/*:refs/remotes/origin/* # timeout=10
> git config remote.origin.url https://github.com/NVIDIA-Merlin/Transformers4Rec.git # timeout=10
> git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
> git config remote.origin.url https://github.com/NVIDIA-Merlin/Transformers4Rec.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/Transformers4Rec.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
> git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Transformers4Rec.git +refs/pull/507/*:refs/remotes/origin/pr/507/* # timeout=10
> git rev-parse 0a2f6bd2c47536c52ed305f116eed260d5ce2d9b^{commit} # timeout=10
Checking out Revision 0a2f6bd2c47536c52ed305f116eed260d5ce2d9b (detached)
> git config core.sparsecheckout # timeout=10
> git checkout -f 0a2f6bd2c47536c52ed305f116eed260d5ce2d9b # timeout=10
Commit message: "fix PR comments"
> git rev-list --no-walk 9c513119b2f522c662a288dd6dade872b906af14 # timeout=10
First time build. Skipping changelog.
[transformers4rec_tests] $ /bin/bash /tmp/jenkins14261443200611807130.sh
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-7.1.3, pluggy-1.0.0
rootdir: /var/jenkins_home/workspace/transformers4rec_tests/transformers4rec
plugins: anyio-3.6.1, xdist-3.0.2, cov-4.0.0
collected 1 item
|
Fixes #499
Goals ⚽
Add save/load methods of T4Rec models using CloudPickle and following the protocol defined in Merlin-Model
Add an
input_schemaproperty to the T4Rec base classModelthat builds the model schema from the inputs modules of the heads and returns the merlin schema object.Add an
output_schemaproperty to the T4Rec base classModelthat builds the model schema from the predictions tasks specified in the heads and returns a merlin schema object with as many ColumnSchemas as the predictions tasks.Implementation Details 🚧
Add save/load methods to the T4Rec base class
Modelusingcloudpickleand following the same protocol proposed in merlin models (here)Add a
shapeproperty to input/output schema to provide the length/shape information of list featuresI used the code of this unit test in merlin system as a starting point to convert the input T4Rec schema to a merlin schema object.
The output schema is built based on the prediction tasks provided to the model. The stored information is:
name, int_domain, value_counts, is_list, shape, and is_ragged.Constraints
The format of the T4Rec model outputs is not standardized and varies a lot based on the PredictionTask and some specific boolean flags such as
hf_format. There is a working going on to simplify the output API format ([Task] Standardize the model output format. #505) which will simplify the output of the model at inference (one prediction tensor is returned in the case of a single task learning or a dictionary of tensors where keys are the task name and values are the predictions tensors).The output dictionary needs to be converted to a
NamedTuplefor PyTorch serving.Testing Details 🔍
test_save_next_item_prediction_model: saving/loading the model trained with the next item prediction task (the model used in the inference example)