Add multi-gpu training example for T4Rec PyTorch#521
Conversation
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Click to view CI ResultsGitHub pull request #521 of commit 93c5bb5707a18286b184ee9a47b8487b71bbaba8, no merge conflicts.
Running as SYSTEM
Setting status of 93c5bb5707a18286b184ee9a47b8487b71bbaba8 to PENDING with url http://merlin-infra1.nvidia.com:8080/job/transformers4rec_tests/248/ and message: 'Build started for merge commit.'
Using context: Jenkins Unit Test Run
Building on master in workspace /var/jenkins_home/workspace/transformers4rec_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA-Merlin/Transformers4Rec.git
> git init /var/jenkins_home/workspace/transformers4rec_tests/transformers4rec # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/Transformers4Rec.git
> git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
> git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Transformers4Rec.git +refs/heads/*:refs/remotes/origin/* # timeout=10
> git config remote.origin.url https://github.com/NVIDIA-Merlin/Transformers4Rec.git # timeout=10
> git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
> git config remote.origin.url https://github.com/NVIDIA-Merlin/Transformers4Rec.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/Transformers4Rec.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
> git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Transformers4Rec.git +refs/pull/521/*:refs/remotes/origin/pr/521/* # timeout=10
> git rev-parse 93c5bb5707a18286b184ee9a47b8487b71bbaba8^{commit} # timeout=10
Checking out Revision 93c5bb5707a18286b184ee9a47b8487b71bbaba8 (detached)
> git config core.sparsecheckout # timeout=10
> git checkout -f 93c5bb5707a18286b184ee9a47b8487b71bbaba8 # timeout=10
Commit message: "Add multi-gpu training example for T4Rec PyTorch"
> git rev-list --no-walk 2118ed166b624d8511c269add03cb0ef9e8260a1 # timeout=10
First time build. Skipping changelog.
[transformers4rec_tests] $ /bin/bash /tmp/jenkins1440288146966703595.sh
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-7.1.3, pluggy-1.0.0
rootdir: /var/jenkins_home/workspace/transformers4rec_tests/transformers4rec
plugins: anyio-3.6.1, xdist-3.0.2, cov-4.0.0
collected 1 item
|
| @@ -0,0 +1,318 @@ | |||
| { | |||
There was a problem hiding this comment.
In the following sentence, maybe worth adding that in the context of RecSys, the larger number of parameters sits on embedding tables
- Model Parallel: If the model is too large to fit on a single GPU, the parameters are distributed over multiple GPUs
Reply via ReviewNB
| @@ -0,0 +1,318 @@ | |||
| { | |||
There was a problem hiding this comment.
- specifying multiple GPUs
I believe that when working with DistributedDataParallel the CUDA_VISIBLE_DEVICES env variable is not really considered by torch.distributed.launch, as the launcher spawns one process per GPU (which is provided via the --local_rank arg) . The number of GPUs is defined by the --nproc_per_node argument.
You say using different batches of the data in a model-parallel fashion,but that is in fact data -parallel fashion.
- data repartitioning:
You could link here to this doc that explains how parquet data can be partitioned when saving
- I think we need to provide instructions on how to download the data required to run this example (maybe pointing to the example notebook that explains that)
Reply via ReviewNB
Click to view CI ResultsGitHub pull request #521 of commit b7d496a1633fafc5bfef91a307b4c9117230e328, no merge conflicts.
Running as SYSTEM
Setting status of b7d496a1633fafc5bfef91a307b4c9117230e328 to PENDING with url http://merlin-infra1.nvidia.com:8080/job/transformers4rec_tests/250/ and message: 'Build started for merge commit.'
Using context: Jenkins Unit Test Run
Building on master in workspace /var/jenkins_home/workspace/transformers4rec_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA-Merlin/Transformers4Rec.git
> git init /var/jenkins_home/workspace/transformers4rec_tests/transformers4rec # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/Transformers4Rec.git
> git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
> git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Transformers4Rec.git +refs/heads/*:refs/remotes/origin/* # timeout=10
> git config remote.origin.url https://github.com/NVIDIA-Merlin/Transformers4Rec.git # timeout=10
> git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
> git config remote.origin.url https://github.com/NVIDIA-Merlin/Transformers4Rec.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/Transformers4Rec.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
> git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Transformers4Rec.git +refs/pull/521/*:refs/remotes/origin/pr/521/* # timeout=10
> git rev-parse b7d496a1633fafc5bfef91a307b4c9117230e328^{commit} # timeout=10
Checking out Revision b7d496a1633fafc5bfef91a307b4c9117230e328 (detached)
> git config core.sparsecheckout # timeout=10
> git checkout -f b7d496a1633fafc5bfef91a307b4c9117230e328 # timeout=10
Commit message: "Updated notebook text."
> git rev-list --no-walk 2ac105763e603045679a4aa91e596c70d2ab01f0 # timeout=10
[transformers4rec_tests] $ /bin/bash /tmp/jenkins16900856128540027596.sh
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-7.1.3, pluggy-1.0.0
rootdir: /var/jenkins_home/workspace/transformers4rec_tests/transformers4rec
plugins: anyio-3.6.1, xdist-3.0.2, cov-4.0.0
collected 1 item
|
Click to view CI ResultsGitHub pull request #521 of commit 987ddf0f833e1af111ec91e2ee4828006133476e, no merge conflicts.
Running as SYSTEM
Setting status of 987ddf0f833e1af111ec91e2ee4828006133476e to PENDING with url http://merlin-infra1.nvidia.com:8080/job/transformers4rec_tests/251/ and message: 'Build started for merge commit.'
Using context: Jenkins Unit Test Run
Building on master in workspace /var/jenkins_home/workspace/transformers4rec_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA-Merlin/Transformers4Rec.git
> git init /var/jenkins_home/workspace/transformers4rec_tests/transformers4rec # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/Transformers4Rec.git
> git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
> git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Transformers4Rec.git +refs/heads/*:refs/remotes/origin/* # timeout=10
> git config remote.origin.url https://github.com/NVIDIA-Merlin/Transformers4Rec.git # timeout=10
> git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
> git config remote.origin.url https://github.com/NVIDIA-Merlin/Transformers4Rec.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/Transformers4Rec.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
> git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Transformers4Rec.git +refs/pull/521/*:refs/remotes/origin/pr/521/* # timeout=10
> git rev-parse 987ddf0f833e1af111ec91e2ee4828006133476e^{commit} # timeout=10
Checking out Revision 987ddf0f833e1af111ec91e2ee4828006133476e (detached)
> git config core.sparsecheckout # timeout=10
> git checkout -f 987ddf0f833e1af111ec91e2ee4828006133476e # timeout=10
Commit message: "Fixed notebook text."
> git rev-list --no-walk b7d496a1633fafc5bfef91a307b4c9117230e328 # timeout=10
[transformers4rec_tests] $ /bin/bash /tmp/jenkins7289922049230587541.sh
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-7.1.3, pluggy-1.0.0
rootdir: /var/jenkins_home/workspace/transformers4rec_tests/transformers4rec
plugins: anyio-3.6.1, xdist-3.0.2, cov-4.0.0
collected 1 item
|
Documentation previewhttps://nvidia-merlin.github.io/Transformers4Rec/review/pr-521 |
| @@ -0,0 +1,321 @@ | |||
| { | |||
There was a problem hiding this comment.
I think we could replace the last sentence of this paragraph by saying that data parallel is useful when you want to speed-up the training/evaluation of data leveraging multiple GPUs in parallel (as typically data won't fit into GPU memory, that is why models are trained on batches)
- Data Parallel: Every GPU has a copy of all model parameters and runs the forward/backward pass for its batch. This is used when the model can fit in one GPU memory, but the dataset is too large and must be distributed over multiple GPUs.
Reply via ReviewNB
Click to view CI ResultsGitHub pull request #521 of commit 7890438793c24c76a1e6901477ee7d7ebd18dbcf, no merge conflicts.
Running as SYSTEM
Setting status of 7890438793c24c76a1e6901477ee7d7ebd18dbcf to PENDING with url http://merlin-infra1.nvidia.com:8080/job/transformers4rec_tests/252/ and message: 'Build started for merge commit.'
Using context: Jenkins Unit Test Run
Building on master in workspace /var/jenkins_home/workspace/transformers4rec_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA-Merlin/Transformers4Rec.git
> git init /var/jenkins_home/workspace/transformers4rec_tests/transformers4rec # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/Transformers4Rec.git
> git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
> git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Transformers4Rec.git +refs/heads/*:refs/remotes/origin/* # timeout=10
> git config remote.origin.url https://github.com/NVIDIA-Merlin/Transformers4Rec.git # timeout=10
> git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
> git config remote.origin.url https://github.com/NVIDIA-Merlin/Transformers4Rec.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/Transformers4Rec.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
> git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Transformers4Rec.git +refs/pull/521/*:refs/remotes/origin/pr/521/* # timeout=10
> git rev-parse 7890438793c24c76a1e6901477ee7d7ebd18dbcf^{commit} # timeout=10
Checking out Revision 7890438793c24c76a1e6901477ee7d7ebd18dbcf (detached)
> git config core.sparsecheckout # timeout=10
> git checkout -f 7890438793c24c76a1e6901477ee7d7ebd18dbcf # timeout=10
Commit message: "Merge branch 'main' into multi_gpu3"
> git rev-list --no-walk 987ddf0f833e1af111ec91e2ee4828006133476e # timeout=10
[transformers4rec_tests] $ /bin/bash /tmp/jenkins4676104782042372006.sh
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-7.1.3, pluggy-1.0.0
rootdir: /var/jenkins_home/workspace/transformers4rec_tests/transformers4rec
plugins: anyio-3.6.1, xdist-3.0.2, cov-4.0.0
collected 1 item
|
Click to view CI ResultsGitHub pull request #521 of commit 095208557c6db3962eb2eb3b84445bb174ed248b, no merge conflicts.
Running as SYSTEM
Setting status of 095208557c6db3962eb2eb3b84445bb174ed248b to PENDING with url http://merlin-infra1.nvidia.com:8080/job/transformers4rec_tests/253/ and message: 'Build started for merge commit.'
Using context: Jenkins Unit Test Run
Building on master in workspace /var/jenkins_home/workspace/transformers4rec_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA-Merlin/Transformers4Rec.git
> git init /var/jenkins_home/workspace/transformers4rec_tests/transformers4rec # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/Transformers4Rec.git
> git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
> git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Transformers4Rec.git +refs/heads/*:refs/remotes/origin/* # timeout=10
> git config remote.origin.url https://github.com/NVIDIA-Merlin/Transformers4Rec.git # timeout=10
> git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
> git config remote.origin.url https://github.com/NVIDIA-Merlin/Transformers4Rec.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/Transformers4Rec.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
> git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Transformers4Rec.git +refs/pull/521/*:refs/remotes/origin/pr/521/* # timeout=10
> git rev-parse 095208557c6db3962eb2eb3b84445bb174ed248b^{commit} # timeout=10
Checking out Revision 095208557c6db3962eb2eb3b84445bb174ed248b (detached)
> git config core.sparsecheckout # timeout=10
> git checkout -f 095208557c6db3962eb2eb3b84445bb174ed248b # timeout=10
Commit message: "Update nb wrt Gabriel's comments"
> git rev-list --no-walk 7890438793c24c76a1e6901477ee7d7ebd18dbcf # timeout=10
[transformers4rec_tests] $ /bin/bash /tmp/jenkins689654318164210636.sh
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-7.1.3, pluggy-1.0.0
rootdir: /var/jenkins_home/workspace/transformers4rec_tests/transformers4rec
plugins: anyio-3.6.1, xdist-3.0.2, cov-4.0.0
collected 1 item
|
Click to view CI ResultsGitHub pull request #521 of commit 095208557c6db3962eb2eb3b84445bb174ed248b, has merge conflicts.
Running as SYSTEM
PR has already been merged, builds using the merged sha1 will fail!!!
Setting status of 095208557c6db3962eb2eb3b84445bb174ed248b to PENDING with url http://merlin-infra1.nvidia.com:8080/job/transformers4rec_tests/270/ and message: 'Build started for original commit.'
Using context: Jenkins Unit Test Run
Building on master in workspace /var/jenkins_home/workspace/transformers4rec_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA-Merlin/Transformers4Rec.git
> git init /var/jenkins_home/workspace/transformers4rec_tests/transformers4rec # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/Transformers4Rec.git
> git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
> git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Transformers4Rec.git +refs/heads/*:refs/remotes/origin/* # timeout=10
> git config remote.origin.url https://github.com/NVIDIA-Merlin/Transformers4Rec.git # timeout=10
> git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
> git config remote.origin.url https://github.com/NVIDIA-Merlin/Transformers4Rec.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/Transformers4Rec.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
> git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Transformers4Rec.git +refs/pull/521/*:refs/remotes/origin/pr/521/* # timeout=10
> git rev-parse 095208557c6db3962eb2eb3b84445bb174ed248b^{commit} # timeout=10
Checking out Revision 095208557c6db3962eb2eb3b84445bb174ed248b (detached)
> git config core.sparsecheckout # timeout=10
> git checkout -f 095208557c6db3962eb2eb3b84445bb174ed248b # timeout=10
Commit message: "Update nb wrt Gabriel's comments"
> git rev-list --no-walk 8846e74299a854c209bc0fdd36d8b9acb9a3d4da # timeout=10
First time build. Skipping changelog.
[transformers4rec_tests] $ /bin/bash /tmp/jenkins17456291754993862696.sh
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Collecting git+https://github.com/NVIDIA-Merlin/NVTabular.git
Cloning https://github.com/NVIDIA-Merlin/NVTabular.git to /tmp/pip-req-build-a05zhqjs
Running command git clone --filter=blob:none --quiet https://github.com/NVIDIA-Merlin/NVTabular.git /tmp/pip-req-build-a05zhqjs
Resolved https://github.com/NVIDIA-Merlin/NVTabular.git to commit ba4c14159a8e858c8998d4158a4376e65a8fa266
Installing build dependencies: started
Installing build dependencies: finished with status 'done'
Getting requirements to build wheel: started
Getting requirements to build wheel: finished with status 'done'
Preparing metadata (pyproject.toml): started
Preparing metadata (pyproject.toml): finished with status 'done'
Requirement already satisfied: merlin-dataloader>=0.0.2 in /usr/local/lib/python3.8/dist-packages (from nvtabular==1.6.0+4.gba4c1415) (0.0.2)
Requirement already satisfied: scipy in /usr/local/lib/python3.8/dist-packages (from nvtabular==1.6.0+4.gba4c1415) (1.8.1)
Requirement already satisfied: merlin-core>=0.2.0 in /usr/local/lib/python3.8/dist-packages (from nvtabular==1.6.0+4.gba4c1415) (0.6.0+1.g5926fcf)
Requirement already satisfied: protobuf>=3.0.0 in /usr/local/lib/python3.8/dist-packages (from merlin-core>=0.2.0->nvtabular==1.6.0+4.gba4c1415) (3.19.5)
Requirement already satisfied: numba>=0.54 in /usr/local/lib/python3.8/dist-packages (from merlin-core>=0.2.0->nvtabular==1.6.0+4.gba4c1415) (0.56.2)
Requirement already satisfied: pyarrow>=5.0.0 in /usr/local/lib/python3.8/dist-packages (from merlin-core>=0.2.0->nvtabular==1.6.0+4.gba4c1415) (7.0.0)
Requirement already satisfied: distributed>=2022.3.0 in /usr/local/lib/python3.8/dist-packages (from merlin-core>=0.2.0->nvtabular==1.6.0+4.gba4c1415) (2022.5.1)
Requirement already satisfied: pandas<1.4.0dev0,>=1.2.0 in /usr/local/lib/python3.8/dist-packages (from merlin-core>=0.2.0->nvtabular==1.6.0+4.gba4c1415) (1.3.5)
Requirement already satisfied: dask>=2022.3.0 in /usr/local/lib/python3.8/dist-packages (from merlin-core>=0.2.0->nvtabular==1.6.0+4.gba4c1415) (2022.5.1)
Requirement already satisfied: tqdm>=4.0 in /usr/local/lib/python3.8/dist-packages (from merlin-core>=0.2.0->nvtabular==1.6.0+4.gba4c1415) (4.64.1)
Requirement already satisfied: betterproto<2.0.0 in /usr/local/lib/python3.8/dist-packages (from merlin-core>=0.2.0->nvtabular==1.6.0+4.gba4c1415) (1.2.5)
Requirement already satisfied: tensorflow-metadata>=1.2.0 in /usr/local/lib/python3.8/dist-packages (from merlin-core>=0.2.0->nvtabular==1.6.0+4.gba4c1415) (1.10.0)
Requirement already satisfied: packaging in /usr/local/lib/python3.8/dist-packages (from merlin-core>=0.2.0->nvtabular==1.6.0+4.gba4c1415) (21.3)
Requirement already satisfied: fsspec==2022.5.0 in /usr/local/lib/python3.8/dist-packages (from merlin-core>=0.2.0->nvtabular==1.6.0+4.gba4c1415) (2022.5.0)
Requirement already satisfied: numpy<1.25.0,>=1.17.3 in /usr/local/lib/python3.8/dist-packages (from scipy->nvtabular==1.6.0+4.gba4c1415) (1.22.4)
Requirement already satisfied: grpclib in /usr/local/lib/python3.8/dist-packages (from betterproto<2.0.0->merlin-core>=0.2.0->nvtabular==1.6.0+4.gba4c1415) (0.4.3)
Requirement already satisfied: stringcase in /usr/local/lib/python3.8/dist-packages (from betterproto<2.0.0->merlin-core>=0.2.0->nvtabular==1.6.0+4.gba4c1415) (1.2.0)
Requirement already satisfied: pyyaml>=5.3.1 in /usr/local/lib/python3.8/dist-packages (from dask>=2022.3.0->merlin-core>=0.2.0->nvtabular==1.6.0+4.gba4c1415) (5.4.1)
Requirement already satisfied: toolz>=0.8.2 in /usr/local/lib/python3.8/dist-packages (from dask>=2022.3.0->merlin-core>=0.2.0->nvtabular==1.6.0+4.gba4c1415) (0.12.0)
Requirement already satisfied: cloudpickle>=1.1.1 in /usr/local/lib/python3.8/dist-packages (from dask>=2022.3.0->merlin-core>=0.2.0->nvtabular==1.6.0+4.gba4c1415) (2.2.0)
Requirement already satisfied: partd>=0.3.10 in /usr/local/lib/python3.8/dist-packages (from dask>=2022.3.0->merlin-core>=0.2.0->nvtabular==1.6.0+4.gba4c1415) (1.3.0)
Requirement already satisfied: urllib3 in /usr/local/lib/python3.8/dist-packages (from distributed>=2022.3.0->merlin-core>=0.2.0->nvtabular==1.6.0+4.gba4c1415) (1.26.12)
Requirement already satisfied: tornado>=6.0.3 in /usr/local/lib/python3.8/dist-packages (from distributed>=2022.3.0->merlin-core>=0.2.0->nvtabular==1.6.0+4.gba4c1415) (6.2)
Requirement already satisfied: zict>=0.1.3 in /usr/local/lib/python3.8/dist-packages (from distributed>=2022.3.0->merlin-core>=0.2.0->nvtabular==1.6.0+4.gba4c1415) (2.2.0)
Requirement already satisfied: sortedcontainers!=2.0.0,!=2.0.1 in /usr/local/lib/python3.8/dist-packages (from distributed>=2022.3.0->merlin-core>=0.2.0->nvtabular==1.6.0+4.gba4c1415) (2.4.0)
Requirement already satisfied: click>=6.6 in /usr/local/lib/python3.8/dist-packages (from distributed>=2022.3.0->merlin-core>=0.2.0->nvtabular==1.6.0+4.gba4c1415) (8.1.3)
Requirement already satisfied: jinja2 in /usr/local/lib/python3.8/dist-packages (from distributed>=2022.3.0->merlin-core>=0.2.0->nvtabular==1.6.0+4.gba4c1415) (3.1.2)
Requirement already satisfied: psutil>=5.0 in /usr/local/lib/python3.8/dist-packages (from distributed>=2022.3.0->merlin-core>=0.2.0->nvtabular==1.6.0+4.gba4c1415) (5.9.2)
Requirement already satisfied: tblib>=1.6.0 in /usr/local/lib/python3.8/dist-packages (from distributed>=2022.3.0->merlin-core>=0.2.0->nvtabular==1.6.0+4.gba4c1415) (1.7.0)
Requirement already satisfied: locket>=1.0.0 in /usr/local/lib/python3.8/dist-packages (from distributed>=2022.3.0->merlin-core>=0.2.0->nvtabular==1.6.0+4.gba4c1415) (1.0.0)
Requirement already satisfied: msgpack>=0.6.0 in /usr/local/lib/python3.8/dist-packages (from distributed>=2022.3.0->merlin-core>=0.2.0->nvtabular==1.6.0+4.gba4c1415) (1.0.4)
Requirement already satisfied: setuptools<60 in /usr/lib/python3/dist-packages (from numba>=0.54->merlin-core>=0.2.0->nvtabular==1.6.0+4.gba4c1415) (45.2.0)
Requirement already satisfied: importlib-metadata in /usr/local/lib/python3.8/dist-packages (from numba>=0.54->merlin-core>=0.2.0->nvtabular==1.6.0+4.gba4c1415) (4.12.0)
Requirement already satisfied: llvmlite<0.40,>=0.39.0dev0 in /usr/local/lib/python3.8/dist-packages (from numba>=0.54->merlin-core>=0.2.0->nvtabular==1.6.0+4.gba4c1415) (0.39.1)
Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /usr/local/lib/python3.8/dist-packages (from packaging->merlin-core>=0.2.0->nvtabular==1.6.0+4.gba4c1415) (3.0.9)
Requirement already satisfied: python-dateutil>=2.7.3 in /usr/local/lib/python3.8/dist-packages (from pandas<1.4.0dev0,>=1.2.0->merlin-core>=0.2.0->nvtabular==1.6.0+4.gba4c1415) (2.8.2)
Requirement already satisfied: pytz>=2017.3 in /usr/local/lib/python3.8/dist-packages (from pandas<1.4.0dev0,>=1.2.0->merlin-core>=0.2.0->nvtabular==1.6.0+4.gba4c1415) (2022.2.1)
Requirement already satisfied: absl-py<2.0.0,>=0.9 in /usr/local/lib/python3.8/dist-packages (from tensorflow-metadata>=1.2.0->merlin-core>=0.2.0->nvtabular==1.6.0+4.gba4c1415) (1.2.0)
Requirement already satisfied: googleapis-common-protos<2,>=1.52.0 in /usr/local/lib/python3.8/dist-packages (from tensorflow-metadata>=1.2.0->merlin-core>=0.2.0->nvtabular==1.6.0+4.gba4c1415) (1.52.0)
Requirement already satisfied: six>=1.5 in /var/jenkins_home/.local/lib/python3.8/site-packages (from python-dateutil>=2.7.3->pandas<1.4.0dev0,>=1.2.0->merlin-core>=0.2.0->nvtabular==1.6.0+4.gba4c1415) (1.15.0)
Requirement already satisfied: heapdict in /usr/local/lib/python3.8/dist-packages (from zict>=0.1.3->distributed>=2022.3.0->merlin-core>=0.2.0->nvtabular==1.6.0+4.gba4c1415) (1.0.1)
Requirement already satisfied: h2<5,>=3.1.0 in /usr/local/lib/python3.8/dist-packages (from grpclib->betterproto<2.0.0->merlin-core>=0.2.0->nvtabular==1.6.0+4.gba4c1415) (4.1.0)
Requirement already satisfied: multidict in /usr/local/lib/python3.8/dist-packages (from grpclib->betterproto<2.0.0->merlin-core>=0.2.0->nvtabular==1.6.0+4.gba4c1415) (6.0.2)
Requirement already satisfied: zipp>=0.5 in /usr/local/lib/python3.8/dist-packages (from importlib-metadata->numba>=0.54->merlin-core>=0.2.0->nvtabular==1.6.0+4.gba4c1415) (3.8.1)
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.8/dist-packages (from jinja2->distributed>=2022.3.0->merlin-core>=0.2.0->nvtabular==1.6.0+4.gba4c1415) (2.1.1)
Requirement already satisfied: hpack<5,>=4.0 in /usr/local/lib/python3.8/dist-packages (from h2<5,>=3.1.0->grpclib->betterproto<2.0.0->merlin-core>=0.2.0->nvtabular==1.6.0+4.gba4c1415) (4.0.0)
Requirement already satisfied: hyperframe<7,>=6.0 in /usr/local/lib/python3.8/dist-packages (from h2<5,>=3.1.0->grpclib->betterproto<2.0.0->merlin-core>=0.2.0->nvtabular==1.6.0+4.gba4c1415) (6.0.1)
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-7.1.3, pluggy-1.0.0
rootdir: /var/jenkins_home/workspace/transformers4rec_tests/transformers4rec
plugins: anyio-3.6.1, xdist-3.0.2, cov-4.0.0
collected 1 item
|
Fixes #508
This notebook example showcases DistributedDataParallel functionality with multi-GPU (2 GPUs) training and evaluation.