[Task] Update `model_pt.py` to  incorporate `ignore_masking` arg in TF4Rec model serving with Triton

**What needs doing**

A while go  we added `ignore_masking` arg to ensure that when we do prediction with TF4Rec we do not apply masking, meaning we don’t want to mask any input in sequence, rather we use the whole sequence to predict the next item. We noticed that our `model_pt.py` does not incorporate ignore_masking=True in this [line](https://github.com/NVIDIA-Merlin/NVTabular/blob/main/nvtabular/inference/triton/model/model_pt.py#L160) and we need to add it, as in [here](https://github.com/NVIDIA-Merlin/Transformers4Rec/blob/main/transformers4rec/torch/trainer.py#L324). 

**Additional context**

@karlhigley recommends that if  we want to avoid breaking other PyTorch models, seems like the cleanest approach would be to save T4R models for inference with an `ignore_masking` property set on them. Then the model's call/forward function could check the property and do the right thing without us having to add additional logic in the serving code. Since the Torchscript-based serving runs in a Triton back-end we don't control where we can't add code, we should be looking for solution that configures the model appropriately for serving before/while we save it. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Task] Update `model_pt.py` to incorporate `ignore_masking` arg in TF4Rec model serving with Triton #1688

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Task] Update model_pt.py to incorporate ignore_masking arg in TF4Rec model serving with Triton #1688

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[Task] Update `model_pt.py` to incorporate `ignore_masking` arg in TF4Rec model serving with Triton #1688