Skip to content

[RMP] Provide PyTorch serving support for T4R models in Torchscript #255

@karlhigley

Description

@karlhigley

Problem:

Users should be able to serve pytorch models that were produced with Transformers4Rec or any other process using a Systems ensemble. This will work towards supporting Session-based models as well as expanding System's support for a new modeling framework.

Goal:

Systems should be able to serve all pytorch models that are currently supported by Triton.

Definition of Done

Have an example that serves PyT session based model in conjunction with a NVT workflow where the session based models scores the whole catalog

Open questions

Constraints:

Not all pytorch models can be served via Triton's pytorch backend, so we will need to be able to use multiple backends in order to serve all Triton-compatible pytorch models.

Starting Point:

Transformers4Rec

Systems

Integration Issues

Nice to have: (P1)

Documentation

Examples

Blockers:

  • [INF] Unresolved architectural decisions
  • Support for ragged tensors in T4R
  • Start with fix padding and have it all padded to the same length and then investigate whether it's worthwhile to add the like padding to the Max Sequence length
  • Padding support in dataloader that supports systems along with cross framework support

Metadata

Metadata

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions