Hi
I am taking this repo as reference to implement code for our custom back-end. During the development process most of the cases when i use hugging-face models directly i see lot of graph breaks, in Fx graph,
My understanding on the inductor side is, each subgraph compiled and ran the inference sent the results to CPU, and the next subgraph will start executing which is time consuming?
So my question is ,
Have you seen such graph breaks if so how Hidet handles them?
similar to below case
https://discuss.pytorch.org/t/stitching-together-graph-breaks-for-large-compilation-units/194793/5