Now that we have a GRU implemented with scan (#8777), we can figure out a way to patch or intercept the PyTorch upstream GRU module such that it just gets replaced with our own GRU logic. Then the user won't have to know and the upstream GRU just works.
Now that we have a GRU implemented with scan (#8777), we can figure out a way to patch or intercept the PyTorch upstream GRU module such that it just gets replaced with our own GRU logic. Then the user won't have to know and the upstream GRU just works.