Conversation
|
While converting pt weights to TensorFlow I am getting this error: Code used: |
|
The documentation is not available anymore as the PR was closed or merged. |
amyeroberts
left a comment
There was a problem hiding this comment.
Thanks for adding this! Overall the PR looks good, just some small nits here and there.
Regarding the hidden layer differences, the way to solve it is to find the line(s) of code in the TF model contributing to the difference. The best thing to do is to bisect through the layers and their output activations when the equivalent PT and TF model are fed the same input.
In the output of the conversion script, we can see that the difference between the pytorch and tensorflow hidden states for the first block is already ~0.3, which is large. As a large difference appears in the first stage, I would load a small model to just have one stage, and start comparing the PT and TF models form there e.g.:
from transformers import TFAutoModel, AutoModel
checkpoint = "facebook/convnextv2-tiny-1k-224"
pt_model = AutoModel.from_pretrained(checkpoint, num_stages=1)
tf_model = TFAutoModel.from_pretrained(checkpoint, from_pt=True, num_stages=1)
What does this PR do?
TF port of convnextv2
@amyeroberts