fix test logcumsumexp broken devectorize=0#10880
Conversation
Changes |
tinygrad/tensor.py
Outdated
| x_expand = x_reshaped.unsqueeze(1).expand(*x_reshaped.shape, last_dim_size) | ||
| mask = Tensor.ones(last_dim_size, last_dim_size, requires_grad=False, device=self.device).tril().unsqueeze(0) | ||
| ret = ((x_expand - x_cummax).exp() * mask).sum(-1).log() + x_cummax.squeeze(-1) | ||
| ret = mask.where(x_expand - x_cummax, mask_val).exp().sum(-1).log() + x_cummax.squeeze(-1) |
There was a problem hiding this comment.
i think dtype.min(self.dtype) is more principled than an arbitrary value
There was a problem hiding this comment.
yes, changed. Thanks for that!
d9b69ec to
fd4de66
Compare
| x_cummax = x_reshaped.cummax(-1).unsqueeze(-1) | ||
| x_expand = x_reshaped.unsqueeze(1).expand(*x_reshaped.shape, last_dim_size) | ||
| mask = Tensor.ones(last_dim_size, last_dim_size, requires_grad=False, device=self.device).tril().unsqueeze(0) | ||
| ret = ((x_expand - x_cummax).exp() * mask).sum(-1).log() + x_cummax.squeeze(-1) |
There was a problem hiding this comment.
so the issue is exp returns nan and nan*0 is bad?
why does it only happen with DEVECTORIZE?
There was a problem hiding this comment.
the issue is: exp returns inf, then inf * 0 returns nan, and this only happens with devectorize=0
to my understanding,
with devectorize=0:
Vector ops like exp * mask are computed exactly as written
No op reordering or optimization
with devectorize=1:
vector ops are devectorized into scalar operations
the devectorizer can reorder or optimize the computation
handle inf * 0 in optimized ways that produce 0 instead of nan
There was a problem hiding this comment.
how can exp returns inf if x_expand - x_cummax is non-positive though?
There was a problem hiding this comment.
it can be positive in some positions because it computes (x_expand - x_cummax).exp() * mask where x_expand creates a matrix that compares each element against the cumulative max at every position, not just its own position.
For input [0.0, 100.0], 100.0 gets compared to the cumulative max at position 0 creating positive diff in a position that will later be masked out. but exp(100.0) = inf is computed before the mask is applied, and when inf is multiplied by the mask value 0, it produces inf * 0 = nan, which happens with DEVECTORIZE=0
There was a problem hiding this comment.
Not really convinced?
I tested it yesterday and devectorize 0 was good. Now both 0 and 1 are working, and tests passed of course.
* fix test logcumsumexp numerical * lint * Use dtypes.min instead of -1e4
my attempt to fix the broken devectorize=0 which failed because of nan values. I propose a fix by applying the mask before the exponential rather than after, with mask value 1e4.