-
Notifications
You must be signed in to change notification settings - Fork 120
Description
Version of Awkward Array
2.8.1
Description and code to reproduce
The evaluation of ak.any() for arrays with the cuda backend seems to be incorrect in some cases.
For example, if you download these:
wget http://uaf-10.t2.ucsd.edu/~kmohrman/public_html_backup/files/parquet_files/100k_from_lindsey_file/test_pq_100k.parquet
wget http://uaf-10.t2.ucsd.edu/~kmohrman/public_html_backup/files/py_files/fromLindsey/ak_from_cudf.pyThen the issue can be reproduced with this code:
import pandas as df
import awkward as ak
import cudf
from ak_from_cudf import cudf_to_awkward
filepath = "test_pq_100k.parquet"
# CPU
table_cpu = df.read_parquet(filepath, columns=["Muon_pt"])
Muon_pt_cpu = ak.Array(table_cpu["Muon_pt"])
mupair_cpu = ak.combinations(Muon_pt_cpu, 2, fields=["mu1", "mu2"])
ptsum_cpu = mupair_cpu.mu1 + mupair_cpu.mu2
mask_cpu = ak.any(ptsum_cpu>30,axis=1)
# GPU
table_gpu = cudf.read_parquet(filepath, columns=["Muon_pt"])
Muon_pt_gpu = cudf_to_awkward(table_gpu["Muon_pt"])
mupair_gpu = ak.combinations(Muon_pt_gpu, 2, fields=["mu1", "mu2"])
ptsum_gpu = mupair_gpu.mu1 + mupair_gpu.mu2
mask_gpu = ak.any(ptsum_gpu>30,axis=1)
for i,x in enumerate(mupair_cpu):
mask_agree = mask_cpu[i] == mask_gpu[i]
if not mask_agree:
print(f"\nEvent {i}: mask_cpu={mask_cpu[i]}, mask_gpu={mask_gpu[i]}, mask_agree={mask_agree}")
print(f"\tptsum_cpu: {ptsum_cpu[i]}")
print(f"\tptsum_gpu: {ptsum_gpu[i]}")
if i > 54290:
breakIf the CPU and GPU results were the same, we would expect nothing to be printed. However, we see that there are many events where the evaluation of the ak.any() differs between CPU and GPU (the break statement is just to keep the output readable so that too many are not printed), so the reproducer above prints the following:
Event 54272: mask_cpu=False, mask_gpu=True, mask_agree=False
ptsum_cpu: []
ptsum_gpu: []
Event 54273: mask_cpu=False, mask_gpu=True, mask_agree=False
ptsum_cpu: []
ptsum_gpu: []
Event 54274: mask_cpu=False, mask_gpu=True, mask_agree=False
ptsum_cpu: []
ptsum_gpu: []
Event 54275: mask_cpu=True, mask_gpu=False, mask_agree=False
ptsum_cpu: [36.2, 36.1, 37.1, 6.95, 7.95, 7.85]
ptsum_gpu: [36.219772, 36.116978, 37.125095, 6.945659, 7.953775, 7.8509827]
Event 54277: mask_cpu=True, mask_gpu=False, mask_agree=False
ptsum_cpu: [37.1]
ptsum_gpu: [37.097]
Event 54280: mask_cpu=False, mask_gpu=True, mask_agree=False
ptsum_cpu: []
ptsum_gpu: []
Event 54281: mask_cpu=False, mask_gpu=True, mask_agree=False
ptsum_cpu: [26.9]
ptsum_gpu: [26.853153]
Event 54282: mask_cpu=False, mask_gpu=True, mask_agree=False
ptsum_cpu: []
ptsum_gpu: []
Event 54284: mask_cpu=True, mask_gpu=False, mask_agree=False
ptsum_cpu: [34.2]
ptsum_gpu: [34.161404]
Event 54285: mask_cpu=True, mask_gpu=False, mask_agree=False
ptsum_cpu: [86.7, 41, 52.5]
ptsum_gpu: [86.69585, 41.000923, 52.51365]
Event 54288: mask_cpu=False, mask_gpu=True, mask_agree=False
ptsum_cpu: []
ptsum_gpu: []
Event 54289: mask_cpu=False, mask_gpu=True, mask_agree=False
ptsum_cpu: []
ptsum_gpu: []
Event 54290: mask_cpu=False, mask_gpu=True, mask_agree=False
ptsum_cpu: []
ptsum_gpu: []
In these cases where the GPU and CPU results differ, we see the CPU results are correct (based on the pt values of the pairs of muons in the events) and the GPU results are incorrect.
In case it's useful, here are some interesting/odd things about this potential bug:
- The
ptsum_gpu>30part of the mask seems to agree (between CPU and GPU) for all events, it is only after theak.any()where the discrepancy arises. - As can be seen in some of the events printed above, the discrepancy arises even in events where the array of muon pt values is empty.
- A discrepancy does not arise until the 54272th event in the sample (more than half way through the sample). The value of the mask for all events before this agree perfectly. But then for the events after this there are very frequent discrepancies (as seen in the print output above).
- Somehow, the bug seems to depend on the length of the array. If you load only a subset of the events around the 54272th event (e.g. just grab a subset like
[54260:54290]in theMuon_pt_gpuandMuon_pt_cpulines), theak.any()mask value for that event (and all the other ones loaded) are evaluated correctly.