Skip to content

Discrepancy in behavior of ak.any() for GPU backend #3503

@kmohrman

Description

@kmohrman

Version of Awkward Array

2.8.1

Description and code to reproduce

The evaluation of ak.any() for arrays with the cuda backend seems to be incorrect in some cases.

For example, if you download these:

wget http://uaf-10.t2.ucsd.edu/~kmohrman/public_html_backup/files/parquet_files/100k_from_lindsey_file/test_pq_100k.parquet
wget http://uaf-10.t2.ucsd.edu/~kmohrman/public_html_backup/files/py_files/fromLindsey/ak_from_cudf.py

Then the issue can be reproduced with this code:

import pandas as df
import awkward as ak
import cudf
from ak_from_cudf import cudf_to_awkward

filepath = "test_pq_100k.parquet"

# CPU
table_cpu   = df.read_parquet(filepath, columns=["Muon_pt"])
Muon_pt_cpu = ak.Array(table_cpu["Muon_pt"])
mupair_cpu  = ak.combinations(Muon_pt_cpu, 2, fields=["mu1", "mu2"])
ptsum_cpu   = mupair_cpu.mu1 + mupair_cpu.mu2
mask_cpu    = ak.any(ptsum_cpu>30,axis=1)

# GPU
table_gpu   = cudf.read_parquet(filepath, columns=["Muon_pt"])
Muon_pt_gpu = cudf_to_awkward(table_gpu["Muon_pt"])
mupair_gpu  = ak.combinations(Muon_pt_gpu, 2, fields=["mu1", "mu2"])
ptsum_gpu   = mupair_gpu.mu1 + mupair_gpu.mu2
mask_gpu    = ak.any(ptsum_gpu>30,axis=1)

for i,x in enumerate(mupair_cpu):
    mask_agree = mask_cpu[i] == mask_gpu[i]
    if not mask_agree:
        print(f"\nEvent {i}: mask_cpu={mask_cpu[i]}, mask_gpu={mask_gpu[i]}, mask_agree={mask_agree}")
        print(f"\tptsum_cpu: {ptsum_cpu[i]}")
        print(f"\tptsum_gpu: {ptsum_gpu[i]}")
    if i > 54290:
        break

If the CPU and GPU results were the same, we would expect nothing to be printed. However, we see that there are many events where the evaluation of the ak.any() differs between CPU and GPU (the break statement is just to keep the output readable so that too many are not printed), so the reproducer above prints the following:

Event 54272: mask_cpu=False, mask_gpu=True, mask_agree=False
	ptsum_cpu: []
	ptsum_gpu: []

Event 54273: mask_cpu=False, mask_gpu=True, mask_agree=False
	ptsum_cpu: []
	ptsum_gpu: []

Event 54274: mask_cpu=False, mask_gpu=True, mask_agree=False
	ptsum_cpu: []
	ptsum_gpu: []

Event 54275: mask_cpu=True, mask_gpu=False, mask_agree=False
	ptsum_cpu: [36.2, 36.1, 37.1, 6.95, 7.95, 7.85]
	ptsum_gpu: [36.219772, 36.116978, 37.125095, 6.945659, 7.953775, 7.8509827]

Event 54277: mask_cpu=True, mask_gpu=False, mask_agree=False
	ptsum_cpu: [37.1]
	ptsum_gpu: [37.097]

Event 54280: mask_cpu=False, mask_gpu=True, mask_agree=False
	ptsum_cpu: []
	ptsum_gpu: []

Event 54281: mask_cpu=False, mask_gpu=True, mask_agree=False
	ptsum_cpu: [26.9]
	ptsum_gpu: [26.853153]

Event 54282: mask_cpu=False, mask_gpu=True, mask_agree=False
	ptsum_cpu: []
	ptsum_gpu: []

Event 54284: mask_cpu=True, mask_gpu=False, mask_agree=False
	ptsum_cpu: [34.2]
	ptsum_gpu: [34.161404]

Event 54285: mask_cpu=True, mask_gpu=False, mask_agree=False
	ptsum_cpu: [86.7, 41, 52.5]
	ptsum_gpu: [86.69585, 41.000923, 52.51365]

Event 54288: mask_cpu=False, mask_gpu=True, mask_agree=False
	ptsum_cpu: []
	ptsum_gpu: []

Event 54289: mask_cpu=False, mask_gpu=True, mask_agree=False
	ptsum_cpu: []
	ptsum_gpu: []

Event 54290: mask_cpu=False, mask_gpu=True, mask_agree=False
	ptsum_cpu: []
	ptsum_gpu: []

In these cases where the GPU and CPU results differ, we see the CPU results are correct (based on the pt values of the pairs of muons in the events) and the GPU results are incorrect.

In case it's useful, here are some interesting/odd things about this potential bug:

  • The ptsum_gpu>30 part of the mask seems to agree (between CPU and GPU) for all events, it is only after the ak.any() where the discrepancy arises.
  • As can be seen in some of the events printed above, the discrepancy arises even in events where the array of muon pt values is empty.
  • A discrepancy does not arise until the 54272th event in the sample (more than half way through the sample). The value of the mask for all events before this agree perfectly. But then for the events after this there are very frequent discrepancies (as seen in the print output above).
  • Somehow, the bug seems to depend on the length of the array. If you load only a subset of the events around the 54272th event (e.g. just grab a subset like [54260:54290] in the Muon_pt_gpu and Muon_pt_cpu lines), the ak.any() mask value for that event (and all the other ones loaded) are evaluated correctly.

Metadata

Metadata

Assignees

Labels

bugThe problem described is something that must be fixedgpuConcerns the GPU implementation (backend = "cuda')

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions