Skip to content

GH-43186: [Go] Use auto-aligned atomic int64 for pqarrow pathbuilders#43206

Merged
joellubi merged 6 commits intoapache:mainfrom
joellubi:gh-43186
Jul 10, 2024
Merged

GH-43186: [Go] Use auto-aligned atomic int64 for pqarrow pathbuilders#43206
joellubi merged 6 commits intoapache:mainfrom
joellubi:gh-43186

Conversation

@joellubi
Copy link
Copy Markdown
Member

@joellubi joellubi commented Jul 10, 2024

Rationale for this change

Fixes: #43186

Improves 32-bit support for the pqarrow writer. We may want to push similar changes to other refcount implementations to more completely support running on 32-bit system.

What changes are included in this PR?

Update pathBuilder and multipathLevelBuilder refCounts to use atomic.Int64 which is automatically aligned on 32-bit systems.

Are these changes tested?

The issue reproduces with existing tests when building for a 32-bit architecture, so no new tests were added. This PR adds a "test" step to the existing workflow for 386 architecture builds, currently limited to run the tests fixed in this PR.

Are there any user-facing changes?

32-bit systems should no longer panic when writing parquet files.

@github-actions
Copy link
Copy Markdown

⚠️ GitHub issue #43186 has been automatically assigned in GitHub to PR creator.

@joellubi joellubi marked this pull request as ready for review July 10, 2024 14:18
Copy link
Copy Markdown
Member

@zeroshade zeroshade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this have any effect on performance at all?

Comment on lines +194 to +195
# No asm simd support for 386
# WIP refactor, only tests in the specified dirs have been fixed
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it actually fail without the noasm tag? it should be correctly detecting the architecture and defaulting to the pure go implementations even without the noasm tag. We should fix it otherwise

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was failing because we didn't have a 386 variant of the BMI functions. I just added one and now it works without the build tag. This issue likely still exists with the other arch-specific function implementations but I think it's reasonable to go and add those together with the corresponding updates to refCount across other parts of the repo.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bah, must have missed the BMI stuff when I added more the default impls for architectures I didn't cover -_-

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting committer review Awaiting committer review labels Jul 10, 2024
@zeroshade
Copy link
Copy Markdown
Member

@joellubi can we also file an issue for follow-up to track updating the rest of the codebase with this fix?

@github-actions github-actions bot added awaiting change review Awaiting change review awaiting changes Awaiting changes and removed awaiting changes Awaiting changes awaiting change review Awaiting change review labels Jul 10, 2024
@joellubi
Copy link
Copy Markdown
Member Author

Does this have any effect on performance at all?

I haven't run formal benchmarks but I would expect the impact to be minimal. The implementation of Int64 is actually pretty simple but interesting:

// An Int64 is an atomic int64. The zero value is zero.
type Int64 struct {
	_ noCopy
	_ align64
	v int64
}

// Add atomically adds delta to x and returns the new value.
func (x *Int64) Add(delta int64) (new int64) { return AddInt64(&x.v, delta) }

So it's actually using the same exact atomic.AddInt64 function we were using before, just with a simple container struct to guarantee alignment rather than having to make sure the int64 is aligned manually in all the structs we use it in.

@zeroshade
Copy link
Copy Markdown
Member

The difference in this case is that we're guaranteeing that the new atomic.Int64 is always on the heap as opposed to allowing the refcount to be on the stack if the structure itself was on the stack. But i agree that I doubt there'll be much of a performance difference for the allocation in question.

@joellubi
Copy link
Copy Markdown
Member Author

@joellubi can we also file an issue for follow-up to track updating the rest of the codebase with this fix?

Just added an issue.

Copy link
Copy Markdown
Member

@zeroshade zeroshade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@joellubi joellubi merged commit 788c8f2 into apache:main Jul 10, 2024
@kou
Copy link
Copy Markdown
Member

kou commented Jul 11, 2024

@joellubi Could you use dev/merge_arrow_pr.* when you merge a PR in this repository next time? It not only merges a PR but also does some extra tasks such as removing comments from PR description and use it as a commit message.
See also: https://github.com/apache/arrow/tree/main/dev#how-to-merge-a-pull-request

@joellubi
Copy link
Copy Markdown
Member Author

@joellubi Could you use dev/merge_arrow_pr.* when you merge a PR in this repository next time?

Hi @kou. Yes absolutely, thanks for letting me know.

@conbench-apache-arrow
Copy link
Copy Markdown

After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit 788c8f2.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 51 possible false positives for unstable benchmarks that are known to sometimes produce them.

@powersj
Copy link
Copy Markdown

powersj commented Jul 16, 2024

For my own education, did this not make the v17 release today? Is there a way to know when it would end up an actual release in either a bug fix v17 or even v18?

@assignUser
Copy link
Copy Markdown
Member

@powersj this was merged after the code freeze for 17.0.0 (1st of July) and as such not part of the release, there where some issues that required fixes to the RCs that delayed the actual release until today. It will be part of 18.0.0. You can usually check the milestone on the issue to see which release it will be part of (the milestone was missing here but I added it)

It is possible to manually add patches to the release branch after the freeze if they are critical/blockers but normally everything that happens on main after the code freeze is not part of the release.

Currently patch releases are only done for severe issues as out process is not setup for an easy release of a single component, though we are working on that. So a patch release is unlikely (but I am not a Go contributor so 🤷) due to the effort involved (even though it's just a tag for go itself). Sorry for the inconvenience.

@powersj
Copy link
Copy Markdown

powersj commented Jul 16, 2024

@assignUser thanks for detailed response and clearing up the release process.

So a patch release is unlikely (but I am not a Go contributor so 🤷) due to the effort involved (even though it's just a tag for go itself). Sorry for the inconvenience.

OK this helps me understand why we are always consuming a new major version of arrow. I had not dug into the release process, but it always made me a bit uneasy to be grabbing new major versions when updating the dependency.

@zeroshade
Copy link
Copy Markdown
Member

@powersj there are ongoing efforts to try to start decoupling the major versions of the implementations here in the monorepo so that we could potentially do more minor version releases instead. But we aren't there yet unfortunately.

If this is critical for you to get released sooner than the v18 release we can ask on the mailing list, but if it's not critical then it's unlikely we'll do a patch release of it (unless there are other requests for patch fixes which would get rolled together) as @assignUser said.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Go] parquet/pqarrow: panic: unaligned 64-bit atomic operation

5 participants