Skip to content

Add ASAN job in CI#3182

Closed
eregon wants to merge 1 commit into
ruby:mainfrom
eregon:asan-job
Closed

Add ASAN job in CI#3182
eregon wants to merge 1 commit into
ruby:mainfrom
eregon:asan-job

Conversation

@eregon

@eregon eregon commented Oct 14, 2024

Copy link
Copy Markdown
Member

@eregon

eregon commented Oct 14, 2024

Copy link
Copy Markdown
Member Author

So in https://github.com/eregon/yarp/actions/runs/11333924655/job/31519044817 there are reported leaks from:

regparse.c (just 1)
prism/prism.c (multiple)
prism/templates/src/node.c.erb (multiple)

@eregon

eregon commented Oct 14, 2024

Copy link
Copy Markdown
Member Author

I guess one possibility is RUBY_FREE_AT_EXIT might not clean some allocations for Prism.

@kddnewton

Copy link
Copy Markdown
Collaborator

Looks like it was all one leak that @peterzhu2118 fixed. It looks like this is detecting a leak that memcheck isn't. Peter do you know why that might be?

@peterzhu2118

Copy link
Copy Markdown
Member

It looks like this leak is happening inside of Ruby. The heuristics in ruby_memcheck are being used for prism, which filters out memory leaks from Ruby.

RubyMemcheck.config(use_only_ruby_free_at_exit: false)

@kddnewton

Copy link
Copy Markdown
Collaborator

So do I understand this that this is duplicative of the memcheck CI run?

@eregon

eregon commented Oct 17, 2024

Copy link
Copy Markdown
Member Author

I think ASAN detect_leaks=1 and valgrind are complementary and might each find leaks the other doesn't find, but @peterzhu2118 or @KJTsanaktsidis probably know better.

@eregon

eregon commented Oct 17, 2024

Copy link
Copy Markdown
Member Author

Looks like it was all one leak that @peterzhu2118 fixed.

Nice, I guess ruby/ruby@90aa6ae was the fix.


I think ideally we'd have ASAN detect_leaks=1 and/or valgrind in CRuby to find leaks in CRuby sooner and catch them directly in CI. Given it's slow probably with a subset of test-all, sounds fine to me and certainly better than nothing.

And then in Prism the same, ASAN detect_leaks=1 and/or valgrind, to avoid Prism changes to introduce leaks without noticing.

Not sure how much overlap between the two tools though.
One upside of ASAN is it doesn't need to build valgrind from source every time, OTOH it's more effort to repro locally (need to build CRuby with ASAN).

@peterzhu2118

Copy link
Copy Markdown
Member

I think ideally we'd have ASAN detect_leaks=1 and/or valgrind in CRuby to find leaks in CRuby sooner and catch them directly in CI.

This has been on my todo list but I haven't been able to find the time to push it past the finish line.

One upside of ASAN is it doesn't need to build valgrind from source every time

The issue is that the Valgrind in the apt repository was too old and had bugs. Looks like the valgrind in Ubuntu 24.04 is finally new enough (>= 3.20.0): https://packages.ubuntu.com/noble/valgrind

eregon added a commit that referenced this pull request Oct 17, 2024
* By using ubuntu-24.04 which has a recent enough valgrind: #3182 (comment)
@eregon eregon mentioned this pull request Oct 17, 2024
eregon added a commit that referenced this pull request Oct 17, 2024
* By using ubuntu-24.04 which has a recent enough valgrind: #3182 (comment)
eregon added a commit that referenced this pull request Oct 17, 2024
* By using ubuntu-24.04 which has a recent enough valgrind: #3182 (comment)
@kddnewton

Copy link
Copy Markdown
Collaborator

So what is the next step here — should we wait until this is a part of memcheck or try to add this separately? If we're going to add this separately I definitely want to add a suppression for this leak since I don't want it to be red when we merge.

@eregon

eregon commented Oct 17, 2024

Copy link
Copy Markdown
Member Author

I think waiting for adding valgrind and/or ASAN in CRuby CI would make most sense.
That way we don't add a(nother) job which would fail because of a leak in CRuby.

That's also what we need to make ruby-asan builds usable with detect_leaks=1: ruby/setup-ruby#653
And it would also test RUBY_FREE_AT_EXIT remains correct and complete.

@ivoanjo

ivoanjo commented Oct 18, 2024

Copy link
Copy Markdown

Another benefit in having this in CI is that it makes it easier for downstream gems (e.g. DataDog/dd-trace-rb#3864 ) to enable ASAN testing without needing to deal with too many exceptions and whatnot ;)

@eregon

eregon commented Oct 18, 2024

Copy link
Copy Markdown
Member Author

@ivoanjo Yep, that's what I meant by "That's also what we need to make ruby-asan builds usable with detect_leaks=1".
But it should be CRuby CI, not Prism, otherwise we'd only catch leaks in Prism (and failed builds from leaks in CRuby), which seems already well tested with ruby_memcheck. There are/were far more leaks in CRuby itself and BTW the ones we saw in the log above were actually in the Prism compiler, which is part of CRuby and not part of this repo.

@peterzhu2118

This has been on my todo list but I haven't been able to find the time to push it past the finish line.

One idea might be to add it in CI with some suppressions initially and then address these suppressions progressively (potentially by filing separate b.r-l.o bugs about them)? It would prevent new leaks. Otherwise it feels like there will likely be more leaks and maybe never a time where CRuby has no leaks (when running a significant part of the test suite with leak detection).


In general this is a bit the wrong place to discuss this, I think it'd make sense to create a https://bugs.ruby-lang.org/ ticket to add leak checking in CRuby CI. @ivoanjo maybe you could do that since you seem most interested in it?

@eregon

eregon commented Oct 18, 2024

Copy link
Copy Markdown
Member Author

I'll close this because I don't think it makes sense to merge until CRuby checks leaks in its CI.

@eregon eregon closed this Oct 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants