Optimize reallocation of caml_frame_descriptors. by chambart · Pull Request #156 · ocaml/ocaml

chambart · 2015-03-23T17:18:58Z

It now only reallocate and reinitialise the whole table only when it
is too small. This avoids quadratic behavior when loading a lot of
module with dynlink.

This in problematic on frama-c when inlining increase the code size. The frame table
initialisation took ~0.5 second. This is quite noticeable on real examples where the
whole frama-c analysis is ~1.5s long.

milanst · 2015-03-23T18:20:54Z

+1 for fixing this issue

I've noticed the bad behavior when I was debugging a surprisingly long startup time of one of our programs at Jane Street (although this particular thing wasn't the main culprit).

The caml_frame_descriptors table can be really big when you use big libraries (like Core or Async, they have many functions). Rehashing this whole table every time a small module is dynlinked can be quite expensive, not to mention quadratic behavior when you load many modules.

bobot · 2015-03-23T22:26:15Z

That's impressive! We remarked in frama-c that the statup was slow but we never imagined that was because of this. Thanks a lot for finding it and for fixing it!

I tried to review the code, but I had trouble to understand the different states of the book keeping of the hashtable. So here an attempt to explicit the different states: bobot/ocaml@878819d, bobot/ocaml@e65f499 (branch optimizes_caml_init_frame_descriptor) . Feel free to use them (or not).

Except that, the code is neat and a lot more efficient!

xavierleroy · 2015-03-24T18:21:21Z

Yes, it's a good idea to grow the hash table only when needed.

I wonder, however, if the code couldn't be simplified further by having caml_register_frametable enter the frames directly in the hash table,, rather than maintaining a list of frame tables. The resizing and insertion logic would go into caml_register_frametable. In the resize case, the contents of the old hash table are reinserted in the new, bigger one.

Also, it would be nice to be able to unregister a frametable, removing its entries from the hash table. We can't use this (safely) for natdynlink (yet?), but the Coqonut project (a JIT for Coq) could make use of it.

damiendoligez · 2015-03-24T19:55:55Z

I wonder, however, if the code couldn't be simplified further by having caml_register_frametable enter the frames directly in the hash table

The difference is startup time for very-short-lived programs that don't allocate enough to start a major collection before they exit.

Not sure we should care about those.

xavierleroy · 2015-03-24T20:14:16Z

Even minor collections need the frame table to be properly set up. So,
you'd need really very short-lived programs for the current "lazy" strategy
to win against the simpler approach I outlined.

2015-03-24 20:55 GMT+01:00 Damien Doligez notifications@github.com:

I wonder, however, if the code couldn't be simplified further by having
caml_register_frametable enter the frames directly in the hash table

The difference is startup time for very-short-lived programs that don't
allocate enough to start a major collection before they exit.

Not sure we should care about those.

—
Reply to this email directly or view it on GitHub
#156 (comment).

chambart · 2015-03-24T22:39:44Z

@xavierleroy I'll try that simplification. Maybe one way to have both the lazy behavior for non-allocating programs and a simpler situation would be to continue to do the first initialisation lazily but directly update the table on dynlink. For code unloading, it's probably possible to lazily reconstruct the table when the used part of the table is below 1/8 (to avoid reallocating if the size oscillate around a power of 2). But I wouldn't want to try that if there is no code using it yet.

damiendoligez · 2015-03-24T22:50:00Z

I don't think you need to bother with shrinking the table when unloading code, but don't you need to rebuild the table whenever you remove some entries?

And I don't think we really care about lazy initialization.

It now only reallocate and reinitialise the whole table only when it is too small. This avoids quadratic behavior when loading a lot of module with dynlink.

If some code was unloaded, the frame table can be freed, so we cannot dereference the frame_descr pointer in the table. To avoid that problem, the return address is copied to the frame table.

chambart · 2015-03-25T15:33:21Z

I got rid of the lazy initialization and couldn't resist trying to allow unloading.

chambart · 2015-03-25T15:37:42Z

To allow unloading I needed to store the return addresses directly in the table and to add a check in the insertion. The check is probably almost free, but this increase the size of the table. Note that it also means that collecting the local gc roots may be a bit cheaper.

xavierleroy · 2015-03-25T15:54:35Z

No, we don't want to double the size of this hashtable. Think of the caches...

I'm not sure why you think you need to put the retaddr in the hash table. Deletion in a linearly-hashed table is painful but can be done without extra info, see e.g. TAOCP vol 3 section 6.4 algorithm R.

chambart · 2015-03-26T12:39:23Z

It's bigger, but that also means that we don't need to follow some pointers when we walk the stack. That's probably avoid trashing parts of the cache with some useless frame descriptors. Out of curiosity, I benchmark it a bit and it seems to be consistently slightly faster (less than 5%) on exception or gc heavy workload.

But I don't really propose that part for inclusion, only the first part.

damiendoligez · 2015-03-26T16:19:32Z

It's not just the caches: in the case that @milanst investigated, the table's size was 16M and most Jane Street programs are probably at this order of magnitude. You don't want to double that if you can avoid it.

chambart · 2015-03-27T15:13:16Z

Removal was not tested at all.

gasche · 2015-04-12T10:53:25Z

What is the status of this patch? My understanding is that @xavierleroy 's request for unloading frametables is implemented, and that the table is not doubled anymore. Would anyone be willing to review the resulting patch?

damiendoligez · 2015-04-14T20:54:57Z

asmrun/roots.c

gasche · 2015-07-26T19:04:33Z

Merged in trunk, thanks for the patch!

alainfrisch · 2015-11-03T21:54:19Z

One of our customers has experienced segfaults when dynlinking a lot of code, and my colleague Roven Gabriel has tracked this down to this PR as the probable cause for the segfault.

Looking at the commits, two things seem fishy to me:

The global frametables variable is never updated (so it is always equal to NULL). I guess it should be set to new_frametables at the end of init_frame_descriptors.
In init_frame_descriptors, in the case of resizing, frametables is explicitly reset to NULL, but then tail->next is again updated to the value of frametables (now NULL again). I guess tail->next should not be updated in case of resizing.

I've submitted #279 as an attempt to fix these issues.

chambart · 2015-11-04T10:48:26Z

@alainfrisch I will try to take some time reread it seriously and verify your patch soon.

mshinwell · 2015-11-04T12:57:37Z

I confirm this patch is the cause of some segfaults; we've seen this on Coq with flambda at high inlining levels.

* Fix leaked file descriptor when querying package documentation * Enable info logs in production

chambart added 3 commits March 25, 2015 16:30

Optimize reallocation of caml_frame_descriptors.

6d2616f

It now only reallocate and reinitialise the whole table only when it is too small. This avoids quadratic behavior when loading a lot of module with dynlink.

Getting rid of lazy initialization of frametables

3fdaf02

Allow unloading frametables.

138bd93

If some code was unloaded, the frame table can be freed, so we cannot dereference the frame_descr pointer in the table. To avoid that problem, the return address is copied to the frame table.

chambart force-pushed the optimizes_caml_init_frame_descriptor branch from 394f90d to 138bd93 Compare March 25, 2015 15:30

Don't double frame table size

f304ccb

damiendoligez reviewed Apr 14, 2015
View reviewed changes

asmrun/roots.c Outdated

Copy link
Copy Markdown

Member

damiendoligez Apr 14, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

Fix typo

0db5a97

chambart mentioned this pull request Jun 24, 2015

Oldify dynamically linked modules only once #177

Closed

gasche closed this Jul 26, 2015

alainfrisch reopened this Nov 3, 2015

alainfrisch mentioned this pull request Nov 3, 2015

Attempt to fix a probable bug introduced by GPR#156. #279

Merged

alainfrisch closed this Nov 13, 2015

mshinwell added a commit to mshinwell/ocaml that referenced this pull request Jun 11, 2020

Check for non-variant tags on Lswitch (ocaml#156)

a50390a

lthls pushed a commit to lthls/ocaml that referenced this pull request Sep 23, 2020

Check for non-variant tags on Lswitch (ocaml#156)

d84bbf8

lthls pushed a commit to lthls/ocaml that referenced this pull request Sep 23, 2020

Check for non-variant tags on Lswitch (ocaml#156)

b13ccae

lthls pushed a commit to lthls/ocaml that referenced this pull request Sep 24, 2020

Check for non-variant tags on Lswitch (ocaml#156)

7b64c09

chambart pushed a commit to chambart/ocaml-1 that referenced this pull request Sep 9, 2021

CFG optimization: eliminate useless edges (ocaml#156)

2f7777b

EmileTrotignon pushed a commit to EmileTrotignon/ocaml that referenced this pull request Jan 12, 2024

Don't commit generated JS files (ocaml#156)

d800537

EmileTrotignon pushed a commit to EmileTrotignon/ocaml that referenced this pull request Jan 12, 2024

Fix leaking file descriptor (ocaml#156)

8682ce6

* Fix leaked file descriptor when querying package documentation * Enable info logs in production

Conversation

chambart commented Mar 23, 2015

Uh oh!

milanst commented Mar 23, 2015

Uh oh!

bobot commented Mar 23, 2015

Uh oh!

xavierleroy commented Mar 24, 2015

Uh oh!

damiendoligez commented Mar 24, 2015

Uh oh!

xavierleroy commented Mar 24, 2015

Uh oh!

chambart commented Mar 24, 2015

Uh oh!

damiendoligez commented Mar 24, 2015

Uh oh!

chambart commented Mar 25, 2015

Uh oh!

chambart commented Mar 25, 2015

Uh oh!

xavierleroy commented Mar 25, 2015

Uh oh!

chambart commented Mar 26, 2015

Uh oh!

damiendoligez commented Mar 26, 2015

Uh oh!

chambart commented Mar 27, 2015

Uh oh!

gasche commented Apr 12, 2015

Uh oh!

damiendoligez Apr 14, 2015

Choose a reason for hiding this comment

Uh oh!

gasche commented Jul 26, 2015

Uh oh!

alainfrisch commented Nov 3, 2015

Uh oh!

chambart commented Nov 4, 2015

Uh oh!

mshinwell commented Nov 4, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants