Skip to content

Optimizing TSD and thread cache layout #701

@interwq

Description

@interwq

Fast path tcache performance is critical as it's expected to be accessed very frequently. We aim to improve the fast path significantly with the following two changes:

  1. Embed tcache into TSD:
    Currently each tsd stores a pointer to the automatic tcache:
    https://github.com/jemalloc/jemalloc/blob/dev/include/jemalloc/internal/tsd_structs.h#L19
    This adds a memory dereference on the fast path, and potentially a cache miss. We can avoid this by embedding the tcache struct into tsd directly. Since we do not allow switching auto tcache, this should work fine w/o breaking APIs.

  2. Optimize tcache and tsd for a more compact layout. We'd like have higher density of the tcache bins to improve cache locality. Currently tbins are 32 bytes, which will be cut down to 24. Also pack TSD tighter.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions