Skip to content

Set the lowest bit in object tags#2801

Merged
mergify[bot] merged 3 commits intomasterfrom
osa1/update_object_tags
Sep 22, 2021
Merged

Set the lowest bit in object tags#2801
mergify[bot] merged 3 commits intomasterfrom
osa1/update_object_tags

Conversation

@osa1
Copy link
Copy Markdown
Contributor

@osa1 osa1 commented Sep 22, 2021

In compacting GC we need to distinguish a heap location (object or field
address) from object headers. Currently this is done by checking if the value
is smaller than or equal to the largest tag. Because first 64 KiB of the heap
is for Rust stack, as long as the largest tag is smaller than 65,536, we can
assume that values smaller than 65,536 are headers.

This way of checking if a value is a header or an address causes problems when
we want to use rest of the object headers to store more information. Examples:

  • In WIP: Implement page allocation #2706 we will use one bit in the header to mark large objects. At least
    initially, we won't be compacting large objects, so mark-compact GC won't see
    large objects and so won't have to care about large header values. But we may
    want to do compaction on large objects, or store other information (maybe
    mark bits, or generation numbers).

  • We may want to store number of untagged (scalar) and tagged fields in object
    headers and merge some of the different object types. For example, instead of
    having 3 tags for Variant, Some, and MutBox, we could have one tag, and
    use rest of the headers to indicate that variants will have one scalar, one
    tagged fields, mutable objects will have just one tagged field, etc.

  • We could have SmallBlob and SmallArray types for blobs and arrays with
    lenghts smaller than 65,535 (16 bits length field). This would save us one
    word for small blobs and arrays.

  • We don't have to rely on Rust stack being large enough so that largest tag
    will still be small enough to be a valid address in heap.

In this PR we update tags so that they always have the lowest bit set. Since
objects and fields are all word aligned (so have the lowest 2 bits unset, this
invariant was established in #2764), this allows checking the lowest bit to
distinguish an address from a header. With this we can freely use the rest of
the bits in headers.

While this PR currently does not unblock any PRs, it's nice to have this
flexibility for the future changes, and these changes do not have any
downsides. (mo-rts.wasm grows 0.03%, 58 bytes)

This is in preparation for #2790 and #2706. With #2790 we will start
using rest of the headers for GC metadata (set some of the high bits)
which will break compacting GC as we won't be able to distinguish a
heap address from an object header by checking if the value is larger
than the max. tag value. This check assumes a heap address cannot be
smaller than the max. tag value, which holds because we have at least 64
KiB Rust stack, and then static data for the canister.

With the high bits of headers set, it's possible that some of the
headers will have a value larger than 64 * 1024. So the current check no
longer works.

To allow distinguishing heap locations from headers, this PR refactors
objects tags so that they will all have the least significant bit set.
Since objects and fields are all word aligned (so have the lowest 2 bits
unset, this invariant was established in #2764), we can now check the lowest
bit and distinguish an address from a header.
@osa1 osa1 requested review from crusso, ggreif and nomeata September 22, 2021 07:26
@osa1 osa1 marked this pull request as draft September 22, 2021 07:34
@osa1
Copy link
Copy Markdown
Contributor Author

osa1 commented Sep 22, 2021

I think this may not be necessary so converted this into draft for now.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Sep 22, 2021

Comparing from 604c89d to d3231cb:
In terms of gas, no changes are observed in 3 tests.
In terms of size, 3 tests regressed and the mean change is +0.0%.

@osa1 osa1 marked this pull request as ready for review September 22, 2021 09:45
@osa1
Copy link
Copy Markdown
Contributor Author

osa1 commented Sep 22, 2021

OK, updated the PR description. @crusso @ggreif @nomeata feedbacks welcome!

Copy link
Copy Markdown
Contributor

@nomeata nomeata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Copy link
Copy Markdown
Contributor

@crusso crusso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but are you sure we never broke the tag abstraction anywhere in compile.ml? I.e. by hardcoding the tag constant?

@osa1
Copy link
Copy Markdown
Contributor Author

osa1 commented Sep 22, 2021

Hard-coding a tag would be a terrible practice.. I can't guarantee that we don't have hard-coded tags in compile.ml, but given that the tests pass, I think we don't.

@osa1
Copy link
Copy Markdown
Contributor Author

osa1 commented Sep 22, 2021

I tried searching for hard-coded tags but we have hundreds of occurrences of 1l, 2l, ... so it will be impossible to check all.

@osa1 osa1 added the automerge-squash When ready, merge (using squash) label Sep 22, 2021
Comment thread src/codegen/compile.ml Outdated
Comment thread src/codegen/compile.ml Outdated
@osa1 osa1 removed the automerge-squash When ready, merge (using squash) label Sep 22, 2021
@osa1 osa1 added the automerge-squash When ready, merge (using squash) label Sep 22, 2021
@mergify mergify bot merged commit 031dddb into master Sep 22, 2021
@mergify mergify bot deleted the osa1/update_object_tags branch September 22, 2021 14:05
@mergify mergify bot removed the automerge-squash When ready, merge (using squash) label Sep 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants