Set the lowest bit in object tags#2801
Merged
mergify[bot] merged 3 commits intomasterfrom Sep 22, 2021
Merged
Conversation
This is in preparation for #2790 and #2706. With #2790 we will start using rest of the headers for GC metadata (set some of the high bits) which will break compacting GC as we won't be able to distinguish a heap address from an object header by checking if the value is larger than the max. tag value. This check assumes a heap address cannot be smaller than the max. tag value, which holds because we have at least 64 KiB Rust stack, and then static data for the canister. With the high bits of headers set, it's possible that some of the headers will have a value larger than 64 * 1024. So the current check no longer works. To allow distinguishing heap locations from headers, this PR refactors objects tags so that they will all have the least significant bit set. Since objects and fields are all word aligned (so have the lowest 2 bits unset, this invariant was established in #2764), we can now check the lowest bit and distinguish an address from a header.
Contributor
Author
|
I think this may not be necessary so converted this into draft for now. |
Contributor
Contributor
Author
crusso
approved these changes
Sep 22, 2021
Contributor
Author
|
Hard-coding a tag would be a terrible practice.. I can't guarantee that we don't have hard-coded tags in compile.ml, but given that the tests pass, I think we don't. |
Contributor
Author
|
I tried searching for hard-coded tags but we have hundreds of occurrences of 1l, 2l, ... so it will be impossible to check all. |
ggreif
reviewed
Sep 22, 2021
ggreif
reviewed
Sep 22, 2021
ggreif
approved these changes
Sep 22, 2021
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
In compacting GC we need to distinguish a heap location (object or field
address) from object headers. Currently this is done by checking if the value
is smaller than or equal to the largest tag. Because first 64 KiB of the heap
is for Rust stack, as long as the largest tag is smaller than 65,536, we can
assume that values smaller than 65,536 are headers.
This way of checking if a value is a header or an address causes problems when
we want to use rest of the object headers to store more information. Examples:
In WIP: Implement page allocation #2706 we will use one bit in the header to mark large objects. At least
initially, we won't be compacting large objects, so mark-compact GC won't see
large objects and so won't have to care about large header values. But we may
want to do compaction on large objects, or store other information (maybe
mark bits, or generation numbers).
We may want to store number of untagged (scalar) and tagged fields in object
headers and merge some of the different object types. For example, instead of
having 3 tags for
Variant,Some, andMutBox, we could have one tag, anduse rest of the headers to indicate that variants will have one scalar, one
tagged fields, mutable objects will have just one tagged field, etc.
We could have
SmallBlobandSmallArraytypes for blobs and arrays withlenghts smaller than 65,535 (16 bits length field). This would save us one
word for small blobs and arrays.
We don't have to rely on Rust stack being large enough so that largest tag
will still be small enough to be a valid address in heap.
In this PR we update tags so that they always have the lowest bit set. Since
objects and fields are all word aligned (so have the lowest 2 bits unset, this
invariant was established in #2764), this allows checking the lowest bit to
distinguish an address from a header. With this we can freely use the rest of
the bits in headers.
While this PR currently does not unblock any PRs, it's nice to have this
flexibility for the future changes, and these changes do not have any
downsides. (mo-rts.wasm grows 0.03%, 58 bytes)