Implement fcvt_to_uint_sat (f32x4 -> i32x4) for x86#1990
Merged
abrown merged 3 commits intobytecodealliance:mainfrom Jul 8, 2020
Merged
Implement fcvt_to_uint_sat (f32x4 -> i32x4) for x86#1990abrown merged 3 commits intobytecodealliance:mainfrom
abrown merged 3 commits intobytecodealliance:mainfrom
Conversation
Subscribe to Label Actioncc @bnjbvr DetailsThis issue or pull request has been labeled: "cranelift", "cranelift:meta", "cranelift:wasm"Thus the following users have been cc'd because of the following labels:
To subscribe or unsubscribe from this label, edit the |
5196d47 to
00a7db1
Compare
julian-seward1
approved these changes
Jul 8, 2020
Contributor
julian-seward1
left a comment
There was a problem hiding this comment.
Ok to land, but please remove the redundant mention of AVX512 in the commit message:
"This converts an f32x4 into an i32x4 (unsigned) with some rounding either by using an AVX512VL/F instruction--VCVTPS2UDQ--or a long sequence of SSE4.1 compatible instructions."
Thanks for your patience with this!
This converts an `f32x4` into an `i32x4` (unsigned) with rounding by using a long sequence of SSE4.1 compatible instructions.
00a7db1 to
ec04966
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This replaces #1822; it consists of the same functionality but removes the AVX512 instruction lowering for the time being. There are two reasons for this:
i32x4.trunc_sat_f32x4_u. We can then use embedded rounding control but lose the ability to specify the vector length, so the instruction would operate on 512-bits which we should discuss (@sunfishcode has reported issues with 512-bit vectors in Spidermonkey)VCVTPS2UDQfor negative lanes is0xFFFFFFFF(I had thought it would be0x00000000); this can be resolved with the following sequence:v0 = pxor ...; v2 = fcmp gte v1, v0 (gte ensures they are ordered); v3 = vcvtps2udq v1; v4 = band v2, v3. However, I would like to look at this a little bit more before submitting a separate PR for it (this is the reason for keeping the legalization inenc_tables.rsand undernarrow_avx, BTW).