Skip to content

Conversation

@zherczeg
Copy link
Contributor

Parse (ref index) form, not just (ref $name) form
Resolve references in types and locals
Display location when a named reference is not found

@zherczeg
Copy link
Contributor Author

This is another bugfix patch extracted from #2562. The list of changes are above, these constructs should work even in the main branch.


if (type.is_index()) {
out_type_list->push_back(Type(type.index()));
// TODO: Temporary fix.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The full fix is in #2562. It is unlikely that close to UINT32_MAX number of types are defined, so this is good as a temporary change.

func->GetNumParams(),
decl->sig.param_type_names, errors);

func->local_types.Set(func->local_type_list);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a delayed processing of locals. The reasion is that otherwise we cannot tell that (ref $t) and (ref 0) refers to the same or different locals, so they can be merged or not.

src/ir.cc Outdated
Index count = 1;
for (Index i = 1; i < types.size(); ++i) {
if (types[i] != type) {
if (!types[i].IsSame(type)) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name might not be the best. Anyway, the purpose is that == only compares the enum, while the index is also important for references.

Copy link
Member

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this.

Mostly LGTM, just trying to page in how this stuff all works.

// Some types can have names, for example (ref $foo) has type $foo.
// So to use this type we need to translate its name into
// a proper index from the module type section.
// This is the mapping from parameter/result index to its name.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment needs updating maybe?

}

// The == comparison only compares the enum_ member.
bool IsSame(const Type& rhs) const {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we instead update operator==?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately I get errors such as error: ambiguous overload for ‘operator==’ (operand types are ‘const wabt::Type’ and ‘wabt::Type::Enum’)

The reason is the Type has a constexpr operator Enum() const { return enum_; }, and this confuses the compiler.

I am open to any ideas.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how much would break if you made that explicit...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to work for me locally:

   constexpr operator Enum() const { return enum_; }
 
+  // TODO(sbc) Use =default here once we have C++20
+  bool operator==(const Type& other) const {
+    return enum_ == other.enum_ && type_index_ == other.type_index_;
+  }
+
+  bool operator==(const Enum& other) const { return enum_ == other; }

I tried the explicit route too and that seems to break a bunch of places.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I change it. Thank you

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw it might fix another bug. Currently if we have two types which are the same except the ref part:

(type $t0 (func (param (ref $t0))))
(type $t1 (func (param (ref $t1))))

(func $f (param (ref $t1)) )

$f gets type 0, instead of type 1. The index is simply ignored.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems it does not work :(

if (types[i] != type) { still uses the enum auto conversion.
if (!types[i].operator==(type)) { works but does not look nice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What shall I do?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess removing the implicit cast is the way to go.. although as you say it might require a bunch of downstream changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to move the operator== before the Enum casting, and the auto casting still takes precedence. I will never understand c++.

Removing the casting might have many side effects which is too big risk for this patch, so I move back to the original code for now, and we can try it later.

Result ParseValueTypeList(
TypeVector* out_type_list,
std::unordered_map<uint32_t, std::string>* type_names);
std::unordered_map<uint32_t, Var>* type_names,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe should be renamed? type_vars?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All ...type_names are renamed to ...type_vars.

out/test/typecheck/missing-ref.txt:5:24: error: Unknown reference: $first_type
(func $f (param (ref $first_type))
^^^^^^^^^^^
out/test/typecheck/missing-ref.txt:6:25: error: Unknown reference: $second_type
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the errors that come from resolve-names.cc It looks like this should be undefined reference type variable \"%s\"".. or something like that?

Copy link
Contributor Author

@zherczeg zherczeg Mar 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it to undefined reference type name %s


void ResolveBlockDeclaration(const Location& loc, BlockDeclaration* decl) {
ResolveTypeNames(*module_, decl);
ResolveTypeNames(*module_, decl->sig, errors_);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this different to the code in src/wast-parser.cc that does name resolving?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I understand this question. ResolveTypeNames is reworked to support error throwing.

@zherczeg zherczeg force-pushed the ref_fix branch 2 times, most recently from 1207345 to a4e230c Compare March 26, 2025 18:43
out/test/typecheck/missing-ref.txt:5:24: error: undefined reference type name $first_type
(func $f (param (ref $first_type))
^^^^^^^^^^^
out/test/typecheck/missing-ref.txt:6:25: error: undefined reference type name $second_type
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note: the error is thrown twice. The reason is that the references are resolved as func type and as a function signature. I don't think it is a problem, but we could fix it (could be later) by removing the items of param_type_vars / result_type_vars, if somebody does not like it.

@zherczeg zherczeg force-pushed the ref_fix branch 3 times, most recently from 946e527 to 74078be Compare March 27, 2025 05:12
@zherczeg
Copy link
Contributor Author

Did some code improvements. The unordered_map is replaced by a vector, because it is a simpler structure. Furthermore, when a reference is resolved, its index is set to kInvalidIndex to avoid duplicated resolves.

@zherczeg
Copy link
Contributor Author

zherczeg commented Apr 3, 2025

@SoniEx2 could you check this patch? It contains various reference related bugfixes.

ReferenceVar(uint32_t index, Var var)
: index(index), var(var) {}

uint32_t index;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the meaning of this index?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Location of the reference.

Example: (param i32 (ref $a) i64 (ref 0) f32 (ref $b)), then two ReferenceVar is created: (1, $a) and (5, $b). The index is 1 and 5.

Shall I add a comment?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this whole class is only needed/used during parsing? Should we add a TODO to move this into the parser itself in the future?

@zherczeg
Copy link
Contributor Author

zherczeg commented Apr 4, 2025

Thank you for the review. Is there any other changes this patch needs?

Copy link
Member

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

I assume that once other PRs in this chain land we can start to run some of the upstream tests instead of constructing our own.

Would it make sense to then remove the local tests that (I assume) duplicate upstream tests?

typedef std::vector<ReferenceVar> ReferenceVars;

ReferenceVars param_type_vars;
ReferenceVars result_type_vars;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess at some point it might make sense to move this parser-specific metadata out of the IR representation? But it seems like this change just adds to the existing pattern?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think yes. Yes, the param_type_names/result_type_names was there, I just changed them to another structure.

There are many text parser related stuff in the code, since declaration order is not fixed in text parsing, sometimes things are declared after use. Maybe they could be hidden behind smart pointers, so they have less memory overhead when not used anymore.

Or use class inheritance, and reallocate the signature without parser data after parsing is completed.

Ok I will think about it.

// non-compressed local vector until the parsing is completed.
// After the locals are resolved, local_types are constructed
// from this vector, and then this vector is freed.
TypeVector local_type_list;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does "when parsing is complete" refer to parsing the whole module, or just a given function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whole module unfortunately. I will update the comment.

(module
  (func $f (local (ref 0)) (local (ref $t)))
  (type $t (func))
)

The two locals are the same, and should be merged:

0000016: 01                                        ; local decl count
0000017: 02                                        ; local type count
0000018: 64                                        ; (ref 0)
0000019: 00                                        ; (ref 0)

But we don't know this until the type is parsed, which can be at the end of the module.

The new test/roundtrip/typed-func-refs.txt checks this merging.

@zherczeg
Copy link
Contributor Author

zherczeg commented Apr 8, 2025

Yes, the follow up patches use spec tests for testing the features.

However, we cannot do it now, because "Type::Reference" is not a function reference anymore.

In v8 engine, the const is assigned to kStructRefCode, and ref/refnull is defined as 0x64/0x63
https://chromium.googlesource.com/v8/v8/+/refs/heads/main/src/wasm/wasm-constants.h#46

https://github.com/WebAssembly/function-references/blob/main/proposals/function-references/Overview.md#types-1

Parse (ref index) form, not just (ref $name) form
Resolve references in types and locals
Display location when a named reference is not found
@zherczeg
Copy link
Contributor Author

zherczeg commented Apr 8, 2025

Thank you for the review. I have updated the comments. I will submit a follow-up patch for extracting parser structures out from IR, but it will be a non-trivial change.

Copy link
Member

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. @SoniEx2 WDYT?

@zherczeg
Copy link
Contributor Author

zherczeg commented Apr 8, 2025

Wait a bit please. #2581 may replace this patch.

@zherczeg zherczeg closed this Apr 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants