Skip to content

Split get typed array methods in loader to safe and unsafe (view)#1047

Merged
dcodeIO merged 5 commits into
AssemblyScript:masterfrom
MaxGraey:improve-loader-typedarray-methods
Jan 10, 2020
Merged

Split get typed array methods in loader to safe and unsafe (view)#1047
dcodeIO merged 5 commits into
AssemblyScript:masterfrom
MaxGraey:improve-loader-typedarray-methods

Conversation

@MaxGraey

@MaxGraey MaxGraey commented Jan 5, 2020

Copy link
Copy Markdown
Member

Using __getUint8Array and other method lead to confusion because it require imeddiate usage of results before other allocs or deallocs. Overwise memory buffer will outdated and point to non-relevant data. Like:

const arrPtr = wasm.__allocArray(...);

const outPtr = wasm.callSomeMethod(arrPtr);
wasm.__release(outPtr);
return wasm.__getUint8Array(outPtr); // but this already outdated after `__release`

// or

const outPtr = wasm.callSomeMethod(arrPtr);
const ret = wasm.__getUint8Array(outPtr);
wasm.__release(outPtr);
return ret; // still outdated after `__release`

This PR make __getUint8Array and other methods safe by default by cloning and add new methods like __getUint8ArrayView which preserve previous dangers behaviour (without cloning)

@MaxGraey MaxGraey requested a review from dcodeIO January 5, 2020 00:30
@dcodeIO

dcodeIO commented Jan 5, 2020

Copy link
Copy Markdown
Member

The intended use here is something like

const arrPtr = wasm.__retain(wasm.__allocArray(...));
const outPtr = wasm.callSomeMethod(arrPtr);
const ret = wasm.__getUint8Array(outPtr);
wasm.__release(arrPtr);
return ret;
// if `outPtr` somehow references `arrPtr`, its RC remains >0 until `outPtr` is released

with outPtr expected to be released on the JS side once not needed anymore (after the return), so it can keep its references (potentially including arrPtr) alive until then.

Also iirc, the __getXXArray are merely special cases of __getArrayView which are all intended for minimal overhead instead of safety?

@MaxGraey

MaxGraey commented Jan 5, 2020

Copy link
Copy Markdown
Member Author
const arrPtr = wasm.__retain(wasm.__allocArray(...));
const outPtr = wasm.callSomeMethod(arrPtr);
const ret = wasm.__getUint8Array(outPtr);
wasm.__release(arrPtr);
return ret;

This produce the same result) __getUint8Array just create slice (typed view) to buffer in memory which mutate after wasm.__release

@dcodeIO

dcodeIO commented Jan 5, 2020

Copy link
Copy Markdown
Member

In the case that outPtr directly or indirectly retains a reference to arrPtr, this is what I'd expect:

const arrPtr = wasm.__retain(wasm.__allocArray(...)); // RC(arrPtr)=1
const outPtr = wasm.callSomeMethod(arrPtr); // RC(outPtr)=1 due to return, RC(arrPtr)=2
const ret = wasm.__getUint8Array(outPtr);
wasm.__release(arrPtr); // RC(arrPtr=1)
return ret;

// later

__release(outPtr); // RC(outPtr)=0, releasing its refs, then RC(arrPtr)=0

This is not so much about what method is called with outPtr, but outPtr being a return value of a function that is pre-retained for the caller to release, here JS. Unless outPtr is manually released, it should remain alive, incl. all the stuff it references. Or am I missing something?

@MaxGraey

MaxGraey commented Jan 5, 2020

Copy link
Copy Markdown
Member Author

This not works and still required slice (cloning). See this for example:
https://github.com/ChainSafe/as-sha256/blob/master/src/index.js#L53

Also @torch2424 came across to similar problem as I know

@dcodeIO

dcodeIO commented Jan 5, 2020

Copy link
Copy Markdown
Member

The example you linked doesn't work (in the modified form below) because

  const digestPointer = wasm.digest();
  const digest = wasm.__getUint8Array(digestPointer)/* .slice() */;
  wasm.__release(digestPointer);
  return digest;

prematurely releases the reference while digestPointer (which digest is a view onto) is still being used. This looks more like invalid use.

So far I was under the impression that not having to slice and instead keeping stuff alive is what we want there to achieve maximum perf.

@MaxGraey

MaxGraey commented Jan 5, 2020

Copy link
Copy Markdown
Member Author

ChainSafe/as-sha256#23 (comment)

In general it cause to problems so I prefer use safe but with little extra overhead instead by default instead potentially danger method which works right only partially in specific cases

@dcodeIO

dcodeIO commented Jan 5, 2020

Copy link
Copy Markdown
Member

Perhaps it would make sense to have two different API methods there

  • __getXXArrayView with the current behavior, matching how __getArrayView works, and
  • __getXXArray which slices, matching how __getArray works

Maybe it's just confusing naming the way it is?

@MaxGraey

MaxGraey commented Jan 5, 2020

Copy link
Copy Markdown
Member Author

Yeah, I agree with naming. Will rename __getXXArrayUnsafe to __getXXArrayView

@dcodeIO

dcodeIO commented Jan 5, 2020

Copy link
Copy Markdown
Member

Also note that if such a function copies by default, we lose mutability in both ways. Without copying, one can see changes and make changes, while with copying it's rather a readonly snapshot.

@dcodeIO

dcodeIO commented Jan 5, 2020

Copy link
Copy Markdown
Member

Speaking of naming, I also like the term unsafe in there to make this more clear. Or perhaps we can do something like getArrayCopy vs getArrayView. Hmm... instead of View, what's a good pre- or postfix for something that's observable, live, mutable (and will most likely explode when used)?

@MaxGraey

MaxGraey commented Jan 5, 2020

Copy link
Copy Markdown
Member Author

I prefer leave __getXXArray as is, similar to __getArray, __getString which also use copies. Also renaming __getXXArray to __getXXArrayCopy will break backward compatibility and makes people do refactoring

@MaxGraey MaxGraey changed the title Split get typed array methods in loader to safe and unsafe Split get typed array methods in loader to safe and unsafe (view) Jan 5, 2020
@dcodeIO

dcodeIO commented Jan 5, 2020

Copy link
Copy Markdown
Member

Would say we are breaking backwards compatibility anyway for everyone using the existing API to modify data in either way, since __getUint8Array etc. now copies but didn't before. Just suggesting that if we would rename something, then we should do it now :)

@MaxGraey

MaxGraey commented Jan 5, 2020

Copy link
Copy Markdown
Member Author

I see. But still think it better leave "as is". Btw comment notation emphasise that __getXXArray actually copy the data and I guess it's enough.

@MaxGraey

MaxGraey commented Jan 6, 2020

Copy link
Copy Markdown
Member Author

latest commit change cloning approach which up to 2x faster on Google Chrome but no performance difference on FF.

cloning

@dcodeIO dcodeIO merged commit b6d2771 into AssemblyScript:master Jan 10, 2020
@MaxGraey MaxGraey deleted the improve-loader-typedarray-methods branch January 10, 2020 13:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants