-
Notifications
You must be signed in to change notification settings - Fork 664
spvBinaryParse API deficient: client needs to handle endianness; not easily composable #1
Description
The spvBinaryParse API isn't quite right:
- It requires the client to translate endianness when the module being parsed is of a foreign endiannes. (e.g. I'm on a little-endian machine, but the module is big-endian). Frankly, most clients won't bother to write that code since the world seems dominated by little-endian machines. Also, there is some subtlety to the translation since most values need to be endian-translated, but literal strings do not. This is complexity we don't need.
- Composability: The API is space-efficient: It does not copy the data in the underlying module. To get
values from instructions, the client must have retained the base pointer to the module, and then index into it via the instruction offset and the operand offset. That's fine as far as it goes, but it isn't very composable. For example, suppose you want to chain multiple transforms together. The consumer at
the end of the pipeline needs a base pointer somewhere to read values out of the instructions. But that means the last transform has to form a pointer relative to some array of words that was meaningful only at the beginning of the pipeline.
I think we can fix both problems in a reasonable way, as follows:
The spv_parsed_instruction_t should gain a "words" member of type "const uint32_t*". It points to native-endian sequence of SPIR-V words for the entire instruction. (So, for example, words[0] contains the combined word count and opcode.) As before, the contract is that the words array is only valid for as long as the callback is excecuting, then may be discarded or overwritten.
In the common case where the original module was in the machine's native endianness, then no copying is required: the words pointer is just computed from the original module's base pointer plus the instruction's offset. In the non-native endian case, there is some copying, but it's done by the parser and not the client. And since the lifetime of the data is restricted to the duration of the callback, we can reuse that space for the next instruction.
With the new design it will become easy to make pipelines of analysis or transform steps in a space efficient way. It should also be generally time efficient since the chain of callbacks should usually be working on the same chunk of instruction storage. It should be hot in the cache.