Skip to content

OP_REVERSEBYTES#358

Merged
Mengerian merged 12 commits intobitcoincashorg:masterfrom
EyeOfPython:master
Feb 18, 2020
Merged

OP_REVERSEBYTES#358
Mengerian merged 12 commits intobitcoincashorg:masterfrom
EyeOfPython:master

Conversation

@EyeOfPython
Copy link
Copy Markdown
Contributor

@EyeOfPython EyeOfPython commented May 29, 2019

A proposal to add OP_REVERSEBYTES to Script

@markblundeberg
Copy link
Copy Markdown
Collaborator

This is a great idea! Though of course, whether it comes in November will depend a lot if there is the dev bandwidth to actually make this consensus change. (Though the change itself is almost trivial, it requires activation logic + lots of testing.)

@dskloet
Copy link
Copy Markdown

dskloet commented May 29, 2019

Why doesn't SLP use little endian instead?


* Covenants and looping scripts usually take the script code of the preimage [9] as input, which means every operation counts twice: Once for the stack item containing the script code, and once for the P2SH script stack item [10]. In the example above, this would save 32 bytes per conversion, and if there's, say, three of those conversions in a script, it would already amount to 96 bytes - a non-trivial amount of bytes for a transaction.
* The cognitive load of developing scripts using the larger snippet above is increased unnecessarily. Developing scripts, by hand or by using tools such as macros or Spedn, already puts a lot of cognitive load on developers, and errors can be devastating to the community. A prominent example of such a failure is the contentious hard-fork on the Ethereum blockchain that was caused by a bug in The DAO smart contract.
* The first version assumes that Script uses 32-byte numbers, however, once 64-bit numbers are implemented, the script fails when numbers that do not fit in 32-bits are used [11], and has to be updated. The second version will work with up to 64-bit numbers.
Copy link
Copy Markdown
Contributor

@BigBlockIfTrue BigBlockIfTrue May 29, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* The first version assumes that Script uses 32-byte numbers, however, once 64-bit numbers are implemented, the script fails when numbers that do not fit in 32-bits are used [11], and has to be updated. The second version will work with up to 64-bit numbers.
* The first version assumes that Script uses 32-byte numbers, however, once 64-bit numbers are implemented, the script fails when numbers that do not fit in 32-bits are used [11], and has to be updated. The second version will work with up to 64-bit numbers. OP_REVERSE itself has no limit on the length of the byte vector it reverses.

Clarification.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've rewritten the rationale to be less SLP-centered, do you think we could add this clarification somewhere else?

* The cognitive load of developing scripts using the larger snippet above is increased unnecessarily. Developing scripts, by hand or by using tools such as macros or Spedn, already puts a lot of cognitive load on developers, and errors can be devastating to the community. A prominent example of such a failure is the contentious hard-fork on the Ethereum blockchain that was caused by a bug in The DAO smart contract.
* The first version assumes that Script uses 32-byte numbers, however, once 64-bit numbers are implemented, the script fails when numbers that do not fit in 32-bits are used [11], and has to be updated. The second version will work with up to 64-bit numbers.

Further, there's likely many additional use cases beside encoding SLP values, as many protocols outside of Bitcoin use big endian numbers, and all of them would benefit from this opcode. Also, it can be used to check palindromes.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Further, there's likely many additional use cases beside encoding SLP values, as many protocols outside of Bitcoin use big endian numbers, and all of them would benefit from this opcode. Also, it can be used to check palindromes.
Further, there's likely many additional use cases beside encoding SLP values, as many protocols outside of Bitcoin use big endian numbers, and all of them would benefit from this opcode. It can also be used to check palindromes. Besides number encoding, OP_REVERSE may be useful for manipulation of non-numeric data.

Copy link
Copy Markdown
Contributor

@BigBlockIfTrue BigBlockIfTrue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work. I included some suggestions for minor changes.

@markblundeberg
Copy link
Copy Markdown
Collaborator

I posted off-github:

Can you emphasize more on this point — "Further, there's likely many additional use cases beside encoding SLP values, as many protocols outside of Bitcoin use big endian numbers, and all of them would benefit from this opcode.".
Complex SLP covenants on their own don't seem like sufficient justification and the way you write the document seems very single-purpose to that one goal."

@dskloet
Copy link
Copy Markdown

dskloet commented May 29, 2019

Specifically: why wouldn't the big-endian to little-endian conversion when interacting with "protocols outside of Bitcoin" happen off-chain?

@EyeOfPython
Copy link
Copy Markdown
Contributor Author

I posted off-github:

Can you emphasize more on this point — "Further, there's likely many additional use cases beside encoding SLP values, as many protocols outside of Bitcoin use big endian numbers, and all of them would benefit from this opcode.".
Complex SLP covenants on their own don't seem like sufficient justification and the way you write the document seems very single-purpose to that one goal."

I've rewritten the rationale to be non-SLP specific. It now focuses mostly on why encoding numbers in big-endian is desireable.

* Adds more basic unit tests
* Adds unit test checking whether reversing twice returns the original value.
* Covers both even and odd lengths of the byte vector.

Co-Authored-By: BigBlockIfTrue <32314686+BigBlockIfTrue@users.noreply.github.com>
@EyeOfPython
Copy link
Copy Markdown
Contributor Author

@markblundeberg Thoughts on the updated version? c26607c

@markblundeberg
Copy link
Copy Markdown
Collaborator

Hmm interesting, though it still feels like most protocols can just adapt and provide little endian numbers. Do you see any use case besides endianness conversions?

@EyeOfPython
Copy link
Copy Markdown
Contributor Author

EyeOfPython commented Jun 1, 2019

Do you see any use case besides endianness conversions?

There aren't many; but given the crazy things some C developers came up with for optimizing their code (like fast inverse square root), there's probably some real-life function that corresponds to reversing the bytes of an integer.

Otherwise, on the top of my head, I came up with these additional use cases, which are just versions of already possible things, but requiring less fewer bytes:

Simple proof-of-work check:

Without OP_REVERSE:

<proof-of-work>
<target bytes>
OP_SPLIT
0
<target bytes>
OP_NUM2BIN
OP_EQUALVERIFY

6 bytes, checks number of leading zeros.

With OP_REVERSE:

<proof-of-work>
<target bytes>
OP_SPLIT
OP_DUP
OP_REVERSE
OP_EQUALVERIFY

5 bytes, checks whether the N leading bytes are a palindrome (<target bytes> must be twice as large as in the previous script for the same difficulty).

Pushing a large number (only relevant once we've got 64-bit or higher numbers):

Without OP_REVERSE:

0x00 00 00 00  00 00 00 01 (=72057594037927936)

8 bytes

With OP_REVERSE:

1
8
OP_NUM2BIN
OP_REVERSE

4 bytes

With 256-bit numbers, the first one would take 32 bytes whereas the second one would take 5 bytes.

I know, the benefit is either niche or not really significant, but it is a possible use case.

@EyeOfPython
Copy link
Copy Markdown
Contributor Author

most protocols can just adapt and provide little endian numbers

That's true, but not all protocols can just adapt, which is not possible for SLP tokens.

One more point on endianness:

Most services (explorer.bitcoin.com/rest.bitcoin.com, Bitdb, etc.) and protocols (most notably, SLP tokens) encode hashes (e.g. transaction hashes) in big endian. However, OP_SHA256 and OP_HASH256 encode hashes in little endian. If a script would calculate the hash of some external transaction and compare it to an input, or match the input against the outpoint, that input would have to be encoded in a little endian hash, making interaction with e.g. bitcoin.com's REST API and Bitdb very cumbersome.

Of course, it's possible to add a switch/function to those services, but that will make interaction unnecessarily confusing and increase the congnitive load needlessly.

At the end, it's mostly a question of goals: Should things be easy to do or is it already sufficient if they're just possible? For example, should we re-enable OP_MUL? As far as I'm able to tell, developers can just rearrange the terms of the formula and push products as inputs instead of factors.

Similar with the already re-enabled OP_NUM2BIN and OP_BIN2NUM: developers could just push numbers in the required format (minimal encoding/N-byte encoding), and if conversion is necessary, they could just use OP_SPLIT and OP_CAT.

However, it seems many would eventually like to see OP_MUL re-enabled, and OP_NUM2BIN and OP_BIN2NUM already has been, not because it offers new functionality, but because it makes interaction with Bitcoin Cash and Script easier, more approachable and, especially, safer. The same holds for OP_REVERSE: While technically many protocols can adapt (even the SLP protocol could be changed to support both big-endian and little-endian), it would require, at best (e.g. for a price oracle), just switching out a function call, worse (e.g. for interacting with Bitdb/rest.bitcoin.com) complicated switches/functions, or at worst (like for the SLP token) an immense effort to upgrade all participants to a new protocol.

@EyeOfPython
Copy link
Copy Markdown
Contributor Author

I think there's better ways to do something like OP_REVERSE, so I'm closing the PR.

@EyeOfPython EyeOfPython closed this Jun 8, 2019
@EyeOfPython EyeOfPython reopened this Dec 14, 2019
@Mengerian
Copy link
Copy Markdown
Contributor

Amaury (@deadalnix ) has suggested changing the name to "BSWAP", to match the similar instruction in other languages:

https://docs.rs/bswap/1.0.0/bswap/

https://www.npmjs.com/package/bswap

@EyeOfPython EyeOfPython dismissed a stale review via dc71460 January 8, 2020 00:05
@fpelliccioni
Copy link
Copy Markdown

I agree with the renaming of OP_REVERSE -> OP_BSWAP

BSWAP is a x86 instruction so is a common name.
Another name could be OP_ENDIAN_REVERSE, as in Boost: https://www.boost.org/doc/libs/1_63_0/libs/endian/doc/conversion.html#endian_reverse

@BigBlockIfTrue
Copy link
Copy Markdown
Contributor

Sorry to crash the renaming party, but I'd like to propose OP_REVERSEENDIAN, based on current naming conventions.

  1. No existing opcode name contains any underscore, dash, or space after the initial OP_, only letters and digits, not even when the name is composed from multiple words (e.g. OP_GREATERTHANOREQUAL), not even when the opcode is a contracted shorthand of two opcodes (e.g. OP_EQUALVERIFY, OP_1NEGATE).
  2. Existing opcodes place the verb before the noun (e.g. it's OP_CHECKSIG, OP_PUSHDATAn). While the full noun is 'endianness', abbreviating this to 'endian' seems acceptable, as existing opcode names also contain many abbreviations (e.g. 'sig' for 'signature', 'num' for 'number').

@joshmg
Copy link
Copy Markdown
Contributor

joshmg commented Jan 9, 2020

I agree with @BigBlockIfTrue . BSWAP's meaning is domain-knowledge dependent (if you're unfamiliar with x86 assembly then its name means nothing to you). There is little question about what OP_REVERSEENDIAN does. Furthermore, its naming convention is consistent to the other OP codes.

(That being said, the name of the opcode is fairly unimportant in the grand scheme of things, so hopefully not too much time will be invested in deciding its name.)

@EyeOfPython
Copy link
Copy Markdown
Contributor Author

oh true

@Mengerian
Copy link
Copy Markdown
Contributor

Sorry to prolong this bikeshedding, but I'll repeat what I said in the review of the ABC Diff for the implementation of this (https://reviews.bitcoinabc.org/D4871).

I'm not a fan of using the term "Endian" to me seems hardware specific, and usually implies some power-of-2 number of bytes. This opcode will be able to reverse bytes in strings of any length. "Endian" is a technical jargon term, and to me seems confusing.

I also think it's possible people may use this opcode for things other than endian conversion, since it can reverse the byte order for any arbitrary-length string.

I would suggest "OP_REVERSEBYTES". It fits with the criteria @BigBlockIfTrue mentioned, and it seems like just a straight description of what it does, without any jargon.

@EyeOfPython
Copy link
Copy Markdown
Contributor Author

Thanks everyone for the valuable feedback. As someone who writes Script by hand, having nice opcode names is not unimportant.

To put an end to the bikeshed, I‘ll use the opcode name OP_REVERSEBYTES now and if anyone wants to have a different name they‘re free to fork the code.

Copy link
Copy Markdown
Contributor

@ftrader ftrader left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specification looks good but Unit Tests as enumerated seem insufficient.

At very least the size ranges of possible stack input data - from 1 byte to as many as are possible in data item that the operator should reverse - should be subjected to test.

Copy link
Copy Markdown
Contributor

@Mengerian Mengerian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you change the file name to match the other recent ones on the spec folder? it would be 2020-05-15-reversebytes.md or something like that.

Also maybe change the title of the PR to "OP_REVERSEBYTES"

@EyeOfPython EyeOfPython changed the title OP_REVERSE OP_REVERSEBYTES Feb 18, 2020
Copy link
Copy Markdown
Contributor

@Mengerian Mengerian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thanks Tobias.

@Mengerian Mengerian merged commit ecda76b into bitcoincashorg:master Feb 18, 2020
defi-scripter470po added a commit to defi-scripter470po/bitcoincash.org that referenced this pull request Sep 27, 2025
* Create op_reverse.md

* Update op_reverse.md

* Change opcode to 192 (0xc0)

see bitcoincashorg/bitcoincash.org#358 (comment) for a rationale

* Rewritten rationale

* Added and clarified unittests

* Adds more basic unit tests
* Adds unit test checking whether reversing twice returns the original value.
* Covers both even and odd lengths of the byte vector.

Co-Authored-By: BigBlockIfTrue <32314686+BigBlockIfTrue@users.noreply.github.com>

* Renamed to OP_BSWAP, changed opcode number, added activation date

* Fixed activation date in unit tests

* Renamed OP_BSWAP to OP_ENDIAN_REVERSE

* Rename OP_ENDIAN_REVERSE to OP_REVERSEENDIAN

* Changed OP_REVERSEENDIAN to OP_REVERSEBYTES

* Updated unittests and other minor changes

* Rename op_reversebytes.md to 2020-05-15-op_reversebytes.md

Co-authored-by: BigBlockIfTrue <32314686+BigBlockIfTrue@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants