Conversation
This library was started by Joe Hendrix and Wojciech Nawrocki. It is crucial for software and hardware verification projects. We need it in `Std`. Moreover `lean-auto` project needs it.
Std/Data/BitVec/Basic.lean
Outdated
| Notation for bit vector literals. `i#n` is a shorthand for `BitVec.ofNat n i`. | ||
| TODO: is this a good notation? Do we need it? We can also write `(i : BitVec n)` | ||
| -/ | ||
| syntax:max num noWs "#" noWs num : term |
There was a problem hiding this comment.
| syntax:max num noWs "#" noWs num : term | |
| scoped syntax:max num noWs "#" noWs num : term |
There was a problem hiding this comment.
It seems narrowly useful if it only works for literals. I'd say we don't want it.
There was a problem hiding this comment.
It makes proof goals way more readable. Specially the ones coming from software/hardware verification.
There was a problem hiding this comment.
I agree with Scott, the only thing this does is add a type ascription so it seems more suited as either #n as a notation for Bitvec n (so (5 : #3) works), or letting downstream projects define the notation themselves as a local notation (it's a one-liner). I don't find the particular syntax very descriptive and would prefer the definition be moved closer to the use for readability.
There was a problem hiding this comment.
Happy to change the notation, but it would be great to have something very compact and working "out of the box" for people working on software/hardware verification.
There was a problem hiding this comment.
My gut feeling is that writing the bitwidth all the time will get old quick, especially in cases where the bitwidth is mostly fixed (e.g. everything is on 32 bit words). For sure frequently varied bitwidth definitely comes up too, especially in hardware verification, but I don't think it is the majority case, and inventing a new domain specific notation for type ascription seems like it would lead to more confusion and not less, unless this is a very standard notation in the trade (and my impression is that it is not). It might be good to do a survey of what VHDLs and such do for this situation.
Also, I don't see any reason this needs to be restricted to numeric literals, it could just be term:max instead of num.
There was a problem hiding this comment.
I think I am slightly in favor of the notation. This is probably orthogonal, but it'd be neat if there was a way to have access to the number of digits in a hex or binary literal (e.g., if we knew 0b01010 was a BitVec, then it would default to having 5 bits). Similar with hex digits having a multiple of the input 0x0010 is a BitVec 16. Cryptol does this, but also is predominantly about bitvector programming.
|
It might be worth mentioning that there is an open Mathlib PR which I believe was based on the same |
|
It might be better to get this into mathlib first, rather than having to copy things around if it is already under review there. Is there a representative project that is currently using this API? I would feel more comfortable merging a new API like this if I knew it was already meeting the needs of another project, and getting it in mathlib is one way to do that. |
|
@alexkeizer and @digama0 Bit vectors must be in the Std. They are essential for software and hardware verification, and support for bit vectors in |
Co-authored-by: Scott Morrison <scott.morrison@gmail.com>
To be clear, I don't contest this and would like us to have a bitvectors API in Std. My concern is about whether the API here is the right one, or whether we need to expose things in a different way or with different definitions (e.g. little endian vs big endian). My main tool for evaluating this is to look at usage of the API in a downstream project. It is relatively easier to merge PRs migrating from mathlib because they have already undergone some API battle testing in order to incorporate them in the library. How exactly is lean-auto blocked? (Not contesting the claim, asking for details.) Should it not be possible to simply have this module in lean-auto since dependencies of it will be able to use this? This is what aesop and mathlib have done in the past when it comes to merging things upstream: they keep a copy of the module and use that instead pending the merge. This practice also helps to get more data and refine the API. |
|
Lean-auto is not necessarily "blocked", but importing the current implementation of BitVec from Mathlib makes compilation and initialization unacceptably slow. @digama0 It seemed to me that there would be no problem to have Bitvec in Std immediately. But since it contradicts the method used to evaluate APIs, I'd be happy to just have this module in Lean-auto. |
|
I'm not trying to block this PR, I'm soliciting information on how it is used. Maybe there could be a test file or something, to demonstrate intended usage?
This is actually very valuable information, it indicates that the Mathlib BitVec API has issues (which I have also long suspected). It would be great to have a summary of how this API differs from it, what the problems were and why this implementation works better. |
|
Lean-auto just imports those Bitvec definitions, and matches their name against constant names in the input so that it can recognize them as Bitvec operations. The actual implementation of Bitvec has no impact on the performance of Lean-auto. What really matters to Lean-auto is the name of the Bitvec operations. @digama0 I think the slow compilation might not be caused by the specific Bitvec implementation. Generally, whenever I import files from mathlib, the compilation of lean-auto becomes significantly slower, so I've been avoiding importing anything from mathlib, and I have several modules copied from mathlib in Lean-auto (I didn't do the same to Bitvec because I thought it makes sense to have it in std). |
|
@PratherConid Could you provide a list of all definitions you need in the @digama0 I am going to be using the |
|
@leodemoura Yes, I only need the functions available on smt-lib 2.6 https://smtlib.cs.uiowa.edu/theories-FixedSizeBitVectors.shtml |
Std/Data/BitVec/Basic.lean
Outdated
|
|
||
| /-- | ||
| Extraction of bits `i` down to `j` from a bit vector of size `n` to yield a | ||
| new bitvector of size `i - j + 1` |
There was a problem hiding this comment.
In mathlib4 we were leaning towards an API that satisfied
| new bitvector of size `i - j + 1` | |
| new bitvector of size `i - j` |
As this saves some branching in shrink
There was a problem hiding this comment.
I imagine this does not align with SMTlib (and am aware it does not align with the convention in VHDL and Verilog), but I imagine it will be much more pleasant to prove things avout, and it aligns with Std.Range.
There was a problem hiding this comment.
@eric-wieser Thanks for the feedback. Yes, the goal is to align with SMTlib which is a standard in software and hardware verification. SMT solvers are widely used in industry, and they all use the SMTlib standard.
There was a problem hiding this comment.
How about defining extract in terms of extract' using Eric's convention?
There was a problem hiding this comment.
extract' is a good idea. Please feel free to add a commit with its implementation.
|
We are also looking to use bitvectors in our project. We started using the current bitvector API, but when the discussed refactor was proposed we stopped until it became clear which representation mathlib would use. I did not necessarily mean to suggest postponing this PR, I would be happy to have bitvectors in STD ASAP, I merely wanted to point out the existing efforts. The refactor also got discussed on Zulip (https://leanprover.zulipchat.com/#narrow/stream/287929-mathlib4/topic/Proposed.20refactor.20of.20Bitvec). |
|
@alexkeizer Thanks for sharing. The good news is that things will start moving really fast here in
We're lucky to have Joe Hendrix (@joehendrix) on board with us at Lean FRO. He's a pro when it comes to software verification and development. Joe's got our back in hitting our goals and will help us get those PRs merged super fast. |
The perf improvement will be in core. See: leanprover/lean4@de719c9
We now have the missing proofs `Nat.mod_le` and `Nat.div_le_self` in core. See: leanprover-community/batteries#286 (comment)
|
I pushed some changes. Summary:
I did not make any changes to the things under discussion (big endian, the |
Std/Data/BitVec/Basic.lean
Outdated
| mk :: | ||
| /-- Interpret a bitvector as a number less than `2^w`. | ||
| O(1), because we use `Fin` as the internal representation of a bitvector. -/ | ||
| val : Fin (2^w) |
There was a problem hiding this comment.
Given the multitude of possible representations, would calling mk/val as ofFin / toFin instead be reasonable? That would make them feel on equal footing with ofBits / toBits or similar.
|
@digama0 I personally think the comment changes and cleanups improve readability so thanks. One interesting thing is that most-significant being on the left is part of our Hindu Arabic numeral system -- even traditionally right-to-left languages have most-significant digits on the left. I don't want the bit order discussion to block this PR from being merged, so if you haven't been persuaded I am ok with avoiding committing to a bit order in this PR. |
|
As things stand currently, the only place where big endian has to be explicitly acknowledged is in the argument order for In general, dependent typing hell will definitely arise when working with this API when the bitwidths are not numeric literals, so we probably need to do something to help with that (although as usual it is difficult to predict in advance what that might be). One thing is that we probably want a |
|
I've pushed
|
Actually I was thinking you might want binary rendering. I can't imagine you would ever want decimal, that seems exceedingly unhelpful for understanding a bitvector value. But binary is a bit too verbose and is probably not suitable for more than 4-8 bits. You could have a delaborator option to control this. |
The signature is right, but I think these kind of functions are usually called |
* Introduce signed operations. * Rename unsigned operations to indicate unsigned nature. * Introduce shiftRight and complement integer bitwise operations. * Rename byte-order dependent operations to clarify endianess. * Add examples to documentation * Add basic test cases.
Std/Data/BitVec/Basic.lean
Outdated
| if t.msb then | ||
| if s.msb then | ||
| .udiv (.neg s) (.neg t) | ||
| else | ||
| .neg (.udiv s (.neg t)) | ||
| else | ||
| if s.msb then | ||
| .neg (.udiv (.neg s) t) | ||
| else | ||
| .udiv s t |
There was a problem hiding this comment.
| if t.msb then | |
| if s.msb then | |
| .udiv (.neg s) (.neg t) | |
| else | |
| .neg (.udiv s (.neg t)) | |
| else | |
| if s.msb then | |
| .neg (.udiv (.neg s) t) | |
| else | |
| .udiv s t | |
| match s.msb, t.msb with | |
| | true, true => .udiv (.neg s) (.neg t) | |
| | false, true => .neg (.udiv s (.neg t)) | |
| | true, false => .neg (.udiv (.neg s) t) | |
| | false, false => .udiv s t |
ditto for smt_sdiv, srem, smod
Co-authored-by: Mario Carneiro <di.gama@gmail.com>
|
Thanks @alexkeizer , @leodemoura , @joehendrix and everyone else for working on this. I think it looks really good now, and although there is still more to do ( |
We now have the missing proofs `Nat.mod_le` and `Nat.div_le_self` in core. See: leanprover-community/batteries#286 (comment)
This library was started by Joe Hendrix and Wojciech Nawrocki. It is crucial for software and hardware verification projects. We need it in `Std`. Moreover `lean-auto` project needs it. --------- Co-authored-by: Scott Morrison <scott.morrison@gmail.com> Co-authored-by: Mario Carneiro <di.gama@gmail.com> Co-authored-by: Joe Hendrix <joe@lean-fro.org>
This library was started by Joe Hendrix and Wojciech Nawrocki. It is crucial for software and hardware verification projects. We need it in
Std. Moreoverlean-autoproject needs it.