Skip to content

Conversation

@recursion-ninja
Copy link
Contributor

This PR contains a proof of concept for Unbox instances of "boxed" data-types by way of the newly added AsBoxedStrictly and AsBoxedLazily newtypes. This PR is intended to address issue #503.

The AsBoxedStrictly newtype imposes new strictness semantics on the underlying type, ensuring the wrapped value is always reduced to normal form. This invariant requires an NFData instance for the underlying type. Furthermore, it requires the use of a "smart constructor" which evaluates the wrapped value to normal form, since a new-type cannot have strictness annotations. Hence there are makeBoxedStrictly and grabBoxedStrictly exposed from the Data.Vector.Unboxed module to construct and deconstruct AsBoxedStrictly values, respectively.

The AsBoxedLazily new-type is much simpler since it preserves the strictness semantics of the underlying data-type unaltered.

Here are two example usages of the newtypes as a proof of concept:

Strict usage

data Foo a = Foo Int a
  deriving (Eq, Ord, Show)

instance NFData a => VU.IsoUnbox (Foo a) (Int, AsBoxedStrictly a) where
  toURepr (Foo i a) = (i, makeBoxedStrictly a)
  fromURepr (i, a) = Foo i $ grabBoxedStrictly a
  {-# INLINE toURepr #-}
  {-# INLINE fromURepr #-}
  
newtype instance VU.MVector s (Foo a) = MV_Foo (VU.MVector s (Int, AsBoxedStrictly a))

newtype instance VU.Vector    (Foo a) = V_Foo  (VU.Vector    (Int, AsBoxedStrictly a))

deriving via (Foo a `VU.As` (Int, AsBoxedStrictly a)) instance NFData a => VGM.MVector VUM.MVector (Foo a)

deriving via (Foo a `VU.As` (Int, AsBoxedStrictly a)) instance NFData a => VG.Vector   VU.Vector   (Foo a)

instance NFData a => VU.Unbox (Foo a)

In GHCi

import qualified Data.Vector.Unboxed as VU
VU.fromListN 3 [ Foo 4 "Hello", Foo 8 "there", Foo 16 "sailor" ]
[Foo 4 "Hello",Foo 8 "there",Foo 16 "sailor"]

Lazy usage

data Bar a = Bar Int a
  deriving (Eq, Ord, Show)

instance VU.IsoUnbox (Bar a) (Int, AsBoxedLazily a) where
  toURepr (Bar i a) = (i, AsBoxedLazily a)
  fromURepr (i, AsBoxedLazily a) = Bar i a
  {-# INLINE toURepr #-}
  {-# INLINE fromURepr #-}
  
newtype instance VU.MVector s (Bar a) = MV_Bar (VU.MVector s (Int, AsBoxedLazily a))

newtype instance VU.Vector    (Bar a) = V_Bar  (VU.Vector    (Int, AsBoxedLazily a))

deriving via (Bar a `VU.As` (Int, AsBoxedLazily a)) instance VGM.MVector VUM.MVector (Bar a)

deriving via (Bar a `VU.As` (Int, AsBoxedLazily a)) instance VG.Vector   VU.Vector   (Bar a)

instance VU.Unbox (Bar a)

In GHCi

import qualified Data.Vector.Unboxed as VU
VU.fromListN 3 [ Bar 3 "Bye", Bar 2 "for", Bar 1 "now" ]
[Bar 3 "Bye",Bar 2 "for",Bar 1 "now"]

Nota Bene

Since this is a proof of concept, all types and functions names are subject to change after the appropriate level of "bike-shedding" has occurred. Commentary is welcomed and encouraged.

@Shimuuar
Copy link
Contributor

I haven't looked at the code yet. CI is broken at the moment fix should be in #507

Copy link
Contributor

@Shimuuar Shimuuar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rebase on top current mater. CI is fixed there.

For strict and lazy vectors you simply need to pick Data.Vector.Strict/Data.Vector as a representation and coerce their methods same way as UnboxViaPrim do,

Variant which reduces values to NF is slightly more complicated. You can pick either strict or lazy vector as a representation. Former seems more logical. We evaluate value whenever it's stored in a vector. So following methods need custom code: basicUnsafeWrite, basicUnsafeReplicate, elemseq

Default implementation of baiscSet for boxed vectors basicSet is fine, it's defined in terms of basicUnsafeWrite

@recursion-ninja recursion-ninja force-pushed the unboxed-boxes branch 8 times, most recently from 54cf581 to e259872 Compare October 21, 2024 17:56
@Shimuuar
Copy link
Contributor

@recursion-ninja If you're trying to fix doctest it's easier to do so locally. Run in vector subdir.

$ cabal build all --write-ghc-environment-files=always
$ cabal run -- vector-doctest

@recursion-ninja
Copy link
Contributor Author

@recursion-ninja If you're trying to fix doctest it's easier to do so locally. Run in vector subdir.

$ cabal build all --write-ghc-environment-files=always
$ cabal run -- vector-doctest

Yes, I was. This is quite helpful.

@recursion-ninja recursion-ninja force-pushed the unboxed-boxes branch 5 times, most recently from f07e3d1 to 7de765b Compare October 21, 2024 23:14
@recursion-ninja
Copy link
Contributor Author

The test now pass, but I still need to do a pass over the documentation.

Comment on lines 1002 to 1006
basicUnsafeReplicate = coerce $ M.basicUnsafeReplicate @S.MVector @a
basicUnsafeRead = coerce $ M.basicUnsafeRead @S.MVector @a
basicUnsafeWrite = coerce $ M.basicUnsafeWrite @S.MVector @a
basicClear = coerce $ M.basicClear @S.MVector @a
basicSet = coerce $ M.basicSet @S.MVector @a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to ensure that value is reduced to NF before writing it to vector. There are 3 method that could write to vector: basicUnsafeReplicate, basicUnsafeWrite, basicSet. They need custom definition which reduces element to NF before passing it to corresponding method

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this API insight. I have updated the PR forceing the values to normal form before being inserted into the vector.

@recursion-ninja
Copy link
Contributor Author

The test now pass, but I still need to do a pass over the documentation.

I have updated the documentation.

@Shimuuar I think the PR is ready for review, and and potentially merged if the package maintainers agree.

@Shimuuar
Copy link
Contributor

Other than comment above PR is good to go.

I'm not sure why CI is failing. It seems to have stuck for some reason and then job got cancelled.

@recursion-ninja
Copy link
Contributor Author

recursion-ninja commented Oct 28, 2024

Other than comment above PR is good to go.

@Shimuuar I addressed the comment above. Should be good to merge now.

@Shimuuar Shimuuar merged commit 89d7584 into haskell:master Oct 29, 2024
@Shimuuar
Copy link
Contributor

Merged. @recursion-ninja Thank you!

@recursion-ninja recursion-ninja deleted the unboxed-boxes branch November 2, 2024 12:19
netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this pull request Jan 29, 2025
# Changes in version 0.13.2.0

 * Strict boxed vector `Data.Vector.Strict` and `Data.Vector.Strict.Mutable` is
   added (#488). it ensures that all values in the vector are evaluated to WHNF.
 * `DoNotUnboxStrict`, `DoNotUnboxLazy`, and `DoNotUnboxNormalForm` wrapper are
   added for defining unbox instances for types that contain not unboxable fields.
   [#503](haskell/vector#506),
   [#508](haskell/vector#508)
 * `spanR` and `breakR` were added [#476](haskell/vector#476).
   They allow parsing vector from the right.
 * We had some improvements on `*.Mutable.{next,prev}Permutation{,By}`
   [#498](haskell/vector#498):
   * Add `*.Mutable.prevPermutation{,By}` and `*.Mutable.nextPermutationBy`
   * Improve time performance. We may now expect good specialization supported by inlining.
     The implementation has also been algorithmically updated: in the previous implementation
     the full enumeration of all the permutations of `[1..n]` took Omega(n*n!), but it now takes O(n!).
   * Add tests for `{next,prev}Permutation`
   * Add benchmarks for `{next,prev}Permutation`
 * Cabal >= 3.0 is now required for building package (#481).
 * `vector:benchmarks-O2` public sublibrary containing benchmarks is added (#481).
 * Type family `Mutable` provides instances for arrays from `primitive`.
 * Various documentation improvements.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants