Dupable wars, episode IV: a new plan

## Background

I've been insisting, in the past few years, that the `Dupable` type class ought to:
- Support instance being defined in a `Data.Applicative` style
- Support generating n-tuples for any n.

To this effect, we've been using `dupV :: a %1 -> V n a` as our main type class method.

We realised long ago that the `Data.Applicative` instance for `dupV` is not super efficient, as it allocates intermediate arrays for each use of `(<*>)`. Which I have proposed in the past should be solved by using pull arrays instead (#208) and deducing `dupV` from the pull-array based duplication function.

Another proposal which has been floated is to specialised to `V 2`, which would be sufficient (https://github.com/tweag/linear-base/issues/151#issuecomment-690130887). I find my explanation, in this comment, that `V 2` is insufficient to be lacking: I can defined `dupV` in term of `dup2` (equivalently `dupV2`). I didn't explain why it is not a great idea.

## Movable -> Dupable

A typical instance of `Dupable` will deep-copy the data type. When I defined `dupV @n` in terms of `dup2`, I need to do `n-1` of these copies. This is often ok, but if I have a `Movable` instance, I can do a single copy:

```haskell
dupV x = case move x of
  Ur x' -> pure x')
```

It would be quite a pity to forbid ourselves from using this when we can.

## A streaming hot approach

I was thinking about the choice between non-mutable and mutable array for the implementation of V (#312, #341). And was starting to be a bit dismayed by the seemingly untamable zoology of twisted little data structures all alike. How can we keep the complexity low? Do we need a special, scoped, dup function for the mutable case (https://github.com/tweag/linear-base/pull/341#issuecomment-905344920).

Then I realised that I had made a pretty incorrect assumption there: that the only applicable multi-ary `Data.Applicative` was `V`. This is very not true: there are also (pull-polarity) streams. This would look something like this:

```haskell
data Stream a where
  Stream { state :: x, next :: x %1 -> (a, x), final :: x %1 -> a}

instance Data.Applicative Stream where
  pure a = Stream
    { state = Ur a
    , next = (\(Ur x) -> (x, Ur x))
    , final = unur }
  (fs <*> xs) = Stream
    { state = (state fs, state xs)
    , next = (\(sf, sx) -> case (next sf, next sx) of 
                           ((f, sf'), (x, sx')) -> (f x, (sf', sx')))
    , final = (\(sf, sx) -> final sf (final sx))}
```

(`Stream` is a variant of `AffineStream (Of a) Identity a` from the streaming sublibrary with a single occurrence of `a` (to be able to define instances), `Of` is linear in `a`, and which (crucially) cannot decide to end on its own)

We could define `dupS :: a %1 -> Stream a` as our main method for the `Dupable` type class. Note how `pure` gives us the `move`-based implementation.

We could then derive functions like `dupV` (or the mutable equivalent) as a derived concept. More generally, we can define

```haskell
elim :: KnownNat n => Fun n a b -> Stream a %1 -> b
```

(though the question of doing this efficiently is probably as tricky as in the case of `V.elim`, but it does give us elimination into `V`, and into `tuples` for instance).

This gives us a much more comprehensible `Dupable` type class too! And it becomes the responsibility of the various data types that we provide to provide a good interface to `Dupable`. Much more manageable.

## A moving touch

There is still something unfortunate though (which also holds the current `dupV`-based abstraction): if I use `<*>` on `pure` streams, then I lose the only-one-deep-copy property. Let's look at the instance for lists:

```haskell
instance Dupable a => Dupable [a] where
  dupS [] = pure []
  dupS (a:as) = (:) <&> dupS a <*> dupS as
```

If `a` is movable, then `[a]` is movable too. However, if each of the `a`-s is indeed deep-copied only once, the spine of the list is copied `n` times 🙁 .

I think that the right solution is to do some dynamic dispatch:

```haskell
data BetterStream a where
  Moved :: a -> BetterStream a
  Sequential :: Stream a %1 -> BetterStream a

instance Applicative BetterStream a where
  pure = Moved
  
  (Moved f) <*> (Moved x) = Moved (f x)
  fs <*> xs = (toStream fs) <*> (toStream xs)
  
toStream :: BetterStream a %1 -> Stream a
toStream (Moved x) = pure x
toStream (Sequential xs) = xs
```

We can similary defined `elim` for this type.

## Conclusion (kind of)

I don't feel super great to introduce two extra stream types on top of the existing two. I suppose `Stream` may want to stay internal while `BetterStream` is exposed (with a less silly name). A type of stream defined solely to be the output of `dupS` still does feel a tad unsatisfactory. Still, it's probably the best thing to do. I do wonder though if we don't want `BetterStream` to expose its constructors, so that `V` (and its mutable variant) can take advantage of it for a slightly faster build in the `Moved` case. Or maybe we have a clever `elim'` function with a type like

```haskell
elim' :: (a -> b) -> (Fun n a b) -> BetterStream a %1 -> b
```

This would avoid displaying one additional stream types. I don't quite know yet.

Anyway, this does look like the right way forward, doesn't it?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dupable wars, episode IV: a new plan #343

Background

Movable -> Dupable

A streaming hot approach

A moving touch

Conclusion (kind of)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Dupable wars, episode IV: a new plan #343

Description

Background

Movable -> Dupable

A streaming hot approach

A moving touch

Conclusion (kind of)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions