Skip to content

Seq.zip, Seq.map2 etc behave different from List.zip, List.map2, Array.zip etc w.r.t. raising for different lengths #14121

@abelbraaksma

Description

@abelbraaksma

In cases where you apply a pairwise operation to two sequences, like Seq.zip, the behavior in F# Core is defined by the implementation of Seq.map2 and List.map2 and the like. The behavior between collection types is different, however.

  • Array.zip and List.zip throw an ArgumentException whenever the sequences have differing lengths
  • Seq.zip on the other hand doesn't.

I doubt this behavior can be changed (backwards compatibility), but I do think it is a bug/oversight or whatchamacallit. I'm currently working on implementing and extending TaskSeq, based on @dsyme's original code from this repo and raised this as a question to myself: fsprojects/FSharp.Control.TaskSeq#32. Then I figured, let's broaden the discussion scope ;).

Repro steps

// this is fine
[1;2;3] |> Seq.zip ["one"; "two" ] |> Seq.toList

// this raises
[1;2;3] |> List.zip ["one"; "two" ]

// this raises too
[1;2;3] |> Array.zip ["one"; "two" ]

Also, this is quite weird:

// returns true??
[1;2;3] |> Seq.forall2 (=) Seq.empty  // true
[1;2;3] |> Seq.forall2 (=) [1;2;3;4]  // true

[1;2;3] |> List.forall2 (=) Seq.empty  // exception
[1;2;3] |> List.forall2 (=) [1;2;3;4]  // exception

Expected behavior

The same behavior for all collection types.

Actual behavior

Functions like Seq.map2, Seq.map3, Seq.mapi2, Seq.zip do not raise an ArgumentException when the sizes are different. However, the implementations do read past the end of the sequence and even if false, read the next item of the paired sequence as well (see MapEnumerator code here). In other words, the information whether one or both sequences are exhausted is available.

Known workarounds

In lazily evaluated sequences, the only workaround is to "roll your own". Easy enough, but still. Alternatively, you could, of course, cache the sequences as an eager sequence like List or Array.

Related information

I did try to find a motivation for this behavior in the source code an online, but failed to do so. There's certainly an argument to be made for not raising an exception, but then one would expect that to be the case for all collection types.

Perhaps there's something with lazy sequences that suggest not raising exceptions in general. But something like [1..3] |> Seq.take 4 raises (not immediately, but when iterating over the sequence), in other words, it does not seem to be a taboo.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    Done

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions