`sequence/c` consumes generator elements
Credit to sorawee, on Discord
sequence/c, being an impersonator contract, seems to eagerly consume elements produced by generators. Note that program A skips 2. I assume this behavior reproduces across VMs and versions.
Program A
#lang racket
(require racket/generator)
(define x
(in-generator
(yield 1)
(yield 2)
(yield 3)))
(define/contract y (sequence/c integer?) x)
(sequence-ref y 0)
(sequence-ref y 0)
Output
1
3
Program B
#lang racket
(require racket/generator)
(define x
(in-generator
(yield 1)
(yield 2)
(yield 3)))
(define/contract y any/c x)
(sequence-ref y 0)
(sequence-ref y 0)
Output
1
2
This is what the contract system is doing, which is there the 1 and 3 come from. Is this wrong?
#lang racket
(require racket/generator)
(define s1
(in-generator
(yield 1)
(yield 2)
(yield 3)))
(define-values (more?.1 next.1) (sequence-generate s1))
(next.1)
(void (more?.1))
(define-values (more?.2 next.2) (sequence-generate s1))
(next.2)
It looks like the call to second more?.1 is somehow consuming the 2 even thought next.1 doesn't get called.
FWIW, in-producer works correctly so this may be a problem with generators
(define x
(in-producer
(let ([x 0])
(lambda ()
(set! x (add1 x))
x))))
(define/contract y (sequence/c integer?) x)
(sequence-ref x 0)
(sequence-ref x 0)
Hmm. If I use @rfindler's code with in-producer, it outputs incorrectly too. That is, sequence-generate seems to be incorrect for both generator and producer.
But sequence-ref on a contracted sequence seems to trigger the bug only for generator, but not producer.
The docs don't seem especially clear on whether or not multiple calls to sequence-generate is allowed and what invariants should be preserved across such calls. It may be that the contract system is wrong here, I am not sure.
Robby
Hmm. If I use @rfindler's code with
in-producer, it outputs incorrectly too. That is,sequence-generateseems to be incorrect for both generator and producer.But
sequence-refon a contracted sequence seems to trigger the bug only for generator, but not producer.
This is probably because of how in-producer is implemented and whether or not it is passing along the calls to that more? function. The contract system calls more? in response to a callback it gets. (So this is another place that might be declared the bug.)
Well, it looks like the difference I put above might be a red herring. Here's, stepping back, what the contract system is doing to get different answers out:
(define (get s)
(for/last ([v (in-values-sequence s)]
[i (in-range 1)])
v))
(define p1 (open-input-string "abcd"))
(get p1)
(get p1)
(define p2 (open-input-string "abcd"))
(define s2
(make-do-sequence
(λ ()
(define-values (more? next) (sequence-generate p2))
(values
(λ (idx) (next))
add1
0
(lambda (idx) (more?))
(lambda elems #t)
(lambda (idx . elems) #t)))))
(get s2)
(get s2)
It may be the case that these procedures inside the call to make-do-sequence are wrong; they are trying to be the same as doing nothing.
Also, I replaced sequence-ref with a call to get as in-values-sequence seems to be the real primitive and its code may be the place where these two sequences are getting treated differently.
Ah, I understand now what's wrong in the above examples with sequence-generate: more? should be called before get, and never after. If done in the opposite order, more? will become the next state's more?, which is not what we want.
(define (get-before-more s)
(define-values (more? get) (sequence-generate s))
(println (get))
(println (more?))
(define-values (more2? get2) (sequence-generate s))
(println (get2))
(println (more2?)))
(define (more-before-get s)
(define-values (more? get) (sequence-generate s))
(println (more?))
(println (get))
(define-values (more2? get2) (sequence-generate s))
(println (more2?))
(println (get2)))
(get-before-more (open-input-bytes (bytes 1 2 3)))
;; 1 -- get of 1
;; #t -- more? of 2
;; 3 -- get of 3
;; #f -- more? of the end
(more-before-get (open-input-bytes (bytes 1 2 3)))
;; #t -- more? of 1
;; 1 -- get of 1
;; #t -- more? of 2
;; 2 -- get of 2
So, we can probably fix @rfindler's latest code with the following?
(define (get s)
(for/last ([v (in-values-sequence s)]
[i (in-range 1)])
v))
(define p1 (open-input-string "abcd"))
(get p1)
(get p1)
(define p2 (open-input-string "abcd"))
(define s2
(make-do-sequence
(λ ()
(define-values (more? next) (sequence-generate p2))
(define more?-result (more?))
(values
(λ (idx) (next))
add1
0
(lambda (idx) more?-result)
(lambda elems #t)
(lambda (idx . elems) #t)))))
(get s2)
(get s2)
It's a bit concerning though that make-do-sequence calls (λ (idx) (next)) and (lambda (idx) more?-result) in a weird order. That looks like a bug to me.
Re make-do-sequence:
(define (get s)
(for/last ([v (in-values-sequence s)]
[i (in-range 1)])
v))
(define p2 (open-input-string "abcd"))
(define s2
(make-do-sequence
(λ ()
(define-values (more? next) (sequence-generate p2))
(values
(λ (idx)
(println 1111)
(next))
add1
0
(lambda (idx)
(println 2222)
(more?))
(lambda elems #t)
(lambda (idx . elems) #t)))))
(get s2)
(get s2)
produces the following trace:
2222
1111
2222
'(97)
2222
1111
2222
'(99)
I had an offline discussion with @mflatt and one of the things that was clarified to me is that more? is allowed to consume elements. This leads me to think that the fundamental problem is that the underlying sequence support knows how certain, stateful sequences have run out (eg if the result is eof, then a port has run out) but that information isn't available via the sequence-generate result. Thus, there is no way to use make-do-sequence wrap an existing sequence in a way that preserves the behavior properly (without making a bunch of special cases along the lines of the code here).
It might work to generalize sequence-generate (eg introduce sequence-generate*) so that next returns not just the value, but also a boolean indicating if that was the last value, or perhaps some other variant on that, and then it would be possible to call make-do-sequence in a way that doesn't require those extra calls to more?. (This is only a guess on how to generalize sequence-generate -- I didn't try it out!)
Stepping back, the thing that's really missing here is sequence-chaperone, as there are other problems that we cannot fix with the current strategy, namely that predicates (eg, input-port?) aren't going to be preserved (not to mention chaperone-of?). Implementing that, tho, seems like a lot of work and maybe we'd need the piece in the previous paragraph to fully implement it anyway, so maybe doing that for now is the right thing?
[ edit to add last line of first paragraph, roughly ]
PS: I'll try to push some changes to the docs that try to explicate what sequence initialization entails and what's legal and what's not legal. (These aren't, technically, new information, but it is hard to wrest that information out of the current text.)