racket icon indicating copy to clipboard operation
racket copied to clipboard

`sequence/c` consumes generator elements

Open zyrolasting opened this issue 4 years ago • 11 comments

Credit to sorawee, on Discord

sequence/c, being an impersonator contract, seems to eagerly consume elements produced by generators. Note that program A skips 2. I assume this behavior reproduces across VMs and versions.

Program A

#lang racket

(require racket/generator)

(define x
  (in-generator 
   (yield 1)
   (yield 2)
   (yield 3)))

(define/contract y (sequence/c integer?) x)
(sequence-ref y 0)
(sequence-ref y 0)

Output

1
3

Program B

#lang racket

(require racket/generator)

(define x
  (in-generator 
   (yield 1)
   (yield 2)
   (yield 3)))

(define/contract y any/c x)
(sequence-ref y 0)
(sequence-ref y 0)

Output

1
2

zyrolasting avatar Jan 30 '22 17:01 zyrolasting

This is what the contract system is doing, which is there the 1 and 3 come from. Is this wrong?

#lang racket
(require racket/generator)

(define s1
  (in-generator 
   (yield 1)
   (yield 2)
   (yield 3)))

(define-values (more?.1 next.1) (sequence-generate s1))
(next.1)
(void (more?.1))

(define-values (more?.2 next.2) (sequence-generate s1))
(next.2)

It looks like the call to second more?.1 is somehow consuming the 2 even thought next.1 doesn't get called.

rfindler avatar Jan 30 '22 21:01 rfindler

FWIW, in-producer works correctly so this may be a problem with generators

(define x
  (in-producer
   (let ([x 0])
     (lambda ()
       (set! x (add1 x))
       x))))

(define/contract y (sequence/c integer?) x)
(sequence-ref x 0)
(sequence-ref x 0)

samdphillips avatar Jan 31 '22 01:01 samdphillips

Hmm. If I use @rfindler's code with in-producer, it outputs incorrectly too. That is, sequence-generate seems to be incorrect for both generator and producer.

But sequence-ref on a contracted sequence seems to trigger the bug only for generator, but not producer.

sorawee avatar Feb 02 '22 11:02 sorawee

The docs don't seem especially clear on whether or not multiple calls to sequence-generate is allowed and what invariants should be preserved across such calls. It may be that the contract system is wrong here, I am not sure.

Robby

rfindler avatar Feb 02 '22 14:02 rfindler

Hmm. If I use @rfindler's code with in-producer, it outputs incorrectly too. That is, sequence-generate seems to be incorrect for both generator and producer.

But sequence-ref on a contracted sequence seems to trigger the bug only for generator, but not producer.

This is probably because of how in-producer is implemented and whether or not it is passing along the calls to that more? function. The contract system calls more? in response to a callback it gets. (So this is another place that might be declared the bug.)

rfindler avatar Feb 04 '22 14:02 rfindler

Well, it looks like the difference I put above might be a red herring. Here's, stepping back, what the contract system is doing to get different answers out:

(define (get s)
  (for/last ([v (in-values-sequence s)]
             [i (in-range 1)])
    v))

(define p1 (open-input-string "abcd"))
(get p1)
(get p1)

(define p2 (open-input-string "abcd"))
(define s2
  (make-do-sequence
   (λ ()
     (define-values (more? next) (sequence-generate p2))
     (values
      (λ (idx) (next))
      add1
      0
      (lambda (idx) (more?))
      (lambda elems #t)
      (lambda (idx . elems) #t)))))
(get s2)
(get s2)

It may be the case that these procedures inside the call to make-do-sequence are wrong; they are trying to be the same as doing nothing.

Also, I replaced sequence-ref with a call to get as in-values-sequence seems to be the real primitive and its code may be the place where these two sequences are getting treated differently.

rfindler avatar Feb 04 '22 23:02 rfindler

Ah, I understand now what's wrong in the above examples with sequence-generate: more? should be called before get, and never after. If done in the opposite order, more? will become the next state's more?, which is not what we want.

(define (get-before-more s)
  (define-values (more? get) (sequence-generate s))
  (println (get))
  (println (more?))

  (define-values (more2? get2) (sequence-generate s))
  (println (get2))
  (println (more2?)))


(define (more-before-get s)
  (define-values (more? get) (sequence-generate s))
  (println (more?))
  (println (get))

  (define-values (more2? get2) (sequence-generate s))
  (println (more2?))
  (println (get2)))

(get-before-more (open-input-bytes (bytes 1 2 3)))
;; 1 -- get of 1
;; #t -- more? of 2
;; 3  -- get of 3
;; #f -- more? of the end
(more-before-get (open-input-bytes (bytes 1 2 3)))
;; #t -- more? of 1
;; 1 -- get of 1
;; #t  -- more? of 2
;; 2 -- get of 2

sorawee avatar Feb 05 '22 01:02 sorawee

So, we can probably fix @rfindler's latest code with the following?

(define (get s)
  (for/last ([v (in-values-sequence s)]
             [i (in-range 1)])
    v))

(define p1 (open-input-string "abcd"))
(get p1)
(get p1)

(define p2 (open-input-string "abcd"))
(define s2
  (make-do-sequence
   (λ ()
     (define-values (more? next) (sequence-generate p2))
     (define more?-result (more?))
     (values
      (λ (idx) (next))
      add1
      0
      (lambda (idx) more?-result)
      (lambda elems #t)
      (lambda (idx . elems) #t)))))
(get s2)
(get s2)

It's a bit concerning though that make-do-sequence calls (λ (idx) (next)) and (lambda (idx) more?-result) in a weird order. That looks like a bug to me.

sorawee avatar Feb 05 '22 01:02 sorawee

Re make-do-sequence:

(define (get s)
  (for/last ([v (in-values-sequence s)]
             [i (in-range 1)])
    v))

(define p2 (open-input-string "abcd"))
(define s2
  (make-do-sequence
   (λ ()
     (define-values (more? next) (sequence-generate p2))
     (values
      (λ (idx)
        (println 1111)
        (next))
      add1
      0
      (lambda (idx)
        (println 2222)
        (more?))
      (lambda elems #t)
      (lambda (idx . elems) #t)))))
(get s2)
(get s2)

produces the following trace:

2222
1111
2222
'(97)
2222
1111
2222
'(99)

sorawee avatar Feb 05 '22 01:02 sorawee

I had an offline discussion with @mflatt and one of the things that was clarified to me is that more? is allowed to consume elements. This leads me to think that the fundamental problem is that the underlying sequence support knows how certain, stateful sequences have run out (eg if the result is eof, then a port has run out) but that information isn't available via the sequence-generate result. Thus, there is no way to use make-do-sequence wrap an existing sequence in a way that preserves the behavior properly (without making a bunch of special cases along the lines of the code here).

It might work to generalize sequence-generate (eg introduce sequence-generate*) so that next returns not just the value, but also a boolean indicating if that was the last value, or perhaps some other variant on that, and then it would be possible to call make-do-sequence in a way that doesn't require those extra calls to more?. (This is only a guess on how to generalize sequence-generate -- I didn't try it out!)

Stepping back, the thing that's really missing here is sequence-chaperone, as there are other problems that we cannot fix with the current strategy, namely that predicates (eg, input-port?) aren't going to be preserved (not to mention chaperone-of?). Implementing that, tho, seems like a lot of work and maybe we'd need the piece in the previous paragraph to fully implement it anyway, so maybe doing that for now is the right thing?

[ edit to add last line of first paragraph, roughly ]

rfindler avatar Feb 05 '22 19:02 rfindler

PS: I'll try to push some changes to the docs that try to explicate what sequence initialization entails and what's legal and what's not legal. (These aren't, technically, new information, but it is hard to wrest that information out of the current text.)

rfindler avatar Feb 05 '22 19:02 rfindler