Skip to content

SI-9691 BufferedIterator should expose an headOption#5277

Closed
ChristopherDavenport wants to merge 4 commits intoscala:2.12.xfrom
ChristopherDavenport:ticket/9691
Closed

SI-9691 BufferedIterator should expose an headOption#5277
ChristopherDavenport wants to merge 4 commits intoscala:2.12.xfrom
ChristopherDavenport:ticket/9691

Conversation

@ChristopherDavenport
Copy link
Contributor

While I am uncertain if this is the appropriate context. With a simple if block this can be achieved and breaks no tests that I can see. The hasNext feature allows us to be aware whether or not we need to return the value wrapped or whether to return a None object which should allow this trait to proliferate. I was worried about where this coincided with already exists. However it appears that it was using override already where the mix occurs at. As it is implemented in the trait it should not affect current code.

I am not sure if this is even an appropriate feature however I thought I could implement it so I wanted to try. Please let me know if anything needs to be done.

While I am uncertain if this is the appropriate context. With a simple if block this can be achieved and breaks no tests that I can see. The hasNext feature allows us to be aware whether or not we need to return the value wrapped or whether to return a None object which should allow this trait to proliferate. I was worried about where this coincided with already exists. However it appears that it was using override already where the mix occurs at. As it is implemented in the trait it should not affect current code.
@ChristopherDavenport
Copy link
Contributor Author

Also I'd like to say I'm sorry if I failed to follow any standards or procedures either here or on the issues site. This is my first time attempting to contribute so I'm rather unsure of myself.

@adriaanm adriaanm added the welcome hello new contributor! label Jul 13, 2016
@ChristopherDavenport
Copy link
Contributor Author

Well my first attempt was naive, and I was missing an obvious test for whether or not this would perform at the end of an Iterator. I am working on a more complete approach currently.

The hasNext method actually checks to see if the next value is nonEmpty, so rather than calling ahead in the sequence through a buffered itertaing we check if this exists, otherwise we generate the None. Test case now includes last value of an Iterator, the first, and an element past the end. The nonEmpty guarantees that head will no be called on a value unless it exists, buffering the call.
@ChristopherDavenport
Copy link
Contributor Author

Well this turns out to do the exact same validation check as before however it appropriately works in the test cases as I didn't realize I was advancing the iterator past where I was evaluating. Hopefully this is seen as the best of both worlds as the implementation is left up to the inheritor.

@som-snytt
Copy link
Contributor

Hi, iterators are deceptively tricky. You're in good hands with @szeiger .

You probably didn't intend this null handling:

scala> val b = List("hi", null, "world").iterator.buffered
b: scala.collection.BufferedIterator[String] = non-empty iterator

scala> if (b.nonEmpty) Option(b.head) else None
res0: Option[String] = Some(hi)

scala> b.next
res2: String = hi

scala> if (b.nonEmpty) Option(b.head) else None
res3: Option[String] = None                                      // Some(null)

scala> b.next
res4: String = null

scala> if (b.nonEmpty) Option(b.head) else None
res5: Option[String] = Some(world)

I don't know whether they'll want to add convenience API (which can be coded as extension methods). But I notice there's no way to ask a BufferedIterator whether the buffer is nonEmpty. I might want to process the next element, but only if it's buffered (perhaps because next incurs I/O). Not sure what you'd call it. Maybe def flush(): Option[A] which would also consume the element. Or nextBuffered. That would be a question for gitter.

if (this.nonEmpty) Option(head)
else None
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO, it looks better as a one-liner using hasNext. And Some(head).

https://github.com/scala/scala/blob/v2.12.0-M5/src/library/scala/collection/TraversableLike.scala#L412

@szeiger
Copy link
Contributor

szeiger commented Jul 14, 2016

I don't understand the argument for calling nonEmpty instead of hasNext. nonEmpty is !isEmpty and isEmpty is !hasNext, so this only adds two layers of indirection with a chance to hit an inappropriately overwritten method. I think it's a safe assumption that all BufferedIterator implementations have an efficient hasNext. I can't imagine how you would implement it differently or why you would want to have an inefficient hasNext and then override isEmpty to be more efficient.

Agreed on the null handling. An Iterator may legitimately contain null, so you need to use Some(head) instead of Option(head). This would make a good test case.

@som-snytt I don't see how you could retrofit a method that checks whether the buffer is non-empty into BufferedIterator. It would require a new method that has to be implemented by all subclasses. I'm not convinced how useful it would be, either. You never really care about an abstract concept such as "has a buffered element". You care about "performs I/O" or "holds on to a large chunk of memory" but those are more specific effects that you wouldn't want to deal with in the collection library. This could be done in a subtype of BufferedIterator in an I/O library.

@ChristopherDavenport
Copy link
Contributor Author

I agree with you @szeiger . The options available don't really seem to make much sense. It belongs in GenTraversableLike, as it is. I tried hasNext realized a shortcoming at the end value of some Iterable implementations decided to use nonEmpty and did not realize that it was !hasNext until after I had posted the commit.

The problem is as posted the iterator makes no guarantees as to the buffer, nor is it ever truly safe to do so. The buffered iterator is free to advance past the end of the stream. As it advances the Iterator and stores the value, however if no value exists we receive the error. We see this in the buffered function. It is doubly unnecessary as it can become a Transversable via the toStream method which would offer this functionality.

Overall I am glad to have taken a shot at this, and learned a bit in the process but I am fairly certain that this not really the place. The buffer is unstable in both of the locations it is implemented. And traversable seems to make more sense, and it comforts me that my chosen implementation seems to mirrored it. Although I seem to have a lot more to learn about null in scala as my daily life in scala is so devoid of it.

Thank you for your time and I will continue to try to help out. Was my overall approach acceptable? Anything I should do differently on another issue?

@szeiger
Copy link
Contributor

szeiger commented Jul 14, 2016

What problems does hasNext have? I can't see anything wrong with if(hasNext) Some(head) else None. Do you have a test where it behaves incorrectly?

This commit applies allows nulls to be inside the Some as appropriate for an Iterator and switches from the isEmpty method which is !hasNext to hasNext.
@ChristopherDavenport
Copy link
Contributor Author

Well I believed incorrectly that I could implement a version where it would advance past the end of the list. Which is obviously untrue and a sign of my inexperience in this realm. I have added the test for null handling, and the code so that it uses hasNext.

@szeiger
Copy link
Contributor

szeiger commented Jul 14, 2016

I think this implementation should be fine. The new method still needs a scaladoc comment.

Testing Iterators is always tricky because of the (possibly intricate) mutable state. All your tests use a fresh Iterator.buffered implementation in its initial state. You may want to test that the behavior is also correct after hasNext and after next, and that calls to head and headOption are idempotent. For example, for a non-empty BufferedIterator, the following should hold:

val v1 = it.head
val v2 = it.headOption
val v3 = it.head
val v4 = it.headOption
v1 == v3 && v2 == v4 && Some(v1) == v2

@szeiger
Copy link
Contributor

szeiger commented Jul 14, 2016

Another consideration is binary compatibility. Unlike most projects in the Scala ecosystem which only guarantee backward compatibility between minor releases, Scala itself also guarantees forward compatibility. This means we cannot add new non-private methods in 2.12.x, so you need to target the 2.12.0 branch.

The new tests demonstrate it works exactly on the bounds described and that the state is not mutated by the advancement, as suggested by buffered. Head and Head Option are interchangeable on a value that exists when head is wrapped with Some.
@ChristopherDavenport
Copy link
Contributor Author

I have made the changes requested. I am uncertain what you mean about targeting for 2.12.0 vs 2.12.x, although I do understand the reason. Do I need to do something differently?

def head: A

/** Returns an option of the next element of an iterator without advancing beyond it.
* @return the next element of this iterator if it is has a next element
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an "is" too many here. Same in the next line.

@szeiger
Copy link
Contributor

szeiger commented Jul 15, 2016

You can see the target branch at the top of this PR. You selected it when opening the PR (or not, because 2.12.x is the current default). There's no way to change it after the fact so you need to close this PR and open a new one against 2.12.0. While you're at it, please squash all changes into a single commit.

@szeiger
Copy link
Contributor

szeiger commented Jul 15, 2016

Oh, sorry, I just noticed that we haven't branched yet, so 2.12.x is indeed the correct branch.

@ChristopherDavenport
Copy link
Contributor Author

Closing for New Squashed Commit

@szeiger
Copy link
Contributor

szeiger commented Jul 18, 2016

Just FYI, you don't need to create a new PR when you squash. Just do it in the original branch and git push -f it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

welcome hello new contributor!

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants