Rethinking Range

View: New views
20 Messages — Rating Filter:   Alert me  
< Prev | 1 - 2 | Next >

Rethinking Range

by Martin Odersky :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

[Please follow up to scala-debate only]

The recent debate about Range gives food for thought. Razvan's
experiences are indeed troubling, and he's not the first to fall in
this trap. Range was part of the previous projection design, which I
never really liked. In the new 2.8 collections, every switch to
by-name evaluation is made explicit by a call to view. Except for
Range, that is. I felt I had to leave it as it is because it was used
a lot. But I feel increasingly worse about leaving that trap in the
Scala API.

So, what can we do? I believe the primary motivation to make Range
by-name was that filters would not evaluate eagerly. I.e. in

  for (x <- 1 to 10000000 if !isPrime(x)) ...

we do not want to construct a vector of all non-prime numbers. A
secondary motivation was that
in

  (for (x <- 1 to 100; y <- 1000) yield f(x, y)).toArray

We do not want to construct 100 vectors of 1000 elements each which
then get concatenated into the result array.

The primary motivation can be addressed by making filter on ranges
lazy, but have map and flatMap be strict. In new collection terms,
Range would be a immutable.VectorLike[Int, Range]. That means
operations like take, drop, filter will yield ranges again. But Range
would not have a builderFactory for itself. That means map, flatMap
and friends would yield immutable.Vectors, not VectorViews.

There's some hope to get a new immutable.Vector implementation which
would make concatenation relatively cheap. So, the secondary concern
(flatMapping nested ranges) might be less of a problem in the future
than it is now.

Technically, I believe this design is a winner. The question is
whether it is acceptable to silently change some behavior in user
code. The behavior that would change is

 - if you do side effects in a filter or yield of a for-expression
over a range. These side effects will now take place at a different
time (arguably a more natural time, possibly this fixes even some
bugs)

 - if you do combinatorial space exploration over ranges. There you
might see different complexity.

In both cases one could get back the old behavior by turning a range
into a view explicitly, like this:

   for (x <- 1 to 10000000 view) yield ....

It should be noted that we already accepted some of these breaking
changes. For instance slice used to produce a projection but now
(i.e.in 2.8) produces a regular collection. It should also be noted
that for simple for-loops (without a yield or filter), the behavior
would stay the same in any case.

Cheers

 -- Martin

Re: Rethinking Range

by Seth Tisue :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

>>>>> "martin" == martin odersky <martin.odersky@...> writes:

 martin> Technically, I believe this design is a winner. The question is
 martin> whether it is acceptable to silently change some behavior in
 martin> user code.

Now's the time!  There are already so many sweeping changes to
collections in 2.8, it seems to me that one more will hardly hurt.

--
Seth Tisue @ Northwestern University / http://tisue.net
lead developer, NetLogo: http://ccl.northwestern.edu/netlogo/

Re: Rethinking Range

by Ismael Juma :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Seth Tisue <seth <at> tisue.net> writes:
>  martin> Technically, I believe this design is a winner. The question is
>  martin> whether it is acceptable to silently change some behavior in
>  martin> user code.
>
> Now's the time!  There are already so many sweeping changes to
> collections in 2.8, it seems to me that one more will hardly hurt.

Indeed. This design seems to choose the right compromise for the to/until case.
People who have more specific requirements can use a different collection.

Best,
Ismael  





Re: Rethinking Range

by Eric J. Christeson-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On 10/15/09 04:50, martin odersky wrote:
> Technically, I believe this design is a winner. The question is
> whether it is acceptable to silently change some behavior in user
> code. The behavior that would change is
>    
+1

--
Eric J. Christeson
Enterprise Application Development
North Dakota State University
Fargo, North Dakota, USA


Re: Rethinking Range

by Razvan Cojocaru :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Martin Odersky wrote:
...
 - if you do side effects in a filter or yield of a for-expression
over a range. These side effects will now take place at a different
time (arguably a more natural time, possibly this fixes even some
bugs)
...
   for (x <- 1 to 10000000 view) yield ....
...
It should also be noted
that for simple for-loops (without a yield or filter), the behavior
would stay the same in any case.
There may be other "dragons" to be slayed by newbies (like that Iterator) - that's why I was thinking that having a different version of "yield" which makes the behaviour in all situations determinate, would be a more generic and less impacting solution.

   for (x <- 1 to 10000000) yieldnow ....

which would most likely trigger an OutOfMemoryException() :)

...there, I broke my word and mentioned it again...oh, well.

 Cheers,
Razvan

Re: Rethinking Range

by Razvan Cojocaru :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Sorry - missed this option, which is very close to Java: have a <- which is like you have it now AND add a : or << or something that would fix the result and make it deterministic

   for (x : 1 to 10000000 ) yield ....
   for (x << 1 to 10000000 ) yield ....


Re: Rethinking Range

by Meredith Gregory :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Dear Razvan,

Actually, these suggestions have merit. Strict/lazy is usefully considered orthogonal to the rest of the for-sugar.

Best wishes,

--greg

On Thu, Oct 15, 2009 at 10:17 AM, Razvan Cojocaru <razvanc99@...> wrote:

Sorry - missed this option, which is very close to Java: have a <- which is
like you have it now AND add a : or << or something that would fix the
result and make it deterministic

  for (x : 1 to 10000000 ) yield ....
  for (x << 1 to 10000000 ) yield ....


--
View this message in context: http://www.nabble.com/Rethinking-Range-tp25905818p25912589.html
Sent from the Scala - Debate mailing list archive at Nabble.com.




--
L.G. Meredith
Managing Partner
Biosimilarity LLC
1219 NW 83rd St
Seattle, WA 98117

+1 206.650.3740

http://biosimilarity.blogspot.com

Re: Rethinking Range

by Daniel Sobral :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

It will continue to shoot people in the foot if left as is. Is there any question of that?
 
I vote to make it strict, and add a non-strict "view" to it.
 
Also, be reminded that Range is already different from 2.7. We had at least one person come here and complain that Range.Inclusive is gone, and it is quite possible that others will feel that bite as well. Let's make it a single bite.
Having said that, making "filter" alone non-strict sounds interesting. We can rest assured that it wouldn't cause problems with for comprehensions, as they always apply map/flatMap anyway. Is it worth the risk, though?
On Thu, Oct 15, 2009 at 6:50 AM, martin odersky <martin.odersky@...> wrote:
[Please follow up to scala-debate only]

The recent debate about Range gives food for thought. Razvan's
experiences are indeed troubling, and he's not the first to fall in
this trap. Range was part of the previous projection design, which I
never really liked. In the new 2.8 collections, every switch to
by-name evaluation is made explicit by a call to view. Except for
Range, that is. I felt I had to leave it as it is because it was used
a lot. But I feel increasingly worse about leaving that trap in the
Scala API.

So, what can we do? I believe the primary motivation to make Range
by-name was that filters would not evaluate eagerly. I.e. in

 for (x <- 1 to 10000000 if !isPrime(x)) ...

we do not want to construct a vector of all non-prime numbers. A
secondary motivation was that
in

 (for (x <- 1 to 100; y <- 1000) yield f(x, y)).toArray

We do not want to construct 100 vectors of 1000 elements each which
then get concatenated into the result array.

The primary motivation can be addressed by making filter on ranges
lazy, but have map and flatMap be strict. In new collection terms,
Range would be a immutable.VectorLike[Int, Range]. That means
operations like take, drop, filter will yield ranges again. But Range
would not have a builderFactory for itself. That means map, flatMap
and friends would yield immutable.Vectors, not VectorViews.

There's some hope to get a new immutable.Vector implementation which
would make concatenation relatively cheap. So, the secondary concern
(flatMapping nested ranges) might be less of a problem in the future
than it is now.

Technically, I believe this design is a winner. The question is
whether it is acceptable to silently change some behavior in user
code. The behavior that would change is

 - if you do side effects in a filter or yield of a for-expression
over a range. These side effects will now take place at a different
time (arguably a more natural time, possibly this fixes even some
bugs)

 - if you do combinatorial space exploration over ranges. There you
might see different complexity.

In both cases one could get back the old behavior by turning a range
into a view explicitly, like this:

  for (x <- 1 to 10000000 view) yield ....

It should be noted that we already accepted some of these breaking
changes. For instance slice used to produce a projection but now
(i.e.in 2.8) produces a regular collection. It should also be noted
that for simple for-loops (without a yield or filter), the behavior
would stay the same in any case.

Cheers

 -- Martin



--
Daniel C. Sobral

Something I learned in academia: there are three kinds of academic reviews: review by name, review by reference and review by value.

Re: [scala-user] Re: Rethinking Range

by Seth Tisue :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

>>>>> "Daniel" == Daniel Sobral <dcsobral@...> writes:

 Daniel> Having said that, making "filter" alone non-strict sounds
 Daniel> interesting. We can rest assured that it wouldn't cause
 Daniel> problems with for comprehensions, as they always apply
 Daniel> map/flatMap anyway. Is it worth the risk, though?  

What's the risk?

--
Seth Tisue @ Northwestern University / http://tisue.net
lead developer, NetLogo: http://ccl.northwestern.edu/netlogo/


Re: Rethinking Range

by Miles Sabin :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thu, Oct 15, 2009 at 10:50 AM, martin odersky <martin.odersky@...> wrote:
> The primary motivation can be addressed by making filter on ranges
> lazy, but have map and flatMap be strict.

To be honest I think this non-uniformity will cause as much grief as
it saves ... either all should be lazy or all should be strict.

I really like the idea (was it Razvan's?) of making Range strict like
the rest of the new collections and adding a non-strict arrow,

  for(i <~ 1 to 100 ; ...) ...

as sugar for,

  for(i <- 1 to 100 view ; ...) ...

or something like that. If nothing else it has the merit of being
applicable to all collections.

Cheers,


Miles

--
Miles Sabin
tel: +44 (0)7813 944 528
skype:  milessabin
http://www.chuusai.com/
http://twitter.com/milessabin

Re: [scala-user] Re: Rethinking Range

by Daniel Sobral :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

It violates the Principle Of Least Astonishment. It's unexpected, and, therefore, will produce code that silently does not what the programmer expects it to.

On Thu, Oct 15, 2009 at 4:14 PM, Seth Tisue <seth@...> wrote:
>>>>> "Daniel" == Daniel Sobral <dcsobral@...> writes:

 Daniel> Having said that, making "filter" alone non-strict sounds
 Daniel> interesting. We can rest assured that it wouldn't cause
 Daniel> problems with for comprehensions, as they always apply
 Daniel> map/flatMap anyway. Is it worth the risk, though?

What's the risk?

--
Seth Tisue @ Northwestern University / http://tisue.net
lead developer, NetLogo: http://ccl.northwestern.edu/netlogo/




--
Daniel C. Sobral

Something I learned in academia: there are three kinds of academic reviews: review by name, review by reference and review by value.

Re: Rethinking Range

by Daniel Sobral :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I *love* the idea of having syntactic sugar for non-strictness (view) on for comprehensions. Of course, there's the small problem of coming up with a syntax for it. Choosing "<~" would break the parser, for example.
 
[scala-user removed as originally requested]
On Thu, Oct 15, 2009 at 4:17 PM, Miles Sabin <miles@...> wrote:
On Thu, Oct 15, 2009 at 10:50 AM, martin odersky <martin.odersky@...> wrote:
> The primary motivation can be addressed by making filter on ranges
> lazy, but have map and flatMap be strict.

To be honest I think this non-uniformity will cause as much grief as
it saves ... either all should be lazy or all should be strict.

I really like the idea (was it Razvan's?) of making Range strict like
the rest of the new collections and adding a non-strict arrow,

 for(i <~ 1 to 100 ; ...) ...

as sugar for,

 for(i <- 1 to 100 view ; ...) ...

or something like that. If nothing else it has the merit of being
applicable to all collections.

Cheers,


Miles

--
Miles Sabin
tel: +44 (0)7813 944 528
skype:  milessabin
http://www.chuusai.com/
http://twitter.com/milessabin



--
Daniel C. Sobral

Something I learned in academia: there are three kinds of academic reviews: review by name, review by reference and review by value.

Re: Rethinking Range

by phkoester :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> I vote to make it strict, and add a non-strict "view" to it.

The word ``strict'' isn't listed in the index of  ``Programming in
Scala." At the risk of posing a dumb question: What do you folks mean
when you say ``strict?"

---Ph.

Re: [scala-user] Re: Rethinking Range

by Seth Tisue :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

>>>>> "Daniel" == Daniel Sobral <dcsobral@...> writes:

 Daniel> It violates the Principle Of Least Astonishment. It's
 Daniel> unexpected, and, therefore, will produce code that silently
 Daniel> does not what the programmer expects it to.

Under Martin's proposal, where Range.filter is lazy but Range.map and
Range.flatMap are strict, what is an example of code that will astonish
the programmer?

--
Seth Tisue @ Northwestern University / http://tisue.net
lead developer, NetLogo: http://ccl.northwestern.edu/netlogo/

Re: [scala-user] Re: Rethinking Range

by Seth Tisue :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

>>>>> "Miles" == Miles Sabin <miles@...> writes:

 >> The primary motivation can be addressed by making filter on ranges
 >> lazy, but have map and flatMap be strict.

 Miles> To be honest I think this non-uniformity will cause as much
 Miles> grief as it saves

How?

(the same question I'm asking Daniel...)

--
Seth Tisue @ Northwestern University / http://tisue.net
lead developer, NetLogo: http://ccl.northwestern.edu/netlogo/

Re: [scala-user] Re: Rethinking Range

by Miles Sabin :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thu, Oct 15, 2009 at 8:47 PM, Seth Tisue <seth@...> wrote:
>  Miles> To be honest I think this non-uniformity will cause as much
>  Miles> grief as it saves
>
> How?
>
> (the same question I'm asking Daniel...)

If map, flatMap and filter are differently strict then there's a
possibility of confusion which wasn't there when they all behaved the
same way.

Saying "all collections are strict, but you can obtain a lazy variant
by applying view" is nice and clear and unequivocal.

Cheers,


Miles

--
Miles Sabin
tel: +44 (0)7813 944 528
skype:  milessabin
http://www.chuusai.com/
http://twitter.com/milessabin

Re: [scala-user] Re: Rethinking Range

by Seth Tisue :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

>>>>> "Miles" == Miles Sabin <miles@...> writes:

 Miles> If map, flatMap and filter are differently strict then there's a
 Miles> possibility of confusion which wasn't there when they all
 Miles> behaved the same way.

 Miles> Saying "all collections are strict, but you can obtain a lazy
 Miles> variant by applying view" is nice and clear and unequivocal.

Range seems sufficiently different from ordinary collections that I
suspect the possibility of confusion is merely theoretical.  That's why
I'm looking for examples of how any confusion would arise in practice
under Martin's proposal.

--
Seth Tisue @ Northwestern University / http://tisue.net
lead developer, NetLogo: http://ccl.northwestern.edu/netlogo/

Re: [scala-user] Re: Rethinking Range

by Meredith Gregory :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Dear Miles,

i think that's why having separate sugar for strict/lazy that intermixes with the for-sugar is a really good way to go. The programmer declares her intention about the evaluation policy. She then can't claim surprise later.

Best wishes,

--greg

On Thu, Oct 15, 2009 at 12:54 PM, Miles Sabin <miles@...> wrote:
On Thu, Oct 15, 2009 at 8:47 PM, Seth Tisue <seth@...> wrote:
>  Miles> To be honest I think this non-uniformity will cause as much
>  Miles> grief as it saves
>
> How?
>
> (the same question I'm asking Daniel...)

If map, flatMap and filter are differently strict then there's a
possibility of confusion which wasn't there when they all behaved the
same way.

Saying "all collections are strict, but you can obtain a lazy variant
by applying view" is nice and clear and unequivocal.

Cheers,


Miles

--
Miles Sabin
tel: +44 (0)7813 944 528
skype:  milessabin
http://www.chuusai.com/
http://twitter.com/milessabin



--
L.G. Meredith
Managing Partner
Biosimilarity LLC
1219 NW 83rd St
Seattle, WA 98117

+1 206.650.3740

http://biosimilarity.blogspot.com

Re: Rethinking Range

by Jesper Nordenberg :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Philip Köster wrote:
>> I vote to make it strict, and add a non-strict "view" to it.
>
> The word ``strict'' isn't listed in the index of  ``Programming in
> Scala." At the risk of posing a dumb question: What do you folks mean
> when you say ``strict?"

Strict means that the result is generated as soon as possible. Lazy
means that it's generated as late as possible, possibly never.

A small example in Scala 2.7.5 (the first is a lazy collection, the
second strict):

scala> 1 to 3 filter (x => {println("Filter: " + x); x < 3}) map (x =>
{println("Map: " + x); x * 2})
Filter: 1
Map: 1
Filter: 2
Map: 2
Filter: 3
res1: Seq.Projection[Int] = RangeFM(2, 4)

scala> (1 to 3 toList) filter (x => {println("Filter: " + x); x < 3})
map (x => {println("Map: " + x); x * 2})
Filter: 1
Filter: 2
Filter: 3
Map: 1
Map: 2
res3: List[Int] = List(2, 4)

/Jesper Nordenberg


Re: [scala-user] Re: Rethinking Range

by Roland Kuhn-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Oct 15, 2009, at 21:54 , Miles Sabin wrote:

> On Thu, Oct 15, 2009 at 8:47 PM, Seth Tisue <seth@...> wrote:
>>  Miles> To be honest I think this non-uniformity will cause as much
>>  Miles> grief as it saves
>>
>> How?
>>
>> (the same question I'm asking Daniel...)
>
> If map, flatMap and filter are differently strict then there's a
> possibility of confusion which wasn't there when they all behaved the
> same way.
>
> Saying "all collections are strict, but you can obtain a lazy variant
> by applying view" is nice and clear and unequivocal.
>
Sure, read like this it's true. However, I submit that the average  
Scala programmer expects a guard in a for-expression to be evaluated  
like it were included in the body, which means lazy. But this point  
may well be moot, given Martin's other proposal from yesterday (http://www.nabble.com/Rethinking-filter-p25906315.html 
). That, in combination might be the real solution.

Regards,

Roland
< Prev | 1 - 2 | Next >