|
View:
New views
13 Messages
—
Rating Filter:
Alert me
|
|
|
[scala] StreamsI was under the impression that if you access, say, the 10000-th
element of a Stream, then the subsequent access to the 10000-th element should be considerably faster. This is because the first time, the stream is actually building the list of elements on demand. The second time, it is merely a lookup in the already-built list. But, when I test this out, it appears to be taking equally _long_ for both accesses. Is there a "timing" function in Scala that lets me easily test the time spent executing a function? Speaking of Streams, I noticed that: a) The Stream.from method is not documented in the class documentation. Why is that? Isn't the class documentation automatically generated from the source? b) The Stream.from method only takes ints, not BigInts, floats, etc. This seems kind of silly, because it's perfectly reasonable to want to generate a stream of all even BigInts, or all the floats counting by .5. The from method should support any type that has a + method. Is there already such a "type class" built into Scala, or does one need to write his own, and build implicit conversion functions for all the numeric types, in order to get something simple like this to work? Thanks, Mark |
|
|
[scala] Re: Streams> a) The Stream.from method is not documented in the class
> documentation. Why is that? Isn't the class documentation > automatically generated from the source? I see now that it is in the Stream object documentation, rather than the class documentation. I keep missing that classes are effectively documented in two places, in the "class" and in the "object". --Mark |
|
|
Re: [scala] Re: StreamsHi Mark,
Mark Engelberg wrote: > I see now that it is in the Stream object documentation, rather than > the class documentation. I keep missing that classes are effectively > documented in two places, in the "class" and in the "object". I think this only seems unnatural in light of Java's approach. If you consider that dereferencing is always done in the form <object name>.<member name> and never <class name>.<member name>, Scala is consistent. The only slight confusion is that we have an object and a class with different names (albeit in different namespaces). Cheers, Jon -- Jon Pretty | Sygneca Ltd. |
|
|
Re: [scala] Re: StreamsHi.
> I see now that it is in the Stream object documentation, rather than > the class documentation. I keep missing that classes are effectively > documented in two places, in the "class" and in the "object". Here's a search facility that may be of help later: http://scripts.mit.edu/~y_z/sds/ (I haven't made it) - Henrik |
|
|
Re: [scala] Re: StreamsJon Pretty wrote:
> I think this only seems unnatural in light of Java's approach. If you > consider that dereferencing is always done in the form <object > name>.<member name> and never <class name>.<member name>, Scala is > consistent. The only slight confusion is that we have an object and a > class with different names (albeit in different namespaces). I clearly didn't get enough sleep last night. Here's what I meant to say: The only slight confusion is that we have an object and a class with the *same name* (albeit in different namespaces). Jon -- Jon Pretty | Sygneca Ltd. |
|
|
Re: [scala] StreamsMark Engelberg wrote:
> I was under the impression that if you access, say, the 10000-th > element of a Stream, then the subsequent access to the 10000-th > element should be considerably faster. This is because the first > time, the stream is actually building the list of elements on demand. > The second time, it is merely a lookup in the already-built list. I wouldn't have expected this. This is often called "memois(z)ation" and is an irreversible space/computation trade in favour of computational advantage at the expense of space. It becomes irreversible at this point, therefore, the current behaviour is the correct one - unless you wish to suggest that in some areas Scala is eager by default and so it is inconsistent here. In any case, if the behaviour that you describe was built-in to the Stream class, then this would mean yet *another* type that I would have to rewrite and I'm getting tired of this :) You should build a memo table outside of Stream yourself - perhaps create a type called MemoStream on top of Stream that does what you want. Note that it is impossible to write a "UnmemoStream" (lazy) on top of "MemoStream" (strict), which is why your described behaviour would require a rewrite. Here is a good start (an excerpt of my code for doing what you describe): import scala.collection.mutable.HashMap import scala.collection.immutable.TreeMap // permits destructive updates without violating referential transparency (except within the local context) trait Memo[k, v] { def get(f: k => v)(k: => k): v } object Memo { def arrayMemo[k, v](n: Int) = new Memo[Int, v] { val a: Array[v] = new Array(n) def get(f: Int => v)(i: => Int) = if(i < 0 || i >= a.length) error("out of range: " + i) else if(a(i) == null) { val x = f(i) a(i) = x x } else a(i) } def nilMemo[k, v] = new Memo[k, v] { def get(f: k => v)(k: => k) = f(k) } def hashMemo[k, v] = new Memo[k, v] { val m = new HashMap[k, v] def get(f: k => v)(i: => k) = { val o = m.get(i) o match { case Some(v) => v case None => { val x = f(i) m.update(i, x) x } } } } def treeMemo[k, v](implicit o: k => Ordered[k]) = new Memo[k, v] { var m = new TreeMap[k, v] def get(f: k => v)(i: => k) = { val o = m.get(i) o match { case Some(v) => v case None => { val x = f(i) m = m.update(i, x) x } } } } } > > But, when I test this out, it appears to be taking equally _long_ for > both accesses. > > Is there a "timing" function in Scala that lets me easily test the > time spent executing a function? > > Speaking of Streams, I noticed that: > a) The Stream.from method is not documented in the class > documentation. Why is that? Isn't the class documentation > automatically generated from the source? > b) The Stream.from method only takes ints, not BigInts, floats, etc. > This seems kind of silly, because it's perfectly reasonable to want to > generate a stream of all even BigInts, or all the floats counting by > .5. The from method should support any type that has a + method. Is > there already such a "type class" built into Scala, or does one need > to write his own, and build implicit conversion functions for all the > numeric types, in order to get something simple like this to work? > > Thanks, > > Mark > > > Tony Morris http://tmorris.net/ |
|
|
Re: [scala] StreamsOn 6/21/07, Tony Morris <tmorris@...> wrote:
> Mark Engelberg wrote: > > I was under the impression that if you access, say, the 10000-th > > element of a Stream, then the subsequent access to the 10000-th > > element should be considerably faster. This is because the first > > time, the stream is actually building the list of elements on demand. > > The second time, it is merely a lookup in the already-built list. > > I wouldn't have expected this. This is often called "memois(z)ation" and > is an irreversible space/computation trade in favour of computational > advantage at the expense of space. It becomes irreversible at this > point, therefore, the current behaviour is the correct one - unless you > wish to suggest that in some areas Scala is eager by default and so it > is inconsistent here. In Haskell, lazy lists work as I describe. In F#, lazy lists work as I describe. (F# also has an uncached lazy list called Seq, sort of like a functional "iterator"). In general, languages need both constructs. I was under the impression that in Scala, the Iterator class fills the need for an uncached construct, and the Stream class fills the need for a cached lazy list that is the norm in functional languages. If Stream doesn't cache its values, why bother having both Stream and Iterator? --Mark |
|
|
Re: [scala] StreamsMark Engelberg wrote:
> On 6/21/07, Tony Morris <tmorris@...> wrote: >> Mark Engelberg wrote: >> > I was under the impression that if you access, say, the 10000-th >> > element of a Stream, then the subsequent access to the 10000-th >> > element should be considerably faster. [snip] > In Haskell, lazy lists work as I describe. Absolutely not (by the way, some Haskell purists would consider this accusation blasphemy) xs !! n will always run in O(n), always > In F#, lazy lists work as > I describe. I haven't used F# enough to know, but I'd hope note. > (F# also has an uncached lazy list called Seq, sort of > like a functional "iterator"). > > In general, languages need both constructs. They need a "completely" lazy list at the very least. All other behaviour can be written to exist on top as generalisation is lost (including the one that you initially described and the behaviour of the F# list that you describe). > I was under the > impression that in Scala, the Iterator class fills the need for an > uncached construct, and the Stream class fills the need for a cached > lazy list that is the norm in functional languages. It is already the norm, it never memoises. > > If Stream doesn't cache its values, why bother having both Stream and > Iterator? One is pure, the other is imperative (Iterator). You might notice that Iterator.next is side-effecting. Imagine changing its next method to return a tuple containing an Iterator instead. i.e. instead of returning a T (for Iterator[T]), it returns a (T, Iterator[T]). What do we have now? A stream! next._1 is head and next._2 is tail. A better question might be "why bother having Iterators (or imperative constructs)?". I suspect the answer has something to do with Java and targetting the JVM i.e. a flawed premise. > > --Mark > > > Tony Morris http://tmorris.net/ |
|
|
[scala] Re: StreamsTony Morris wrote:
> Mark Engelberg wrote: >> I was under the impression that if you access, say, the 10000-th >> element of a Stream, then the subsequent access to the 10000-th >> element should be considerably faster. This is because the first >> time, the stream is actually building the list of elements on demand. >> The second time, it is merely a lookup in the already-built list. > > I wouldn't have expected this. This is often called "memois(z)ation" and > is an irreversible space/computation trade in favour of computational > advantage at the expense of space. The stream is constructed the first time. A linked list is traversed the second time (unless you memoize - then it's just a look-up). The second time will already be considerably faster in cases where it takes a long time to determine each stream node. Here's a simple test with a timing method. def reportTime[T](description: String, computation: => T): T = { val startTime = System.nanoTime // or use System.currentTimeMillis val result: T = computation Console.println(description+" took "+(System.nanoTime - startTime)+" ns") result } // like Stream.from, but using BigInt and having a sleep before each cons def slowFrom(start: BigInt, step: BigInt): Stream[BigInt] = { Thread.sleep(1) Stream.cons(start, slowFrom(start+step, step)) } val numberStream = slowFrom(0, 1) Console.println(reportTime("first access", numberStream(10000))) Console.println(reportTime("second access", numberStream(10000))) first access took 19610635366 ns 10000 second access took 650975 ns 10000 Without the Thread.sleep(1): first access took 25804097 ns 10000 second access took 707413 ns 10000 (Note that the precision is misleading; ideally we'd be averaging over many runs and reporting in seconds or milliseconds.) >> b) The Stream.from method only takes ints, not BigInts, floats, etc. >> This seems kind of silly, because it's perfectly reasonable to want to >> generate a stream of all even BigInts, or all the floats counting by >> .5. The from method should support any type that has a + method. Is >> there already such a "type class" built into Scala, or does one need >> to write his own, and build implicit conversion functions for all the >> numeric types, in order to get something simple like this to work? I'm not aware of one. Perhaps the design could be explored in a different thread. |
|
|
Re: [scala] StreamsOn 6/21/07, Tony Morris <tmorris@...> wrote:
> Mark Engelberg wrote: > > On 6/21/07, Tony Morris <tmorris@...> wrote: > >> Mark Engelberg wrote: > >> > I was under the impression that if you access, say, the 10000-th > >> > element of a Stream, then the subsequent access to the 10000-th > >> > element should be considerably faster. > [snip] > > In Haskell, lazy lists work as I describe. > > Absolutely not (by the way, some Haskell purists would consider this > accusation blasphemy) > > xs !! n will always run in O(n), always > Yes, it is always O(n), because it still needs to traverse the list. However, it does not recompute the elements of the list. So if the definition of how the list is constructed is computationally intensive, you notice a dramatic speedup in Haskell when you do xs!!n the second time. --Mark |
|
|
Re: [scala] Re: StreamsThanks, Eric. It's nice to know that Stream should work the way I
expect. I'll use your timing code on some of my own tests, and try to figure out why I didn't seem to be getting a speedup on subsequent accesses. --Mark |
|
|
Re: [scala] StreamsMark Engelberg wrote:
> On 6/21/07, Tony Morris <tmorris@...> wrote: >> Mark Engelberg wrote: >> > On 6/21/07, Tony Morris <tmorris@...> wrote: >> >> Mark Engelberg wrote: >> >> > I was under the impression that if you access, say, the 10000-th >> >> > element of a Stream, then the subsequent access to the 10000-th >> >> > element should be considerably faster. >> [snip] >> > In Haskell, lazy lists work as I describe. >> >> Absolutely not (by the way, some Haskell purists would consider this >> accusation blasphemy) >> >> xs !! n will always run in O(n), always >> > > Yes, it is always O(n), because it still needs to traverse the list. > However, it does not recompute the elements of the list. So if the > definition of how the list is constructed is computationally > intensive, you notice a dramatic speedup in Haskell when you do xs!!n > the second time. > > --Mark > > > describing exactly how Streams work. Notice that the first argument to cons.apply[1] is declared without a laziness annotation (=>). [1] http://scalasvn.epfl.ch/cgi-bin/viewvc.cgi/scala/trunk/src/library/scala/Stream.scala?view=markup This means that the following code behaves the way it does. Hope this helps. object F { def main(args: Array[String]) = { import Stream._ val x = cons(1, cons(2, cons(3, empty))) def y = x.map(x => { Console.println("effect: " + x); x + 2 }) val z = y for(val s <- y) Console.println(s) Console.println("------------------") for(val s <- y) Console.println(s) Console.println("------------------") for(val s <- z) Console.println(s) Console.println("------------------") for(val s <- z) Console.println(s) } } $ scala F effect: 1 effect: 1 3 effect: 2 4 effect: 3 5 ------------------ effect: 1 3 effect: 2 4 effect: 3 5 ------------------ 3 effect: 2 4 effect: 3 5 ------------------ 3 4 5 Tony Morris http://tmorris.net/ |
|
|
Re: [scala] Re: StreamsI found my problem.
I was defining my stream for testing as: def myStream = slowFrom(0,1) I didn't realize this was making myStream a function that creates a new stream each time it is invoked. I needed to write: val myStream = slowFrom(0,1) Now I see the speedup I expected from subsequent accesses. Thanks everyone! On 6/21/07, Mark Engelberg <mark.engelberg@...> wrote: > Thanks, Eric. It's nice to know that Stream should work the way I > expect. I'll use your timing code on some of my own tests, and try to > figure out why I didn't seem to be getting a speedup on subsequent > accesses. > > --Mark > |
| Free embeddable forum powered by Nabble | Forum Help |