Tail calls via trampolining and an explicit instruction

View: New views
12 Messages — Rating Filter:   Alert me  

Tail calls via trampolining and an explicit instruction

by James Iry-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Re: Tail calls via trampolining and an explicit instruction

by csar :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Impressive:

user=> (time (trampoline #(foo 1000)) )
"Elapsed time: 0.547556 msecs"

user=> (time (foo 1000))
"Elapsed time: 0.375466 msecs"

Looks like the penalty is not that high, but after the JIT kicked in the trampolined version takes twice as long as the recusive version - hard to call this a TCO when the tail-call takes actually longer. Probably not a solution for a more performance focussed language like Scala:

user=> (dotimes [_ 100] (time (foo 5000 )))
"Elapsed time: 0.653994 msecs"
"Elapsed time: 1.023594 msecs"
"Elapsed time: 0.553143 msecs"
"Elapsed time: 0.638629 msecs"
"Elapsed time: 0.808762 msecs"
"Elapsed time: 0.602032 msecs"
"Elapsed time: 1.754692 msecs"
"Elapsed time: 3.244546 msecs"
"Elapsed time: 0.810438 msecs"
"Elapsed time: 0.635276 msecs"
"Elapsed time: 0.810159 msecs"
"Elapsed time: 0.958501 msecs"
"Elapsed time: 0.768813 msecs"
"Elapsed time: 0.940343 msecs"
"Elapsed time: 1.066057 msecs"
"Elapsed time: 0.871619 msecs"
"Elapsed time: 1.461359 msecs"
"Elapsed time: 1.232838 msecs"
"Elapsed time: 3.270248 msecs"
"Elapsed time: 0.597841 msecs"
"Elapsed time: 1.21915 msecs"
"Elapsed time: 0.569904 msecs"
"Elapsed time: 0.816305 msecs"
"Elapsed time: 1.208813 msecs"
"Elapsed time: 0.563759 msecs"
"Elapsed time: 0.564597 msecs"
"Elapsed time: 1.501866 msecs"
"Elapsed time: 0.555657 msecs"
"Elapsed time: 0.931683 msecs"
"Elapsed time: 4.408661 msecs"
"Elapsed time: 1.952762 msecs"
"Elapsed time: 1.630374 msecs"
"Elapsed time: 0.764064 msecs"
"Elapsed time: 0.561523 msecs"
"Elapsed time: 0.815746 msecs"
"Elapsed time: 0.550908 msecs"
"Elapsed time: 1.218311 msecs"
"Elapsed time: 0.967721 msecs"
"Elapsed time: 1.265803 msecs"
"Elapsed time: 0.979454 msecs"
"Elapsed time: 0.885587 msecs"
"Elapsed time: 3.627556 msecs"
"Elapsed time: 0.836699 msecs"
"Elapsed time: 0.814629 msecs"
"Elapsed time: 1.529245 msecs"
"Elapsed time: 0.564038 msecs"
"Elapsed time: 0.540851 msecs"
"Elapsed time: 2.006679 msecs"
"Elapsed time: 0.754286 msecs"
"Elapsed time: 1.038679 msecs"
"Elapsed time: 0.735289 msecs"
"Elapsed time: 2.170108 msecs"
"Elapsed time: 0.824965 msecs"
"Elapsed time: 1.758603 msecs"
"Elapsed time: 0.986158 msecs"
"Elapsed time: 1.199594 msecs"
"Elapsed time: 0.56348 msecs"
"Elapsed time: 1.221105 msecs"
"Elapsed time: 0.60287 msecs"
"Elapsed time: 0.798426 msecs"
"Elapsed time: 0.560965 msecs"
"Elapsed time: 0.846755 msecs"
"Elapsed time: 0.553981 msecs"
"Elapsed time: 1.272787 msecs"
"Elapsed time: 1.913093 msecs"
"Elapsed time: 4.565105 msecs"
"Elapsed time: 0.740876 msecs"
"Elapsed time: 0.936431 msecs"
"Elapsed time: 0.555658 msecs"
"Elapsed time: 0.546438 msecs"
"Elapsed time: 1.569474 msecs"
"Elapsed time: 0.630527 msecs"
"Elapsed time: 0.54113 msecs"
"Elapsed time: 0.870781 msecs"
"Elapsed time: 0.543644 msecs"
"Elapsed time: 3.667506 msecs"
"Elapsed time: 1.010743 msecs"
"Elapsed time: 1.566121 msecs"
"Elapsed time: 0.993422 msecs"
"Elapsed time: 0.903746 msecs"
"Elapsed time: 0.555378 msecs"
"Elapsed time: 0.55007 msecs"
"Elapsed time: 0.557892 msecs"
"Elapsed time: 1.256025 msecs"
"Elapsed time: 0.725512 msecs"
"Elapsed time: 1.02499 msecs"
"Elapsed time: 0.781384 msecs"
"Elapsed time: 2.289397 msecs"
"Elapsed time: 0.550629 msecs"
"Elapsed time: 1.059352 msecs"
"Elapsed time: 3.169677 msecs"
"Elapsed time: 0.832508 msecs"
"Elapsed time: 1.014375 msecs"
"Elapsed time: 1.175568 msecs"
"Elapsed time: 0.692546 msecs"
"Elapsed time: 1.797715 msecs"
"Elapsed time: 0.663492 msecs"
"Elapsed time: 1.708039 msecs"
"Elapsed time: 2.835277 msecs"
"Elapsed time: 0.8448 msecs"

user=> (dotimes [_ 100] (time (trampoline #(foo 5000 ))))
"Elapsed time: 7.796522 msecs"
"Elapsed time: 1.891581 msecs"
"Elapsed time: 1.577017 msecs"
"Elapsed time: 2.259785 msecs"
"Elapsed time: 1.782909 msecs"
"Elapsed time: 1.713626 msecs"
"Elapsed time: 3.076089 msecs"
"Elapsed time: 4.407264 msecs"
"Elapsed time: 2.485232 msecs"
"Elapsed time: 2.570439 msecs"
"Elapsed time: 1.990756 msecs"
"Elapsed time: 2.692241 msecs"
"Elapsed time: 5.366604 msecs"
"Elapsed time: 2.504788 msecs"
"Elapsed time: 2.344712 msecs"
"Elapsed time: 2.395556 msecs"
"Elapsed time: 3.39708 msecs"
"Elapsed time: 4.998121 msecs"
"Elapsed time: 2.778007 msecs"
"Elapsed time: 2.577702 msecs"
"Elapsed time: 2.498083 msecs"
"Elapsed time: 2.049422 msecs"
"Elapsed time: 2.771302 msecs"
"Elapsed time: 3.368026 msecs"
"Elapsed time: 2.136026 msecs"
"Elapsed time: 2.048025 msecs"
"Elapsed time: 4.829105 msecs"
"Elapsed time: 2.302248 msecs"
"Elapsed time: 2.315378 msecs"
"Elapsed time: 3.378921 msecs"
"Elapsed time: 2.387455 msecs"
"Elapsed time: 2.093562 msecs"
"Elapsed time: 2.907353 msecs"
"Elapsed time: 1.943543 msecs"
"Elapsed time: 2.022045 msecs"
"Elapsed time: 5.502096 msecs"
"Elapsed time: 1.927619 msecs"
"Elapsed time: 1.823696 msecs"
"Elapsed time: 2.601727 msecs"
"Elapsed time: 2.067022 msecs"
"Elapsed time: 3.73567 msecs"
"Elapsed time: 2.536077 msecs"
"Elapsed time: 2.476292 msecs"
"Elapsed time: 4.919899 msecs"
"Elapsed time: 2.737498 msecs"
"Elapsed time: 2.006121 msecs"
"Elapsed time: 2.946463 msecs"
"Elapsed time: 2.147759 msecs"
"Elapsed time: 3.583416 msecs"
"Elapsed time: 2.334375 msecs"
"Elapsed time: 2.472102 msecs"
"Elapsed time: 1.99327 msecs"
"Elapsed time: 3.815848 msecs"
"Elapsed time: 2.738337 msecs"
"Elapsed time: 2.386896 msecs"
"Elapsed time: 5.070477 msecs"
"Elapsed time: 2.417905 msecs"
"Elapsed time: 2.936407 msecs"
"Elapsed time: 3.320813 msecs"
"Elapsed time: 2.537194 msecs"
"Elapsed time: 2.01087 msecs"
"Elapsed time: 2.029308 msecs"
"Elapsed time: 2.771302 msecs"
"Elapsed time: 2.966858 msecs"
"Elapsed time: 2.334654 msecs"
"Elapsed time: 4.746692 msecs"
"Elapsed time: 2.362312 msecs"
"Elapsed time: 2.59642 msecs"
"Elapsed time: 2.217042 msecs"
"Elapsed time: 3.168279 msecs"
"Elapsed time: 2.558147 msecs"
"Elapsed time: 2.596978 msecs"
"Elapsed time: 2.750629 msecs"
"Elapsed time: 2.259785 msecs"
"Elapsed time: 2.545854 msecs"
"Elapsed time: 5.08221 msecs"
"Elapsed time: 2.454223 msecs"
"Elapsed time: 1.681219 msecs"
"Elapsed time: 2.257829 msecs"
"Elapsed time: 2.400585 msecs"
"Elapsed time: 2.859022 msecs"
"Elapsed time: 2.075682 msecs"
"Elapsed time: 2.739455 msecs"
"Elapsed time: 2.42433 msecs"
"Elapsed time: 2.017855 msecs"
"Elapsed time: 5.994616 msecs"
"Elapsed time: 2.90428 msecs"
"Elapsed time: 1.769219 msecs"
"Elapsed time: 2.405333 msecs"
"Elapsed time: 2.227658 msecs"
"Elapsed time: 2.327949 msecs"
"Elapsed time: 2.189384 msecs"
"Elapsed time: 2.968534 msecs"
"Elapsed time: 2.795328 msecs"
"Elapsed time: 1.956953 msecs"
"Elapsed time: 2.138261 msecs"
"Elapsed time: 1.801905 msecs"
"Elapsed time: 2.024559 msecs"
"Elapsed time: 2.949816 msecs"
"Elapsed time: 2.733867 msecs"
/Carsten

On Wed, Nov 26, 2008 at 4:08 PM, James Iry <jamesiry@...> wrote:
http://groups.google.com/group/clojure/msg/3addf875319c5c10


Re: Tail calls via trampolining and an explicit instruction

by Randall Schulz :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wednesday 26 November 2008 08:48, Carsten Saager wrote:

> Impressive:
>
> user=> (time (trampoline #(foo 1000)) )
> "Elapsed time: 0.547556 msecs"
>
> user=> (time (foo 1000))
> "Elapsed time: 0.375466 msecs"
>
> Looks like the penalty is not that high, but after the JIT kicked in
> the trampolined version takes twice as long as the recusive version -
> hard to call this a TCO when the tail-call takes actually longer.
> ...

The point is not to optimize CPU time, but to keep stack consumption
independent of the input size in mutually recursive sets of function
and to do so using the stock JVM's ordinary call stack.


> /Carsten
>
> On Wed, Nov 26, 2008 at 4:08 PM, James Iry <jamesiry@...> wrote:
> > http://groups.google.com/group/clojure/msg/3addf875319c5c10


Randall Schulz

Re: Tail calls via trampolining and an explicit instruction

by csar :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I agree, slower is better than stack-overflow.

Compare
(defn recount [n]
  (if (zero? n) nil
    (recount (dec n))))

with
(defn recount [n]
  (if (zero? n) nil
    (recur (dec n))))

The later is as well faster, thus I am asking if trampolines are a workaround rather than a solution.

/C

On Wed, Nov 26, 2008 at 6:00 PM, Randall R Schulz <rschulz@...> wrote:
On Wednesday 26 November 2008 08:48, Carsten Saager wrote:
> Impressive:
>
> user=> (time (trampoline #(foo 1000)) )
> "Elapsed time: 0.547556 msecs"
>
> user=> (time (foo 1000))
> "Elapsed time: 0.375466 msecs"
>
> Looks like the penalty is not that high, but after the JIT kicked in
> the trampolined version takes twice as long as the recusive version -
> hard to call this a TCO when the tail-call takes actually longer.
> ...

The point is not to optimize CPU time, but to keep stack consumption
independent of the input size in mutually recursive sets of function
and to do so using the stock JVM's ordinary call stack.


> /Carsten
>
> On Wed, Nov 26, 2008 at 4:08 PM, James Iry <jamesiry@...> wrote:
> > http://groups.google.com/group/clojure/msg/3addf875319c5c10


Randall Schulz


Re: Tail calls via trampolining and an explicit instruction

by James Iry-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

The phrase "tail call optimization" often confuses because it sounds like proponents primarily want things to be "faster" - after all speed is the most common target of optimizations.   But while faster is certainly nicer, what we really want from TCO is for things to not explode in the face of large or even infinite sequences of tail calls.   It's a space optimization rather than a time optimization.  In the right circumstances is optimizes O(n) memory stack usage to O(1).  That's often far more important than a mere constant factor loss in time performance.

Hickey made an interesting choice.   He makes the user do the work to indicate when trampolining should be used.  That means the user gets to make the choice about the trade-offs involved.

A variation in Scala

  object Sometime {
    def tailCall[X](x : => Sometime[X]) = new Later({x})
    implicit def xToSometime[X](x : => X) = new Now({x})
    def trampoline[X](x : Sometime[X]) = x.eval
  }
  sealed abstract class Sometime[X] {
    def eval : X = {
      def eval[X](x : Sometime[X]) : X = x match {
        case x : Now[X] => x.get
        case x : Later[X] => eval(x.get)
      }
      eval(this)
    }
  }
  class Now[X](x : => X) extends Sometime[X] {
    protected def get = x
  }

  class Later[X](x : => Sometime[X]) extends Sometime[X] {
    protected def get = x
  }

  import Sometime._
  def foo(n : Long) : Sometime[Unit] = {
    if ( n > 0)
      tailCall(bar(n - 1));
  }
 
  def bar(n : Long) : Sometime[Unit] = {
    if ( n > 0)
      tailCall(foo(n - 1));
  }

trampoline(foo(1000)), or, equivalently foo(1000).eval

On Wed, Nov 26, 2008 at 8:48 AM, Carsten Saager <csaager@...> wrote:
Impressive:

user=> (time (trampoline #(foo 1000)) )
"Elapsed time: 0.547556 msecs"

user=> (time (foo 1000))
"Elapsed time: 0.375466 msecs"

Looks like the penalty is not that high, but after the JIT kicked in the trampolined version takes twice as long as the recusive version - hard to call this a TCO when the tail-call takes actually longer. Probably not a solution for a more performance focussed language like Scala:



Re: Tail calls via trampolining and an explicit instruction

by Martin Odersky :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

It's a neat technique, and one we can easily adopt for Scala. Here's
the necessary code to do that:

package scala.util.control

abstract class TailRec[+A]

object TailRec {

  case class Call[A](rest: () => TailRec[A]) extends TailRec[A]
  case class Done[A](result: A) extends TailRec[A]

  def tailcall[A](rest: => TailRec[A]) = new Call(() => rest)

  def done[A](result: A) = new Done(result)

  def trampoline[A](body: TailRec[A]): A = {
    def loop(body: TailRec[A]): A = body match {
      case Call(rest) => loop(rest())
      case Done(result) => result
    }
    loop(body)
  }
}

I am toying with the idea of putting this into Scala 2.8. Here's a
test that uses mutual recursion to find out whether a list has even
length:

  import scala.util.control.TailRec._

  def isEven(xs: List[Int]): TailRec[Boolean] =
    if (xs.isEmpty) done(true) else tailcall(isOdd(xs.tail))

  def isOdd(xs: List[Int]): TailRec[Boolean] =
    if (xs.isEmpty) done(false) else tailcall(isEven(xs.tail))

  def testEven(xs: List[Int]) = trampoline(isEven(xs))

Everything works as expected. But it works quite slowly. The numbers
below show 5 runs each of a benchmark with direct calls to
isEven/isOdd on lists as shown above and a benchmark with trampolining
calls. The trampolining call benchmark is about 4 times slower.

util.control.RecListBench$ 886 671 558 573 554
util.control.TrampListBench$ 2700 2263 2302 2350 2302

Compared to Clojure, the Scala solution needs an extra wrapper around
the closure in order to make the sum. It might be that this explains
part of the relatively worse performance of trampolines in Scala. Or
it might be that the base case (direct call) is faster in Scala than
in Clojure. It would be interesting to find out more.

Clojure seems to have lots of other interesting ideas as well, for
instance its transactional memory and immutable datastructures.
It's definitively a welcome addition to the JVM languages zoo!
Ironically, Clojure's maps are based on work which Phil Bagwell did
as an associate of my group at EPFL. We never put that into the Scala
collection libraries (Phil being a C++ person), but now we have new
motivation to do so.

Cheers

 -- Martin

Re: Tail calls via trampolining and an explicit instruction

by Martin Odersky :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Nov 26, 2008 at 7:01 PM, James Iry <jamesiry@...> wrote:

> The phrase "tail call optimization" often confuses because it sounds like
> proponents primarily want things to be "faster" - after all speed is the
> most common target of optimizations.   But while faster is certainly nicer,
> what we really want from TCO is for things to not explode in the face of
> large or even infinite sequences of tail calls.   It's a space optimization
> rather than a time optimization.  In the right circumstances is optimizes
> O(n) memory stack usage to O(1).  That's often far more important than a
> mere constant factor loss in time performance.
>
> Hickey made an interesting choice.   He makes the user do the work to
> indicate when trampolining should be used.  That means the user gets to make
> the choice about the trade-offs involved.
>
> A variation in Scala
>
I only found out you had sent that now, James. It looks like your
solution and mine are pretty much equivalent.

Cheers

 -- Martin

Re: Tail calls via trampolining and an explicit instruction

by James Iry-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Yeah, but yours is nicer.  The only contribution mine might make over yours is an implicit so that non-tail calls can be expressed directly.

implicit def done[X](x: => X) = Done(x)

def foo : TailRec[Int]{
  if (condition)
     tailCall(bar)
  else
    42
}

Whether that's useful or causes confusion I don't know.

Also, my base case used call-by-name in its construction.  I did that to ensure side effects always occur in the expected order, but the more I think about it the more I think that strict evaluation achieves that too.


On Wed, Nov 26, 2008 at 10:34 AM, martin odersky <martin.odersky@...> wrote:
On Wed, Nov 26, 2008 at 7:01 PM, James Iry <jamesiry@...> wrote:
> The phrase "tail call optimization" often confuses because it sounds like
> proponents primarily want things to be "faster" - after all speed is the
> most common target of optimizations.   But while faster is certainly nicer,
> what we really want from TCO is for things to not explode in the face of
> large or even infinite sequences of tail calls.   It's a space optimization
> rather than a time optimization.  In the right circumstances is optimizes
> O(n) memory stack usage to O(1).  That's often far more important than a
> mere constant factor loss in time performance.
>
> Hickey made an interesting choice.   He makes the user do the work to
> indicate when trampolining should be used.  That means the user gets to make
> the choice about the trade-offs involved.
>
> A variation in Scala
>
I only found out you had sent that now, James. It looks like your
solution and mine are pretty much equivalent.

Cheers

 -- Martin


Re: Tail calls via trampolining and an explicit instruction

by Martin Odersky :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Nov 26, 2008 at 7:45 PM, James Iry <jamesiry@...> wrote:
> Yeah, but yours is nicer.  The only contribution mine might make over yours
> is an implicit so that non-tail calls can be expressed directly.
>
> implicit def done[X](x: => X) = Done(x)
>
Ah, yes, that's neat! I don't think there would be confusion since the
method return type is already a TailRec.

> Also, my base case used call-by-name in its construction.  I did that to
> ensure side effects always occur in the expected order, but the more I think
> about it the more I think that strict evaluation achieves that too.
>
I'd think so, yes.

Cheers

 -- Martin

Re: Tail calls via trampolining and an explicit instruction

by James Iry-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Okay, some thought gave me a case where it makes a difference

def foo(x : Boolean) : TailRec[Unit] = if(x) tailCall(foo(false)) else done(println("hello"))

val x = foo(true)

With either strict or c-b-n parameters on done "hello" isn't printed until you trampoline(x).  But...

val y = foo(false)

With c-b-n, you get the same behavior : it's waiting for trampoline(y) to evaluate.  But with strict arguments, "hello" is printed immediately without a call to trampoline(y).

On Wed, Nov 26, 2008 at 10:58 AM, martin odersky <martin.odersky@...> wrote:
> Also, my base case used call-by-name in its construction.  I did that to
> ensure side effects always occur in the expected order, but the more I think
> about it the more I think that strict evaluation achieves that too.
>
I'd think so, yes.

Cheers

 -- Martin


Re: Tail calls via trampolining and an explicit instruction

by Martin Odersky :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Nov 26, 2008 at 8:20 PM, James Iry <jamesiry@...> wrote:

> Okay, some thought gave me a case where it makes a difference
>
> def foo(x : Boolean) : TailRec[Unit] = if(x) tailCall(foo(false)) else
> done(println("hello"))
>
> val x = foo(true)
>
> With either strict or c-b-n parameters on done "hello" isn't printed until
> you trampoline(x).  But...
>
> val y = foo(false)
>
Good catch! Do done must indeed be c-b-n.

Cheers

 -- Martin

Re: Tail calls via trampolining and an explicit instruction

by Christos KK Loverdos :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

This is one of the most enjoyable threads in the scala lists.

Martin, James,

Just paying my respect.

Christos

On Wed, Nov 26, 2008 at 9:35 PM, martin odersky <martin.odersky@...> wrote:
On Wed, Nov 26, 2008 at 8:20 PM, James Iry <jamesiry@...> wrote:
> Okay, some thought gave me a case where it makes a difference
>
> def foo(x : Boolean) : TailRec[Unit] = if(x) tailCall(foo(false)) else
> done(println("hello"))
>
> val x = foo(true)
>
> With either strict or c-b-n parameters on done "hello" isn't printed until
> you trampoline(x).  But...
>
> val y = foo(false)
>
Good catch! Do done must indeed be c-b-n.

Cheers

 -- Martin



--
 __~O
-\ <,       Christos KK Loverdos
(*)/ (*)      http://ckkloverdos.com