improved parallel combinator

View: New views
7 Messages — Rating Filter:   Alert me  

improved parallel combinator

by John Nowak :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I actually really like this one. No more weird pseudo-vector stuff.

Here are the semantics:

            S (| F |)  ==  S F
        S (| F , G |)  ==  S F { S G } peek
    S (| F , G , H |)  ==  S F { S G } peek { S H } peek
                       ..

Here are some examples:

                5 (| sq , neg |)  ==  25 -5
                 4 3 (| + , * |)  ==  7 12
            1 2 (| nip , drop |)  ==  2 1        ! same as 'swap'
    { 2 3 4 } (| rest , first |)  ==  { 3 4 } 2  ! same as 'unclip'

Here is the implementation:

    : infra-seq ( seq seq -- seq )
        [ [ with-datastack peek ] curry keep ] each drop ;
    : datastack-tail ( x -- x seq )
        [ datastack ] dip swap ;
    : parallel ( seq -- )
        datastack-tail [ unclip dip ] dip swap infra-seq ;

    ! "banana syntax" for 'parallel':
    ! (| F , G |) == { [ F ] [ G ] } parallel
    : |) ;
    : (| \ |)
        [ \ , 1array split [ >quotation ] map ] parse-literal
        \ parallel suffix ; parsing

For those not familiar with Factor:

             { x0 .. xN } peek  ==  xN
                { x S } unclip  ==  { S } x
                  [ F ] x slip  ==  F x
    { S } [ F ] with-datastack  ==  { S F }

The trick was realizing that the first quotation should be applied  
directly to the main stack. Lacking this, you'd need to explicitly  
indicate how many values to pass to each function (as with '2cleave',  
'3cleave', etc).

The only downside of this form compared to the previous one is that  
you need to explicitly group values if you want to return more than  
one from one of the functions (except for the first). I don't see this  
as a big problem.

- John

Re: improved parallel combinator

by William Tanksley, Jr :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

John Nowak <john@...> wrote:
> I actually really like this one. No more weird pseudo-vector stuff.
> Here are the semantics:
>            S (| F |)  ==  S F
>        S (| F , G |)  ==  S F { S G } peek
>    S (| F , G , H |)  ==  S F { S G } peek { S H } peek

Very nice; I like it too, more than spread and its associated
functions. I presume that it's easier to typecheck?

To be complete, it would make sense to define a (- -) pair that serves
the place of "cleave". The definition's probably pretty obvious, but
I'll let you work it out because you're the typechecking expert...

But it also occurs to me that a general-purpose word is possible: let
me call it "multistack". The idea is that both (- -) and (| |) split
data and control flow into multiple stacks, run arbitrary code on
those stacks, and then combine the results onto the main stack; the
difference between the two is how they gather the data, and how they
combine it into the main stack.

It therefore seems that the multistack combinator would take three
arguments: one gatherer quotation, one combiner quotation, and one
sequence of quotations to be run on the split stack. If there are /n/
quotations in the sequence, the gather quotation will be run /n/
times, each time being passed an empty list; it must load that list
with data from the main stack. Each of those lists will be used as the
infra stack for the quotation within the sequence, and the resulting
stack will be placed into a list. Finally, the combiner quotation will
be run on each result list in the order its producing program appears
in the sequence.

This seems both general and statically determinable.

Is it?

It also seems too complex for regular use; really, we'd want to define
simple variants. I *think* it would be easy to define both of your
syntaxes using this combinator.

> - John

-Wm

Re: improved parallel combinator

by John Nowak :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Jan 7, 2009, at 3:51 PM, William Tanksley, Jr wrote:

> John Nowak <john@...> wrote:
>
>> I actually really like this one. No more weird pseudo-vector stuff.
>> Here are the semantics:
>>
>
> Very nice; I like it too, more than spread and its associated
> functions. I presume that it's easier to typecheck?

Spread, cleave, and this are all easy to type check. The combinator I  
proposed before this isn't; it might be doable, but I don't think it's  
worth doing.

> \But it also occurs to me that a general-purpose word is possible: let
> me call it "multistack". The idea is that both (- -) and (| |) split
> data and control flow into multiple stacks, run arbitrary code on
> those stacks, and then combine the results onto the main stack;

Cleave doesn't quite do that:

    (- F , G -)  ==  [ F ] [ G ] bi  ==  [ F ] keep G

In other words, 'G' has access to the result of '[ F ] keep'. This  
might seem like something you could prevent with types, but as I  
posted earlier, it unfortunately isn't unless you restrict 'G' to  
returning a single element or some other exact number (or extend the  
type system just for this purpose). As a result of this, you cannot  
implement 'cleave' with your multistack combinator.

> It therefore seems that the multistack combinator would take three
> arguments: one gatherer quotation, one combiner quotation, and one
> sequence of quotations to be run on the split stack. This seems both
> general and statically determinable.
> ...
> Is it?

Not generally given higher order functions. The type system would need  
the addition of concatenation ('++').  Here are the types involved for  
'multistack' with two quotations in the list:

    Gatherer: R -> (R++S)  -- ensures it returns 0 or more values
    Quote1:   S -> T
    Quote2:   S -> U
    Combiner: (T -> V) /\ (U -> W)
    Final:    R -> (V++W)

It may be possible to add concatenation. My guess is that the  
combination of concatenation and intersection is probably undecidable  
though. Concatenation alone may be undecidable. In either case, I've  
no idea how to do it. I'd have to go back and look at some of the  
things done along these lines with record systems that support record  
concatenation.

Now, if we rule out higher order functions, it's all trivial provided  
the functions are known. There is one exception to this: You couldn't  
support Joy's "unstack" ('R {S} -> S') or any other function that does  
not have the same row variable on both sides. Doing so would again  
require concatenation. You'd also need concatenation if you allowed  
the definition of second order combinators where a function passed in  
is used in multistack.

- John

Re: improved parallel combinator

by John Nowak :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On Jan 7, 2009, at 4:58 PM, John Nowak wrote:

>    Gatherer: R -> (R++S)  -- ensures it returns 0 or more values
>    Quote1:   S -> T
>    Quote2:   S -> U
>    Combiner: (T -> V) /\ (U -> W)
>    Final:    R -> (V++W)

Sorry, the last one should be 'R -> (R++V++W)'.

- John

Re: improved parallel combinator

by John Nowak :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On Jan 7, 2009, at 4:58 PM, John Nowak wrote:

> On Jan 7, 2009, at 3:51 PM, William Tanksley, Jr wrote:
>
>> But it also occurs to me that a general-purpose word is possible: let
>> me call it "multistack". The idea is that both (- -) and (| |) split
>> data and control flow into multiple stacks, run arbitrary code on
>> those stacks, and then combine the results onto the main stack;
>
> As a result of this, you cannot
> implement 'cleave' with your multistack combinator.

Actually, I don't think you can implement 'parallel' either because  
the first quotation is applied directly to the main stack. 'parallel'  
can actually leave less elements on the stack than it started with as  
in '(| 3drop , id |)'. This doesn't seem possible with 'multistack'.

I should note that your 'multistack' is trivial in a more FP-style  
context. Here, the gatherer's result is used directly for all elements  
in the list. For example, if you just wanted the top two elements for  
each quotation, you'd some function 'R x y -> x y' instead of 'R x y -
 > R x y x y'. In other words, your gatherer would return *exactly*  
two elements, not two *more* elements. The combiner's results would be  
grouped into a list rather than flattened into a stack. As a result,  
the types involved for 'multistack' with two quotations in the list  
would be as such:

    Gatherer: a -> b
    Quote1:   b -> c
    Quote2:   b -> d
    Combiner: (c -> e) /\ (d -> f)
    Final:    a -> (d, f)

Assuming the combiner is directly supplied, you can remove the need  
for the intersection even if you have higher order functions. (This  
goes for the stack-based version as well.) In either case, you don't  
need row variables or concatenation.

This is probably a good example of why I'm interested in dropping the  
stack-based approach.

- John

Re: improved parallel combinator

by William Tanksley, Jr :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

John Nowak <john@...> wrote:
> William Tanksley, Jr wrote:
>>> I actually really like this one. No more weird pseudo-vector stuff.
>>> Here are the semantics:

>> Very nice; I like it too, more than spread and its associated
>> functions. I presume that it's easier to typecheck?

> Spread, cleave, and this are all easy to type check. The combinator I
> proposed before this isn't; it might be doable, but I don't think it's
> worth doing.

Are you saying that (| |) is hard to typecheck? And I thought that
spread and cleave were hard.

>> \But it also occurs to me that a general-purpose word is possible: let
>> me call it "multistack". The idea is that both (- -) and (| |) split
>> data and control flow into multiple stacks, run arbitrary code on
>> those stacks, and then combine the results onto the main stack;

> Cleave doesn't quite do that:
>    (- F , G -)  ==  [ F ] [ G ] bi  ==  [ F ] keep G
> In other words, 'G' has access to the result of '[ F ] keep'. This
> might seem like something you could prevent with types, but as I
> posted earlier, it unfortunately isn't unless you restrict 'G' to
> returning a single element or some other exact number (or extend the
> type system just for this purpose). As a result of this, you cannot
> implement 'cleave' with your multistack combinator.

Oh, I know -- cleave doesn't split into multiple stacks, which to me
makes it harder to think about. I admit that what cleave does with
multiple stacks IS useful, but I don't have a conceptual model yet for
it.

>> It therefore seems that the multistack combinator would take three
>> arguments: one gatherer quotation, one combiner quotation, and one
>> sequence of quotations to be run on the split stack. This seems both
>> general and statically determinable.
>> ...
>> Is it?

> Not generally given higher order functions. The type system would need

>    Gatherer: R -> (R++S)  -- ensures it returns 0 or more values

The gatherer was intended to be a lot simpler than that. The type should be:

  Sn-1 -> Sn qn

...where q1 is a typed staticly-sized list or tuple, and the index /n/
varies according to which quotation accepts it. And S0 = R (the
starting stack).

The idea is that the gatherer uses the main stack to build the
starting stack for its quotation, and it returns that starting stack
in the form of a list. It's run once for each quotation.

The quotation, when it runs, will see ONLY the items the gatherer put
onto that list.

Let 'Q' be the type of the stack after the last gatherer has run (in
this case Q = Q2).

>    Quote1:   S -> T
>    Quote2:   S -> U

Hmm.

QuoteN:  Qn -> Tn

...where 'Qn' is the type of the contents of the gatherer's list,
viewed as a stack. (I'm not a typing expert; probably a list is the
wrong structure; if you type tuples, use those instead. The size must
be static.)

>    Combiner: (T -> V) /\ (U -> W)

Yes, this one's a problem. I can hide the problem by keeping the
contents of T and U in tuples when passing them to the combiner, but
I'm not sure if that really solves the problem. It seems consistent...
So here's that:

Combiner:

  Un tn -> Wn

The last combiner, when run, produces W as its result stack (in this
case, W=W2).

>    Final:    R -> (V++W)

Final: R -> W

I think I've eliminated all type concatenation, at the cost of
requiring statically typed n-tuples. (Actually, the sequence of
quotations would be a typed n-tuple -- it certainly couldn't be
dynamically computed.)

> Now, if we rule out higher order functions, it's all trivial provided
> the functions are known. There is one exception to this: You couldn't
> support Joy's "unstack" ('R {S} -> S') or any other function that does
> not have the same row variable on both sides. Doing so would again
> require concatenation. You'd also need concatenation if you allowed
> the definition of second order combinators where a function passed in
> is used in multistack.

If tuples were statically internally typed "unstack" would be possible... Right?

Higher orders ... Whew. I need to think about that, but even with what
I said I'm sure you're right.

Keep in mind that I'm very new to static type inference...

> - John

-Wm

Re: improved parallel combinator

by John Nowak :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Jan 7, 2009, at 6:41 PM, William Tanksley, Jr wrote:

> John Nowak <john@...> wrote:
>
>> Spread, cleave, and this are all easy to type check. The combinator I
>> proposed before this isn't; it might be doable, but I don't think  
>> it's
>> worth doing.
>
> Are you saying that (| |) is hard to typecheck? And I thought that
> spread and cleave were hard.

Spread, cleave, and (| |) are easy. That [( )] thing with the vector-
like semantics I proposed before is not; it would require some sort of  
type coercion.

Here are some examples of cleave and spread combinators:

    bi   :: R x [R x -> S] [S x -> T] -> T
    2bi  :: R x y [R x y -> S] [S x y -> T] -> T
    3bi  :: R x y z [R x y z -> S] [S x y z -> T] -> T
    bi@  :: R x [R x -> S /\ S x -> T] -> T
    2bi@ :: R x y [R x y -> S /\ S x y -> T] -> T
    3bi@ :: R x y z [R x y z -> S /\ S x y z -> T] -> T
    bi*  :: R a x [R a -> S] [S x -> T] -> T
    2bi* :: R a b x y [R a b -> S] [S x y -> T] -> T

The "tri" versions are essentially the same, just with three  
quotations instead of two.

Here are the (| |), aka 'parallel' types:

    parallel1 :: R [R -> S] -> S
    parallel2 :: R [R -> S] [R -> T x] -> S x
    parallel3 :: R [R -> S] [R -> T x] [R -> U y] -> S x y
    parallel4 :: R [R -> S] [R -> T x] [R -> U y] [R -> V z] -> S x y z

All of these can be inferred in my current system; only the '@'  
versions are tricky because they require intersection types. Cat can  
handle the types for all except the '@' versions, although you can't  
actually *implement* the 'parallel' combinators in Cat because it  
doesn't support parameterized types (although they could be offered as  
primitives).

> Oh, I know -- cleave doesn't split into multiple stacks, which to me
> makes it harder to think about. I admit that what cleave does with
> multiple stacks IS useful, but I don't have a conceptual model yet for
> it.

I'm not sure how useful it is. The reason 'cleave' was invented was to  
convey intent. You *could* just write '[ F ] keep G', but you write  
'[ F ] [ G ] bi' if they're both operating on the same value to make  
the code more clear. The fact that 'cleave' allows G to access the  
results of F is, in my opinion, undesirable. Slava has shown me a  
couple of cases where it is intentionally used in such a way (I think  
with tuple selectors or something), but I don't think those use cases  
are particularly valuable.

The good thing about 'cleave' compared to 'parallel' is that the  
quotations can return multiple values directly on the stack, not just  
one. It would be nice to extend 'parallel' to do this, but again,  
you'd need those ugly concatenation types. The fact that we can type  
so much without them is actually somewhat surprising.

> The gatherer was intended to be a lot simpler than that. The type  
> should be:
>
>  Sn-1 -> Sn qn
>
> ...where q1 is a typed staticly-sized list or tuple, and the index /n/
> varies according to which quotation accepts it. And S0 = R (the
> starting stack).

Ah, I see. I think you're saying that the gatherer, given some stack  
R, returns some stack S with some first-class stack 'T' on top:

    R -> S {T}

That is easy to handle. If that's not exactly what you were saying,  
this should be roughly equivalent in functionality at least.

> (I'm not a typing expert; probably a list is the wrong structure;

> if you type tuples, use those instead. The size must be static.)


Yes, it's no trouble handling first-class stacks. For example:

    new   :: R -> R {0}              -- 0 is the empty row
    push  :: R {S} x -> R {S x}
    pop   :: R {S x} -> R x
    infra :: R {S} [S -> T] -> R {T}

> The quotation, when it runs, will see ONLY the items the gatherer put
> onto that list.

Right. Makes sense.

> Combiner: Un tn -> Wn
> The last combiner, when run, produces W as its result stack (in this
> case, W=W2).
> Final: R -> W
>
> I think I've eliminated all type concatenation

Yes, you have. What you've suggested is already typeable. Below is the  
type for your combinator; the first quotation is the gatherer, the  
second is the combiner, and then the following two are the quotations  
in the middle:

    R [R -> S {T}] [S {U} {V} -> W] [T -> U] [T -> V] -> W

I had to modify your design slightly. As odd as it might seem, the  
combiner must be allowed to access the result of the gatherer below  
the top element (which is the first class stack).

Alternatively, you could drop the combinator and just return the  
stacks directly (which is essentially just as useful as you could  
always compose your own combiner at the end):

    R [R -> S {T}] [T -> U] [U -> V] -> R {S} {T}

This combinator can be implemented as such:

    [i] 2dip [[infra] papply] bi@ bi

Or, with the combiner:

    rot [i] 2dip [[infra] papply] bi@ bi i

I came up with something very similar to your combinator a couple of  
months ago. It's what motivated me to try dealing with alternative  
data structures: It all works great, but pulling the values out of the  
stacks at the end is a pain in the ass. The [( )] combinator I  
proposed is essentially the same as yours except that it automatically  
converts the top element to a stack for you if necessary and  
automatically converts single element stacks in the output to just the  
elements they contain. The auto-coercion makes things less painful,  
but unfortunately I don't know how to type it. The semantics are  
awkward anyway.

> If tuples were statically internally typed "unstack" would be  
> possible... Right?

Joy's 'stack' and 'unstack' can be given types:

    stack   :: R -> R {R}
    unstack :: R {S} -> S

The problem had to do with the fact that you can't tell from unstack's  
type how many elements it would add or remove relative to the input  
stack. This is something I had to know to type the combinator I  
mistakenly thought you were proposing. Given that you're not proposing  
them though, forget I said anything! The 'stack' and 'unstack'  
primitives work fine.

- John