|
View:
New views
11 Messages
—
Rating Filter:
Alert me
|
|
|
Concatenative Hardware Redux (Another 16 Instruction Set)Hi all,
Here is my most recent 16 instruction set for concatenative hardware. Any comments would be greatly appreciated. /* * Released into the public domain by Christopher Diggins * Contact me if you have special licensing requirements * http://www.cdiggins.com */ enum { /* Push a literal value from the instruction stream */ LIT = 0, /* -> a */ /* Add top two values on the stack */ ADD = 1, /* a b -> (a + b) */ /* Shift second value on the stack left or right */ SHIFT = 2, /* a b -> b < 32 ? (a >> b) : (a << (b - 32)) */ /* Perform exclusive or of top two values */ XOR = 3, /* a b -> (a ^ b) */ /* Perfrom bitwise and of top two values */ AND = 4, /* a b -> (a & b) */ /* Dynamic function/list construction */ /* Note: lists and functions are identical from a programmer's perspective, a list is the result of a function that consumes nothing. Think of it as a new stack yielded by executing the function, except that it is lazy (evaluated at last minute) */ PAPPLY = 5, /* a [X] -> [a X] */ COMPOSE = 6, /* [X] [Y] -> [X Y] */ /* Branching and calling */ APPLYIF = 7, /* bool [X] -> X or NOOP */ /* Reference creation and manipulation, used for playing with heap */ MAKEREF = 8, /* a -> &a */ GETREF = 9, /* &a -> a */ SETREF = 10, /* &a b -> &(*a = b) */ /* Stack manipulation operations */ FLIP = 11, /* a X b n:int -> b X a ; sizeof(X) == n */ GETN = 12, /* a X n:int -> (a X, a) ; sizeof(X) == n */ SETN = 13, /* (a X n:int, b) -> b X ; sizeof(X) == n */ /* Applies a function to a list */ APPPLYTO = 14, /* [->X] [X->Y] => [Y] */ /* Make a suggestion ( I was thinking of COUNT, but I am not so sure ) */ UNUSED = 15 } opcodes; /* Some examples of derived opcodes: SWAP = 0 FLIPN DUP = 0 GETN POPD = 0 SETN POP = SWAP POPD QUOTE = [] CONS APPLY = TRUE SWAP APPLYIF CONS = SWAP QUOTE PAPPLY UNCONS = APPLY SWAP DIP = SWAP QUOTE COMPOSE APPLY */ Christopher Diggins http://www.cat-language.com |
|
|
Re: Concatenative Hardware Redux (Another 16 Instruction Set)Don't you think basic arithmetical operators would be useful, if this
might eventually be actual and not theoretical hardware? If you only have and, shift and xor, maybe it's *possible* to implement arithmetic (I personally don't know how) but it wouldn't be as efficient. There's no real point in restricting yourself to 16 operations, especially when some of these operations will encode a literal and some of them won't. Dan On Jan 10, 2008 2:12 PM, Christopher Diggins <cdiggins@...> wrote: > > > > > > > Hi all, > > Here is my most recent 16 instruction set for concatenative hardware. > Any comments would be greatly appreciated. > > /* > * Released into the public domain by Christopher Diggins > * Contact me if you have special licensing requirements > * http://www.cdiggins.com > */ > > enum > { > /* Push a literal value from the instruction stream */ > LIT = 0, /* -> a */ > > /* Add top two values on the stack */ > ADD = 1, /* a b -> (a + b) */ > > /* Shift second value on the stack left or right */ > SHIFT = 2, /* a b -> b < 32 ? (a >> b) : (a << (b - 32)) */ > > /* Perform exclusive or of top two values */ > XOR = 3, /* a b -> (a ^ b) */ > > /* Perfrom bitwise and of top two values */ > AND = 4, /* a b -> (a & b) */ > > /* Dynamic function/list construction */ > /* Note: lists and functions are identical from a programmer's perspective, > a list is the result of a function that consumes nothing. Think of it > as a new stack yielded by executing the function, except that it is > lazy (evaluated at last minute) */ > PAPPLY = 5, /* a [X] -> [a X] */ > COMPOSE = 6, /* [X] [Y] -> [X Y] */ > > /* Branching and calling */ > APPLYIF = 7, /* bool [X] -> X or NOOP */ > > /* Reference creation and manipulation, used for playing with heap */ > MAKEREF = 8, /* a -> &a */ > GETREF = 9, /* &a -> a */ > SETREF = 10, /* &a b -> &(*a = b) */ > > /* Stack manipulation operations */ > FLIP = 11, /* a X b n:int -> b X a ; sizeof(X) == n */ > GETN = 12, /* a X n:int -> (a X, a) ; sizeof(X) == n */ > SETN = 13, /* (a X n:int, b) -> b X ; sizeof(X) == n */ > > /* Applies a function to a list */ > APPPLYTO = 14, /* [->X] [X->Y] => [Y] */ > > /* Make a suggestion ( I was thinking of COUNT, but I > am not so sure ) */ > UNUSED = 15 > } > opcodes; > > /* > Some examples of derived opcodes: > > SWAP = 0 FLIPN > DUP = 0 GETN > POPD = 0 SETN > POP = SWAP POPD > QUOTE = [] CONS > APPLY = TRUE SWAP APPLYIF > CONS = SWAP QUOTE PAPPLY > UNCONS = APPLY SWAP > DIP = SWAP QUOTE COMPOSE APPLY > */ > > Christopher Diggins > http://www.cat-language.com > |
|
|
Re: Concatenative Hardware Redux (Another 16 Instruction Set)You missed the ADD instruction.
Seriously, though, I do suspect that Chuck Moore has it right on this one... Use instruction slots in a larger word size, and allow some of the slots to encode alternate instructions. In the SeaForth, you have an 18-bit word size with 3 5-bit instructions plus a trailing 3-bit instruction... The trailing instruction is very useful for the uLOOP instruction, which can only go there and which performs a 'decrement and loop if not zero' without a destination (which means it'll repeat the preceding words in the instruction).You might want a trailing _and_ a preceding slot; the preceding slot could be for literals and targetted jumps (so that the instruction and the literal fit in the same instruction slot). And so on. -Wm On Jan 11, 2008 12:45 AM, Daniel Ehrenberg <microdan@...> wrote: > Don't you think basic arithmetical operators would be useful, if this > might eventually be actual and not theoretical hardware? If you only > have and, shift and xor, maybe it's *possible* to implement arithmetic > (I personally don't know how) but it wouldn't be as efficient. There's > no real point in restricting yourself to 16 operations, especially > when some of these operations will encode a literal and some of them > won't. > > Dan > > > On Jan 10, 2008 2:12 PM, Christopher Diggins <cdiggins@...> wrote: > > > > > > > > > > > > > > Hi all, > > > > Here is my most recent 16 instruction set for concatenative hardware. > > Any comments would be greatly appreciated. > > > > /* > > * Released into the public domain by Christopher Diggins > > * Contact me if you have special licensing requirements > > * http://www.cdiggins.com > > */ > > > > enum > > { > > /* Push a literal value from the instruction stream */ > > LIT = 0, /* -> a */ > > > > /* Add top two values on the stack */ > > ADD = 1, /* a b -> (a + b) */ > > > > /* Shift second value on the stack left or right */ > > SHIFT = 2, /* a b -> b < 32 ? (a >> b) : (a << (b - 32)) */ > > > > /* Perform exclusive or of top two values */ > > XOR = 3, /* a b -> (a ^ b) */ > > > > /* Perfrom bitwise and of top two values */ > > AND = 4, /* a b -> (a & b) */ > > > > /* Dynamic function/list construction */ > > /* Note: lists and functions are identical from a programmer's perspective, > > a list is the result of a function that consumes nothing. Think of it > > as a new stack yielded by executing the function, except that it is > > lazy (evaluated at last minute) */ > > PAPPLY = 5, /* a [X] -> [a X] */ > > COMPOSE = 6, /* [X] [Y] -> [X Y] */ > > > > /* Branching and calling */ > > APPLYIF = 7, /* bool [X] -> X or NOOP */ > > > > /* Reference creation and manipulation, used for playing with heap */ > > MAKEREF = 8, /* a -> &a */ > > GETREF = 9, /* &a -> a */ > > SETREF = 10, /* &a b -> &(*a = b) */ > > > > /* Stack manipulation operations */ > > FLIP = 11, /* a X b n:int -> b X a ; sizeof(X) == n */ > > GETN = 12, /* a X n:int -> (a X, a) ; sizeof(X) == n */ > > SETN = 13, /* (a X n:int, b) -> b X ; sizeof(X) == n */ > > > > /* Applies a function to a list */ > > APPPLYTO = 14, /* [->X] [X->Y] => [Y] */ > > > > /* Make a suggestion ( I was thinking of COUNT, but I > > am not so sure ) */ > > UNUSED = 15 > > } > > opcodes; > > > > /* > > Some examples of derived opcodes: > > > > SWAP = 0 FLIPN > > DUP = 0 GETN > > POPD = 0 SETN > > POP = SWAP POPD > > QUOTE = [] CONS > > APPLY = TRUE SWAP APPLYIF > > CONS = SWAP QUOTE PAPPLY > > UNCONS = APPLY SWAP > > DIP = SWAP QUOTE COMPOSE APPLY > > */ > > > > Christopher Diggins > > http://www.cat-language.com > > > > > > Yahoo! Groups Links > > > > |
|
|
Re: Concatenative Hardware Redux (Another 16 Instruction Set)On Jan 11, 2008 9:57 AM, William Tanksley, Jr <wtanksleyjr@...> wrote:
> > Seriously, though, I do suspect that Chuck Moore has it right on this > one... Why? > Use instruction slots in a larger word size, and allow some of > the slots to encode alternate instructions. In the SeaForth, you have > an 18-bit word size with 3 5-bit instructions plus a trailing 3-bit > instruction... The trailing instruction is very useful for the uLOOP > instruction, which can only go there and which performs a 'decrement > and loop if not zero' without a destination (which means it'll repeat > the preceding words in the instruction). Sure but that is just one instruction. Overall, having 16 instructions, means a much smaller chip with very very small power overhead. I'm not an expert in the domain of course, but I would like to know more about the trade-offs. > You might want a trailing > _and_ a preceding slot; the preceding slot could be for literals and > targetted jumps (so that the instruction and the literal fit in the > same instruction slot). I don't understand what this means, and why. > And so on. Sorry that I don't fully understand. - Christopher |
|
|
Re: Concatenative Hardware Redux (Another 16 Instruction Set)On Jan 11, 2008 12:45 AM, Daniel Ehrenberg <microdan@...> wrote:
> > Don't you think basic arithmetical operators would be useful, if this > might eventually be actual and not theoretical hardware? If you only > have and, shift and xor, maybe it's *possible* to implement arithmetic > (I personally don't know how) but it wouldn't be as efficient. You overlooked add. With that we can perform all arithmetical operations without too much problems. Of course, this is not appropriate for a PC CPU, but rather a very small embedded device. > There's > no real point in restricting yourself to 16 operations, especially > when some of these operations will encode a literal and some of them > won't. I am looking at here the possibilities of implementing minimal hardware that is cheap, small, and has very little energy consumption. However, to be completely honest, the more I experiment with the instruction set, the more I think that I would prefer to work with a 255-instruction set (ala JVML). - Christopher |
|
|
Re: Concatenative Hardware Redux (Another 16 Instruction Set)Christopher Diggins <cdiggins@...> wrote:
> William Tanksley, Jr <wtanksleyjr@...> wrote: > > Seriously, though, I do suspect that Chuck Moore has it right on this > > one... > Why? > > Use instruction slots in a larger word size, and allow some of > > the slots to encode alternate instructions. This is what Chuck does. The reason I think he's got it right is that 16 instructions just isn't enough for a reasonable programming platform, especially if you include specialized instructions (such as your list manipulators). 32 instructions is PLENTY -- you'll wind up with spares. And the specialized slots allow you to include instructions whose only purpose is making decoding easy. Right now you've got 4-bit instructions; I assume a 32-bit word size. That means you load 8 instructions with every fetch. That's a _lot_ of instructions that can run without any need for memory access... If you change to 5-bit instructions you can fit 6 instructions, plus 2 bits for something else. 6 instructions is still an awful lot, and now you can include more useful instructions which will make your operations much more powerful. Use those two leftover bits to allow 4 different ways of interpreting the loaded instruction word: 00=literal (just push it on the stack and load the next); 01=literal jump; 10=microloop; 11=literal call. (Or whatever encoding and instructions you want.) Microloop is amazingly powerful, and changes the way your instruction set works. With it, you should consider adding instructions that use iteration... For example, Chuck didn't have room on his chip for a multiplier, but he did include a 'multiply step' instruction, which with the uLOOP capability makes multiplication very simple. I don't know what your chip might need... Perhaps a MergeSortStep instruction? > > In the SeaForth, you have > > an 18-bit word size with 3 5-bit instructions plus a trailing 3-bit > > instruction... The trailing instruction is very useful for the uLOOP > > instruction, which can only go there and which performs a 'decrement > > and loop if not zero' without a destination (which means it'll repeat > > the preceding words in the instruction). > Sure but that is just one instruction. Overall, having 16 > instructions, means a much smaller chip with very very small power > overhead. I'm not an expert in the domain of course, but I would like > to know more about the trade-offs. 4 bits versus 5 bits is just what it looks like -- a 20% space savings; that adds up FAST. The problem is that if the 16 instructions you choose exclude commonly used ones, your programmers will not only be annoyed, their programs will wind up taking MORE space, since they have to use 2 or 3 opcodes to implement what could have been done with 1 5-bit opcode. You're in an experimental design space, attempting to produce a high-level microarchitecture (i.e. with instructions for a high-level task, list manipulation). I think you'd be better served by underconstraining your problem space: loosen up the constraint on the "smallest possible instruction set" in order to give yourself elbow room for "highest level instruction set". Don't give up on "smallest possible"; just produce a working draft that is a bit big, and then see if you can make it smaller later. > - Christopher \-Wm |
|
|
Re: Concatenative Hardware Redux (Another 16 Instruction Set)Christopher Diggins <cdiggins@...> wrote:
> Daniel Ehrenberg <microdan@...> wrote: > > There's > > no real point in restricting yourself to 16 operations, especially > > when some of these operations will encode a literal and some of them > > won't. > I am looking at here the possibilities of implementing minimal > hardware that is cheap, small, and has very little energy consumption. A fascinating goal, and one which to my knowledge has not been attempted together with a high-level instruction set. > However, to be completely honest, the more I experiment with the > instruction set, the more I think that I would prefer to work with a > 255-instruction set (ala JVML). Completely incompatible with the goal of minimal hardware -- and do you actually have anything close to 256 opcodes? Can you imagine building all the hardware to support each one and muxing/demuxing them together? Wouldn't retreating to 5-bit instructions make more sense initially? > - Christopher -Wm |
|
|
Re: Concatenative Hardware Redux (Another 16 Instruction Set)On Jan 11, 2008 11:35 AM, William Tanksley, Jr <wtanksleyjr@...> wrote:
> A fascinating goal, and one which to my knowledge has not been > attempted together with a high-level instruction set. I think the distinction between "high-level" and "low-level" is artificial, and rooted to firmly on the naivete of our predecessors. If we ignore it, we can come up with something new and interesting. > > However, to be completely honest, the more I experiment with the > > instruction set, the more I think that I would prefer to work with a > > 255-instruction set (ala JVML). > > Completely incompatible with the goal of minimal hardware Not at all, it depends on what you do with these instructions. I should clarify that I really mean "1-byte" operators. The core instructions would be few, and many of the bits would be flags. Consider: dup pop swap ... and then with a bit flag to represent "dip" dupd == [dup] dip popd == [dup] dip swap == [swap] dip ... and another bit flag to represent "dup" ddup = dup dup dpop = dup pop dswap = dup swap ... and another bit flag to represent "swap" swdup = swap dup swpop = swap pop ... Hopefully this illustrates what I have in mind. > -- and do > you actually have anything close to 256 opcodes? It's quite easy to come up with a whole bunch of opcodes. > Can you imagine > building all the hardware to support each one and muxing/demuxing them > together? Well yes. I don't think it neccessarily has to be complex. > Wouldn't retreating to 5-bit instructions make more sense > initially? The simplicity of the design has less to do with the number of instructions, than it does with the organization and kind of instructions. 4-bit instructions is an interesting challenge, because it is so darn small. 5-bit instructions has already been done and I believe would be hard to port to other architectures efficiently. 8-bit instructions is easy to port and emulate. I consider emulation very important, because a practical virtual machine would be run on many different kinds of chips and software. - Christopher |
|
|
Re: Concatenative Hardware Redux (Another 16 Instruction Set)> The problem is that if the 16 instructions
> you choose exclude commonly used ones, your programmers will not only > be annoyed, I'm not really designing the thing to be implemented by hand, I would expect the programmer to use a compiler. > their programs will wind up taking MORE space, since they > have to use 2 or 3 opcodes to implement what could have been done with > 1 5-bit opcode. That line of reasoning isn't accurate. Just because you have fewer opcodes doesn't mean you will neccessarily take more space. It really has to do with how well balanced the instruction set is for the problem domain. One isn't a clear win over the other. Of course, what I propose is probably naive enough (because I haven't performed any practical experimentation) so as not to be very space efficient at this point. - Christopher |
|
|
|
|
|
Re: Concatenative Hardware Redux (Another 16 Instruction Set)Christopher Diggins <cdiggins@...> wrote:
> William Tanksley, Jr <wtanksleyjr@...> wrote: > > A fascinating goal, and one which to my knowledge has not been > > attempted together with a high-level instruction set. > I think the distinction between "high-level" and "low-level" is > artificial, and rooted to firmly on the naivete of our predecessors. > If we ignore it, we can come up with something new and interesting. Well then, "special purpose" versus "general purpose". Either way, IMO interesting. > > > However, to be completely honest, the more I experiment with the > > > instruction set, the more I think that I would prefer to work with a > > > 255-instruction set (ala JVML). > > Completely incompatible with the goal of minimal hardware > Not at all, it depends on what you do with these instructions. I > should clarify that I really mean "1-byte" operators. The core > instructions would be few, and many of the bits would be flags. Ah, this is how I handled the 8-bit instructions I was required to use during my processor design class (UCSD). I encoded only a few fields in the instruction word, and the rest I ran as raw input directly to the needed components of the processor. Saved a few transistors, at the cost of vastly larger code size (but I was competing against students designing accumulator machines, so they had the same code size problems). > Consider: > and then with a bit flag to represent "dip" > and another bit flag to represent "dup" > and another bit flag to represent "swap" > Hopefully this illustrates what I have in mind. I think it does. I'm sure you haven't settled on any meanings for bits yet, and those are just the first examples that pop to mind. (Note that deep stack effects are a _very_ bad idea, so being able to easily add dups will make your instructions VERY slow. In fact, these are _very_ bad choices, because it's not clear how those appendage instructions will be ordered... If I turn on both the dip AND swap bit, which one happens first?) But I'm sure you'll put the effort needed into designing this when and if you decide to. > > -- and do > > you actually have anything close to 256 opcodes? > It's quite easy to come up with a whole bunch of opcodes. No doubt. Useful and coherent ones? That's another matter. > > Wouldn't retreating to 5-bit instructions make more sense > > initially? > The simplicity of the design has less to do with the number of > instructions, than it does with the organization and kind of > instructions. 4-bit instructions is an interesting challenge, because > it is so darn small. 5-bit instructions has already been done and I Again, I don't recommend taking on 2 "interesting challenges" in the same problem. Small instruction set is cool; high-level instructions is cool. Both together? Too many constraints. Solve one, THEN solve the other. > believe would be hard to port to other architectures efficiently. I don't understand the "hard to port" objection at all. Clarify? How can a microarchitecture need to be ported to another architecture? > 8-bit instructions is easy to port and emulate. I consider emulation > very important, because a practical virtual machine would be run on > many different kinds of chips and software. Well, that's certainly an important design constraint. Software can handle muxing a LOT better than hardware can (of course, software has a MUCH harder time of handling predecoded instructions). 8-bit VM instructions make a lot more sense than 8-bit hardware instructions. > - Christopher -Wm |
| Free embeddable forum powered by Nabble | Forum Help |