SPASM - a MPASM behave alike

View: New views
14 Messages — Rating Filter:   Alert me  

SPASM - a MPASM behave alike

by Holger Rapp :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

I spent some time hacking on a PIC assembler implementation in pure  
python. You can find my efforts here:
https://code.launchpad.net/spasm

I'd really like opinions from you guys. You did a terrific job with  
gpasm which helped me a lot. But I needed complete
macro support and #defines with arguments.

Following are the feature list and how to install it as python newbie.

Feature list:
- Support for all chips that gpasm supports
- Support for #defines with arguments
- Full support of Macro definitions
- Full support of #v(val) substitutions

No implemented:
- Support for EEPROM device programming. EEPROM8s might work,  
EEPROMS16 quite
   surely not
- Support for LIST file generation
- Support for relocatable output
- 18* support is very sparse, since there weren't many test cases in  
the testsuite of microchip. Things like
config A=1 are not supported but could easily be added.

How to install:
- have python installed
- get the PLY library (under mac os/linux try $ ./easy_install ply),  
python-ply package in debian/ubuntu; You need a recent version (>3.0)
- get bzr (versioning tool like svn; also try $ ./easy_install bzr),  
package bzr in debian/ubuntu.
- get spasm: $ bzr get lp:spasm
- $ cd spasm
- $ python setup.py build (this might take a while)
- $./spasm.py <your assembler file>


I won't have time to continue development in the nearer future,  
therefore I release it as is into the public with the hope someone
will continue on it. I will obviously provide any help I can.

Cheers,
Holger



Re: SPASM - a MPASM behave alike

by David Barnett-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Holger,

Haven't run the program itself, but from the description it sounds nice.
I've been trying for a while to split out gpasm's preprocessor from the
assembler (to support several of the features you've implemented), but I
kept running into quirks of lex and yacc and getting stuck (e.g. there's
some kind of magic to having multiple parsers in one project).

A lot of people want the portability of pure C, so it's good that gputils
exists. However, I think with MCU development you want to do everything you
can on the host and get a small, optimized binary, and it seems like there's
only so far you can go down that path implementing the assembler in C before
it gets completely unmaintainable. I'm a Python guy myself, and I'm hoping
that using Python will help lift some of those burdens that have held
gputils back.

I'm excited to see some progress, at any rate. Thanks for pitching in!

David

On Sat, Jun 6, 2009 at 8:51 AM, Holger Rapp <HolgerRapp@...> wrote:

> Hi,
>
> I spent some time hacking on a PIC assembler implementation in pure python.
> You can find my efforts here:
> https://code.launchpad.net/spasm
>
> I'd really like opinions from you guys. You did a terrific job with gpasm
> which helped me a lot. But I needed complete
> macro support and #defines with arguments.
>
> Following are the feature list and how to install it as python newbie.
>
> Feature list:
> - Support for all chips that gpasm supports
> - Support for #defines with arguments
> - Full support of Macro definitions
> - Full support of #v(val) substitutions
>
> No implemented:
> - Support for EEPROM device programming. EEPROM8s might work, EEPROMS16
> quite
>  surely not
> - Support for LIST file generation
> - Support for relocatable output
> - 18* support is very sparse, since there weren't many test cases in the
> testsuite of microchip. Things like
> config A=1 are not supported but could easily be added.
>
> How to install:
> - have python installed
> - get the PLY library (under mac os/linux try $ ./easy_install ply),
> python-ply package in debian/ubuntu; You need a recent version (>3.0)
> - get bzr (versioning tool like svn; also try $ ./easy_install bzr),
> package bzr in debian/ubuntu.
> - get spasm: $ bzr get lp:spasm
> - $ cd spasm
> - $ python setup.py build (this might take a while)
> - $./spasm.py <your assembler file>
>
>
> I won't have time to continue development in the nearer future, therefore I
> release it as is into the public with the hope someone
> will continue on it. I will obviously provide any help I can.
>
> Cheers,
> Holger
>
>
>

Re: SPASM - a MPASM behave alike

by Holger Rapp :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello David,

> Haven't run the program itself, but from the description it sounds  
> nice.
> I've been trying for a while to split out gpasm's preprocessor from  
> the
> assembler (to support several of the features you've implemented),  
> but I
> kept running into quirks of lex and yacc and getting stuck (e.g.  
> there's
> some kind of magic to having multiple parsers in one project).
Oohh, it's the same with PLY (the python lex yacc lib I was using). It  
is quite
troublesome to keep partial parsers around (like parsers for  
expressions) so I
used a number of work arounds and hacks. Also the language of MPASM is  
not very
regular; it is (nearly?) impossible to build a non ambiguos parser.

Nevertheless, glad you like my efforts. As I mentioned this can only  
be seen as
a first step. The Assembler is not usable in the current state, but  
all features
that are missing are non crucial (from an implementation perspective);  
that is
all language features are implemented in the parser/lexer. Adding new  
features
should be a quite enjoyable task right now. But features need to be  
added to
code on 18* devices; I am quite confident that 16* devices work pretty  
well
since virtually all test cases in the mchip test set are for those  
devices.

The whole development happened test driven (which was easy with so  
many tests).
I think this was the only reason that I managed to rework the parser/
lexer
several time when I learned from another MPASM language quirk/feature  
(my
favorite by far are string literals: data "A" and data "A"*1 yield  
different
byte code. pretty! ;) ). whoever picks up the development should be  
willing to
also write test cases to keep the code healthy.

SPASM is also not running over the code twice as MPASM and gpasm do;  
instead I
implemented backtracking for opcodes with undefined jump labels. I am  
not sure
that this works in all cases, but if not - implementing going over the  
code
twice is makable though not too easy I guess.

> A lot of people want the portability of pure C, so it's good that  
> gputils
> exists. However, I think with MCU development you want to do  
> everything you
> can on the host and get a small, optimized binary, and it seems like  
> there's
> only so far you can go down that path implementing the assembler in  
> C before
> it gets completely unmaintainable. I'm a Python guy myself, and I'm  
> hoping
> that using Python will help lift some of those burdens that have held
> gputils back.
Is there a FOSS compiler solution for PICs with gplink as linker? I  
wasn't aware
of this.

Cheers,
Holger


Am 06.06.2009 um 23:03 schrieb David Barnett:

> Holger,
>
>
>
>
> I'm excited to see some progress, at any rate. Thanks for pitching in!
>
> David
>
> On Sat, Jun 6, 2009 at 8:51 AM, Holger Rapp <HolgerRapp@...>  
> wrote:
>
>> Hi,
>>
>> I spent some time hacking on a PIC assembler implementation in pure  
>> python.
>> You can find my efforts here:
>> https://code.launchpad.net/spasm
>>
>> I'd really like opinions from you guys. You did a terrific job with  
>> gpasm
>> which helped me a lot. But I needed complete
>> macro support and #defines with arguments.
>>
>> Following are the feature list and how to install it as python  
>> newbie.
>>
>> Feature list:
>> - Support for all chips that gpasm supports
>> - Support for #defines with arguments
>> - Full support of Macro definitions
>> - Full support of #v(val) substitutions
>>
>> No implemented:
>> - Support for EEPROM device programming. EEPROM8s might work,  
>> EEPROMS16
>> quite
>> surely not
>> - Support for LIST file generation
>> - Support for relocatable output
>> - 18* support is very sparse, since there weren't many test cases  
>> in the
>> testsuite of microchip. Things like
>> config A=1 are not supported but could easily be added.
>>
>> How to install:
>> - have python installed
>> - get the PLY library (under mac os/linux try $ ./easy_install ply),
>> python-ply package in debian/ubuntu; You need a recent version (>3.0)
>> - get bzr (versioning tool like svn; also try $ ./easy_install bzr),
>> package bzr in debian/ubuntu.
>> - get spasm: $ bzr get lp:spasm
>> - $ cd spasm
>> - $ python setup.py build (this might take a while)
>> - $./spasm.py <your assembler file>
>>
>>
>> I won't have time to continue development in the nearer future,  
>> therefore I
>> release it as is into the public with the hope someone
>> will continue on it. I will obviously provide any help I can.
>>
>> Cheers,
>> Holger
>>
>>
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: gnupic-unsubscribe@...
For additional commands, e-mail: gnupic-help@...


Re: SPASM - a MPASM behave alike

by Peter Keller :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sat, Jun 06, 2009 at 05:03:57PM -0400, David Barnett wrote:
> Haven't run the program itself, but from the description it sounds nice.
> I've been trying for a while to split out gpasm's preprocessor from the
> assembler (to support several of the features you've implemented), but I
> kept running into quirks of lex and yacc and getting stuck (e.g. there's
> some kind of magic to having multiple parsers in one project).

Is there a formal lexical & grammar specification for the assembly
dialect? And, in a sense, why bother having a preprocessor phase
at all? Just lex and parse the entirety of the language, macros and
all into an AST, and then transform the AST into another AST with the
"preprocessing" steps applied. With modern computers, so what if
two, ten, or a hundred passes are done on the in memory AST?

Thank you.

-pete

---------------------------------------------------------------------
To unsubscribe, e-mail: gnupic-unsubscribe@...
For additional commands, e-mail: gnupic-help@...


Re: SPASM - a MPASM behave alike

by David Barnett-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sat, Jun 6, 2009 at 7:36 PM, Holger Rapp <HolgerRapp@...> wrote:

> SPASM is also not running over the code twice as MPASM and gpasm do;

Does MPASM work that way, too? I thought it was just a quirk of gputils.
Anyway, I wouldn't worry too much about that point unless you actually run
into problems.


> Is there a FOSS compiler solution for PICs with gplink as linker? I wasn't
> aware
> of this.

Yeah, SDCC uses gpasm and gplink. But I was talking about C as the language
gputils is written in, not the PIC projects themselves.

David

Re: SPASM - a MPASM behave alike

by David Barnett-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sat, Jun 6, 2009 at 7:58 PM, Peter Keller <psilord@...> wrote:

> Is there a formal lexical & grammar specification for the assembly
> dialect? And, in a sense, why bother having a preprocessor phase
> at all? Just lex and parse the entirety of the language, macros and
> all into an AST, and then transform the AST into another AST with the
> "preprocessing" steps applied.

Macros and directives are tricky because they work in terms of text
substitution. So while you could go directly to an AST, you'd probably need
to dip back into a textual representation for several things, and in most
cases that would defeat the purpose of goint straight to an AST. For
instance, in PIC assembler you can do #v substitution in the middle of
symbol names, or you can #define multiple arguments at once (e.g. "PORTB,
2"), so before the preprocessor stage the "syntax" is very loose and doesn't
make a good AST, IMO.

The problem with gpasm's "two pass" system is that it's sloppy about how it
does the substitutions, and it does too much in the lexer. That makes it
nearly impossible to do some things we need. For instance, if the
indentation is wrong in the assembler syntax, gpasm gets completely tripped
up and gives very strange errors because the lexer has to assume that
anything in column 1 is a directive or label, without checking whether it
corresponds to an opcode instead.

David

Re: SPASM - a MPASM behave alike

by Peter Keller :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sat, Jun 06, 2009 at 08:20:57PM -0400, David Barnett wrote:
> The problem with gpasm's "two pass" system is that it's sloppy about how it
> does the substitutions, and it does too much in the lexer. That makes it
> nearly impossible to do some things we need. For instance, if the
> indentation is wrong in the assembler syntax, gpasm gets completely tripped
> up and gives very strange errors because the lexer has to assume that
> anything in column 1 is a directive or label, without checking whether it
> corresponds to an opcode instead.

What would the challenges be in writing a completely standalone
preprocessor which just emits the processed assembly to be then fed
into gpasm?

-pete

---------------------------------------------------------------------
To unsubscribe, e-mail: gnupic-unsubscribe@...
For additional commands, e-mail: gnupic-help@...


Re: SPASM - a MPASM behave alike

by Xiaofan Chen :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sun, Jun 7, 2009 at 2:57 PM, Peter Keller<psilord@...> wrote:

> On Sat, Jun 06, 2009 at 08:20:57PM -0400, David Barnett wrote:
>> The problem with gpasm's "two pass" system is that it's sloppy about how it
>> does the substitutions, and it does too much in the lexer. That makes it
>> nearly impossible to do some things we need. For instance, if the
>> indentation is wrong in the assembler syntax, gpasm gets completely tripped
>> up and gives very strange errors because the lexer has to assume that
>> anything in column 1 is a directive or label, without checking whether it
>> corresponds to an opcode instead.
>
> What would the challenges be in writing a completely standalone
> preprocessor which just emits the processed assembly to be then fed
> into gpasm?
>

One of the example is here. It is a very comprehensive assembler
environment in its own right with a large amount of macros.
http://www.embedinc.com/pic/

It has aspic_fix, the formater.
http://www.embedinc.com/pic/aspic_fix.txt.htm

It also has the pre-processor for the macros.
http://www.embedinc.com/pic/prepic.txt.htm

All the source codes (in Pascal) are here:
http://www.embedinc.com/pic/dload.htm
http://www.embedinc.com/pic/install_public_source.exe

--
Xiaofan http://mcuee.blogspot.com

---------------------------------------------------------------------
To unsubscribe, e-mail: gnupic-unsubscribe@...
For additional commands, e-mail: gnupic-help@...


Re: SPASM - a MPASM behave alike

by Tamas Rudnai :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sun, Jun 7, 2009 at 7:57 AM, Peter Keller <psilord@...> wrote:

> What would the challenges be in writing a completely standalone
> preprocessor which just emits the processed assembly to be then fed
> into gpasm?
>

Why would write the #define / #include style preprocessor at all? This
should be done by existing tools like the cpp instead of reinventing the
wheel. The "other preprocessor" used by MPASM which is involved by the
equ/macro style could be written as a separated preprocessor which would
make the remaining compiler fairly trivial in my opinion.

Is there a linker implemented in this SPASM with the MPASM style linker
scripts?

Tamas
--
http://www.mcuhobby.com

Re: SPASM - a MPASM behave alike

by Holger Rapp :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello,

Am 07.06.2009 um 12:05 schrieb Tamas Rudnai:

> On Sun, Jun 7, 2009 at 7:57 AM, Peter Keller <psilord@...>  
> wrote:
>
>> What would the challenges be in writing a completely standalone
>> preprocessor which just emits the processed assembly to be then fed
>> into gpasm?
>>
>
> Why would write the #define / #include style preprocessor at all? This
> should be done by existing tools like the cpp instead of reinventing  
> the
> wheel. The "other preprocessor" used by MPASM which is involved by the
> equ/macro style could be written as a separated preprocessor which  
> would
> make the remaining compiler fairly trivial in my opinion.
I do not want to sound rude, but I think many people are not really  
aware of what they
are talking about in case of MPASM (the language, not the program).  
#include #define and so
on are not defined as preprocessor directives, they are part of the  
language definition. Using a preprocessor
like CPP is out of the game for example for this reason

        #define B 10

        A = 3 * B ; this is not a preprocessor variable; it must evaluate  
expressions and know about the current radix (for example)
       
        while A > 0
                #if A > 2
                        data A
                #endif
                A--
        endw

Label#v(A*10+1) ; this will become Label1

The "preprocessor" must know about a lot of the current assembly state  
like for example the current radix or the minimal or maximal values.
It is therefore not a preprocessor in the C like sense, it is more an  
integrated component of the assembler.

SPASM has to do a lot of error checking in the preprocessor too (for  
example it gets handed down a list of opcode and pseudo opcode names
to check for opcodes in the first colum and so on). Also it has to  
check because some tokens must be interpreted from the context.
This can't be avoided with MPASM. For example:

data abcdh  ; will write 0xabcd; the h suffix is a valid identifier  
for hex values
abcdh equ 0x1  ;  abcdh is also a valid identifier
data abcdh ; will write 0x1

Now what? The lexer can't decide if it should return an identifer or a  
hex-constant token. It therefore returns an "either-this-or-that"  
token and the preprocessor
decides from context before handing stuff to the parser.

<rant>
Long story short: MPASM is a very undesigned language; it is not  
trivial to write a "correct" assembler for it; people who consider it  
"fairly trivial" should please make
sure they know what they are talking about; most programmers are quite  
lazy; we would not reinvent the wheel if there were tools who could do  
our job in our environment.
</rant>

>
>
> Is there a linker implemented in this SPASM with the MPASM style  
> linker
> scripts?
No, as mentioned SPASM does not do relocatable code. I have never  
worked with relocatable code on PICs and I doubt it is very useful to  
have a macro assembler
(as SPASM is) for that. That is, I only see the scenario of a c  
compiler creating objects that need to be linked. You do not need  
MACRO, #v() or #ifdefs for that. You would prefer a clever linker,  
which is another cup of coffee.


Cheers,
Holger


---------------------------------------------------------------------
To unsubscribe, e-mail: gnupic-unsubscribe@...
For additional commands, e-mail: gnupic-help@...


Re: SPASM - a MPASM behave alike

by David Barnett-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sun, Jun 7, 2009 at 7:25 AM, Holger Rapp <HolgerRapp@...> wrote:

> I have never worked with relocatable code on PICs and I doubt it is very
> useful to have a macro assembler (as SPASM is) for that.

Yes and no. I can certainly say that I used plenty of macros and #defines in
my projects, but relocatable code does blur some lines between the linker
and preprocessor.

Still, many MPASM linkers are missing features such as dead code removal
(for unused functions) so that macros and such still have their place. And
besides, text substitutions are their own animal, and sometimes you can do
tricks with them you couldn't do in assembler logic.

I actually think relocatable code support would be the best feature to add
to SPASM next to make it's widely adopted.

David

Relocatable support (Was: Re: [gnupic] SPASM - a MPASM behave alike)

by Holger Rapp :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

> I actually think relocatable code support would be the best feature  
> to add
> to SPASM next to make it's widely adopted.
Well, I cannot promise anything and help would greatly be appreciated;  
but I am kind of interested in the mechanics of relocatable code.  
Could you provide some insight into new style coff (-C option to  
gpasm) and old style coff? is it necessary to support both?
I found this document on the homepage
http://gputils.sourceforge.net/51288a.pdf

But the objects gpasm creates start with 0x1240 instead of the  
documented 0x1234. The test files in the testsuite also start with  
0x1234.

Cheers,
Holger


---------------------------------------------------------------------
To unsubscribe, e-mail: gnupic-unsubscribe@...
For additional commands, e-mail: gnupic-help@...


Re: Relocatable support (Was: Re: [gnupic] SPASM - a MPASM behave alike)

by David Barnett-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sun, Jun 7, 2009 at 4:58 PM, Holger Rapp <HolgerRapp@...> wrote:

> [...] I am kind of interested in the mechanics of relocatable code. Could
> you provide some insight into new style coff (-C option to gpasm) and old
> style coff?

I'm not sure myself what *all* of the difference is, but I think there are a
couple of extra bytes in some of the fields in the new style (and 0x1240 is
the magic number for the new style, I think). The biggest reason I've seen
for supporting new-style coff is that it's what recent versions of MPASM
generate, so for the linker and gpvo it's good to support both. However, for
the assembler just supporting one or the other should be fine for now, and I
think it should be extremely easy to add the other later.

There are some structs in the gputils code that might clear some of those
binary formats up for you if you want to go looking for them.

BTW, have you seen the intelhex module in python (
http://bialix.com/intelhex/). The HEX format isn't too complicated, so it
might not be worth the extra dependency, but I just wondered if you'd seen
it yet...

David

Re: Relocatable support (Was: Re: [gnupic] SPASM - a MPASM behave alike)

by Holger Rapp :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

Am 08.06.2009 um 03:45 schrieb David Barnett:

> On Sun, Jun 7, 2009 at 4:58 PM, Holger Rapp <HolgerRapp@...>  
> wrote:
>
>> [...] I am kind of interested in the mechanics of relocatable code.  
>> Could
>> you provide some insight into new style coff (-C option to gpasm)  
>> and old
>> style coff?
>
> I'm not sure myself what *all* of the difference is, but I think  
> there are a
> couple of extra bytes in some of the fields in the new style (and  
> 0x1240 is
> the magic number for the new style, I think). The biggest reason  
> I've seen
> for supporting new-style coff is that it's what recent versions of  
> MPASM
> generate, so for the linker and gpvo it's good to support both.  
> However, for
> the assembler just supporting one or the other should be fine for  
> now, and I
> think it should be extremely easy to add the other later.
Thanks, that helps tremendously.

> There are some structs in the gputils code that might clear some of  
> those
> binary formats up for you if you want to go looking for them.
I might have done that anyway ;)

> BTW, have you seen the intelhex module in python (
> http://bialix.com/intelhex/). The HEX format isn't too complicated,  
> so it
> might not be worth the extra dependency, but I just wondered if  
> you'd seen
> it yet...
I wasn't aware of this lib. On first look it seems to be pretty well  
written but
seems to lack some features (big endian support mainly) that might be  
needed.
However, when I'd have to support more functionality of hex files than  
SPASM
currently do (which is basically just writing them out), I would  
surely consider
switching to this lib.

Cheers,
Holger


---------------------------------------------------------------------
To unsubscribe, e-mail: gnupic-unsubscribe@...
For additional commands, e-mail: gnupic-help@...