Adventures with Building Applications

View: New views
6 Messages — Rating Filter:   Alert me  

Adventures with Building Applications

by David Goehrig :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello All,

I have a project that has been in development for several years now,
that supports a number of web applications.  The project's customer base
ranges from multinationals to startups, and has pretty much paid my
bills for the past 5 years.  The code base has been ported from C++, to
Ocaml, to C.  It has used Perl, Python, Ruby, Lua, and Javascript as
embedded interpreters.  The latest version of the project consists of 7k
lines of C, and a cut down version of Mozilla's SpiderMonkey (which
itself consists of 86k lines of C).

This web application server includes support for HTTP/1.1 (client &
server), SSL, SMTP (client & server), SMPP (client), and IRC style
protocol, and SQL through PostgreSQL.  Unlike most web servers, it has a
large number of resident "bots" that process requests, chat to users,
and do general house keeping.  All of the functionality is currently
accessible through Javascript and doesn't require any special training
beyond what a general Flash programmer knows.

But time doesn't stand still, and a recent project for a large
multinational put serious strain on the system when the numer of users
was an entire order of magnitude greater than the original project
proposal!  The biggest stumbling block has been the 86k lines of C code
that run the Javascript engine, the same one found in Firefox.  After
reading through Webkit, and the new Tamarind engine from Adobe (with its
250k lines of C++ code & Javascript JIT), I pretty much gave up on using
any of the existing engines.  So I  have begun the 4 major rewrite of
the system.  Over the past few months, I've been evaluating a  wide
range of technologies.  I've played with customizing several smalltalk
VMs, written PEG based language translators in several different
languages, prototyped versions in the mainline scripting languages, and
created 4 custom VMs in C and intel assembler.  But ultimately, I went
back and read through cmForth.blk, and the source listings for
colorforth and got some inspiration.

The latest VM has a simple opcode set which consisted of the characters:

    !"#$%&'()*+,-./0123456789:;<=>?@[\]^_`abcdef{|}~

And for inspiration, this VM ran opcode that were very similar to
Chuck's colorforth VM with a few minor modifications.  Registers were
allocated:

; %edx - top of stack  (doubles as the B register)
; %eax - next on stack
; %ecx - counter/utility
; %ebx - instruction pointer (now a utility register)
; %esi - data stack pointer
; %edi - memory register ( the A register )
; %esp - return stack pointer
; %ebp - free space pointer (aka here)

And the VM opcodes did slightly different things than they do in Forth,
but there always were equivalents. For example while ! is xor, $ does a
!a+.  Also some words were repurposed:

[ pushes ebp, and starts compiling
] stops compiling
( immediately switches to interpreter context, and compiles
) evaluates the interpreter context
{ pushes ebp
} compiles a 0 cell
: takes the next word in the input buffer, and binds it to the value on
the top of the stack.

Also any word not found in the dictionary, just leaves the address of
the token on the stack.  These changes made a Forth-like language for
the VM very simple, and the implementation of the VM and bytecode
interpreter, was under 500 lines of assembler.  The trick being each
opcode was aligned on a 32 byte boundary, and simply vectored to the
byte [ebx + 8000] address.  My favorite op codes are 0-9a-f which
multiply the TOS by [base] and add the values they represent.

One of the biggest changes from Forth is :.  Unlike traditional colon
definitions the VM's Forth-like language would look like:

[ _ * ] : square

Where [ started compiling, pushing the address of the code on the stack,
_ compiles a dup, * compiles a multiplication, and ] turns off
compilation, : then bound the value pushed on the stack by [ to the word
square.  And everything else works pretty much like you'd expect.  The
other thing that makes
life really kinda neat was:

{ 0 : foo , "narf" : bar , }

with a minor tweak to : to mean , when between { } produced data
strutures that look a lot like JSON, only with the value : key notation
rather than key : value.  At this point, I thought that I was almost
done with the rewrite, I'd simply build a parser that reordered the
Javascript into a postfix notation,  and run it through the VM, and
presto I'd be able to support most of the code the developers I've
worked with have written over the years.

But why stop there! I've got a VM that looks a lot like Chuck's VM, and
I know that he's got an optimizing native compiler for his code, so why
not remove the whole redundant bytecode step?  So I kept the byte codes,
changed the definition of next to just be ret, and added a quick lookup
table of the lengths of each VM instruction, and rewrote the _compile
block to just copy VM instruction inline, compile literals as a dup; mov
edx, 1234 and all dictionary calls as call instructions.  Now I switched
my VM from a bytecode interpreter to a native compiler!  A few more
tweaks, and in 630 lines of assembler, I now have a native compiler that
optimizes tail calls, optimizes out some stack juggling combinations,
and can still switch between interpret and compile modes (literally,
compile & run vs. just compile).

And that's where it stands today.  I am currently, rewriting my
javascript -> forth parser, in javascript, and playing around with a
colorforth -> javascript compiler in javascript as well.  Colorforth in
your browser?  Sure why not?  Your browser in colorforth?  Sure why
not?  Chuck was right when he said a browser was simple, it just needs
to be done :)  By implementing javascript in forth, and forth in
javascript, either can run anywhere.

Going forward, I fully expecting to have a Forth/Javascript engine
powering version 4 of this web application server project.  I am
currently planning on rewriting the 7K lines of C in a combination of
Forth and Javascript (which compiles to Forth so ultimately it is Forth,
but my web guys don't need to know that), and expect to have the entire
code based reduced to under 4k lines of code (down from 94k!).

In the future, since I've got the code to boot the VM from disk, I'm
also planning on writing an Ethernet device driver for the couple cards
I use, and a TCP/IP stack, so I can run this server on bare hardware.  
With the availability of virtualization software like VMWare,
VirutalBox, QEMU, etc, I can easily see a migration path from
FreeBSD/Linux/MacOS X to plain bare hardware.

Looking back at this development process, I could have just used the
Colorforth 2.0a to implement the VM, but without the equivalent of Jay
Melvin's shadow block listings of cmforth for the first 18 blocks of CF,
I wasn't comfortable implementing it on top of so much black magic.  
Building a native javascript compiler is hard enough without knowing how
the internals of the compiler work :)

Anyways, I hope to do a postmortem for this project in a few months.  
The source code and executables will most likely be available by the end
of the year under some form of open source license.  I hope this
inspires some of you to dust off your CF images, and build some apps.

Dave Goehrig

--

David J. Goehrig


Email: dave@...


---------------------------------------------------------------------
To unsubscribe, e-mail: colorforth-unsubscribe@...
For additional commands, e-mail: colorforth-help@...
Main web page - http://www.colorforth.com


Re: Adventures with Building Applications

by vaded :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Off topic: Ray, thanks for all your posts to the wiki/blog.

Mr. Goehrig -- your project sounds very interesting.  I hope it succeeds
excellently.  What I am curious about is, seeing how you realize the
power of colorForth, have you ever looked at alternative ways to do
what you want?  What I really mean is as long as there is this
dependency
on Javascript then that defeats the purpose of something like
colorForth.
Javascript becomes the weakest link and once you have to parameter
things
down to the level of Javascript you lost all the benefits of Forth.

Your thoughts?


On Tue, 19 Aug 2008 13:21:28 -0400, "David J. Goehrig"
<dave@...> said:

> Hello All,
>
> I have a project that has been in development for several years now,
> that supports a number of web applications.  The project's customer base
> ranges from multinationals to startups, and has pretty much paid my
> bills for the past 5 years.  The code base has been ported from C++, to
> Ocaml, to C.  It has used Perl, Python, Ruby, Lua, and Javascript as
> embedded interpreters.  The latest version of the project consists of 7k
> lines of C, and a cut down version of Mozilla's SpiderMonkey (which
> itself consists of 86k lines of C).
>
> This web application server includes support for HTTP/1.1 (client &
> server), SSL, SMTP (client & server), SMPP (client), and IRC style
> protocol, and SQL through PostgreSQL.  Unlike most web servers, it has a
> large number of resident "bots" that process requests, chat to users,
> and do general house keeping.  All of the functionality is currently
> accessible through Javascript and doesn't require any special training
> beyond what a general Flash programmer knows.
>
> But time doesn't stand still, and a recent project for a large
> multinational put serious strain on the system when the numer of users
> was an entire order of magnitude greater than the original project
> proposal!  The biggest stumbling block has been the 86k lines of C code
> that run the Javascript engine, the same one found in Firefox.  After
> reading through Webkit, and the new Tamarind engine from Adobe (with its
> 250k lines of C++ code & Javascript JIT), I pretty much gave up on using
> any of the existing engines.  So I  have begun the 4 major rewrite of
> the system.  Over the past few months, I've been evaluating a  wide
> range of technologies.  I've played with customizing several smalltalk
> VMs, written PEG based language translators in several different
> languages, prototyped versions in the mainline scripting languages, and
> created 4 custom VMs in C and intel assembler.  But ultimately, I went
> back and read through cmForth.blk, and the source listings for
> colorforth and got some inspiration.
>
> The latest VM has a simple opcode set which consisted of the characters:
>
>     !"#$%&'()*+,-./0123456789:;<=>?@[\]^_`abcdef{|}~
>
> And for inspiration, this VM ran opcode that were very similar to
> Chuck's colorforth VM with a few minor modifications.  Registers were
> allocated:
>
> ; %edx - top of stack  (doubles as the B register)
> ; %eax - next on stack
> ; %ecx - counter/utility
> ; %ebx - instruction pointer (now a utility register)
> ; %esi - data stack pointer
> ; %edi - memory register ( the A register )
> ; %esp - return stack pointer
> ; %ebp - free space pointer (aka here)
>
> And the VM opcodes did slightly different things than they do in Forth,
> but there always were equivalents. For example while ! is xor, $ does a
> !a+.  Also some words were repurposed:
>
> [ pushes ebp, and starts compiling
> ] stops compiling
> ( immediately switches to interpreter context, and compiles
> ) evaluates the interpreter context
> { pushes ebp
> } compiles a 0 cell
> : takes the next word in the input buffer, and binds it to the value on
> the top of the stack.
>
> Also any word not found in the dictionary, just leaves the address of
> the token on the stack.  These changes made a Forth-like language for
> the VM very simple, and the implementation of the VM and bytecode
> interpreter, was under 500 lines of assembler.  The trick being each
> opcode was aligned on a 32 byte boundary, and simply vectored to the
> byte [ebx + 8000] address.  My favorite op codes are 0-9a-f which
> multiply the TOS by [base] and add the values they represent.
>
> One of the biggest changes from Forth is :.  Unlike traditional colon
> definitions the VM's Forth-like language would look like:
>
> [ _ * ] : square
>
> Where [ started compiling, pushing the address of the code on the stack,
> _ compiles a dup, * compiles a multiplication, and ] turns off
> compilation, : then bound the value pushed on the stack by [ to the word
> square.  And everything else works pretty much like you'd expect.  The
> other thing that makes
> life really kinda neat was:
>
> { 0 : foo , "narf" : bar , }
>
> with a minor tweak to : to mean , when between { } produced data
> strutures that look a lot like JSON, only with the value : key notation
> rather than key : value.  At this point, I thought that I was almost
> done with the rewrite, I'd simply build a parser that reordered the
> Javascript into a postfix notation,  and run it through the VM, and
> presto I'd be able to support most of the code the developers I've
> worked with have written over the years.
>
> But why stop there! I've got a VM that looks a lot like Chuck's VM, and
> I know that he's got an optimizing native compiler for his code, so why
> not remove the whole redundant bytecode step?  So I kept the byte codes,
> changed the definition of next to just be ret, and added a quick lookup
> table of the lengths of each VM instruction, and rewrote the _compile
> block to just copy VM instruction inline, compile literals as a dup; mov
> edx, 1234 and all dictionary calls as call instructions.  Now I switched
> my VM from a bytecode interpreter to a native compiler!  A few more
> tweaks, and in 630 lines of assembler, I now have a native compiler that
> optimizes tail calls, optimizes out some stack juggling combinations,
> and can still switch between interpret and compile modes (literally,
> compile & run vs. just compile).
>
> And that's where it stands today.  I am currently, rewriting my
> javascript -> forth parser, in javascript, and playing around with a
> colorforth -> javascript compiler in javascript as well.  Colorforth in
> your browser?  Sure why not?  Your browser in colorforth?  Sure why
> not?  Chuck was right when he said a browser was simple, it just needs
> to be done :)  By implementing javascript in forth, and forth in
> javascript, either can run anywhere.
>
> Going forward, I fully expecting to have a Forth/Javascript engine
> powering version 4 of this web application server project.  I am
> currently planning on rewriting the 7K lines of C in a combination of
> Forth and Javascript (which compiles to Forth so ultimately it is Forth,
> but my web guys don't need to know that), and expect to have the entire
> code based reduced to under 4k lines of code (down from 94k!).
>
> In the future, since I've got the code to boot the VM from disk, I'm
> also planning on writing an Ethernet device driver for the couple cards
> I use, and a TCP/IP stack, so I can run this server on bare hardware.  
> With the availability of virtualization software like VMWare,
> VirutalBox, QEMU, etc, I can easily see a migration path from
> FreeBSD/Linux/MacOS X to plain bare hardware.
>
> Looking back at this development process, I could have just used the
> Colorforth 2.0a to implement the VM, but without the equivalent of Jay
> Melvin's shadow block listings of cmforth for the first 18 blocks of CF,
> I wasn't comfortable implementing it on top of so much black magic.  
> Building a native javascript compiler is hard enough without knowing how
> the internals of the compiler work :)
>
> Anyways, I hope to do a postmortem for this project in a few months.  
> The source code and executables will most likely be available by the end
> of the year under some form of open source license.  I hope this
> inspires some of you to dust off your CF images, and build some apps.
>
> Dave Goehrig
>
> --
>
> David J. Goehrig
>
>
> Email: dave@...
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: colorforth-unsubscribe@...
> For additional commands, e-mail: colorforth-help@...
> Main web page - http://www.colorforth.com
>

---------------------------------------------------------------------
To unsubscribe, e-mail: colorforth-unsubscribe@...
For additional commands, e-mail: colorforth-help@...
Main web page - http://www.colorforth.com


Re: Adventures with Building Applications

by David Goehrig :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

vaded@... wrote:
>   What I really mean is as long as there is this
> dependency
> on Javascript then that defeats the purpose of something like
> colorForth.
> Javascript becomes the weakest link and once you have to parameter
> things
> down to the level of Javascript you lost all the benefits of Forth.
>  

The word dependency is kinda misleading.   There's two directions,
server side JS => Forth and client side Forth => JS.  On the server
side, the old code is written in JS and so is dependent on JS in that it
is written in it.  Hence translating JS => Forth!  On the client side,
well the client has a JS environment, so Forth => JS is the easiest way
to get Forth on the client's system, so it is dependent on what the
client has available.

As for looking for alternatives, like getting a Forth based browser on
everyone's PC and phone, that's essentially impossible.  A lot of our
applications have cellphone clients, and I do a lot of contract
engineering for companies to port existing apps to things like
Qualcomm's BREW.  You'd never get your colorforth app through NSTL,
simple as that.  Apple won't put it on the iPhone either.  On the other
hand, those phones all have web browsers with some Javascript support.

Javascript is everywhere, on the desktop, on the phone, and on the web.  
To view a viable client platform that can literally reach billions of
people as "the weakest link" kinda misses the point.  It is also the one
programming language for which everyone with a browser can write
software, and that nearly everyone can run.  On the client side, it is
simply the best target language.  It has become the universal assembly
language of the web.

On the server side, I can do whatever I want.  That's where I can
leverage the benefits of Forth.  I can produce tight code that does what
it needs, and export that functionality to the JS users.  The vast
majority of the people who program web UIs know Javascript, but don't
know Forth.  Enabling them to write backend processing in JS and then
behind the scenes implementing it in Forth means they remain happy and
productive, and most importantly lets them invest in perfecting 1 skill
JS programming, rather than trying to perfect 2 different skills JS +
Forth.  I reap the benefit of having more time to work on the Forth side
of things, and my clients get their projects delivered on time.  But
that said, we still have a large base of legacy JS code that we have to
support, so there's real incentive to port all of that to Forth.

When you really get down to it, Javascript is just a bunch of C syntax,
tacked on top of Self, which is itself just yet another LISP variant
with a prototypal idiom for using assoc lists as objects. There's about
five words you have to implement in Forth to handle the entire object
model for Javascript, (new, get, set, apply, delegate).  Nothing hard to
mimic in Forth. While the programmer writing JS code can't take full
advantage of the Forth engine, he at least can be as productive as he
can be with the tools that he knows.

 From my point of view, the benefits of Forth are:

1.) I can quickly build a system that emulates the functionality of the
old, and be able to run the old apps
2.) The resulting system will be substantially faster than, support more
users than,  and be more maintainable than the old
3.) Gives me a migration path to bare hardware in the event I need still
more power

The benefits of supporting Javascript are:

1.) I can continue to hire people who are Javascript experts, who can
contribute their best effort
2.) I don't have to rewrite all the applications for existing clients
(at great expense and no additional gain)
3.) It is everywhere, and as a target language, forth that generates
javascript gives me the best of both worlds.

Hope that makes sense.

Dave
---

David J. Goehrig


Email: dave@...


---------------------------------------------------------------------
To unsubscribe, e-mail: colorforth-unsubscribe@...
For additional commands, e-mail: colorforth-help@...
Main web page - http://www.colorforth.com


Re: Adventures with Building Applications

by vaded :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Thank you for the thoughtful and detailed post.  Your explanations made
sense and given the context you're working within I see how JS is
necessary.

I have the luxury of not having to meet the needs of clients or having
to fit my projects within the available software infrastructures Thus my
thinking becomes absolutist but I appreciate and understand others are
working within different parameters.

I'm glad you find Forth fitting enough to what you're doing that it
makes sense to implement it and add it to the mix.  I'm also grateful
you shared your experience on this list; our numbers here are small so
anytime someone shares some of their experience I find it very
interesting.

 



On Tue, 19 Aug 2008 16:37:20 -0400, "David J. Goehrig"
<dave@...> said:

> vaded@... wrote:
> >   What I really mean is as long as there is this
> > dependency
> > on Javascript then that defeats the purpose of something like
> > colorForth.
> > Javascript becomes the weakest link and once you have to parameter
> > things
> > down to the level of Javascript you lost all the benefits of Forth.
> >  
>
> The word dependency is kinda misleading.   There's two directions,
> server side JS => Forth and client side Forth => JS.  On the server
> side, the old code is written in JS and so is dependent on JS in that it
> is written in it.  Hence translating JS => Forth!  On the client side,
> well the client has a JS environment, so Forth => JS is the easiest way
> to get Forth on the client's system, so it is dependent on what the
> client has available.
>
> As for looking for alternatives, like getting a Forth based browser on
> everyone's PC and phone, that's essentially impossible.  A lot of our
> applications have cellphone clients, and I do a lot of contract
> engineering for companies to port existing apps to things like
> Qualcomm's BREW.  You'd never get your colorforth app through NSTL,
> simple as that.  Apple won't put it on the iPhone either.  On the other
> hand, those phones all have web browsers with some Javascript support.
>
> Javascript is everywhere, on the desktop, on the phone, and on the web.  
> To view a viable client platform that can literally reach billions of
> people as "the weakest link" kinda misses the point.  It is also the one
> programming language for which everyone with a browser can write
> software, and that nearly everyone can run.  On the client side, it is
> simply the best target language.  It has become the universal assembly
> language of the web.
>
> On the server side, I can do whatever I want.  That's where I can
> leverage the benefits of Forth.  I can produce tight code that does what
> it needs, and export that functionality to the JS users.  The vast
> majority of the people who program web UIs know Javascript, but don't
> know Forth.  Enabling them to write backend processing in JS and then
> behind the scenes implementing it in Forth means they remain happy and
> productive, and most importantly lets them invest in perfecting 1 skill
> JS programming, rather than trying to perfect 2 different skills JS +
> Forth.  I reap the benefit of having more time to work on the Forth side
> of things, and my clients get their projects delivered on time.  But
> that said, we still have a large base of legacy JS code that we have to
> support, so there's real incentive to port all of that to Forth.
>
> When you really get down to it, Javascript is just a bunch of C syntax,
> tacked on top of Self, which is itself just yet another LISP variant
> with a prototypal idiom for using assoc lists as objects. There's about
> five words you have to implement in Forth to handle the entire object
> model for Javascript, (new, get, set, apply, delegate).  Nothing hard to
> mimic in Forth. While the programmer writing JS code can't take full
> advantage of the Forth engine, he at least can be as productive as he
> can be with the tools that he knows.
>
>  From my point of view, the benefits of Forth are:
>
> 1.) I can quickly build a system that emulates the functionality of the
> old, and be able to run the old apps
> 2.) The resulting system will be substantially faster than, support more
> users than,  and be more maintainable than the old
> 3.) Gives me a migration path to bare hardware in the event I need still
> more power
>
> The benefits of supporting Javascript are:
>
> 1.) I can continue to hire people who are Javascript experts, who can
> contribute their best effort
> 2.) I don't have to rewrite all the applications for existing clients
> (at great expense and no additional gain)
> 3.) It is everywhere, and as a target language, forth that generates
> javascript gives me the best of both worlds.
>
> Hope that makes sense.
>
> Dave
> ---
>
> David J. Goehrig
>
>
> Email: dave@...
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: colorforth-unsubscribe@...
> For additional commands, e-mail: colorforth-help@...
> Main web page - http://www.colorforth.com
>

---------------------------------------------------------------------
To unsubscribe, e-mail: colorforth-unsubscribe@...
For additional commands, e-mail: colorforth-help@...
Main web page - http://www.colorforth.com


Re: Adventures with Building Applications

by David Goehrig :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

vaded@... wrote:
> Thank you for the thoughtful and detailed post.
>  
You're welcome.
>  I'm also grateful
> you shared your experience on this list; our numbers here are small so
> anytime someone shares some of their experience I find it very
> interesting.
I am hoping that by implementing a version of CF in javascript as well,
I can help pull in some converts by making the language more
accessible.  I just whipped up a simple colored editor this morning, and
am now converting it to use the canvas widget so that it has bitmapped
graphics support just like the native one.  Forth faces many of the same
acceptance problems as Smalltalk.  Both are older languages, with a rich
history of successes and failures, and buck the mainstream when it comes
to development style.  Smalltalk on the other hand is having a
resurgence due to projects like Squeak, Seaside, and gaining some
visibility due to projects like Sun's Lively Kernel:

http://research.sun.com/projects/lively/

And if you look at things like VPRI's research, and goal to develop a
full software stack in 20k lines of code,

http://vpri.org/

You have a bunch of very smart people who are trying to basically
reinvent what Forth has long done, only coming at it from the world of
late bound languages with extensible grammars.  Most of their time is
spent trying to come up with new ways to out clever the problem, by late
binding everything, effectively pushing edit and compile time into
runtime.  A smart aleck  would simply hand them one of Chuck's forths on
Chuck's hardware and say "Here you're done".  But that would just ruin
their fun :)

Dave

--

David J. Goehrig


Email: dave@...


---------------------------------------------------------------------
To unsubscribe, e-mail: colorforth-unsubscribe@...
For additional commands, e-mail: colorforth-help@...
Main web page - http://www.colorforth.com


Re: Adventures with Building Applications

by Ray St. Marie :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, Aug 19, 2008 at 1:20 PM,  <vaded@...> wrote:
> Off topic: Ray, thanks for all your posts to the wiki/blog.
>

You are very velcome Vaded. I'm enjoying doing it, and the more I come
up with things to discuss, or post, the more ideas follow.


Your colorForth content provider and friend. :-)
Raystm2 in #c4th irc.Freenode.net
Ray

--
Raymond St. Marie ii,
public E-mail Ray.stmarie@...
a quickstart guide http://colorforthray.info
Community Blog http://colorForth.net
Community Wiki http://ForthWorks.com/c4th

---------------------------------------------------------------------
To unsubscribe, e-mail: colorforth-unsubscribe@...
For additional commands, e-mail: colorforth-help@...
Main web page - http://www.colorforth.com