exportFn and evalStream questions

View: New views
3 Messages — Rating Filter:   Alert me  

exportFn and evalStream questions

by Joe Wells :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Dear SML/NJ gurus,

Would it be possible for someone to please answer 1 or more of the
following questions about exportFn and evalStream?

1. Where does exportFn discard the contents of the top-level
   environment?  I've been looking through the source code and if I
   have seen it I certainly have not recognized it.  It is clear that
   this is happening both from the documentation and also from the
   fact that subsequent uses of evalStream or useStream seem to
   complain that all identifiers are undefined.

2. Is it possible to get the same space-saving effect as exportFn
   without having to write a heap image to disk and then exec a fresh
   sml process to load the heap image?  (Another problem this would
   avoid is the fact that exportFn closes all files and clears the
   signal handler table.)  Is it enough to clear the top-level
   environment by hand and then let the garbage collector do its work?
   And does anyone have any example SML code that clears the top-level
   environment correctly?  Can doing this release memory back to the
   system?  If not, can doing this at least reduce the amount of work
   the garbage collector needs to do regularly?

3. Is is possible to preserve some portion of the top-level
   environment across an exportFn?  It would be enough to preserve the
   constructors and type names of some datatypes and built-in types,
   so that evalStream and useStream could still work for parsing, type
   checking, and pretty printing datatype values.

4. Can someone provide an example of constructing a custom environment
   for use with evalStream?  Any example would do.  I already know how
   to do something that seems equivalent to the top-level environment.
   It would be particularly nice if the example showed how to include
   just datatype constructors and type names.

Thanks for your time in considering my questions.  And extra thanks
if anyone answers even just one of my questions!

--
Joe


--
Heriot-Watt University is a Scottish charity
registered under charity number SC000278.


------------------------------------------------------------------------------
Come build with us! The BlackBerry® Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9-12, 2009. Register now!
http://p.sf.net/sfu/devconf
_______________________________________________
Smlnj-list mailing list
Smlnj-list@...
https://lists.sourceforge.net/lists/listinfo/smlnj-list

Re: exportFn and evalStream questions

by John Reppy :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On Oct 5, 2009, at 7:43 AM, Joe Wells wrote:

> Dear SML/NJ gurus,
>
> Would it be possible for someone to please answer 1 or more of the
> following questions about exportFn and evalStream?

I can answer the first two.  Dave or Matthias should be able to address
the others.

>
> 1. Where does exportFn discard the contents of the top-level
>   environment?  I've been looking through the source code and if I
>   have seen it I certainly have not recognized it.  It is clear that
>   this is happening both from the documentation and also from the
>   fact that subsequent uses of evalStream or useStream seem to
>   complain that all identifiers are undefined.

The discarding of the environment is effectively done by the GC.  We
isolate the return continuation of the exported function so that it
does not refer to the top-level-loop, and, thus, the environment becomes
garbage.

>
> 2. Is it possible to get the same space-saving effect as exportFn
>   without having to write a heap image to disk and then exec a fresh
>   sml process to load the heap image?  (Another problem this would
>   avoid is the fact that exportFn closes all files and clears the
>   signal handler table.)  Is it enough to clear the top-level
>   environment by hand and then let the garbage collector do its work?
>   And does anyone have any example SML code that clears the top-level
>   environment correctly?  Can doing this release memory back to the
>   system?  If not, can doing this at least reduce the amount of work
>   the garbage collector needs to do regularly?

Yes, as long as you don't want to return to the top-level loop.  See
the code in

        base/system/Basis/Implementation/NJ/export.sml

>
> 3. Is is possible to preserve some portion of the top-level
>   environment across an exportFn?  It would be enough to preserve the
>   constructors and type names of some datatypes and built-in types,
>   so that evalStream and useStream could still work for parsing, type
>   checking, and pretty printing datatype values.

>
> 4. Can someone provide an example of constructing a custom environment
>   for use with evalStream?  Any example would do.  I already know how
>   to do something that seems equivalent to the top-level environment.
>   It would be particularly nice if the example showed how to include
>   just datatype constructors and type names.
>
> Thanks for your time in considering my questions.  And extra thanks
> if anyone answers even just one of my questions!
>
> --  
> Joe
>
>
> --  
> Heriot-Watt University is a Scottish charity
> registered under charity number SC000278.
>
>
> ------------------------------------------------------------------------------
> Come build with us! The BlackBerry® Developer Conference in SF, CA
> is the only developer event you need to attend this year. Jumpstart  
> your
> developing skills, take BlackBerry mobile applications to market and  
> stay
> ahead of the curve. Join us from November 9-12, 2009. Register  
> now!
> http://p.sf.net/sfu/devconf
> _______________________________________________
> Smlnj-list mailing list
> Smlnj-list@...
> https://lists.sourceforge.net/lists/listinfo/smlnj-list
>


------------------------------------------------------------------------------
Come build with us! The BlackBerry® Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9-12, 2009. Register now!
http://p.sf.net/sfu/devconf
_______________________________________________
Smlnj-list mailing list
Smlnj-list@...
https://lists.sourceforge.net/lists/listinfo/smlnj-list

Re: exportFn and evalStream questions

by Joe Wells :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

John Reppy <jhr@...> writes:

> On Oct 5, 2009, at 7:43 AM, Joe Wells wrote:
>
>> Dear SML/NJ gurus,
>>
>> Would it be possible for someone to please answer 1 or more of the
>> following questions about exportFn and evalStream?
>
> I can answer the first two.

Thanks!

> Dave or Matthias should be able to address the others.

That would be lovely, but I want you to know that I am very grateful
for even the answers you have already given me.

>> 1. Where does exportFn discard the contents of the top-level
>>   environment?  I've been looking through the source code and if I
>>   have seen it I certainly have not recognized it.  It is clear that
>>   this is happening both from the documentation and also from the
>>   fact that subsequent uses of evalStream or useStream seem to
>>   complain that all identifiers are undefined.
>
> The discarding of the environment is effectively done by the GC.  We
> isolate the return continuation of the exported function so that it
> does not refer to the top-level-loop, and, thus, the environment becomes
> garbage.

Thanks very much for this answer.

I now realize that I was confused about something.  My best guess as
to the cause of my confusion is that I must have left out a vital
semicolon between two top-level declarations when I was testing
things, and somehow thought the missing definitions during a call to
evalStream were caused by something deliberately removing them from
the top-level environment during exportFn, because the problem went
away when I switched to exportML.

The answer you gave above made me think things through again and redo
my tests.

>> 2. Is it possible to get the same space-saving effect as exportFn
>>   without having to write a heap image to disk and then exec a fresh
>>   sml process to load the heap image?  (Another problem this would
>>   avoid is the fact that exportFn closes all files and clears the
>>   signal handler table.)  Is it enough to clear the top-level
>>   environment by hand and then let the garbage collector do its work?
>>   And does anyone have any example SML code that clears the top-level
>>   environment correctly?  Can doing this release memory back to the
>>   system?  If not, can doing this at least reduce the amount of work
>>   the garbage collector needs to do regularly?
>
> Yes, as long as you don't want to return to the top-level loop.

Okay, so this is the key.  So I've now tried this.

Let me give a bit of the background.  I've been working on a system
for letting people write SML/NJ scripts.  Being able to write SML
scripts makes it reasonable to use SML in a lot more situations,
because the time overhead of setting up and debugging the scheme for
building your program is a lot smaller (basically this overhead cost
goes to zero).  This is particularly important when giving problems to
students, because they can focus on the programming and not about how
to compile and link their files and dump the heap and so on.

With my system, the “hello world” program can look like this:

  #!/directory/path/goes/here/smlnj-script
  print "Hello, world!\n"

Actually, my “hello world” program looks like this:

  #!/usr/bin/env smlnj-script
  ;(*-*-SML-*-*)
  silenceCompiler ();
  print "Hello, world!\n";

The extra things are as follows.  (1) For now, I use “env” to search
for “smlnj-script” in the directories listed in the user's PATH
environment variable, so it can be installed in various places.  (2)
The second line begins with a semi-colon so that the first line is at
least syntactically valid SML and various program tools (like the
Emacs SML mode) will be less confused.  (3) The comment on the second
line tells Emacs to use SML mode, because Emacs hasn't yet been told
about which mode to use with the interpreter specified on the first
line.  (4) The invocation of silenceCompiler shuts up all uses of
Control.Print.say to suppress messages for successful declarations and
autoloading.  I don't do this by default because that also stops type
error messages.  So the idea is once the programmer is mostly happy
they will insert a call to silenceCompiler.

Okay, so now you know the background.

The problem is that even the simplest programs use roughly 30 M bytes
of virtual memory, and have a resident set size (RSS) of somewhere
between 10 and 30 M bytes.

Before I followed your advice, the programs were using roughly 40 M
bytes of virtual memory.  So I have now saved about 10 M of virtual
memory.  In some case, the RSS went down by half, while in other cases
it stayed the same.

As you can see though, I still have a problem, because 30 M bytes of
memory for “hello world” is way too much.

Here is a simple script that needs 33 M bytes of virtual memory and
28 M bytes of real memory:

  #!/usr/bin/env smlnj-script
  ;(*-*-SML-*-*)
  silenceCompiler ();
  print "see how much memory this is using ...\n";
  val pid = Word32.toInt (Posix.Process.pidToWord (Posix.ProcEnv.getpid ()));
  val psCmd = "ps l " ^ Int.toString pid;
  fun doPs () = OS.Process.system psCmd;
  doPs();
  print "now exiting evalLoop, see if memory has been freed ...\n";
  fun finish () = (while true do (doPs(); OS.Process.sleep (Time.fromSeconds 2)));
  finishFunction := finish;

The only thing non-obvious to explain about what this script does is
that whatever function is stored by the user in the ref cell
finishFunction will be invoked by smlnj-script *after* the evalLoop
that processed the script file returns.  So the user can put the main
long-running loop of their program in finishFunction and this will
allow the garbage collector to reclaim space used by the compiler.
Here is the output I get from this:

  see how much memory this is using ...
  F   UID   PID  PPID PRI  NI    VSZ   RSS WCHAN  STAT TTY        TIME COMMAND
  0  1000  5926  8535  20   0  40804 36564 -      S+   pts/7      0:00 smlnj-scrip
  now exiting evalLoop, see if memory has been freed ...
  F   UID   PID  PPID PRI  NI    VSZ   RSS WCHAN  STAT TTY        TIME COMMAND
  0  1000  5926  8535  20   0  32868 28284 -      S+   pts/7      0:00 smlnj-scrip
  F   UID   PID  PPID PRI  NI    VSZ   RSS WCHAN  STAT TTY        TIME COMMAND
  0  1000  5926  8535  20   0  32868 28284 -      S+   pts/7      0:00 smlnj-scrip
  F   UID   PID  PPID PRI  NI    VSZ   RSS WCHAN  STAT TTY        TIME COMMAND
  0  1000  5926  8535  20   0  32868 28284 -      R+   pts/7      0:00 smlnj-scrip
  ...

As you can see, the GC was able to reclaim about 8 M bytes of memory
when the evalLoop was exited.  But the program still uses around 30 M
bytes of memory just to do this:

  while true do (doPs(); OS.Process.sleep (Time.fromSeconds 2))

Needing 30 M bytes of memory to keep around the implementation of
OS.Process.sleep, Time.fromSeconds, and OS.Process.system seems a bit
much to me.

Is there anything I can do to improve on this?  Ideally without
writing a heap file to disk, which is a bit excessive for just
implementing scripting.

For reference, I've included the relevant bits of smlnj-script below.

> See the code in
>
> base/system/Basis/Implementation/NJ/export.sml

Just curious, but what should I be seeing in this file?  I can't see
what I should be learning from it.

By the way, I am still interested in answers to my other questions
(quoted just below) because they are addressing other quite distinct
problems.

>> 3. Is is possible to preserve some portion of the top-level
>>   environment across an exportFn?  It would be enough to preserve the
>>   constructors and type names of some datatypes and built-in types,
>>   so that evalStream and useStream could still work for parsing, type
>>   checking, and pretty printing datatype values.
>
>> 4. Can someone provide an example of constructing a custom environment
>>   for use with evalStream?  Any example would do.  I already know how
>>   to do something that seems equivalent to the top-level environment.
>>   It would be particularly nice if the example showed how to include
>>   just datatype constructors and type names.
>>
>> Thanks for your time in considering my questions.  And extra thanks
>> if anyone answers even just one of my questions!
>>
>> -- Joe

To anyone who takes the time to even add a short answer, thanks very
much!

--
Joe

======================================================================
the relevant bits of the Makefile that builds smlnj-script:
======================================================================
...

smlnj-script: smlnj-script.sml
        USE_EXPORT_FN=yes sml smlnj-script.sml
        #sml smlnj-script.sml
        heap2exec smlnj-script.x86-linux smlnj-script
        # *** FIX: make this Makefile usable by others!  don't hard-wire os/arch!

...
======================================================================
the relevant bits of smlnj-script.sml:
======================================================================
...

(* The user can redefine finishFunction to get something to be
   evaluated after useStream is done.  The point is that this allows
   the compiler to be garbage collected so long-running scripts do not
   have to tie down huge chunks of memory. *)
val finishFunction = ref (fn () => ());

fun finish () =
    ((* invoking the GC with less than the maximum generation seems to
        have no effect *)
     SMLofNJ.Internals.GC.doGC (valOf Int.maxInt);
     (* invoking the GC with maximum generation just once blows up
        memory usage enormously until the maximum generation is
        collected again, so we do it twice.  arrgh! *)
     SMLofNJ.Internals.GC.doGC (valOf Int.maxInt);
     (* Provided we dumped with exportFn, at this point on my machine
        the script will be using about 10 M bytes less virtual memory.
        The resident set size will sometimes have decreased. *)
     ! finishFunction ();
     OS.Process.exit OS.Process.success)

(* no semicolon here deliberately to avoid adding to top-level environment *)

(* We define here a separate function suitable for use with exportFn
   so that we can easily switch the implementation between exportML and
   exportFn. *)
fun main (_, _ (* ignore this which should be a copy of SMLofNJ.getArgs () *)) =
    let (* *** give nice error message if there is not at least 1 argument. *)
        (* CommandLine.arguments does the same as SMLofNJ.getArgs. *)
        val f = TextIO.openIn (hd (SMLofNJ.getArgs ()))
    in SMLofNJ.shiftArgs ();
       (* skip over #! line *)
       (* *** check that #! line is of correct form *)
       TextIO.inputLine f;
       (* *** find way to fix error messages to mention correct file name and line number! *)
       Backend.Interact.useStream f;
       TextIO.closeIn f;
       (* tail call, hopefully freeing the closure for this function (main)! *)
       finish ()
    end

(* no semicolon here deliberately to avoid adding to top-level environment *)

val scriptInterpreterFile = "smlnj-script"

(* no semicolon here deliberately to avoid adding to top-level environment *)

(* This allows the Makefile to decide how to dump the heap. *)
val useExportML = not (isSome (OS.Process.getEnv "USE_EXPORT_FN"))

(* no semicolon here deliberately to avoid adding to top-level environment *)

val () = if useExportML
         then if (print "dumping with exportML\n";
                  SMLofNJ.exportML scriptInterpreterFile)
              then ignore (main (SMLofNJ.getCmdName (), SMLofNJ.getArgs ()))
              else OS.Process.exit OS.Process.success
         else (print "dumping with exportFn\n";
               SMLofNJ.exportFn (scriptInterpreterFile, main))


--
Heriot-Watt University is a Scottish charity
registered under charity number SC000278.


------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Smlnj-list mailing list
Smlnj-list@...
https://lists.sourceforge.net/lists/listinfo/smlnj-list