BASE_PURESIZE

View: New views
20 Messages — Rating Filter:   Alert me  
< Prev | 1 - 2 | Next >

BASE_PURESIZE

by Eli Zaretskii :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Isn't the current definition of BASE_PURESIZE too large?

  #define BASE_PURESIZE (1430000 + SYSTEM_PURESIZE_EXTRA + SITELOAD_PURESIZE_EXTRA)

I looked at the values of pure_size vs pure_bytes_used in several
builds on several platforms, and I see that we are wasting at least
130KB:

    MS-Windows:
    (gdb) p pure_size
    $1 = 1480000
    (gdb) p pure_bytes_used
    $2 = 1357888

    64-bit GNU/Linux (--without-x):
    (gdb) p pure_size
    $1 = 2383333
    (gdb) p pure_bytes_used
    $2 = 2015813

    64-bit GNU/Linux (with X):
    (gdb) p pure_size
    $1 = 2383333
    (gdb) p pure_bytes_used
    $2 = 2193049

    MS-DOS:
    (gdb) p pure_size
    $1 = 1440000
    (gdb) p pure_bytes_used
    $2 = 1275442

GNU/Linux without-X is the extreme example: it wastes 370KB.

How about reducing the 1430000 number above?



Re: BASE_PURESIZE

by Andreas Schwab-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Eli Zaretskii <eliz@...> writes:

> Isn't the current definition of BASE_PURESIZE too large?

Fits quite well here (pure_size - pure_bytes_used == 79770).

Andreas.

--
Andreas Schwab, schwab@...
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



Re: BASE_PURESIZE

by Dan Nicolaescu :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Eli Zaretskii <eliz@...> writes:

  > Isn't the current definition of BASE_PURESIZE too large?
  >
  >   #define BASE_PURESIZE (1430000 + SYSTEM_PURESIZE_EXTRA + SITELOAD_PURESIZE_EXTRA)
  >
  > I looked at the values of pure_size vs pure_bytes_used in several
  > builds on several platforms, and I see that we are wasting at least
  > 130KB:
  >
  >     MS-Windows:
  >     (gdb) p pure_size
  >     $1 = 1480000
  >     (gdb) p pure_bytes_used
  >     $2 = 1357888
  >
  >     64-bit GNU/Linux (--without-x):
  >     (gdb) p pure_size
  >     $1 = 2383333
  >     (gdb) p pure_bytes_used
  >     $2 = 2015813
  >
  >     64-bit GNU/Linux (with X):
  >     (gdb) p pure_size
  >     $1 = 2383333
  >     (gdb) p pure_bytes_used
  >     $2 = 2193049
  >
  >     MS-DOS:
  >     (gdb) p pure_size
  >     $1 = 1440000
  >     (gdb) p pure_bytes_used
  >     $2 = 1275442
  >
  > GNU/Linux without-X is the extreme example: it wastes 370KB.
  >
  > How about reducing the 1430000 number above?

I have a few pending changes that will make the sizes needed just under
that, no need to fiddle with it all the time.



Re: BASE_PURESIZE

by Eli Zaretskii :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> From: Andreas Schwab <schwab@...>
> Date: Fri, 23 Oct 2009 13:39:37 +0200
> Cc: emacs-devel@...
>
> Eli Zaretskii <eliz@...> writes:
>
> > Isn't the current definition of BASE_PURESIZE too large?
>
> Fits quite well here (pure_size - pure_bytes_used == 79770).

What configuration is that?

Anyway, the numerical constant is not supposed to be tuned to the
largest user of pure[], that's what SYSTEM_PURESIZE_EXTRA and friends
are for.

But since Dan says he has changes in the pipe to use that up, I guess
that's okay.



Re: BASE_PURESIZE

by Juanma Barranquero :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Fri, Oct 23, 2009 at 13:00, Eli Zaretskii <eliz@...> wrote:

> I looked at the values of pure_size vs pure_bytes_used in several
> builds on several platforms, and I see that we are wasting at least
> 130KB:
>
>    MS-Windows:
>    (gdb) p pure_size
>    $1 = 1480000
>    (gdb) p pure_bytes_used
>    $2 = 1357888

With this system-configuration-options

  --with-gcc (4.4) --cflags -DENABLE_CHECKING=1 -DXASSERTS=1
-IC:/emacs/build/include -fno-crossjumping

on Windows I get

  (gdb) p pure_size
  $1 = 1776000
  (gdb) p pure_bytes_used
  $2 = 1518217

or about 252 KiB wasted.

    Juanma



Re: BASE_PURESIZE

by Andreas Schwab-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Eli Zaretskii <eliz@...> writes:

>> Eli Zaretskii <eliz@...> writes:
>>
>> > Isn't the current definition of BASE_PURESIZE too large?
>>
>> Fits quite well here (pure_size - pure_bytes_used == 79770).
>
> What configuration is that?

powerpc-suse-linux

> Anyway, the numerical constant is not supposed to be tuned to the
> largest user of pure[], that's what SYSTEM_PURESIZE_EXTRA and friends
> are for.

There are no system specific additions.

Andreas.

--
Andreas Schwab, schwab@...
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



Re: BASE_PURESIZE

by Eli Zaretskii :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> From: Andreas Schwab <schwab@...>
> Cc: emacs-devel@...
> Date: Fri, 23 Oct 2009 16:24:26 +0200
>
> Eli Zaretskii <eliz@...> writes:
>
> >> Eli Zaretskii <eliz@...> writes:
> >>
> >> > Isn't the current definition of BASE_PURESIZE too large?
> >>
> >> Fits quite well here (pure_size - pure_bytes_used == 79770).
> >
> > What configuration is that?
>
> powerpc-suse-linux

That's strange.  Is that a 64-bit system?  If so, do you have any
ideas why two different GNU/Linux systems, one on x86_64, the other
yours, have such significantly different pure space usage sizes?



Re: BASE_PURESIZE

by Andreas Schwab-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Eli Zaretskii <eliz@...> writes:

>> Eli Zaretskii <eliz@...> writes:
>>
>> >> Eli Zaretskii <eliz@...> writes:
>> >>
>> >> > Isn't the current definition of BASE_PURESIZE too large?
>> >>
>> >> Fits quite well here (pure_size - pure_bytes_used == 79770).
>> >
>> > What configuration is that?
>>
>> powerpc-suse-linux
>
> That's strange.  Is that a 64-bit system?

No.

Andreas.

--
Andreas Schwab, schwab@...
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



Re: BASE_PURESIZE

by Stephen J. Turnbull :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Juanma Barranquero writes:
 > On Fri, Oct 23, 2009 at 13:00, Eli Zaretskii <eliz@...> wrote:
 >
 > > I looked at the values of pure_size vs pure_bytes_used in several
 > > builds on several platforms, and I see that we are wasting at least
 > > 130KB:
[...]
 > on Windows I get
[...]
 > or about 252 KiB wasted.

Three comments:

1.  XEmacs abandoned pure space years ago on the assumption that (bugs
    aside) copy-on-write means that dumped text will be shared anyway.
    Is that incorrect?

2.  252 KiB is not negligible, I suppose, but these days the systems
    Emacs runs on typically sport >1GB of memory, and since that's
    pure space even with multiple instances of Emacs running that is
    all that will be wasted ever.

3.  To save the space, dump twice, the second time using the precise
    number you can measure from the first try.  Unlike the Lisp
    compilation stage, this takes less than an extra minute IIRC.  If
    you still care about the extra time, make the second dump part of
    the install target.




Re: BASE_PURESIZE

by Dan Nicolaescu :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

"Stephen J. Turnbull" <stephen@...> writes:

  > Juanma Barranquero writes:
  >  > On Fri, Oct 23, 2009 at 13:00, Eli Zaretskii <eliz@...> wrote:
  >  >
  >  > > I looked at the values of pure_size vs pure_bytes_used in several
  >  > > builds on several platforms, and I see that we are wasting at least
  >  > > 130KB:
  > [...]
  >  > on Windows I get
  > [...]
  >  > or about 252 KiB wasted.
  >
  > Three comments:
  >
  > 1.  XEmacs abandoned pure space years ago on the assumption that (bugs
  >     aside) copy-on-write means that dumped text will be shared anyway.
  >     Is that incorrect?

Do you have generational GC? If not, all the data in the dumped image
dumped image will be GCed every time, and that means lots of pages will
get written to, so won't be shareable.



Re: BASE_PURESIZE

by Stephen J. Turnbull :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Dan Nicolaescu writes:

 >   > 1.  XEmacs abandoned pure space years ago on the assumption that (bugs
 >   >     aside) copy-on-write means that dumped text will be shared anyway.
 >   >     Is that incorrect?
 >
 > Do you have generational GC? If not, all the data in the dumped image
 > dumped image will be GCed every time, and that means lots of pages will
 > get written to, so won't be shareable.

I don't think that's necessarily true.  I'm not sure what the
mechanism for avoiding writes to dumped objects is in XEmacs; it might
be something like setting a "permanently marked object" bit and
putting them all in the set of GC roots at dumptime.  (It's been a
while and I wasn't involved in that work at all.)  And doesn't Emacs
keep its mark bits separately, being able to do that because objects
are allocated in arrays of same-sized blocks?




Re: BASE_PURESIZE

by Eli Zaretskii :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> Date: Fri, 23 Oct 2009 16:10:47 +0200
> From: Eli Zaretskii <eliz@...>
> Cc: emacs-devel@...
>
> > From: Andreas Schwab <schwab@...>
> > Date: Fri, 23 Oct 2009 13:39:37 +0200
> > Cc: emacs-devel@...
> >
> > Eli Zaretskii <eliz@...> writes:
> >
> > > Isn't the current definition of BASE_PURESIZE too large?
> >
> > Fits quite well here (pure_size - pure_bytes_used == 79770).
>
> What configuration is that?
>
> Anyway, the numerical constant is not supposed to be tuned to the
> largest user of pure[], that's what SYSTEM_PURESIZE_EXTRA and friends
> are for.
>
> But since Dan says he has changes in the pipe to use that up, I guess
> that's okay.

After Dan committed his changes that use more purecopy, and after this
change:

  2009-10-23  Andreas Schwab  <schwab@...>

          * puresize.h (PURESIZE_RATIO): Decrease to 11/7.

pure space overflows on a 64-bit GNU/Linux host, and I need to enlarge
the 1430000 constant to at least 1460000, i.e. by 30KB, to fix that.
On a 32-bit Windows, the old constant of 1430000 still works (there's
70KB of spare pure space in the dumped Emacs).  So I'm not sure if the
problem is with the ratio or with something else.

For the record, the extra use of purecopy caused the pure_bytes_used
value to go up by 52KB on 32-bit Windows, and by 92KB on 64-bit
GNU/Linux.  So it looks like the ratio is actually closer to 9/5 than
to either the old 10/6 or the new 11/7.  Or maybe I'm missing
something.



Re: BASE_PURESIZE

by Eli Zaretskii :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> From: "Stephen J. Turnbull" <stephen@...>
> Date: Sat, 24 Oct 2009 13:41:50 +0900
>
> 2.  252 KiB is not negligible, I suppose, but these days the systems
>     Emacs runs on typically sport >1GB of memory, and since that's
>     pure space even with multiple instances of Emacs running that is
>     all that will be wasted ever.

First, we had changes committed lately that save much less than that
(FWIW, I support them).  The memory size is large nowadays, but it's
all taken by programs that have even larger memory footprint.

More importantly, I think the issue here is that the numbers I
presented indicate that there's something wrong with the way we
compute PURESIZE, because on some hosts it seems to be exactly right,
while on others it wastes a lot of memory.

> 3.  To save the space, dump twice, the second time using the precise
>     number you can measure from the first try.

PURESIZE is a compile-time C constant, it cannot be changed at dump
time.  You need to recompile.



Re: BASE_PURESIZE

by Andreas Schwab-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Eli Zaretskii <eliz@...> writes:

> For the record, the extra use of purecopy caused the pure_bytes_used
> value to go up by 52KB on 32-bit Windows, and by 92KB on 64-bit
> GNU/Linux.  So it looks like the ratio is actually closer to 9/5 than
> to either the old 10/6 or the new 11/7.  Or maybe I'm missing
> something.

It all depends on the ratio of string data vs. lisp object pure storage.

Andreas.

--
Andreas Schwab, schwab@...
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



Re: BASE_PURESIZE

by Stephen J. Turnbull :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Eli Zaretskii writes:

 > > 3.  To save the space, dump twice, the second time using the precise
 > >     number you can measure from the first try.

 > PURESIZE is a compile-time C constant, it cannot be changed at dump
 > time.  You need to recompile.

OK, so recompile the relevant module and relink.  You may actually be
pushing 60 extra seconds on a very slow machine at this point.

If you're more worried about errors in the computation of PURESIZE or
of the amount actually used when dumping, that's another issue, of
course, and this doesn't help with that.



Re: BASE_PURESIZE

by Eli Zaretskii :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> From: Andreas Schwab <schwab@...>
> Cc: emacs-devel@...
> Date: Sat, 24 Oct 2009 12:37:19 +0200
>
> Eli Zaretskii <eliz@...> writes:
>
> > For the record, the extra use of purecopy caused the pure_bytes_used
> > value to go up by 52KB on 32-bit Windows, and by 92KB on 64-bit
> > GNU/Linux.  So it looks like the ratio is actually closer to 9/5 than
> > to either the old 10/6 or the new 11/7.  Or maybe I'm missing
> > something.
>
> It all depends on the ratio of string data vs. lisp object pure storage.

I made some measurements.  The ratio of 11/7 seems to work pretty
well, but there are two additional problems:

 . The default value of SYSTEM_PURESIZE_EXTRA is zero, and is not
   increased for GUI builds.  This causes a --without-x build to waste
   some 100KB.  If we want to handle this, the basic constant in
   BASE_PURESIZE can be as low as 1290000 and SYSTEM_PURESIZE_EXTRA
   should have its default at 140000 for GUI builds, zero otherwise.

 . The amount of pure storage used by load-history depends on the
   length of the filename of the directory where Emacs is dumped.  In
   my case, I have 32 characters before the "emacs/lisp/" part, so I'm
   guessing that's the main reason the value of 1430000 was too small
   for me.

We could decide that we don't care too much about the --without-x
case, but what about the second problem?  If we want to handle it
without wasting storage on systems with shorter file names, we would
need some code in src/Makefile.in that would measure the length of the
directory name and enlarge PURESIZE accordingly.



Re: BASE_PURESIZE

by Dan Nicolaescu :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Eli Zaretskii <eliz@...> writes:

  > > From: Andreas Schwab <schwab@...>
  > > Cc: emacs-devel@...
  > > Date: Sat, 24 Oct 2009 12:37:19 +0200
  > >
  > > Eli Zaretskii <eliz@...> writes:
  > >
  > > > For the record, the extra use of purecopy caused the pure_bytes_used
  > > > value to go up by 52KB on 32-bit Windows, and by 92KB on 64-bit
  > > > GNU/Linux.  So it looks like the ratio is actually closer to 9/5 than
  > > > to either the old 10/6 or the new 11/7.  Or maybe I'm missing
  > > > something.
  > >
  > > It all depends on the ratio of string data vs. lisp object pure storage.
  >
  > I made some measurements.  The ratio of 11/7 seems to work pretty
  > well, but there are two additional problems:
  >
  >  . The default value of SYSTEM_PURESIZE_EXTRA is zero, and is not
  >    increased for GUI builds.  This causes a --without-x build to waste
  >    some 100KB.  If we want to handle this, the basic constant in
  >    BASE_PURESIZE can be as low as 1290000 and SYSTEM_PURESIZE_EXTRA
  >    should have its default at 140000 for GUI builds, zero otherwise.
  >
  >  . The amount of pure storage used by load-history depends on the
  >    length of the filename of the directory where Emacs is dumped.  In
  >    my case, I have 32 characters before the "emacs/lisp/" part, so I'm
  >    guessing that's the main reason the value of 1430000 was too small
  >    for me.

We have 2 more problems with load-history: although in loadup.el is
purecopied, something still seems to maintain references to the file
name strings, they are still present as non-pure strings in the dumped
image both as absolute file names and as the arguments passed to load
(see the simple patch I posted yesterday to dump strings).  So we are
still wasting memory on those.

It would be great if load-history would be constructed in pure memory
from the beginning when dumping (instead of purecopying later).
Maybe someone that understands that code could do that...

  > We could decide that we don't care too much about the --without-x

IMHO --without-x is completely unimportant.



Re: BASE_PURESIZE

by Stefan Monnier :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

>> . The default value of SYSTEM_PURESIZE_EXTRA is zero, and is not
>> increased for GUI builds.  This causes a --without-x build to waste
>> some 100KB.  If we want to handle this, the basic constant in
>> BASE_PURESIZE can be as low as 1290000 and SYSTEM_PURESIZE_EXTRA
>> should have its default at 140000 for GUI builds, zero otherwise.

I'm not terribly concerned about wasting 100KB in pure space (or even
more than that): it's space we don't ever touch or look at, so it's
never brought into RAM.  IOW it's very cheap.
OTOH checking what really goes into purespace and why something doesn't
is good, because moving data into purespace is good.


        Stefan


PS: Regarding the recent purecopy of autoload's file names: why not just
purecopy the whole autload list?


--- src/eval.c 2009-10-24 12:32:03 +0000
+++ src/eval.c 2009-10-24 18:32:23 +0000
@@ -2112,10 +2148,7 @@
      (function, file, docstring, interactive, type)
      Lisp_Object function, file, docstring, interactive, type;
 {
-  Lisp_Object args[4];
-
   CHECK_SYMBOL (function);
-  CHECK_STRING (file);
 
   /* If function is defined and not as an autoload, don't override */
   if (!EQ (XSYMBOL (function)->function, Qunbound)
@@ -2128,15 +2161,13 @@
        not useful and else we get loads of them from the loaddefs.el.  */
     LOADHIST_ATTACH (Fcons (Qautoload, function));
 
-  if (NILP (Vpurify_flag))
-    args[0] = file;
-  else
-    args[0] = Fpurecopy (file);
-  args[1] = docstring;
-  args[2] = interactive;
-  args[3] = type;
-
-  return Ffset (function, Fcons (Qautoload, Flist (4, &args[0])));
+  if (!NILP (Vpurify_flag))
+    /* We don't want the docstring in purespace (instead,
+       Snarf-documentation should (hopefully) overwrite it).  */
+    docstring = make_number (0);
+  return Ffset (function,
+ Fpurecopy (list5 (Qautoload, file, docstring,
+  interactive, type)));
 }
 
 Lisp_Object




Re: BASE_PURESIZE

by Chong Yidong :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Dan Nicolaescu <dann@...> writes:

>   > We could decide that we don't care too much about the --without-x
>
> IMHO --without-x is completely unimportant.

Not sure what you mean by this statement.  If you mean that wasting a
few extra kB of memory on --without-x builds isn't important, I can
agree with that.  But I don't even care about a few extra kB in X
builds, personally.



Re: BASE_PURESIZE

by Dan Nicolaescu :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Chong Yidong <cyd@...> writes:

  > Dan Nicolaescu <dann@...> writes:
  >
  > >   > We could decide that we don't care too much about the --without-x
  > >
  > > IMHO --without-x is completely unimportant.
  >
  > Not sure what you mean by this statement.  If you mean that wasting a
  > few extra kB of memory on --without-x builds isn't important,

That's exactly what I mean.


< Prev | 1 - 2 | Next >