Mnesia memory, size and effects of table copy types

View: New views
4 Messages — Rating Filter:   Alert me  

Mnesia memory, size and effects of table copy types

by Matthew Sackman-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I'm doing some manual memory management and when memory gets tight, I'm
converting some mnesia tables from disc_copies to disc_only_copies. But
I have a few questions because what I'm seeing reported through table_info
seems odd.

ets and mnesia claim to report memory size in words. But dets reports in
bytes. When a table is put in disc_only_copies, can someone confirm it's
still words I'm getting back, and not bytes? Because the following looks
very very fishy

> mnesia:create_table(mytable, [{disc_copies, [node()]}]).
{atomic,ok}
> mnesia:table_info(mytable, size).
0
> mnesia:table_info(mytable, memory).
299

Ok, so presumeably, that's the number of words in RAM. Maybe. Docs don't
actually say - it could very well be the sum of bytes on disk and words
in ram, but let's assume it's not double counting.

> mnesia:change_table_copy_type(mytable, node(), disc_only_copies).
{atomic,ok}
> mnesia:table_info(mytable, size).                                
0
> mnesia:table_info(mytable, memory).                              
5752

Um. Ok, so maybe that's just some RAM based overhead or something. But
it's gone up quite a long way... far enough that I'm not convinced
that's words and not bytes.

> mnesia:change_table_copy_type(mytable, node(), ram_copies).      
{atomic,ok}
> mnesia:table_info(mytable, size).                          
0
> mnesia:table_info(mytable, memory).                        
299

Oh well, at least we're back. Now, let's fill the table up:

> [ mnesia:dirty_write(mytable, {mytable, N, <<0:8192>>}) || N <- lists:seq(1, 1000000) ].
[ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,
 ok,ok,ok,ok,ok,ok,ok,ok,ok,ok|...]
> mnesia:table_info(mytable, size).                                                      
1000000
> mnesia:table_info(mytable, memory).                                                    
17144943

> mnesia:change_table_copy_type(mytable, node(), disc_copies).                            
{atomic,ok}
> mnesia:table_info(mytable, size).                                                      
1000000
> mnesia:table_info(mytable, memory).                        
17144943

Still good... but then:

> mnesia:change_table_copy_type(mytable, node(), disc_only_copies).
{atomic,ok}
> mnesia:table_info(mytable, size).                                
1000000
> mnesia:table_info(mytable, memory).                              
1929666056

Oh goodie. 1929666056 / 17144943 = 112.6. So going to disk has made it
MUCH bigger. I am less than convinced by these numbers.

> size(term_to_binary({mytable, 1000000, <<0:8192>>})) * 1000000.
1047000000

So, given that's at least in the same magnitude, I'm really suspecting
the number returned by table_info when in disc_only_copies is in
*bytes*, not words.

> round ((size(term_to_binary({mytable, 1000000, <<0:8192>>})) * 1000000) / erlang:system_info(wordsize)).
130875000

The minimum number of words is some 7 times bigger than the amount of
memory reported when using ets. However, I can well believe this
assuming we don't really have 1e6 copies of that binary.

So, from this, I deduce that calling table_info(Tab, memory) reports:
a) if ets is being used, the number of words of RAM used by the table
b) if ets isn't being used, the number of bytes of the table on disk

Is this right?

Cheers,

Matthew

________________________________________________________________
erlang-questions mailing list. See http://www.erlang.org/faq.html
erlang-questions (at) erlang.org


Parent Message unknown Re: Mnesia memory, size and effects of table copy types

by wde :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

by reading the mnesia source code i found this :

table_info/2 ->
raw_table_info/2 ->
if storage type == disc_only_copies ->
dets:info/2 ->
dets_utils:position/3 ->  
file:position/2 ->

return offset counted in bytes  




 
======= le 02/07/2009, 12:39:20 vous écriviez: =======

>I'm doing some manual memory management and when memory gets tight, I'm
>converting some mnesia tables from disc_copies to disc_only_copies. But
>I have a few questions because what I'm seeing reported through table_info
>seems odd.
>
>ets and mnesia claim to report memory size in words. But dets reports in
>bytes. When a table is put in disc_only_copies, can someone confirm it's
>still words I'm getting back, and not bytes? Because the following looks
>very very fishy
>
>> mnesia:create_table(mytable, [{disc_copies, [node()]}]).
>{atomic,ok}
>> mnesia:table_info(mytable, size).
>0
>> mnesia:table_info(mytable, memory).
>299
>
>Ok, so presumeably, that's the number of words in RAM. Maybe. Docs don't
>actually say - it could very well be the sum of bytes on disk and words
>in ram, but let's assume it's not double counting.
>
>> mnesia:change_table_copy_type(mytable, node(), disc_only_copies).
>{atomic,ok}
>> mnesia:table_info(mytable, size).                                
>0
>> mnesia:table_info(mytable, memory).                              
>5752
>
>Um. Ok, so maybe that's just some RAM based overhead or something. But
>it's gone up quite a long way... far enough that I'm not convinced
>that's words and not bytes.
>
>> mnesia:change_table_copy_type(mytable, node(), ram_copies).      
>{atomic,ok}
>> mnesia:table_info(mytable, size).                          
>0
>> mnesia:table_info(mytable, memory).                        
>299
>
>Oh well, at least we're back. Now, let's fill the table up:
>
>> [ mnesia:dirty_write(mytable, {mytable, N, <<0:8192>>}) || N <- lists:seq(1, 1000000) ].
>[ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,
> ok,ok,ok,ok,ok,ok,ok,ok,ok,ok|...]
>> mnesia:table_info(mytable, size).                                                      
>1000000
>> mnesia:table_info(mytable, memory).                                                    
>17144943
>
>> mnesia:change_table_copy_type(mytable, node(), disc_copies).                            
>{atomic,ok}
>> mnesia:table_info(mytable, size).                                                      
>1000000
>> mnesia:table_info(mytable, memory).                        
>17144943
>
>Still good... but then:
>
>> mnesia:change_table_copy_type(mytable, node(), disc_only_copies).
>{atomic,ok}
>> mnesia:table_info(mytable, size).                                
>1000000
>> mnesia:table_info(mytable, memory).                              
>1929666056
>
>Oh goodie. 1929666056 / 17144943 = 112.6. So going to disk has made it
>MUCH bigger. I am less than convinced by these numbers.
>
>> size(term_to_binary({mytable, 1000000, <<0:8192>>})) * 1000000.
>1047000000
>
>So, given that's at least in the same magnitude, I'm really suspecting
>the number returned by table_info when in disc_only_copies is in
>*bytes*, not words.
>
>> round ((size(term_to_binary({mytable, 1000000, <<0:8192>>})) * 1000000) / erlang:system_info(wordsize)).
>130875000
>
>The minimum number of words is some 7 times bigger than the amount of
>memory reported when using ets. However, I can well believe this
>assuming we don't really have 1e6 copies of that binary.
>
>So, from this, I deduce that calling table_info(Tab, memory) reports:
>a) if ets is being used, the number of words of RAM used by the table
>b) if ets isn't being used, the number of bytes of the table on disk
>
>Is this right?
>
>Cheers,
>
>Matthew
>
>________________________________________________________________
>erlang-questions mailing list. See http://www.erlang.org/faq.html
>erlang-questions (at) erlang.org
>
>

= = = = = = = = = ========= = = = = = = = = = =
                       
wde
wde@...
02/07/2009


Re: Mnesia memory, size and effects of table copy types

by Matthew Sackman-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thu, Jul 02, 2009 at 02:15:39PM +0200, wde wrote:

> by reading the mnesia source code i found this :
>
> table_info/2 ->
> raw_table_info/2 ->
> if storage type == disc_only_copies ->
> dets:info/2 ->
> dets_utils:position/3 ->  
> file:position/2 ->
>
> return offset counted in bytes  

Yeah, that's pretty much what I found from some quick grepping through.
This surely is a bug then. Units should not change depending on table
type. Also, the documentation should probably reenforce that this is the
number of words (if we must, but why on earth was words chosen and not
bytes?) used by the table content, irregardless of whether that content
is on disk or not. I.e. it's quite sensible to think that when using
disc_only_copies, memory should return the amount of RAM used and thus
be small and reasonably constant. Now in my case, I definitely want the
size of the dets table, but that's not the only interpretation, and the
documentation should be clearer.

Matthew

________________________________________________________________
erlang-questions mailing list. See http://www.erlang.org/faq.html
erlang-questions (at) erlang.org


Re: Mnesia memory, size and effects of table copy types

by Paulo Sérgio Almeida :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

Matthew Sackman wrote:

>> [ mnesia:dirty_write(mytable, {mytable, N, <<0:8192>>}) || N <- lists:seq(1, 1000000) ].
> [ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,
>  ok,ok,ok,ok,ok,ok,ok,ok,ok,ok|...]
>> mnesia:table_info(mytable, size).                                                      
> 1000000
>> mnesia:table_info(mytable, memory).                                                    
> 17144943

You are inserting large binaries which are not accounted for here. Only
the references to them.

> The minimum number of words is some 7 times bigger than the amount of
> memory reported when using ets. However, I can well believe this
> assuming we don't really have 1e6 copies of that binary.

I tested and we have really 1e6 copies of that binary. i.e. doing

   [ets:insert(aaa, {N, <<0:8192>>}) || N <- lists:seq(1,1000000)]

consumes much more memory than doing:

   B = <<0:8192>>,
   [ets:insert(aaa, {N, B}) || N <- lists:seq(1,1000000)].

This means that binary constants have a different behaviour than other
constants. (Now there is a global pool for constants to avoid recreating
them e.g. once each function invocation.) Is this accidental or some
design decision, considering that the above is not common and that one
wants to frequently append things to binaries?

Regards,
Paulo

________________________________________________________________
erlang-questions mailing list. See http://www.erlang.org/faq.html
erlang-questions (at) erlang.org