file:read with read_ahead and binaries broken

View: New views
4 Messages — Rating Filter:   Alert me  

file:read with read_ahead and binaries broken

by Matthew Sackman-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

dd if=/dev/urandom of=/tmp/file.rnd bs=1M count=20

test(Hdl) ->
    test(Hdl, []).

test(Hdl, Acc) ->
    case file:read(Hdl, 1) of
        {ok, <<Num:1/binary>>} -> {ok, _Pos} = file:position(Hdl, {cur, 1}),
                                  test(Hdl, [Num|Acc]);
        eof -> Acc
    end.

1> f(), {ok, Hdl} = file:open("/tmp/file.rnd", [read, read_ahead, binary, raw]),
  X = test:test(Hdl), ok = file:close(Hdl).

Erlang will die. Badly. erlang:memory() shows that of the 4GB erlang
has claimed before I kill it, 3.9GB of that is binary data.

Ways to stop this going nuts:
1) Don't use read_ahead
2) Remove the position call - instead, read 2 bytes and skip the second
3) Add any random term, say 'foo' to the Acc, rather than Num.
4) Have Num as an int, not a binary.
5) Do the following:
        {ok, <<Num:8>>} -> {ok, _Pos} = file:position(Hdl, {cur, 1}),
                           <<Num2:1/binary>> = <<Num:8>>,
                           test(Hdl, [Num2|Acc]);

My guess is that what's happening is that the read is reading in a whole
disk page (as it should), Num is a pointer into the start of that page,
but the rest of the page beyond the first byte, isn't reclaimed. Then the
position seemingly invalidates the entire page. This is confirmed by the
fact that strace -f -c -p $PID shows the same number of calls to read in
both the read_ahead and non read_ahead versions. Interestingly though,
there are twice as many calls to lseek in the read_ahead version.

From inspecting the size of the file itself, both the read_ahead and non
versions are really issuing a read for every single byte read, and the
read_ahead version also has the advantage of issuing twice as many
seeks.

A quick test shows this happens at least as far back as R12B5, and still
happens in R13B02.

Oh and if you follow suggestion (5), you'll find the read_ahead version
is about 8 times slower than the non read_ahead version.

Matthew

________________________________________________________________
erlang-bugs mailing list. See http://www.erlang.org/faq.html
erlang-bugs (at) erlang.org


Re: file:read with read_ahead and binaries broken

by Björn-Egil Dahlberg :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Matthew,

We are aware of this issue and a more aggressive gc-strategy is being
developed. This will be in place in the next release unless something
unforeseen happens.

The new strategy involves virtual heaps for binaries that will also
trigger gc:s when binary heap boundaries are reached instead of only
procbins and binary overhead counting triggers.

The new strategy will also take care of past old heap binary problems.

Regards,
Björn-Egil
Erlang/OTP

Matthew Sackman wrote:

> dd if=/dev/urandom of=/tmp/file.rnd bs=1M count=20
>
> test(Hdl) ->
>     test(Hdl, []).
>
> test(Hdl, Acc) ->
>     case file:read(Hdl, 1) of
>         {ok, <<Num:1/binary>>} -> {ok, _Pos} = file:position(Hdl, {cur, 1}),
>                                   test(Hdl, [Num|Acc]);
>         eof -> Acc
>     end.
>
> 1> f(), {ok, Hdl} = file:open("/tmp/file.rnd", [read, read_ahead, binary, raw]),
>   X = test:test(Hdl), ok = file:close(Hdl).
>
> Erlang will die. Badly. erlang:memory() shows that of the 4GB erlang
> has claimed before I kill it, 3.9GB of that is binary data.
>
> Ways to stop this going nuts:
> 1) Don't use read_ahead
> 2) Remove the position call - instead, read 2 bytes and skip the second
> 3) Add any random term, say 'foo' to the Acc, rather than Num.
> 4) Have Num as an int, not a binary.
> 5) Do the following:
>         {ok, <<Num:8>>} -> {ok, _Pos} = file:position(Hdl, {cur, 1}),
>                            <<Num2:1/binary>> = <<Num:8>>,
>                            test(Hdl, [Num2|Acc]);
>
> My guess is that what's happening is that the read is reading in a whole
> disk page (as it should), Num is a pointer into the start of that page,
> but the rest of the page beyond the first byte, isn't reclaimed. Then the
> position seemingly invalidates the entire page. This is confirmed by the
> fact that strace -f -c -p $PID shows the same number of calls to read in
> both the read_ahead and non read_ahead versions. Interestingly though,
> there are twice as many calls to lseek in the read_ahead version.
>
>>From inspecting the size of the file itself, both the read_ahead and non
> versions are really issuing a read for every single byte read, and the
> read_ahead version also has the advantage of issuing twice as many
> seeks.
>
> A quick test shows this happens at least as far back as R12B5, and still
> happens in R13B02.
>
> Oh and if you follow suggestion (5), you'll find the read_ahead version
> is about 8 times slower than the non read_ahead version.
>
> Matthew
>
> ________________________________________________________________
> erlang-bugs mailing list. See http://www.erlang.org/faq.html
> erlang-bugs (at) erlang.org



________________________________________________________________
erlang-bugs mailing list. See http://www.erlang.org/faq.html
erlang-bugs (at) erlang.org


Re: file:read with read_ahead and binaries broken

by Matthew Sackman-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Björn-Egil,

Thanks for the reply, and good to know a solution is in the pipeline.
However, you're solution is only addressing one issue. The other issue
is why is a read issued when the position call does not move the file
handle outside of the region currently cached by the read ahead buffer?
In truth, both the seek and read libc calls can be avoided, or at the
least, the position can be delayed until some other non-(position or
read) call - eg truncate or write.

Matthew

________________________________________________________________
erlang-bugs mailing list. See http://www.erlang.org/faq.html
erlang-bugs (at) erlang.org


Re: file:read with read_ahead and binaries broken

by Björn-Egil Dahlberg :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Yes, I did hit send a bit prematurely.

The solution I was talking about does not solve this particular problem.

What's happening here is that the driver is keeping a read_ahead buffer
which is a binary of size 64 kB (if I remember the default cache size
correctly).

Each read will generate a subbinary of the read_ahead buffer which is
kept reachable in the process by pushing the subbinary to a list in the
read-loop.

Each file:position will flush the read_ahead cache and a new binary will
be made to take is place. *repeat until eof*

Each subbinary will reference the binary and force the gc to keep those
binaries since they are all live data. In this example the total memory
consumption would be roughly ~20M x 64K bytes / 2 ~ 640 GB which is not
the intention by the programmer I guess. =)

The main problem here is that each subbinary is kept. It is aggravated
by producing a new binary cache for each read. This is of course easily
remedied by matching numbers instead of binaries. In this case using
<<N:8>> instead of <<N:1/binary>>. Also instead of seeks one could read
2 bytes instead of one. Or, as you said, skip read_ahead since it wont
give any boost because of the seeks. I realize that this not the intent
of the test though.

Is this a bug in the handling of binaries?
No, but perhaps a limitation and not the "least astonishing result".
Users must be aware of the fact that subbinaries will keep the whole
binary it is referencing. And keeping the subbinaries reachable will
keep them from being gc:ed. In this case the user must also be aware of
the fact that he is receiving subbinaries from the reads. I think that
this could be clearer in the documentation.

One could argue that seeks should not always flush the cache. I fully
agree with you that this should be avoided. This is something we will
review.

One could also argue that subbinaries should be compacted. This is not
wise for the most common cases. It would kill performance and actually
bloat memory. A user can do this by himself by forcing a copy of the
subbinary. This will generate a new separate smaller binary.

Some sort of smart automatic compacting of binaries could be done in the
gc but it is not easily implemented for a number of reasons. Several
strategies for compacting are on the table but it wont be a realization
until R14 at the earliest.

I hope you find this information helpful.

*hitting send*

Regards,
Björn-Egil
Erlang/OTP


Matthew Sackman wrote:

> Hi Björn-Egil,
>
> Thanks for the reply, and good to know a solution is in the pipeline.
> However, you're solution is only addressing one issue. The other issue
> is why is a read issued when the position call does not move the file
> handle outside of the region currently cached by the read ahead buffer?
> In truth, both the seek and read libc calls can be avoided, or at the
> least, the position can be delayed until some other non-(position or
> read) call - eg truncate or write.
>
> Matthew
>
> ________________________________________________________________
> erlang-bugs mailing list. See http://www.erlang.org/faq.html
> erlang-bugs (at) erlang.org



________________________________________________________________
erlang-bugs mailing list. See http://www.erlang.org/faq.html
erlang-bugs (at) erlang.org