following up on space leak

View: New views
9 Messages — Rating Filter:   Alert me  

following up on space leak

by Uwe Hollerbach-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Good evening, all, following up on my question regarding space leaks,
I seem to have stumbled across something very promising. I said I was
using this tiny function "lastOrNil" to get the last value in a list,
or the empty (scheme) list if the haskell list was empty. The uses of
it were all of the form

    lastOrNil (mapM <something> <some list>)

so I wrote a different function mapML to do this directly:

> mapML fn lst = mapMLA (List []) fn lst
>   where mapMLA r _ [] = return r
>         mapMLA ro fn (x:xs) =
>            do rn <- fn x
>               mapMLA rn fn xs

This isn't an accumulator, it's a replacer (or, if you like, the
"accumulation" is "drop the old one on the floor"), it starts out with
the scheme empty list that I want as the default, and it never even
builds the list which it'll just dump an instant later. Shazam! Memory
usage dropped by roughly an order of magnitude in my little Collatz
benchmark, and incidentally runtime improved by 25% or so as well. The
horror! :-)

Having tasted blood, I will of course be continuing to benchmark...
but not tonight.

Uwe
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: following up on space leak

by Marcin Kosiba :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Saturday 04 July 2009, Uwe Hollerbach wrote:

> Good evening, all, following up on my question regarding space leaks,
> I seem to have stumbled across something very promising. I said I was
> using this tiny function "lastOrNil" to get the last value in a list,
> or the empty (scheme) list if the haskell list was empty. The uses of
> it were all of the form
>
>     lastOrNil (mapM <something> <some list>)
>
> so I wrote a different function mapML to do this directly:
> > mapML fn lst = mapMLA (List []) fn lst
> >   where mapMLA r _ [] = return r
> >         mapMLA ro fn (x:xs) =
> >            do rn <- fn x
> >               mapMLA rn fn xs
>
> This isn't an accumulator, it's a replacer (or, if you like, the
> "accumulation" is "drop the old one on the floor"), it starts out with
> the scheme empty list that I want as the default, and it never even
> builds the list which it'll just dump an instant later. Shazam! Memory
> usage dropped by roughly an order of magnitude in my little Collatz
> benchmark, and incidentally runtime improved by 25% or so as well. The
> horror! :-)
Hi,
        IMHO expressing mapML using StateT would be a bit cleaner ;)

mapML :: (Monad m) => (a -> m List) -> [a] -> m List
mapML fn lst = execStateT mapMLAs (List [])
  where
    mapMLAs  = sequence_ $ map mapMLA lst
    mapMLA x = (lift $ fn x) >>= put

--
Marcin Kosiba


_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

signature.asc (204 bytes) Download Attachment

Re: following up on space leak

by Uwe Hollerbach-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On 7/4/09, Marcin Kosiba <marcin.kosiba@...> wrote:

> On Saturday 04 July 2009, Uwe Hollerbach wrote:
>> Good evening, all, following up on my question regarding space leaks,
>> I seem to have stumbled across something very promising. I said I was
>> using this tiny function "lastOrNil" to get the last value in a list,
>> or the empty (scheme) list if the haskell list was empty. The uses of
>> it were all of the form
>>
>>     lastOrNil (mapM <something> <some list>)
>>
>> so I wrote a different function mapML to do this directly:
>> > mapML fn lst = mapMLA (List []) fn lst
>> >   where mapMLA r _ [] = return r
>> >         mapMLA ro fn (x:xs) =
>> >            do rn <- fn x
>> >               mapMLA rn fn xs
>>
>> This isn't an accumulator, it's a replacer (or, if you like, the
>> "accumulation" is "drop the old one on the floor"), it starts out with
>> the scheme empty list that I want as the default, and it never even
>> builds the list which it'll just dump an instant later. Shazam! Memory
>> usage dropped by roughly an order of magnitude in my little Collatz
>> benchmark, and incidentally runtime improved by 25% or so as well. The
>> horror! :-)
>
> Hi,
> IMHO expressing mapML using StateT would be a bit cleaner ;)
>
> mapML :: (Monad m) => (a -> m List) -> [a] -> m List
> mapML fn lst = execStateT mapMLAs (List [])
>   where
>     mapMLAs  = sequence_ $ map mapMLA lst
>     mapMLA x = (lift $ fn x) >>= put
>
> --
> Marcin Kosiba

Yeah, I'm sure there are more-elegant ways to write this, I'm still
very much a beginner in haskell. I'm just very thrilled by the
reduction in memory usage!

Uwe
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: following up on space leak

by Paul L-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Previously you had lastOrNil taking m [a] as input, presumably
generated by mapM. So mapM is actually building an entire list before
it returns the argument for you to call lastOrNil. This is where you
had unexpected memory behavior.

Now you are "fusing" lastOrNil and mapM together, and instead of
building a list, you traverse it and perform monadic action along the
way. This can happen in a constant memory if the original pure list is
generated lazily.

I think the real problem you had was a mis-understanding of mapM, and
there was nothing wrong with your previous lastOrNil function. mapM
will only return a list after all monadic actions are performed, and
in doing so, it inevitably has to build the entire list along the way.

--
Regards,
Paul Liu

Yale Haskell Group
http://www.haskell.org/yale

On 7/4/09, Uwe Hollerbach <uhollerbach@...> wrote:

> Good evening, all, following up on my question regarding space leaks,
> I seem to have stumbled across something very promising. I said I was
> using this tiny function "lastOrNil" to get the last value in a list,
> or the empty (scheme) list if the haskell list was empty. The uses of
> it were all of the form
>
>     lastOrNil (mapM <something> <some list>)
>
> so I wrote a different function mapML to do this directly:
>
>> mapML fn lst = mapMLA (List []) fn lst
>>   where mapMLA r _ [] = return r
>>         mapMLA ro fn (x:xs) =
>>            do rn <- fn x
>>               mapMLA rn fn xs
>
> This isn't an accumulator, it's a replacer (or, if you like, the
> "accumulation" is "drop the old one on the floor"), it starts out with
> the scheme empty list that I want as the default, and it never even
> builds the list which it'll just dump an instant later. Shazam! Memory
> usage dropped by roughly an order of magnitude in my little Collatz
> benchmark, and incidentally runtime improved by 25% or so as well. The
> horror! :-)
>
> Having tasted blood, I will of course be continuing to benchmark...
> but not tonight.
>
> Uwe
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe@...
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: following up on space leak

by Uwe Hollerbach-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On 7/5/09, Paul L <ninegua@...> wrote:

> Previously you had lastOrNil taking m [a] as input, presumably
> generated by mapM. So mapM is actually building an entire list before
> it returns the argument for you to call lastOrNil. This is where you
> had unexpected memory behavior.
>
> Now you are "fusing" lastOrNil and mapM together, and instead of
> building a list, you traverse it and perform monadic action along the
> way. This can happen in a constant memory if the original pure list is
> generated lazily.
>
> I think the real problem you had was a mis-understanding of mapM, and
> there was nothing wrong with your previous lastOrNil function. mapM
> will only return a list after all monadic actions are performed, and
> in doing so, it inevitably has to build the entire list along the way.
>
> --
> Regards,
> Paul Liu
>
> Yale Haskell Group
> http://www.haskell.org/yale

Hi, Paul, thanks for the comments. You're quite right that I am fusing
the two functions together, but I think I wasn't mis-understanding
mapM... I knew I was generating the entire list, and aside from the
slight inefficiency of generating it only to tear it down an instant
later, that would have been no problem. But I was expecting all of the
memory associated with the list to be reclaimed after I had processed
it, and that was what was not happening as far as I could tell. (This
isn't one monolithic list, by the way; it's the small bodies of a
couple of small scheme functions that get evaluated over and over. So
the setup and teardown happens a lot.) I don't have very good
intuition yet about what should get garbage-collected and what should
get kept in such situations, and in fact I'm kind of in the same boat
again: the test case now runs much better, but it still leaks memory,
and I am again stumped as to why. Could I see something useful by
examining ghc core? I haven't looked at that yet, no idea what to look
for...

Uwe
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: following up on space leak

by Alexander Dunlap :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sun, Jul 5, 2009 at 7:46 PM, Uwe Hollerbach<uhollerbach@...> wrote:

> On 7/5/09, Paul L <ninegua@...> wrote:
>> Previously you had lastOrNil taking m [a] as input, presumably
>> generated by mapM. So mapM is actually building an entire list before
>> it returns the argument for you to call lastOrNil. This is where you
>> had unexpected memory behavior.
>>
>> Now you are "fusing" lastOrNil and mapM together, and instead of
>> building a list, you traverse it and perform monadic action along the
>> way. This can happen in a constant memory if the original pure list is
>> generated lazily.
>>
>> I think the real problem you had was a mis-understanding of mapM, and
>> there was nothing wrong with your previous lastOrNil function. mapM
>> will only return a list after all monadic actions are performed, and
>> in doing so, it inevitably has to build the entire list along the way.
>>
>> --
>> Regards,
>> Paul Liu
>>
>> Yale Haskell Group
>> http://www.haskell.org/yale
>
> Hi, Paul, thanks for the comments. You're quite right that I am fusing
> the two functions together, but I think I wasn't mis-understanding
> mapM... I knew I was generating the entire list, and aside from the
> slight inefficiency of generating it only to tear it down an instant
> later, that would have been no problem. But I was expecting all of the
> memory associated with the list to be reclaimed after I had processed
> it, and that was what was not happening as far as I could tell. (This
> isn't one monolithic list, by the way; it's the small bodies of a
> couple of small scheme functions that get evaluated over and over. So
> the setup and teardown happens a lot.) I don't have very good
> intuition yet about what should get garbage-collected and what should
> get kept in such situations, and in fact I'm kind of in the same boat
> again: the test case now runs much better, but it still leaks memory,
> and I am again stumped as to why. Could I see something useful by
> examining ghc core? I haven't looked at that yet, no idea what to look
> for...
>
> Uwe
> _______________________________________________

mapM_ might be useful to you. I know there are cases where mapM leaks
memory but mapM_ doesn't, basically because mapM_ throws away all of
the intermediate results immediately. You might want to condition on
nullness of the list and then mapM_ your function over the init of the
list and then just return the function on the last element of the
list.

Alex
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: following up on space leak

by Uwe Hollerbach-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On 7/5/09, Alexander Dunlap <alexander.dunlap@...> wrote:

> On Sun, Jul 5, 2009 at 7:46 PM, Uwe Hollerbach<uhollerbach@...>
> wrote:
>> On 7/5/09, Paul L <ninegua@...> wrote:
>>> Previously you had lastOrNil taking m [a] as input, presumably
>>> generated by mapM. So mapM is actually building an entire list before
>>> it returns the argument for you to call lastOrNil. This is where you
>>> had unexpected memory behavior.
>>>
>>> Now you are "fusing" lastOrNil and mapM together, and instead of
>>> building a list, you traverse it and perform monadic action along the
>>> way. This can happen in a constant memory if the original pure list is
>>> generated lazily.
>>>
>>> I think the real problem you had was a mis-understanding of mapM, and
>>> there was nothing wrong with your previous lastOrNil function. mapM
>>> will only return a list after all monadic actions are performed, and
>>> in doing so, it inevitably has to build the entire list along the way.
>>>
>>> --
>>> Regards,
>>> Paul Liu
>>>
>>> Yale Haskell Group
>>> http://www.haskell.org/yale
>>
>> Hi, Paul, thanks for the comments. You're quite right that I am fusing
>> the two functions together, but I think I wasn't mis-understanding
>> mapM... I knew I was generating the entire list, and aside from the
>> slight inefficiency of generating it only to tear it down an instant
>> later, that would have been no problem. But I was expecting all of the
>> memory associated with the list to be reclaimed after I had processed
>> it, and that was what was not happening as far as I could tell. (This
>> isn't one monolithic list, by the way; it's the small bodies of a
>> couple of small scheme functions that get evaluated over and over. So
>> the setup and teardown happens a lot.) I don't have very good
>> intuition yet about what should get garbage-collected and what should
>> get kept in such situations, and in fact I'm kind of in the same boat
>> again: the test case now runs much better, but it still leaks memory,
>> and I am again stumped as to why. Could I see something useful by
>> examining ghc core? I haven't looked at that yet, no idea what to look
>> for...
>>
>> Uwe
>> _______________________________________________
>
> mapM_ might be useful to you. I know there are cases where mapM leaks
> memory but mapM_ doesn't, basically because mapM_ throws away all of
> the intermediate results immediately. You might want to condition on
> nullness of the list and then mapM_ your function over the init of the
> list and then just return the function on the last element of the
> list.
>
> Alex


Oh, sorry, I was not clear in my original note in this thread: the
lastOrNil issue seems to be solved. That part of the code is, as far
as I can tell, not leaking memory at all anymore. I think I can claim
that because now the constant memory allocation is showing up visibly
in the profiling output; before, it was lost in the noise. So, if
there is a leak there, it's tiny compared with the constant stuff at
least for this benchmark. There are still two or perhaps three leaks,
and these show up as large but not huge compared to the constant
stuff. I've got a plot of this up on the haskeem website:
http://www.korgwal.com/haskeem/run_new.png.

The bits where I am stumped now are two-fold: one is (I think)
analogous to the lastOrNil issue, except that instead of feeding the
result to lastOrNil, I am doing a more general fold. So there I do
need all the results. I tried the same fusion as with lastOrNil/mapML,
and as far as I can tell I'm not building any lists; but this time it
didn't change the behavior at all, other than causing the names of
some profiling cost centers to change. This is the #2 entry on the
plot above.

The other issue seems to have something to do with IORefs, I'm
dynamically building environments for my scheme functions, and somehow
there seems to be something going wrong with reclaiming that memory
after it's done. This is the #1 and I think #3 entry on the plot. I
don't know enough details there yet to be able to say any more.

Uwe
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: following up on space leak

by porges-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I believe there might be an elegant solution for this using the `Last` monoid
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: following up on space leak

by Uwe Hollerbach-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi, George, thanks for the pointer, it led me to some interesting
reading. Alas, the problem which it solves was already solved, and the
unsolved problem didn't yield any further...

At this point, I've concluded that my interpreter just simply isn't
tail-recursive enough: in the Collatz test case I had originally
looked at and mentioned, it seems that no matter what I do the memory
usage stays the same. Initially, a significant portion of the usage
showed up as one particular function in the interpreter which applies
binary numerical operators to a list of numbers. It's a moderately
complex function, as it deals with any number of operands, and it
takes care of type conversions as well: if I add two integers, I want
the result to be an integer; if I add in a float, the result will be a
float, etc.

In my particular usage in this test case, it was only getting used to
increment an integer; so I simplified that, I added an "incr" function
to my interpreter and called that instead... now exactly the same
amount of memory usage shows up in the cost center labeled "incr" as
was previously being used in the more-complex numeric-binary-operator
function. I've cut down the interpreter to about a quarter of its
original size, now I've got a version that really is only useful for
running this Collatz test case, and... it uses exactly the same amount
of memory as before.

The last thing I tried before giving up was to try and make a
more-strict bind operator, I think I wrote that as

    (!>=) !m !k = m >>= k

with appropriate -XBangPatterns added to the compiler options. It
passed all the self-tests for the interpreter, so I'm pretty sure I
didn't do anything wrong, but it made no difference to the memory
usage. So for now I've shelved that problem, I'm looking instead at
adding proper continuations to the interpreter.

Uwe

On 7/7/09, George Pollard <porges@...> wrote:
> I believe there might be an elegant solution for this using the `Last`
> monoid
>
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe