export toDescList from Data.Map

View: New views
20 Messages — Rating Filter:   Alert me  

export toDescList from Data.Map

by Evan Laforge :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

So, I mentioned this a long time ago but didn't get any responses and
then I got distracted.  So this time I added a ticket and patch and
everything:  2580

Here's the text:
It's even implemented, but not exported. Without this, there's
apparently no way to iterate over a map from high to low, since foldl
is also not exported.


Also, I implemented a toDescList for IntMap, and could do the same for
Set, and could add those patches, if people agree it's a good idea.

In addition, is there any problem with exporting foldl aka descending
for the three modules?  I implemented a foldl for IntMap and it seems
to work, but I'd want someone smarter to look for problems with it.
In addition, Set and Map are inconsistent with folds:  Set exports
'fold' and says "no order is guaranteed", but it's just defined to
foldr, which is in ascending order.  Set.foldr itself is not exported.
 Meanwhile, Map also exports fold, but heavily implies from the doc
that it's guaranteed to be ascending.

Would it restrict implementation changes too much to export a
descending fold for all three?  Other languages that have tree
oriented general purpose maps all let you start anywhere and iterate
in either direction.

Anyway, is 2-3 weeks ok for discussion period?  The library submission
wiki doesn't say how long.
_______________________________________________
Libraries mailing list
Libraries@...
http://www.haskell.org/mailman/listinfo/libraries

Re: export toDescList from Data.Map

by Benedikt Huber :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Evan Laforge schrieb:
> So, I mentioned this a long time ago but didn't get any responses and
> then I got distracted.  So this time I added a ticket and patch and
> everything:  2580
>
> Here's the text:
> It's even implemented, but not exported. Without this, there's
> apparently no way to iterate over a map from high to low, since foldl
> is also not exported.

Hi,
foldl is available via the Foldable instance for Set,Map,IntMap.
And if I'm not mistaken, a 'left fold' corresponds to 'iterate from low
to high' (try foldlM failing on the first element).

Anyway, while there is foldWithKey

 > toAscList = foldWithKey (\k v m -> (k,v) : m) []

_foldrWithKey_ seems to be missing for Map/IntMap.
I don't know if there is a performance penalty using 'reverse .
toAscList' (e.g. in monadic traversals which stop after a few elements),
but I suppose you were talking about the API anyway.

Finally, I'd like to see a view on the association list, something like

 > viewAssocs :: Map k v -> MapView (k,v)
 > instance Foldable MapView

which would provide 'fold[lr]WithKey[M]' in a standard way.

best regards,
benedikt
_______________________________________________
Libraries mailing list
Libraries@...
http://www.haskell.org/mailman/listinfo/libraries

Re: export toDescList from Data.Map

by Evan Laforge :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Sep 10, 2008 at 11:06 AM, Benedikt Huber <benjovi@...> wrote:

> Evan Laforge schrieb:
>>
>> So, I mentioned this a long time ago but didn't get any responses and
>> then I got distracted.  So this time I added a ticket and patch and
>> everything:  2580
>>
>> Here's the text:
>> It's even implemented, but not exported. Without this, there's
>> apparently no way to iterate over a map from high to low, since foldl
>> is also not exported.
>
> Hi,
> foldl is available via the Foldable instance for Set,Map,IntMap.
> And if I'm not mistaken, a 'left fold' corresponds to 'iterate from low to
> high' (try foldlM failing on the first element).

Oh right, I'm getting them backwards, because lists are build from
back to front, so you apply the (:) from high to low, to generate a
list from low to high efficiently.  Then you iterate over that list
from low to high, so high to low *application* via fold is actually
low to high iteration via map...  if I'm not even more confused now
than before.

But anyway, you're totally right about Foldable: 'take 10 $
Foldable.foldl (flip (:)) [] bigmap' seems to be quite fast and
independent of the size of bigmap.

> I don't know if there is a performance penalty using 'reverse . toAscList'
> (e.g. in monadic traversals which stop after a few elements), but I suppose
> you were talking about the API anyway.

reverse has a large performance penalty since it has to generate an
entire forward list and traverse the entire thing.  Here's the cpu
time output from first a head . toAscList, and then a head . reverse .
toAscList:

3164
-> (0,0)
3164
-> (10000000,10000000)
5434


So since there's Foldable.foldl, I guess this isn't so pressing.  I
still argue that it's more straightforward to simply export toDescList
(why implement it and then not export it?) rather than make people
roll their own with Foldable.
_______________________________________________
Libraries mailing list
Libraries@...
http://www.haskell.org/mailman/listinfo/libraries

Re: export toDescList from Data.Map

by Evan Laforge :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> So since there's Foldable.foldl, I guess this isn't so pressing.  I
> still argue that it's more straightforward to simply export toDescList
> (why implement it and then not export it?) rather than make people
> roll their own with Foldable.

Oh, except the key part, of course.  So I guess we need toDescList
after all.  'foldr' and 'foldl' are also defined in the module so I
took the liberty of renaming them 'WithKey' and exporting foldlWithKey
(foldrWithKey is already exported as plain foldWithKey).  There's also
a foldlStrict, but I don't want to get too tangled up with all these
details when all I really want is toDescList.
_______________________________________________
Libraries mailing list
Libraries@...
http://www.haskell.org/mailman/listinfo/libraries

Re: export toDescList from Data.Map

by Evan Laforge :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> Anyway, is 2-3 weeks ok for discussion period?  The library submission
> wiki doesn't say how long.

So it's been a little while and there hasn't been any objection.
What's the next step?  Is there any problem with applying the patch as
it stands and will it be in 6.10?  To re-recap, it exports toDescList,
and foldlWithKey.  I'm assuming its current level of strictness is ok
since no one has said anything.


As an aside, it's curious to me that so few people care about
Data.Map.  In my code, it's the #1 data structure, except for lists,
which almost all generated and consumed incrementally and thus take up
little space and have simple access patterns.  Data.Map does all the
heavy lifting.  Other people don't use it?
_______________________________________________
Libraries mailing list
Libraries@...
http://www.haskell.org/mailman/listinfo/libraries

Re: export toDescList from Data.Map

by Ian Lynagh :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Hi Evan,

On Mon, Sep 22, 2008 at 11:47:49AM -0700, Evan Laforge wrote:
> > Anyway, is 2-3 weeks ok for discussion period?  The library submission
> > wiki doesn't say how long.
>
> So it's been a little while and there hasn't been any objection.
> What's the next step?

Please see:
http://www.haskell.org/haskellwiki/Library_submissions#At_the_end_of_the_discussion_period
(and if the concensus was to apply the patch, then you can assign the
ticket to igloo).


Thanks
Ian

_______________________________________________
Libraries mailing list
Libraries@...
http://www.haskell.org/mailman/listinfo/libraries

Re: export toDescList from Data.Map

by Evan Laforge :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

>> So it's been a little while and there hasn't been any objection.
>> What's the next step?
>
> Please see:
> http://www.haskell.org/haskellwiki/Library_submissions#At_the_end_of_the_discussion_period
> (and if the concensus was to apply the patch, then you can assign the
> ticket to igloo).

Ok.  I'm not sure what the consensus was since no one said "looks
good", but I'm going to assume it was to apply, since no one said
"looks bad" either.  I'll add Benedikt's comments to the ticket in
case someone later wants to implement his suggestion (viewAssocs).

Here's what I added to the ticket when assigning to igloo:

So on further thought, it seemed asymmetrical to export foldWithKey
and foldlWithKey, so I added foldrWithKey to the export list, keeping
foldWithKey to not break existing code.  I added a note to the haddock
for foldWithKey encouraging the use of foldrWithKey.

So we export:
toDescList - new, but implementable with foldlWithKey
foldlWithKey - new
foldrWithKey - the same as foldWithKey, but explicitly the mirror of
foldlWithKey

There were no other comments.  I'm specifically not certain about the
laziness of the folds (i.e. will they lead to stack problems like
List.foldl), but no one said there was a problem, so maybe it's fine.
I'm also not sure if there was some important reason behind the reason
the folds were exported, but no one said anything, so maybe it was
just arbitrary.

---

As another aside, it seems like it would be nice to fix up Map, Set,
IntMap, and IntSet, to hopefully give them a consistent API and maybe
reduce some of the rampant copy and paste reuse going on in there.
Maybe this GSoC thing for tries is doing that...
_______________________________________________
Libraries mailing list
Libraries@...
http://www.haskell.org/mailman/listinfo/libraries

Re: export toDescList from Data.Map

by Christian Maeder-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Evan,

sorry for replying so late. I believe, many people are interested in a
good Map data structure but do not care much if your proposed patch is
applied or not. (The patch does not harm, toDescList is trivial to
obtain, and the order of folding does not seem to matter.)

(I'm not sure if foldrWithKey should be preferred over foldWithKey.)

I've looked into the source and the (giant 338K) patch. The essence is
attached below (You've renamed the internal foldr and foldl to
foldlWithKey resp. foldrWithKey, which looks ok to me)

Actually, Data.Map, Data.IntMap, Data.Set and Data.IntSet do not have an
active maintainer and maybe everybody waits for "GSoC thing for tries".

Indeed, as you noted yourself "it would be nice to fix up Map, Set,
IntMap, and IntSet". I think it is almost mandatory. So would you do
this, too? "toDescList" makes sense for sets, too. On the other hand, I
was against toAscList as it is identical to toList. That means I always
assumed a bias towards "ascending operations".

>From a maintainer point of view the module is messy. (Many warnings when
compiling.) For example I do not understand why the first pragma is needed:
{-# OPTIONS_GHC -fno-bang-patterns #-}

Furthermore "foldi" is also defined but not used (but I do not propose
it to be foldWithKey). (Other functions are also not exported.)

Data.Map contains foldStrict:
foldlStrict f z xs
  = case xs of
      []     -> z
      (x:xx) -> let z' = f z x in seq z' (foldlStrict f z' xx)

Is there a difference to Data.List.foldl'?
#ifdef __GLASGOW_HASKELL__
foldl' f z xs = lgo z xs
    where lgo z []     = z
          lgo z (x:xs) = let z' = f z x in z' `seq` lgo z' xs

In order to refactor these modules a big test-suite is needed in order
to ensure correctness and no performance loss. (So maybe it is wise to
never change a running system.)

Maybe this explains the little feedback to this ticket, but I would like
to encourge you, to improve things, anyway.

Cheers Christian

Evan Laforge wrote:

> Ok.  I'm not sure what the consensus was since no one said "looks
> good", but I'm going to assume it was to apply, since no one said
> "looks bad" either.  I'll add Benedikt's comments to the ticket in
> case someone later wants to implement his suggestion (viewAssocs).
>
> Here's what I added to the ticket when assigning to igloo:
>
> So on further thought, it seemed asymmetrical to export foldWithKey
> and foldlWithKey, so I added foldrWithKey to the export list, keeping
> foldWithKey to not break existing code.  I added a note to the haddock
> for foldWithKey encouraging the use of foldrWithKey.
>
> So we export:
> toDescList - new, but implementable with foldlWithKey
> foldlWithKey - new
> foldrWithKey - the same as foldWithKey, but explicitly the mirror of
> foldlWithKey
>
> There were no other comments.  I'm specifically not certain about the
> laziness of the folds (i.e. will they lead to stack problems like
> List.foldl), but no one said there was a problem, so maybe it's fine.
> I'm also not sure if there was some important reason behind the reason
> the folds were exported, but no one said anything, so maybe it was
> just arbitrary.
>
> ---
>
> As another aside, it seems like it would be nice to fix up Map, Set,
> IntMap, and IntSet, to hopefully give them a consistent API and maybe
> reduce some of the rampant copy and paste reuse going on in there.
> Maybe this GSoC thing for tries is doing that...

hunk ./Data/Map.hs 106
+            , foldrWithKey
+            , foldlWithKey
hunk ./Data/Map.hs 123
+            , toDescList
hunk ./Data/Map.hs 176
-import Prelude hiding (lookup,map,filter,foldr,foldl,null)
+import Prelude hiding (lookup,map,filter,null)
hunk ./Data/Map.hs 1431
+--
+-- This is identical to 'foldrWithKey', and you should use that one inste
ad of
+-- this one.  This name is kept for backward compatibility.
hunk ./Data/Map.hs 1437
-  = foldr f z t
+  = foldrWithKey f z t
hunk ./Data/Map.hs 1448
--- | /O(n)/. Post-order fold.
-foldr :: (k -> a -> b -> b) -> b -> Map k a -> b
-foldr _ z Tip              = z
-foldr f z (Bin _ kx x l r) = foldr f (f kx x (foldr f z r)) l
+-- | /O(n)/. Post-order fold.  The function will be applied from the lowe
st
+-- value to the highest.
+foldrWithKey :: (k -> a -> b -> b) -> b -> Map k a -> b
+foldrWithKey _ z Tip              = z
+foldrWithKey f z (Bin _ kx x l r) =
+    foldrWithKey f (f kx x (foldrWithKey f z r)) l
hunk ./Data/Map.hs 1455
-{-
-XXX unused code
hunk ./Data/Map.hs 1456
--- | /O(n)/. Pre-order fold.
-foldl :: (b -> k -> a -> b) -> b -> Map k a -> b
-foldl _ z Tip              = z
-foldl f z (Bin _ kx x l r) = foldl f (f (foldl f z l) kx x) r
--}
+-- | /O(n)/. Pre-order fold.  The function will be applied from the highe
st
+-- value to the lowest.
+foldlWithKey :: (b -> k -> a -> b) -> b -> Map k a -> b
+foldlWithKey _ z Tip              = z
+foldlWithKey f z (Bin _ kx x l r) =
+    foldlWithKey f (f (foldlWithKey f z l) kx x) r
hunk ./Data/Map.hs 1554
-toAscList t   = foldr (\k x xs -> (k,x):xs) [] t
+toAscList t   = foldrWithKey (\k x xs -> (k,x):xs) [] t
hunk ./Data/Map.hs 1556
-{-
-XXX unused code
-
--- | /O(n)/.
+-- | /O(n)/. Convert to a descending list.
hunk ./Data/Map.hs 1558
-toDescList t  = foldl (\xs k x -> (k,x):xs) [] t
--}
+toDescList t  = foldlWithKey (\xs k x -> (k,x):xs) [] t

_______________________________________________
Libraries mailing list
Libraries@...
http://www.haskell.org/mailman/listinfo/libraries

Re: export toDescList from Data.Map

by Evan Laforge :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> sorry for replying so late. I believe, many people are interested in a
> good Map data structure but do not care much if your proposed patch is
> applied or not. (The patch does not harm, toDescList is trivial to
> obtain, and the order of folding does not seem to matter.)

That's one of the things I was asking... how do you obtain toDescList
without foldlWithKey?  And for me, since since the order of the keys
is important (say points in time), the order of the fold matters a
great deal.

> (I'm not sure if foldrWithKey should be preferred over foldWithKey.)

Me either.  On one hand, it's more api clutter.  Once you add it, it
never goes away.  On the other hand, foldWithKey vs. foldlWithKey is
just confusing.  But if there was some sort of consensus that default
is low to high, then I wouldn't mind so much.  The unfortunate
presence of both toList and toAscList seems to indicate a lack of
consensus though (I think including both is silly too).

> Actually, Data.Map, Data.IntMap, Data.Set and Data.IntSet do not have an
> active maintainer and maybe everybody waits for "GSoC thing for tries".
>
> Indeed, as you noted yourself "it would be nice to fix up Map, Set,
> IntMap, and IntSet". I think it is almost mandatory. So would you do
> this, too? "toDescList" makes sense for sets, too. On the other hand, I
> was against toAscList as it is identical to toList. That means I always
> assumed a bias towards "ascending operations".

I wouldn't mind doing some cleanup, but I'm assuming any
backward-incompatible api changes are out of the question.  It seems
like there should be the maplike equivalent of the IArray interface,
or something clever like that.  I know there has been a lot of
discussion on this topic, and possibly even implementation in Edison.
And then some discussion for the trie thing that sounded like a
generic typeclass interface.  So I feel like I should stick to
conservative changes, i.e. just add new functions.

It's also galling that IntMap is hardcoded to 32bit ints and writing a
64 bit version would seem to require yet another copy and paste
session.  And it seems like theoretically patricia should work on
doubles too.  But I'm not sure how to address all this code sharing.

I agree the changes should be made to the other maplike modules, and I
even wrote a foldl and toDescList for IntMap, but I was trying to the
patch scope small, since I only need Map.toDescList.

> In order to refactor these modules a big test-suite is needed in order
> to ensure correctness and no performance loss. (So maybe it is wise to
> never change a running system.)

Yes, no tests (that I could see) is another reason to be conservative
making changes.  I suppose I could write up some tests, but I remember
there was some controversy about whether tests should be included in
the ghc libs.  And I'd be sort of surprised if someone else hasn't
already written some?

It would also be nice to have some time, garbage, and sharing (i.e. if
you update a value how much of the structure is shared?) graphs
generated for each of the operations.  This would help evaluate new
map implementations as well as check for the ramifications to various
changes.

> Maybe this explains the little feedback to this ticket, but I would like
> to encourge you, to improve things, anyway.

Well thanks.  It's nice to hear *something* :)
_______________________________________________
Libraries mailing list
Libraries@...
http://www.haskell.org/mailman/listinfo/libraries

Parent Message unknown Re: export toDescList from Data.Map

by Christian Maeder-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hallo Evan,

If it is really such a performance difference the functions should be
exported by all means. (Another work-around might by to inverse the
order of the keys, but that'll be unnatural at least for Int keys.)

Obviously, I rarely have 1 million elements in my maps.

Evan Laforge wrote:

>>> That's one of the things I was asking... how do you obtain toDescList
>>> without foldlWithKey?  And for me, since since the order of the keys
>>> is important (say points in time), the order of the fold matters a
>>> great deal.
>> I thought of "reverse . toList". Maybe the folding functions are not
>> needed at all (or only short cuts):
>
> Ah, well if I really didn't care about performance, then I wouldn't
> use Map at all, just an unsorted [(k, v)].  But it's sort of awkward
> for the GUI to freeze for a second (yes, it really does take that
> long) while haskell reverses a 1 million element list just to get the
> first 3 elements (and does so many times)...

findMax (deleteFinMax) for the highest 3 elements would be faster then.

>> foldlWithKey f z = foldl (\ b (k, v) -> f b k v) z . toList
>
> This is equivalent to reverse, isn't it?  It also takes a second or so
> (well, once I add a prime of course, without the prime it sends the
> whole system into OOM molasses).

Yes, but folding is linear anyway (if the whole result is needed).

Cheers Christian
_______________________________________________
Libraries mailing list
Libraries@...
http://www.haskell.org/mailman/listinfo/libraries

Re: export toDescList from Data.Map

by ChrisK-5 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


> Me either.  On one hand, it's more api clutter.  Once you add it, it
> never goes away.  On the other hand, foldWithKey vs. foldlWithKey is
> just confusing.  But if there was some sort of consensus that default
> is low to high, then I wouldn't mind so much.  The unfortunate
> presence of both toList and toAscList seems to indicate a lack of
> consensus though (I think including both is silly too).

I conjecture that the logic, if any exists, behind having the two names is so
that the Data.Map API can be applied to several implementations.  And in another
implementation it may be most efficient or most lazy to return elements in
unsorted order.  So there is a "toList" which asks for the easiest conversion to
a list, and a "toAscList" which asks for the sorted version.

As for this proposal, I vote to have toDescList and foldlWithKey.  This would
likely simplify one of my uses of Data.Map.

--
Chris
_______________________________________________
Libraries mailing list
Libraries@...
http://www.haskell.org/mailman/listinfo/libraries

Re: export toDescList from Data.Map

by Christian Maeder-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Chris Kuklewicz wrote:
> I conjecture that the logic, if any exists, behind having the two names
> is so that the Data.Map API can be applied to several implementations.
> And in another implementation it may be most efficient or most lazy to
> return elements in unsorted order.  So there is a "toList" which asks
> for the easiest conversion to a list, and a "toAscList" which asks for
> the sorted version.

if "toList" returned "elements in unsorted order" it would break the
abstraction: equal maps could return different list. So toList must
either be toAscList or toDescList.

>
> As for this proposal, I vote to have toDescList and foldlWithKey.  This
> would likely simplify one of my uses of Data.Map.

+1 and foldrWithKey.

Christian

_______________________________________________
Libraries mailing list
Libraries@...
http://www.haskell.org/mailman/listinfo/libraries

Parent Message unknown Re: export toDescList from Data.Map

by scodil :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Date: Tue, 23 Sep 2008 09:09:44 -0700
From: "Evan Laforge" <qdunkan@...>
Subject: Re: export toDescList from Data.Map


> Actually, Data.Map, Data.IntMap, Data.Set and Data.IntSet do not have an
> active maintainer and maybe everybody waits for "GSoC thing for tries".
>
> Indeed, as you noted yourself "it would be nice to fix up Map, Set,
> IntMap, and IntSet". I think it is almost mandatory. So would you do
> this, too? "toDescList" makes sense for sets, too. On the other hand, I
> was against toAscList as it is identical to toList. That means I always
> assumed a bias towards "ascending operations".

I wouldn't mind doing some cleanup...

I use Map, Set, IntMap and IntSet very much, and I'm very interested in seeing to it that these important libraries are maintained and improved. However, short of a major API change, like what's going on with generalized tries and what's already been done with Edison, I don't think there is much work left to be done with these libraries at the API level. Just minor changes like the one you've proposed. There are some things that could be done though...

 - I'm not sure what Benedikt Huber meant by 'view', but I think he means exposing the tree structure, in a read-only way, to users. I think its a shame that the Map/Set libraries do not expose this. The simplest solution would be to have toTree functions that convert it to a Data.Tree, but I don't think anybody actually likes that data type. A specialized binary tree data type would be more elegant, but there is an issue of how data is stored in the tree. Map/Set store keys and values in internal nodes and use null leaves. IntMap/IntSet store all keys/values in non-null leaves. So maybe this TreeView type would have to be specific to either Map or IntMap. The idea here is that the algorithms and data structures used for these trees are well-known. The papers are linked from the documentation. So the library should expose this to users, in a safe way.

 - Since someone raised the issue of test suites for performance and correctness, I think it would also be interesting to investigate what effect the strictness annotations in the tree constructors have on performance. Everyone takes for granted that bangs=faster, but I've noticed, as have others, that removing these strictness annotations actually make things run faster. Instead, the tree construction _functions_ should be made strict using seq. The Map construction functions are already strict enough on account of the balancing. Try it for yourself, remove the bangs from the sub-tree fields in the Bin constructor, and run a little benchmark. This would open up the possibility for lazy versions of functions like mapWithKey and mapKeysMonotonic. I had mentioned this previously, but few seemed to care, so I just made the change locally.  http://www.haskell.org/pipermail/libraries/2008-August/010371.html . If we we're going to start a little Map/Set/IntMap/IntSet working group, then I'd like to throw that idea back into the mix.



It's also galling that IntMap is hardcoded to 32bit ints and writing a
64 bit version would seem to require yet another copy and paste
session.  And it seems like theoretically patricia should work on
doubles too.  But I'm not sure how to address all this code sharing.

IntMap is hard coded to Int, which is 32 bits on a 32 bit architecture, and 64 bits on a 64 bit architecture. I think the main reason for this is so that the key can be
unpacked/specialized for extra performance, since that is the primary purpose of the library. If you just need sublinear insert/lookup/delete then use a Map.

Are you galled that the key type is not hardcoded to Int64, or that the key is not allowed to be any instance of the Bits class? In the former case, you'd inflate the size of the tree and take a small speed hit on 32bit archs, and in the latter I think you'd have even worse performance. I think the choice of Int is wise because it is the fastest, and that is the point of the library, and if you think about it you're not going to be dealing with more that 2^32 Ints on a 32 bit computer. (What would they be referencing? Not array positions or file offsets.)  If you're trying to optimize, say, "Map (Int32,Int32) a" into "IntMap64 a", there are other ways to accomplish that:

import qualified Data.IntMap as IM
import Data.IntMap (IntMap)

newtype IntTrie = IntTrie (IntMap IntTrie)

empty :: IntTrie
empty = IntTrie IM.empty

insert :: [Int] -> IntTrie -> IntTrie
insert []     t           = t
insert (x:xs) (IntTrie t) = IntTrie t'
  where
  (y,t') = IM.insertLookupWithKey (const union) x y' t
  y' = case y of Just y  -> insert xs y; Nothing -> insert xs empty

delete :: [Int] -> IntTrie -> IntTrie
delete []     t           = empty
delete (x:xs) (IntTrie t) = IntTrie (IM.update f x t)
  where
  f y = case IM.delete x (case delete xs y of IntTrie t -> t) of
          t' | IM.null t' -> Nothing
             | otherwise  -> Just (IntTrie t')

And so forth. This is an IntSet for arbitrarily long integers represented as [Int]. An IntMap is probably not much more work.


> Maybe this explains the little feedback to this ticket, but I would like
> to encourge you, to improve things, anyway.

Well thanks.  It's nice to hear *something* :)


For my $0.02 your proposal is good. toDescList is very important to have in many situations. I've had it un-hidden locally for a while now. It's silly that it wasn't exported in the first place.
 
Scott


_______________________________________________
Libraries mailing list
Libraries@...
http://www.haskell.org/mailman/listinfo/libraries

Re: export toDescList from Data.Map

by Benedikt Huber :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

Scott Dillard schrieb:

>
>  - I'm not sure what Benedikt Huber meant by 'view', but I think he
> means exposing the tree structure, in a read-only way, to users. I think
> its a shame that the Map/Set libraries do not expose this. The simplest
> solution would be to have toTree functions that convert it to a
> Data.Tree, but I don't think anybody actually likes that data type. A
> specialized binary tree data type would be more elegant, but there is an
> issue of how data is stored in the tree. Map/Set store keys and values
> in internal nodes and use null leaves. IntMap/IntSet store all
> keys/values in non-null leaves. So maybe this TreeView type would have
> to be specific to either Map or IntMap. The idea here is that the
> algorithms and data structures used for these trees are well-known. The
> papers are linked from the documentation. So the library should expose
> this to users, in a safe way.

I think that having a function

 > treeView :: Map k v -> T (k,v)

s.t. T is an instance of Foldable, supports efficient left and right
folds, and additional operations like subrange queries (all pairs (k,v)
s.t. l <= k <= u) would be useful.
I'd like to have all functions from Data.Foldable available for folds
with key, e.g fold[lr]MWithKey.

Supporting toTree (e.g. using a binary tree as you suggested) would be
great as well, but I think T (k,v) does not need to be build an
intermediate representation for supporting queries/folds, so a newtype
should do as well.

>  - Since someone raised the issue of test suites for performance and
> correctness, I think it would also be interesting to investigate what
> effect the strictness annotations in the tree constructors have on
> performance.
 > ...
 > http://www.haskell.org/pipermail/libraries/2008-August/010371.html

Did you also measure the effect of removing strictness annotations  on
space performance ?

> For my $0.02 your proposal is good. toDescList is very important to have
> in many situations. I've had it un-hidden locally for a while now. It's
> silly that it wasn't exported in the first place.
+1 for toDescList
foldrWithKey is certainly useful as well, but I think one should also
think about monadic folds (and the other stuff from Data.Foldable), as
explained above.

benedikt
_______________________________________________
Libraries mailing list
Libraries@...
http://www.haskell.org/mailman/listinfo/libraries

Re: export toDescList from Data.Map

by Evan Laforge :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

>  - Since someone raised the issue of test suites for performance and
> correctness, I think it would also be interesting to investigate what effect
> the strictness annotations in the tree constructors have on performance.
> Everyone takes for granted that bangs=faster, but I've noticed, as have
> others, that removing these strictness annotations actually make things run
> faster. Instead, the tree construction _functions_ should be made strict
> using seq. The Map construction functions are already strict enough on
> account of the balancing. Try it for yourself, remove the bangs from the
> sub-tree fields in the Bin constructor, and run a little benchmark. This
> would open up the possibility for lazy versions of functions like mapWithKey
> and mapKeysMonotonic. I had mentioned this previously, but few seemed to
> care, so I just made the change locally.
> http://www.haskell.org/pipermail/libraries/2008-August/010371.html . If we
> we're going to start a little Map/Set/IntMap/IntSet working group, then I'd
> like to throw that idea back into the mix.

Of course, anything that boosts performance is a win.  Does anyone
know of any Map benchmarks?  I suppose I could write some, but I'd be
sort of surprised if someone else hasn't already done that.  Then we
could throw in all the various avl / redblack / whatever map variants
in there and see how they stack up.

> IntMap is hard coded to Int, which is 32 bits on a 32 bit architecture, and
> 64 bits on a 64 bit architecture. I think the main reason for this is so
> that the key can be
> unpacked/specialized for extra performance, since that is the primary
> purpose of the library. If you just need sublinear insert/lookup/delete then
> use a Map.

To be honest, I had no particular rational reason for wanting IntMap
(and in fact I'm not using it).  I just heard some mumbling on the
list about "Data.Map so slow, IntMap so much faster especially for
unions".  Though to be honest, probably what most matters to me is how
much garbage an insert generates (and similarly how much the resulting
structures will share).  I didn't run any tests or anything, so I
wasn't that upset to not be able to use IntMap :)

> Are you galled that the key type is not hardcoded to Int64, or that the key
> is not allowed to be any instance of the Bits class? In the former case,
> you'd inflate the size of the tree and take a small speed hit on 32bit
> archs, and in the latter I think you'd have even worse performance. I think
> the choice of Int is wise because it is the fastest, and that is the point
> of the library, and if you think about it you're not going to be dealing
> with more that 2^32 Ints on a 32 bit computer. (What would they be
> referencing? Not array positions or file offsets.)  If you're trying to

Points in time.  I was using Int64 at the time, but I'm actually using
Doubles now.  Plain Data.Map is working fine in practice.

> optimize, say, "Map (Int32,Int32) a" into "IntMap64 a", there are other ways
> to accomplish that:

Ah yes, splitting the numbers would work for 64 bit words.
_______________________________________________
Libraries mailing list
Libraries@...
http://www.haskell.org/mailman/listinfo/libraries

Re: export toDescList from Data.Map

by Ross Paterson :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Fri, Sep 26, 2008 at 04:12:17PM +0200, Benedikt Huber wrote:
> I think that having a function
>
> > treeView :: Map k v -> T (k,v)
>
> s.t. T is an instance of Foldable, supports efficient left and right  
> folds, and additional operations like subrange queries (all pairs (k,v)  
> s.t. l <= k <= u) would be useful.
> I'd like to have all functions from Data.Foldable available for folds  
> with key, e.g fold[lr]MWithKey.

Map has split and splitLookup for subrange queries, and you could get
the folds with

        mapWithKey (,) :: Map k v -> Map k (k,v)
_______________________________________________
Libraries mailing list
Libraries@...
http://www.haskell.org/mailman/listinfo/libraries

Re: export toDescList from Data.Map

by Benedikt Huber :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Ross Paterson schrieb:

> On Fri, Sep 26, 2008 at 04:12:17PM +0200, Benedikt Huber wrote:
>> I think that having a function
>>
>>> treeView :: Map k v -> T (k,v)
>> s.t. T is an instance of Foldable, supports efficient left and right  
>> folds, and additional operations like subrange queries (all pairs (k,v)  
>> s.t. l <= k <= u) would be useful.
>> I'd like to have all functions from Data.Foldable available for folds  
>> with key, e.g fold[lr]MWithKey.
>
> Map has split and splitLookup for subrange queries, and you could get
> the folds with
>
> mapWithKey (,) :: Map k v -> Map k (k,v)

Thanks for the pointing this out.

While browsing Data.Map I noted that the Foldable instance only defines
foldMap - it is notably slower to use Data.Map - Foldable.foldr
compared to Map.fold. Adding

 > instance Foldable (Map k) where
 >  foldMap = ...
 >  foldr f = foldr (const f)
 >  foldl f = foldl (\b _ a -> f b a)

really helps.
There is also a notable performance penalty using mapWithKey as you
suggested, but otherwise it does the job. I still think that toTree or
treeView would be interesting, but others might disagree of course.

For range queries, split should indeed be sufficient, though a
specialized function might be faster.

Finally I've attached a (very simple) microbenchmark to demonstrate the
need for adding foldr,foldl to the Foldable instance of Data.Map.

best regards,
benedikt


-- Timings {without additional Foldable definitions}/{with foldl,foldr
definitions for foldable}

-- fold asc-list
-- {1.35}
testFold_1 = foldr (uncurry f) 0 . M.toAscList

-- foldWithKey (foldWithKey)
-- {1.45}
testFold_2 = M.foldWithKey f 0

-- fold M.mapWithKey (,)
-- {9.0 / 3.25}
testFold_3 = foldr (uncurry f) 0 . M.mapWithKey (,)

-- fold without key (Foldable.foldr)
-- {5.06 / 1.35}
testFold_4 = foldr (f 0) 0

-- fold (elems.mapWithKey)
-- {3.75}
testFold_6 = foldr (uncurry f) 0 . M.elems . M.mapWithKey (,)

f k v b = b*7+k+2*v
main = go testFold_1 0 (M.empty) where
   go f 5000 = putStrLn ""
   go f k m  = putStr (show (f m) ++ " ") >>
               go f (succ k) (M.insert k (2*k) m)


_______________________________________________
Libraries mailing list
Libraries@...
http://www.haskell.org/mailman/listinfo/libraries

Re: export toDescList from Data.Map

by Evan Laforge :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Mon, Sep 8, 2008 at 3:46 PM, Evan Laforge <qdunkan@...> wrote:
> So, I mentioned this a long time ago but didn't get any responses and
> then I got distracted.  So this time I added a ticket and patch and
> everything:  2580

Sorry to dredge this one up again, but this patch was apparently
applied more than a year ago, and yet when I look at the latest
version 0.2.0.1 of containers on hackage, uploaded on April 2009, the
changes aren't in the haddock (searching for toDescList reveals
nothing).  I was hoping to be able to finally remove my local hack!

Was the patch applied?  What's going on?  Will 6.12 include it?
_______________________________________________
Libraries mailing list
Libraries@...
http://www.haskell.org/mailman/listinfo/libraries

Re: export toDescList from Data.Map

by Simon Marlow-7 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On 12/10/2009 04:38, Evan Laforge wrote:

> On Mon, Sep 8, 2008 at 3:46 PM, Evan Laforge<qdunkan@...>  wrote:
>> So, I mentioned this a long time ago but didn't get any responses and
>> then I got distracted.  So this time I added a ticket and patch and
>> everything:  2580
>
> Sorry to dredge this one up again, but this patch was apparently
> applied more than a year ago, and yet when I look at the latest
> version 0.2.0.1 of containers on hackage, uploaded on April 2009, the
> changes aren't in the haddock (searching for toDescList reveals
> nothing).  I was hoping to be able to finally remove my local hack!
>
> Was the patch applied?  What's going on?  Will 6.12 include it?

Yes, 6.12.1 includes that patch.

Cheers,
        Simon
_______________________________________________
Libraries mailing list
Libraries@...
http://www.haskell.org/mailman/listinfo/libraries

Re: export toDescList from Data.Map

by Evan Laforge :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> Yes, 6.12.1 includes that patch.

Excellent.  It's small but it makes me happy :)
_______________________________________________
Libraries mailing list
Libraries@...
http://www.haskell.org/mailman/listinfo/libraries