Data.Map/Data.IntMap additional features

View: New views
5 Messages — Rating Filter:   Alert me  

Data.Map/Data.IntMap additional features

by Arkleseizure :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hey all,

Amid the discussion of inconsistencies between Data.IntMap and Data.Map, I thought I'd throw in some more ideas.  I've been working on a rather thorough generalized trie map implementation, which has done nothing if not force me to make really very generalized signatures for every method offered by Data.Map, etc.  Some of these generalizations are, I think, independently useful, and I wanted to throw them out for discussion before creating a ticket.

A function that I call extract/extractWith/extractWithKey:

extractWithKey :: Alternative f => (k -> a -> f (z, Maybe a)) -> Map k a -> f (z, Map k a)

which is actually defined as follows:

> extractWithKey f (Bin n kx x l r) = fmap (\ (z, l') -> (z, Bin n kx x l' r)) (extractWithKey f l) <|>
>      fmap (\ (z, x') -> (z, maybe (glue l r) (\ xx -> bin kx xx l r) x')) (f kx x) <|> fmap (\ (z, r') -> (z, Bin n kx x l r')) (extractWithKey f r)
> extractWithKey f Nil = empty

If m is fromList [(k1, x1), (k2, x2), ..., (kn, xn)], then

> fst <$> extractWithKey (\ k a -> pure (f k a, <whatever>)) m == f k1 x1 <|> f k2 x2 <|> ... <|> f kn xn

and then the second component is obtained by modifying a single element accordingly and taking the alternative over every element to choose from.

A few observations:  with an appropriate Alternative instance for Data.Monoid.First and Data.Monoid.Last,
> extractWithKey (\ k a -> return ((k, a), Nothing)) m == First (minViewWithKey m)

so this generalizes minViewWithKey and maxViewWithKey according to the First Alternative instance.  (Note that the extractWithKey implementation is also O(log n), because the natural First Alternative instance will ignore the irrelevant pathways lazily,

> extractWithKey (\ k a -> if p k a then pure ((k, a), Nothing) else empty)

in the First Alternative instance, will extract the first element of the map that satisfies p, or will return Nothing if there is no such element, and will take O(i) time to do so if the first satisfying element is at index i,

> mapPermK :: Int -> Map k a -> [[(k, a)]]
> mapPermK 0 m = [[]]
> mapPermK (n+1) m = do    ((k1, a1), m') <- extractWithKey (\ k a -> return ((k, a), Nothing)) m
>                                         liftM ((k1, a1) :) (mapPermK n m)

returns a list of all permutations of k distinct associations from the map m, and does it exactly as lazily as one might hope.  (Generating combinations efficiently is somewhat harder, and I'm not sure this approach grants much in the way of additional efficiency beyond a couple other approaches I can think of.)

In any event, you get the idea: this is a preposterously general method, albeit difficult to describe.  It is moderately easier to describe the individual projections, which I name arbitrarily,

about :: Alternative f => (k -> a -> f z) -> Map k a -> f z
modify :: Alternative f => (k -> a -> f (Maybe a)) -> Map k a -> f (Map k a)

Finally, I'll define neighborhood.  Internally, neighborhood has the signature

neighborhood :: Ord k => k -> Map k a -> (Last (k, a), Maybe a, First (k, a))

where neighborhood k m = (sup [(k', a) | k' < k], lookup k m, inf [(k', a) | k' > k]) == case splitLookup k m of (mL, v, mR) -> (fmap fst (maxViewWithKey mL), v, fmap fst (minViewWithKey mR))

is defined as follows:

neighborhood k Tip = (empty, empty, empty)
neighborhood k (Bin _ kx x l r) = case compare k kx of
LT -> case neighborhood k l of
(lb, v, ub) -> (lb, v, ub <|> pure (kx, x))
EQ -> (Last (fmap fst (maxViewWithKey l)), Just x, First (fmap fst (minViewWithKey r)))
-- we can also use the *about* generalization here, since we only need the minimum and maximum associations
GT -> case neighborhood k r of
(lb, v, ub) -> (pure (kx, x) <|> lb, v, ub)

Using MonadPlus or Alternative instances for First and Last are rather elegant here, and I strongly support adding their Applicative, Alternative, Monad, and MonadPlus instances to base.

Louis Wasserman
wasserman.louis@...
http://profiles.google.com/wasserman.louis

_______________________________________________
Libraries mailing list
Libraries@...
http://www.haskell.org/mailman/listinfo/libraries

Re: Data.Map/Data.IntMap additional features

by Evan Laforge :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, Nov 3, 2009 at 12:39 PM, Louis Wasserman
<wasserman.louis@...> wrote:
> Hey all,
>
> Amid the discussion of inconsistencies between Data.IntMap and Data.Map, I
> thought I'd throw in some more ideas.  I've been working on a rather
> thorough generalized trie map implementation, which has done nothing if not
> force me to make really very generalized signatures for every method offered
> by Data.Map, etc.  Some of these generalizations are, I think, independently
> useful, and I wanted to throw them out for discussion before creating a
> ticket.

It looks really complicated and I don't really understand any of it.
So maybe as a default method for a typeclass that the datatype
designer can implement, but if it were a choice between findMax and
extractWithKey with some Applicative and Monoid and whatever else
magic, I'd use findMax every time :)  I'd be wary of putting anything
difficult to describe in a default interface for a basic data type.

The main functions I use with maps are forward and reverse iteration
(toAscList, toDescList) and splitting, I have split2 defined which is
more useful to me than Map.splitLookup:

-- | Like 'Data.Map.split', except include a matched key in the above map.
split2 :: (Ord k) => k -> Map.Map k a -> (Map.Map k a, Map.Map k a)

And of course creating, inserting, updating, and deleting.  Rarely key
and value mapping.  Any interface that provides those 9 functions will
satisfy all needs I can think of.  If they can all be implemented with
extractWithKey, then great, put that in the class interface and define
the rest with defaulting methods.
_______________________________________________
Libraries mailing list
Libraries@...
http://www.haskell.org/mailman/listinfo/libraries

Re: Data.Map/Data.IntMap additional features

by Sterling Clover :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, Nov 3, 2009 at 3:39 PM, Louis Wasserman
<wasserman.louis@...> wrote:
> Using MonadPlus or Alternative instances for First and Last are rather
> elegant here, and I strongly support adding their Applicative, Alternative,
> Monad, and MonadPlus instances to base.

+1 for this. I've had to do it myself more than once. Also the functor instance.

--S.
_______________________________________________
Libraries mailing list
Libraries@...
http://www.haskell.org/mailman/listinfo/libraries

Parent Message unknown Re: Data.Map/Data.IntMap additional features

by Arkleseizure :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

but if it were a choice between findMax and
extractWithKey with some Applicative and Monoid and whatever else
magic, I'd use findMax every time :) I'd be wary of putting anything
difficult to describe in a default interface for a basic data type.

Okay, some restatement required here: I'm suggesting this method only for Data.Map and Data.IntMap -- not for anything more common -- and I'm not by any means suggesting that I'd use this in every case over findMin, but merely using findMin as an example application.  I do think it's okay for a commonly used module to have some advanced methods for advanced users.
 
And of course creating, inserting, updating, and deleting.  Rarely key
and value mapping.  Any interface that provides those 9 functions will
satisfy all needs I can think of.  If they can all be implemented with
extractWithKey, then great, put that in the class interface and define
the rest with defaulting methods.

Quite honestly, I use key and value mapping all the time, in most applications I need maps for, and extractWithKey generalizes (and makes more efficient) several applications use regularly.

Here's one more example of an application that can't be done nicely or efficiently with application of the currently existing methods:

extractMinPair :: Map k1 (Map k2 v) -> Maybe (((k1, k2), v), Map k1 (Map k2 v))
extractMinPair m = do
   ((k1, m1), m') <- minViewWithKey m
   ((k2, v), m1') <- minViewWithKey m1
   return (((k1, k2), v), if null m1' then m' else insert k1 m1' m)

(In particular, this is the extractMin implementation from the generalized trie on keys of type (k1, k2), which motivated the method in the first place.)

Note that in most cases, we'll need to insert m1' back into m, which is really ugly, and definitely destroys the nicer recursive properties that I'm looking for.  Rather, if I can extract information and modify a value from the map with a single use of extractWithKey,

extractMinPair m = getFirst $ extractWithKey (\ k1 m1 -> fmap (\ ((k2, v), m1') -> (((k1, k2), v), guardNull m1')) (minViewWithKey m1)) m
   where guardNull m = if null m then Nothing else Just m

In the list of methods you named as particularly critical, you talked about creating, inserting, updating, and deleting; you actually didn't mention looking up values, which I assume you meant to -- since what's the point of the map if you can't look up any values?  extractWithKey is useful when I want to look up a value, and edit it, at the same time, without an exra O(log n) pass, and it specifically and elegantly takes care of the case where we want to simultaneously look up and edit the minimum or maximum value of a map.

Louis Wasserman
wasserman.louis@...
http://profiles.google.com/wasserman.louis

_______________________________________________
Libraries mailing list
Libraries@...
http://www.haskell.org/mailman/listinfo/libraries

Re: Data.Map/Data.IntMap additional features

by Evan Laforge :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, Nov 3, 2009 at 6:00 PM, Louis Wasserman
<wasserman.louis@...> wrote:

>> but if it were a choice between findMax and
>> extractWithKey with some Applicative and Monoid and whatever else
>> magic, I'd use findMax every time :) I'd be wary of putting anything
>> difficult to describe in a default interface for a basic data type.
>
> Okay, some restatement required here: I'm suggesting this method only for
> Data.Map and Data.IntMap -- not for anything more common -- and I'm not by
> any means suggesting that I'd use this in every case over findMin, but
> merely using findMin as an example application.  I do think it's okay for a
> commonly used module to have some advanced methods for advanced users.

Sure, that's totally reasonable.

> In the list of methods you named as particularly critical, you talked about
> creating, inserting, updating, and deleting; you actually didn't mention
> looking up values, which I assume you meant to -- since what's the point of
> the map if you can't look up any values?  extractWithKey is useful when I
> want to look up a value, and edit it, at the same time, without an exra
> O(log n) pass, and it specifically and elegantly takes care of the case
> where we want to simultaneously look up and edit the minimum or maximum
> value of a map.

Well ok I'm sold then, sign me up :)

One other thing I haven't seen much reference too that I think could
be a big win, and that's unboxed maps.  A map in the vein of
storablevector than can pack together a lot of elements in chunks, but
keep persistence and log lookup with some shallow tree structure could
be quite handy.  I've heard clojure's arrays and maps are constructed
along these lines.
_______________________________________________
Libraries mailing list
Libraries@...
http://www.haskell.org/mailman/listinfo/libraries