|
View:
New views
20 Messages
—
Rating Filter:
Alert me
|
| < Prev | 1 - 2 - 3 - 4 - 5 - 6 | Next > |
|
|
(flawed?) benchmark : sortHi
I was playing with various versions of sorting algorithms. I know it's very easy to create flawed benchmark and I don't claim those are good ones. However, it really seems strange to me, that sort - library function - is actually the worse measured function. I can hardly belive it, and I'd rather say I have made a mistake preparing it. The overall winner seems to be qsort_iv - which is nothing less but old sort replaced by mergesort now. Any clues? Regards Christopher Skrzętnicki --- cut here --- [Tener@laptener haskell]$ ghc -O2 --make qsort.hs && ./qsort +RTS -sstderr -RTS > /dev/null [1 of 1] Compiling Main ( qsort.hs, qsort.o ) Linking qsort ... ./qsort +RTS -sstderr (1.0,"iv") (1.1896770400256864,"v") (1.3091609772011856,"treeSort") (1.592515715933153,"vii") (1.5953543402198838,"vi") (1.5961286512637272,"iii") (1.8175480563244177,"i") (1.8771604568641642,"ii") (2.453160634439497,"mergeSort") (2.6627090768870216,"sort") 26,094,674,624 bytes allocated in the heap 12,716,656,224 bytes copied during GC (scavenged) 2,021,104,592 bytes copied during GC (not scavenged) 107,225,088 bytes maximum residency (140 sample(s)) 49773 collections in generation 0 ( 21.76s) 140 collections in generation 1 ( 23.61s) 305 Mb total memory in use INIT time 0.00s ( 0.00s elapsed) MUT time 20.39s ( 20.74s elapsed) GC time 45.37s ( 46.22s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 65.76s ( 66.96s elapsed) %GC time 69.0% (69.0% elapsed) Alloc rate 1,279,723,644 bytes per MUT second Productivity 31.0% of total user, 30.5% of total elapsed --- cut here --- {-# OPTIONS_GHC -O2 #-} module Main where import System.CPUTime import System.IO import System.Environment import System.Random import Data.List( partition, sort ) data Tree a = Node (Tree a) a (Tree a) | Leaf qsort_i [] = [] qsort_i (x:xs) = qsort_i (filter (< x) xs) ++ [x] ++ qsort_i (filter (>= x) xs) qsort_ii [] = [] qsort_ii (x:xs) = let (ls,gt) = partition (< x) xs in qsort_ii ls ++ [x] ++ qsort_ii gt qsort_iii xs = qsort_iii' [] xs qsort_iii' acc [] = acc qsort_iii' acc (x:xs) = let (ls,gt) = partition (< x) xs in let acc' = (x:(qsort_iii' acc gt)) in qsort_iii' acc' ls qsort_v [] = [] qsort_v (x:xs) = let (xlt, xgt ) = foldl (\ (lt,gt) el -> case compare x el of GT -> (el:lt, gt) _ -> (lt, el:gt) ) ([],[]) xs in qsort_v xlt ++ [x] ++ qsort_v xgt -- zmodyfikowany i qsort_vi [] = [] qsort_vi (x:xs) = qsort_vi (filter (\y-> compare x y == GT) xs) ++ [x] ++ qsort_vi (filter (\y-> compare x y /= GT) xs) -- zmodyfikowany iii qsort_vii xs = qsort_vii' [] xs qsort_vii' acc [] = acc qsort_vii' acc (x:xs) = let (ls,gt) = partition (\y-> compare x y == GT) xs in let acc' = (x:(qsort_vii' acc gt)) in qsort_vii' acc' ls -- qsort is stable and does not concatenate. qsort_iv xs = qsort_iv' (compare) xs [] qsort_iv' _ [] r = r qsort_iv' _ [x] r = x:r qsort_iv' cmp (x:xs) r = qpart cmp x xs [] [] r -- qpart partitions and sorts the sublists qpart cmp x [] rlt rge r = -- rlt and rge are in reverse order and must be sorted with an -- anti-stable sorting rqsort_iv' cmp rlt (x:rqsort_iv' cmp rge r) qpart cmp x (y:ys) rlt rge r = case cmp x y of GT -> qpart cmp x ys (y:rlt) rge r _ -> qpart cmp x ys rlt (y:rge) r -- rqsort is as qsort but anti-stable, i.e. reverses equal elements rqsort_iv' _ [] r = r rqsort_iv' _ [x] r = x:r rqsort_iv' cmp (x:xs) r = rqpart cmp x xs [] [] r rqpart cmp x [] rle rgt r = qsort_iv' cmp rle (x:qsort_iv' cmp rgt r) rqpart cmp x (y:ys) rle rgt r = case cmp y x of GT -> rqpart cmp x ys rle (y:rgt) r _ -> rqpart cmp x ys (y:rle) rgt r -- code by Orcus -- Zadanie 9 - merge sort mergeSort :: Ord a => [a] -> [a] mergeSort [] = [] mergeSort [x] = [x] mergeSort xs = let(l, r) = splitAt (length xs `quot` 2) xs in mergeSortP (mergeSort l) (mergeSort r) -- funkcja pomocnicza scalajÄ…ca dwie listy uporzÄ…dkowane w jednÄ… mergeSortP :: Ord a => [a] -> [a] -> [a] mergeSortP xs [] = xs mergeSortP [] ys = ys mergeSortP (x:xs) (y:ys) | x <= y = x:(mergeSortP xs (y:ys)) | otherwise = y:(mergeSortP (x:xs) ys) -- Zadanie 10 - tree sort treeSort :: Ord a => [a] -> [a] -- pointless po raz drugi treeSort = (treeSortInorder . treeSortToTree) treeSortToTree :: Ord a => [a] -> Tree a treeSortToTree [] = Leaf treeSortToTree (x:xs) = let (xlt, xgt) = foldl (\ (lt,gt) el -> case compare x el of GT -> (el:lt, gt) _ -> (lt, el:gt) ) ([],[]) xs in Node (treeSortToTree xlt) x (treeSortToTree xgt) treeSortInorder :: Ord a => Tree a -> [a] treeSortInorder Leaf = [] treeSortInorder (Node l x r) = (treeSortInorder l) ++ [x] ++ (treeSortInorder r) -- end code by Orcus -- big_number = 1000000 :: Int main = do gen <- getStdGen let xs' = randomRs (1::Int, big_number) gen xs <- return (take big_number xs') t1 <- getCPUTime print (qsort_i xs) -- i t2 <- getCPUTime print (qsort_ii xs) -- ii t3 <- getCPUTime print (qsort_iii xs) -- iii t4 <- getCPUTime print (qsort_iv xs) -- iv t5 <- getCPUTime print (qsort_v xs) -- v t6 <- getCPUTime print (qsort_vi xs) -- vi t7 <- getCPUTime print (qsort_vii xs) -- vii t8 <- getCPUTime print (sort xs) -- sort t9 <- getCPUTime print (mergeSort xs) -- mergeSort t10 <- getCPUTime print (treeSort xs) -- treeSort t11 <- getCPUTime let getTimes xs = zipWith (-) (tail xs) xs let timers = [t1,t2,t3,t4,t5,t6,t7,t8,t9,t10,t11] let times = getTimes timers let table = zip times ["i","ii","iii","iv", "v", "vi", "vii", "sort","mergeSort","treeSort"] let sorted = sort table let scaled = map (\(x,n) -> (((fromIntegral x / (fromIntegral $ fst (head sorted)))::Double),n)) sorted let toShow = concatMap (\x-> show x ++ "\n") scaled hPutStr stderr toShow main_small = do gen <- getStdGen let xs' = randomRs (1::Int, 100000) gen xs <- return (take big_number xs') t1 <- getCPUTime print (qsort_iv xs) -- iv t2 <- getCPUTime print (sort xs) -- sort t3 <- getCPUTime print (mergeSort xs) -- mergeSort t4 <- getCPUTime print (treeSort xs) -- treeSort t5 <- getCPUTime let getTimes xs = zipWith (-) (tail xs) xs let timers = [t1,t2,t3,t4,t5] let times = getTimes timers let table = zip times ["iv", "sort","mergeSort","treeSort"] let sorted = sort table let scaled = map (\(x,n) -> (((fromIntegral x / (fromIntegral $ fst (head sorted)))::Double),n)) sorted let toShow = concatMap (\x-> show x ++ "\n") scaled hPutStr stderr toShow hPrint stderr times --- cut here --- _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@... http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
|
Re: (flawed?) benchmark : sort2008/3/4, Krzysztof Skrzętnicki <gtener@...>:
> Hi > > I was playing with various versions of sorting algorithms. I know it's very > easy to create flawed benchmark and I don't claim those are good ones. > However, it really seems strange to me, that sort - library function - is > actually the worse measured function. I can hardly belive it, and I'd rather > say I have made a mistake preparing it. > > The overall winner seems to be qsort_iv - which is nothing less but old sort > replaced by mergesort now. > > Any clues? -- cut here -- module Main where import Control.Parallel.Strategies import Control.Arrow import System.CPUTime import System.IO import System.Environment import System.Random import Data.List( partition, sort ) data Tree a = Node (Tree a) a (Tree a) | Leaf qsort_i [] = [] qsort_i (x:xs) = qsort_i (filter (< x) xs) ++ [x] ++ qsort_i (filter (>= x) xs) qsort_ii [] = [] qsort_ii (x:xs) = let (ls,gt) = partition (< x) xs in qsort_ii ls ++ [x] ++ qsort_ii gt qsort_iii xs = qsort_iii' [] xs qsort_iii' acc [] = acc qsort_iii' acc (x:xs) = let (ls,gt) = partition (< x) xs in let acc' = (x:(qsort_iii' acc gt)) in qsort_iii' acc' ls qsort_v [] = [] qsort_v (x:xs) = let (xlt, xgt ) = foldl (\ (lt,gt) el -> case compare x el of GT -> (el:lt, gt) _ -> (lt, el:gt) ) ([],[]) xs in qsort_v xlt ++ [x] ++ qsort_v xgt -- zmodyfikowany i qsort_vi [] = [] qsort_vi (x:xs) = qsort_vi (filter (\y-> compare x y == GT) xs) ++ [x] ++ qsort_vi (filter (\y-> compare x y /= GT) xs) -- zmodyfikowany iii qsort_vii xs = qsort_vii' [] xs qsort_vii' acc [] = acc qsort_vii' acc (x:xs) = let (ls,gt) = partition (\y-> compare x y == GT) xs in let acc' = (x:(qsort_vii' acc gt)) in qsort_vii' acc' ls -- qsort is stable and does not concatenate. qsort_iv xs = qsort_iv' (compare) xs [] qsort_iv' _ [] r = r qsort_iv' _ [x] r = x:r qsort_iv' cmp (x:xs) r = qpart cmp x xs [] [] r -- qpart partitions and sorts the sublists qpart cmp x [] rlt rge r = -- rlt and rge are in reverse order and must be sorted with an -- anti-stable sorting rqsort_iv' cmp rlt (x:rqsort_iv' cmp rge r) qpart cmp x (y:ys) rlt rge r = case cmp x y of GT -> qpart cmp x ys (y:rlt) rge r _ -> qpart cmp x ys rlt (y:rge) r -- rqsort is as qsort but anti-stable, i.e. reverses equal elements rqsort_iv' _ [] r = r rqsort_iv' _ [x] r = x:r rqsort_iv' cmp (x:xs) r = rqpart cmp x xs [] [] r rqpart cmp x [] rle rgt r = qsort_iv' cmp rle (x:qsort_iv' cmp rgt r) rqpart cmp x (y:ys) rle rgt r = case cmp y x of GT -> rqpart cmp x ys rle (y:rgt) r _ -> rqpart cmp x ys (y:rle) rgt r -- code by Orcus -- Zadanie 9 - merge sort mergeSort :: Ord a => [a] -> [a] mergeSort [] = [] mergeSort [x] = [x] mergeSort xs = let(l, r) = splitAt (length xs `quot` 2) xs in mergeSortP (mergeSort l) (mergeSort r) -- funkcja pomocnicza scalajÄ…ca dwie listy uporzÄ…dkowane w jednÄ… mergeSortP :: Ord a => [a] -> [a] -> [a] mergeSortP xs [] = xs mergeSortP [] ys = ys mergeSortP (x:xs) (y:ys) | x <= y = x:(mergeSortP xs (y:ys)) | otherwise = y:(mergeSortP (x:xs) ys) -- Zadanie 10 - tree sort treeSort :: Ord a => [a] -> [a] -- pointless po raz drugi treeSort = (treeSortInorder . treeSortToTree) treeSortToTree :: Ord a => [a] -> Tree a treeSortToTree [] = Leaf treeSortToTree (x:xs) = let (xlt, xgt) = foldl (\ (lt,gt) el -> case compare x el of GT -> (el:lt, gt) _ -> (lt, el:gt) ) ([],[]) xs in Node (treeSortToTree xlt) x (treeSortToTree xgt) treeSortInorder :: Ord a => Tree a -> [a] treeSortInorder Leaf = [] treeSortInorder (Node l x r) = (treeSortInorder l) ++ [x] ++ (treeSortInorder r) -- end code by Orcus -- begin benchmark making code makeBenchs benchs xs = do let (funcNames, funcs) = unzip benchs tBegin <- getCPUTime timers <- mapM (\f-> print (f xs) >> getCPUTime) funcs let times = zipWith (-) timers (tBegin:timers) sortedResults = sort $ zip times funcNames minT = fromIntegral $ fst $ head sortedResults scaled = map (((/minT) . fromIntegral) *** id) sortedResults hPutStr stderr $ unlines $ map show scaled onRandom eltCnt = do gen <- getStdGen let xs = take eltCnt (randomRs (1::Int, bigNum) gen) `using` rnf xs `seq` return xs onSorted eltCnt = do gen <- getStdGen let xs = take eltCnt (randomRs (1::Int, bigNum) gen) `using` rnf sxs = sort xs `using` rnf xs `seq` sxs `seq` return sxs bigNum = 1000000 :: Int -- end of benchmark making code main = makeBenchs [("i",qsort_i), ("ii",qsort_ii), ("iii",qsort_iii), ("iv",qsort_iv), ("v",qsort_v), ("vi",qsort_vi), ("vii",qsort_vii), ("sort",sort), ("mergeSort",mergeSort), ("treeSort",treeSort)] =<< onSorted . read . head =<< getArgs -- cut here -- It could probably be improved (with classics solution (better selection of the pivot...)), but the mergesort is only 3 times slower in worse case, and much more regular, if someone needs a faster sort in a specific case, it isn't hard to code. -- Jedaï _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@... http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
|
Re: (flawed?) benchmark : sortHi
> -- Zadanie 9 - merge sort > mergeSort :: Ord a => [a] -> [a] > mergeSort [] = [] > mergeSort [x] = [x] > mergeSort xs = let(l, r) = splitAt (length xs `quot` 2) xs > in mergeSortP (mergeSort l) (mergeSort r) splitAt is not a particularly good way to split a list, since you recurse over the list twice. Try instead: split (x:xs) = (x:b,a) where (a,b) = split xs split [] = ([], []) Perhaps adding some strictness annotations and turning the where into a case. Also remember that a standard benchmark for sorting is an ordered/reverse ordered list, as that causes quicksort to go to O(n^2) given a bad pivot choice. If the sort in the standard libraries can be improved on, it should be replaced. Thanks Neil _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@... http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
|
Re: (flawed?) benchmark : sortThanks for improved code. My point was to measure which programming patterns are faster than the others so I can learn which ones I should use. However, the thing that is really bad is the fact, that even oneliner qsort_i is faster than library sort. Which is very different from what I've expected. My intuition is only best and fastest code goes to library, to the point that people can learn from it. It seems I was mislead.
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@... http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
|
Re: (flawed?) benchmark : sortHi
> My intuition is only best and fastest code goes to library, to the point > that people can learn from it. It seems I was mislead. The compilers change over time - meaning that the fastest code may change over time. There is also the chance that the original code was not the best - for example the words function in the standard library performs two additional isSpace tests per word. The original code specifies an interface, its thanks to people trying to beat the performance that things improve. Thanks Neil _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@... http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
|
Re: (flawed?) benchmark : sort2008/3/4, Krzysztof Skrzętnicki <gtener@...>:
> Thanks for improved code. My point was to measure which programming patterns > are faster than the others so I can learn which ones I should use. However, > the thing that is really bad is the fact, that even oneliner qsort_i is > faster than library sort. Which is very different from what I've expected. > My intuition is only best and fastest code goes to library, to the point > that people can learn from it. It seems I was mislead. I think you did not correctly got the point of my and Neil Mitchell's message : you benchmarked those function on a completely random sequences so qsort was at his best, but in the real world, most sequences would have bias, and it is not infrequent at all to sort a partially sorted (or reverse sorted) list... In this case the performance of all your qsort are abysmal... Which is the reason the old sort was replaced by the actual mergesort in the library. Try my code (with 10000 elements for example), you'll see that sort is the best on a sorted list, and that qsort is at least 60 times slower (on 10000, in fact it is degenerating in O(n^2)). In the real world, the library maintainers decided it was ok to pay a slight overhead in the case where the list to sort is really randomly distributed since mergesort won so hugely over qsort in the pretty frequent case (in programs) of lists which present regularities. There is no sort which is ideal in all situations, but we can try to get a sort that works well in all situations, and don't trash in situations not so infrequent. (On the other hand, don't expect libraries functions to always be the best to use in your particular situation, they tend to be all-purpose as we just saw and the maintainers prefer simple generic implementations rather than complicated ones who could be slightly (or even significantly) faster in some case) -- Jedaï _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@... http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
|
Re: (flawed?) benchmark : sortI get it now, thanks. Also, I guess it is possible to find a better algorithm for standard library sort.
Christopher Skrzętnicki On Wed, Mar 5, 2008 at 12:04 AM, Chaddaï Fouché <chaddai.fouche@...> wrote:
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@... http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
|
Re: (flawed?) benchmark : sortOk, I did some search and found Data.Map, which can be used to implement pretty fast sorting:
import qualified Data.Map as Map treeSort :: Ord a => [a] -> [a] treeSort = map (\(x,_) -> x ) . Map.toAscList . Map.fromList . map (\x->(x,())) In fact It is likely to behave like sort, with the exception that it is 23% faster. I did not hovever check the memory consumption. It works well on random, sorted and reverse-sorted inputs, and the speed difference is always about the same. I belive I could take Data.Map and get datatype isomorphic to specialized Data.Map a () of it, so that treeSort will became Map.toAscList . Map.fromList. This may also bring some speedup. What do you think about this particular function? On Tue, Mar 4, 2008 at 1:45 AM, Krzysztof Skrzętnicki <gtener@...> wrote: Hi _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@... http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
|
Re: Re: (flawed?) benchmark : sortOn Sunday 09 March 2008, Krzysztof Skrzętnicki wrote:
> Ok, I did some search and found Data.Map, which can be used to implement > pretty fast sorting: > > import qualified Data.Map as Map > > treeSort :: Ord a => [a] -> [a] > treeSort = map (\(x,_) -> x ) . Map.toAscList . Map.fromList . map > (\x->(x,())) > > In fact It is likely to behave like sort, with the exception that it is 23% > faster. I did not hovever check the memory consumption. It works well on > random, sorted and reverse-sorted inputs, and the speed difference is > always about the same. I belive I could take Data.Map and get datatype > isomorphic to specialized *Data.Map a ()* of it, so that treeSort will > became Map.toAscList . Map.fromList. This may also bring some speedup. > > What do you think about this particular function? Some thoughts: 1) To get your function specifically, you could just use Data.Set.Set a instead of Map a (). 2) What does it do with duplicate elements in the list? I expect it deletes them. To avoid this, you'd need to use something like fromListWith, keeping track of how many duplicates there are, and expanding at the end. 3) I imagine the time taken to get any output is always O(n*log n). Various lazy sorts can be written (and I'd guess the standard library sort is written this way, although I don't know for sure) such that 'head (sort l)' is O(n), or O(n + k*log n) for getting the first k elements. However, Map, being a balanced binary tree, doesn't (I think) have the right characteristics for this. At the very least, you'll probably want to test with a function that doesn't delete duplicate elements. Something like this: treeSort = concatMap (\(x,n) -> replicate n x) . Map.toAscList . Map.fromListWith (+) . map (\x -> (x,1)) -- Dan _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@... http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
|
Re: Re: (flawed?) benchmark : sortdan.doel:
> On Sunday 09 March 2008, Krzysztof Skrzętnicki wrote: > > Ok, I did some search and found Data.Map, which can be used to implement > > pretty fast sorting: > > > > import qualified Data.Map as Map > > > > treeSort :: Ord a => [a] -> [a] > > treeSort = map (\(x,_) -> x ) . Map.toAscList . Map.fromList . map > > (\x->(x,())) > > > > In fact It is likely to behave like sort, with the exception that it is 23% > > faster. I did not hovever check the memory consumption. It works well on > > random, sorted and reverse-sorted inputs, and the speed difference is > > always about the same. I belive I could take Data.Map and get datatype > > isomorphic to specialized *Data.Map a ()* of it, so that treeSort will > > became Map.toAscList . Map.fromList. This may also bring some speedup. > > > > What do you think about this particular function? > > Some thoughts: > > 1) To get your function specifically, you could just use Data.Set.Set a > instead of Map a (). > > 2) What does it do with duplicate elements in the list? I expect it deletes > them. To avoid this, you'd need to use something like fromListWith, keeping > track of how many duplicates there are, and expanding at the end. And a little QuickCheck to help things along: import qualified Data.Map as Map import Data.List import Test.QuickCheck treeSort :: Ord a => [a] -> [a] treeSort = map (\(x,_) -> x ) . Map.toAscList . Map.fromList . map (\x->(x,())) main = quickCheck prop_sort prop_sort xs = sort xs == treeSort xs where _ = xs :: [Int] Running: $ runhaskell A.hs Falsifiable, after 11 tests: [-2,-2,5] _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@... http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
|
Re: Re: (flawed?) benchmark : sortOn Sun, 2008-03-09 at 23:04 -0400, Dan Doel wrote: > On Sunday 09 March 2008, Krzysztof Skrzętnicki wrote: > > What do you think about this particular function? > > Some thoughts: > > 1) To get your function specifically, you could just use Data.Set.Set a > instead of Map a (). > > 2) What does it do with duplicate elements in the list? I expect it deletes > them. To avoid this, you'd need to use something like fromListWith, keeping > track of how many duplicates there are, and expanding at the end. > > 3) I imagine the time taken to get any output is always O(n*log n). Various > lazy sorts can be written (and I'd guess the standard library sort is written > this way, although I don't know for sure) such that 'head (sort l)' is O(n), > or O(n + k*log n) for getting the first k elements. However, Map, being a > balanced binary tree, doesn't (I think) have the right characteristics for > this. Sounds to me like we should try a heap sort. As a data structure it should have similar constant factors to Data.Map (or .Set) but a heap is less ordered than a search tree and gives the O(n + k*log n) property. Duncan _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@... http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
|
Re: Re: (flawed?) benchmark : sortHi
> 2) What does it do with duplicate elements in the list? I expect it deletes > them. To avoid this, you'd need to use something like fromListWith, keeping > track of how many duplicates there are, and expanding at the end. That would be wrong. Consider: data Foo = Foo Int Int instance Ord Foo where compare (Foo a _) (Foo b _) = compare a b sort [Foo 1 2, Foo 1 -2] must return the original list back, in that order. You cannot delete things and duplicate them later. To guarantee the ordering you must also do other stuff. Thanks Neil _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@... http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
|
Re: Re: (flawed?) benchmark : sortHi
Can whoever picks this up please try the sort code from Yhc in their comparisons. In my benchmarks it ran up to twice as fast as the GHC code. http://darcs.haskell.org/yhc/src/packages/yhc-base-1.0/Data/List.hs I think what we really need is first quickCheck and timing framework for measuring sorts. After we have decided what makes one sort faster/better than another, then is the time to start deciding what sort is the best one. Ian did some initial work on this: http://www.haskell.org/pipermail/glasgow-haskell-users/2002-May/003376.html Until the sort-check package is uploaded to hackage I think most people will find it hard to be convinced of one sort over another. Thanks Neil On Mon, Mar 10, 2008 at 8:27 AM, Neil Mitchell <ndmitchell@...> wrote: > Hi > > > > 2) What does it do with duplicate elements in the list? I expect it deletes > > them. To avoid this, you'd need to use something like fromListWith, keeping > > track of how many duplicates there are, and expanding at the end. > > That would be wrong. Consider: > > data Foo = Foo Int Int > > instance Ord Foo where > compare (Foo a _) (Foo b _) = compare a b > > sort [Foo 1 2, Foo 1 -2] must return the original list back, in that > order. You cannot delete things and duplicate them later. To guarantee > the ordering you must also do other stuff. > > Thanks > > Neil > Haskell-Cafe mailing list Haskell-Cafe@... http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
|
Re: Re: (flawed?) benchmark : sortOn 10 Mar 2008, at 08:36, Neil Mitchell wrote: > Can whoever picks this up please try the sort code from Yhc in their > comparisons. In my benchmarks it ran up to twice as fast as the GHC > code. http://darcs.haskell.org/yhc/src/packages/yhc-base-1.0/Data/List.hs I believe the Yhc sort implementation is faster because Lennart did some extensive performance tuning of sorting with hbc, about ten years ago, and contributed the resulting winner to nhc98 way back then. Regards, Malcolm _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@... http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
|
Re: Re: (flawed?) benchmark : sortNeil Mitchell wrote:
>> 2) What does it do with duplicate elements in the list? I expect it deletes >> them. To avoid this, you'd need to use something like fromListWith, keeping >> track of how many duplicates there are, and expanding at the end. > > That would be wrong. Consider: > > data Foo = Foo Int Int > > instance Ord Foo where > compare (Foo a _) (Foo b _) = compare a b I would consider such an Ord instance to be hopelessly broken, and I don't think it's the responsibility of authors of functions with Ord constraints in their sigs (such as sort) to consider such possibilities or specify and control the behaviour of their behaviour for such instances. Trying to do this is what leads to horrors such as the "left biasing" of Data.Map (for example). Unfortunately the Haskell standards don't currently specify sane laws for Eq and Ord class instances, but they should. Otherwise knowing a type is an instance of Ord tells me nothing that I can rely on. Regards -- Adrian Hey _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@... http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
|
Re: Re: (flawed?) benchmark : sortAdrian Hey wrote:
> or specify and control the behaviour of their behaviour for such > instances. Urk, sorry for the gibberish. I guess I should get into the habit of reading what I write before posting :-) Regards -- Adrian Hey _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@... http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
|
Re: Re: (flawed?) benchmark : sortOn Mon, Mar 10, 2008 at 11:00 AM, Adrian Hey <ahey@...> wrote:
> Neil Mitchell wrote: > >> 2) What does it do with duplicate elements in the list? I expect it deletes > >> them. To avoid this, you'd need to use something like fromListWith, keeping > >> track of how many duplicates there are, and expanding at the end. > > > > That would be wrong. Consider: > > > > data Foo = Foo Int Int > > > > instance Ord Foo where > > compare (Foo a _) (Foo b _) = compare a b > > I would consider such an Ord instance to be hopelessly broken, and I > don't think it's the responsibility of authors of functions with Ord > constraints in their sigs (such as sort) to consider such possibilities > or specify and control the behaviour of their behaviour for such > instances. Trying to do this is what leads to horrors such as the "left > biasing" of Data.Map (for example). It could be. I actually don't know what Haskell specifies here. If a type has an Eq instance and x == y, must x and y be observationally equivalent? Or is it reasonable to allow some wiggle room? I'd say (==) definitely has to be an equivalence relation, but beyond that, it's open to the implementor, since if an algorithm only depends on (Eq a), it can't tell the difference between observational equality and any other equivalence relation. But that's just one argument ("by example", in a way). That is, an argument that this is hopelessly broken isn't trivial, it needs to be defended. There is nonetheless a need to handle duplicates gracefully, that is keeping a count won't cut it, because of sortBy. Luke _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@... http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
|
Re: Re: (flawed?) benchmark : sortHi
> > instance Ord Foo where > > compare (Foo a _) (Foo b _) = compare a b > > I would consider such an Ord instance to be hopelessly broken, and I > don't think it's the responsibility of authors of functions with Ord > constraints in their sigs (such as sort) to consider such possibilities > or specify and control the behaviour of their behaviour for such > instances. Trying to do this is what leads to horrors such as the "left > biasing" of Data.Map (for example). The sort in Haskell is specified to be "stable". What that means is that the above ord relation is fine. The above ordering observes all the necessary mathematical definitions of ordering, assuming an Eq of: instance Eq Foo where (==) (Foo a _) (Foo b _) = (==) a b > Unfortunately the Haskell standards don't currently specify sane laws > for Eq and Ord class instances, but they should. Otherwise knowing a > type is an instance of Ord tells me nothing that I can rely on. Please give the sane law that this ordering violates. I can't see any! What if I had made the definition of Foo: data Foo = Foo Int (Int -> Int) Now, is the only possible answer that any Ord instance for Foo is wrong? Thanks Neil _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@... http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
|
Re[2]: Re: (flawed?) benchmark : sortHello Adrian,
Monday, March 10, 2008, 2:00:18 PM, you wrote: >> instance Ord Foo where >> compare (Foo a _) (Foo b _) = compare a b > I would consider such an Ord instance to be hopelessly broken, and I hmmmm. for example, imagine files in file manager sorted by size or date -- Best regards, Bulat mailto:Bulat.Ziganshin@... _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@... http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
|
Re: Re: (flawed?) benchmark : sortNeil Mitchell wrote:
> Hi > >> > instance Ord Foo where >> > compare (Foo a _) (Foo b _) = compare a b >> >> I would consider such an Ord instance to be hopelessly broken, and I >> don't think it's the responsibility of authors of functions with Ord >> constraints in their sigs (such as sort) to consider such possibilities >> or specify and control the behaviour of their behaviour for such >> instances. Trying to do this is what leads to horrors such as the "left >> biasing" of Data.Map (for example). > > The sort in Haskell is specified to be "stable". What that means is > that the above ord relation is fine. The above ordering observes all > the necessary mathematical definitions of ordering, assuming an Eq of: > > instance Eq Foo where > (==) (Foo a _) (Foo b _) = (==) a b > >> Unfortunately the Haskell standards don't currently specify sane laws >> for Eq and Ord class instances, but they should. Otherwise knowing a >> type is an instance of Ord tells me nothing that I can rely on. > > Please give the sane law that this ordering violates. I can't see any! The Eq instance you've given violates the law that (x == y) = True implies x = y. Of course the Haskell standard doesn't specify this law, but it should. The Haskell standard doen't even specify that compare x y = EQ implies (x == y) = True, but again it should (what's the purpose of the Eq constraint on Ord class otherwise). > What if I had made the definition of Foo: > > data Foo = Foo Int (Int -> Int) > > Now, is the only possible answer that any Ord instance for Foo is wrong? Yes, if the Foo constuctor is exported. If it's scope confined to one module then you could maintain the invariant that the same function is always associated with a given Int. However, if this is the case then the issue you raise wrt sort behaviour is irrelevant. Regards -- Adrian Hey _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@... http://www.haskell.org/mailman/listinfo/haskell-cafe |
| < Prev | 1 - 2 - 3 - 4 - 5 - 6 | Next > |
| Free embeddable forum powered by Nabble | Forum Help |