|
View:
New views
19 Messages
—
Rating Filter:
Alert me
|
|
|
Simple FAST lazy functional primesFirst, here it is: primes = 2: 3: sieve 0 primes' 5 primes' = tail primes sieve k (p:ps) x = [x | x<-[x,x+2..p*p-2], and [(x`rem`p)/=0 | p<-take k primes']] ++ sieve (k+1) ps (p*p+2) (thanks to Leon P.Smith for his brilliant idea of directly generating the spans of odds instead of calling 'span' on the infinite odds list). This code is faster than PriorityQueue-based sieve (granted, using my own ad-hoc implementation, so YMMV) in producing the first million primes (testing was done running the ghc -O3 compiled code inside GHCi). The relative performance vs the PQ-version is: 100,000 300,000 1,000,000 primes produced 0.6 0.75 0.97 I've recently came to the Melissa O'Neill's article (I know, I know) and she makes the incredible claim there that the square-of-prime optimization doesn't count if we want to improve the old and "lazy" uprimes = 2: sieve [3,5..] where sieve (p:xs) = p : sieve [x | x <- xs, x `mod` p > 0] Her article gave me the strong impression that she claims that the only way to fix this code is by using Priority Queue based optimization, and then goes on to present astronomical gains in speed by implementing it. Well, I find this claim incredible. First of all, the "naive" code fires up its nested filters much too early, when in fact each must be started only when the prime's square is reached (not only have its work started there - be started there itself!), so that filters are delayed: dprimes = 2: 3: sieve (tail dprimes) [5,7..] where sieve (p:ps) xs = h ++ sieve ps (filter ((/=0).(`rem`p)) (tail t)) where (h,t)=span (< p*p) xs This code right there is on par with the PQ-base code, only x3-4 slower at generating the first million primes. Second, the implicit control structure of linearly nested filters, piling up in front of the supply stream, each with its prime-number-it-filters-by hidden inside it, needs to be explicated into a data structure of primes-to-filter-by (which is in fact the prefix of the primes list we're building itself), so the filtering could be done by one function call instead of many nested calls: xprimes = 2: 3: sieve 0 primes' [5,7..] where primes' = tail xprimes sieve k (p:ps) xs = noDivs k h ++ sieve (k+1) ps (tail t) where (h,t)=span (< p*p) xs noDivs k = filter (\x-> all ((/=0).(x`rem`)) (take k primes')) >From here, using the brilliant idea of Leon P. Smith's of fusing the span and the infinite odds supply, we arrive at the final version, kprimes = 2: 3: sieve 0 primes' 5 where primes' = tail kprimes sieve k (p:ps) x = noDivs k h ++ sieve (k+1) ps (t+2) where (h,t)=([x,x+2..t-2], p*p) noDivs k = filter (\x-> all ((/=0).(x`rem`)) (take k primes')) Using the list comprehension syntax, it turns out, when compiled with ghc -O3, gives it another 5-10% speedup itself. So I take it to disprove the central premise of the article, and to show that simple lazy functional FAST primes code does in fact exist, and that the PQ optimization - which value of course no-one can dispute - is a far-end optimization. _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@... http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
|
Re: Simple FAST lazy functional primesHello Will,
Monday, November 2, 2009, 10:41:15 AM, you wrote: > (testing was done running the ghc -O3 compiled code inside GHCi). afaik, -O2 should be used -- Best regards, Bulat mailto:Bulat.Ziganshin@... _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@... http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
|
Re: Simple FAST lazy functional primesYou can remove the "take k" step by passing along the list of primes
smaller than p instead of k: primes = 2 : 3 : 5 : 7 : sieve [3, 2] (drop 2 primes) sieve qs@(q:_) (p:ps) = [x | x<-[q*q+2,q*q+4..p*p-2], and [(x`rem`p)/=0 | p<-qs]] ++ sieve (p:qs) ps I also removed the "x" parameter by generating it from the previous prime. But this also means you have to start at p=5 and q=3, because you want q*q+2 to be odd. Sjoerd On Nov 2, 2009, at 8:41 AM, Will Ness wrote: > > First, here it is: > > primes = 2: 3: sieve 0 primes' 5 > primes' = tail primes > sieve k (p:ps) x > = [x | x<-[x,x+2..p*p-2], and [(x`rem`p)/=0 | p<-take k > primes']] > ++ sieve (k+1) ps (p*p+2) > > (thanks to Leon P.Smith for his brilliant idea of directly generating > the spans of odds instead of calling 'span' on the infinite odds > list). > > This code is faster than PriorityQueue-based sieve (granted, using > my own > ad-hoc implementation, so YMMV) in producing the first million primes > (testing was done running the ghc -O3 compiled code inside GHCi). > The relative performance vs the PQ-version is: > > 100,000 300,000 1,000,000 primes produced > 0.6 0.75 0.97 > > I've recently came to the Melissa O'Neill's article (I know, I know) > and > she makes the incredible claim there that the square-of-prime > optimization > doesn't count if we want to improve the old and "lazy" > > uprimes = 2: sieve [3,5..] where > sieve (p:xs) = p : sieve [x | x <- xs, x `mod` p > 0] > > Her article gave me the strong impression that she claims that the > only way > to fix this code is by using Priority Queue based optimization, and > then > goes on to present astronomical gains in speed by implementing it. > > Well, I find this claim incredible. First of all, the "naive" code > fires up > its nested filters much too early, when in fact each must be started > only > when the prime's square is reached (not only have its work started > there - > be started there itself!), so that filters are delayed: > > dprimes = 2: 3: sieve (tail dprimes) [5,7..] where > sieve (p:ps) xs > = h ++ sieve ps (filter ((/=0).(`rem`p)) (tail t)) > where (h,t)=span (< p*p) xs > > This code right there is on par with the PQ-base code, only x3-4 > slower at > generating the first million primes. > > Second, the implicit control structure of linearly nested filters, > piling up > in front of the supply stream, each with its prime-number-it-filters- > by > hidden inside it, needs to be explicated into a data structure of > primes-to-filter-by (which is in fact the prefix of the primes list > we're building itself), so the filtering could be done by one > function call > instead of many nested calls: > > xprimes = 2: 3: sieve 0 primes' [5,7..] where > primes' = tail xprimes > sieve k (p:ps) xs > = noDivs k h ++ sieve (k+1) ps (tail t) > where (h,t)=span (< p*p) xs > noDivs k = filter (\x-> all ((/=0).(x`rem`)) (take k primes')) > >> From here, using the brilliant idea of Leon P. Smith's of fusing >> the span > and the infinite odds supply, we arrive at the final version, > > kprimes = 2: 3: sieve 0 primes' 5 where > primes' = tail kprimes > sieve k (p:ps) x > = noDivs k h ++ sieve (k+1) ps (t+2) > where (h,t)=([x,x+2..t-2], p*p) > noDivs k = filter (\x-> all ((/=0).(x`rem`)) (take k primes')) > > Using the list comprehension syntax, it turns out, when compiled with > ghc -O3, gives it another 5-10% speedup itself. > > So I take it to disprove the central premise of the article, and to > show > that simple lazy functional FAST primes code does in fact exist, and > that the PQ optimization - which value of course no-one can dispute > - is > a far-end optimization. > > > _______________________________________________ > Haskell-Cafe mailing list > Haskell-Cafe@... > http://www.haskell.org/mailman/listinfo/haskell-cafe -- Sjoerd Visscher sjoerd@... _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@... http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
|
Re: Simple FAST lazy functional primesExcuse me, 2 doesn't have to be in the list of smaller primes, as
we're only generating odd numbers: primes = 2 : 3 : 5 : 7 : sieve [3] (drop 2 primes) sieve qs@(q:_) (p:ps) = [x | x<-[q*q+2,q*q+4..p*p-2], and [(x`rem`p)/=0 | p<-qs]] ++ sieve (p:qs) ps Sjoerd On Nov 2, 2009, at 2:07 PM, Sjoerd Visscher wrote: > You can remove the "take k" step by passing along the list of primes > smaller than p instead of k: > > primes = 2 : 3 : 5 : 7 : sieve [3, 2] (drop 2 primes) > sieve qs@(q:_) (p:ps) > = [x | x<-[q*q+2,q*q+4..p*p-2], and [(x`rem`p)/=0 | p<-qs]] > ++ sieve (p:qs) ps > > I also removed the "x" parameter by generating it from the previous > prime. But this also means you have to start at p=5 and q=3, because > you want q*q+2 to be odd. > > Sjoerd > > On Nov 2, 2009, at 8:41 AM, Will Ness wrote: > >> >> First, here it is: >> >> primes = 2: 3: sieve 0 primes' 5 >> primes' = tail primes >> sieve k (p:ps) x >> = [x | x<-[x,x+2..p*p-2], and [(x`rem`p)/=0 | p<-take k >> primes']] >> ++ sieve (k+1) ps (p*p+2) >> >> (thanks to Leon P.Smith for his brilliant idea of directly generating >> the spans of odds instead of calling 'span' on the infinite odds >> list). >> >> This code is faster than PriorityQueue-based sieve (granted, using >> my own >> ad-hoc implementation, so YMMV) in producing the first million primes >> (testing was done running the ghc -O3 compiled code inside GHCi). >> The relative performance vs the PQ-version is: >> >> 100,000 300,000 1,000,000 primes produced >> 0.6 0.75 0.97 >> >> I've recently came to the Melissa O'Neill's article (I know, I >> know) and >> she makes the incredible claim there that the square-of-prime >> optimization >> doesn't count if we want to improve the old and "lazy" >> >> uprimes = 2: sieve [3,5..] where >> sieve (p:xs) = p : sieve [x | x <- xs, x `mod` p > 0] >> >> Her article gave me the strong impression that she claims that the >> only way >> to fix this code is by using Priority Queue based optimization, and >> then >> goes on to present astronomical gains in speed by implementing it. >> >> Well, I find this claim incredible. First of all, the "naive" code >> fires up >> its nested filters much too early, when in fact each must be >> started only >> when the prime's square is reached (not only have its work started >> there - >> be started there itself!), so that filters are delayed: >> >> dprimes = 2: 3: sieve (tail dprimes) [5,7..] where >> sieve (p:ps) xs >> = h ++ sieve ps (filter ((/=0).(`rem`p)) (tail t)) >> where (h,t)=span (< p*p) xs >> >> This code right there is on par with the PQ-base code, only x3-4 >> slower at >> generating the first million primes. >> >> Second, the implicit control structure of linearly nested filters, >> piling up >> in front of the supply stream, each with its prime-number-it- >> filters-by >> hidden inside it, needs to be explicated into a data structure of >> primes-to-filter-by (which is in fact the prefix of the primes list >> we're building itself), so the filtering could be done by one >> function call >> instead of many nested calls: >> >> xprimes = 2: 3: sieve 0 primes' [5,7..] where >> primes' = tail xprimes >> sieve k (p:ps) xs >> = noDivs k h ++ sieve (k+1) ps (tail t) >> where (h,t)=span (< p*p) xs >> noDivs k = filter (\x-> all ((/=0).(x`rem`)) (take k primes')) >> >>> From here, using the brilliant idea of Leon P. Smith's of fusing >>> the span >> and the infinite odds supply, we arrive at the final version, >> >> kprimes = 2: 3: sieve 0 primes' 5 where >> primes' = tail kprimes >> sieve k (p:ps) x >> = noDivs k h ++ sieve (k+1) ps (t+2) >> where (h,t)=([x,x+2..t-2], p*p) >> noDivs k = filter (\x-> all ((/=0).(x`rem`)) (take k primes')) >> >> Using the list comprehension syntax, it turns out, when compiled with >> ghc -O3, gives it another 5-10% speedup itself. >> >> So I take it to disprove the central premise of the article, and to >> show >> that simple lazy functional FAST primes code does in fact exist, and >> that the PQ optimization - which value of course no-one can dispute >> - is >> a far-end optimization. >> >> >> _______________________________________________ >> Haskell-Cafe mailing list >> Haskell-Cafe@... >> http://www.haskell.org/mailman/listinfo/haskell-cafe > > -- > Sjoerd Visscher > sjoerd@... > > > > _______________________________________________ > Haskell-Cafe mailing list > Haskell-Cafe@... > http://www.haskell.org/mailman/listinfo/haskell-cafe -- Sjoerd Visscher sjoerd@... _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@... http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
|
Re: Simple FAST lazy functional primesSjoerd Visscher <sjoerd <at> w3future.com> writes:
> [...] 2 doesn't have to be in the list of smaller primes, as > we're only generating odd numbers: > > primes = 2 : 3 : 5 : 7 : sieve [3] (drop 2 primes) > sieve qs@(q:_) (p:ps) > = [x | x<-[q*q+2,q*q+4..p*p-2], and [(x`rem`p)/=0 | p<-qs]] > ++ sieve (p:qs) ps > > Sjoerd Thanks! I haven't tested your code's speed yet, but have few points: i wouldn't expect eliminating a parameter to have any effect on performance in compiled code (I haven't a clue whether -O2 or -O3 should be used BTW) if it doesn't eliminate any superfluous computations. second, you prepend bigger primes in front of a list (building the prefix in reverse order); this will cause them to be tested against first. I think (1) we're supposed to test composites against smaller factors first, for the tests to stop sooner. IMO it'll slow the code down. I think (2) you can safely append them at the end instead, for a compiler should be able to just maintain a tail pointer there. But then you loose immediate access to the last found prime; so will need to return the 'x' parameter back. Then you end up with my code exactly :) (only replicating the prefix): primes = 2 : 3 : 5 : 7 : sieve [3] (drop 2 primes) 9 sieve pfx (p:ps) x = [x | x<-[x+2,x+4..p*p-2], and [(x`rem`p)/=0 | p<-pfx]] ++ sieve (pfx++[p]) ps (p*p) it'll be interesting to test my hypothesis (two of them :) ) and see if this has in fact better time than your code. thirdly, (take k) could be compiled out completely into a tail pointer too, maintained and advanced after each step. I would hope a smart compiler to do just that. Again, some more testing is in order. Although, I tested the two approaches on some previous incarnation of this code, and the (take k) vs (pfx++[p]) had exactly same run time (or was better). What I'm after mostly here, is for the code to be clear and simple, and reasonably efficient. Right now it corresponds very nicely to the description of the sieve as filtering all the odds testable by a known prefix of primes, and then going on to proceed with the next batch of them with a prefix that was grown by one more prime. And so on. But more importantly I want it to be known that there's a lot that can be done here, in a natural functional lazy kind of way, before resorting to priority queues and mutable arrays. We could always just use C too. ;) Thanks for the feedback! :) _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@... http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
|
Re: Simple FAST lazy functional primesWill Ness <will_n48 <at> yahoo.com> writes:
> But more importantly I want it to be known that there's a lot that can be done > here, in a natural functional lazy kind of way, before resorting to priority > queues and mutable arrays. We could always just use C too. ;) I mean it as an introductory code that's nevertheless good for producing the first million primes or so - not the best sieve ever, but the best (is it?) simple clear functional introductory sieve, instead. That's another reason to using (take k primes') - it practically says it in English, "the first k odd primes". :) _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@... http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
|
Re: Simple FAST lazy functional primesSjoerd Visscher <sjoerd <at> w3future.com> writes:
> > Excuse me, 2 doesn't have to be in the list of smaller primes, as > we're only generating odd numbers: > > primes = 2 : 3 : 5 : 7 : sieve [3] (drop 2 primes) > sieve qs@(q:_) (p:ps) > = [x | x<-[q*q+2,q*q+4..p*p-2], and [(x`rem`p)/=0 | p<-qs]] > ++ sieve (p:qs) ps > > Sjoerd > Hi, I've run it now. In producing 100,000 primes, your above code takes x3.5 more time than the one I posted. The code modified as I suggested with (qs++[p]) takes exactly the same time as mine. :) Cheers, _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@... http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
|
Re: Re: Simple FAST lazy functional primesOn Nov 2, 2009, at 5:11 PM, Will Ness wrote: > Sjoerd Visscher <sjoerd <at> w3future.com> writes: > >> >> Excuse me, 2 doesn't have to be in the list of smaller primes, as >> we're only generating odd numbers: >> >> primes = 2 : 3 : 5 : 7 : sieve [3] (drop 2 primes) >> sieve qs@(q:_) (p:ps) >> = [x | x<-[q*q+2,q*q+4..p*p-2], and [(x`rem`p)/=0 | p<-qs]] >> ++ sieve (p:qs) ps >> >> Sjoerd >> > > Hi, > > I've run it now. In producing 100,000 primes, your above code takes > x3.5 more > time than the one I posted. Too bad. The order of the primes when filtering matters quite a lot! > The code modified as I suggested with (qs++[p]) > takes exactly the same time as mine. :) Yes, append and take are both O(n). Here's another variation which I rather like: primes = 2 : 3 : sieve (tail primes) (notDivides 3) 5 sieve (p:ps) test from = [x | x <- [from, from + 2 .. p * p - 2], test x] ++ sieve ps (\x -> test x && notDivides p x) (p * p + 2) notDivides d x = (x `rem` d) /= 0 I'm curious if GHC can compile this to the same speed as your code. -- Sjoerd Visscher sjoerd@... _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@... http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
|
Re: Simple FAST lazy functional primesSjoerd Visscher <sjoerd <at> w3future.com> writes:
> On Nov 2, 2009, at 5:11 PM, Will Ness wrote: > > > Hi, > > > > I've run it now. In producing 100,000 primes, your above code takes > > x3.5 more > > time than the one I posted. > > Too bad. The order of the primes when filtering matters quite a lot! Yep. Most of the composites will have mostly small prime factors, most of the time. :) An English paraphrase of the analysis from the paper, when she proves that whatever you do with composites, doesn't matter (in the _long_ run!) if you test primes for divisibility - because for primes we end up testing against _all_ of the primes in the prefix. This is exactly where her PQ code achieves its coup - it gets its primes for free when it checks its composites and sees the gap there. So the reason not to have these nested functions is to have all the logic represented in the open, in the data structure - here the list, still. PQ is a natural next step. Having a lot of linearly nested functions, there isn't a lot we can do with them. :) > > The code modified as I suggested with (qs++[p]) > > takes exactly the same time as mine. :) > > Yes, append and take are both O(n). I would hope 'append' to be O(1), and 'take' just translated into an upper limit of the testing loop in 'and'. > > Here's another variation which I rather like: > > primes = 2 : 3 : sieve (tail primes) (notDivides 3) 5 > sieve (p:ps) test from > = [x | x <- [from, from + 2 .. p * p - 2], test x] > ++ sieve ps (\x -> test x && notDivides p x) (p * p + 2) > notDivides d x = (x `rem` d) /= 0 > > I'm curious if GHC can compile this to the same speed as your code. It depends on whether it is as good at traversing function call frames at run time, as elements of a list. Or whether it uses the call frames at all, or is able to compile away all of it. :) _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@... http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
|
Re: Simple FAST lazy functional primesWill Ness <will_n48 <at> yahoo.com> writes:
> primes = 2: 3: sieve 0 primes' 5 > primes' = tail primes > sieve k (p:ps) x > = [x | x <- [x,x+2..p*p-2], > and [(x`rem`p)/=0 | p <- take k primes']] > ++ sieve (k+1) ps (p*p+2) > > (thanks to Leon P.Smith for his brilliant idea of directly generating > the spans of odds instead of calling 'span' on the infinite odds list). > > The relative performance vs the PQ-version is: > > 100,000 300,000 1,000,000 primes produced > 0.6 0.75 0.97 > One _crucial_ tidbit I've left out: _type_signature_. Adding (:: [Int]) speeds this code up more than TWICE! :) :) 'sieve' can also be used in e.g. primesFrom m = sieve (length h) ps m where (h,(_:ps)) = span (<= (floor.sqrt.fromIntegral) m) primes to get few big primes even faster. :) _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@... http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
|
Re: Re: Simple FAST lazy functional primesOn Mon, Nov 2, 2009 at 1:59 PM, Will Ness <will_n48@...> wrote:
If you are okay with Int, then maybe you're also happy with Int32 or Word32. If so, why don't you use template haskell to build the list at compile time? If you do that, then getting the kth prime at run-time is O(k). Take that AKS! :)
Jason _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@... http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
|
Re: Simple FAST lazy functional primesJason Dagit <dagit <at> codersbase.com> writes:
> On Mon, Nov 2, 2009 at 1:59 PM, Will Ness <will_n48 <at> yahoo.com> wrote: > Will Ness <will_n48 <at> yahoo.com> writes: > > One _crucial_ tidbit I've left out: _type_signature_. > Adding (:: [Int]) speeds this code up more than TWICE! > :) :) > > > If you are okay with Int, then maybe you're also happy with Int32 or Word32. If so, why don't you use template haskell to build the list at compile time? If you do that, then getting the kth prime at run-time is O(k). Take that AKS! :) > O(k), I've removed it since the post actually. Wasn't thinking clearly for a moment, having seen the double speedup! I've found Matthew Brecknell's fast code in old Melissa O'Neill's article here from 2007-02-19 18:14:23 GMT. Without the type signature, it's twice slower than my code. I think it is a fairly faithful translation now of what the sieve is all about, except that it tests its candidate numbers whereas sieve counts over them (and thus is able to skip over primes without testing them). The usual functional approach has it working with each number in isolation, so tests it (to recreate counting state in effect), thus overworks much on primes. Next logical step is to start counting! :) > Jason > > > > _______________________________________________ > Haskell-Cafe mailing list > Haskell-Cafe <at> haskell.org > http://www.haskell.org/mailman/listinfo/haskell-cafe > _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@... http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
|
Re: Re: Simple FAST lazy functional primesOn Mon, Nov 02, 2009 at 02:10:12PM -0700, Jason Dagit wrote:
> If you are okay with Int, then maybe you're also happy with Int32 or Word32. > If so, why don't you use template haskell to build the list at compile > time? If you do that, then getting the kth prime at run-time is O(k). Take > that AKS! :) Silly you! If he's into TH than he may just create an -- god forbid -- array and get the kth prime in O(1). :) -- Felipe. _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@... http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
|
Re: Re: Simple FAST lazy functional primesOn Mon, Nov 2, 2009 at 2:40 PM, Will Ness <will_n48@...> wrote:
Jason Dagit <dagit <at> codersbase.com> writes: By the way, do you understand where the speedup with Int is coming from? As I understand it, there are two main places. One is that the type class dictionary passing can be removed (GHC might figure this out already, I'd need to check the core to be sure). The other is that GHC is likely unboxing to it's primitive Int# type.
If none of that made sense, or you'd like to know how you can double check what I'm talking about, then Real-World Haskell has a chapter with an example:
Good luck! Jason _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@... http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
|
|
|
|
Re: Simple FAST lazy functional primesJason Dagit <dagit <at> codersbase.com> writes:
> By the way, do you understand where the speedup with Int is coming from? As I understand it, there are two main places. One is that the type class dictionary passing can be removed (GHC might figure this out already, I'd need to check the core to be sure). The other is that GHC is likely unboxing to it's primitive Int# type. > [...] > Good luck! > Jason Thanks! Writing the super-fast sieve wasn't my objective here though. It rather was writing the fastest possible simple functional lazy code true to the sieve's definition, and understanding it better that way (that's the added bonus). As it stands now, this code seems a rather faithful description of what _is_ sieve, except that it tests each number in isolation instead of counting over a bunch of them at once (skipping over primes, getting them for free). THAT's the crucial difference, which the article seems trying to explain but never quite gets it in such simple terms. All the extra activity is kept to absolute minimum here, and _now_ the main thing can be dealt with further, if so desired - like turning to using the PQ thing, etc. Then if we were to compare them, it wouldn't be like comparing apples with orange juice. :) That's what it felt like, seeing the PQ code compared with the classic "naive" version in that article. I'm reasonably sure that PQ-augmented, this code will be even faster, not slower, even for the first million primes. This whole experience proves it that the clearest code can also be the fastest (and may be necessarily so). Seeing it described in that article as if clarity must be paid for with efficiency (and vice versa), didn't seem right to me. Cheers, > _______________________________________________ > Haskell-Cafe mailing list > Haskell-Cafe <at> haskell.org > http://www.haskell.org/mailman/listinfo/haskell-cafe > _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@... http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
|
Re: Simple FAST lazy functional primesHi Steve, Steve <stevech1097 <at> yahoo.com.au> writes: > > Hi Will, > > I had previously tested the Melissa O'Neil prime function (using the > primes 0.1.1 package) and found it to be fast (for a functional > approach), but with high memory use. > > To fully test a prime function, I think you should: > 1. Test for more than the first 10^6 primes. > Generating all primes to 10^6 is quite easy for modern computers. Most > prime generating functions will run in less than 1 sec and look fast. > Try generating all primes to 10^7 and 10^8 then you will see how 'fast' > these lazy functional methods really are. > 2. Measure memory use. > As you move above 10^6 and test primes up to 10^7, 10^8, and 10^9, > memory use becomes very important. A prime function with excessive > memory use will soon consume all of the system's memory. > > Here's another primes function which performs quite well. > > primes :: [Int] > primes = 2 : filter (isPrime primes) [3,5..] where > isPrime (p:ps) n > | mod n p == 0 = False > | p*p > n = True > | otherwise = isPrime ps n > isPrime [] _ = False -- never used, but avoids compiler warning > > Here's some results from my PC, for generating primes to 10^8. > > 10**6 10**7 10**8 > secs MiB secs MiB secs MiB > ------------------------------- > 1 0.01 0 0.1 2 2 14 > 2 0.56 7 11.1 43 270 306 > 3 0.61 7 11.8 44 260 342 > 4 0.43 36 5.4 345 900 not finished > > 1 using a Haskell ST Array > 2 your primes function > 3 my primes function, listed above > 4 Melissa O'Neil method from primes 0.1.1 package > > To summarise the results from the tests I've run: > - if you want fast functional primes, forget it, its not possible. > - if you just want fast primes, then use C, or a Haskell array. > - if performance does not matter and slow primes are acceptable, then > use the purely functional approach. > > Regards, > Steve > you just have a fast PC that's all, :) so a million is not enough for you. My old laptop is 50 times slower. :) Seriously though, your conclusions seem entirely plausible to me. My goal here was to have a Haskell code for primes, that is decent. Nothing more. The reason your code is slightly slower is of course that in effect it recalculates the needed primes prefix for each new candidate. If you could somehow thread its length through, it might have sped it up some more. I've just tried it and it was twice slower than mine. (?) I didn't use the [Int] signature in both. Maybe it's not real issues we're dealing with here, but compiler/OS/CPU issues? (or have you've forgotten to put the [Int] signature into my code too, when tested? It runs twice as fast with it). Although your code has an advantage that it is very easy to add the wheel optimization to it. BTW I don't know about the code in the package, but the one in the article makes terrible faux-pas of adding each prime into the queue as soon as it is produced; this could easily account for a memory blow-up you're seeing. What's really needed, is to plug a decent PQ implementation into my framework, which does absolute minimum of repeated calculations it seems. What I have now, is this: qprimes = 2: 3: sieve emptyQ primes' 5 where primes' = tail qprimes sieve q (p:ps) x = h ++ sieve (addQ q' (2*p) (p*p+2*p)) ps (p*p+2) where (h,q') = noComps q [x,x+2..p*p-2] ...... The main deficiency of list-based sieves, as I've now came to realize and formulate in simple enough terms for myself to understand, is that they work with each number in isolation, and thus must test primes as well as composites. Testing composites on average is cheap, because most of them will have small factors; testing primes is costly. Imperative sieves avoid that by working over spans of numbers at once, so they get their primes for free, when they "see" gaps in produced/marked composites (I repeat myself here, but am not sure if you've read this my explanation in other posts). What counts here, is the direct access to random memory - or the numbers written out on a blackboard. Melissa's PQ approach tries to emulate that. Not crossing them off, but seeing the gaps in between. One is tempted to treat Haskell as high-level executable definition, hoping for a compiler to turn it into the fastest possible low-level code. I know I can translate it into C by hand fairly well; why wouldn't the compiler? It seems Haskell compilers have much room for improvement. :) _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@... http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
|
|
|
|
Re: Simple FAST lazy functional primesSteve <stevech1097 <at> yahoo.com.au> writes:
> > On Tue, 2009-11-03 at 10:52 -0500, Will wrote: > > I've just tried it and it was twice slower than mine. (?) I didn't use > > the [Int] > > signature in both. [...] It runs twice as fast with it). > > Although your > > code has an advantage that it is very easy to add the wheel > > optimization to it. > > I used [Int] for both. If you didn't use [Int] then it defaults to > [Integer] and you are probably testing the speed of GMP integer > operations (rather than the speed of Haskell Int operations) which could > give differing conclusions. When both are declared [Int], the ratios are [1.50, 1.40, 1.35, 1.26, 1.26] for speed of your version vs mine in producing 100,000, 300,000, 1, 2 and 3 million primes, on my laptop, with memory consistently at 120% (as reported by GHCi with :s +s switch). Haven't tried any above that, as it's old and slow. But clearly the two versions are on par with each other. The reason I prefer my version is that it's in a form easy to be tweaked with further. :) > I had looked into using a wheel before. Its nice in theory, but not so > useful in practice. At least that's my experience where using a wheel > made the primes function slower. interesting. > Adding your code and conclusions to "Prime numbers" on the Haskell wiki, > could be useful. > http://www.haskell.org/haskellwiki/Prime_numbers Thank you for pointing me to that page. I've just done that. :) The short description that I've arrived at in the end, is that my version explicitly uses successive prefixes of the primes list to test batches of odds between successive squares of primes. Now it's short and clear. Thanks! > I've just updated the page to split it into "Finding Primes" and > "Testing Primality". > > Steve > _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@... http://www.haskell.org/mailman/listinfo/haskell-cafe |
| Free embeddable forum powered by Nabble | Forum Help |