|
View:
New views
20 Messages
—
Rating Filter:
Alert me
|
| < Prev | 1 - 2 - 3 - 4 | Next > |
|
|
New libc malloc patchThere is a patch that contains a new libc malloc implementation at:
http://www.canonware.com/~jasone/jemalloc/jemalloc_20051127a.diff This implementation is very different from the current libc malloc. Probably the most important difference is that this one is designed with threads and SMP in mind. The patch has been tested for stability quite a bit already, thanks mainly to Kris Kennaway. However, any help with performance testing would be greatly appreciated. Specifically, I'd like to know how well this malloc holds up to threaded workloads on SMP systems. If you have an application that relies on threads, please let me know how performance is affected. Naturally, if you notice horrible performance or ridiculous resident memory usage, that's a bad thing and I'd like to hear about it. Thanks, Jason === Important notes: * You need to do a full buildworld/installworld in order for the patch to work correctly, due to various integration issues with the threads libraries and rtld. * The virtual memory size of processes, as reported in the SIZE field by top, will appear astronomical for almost all processes (32+ MB). This is expected; it is merely an artifact of using large mmap()ed regions rather than sbrk(). * In keeping with the default option settings for CURRENT, the A and J flags are enabled by default. When conducting performance tests, specify MALLOC_OPTIONS="aj" . _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: New libc malloc patchJason,
I see that you have included an implementation of red-black tree CPP macros, but wouldn't it be better if you were to use the ones in <sys/tree.h> ? I have only had a precursory look, but I would have thought that would be the way to go. Just a suggestion. Best regards, -- Hiten Pandya hmp at freebsd.org On 29/11/05, Jason Evans <jasone@...> wrote: > There is a patch that contains a new libc malloc implementation at: > > http://www.canonware.com/~jasone/jemalloc/jemalloc_20051127a.diff > > This implementation is very different from the current libc malloc. > Probably the most important difference is that this one is designed > with threads and SMP in mind. > > The patch has been tested for stability quite a bit already, thanks > mainly to Kris Kennaway. However, any help with performance testing > would be greatly appreciated. Specifically, I'd like to know how > well this malloc holds up to threaded workloads on SMP systems. If > you have an application that relies on threads, please let me know > how performance is affected. > > Naturally, if you notice horrible performance or ridiculous resident > memory usage, that's a bad thing and I'd like to hear about it. > > Thanks, > Jason > > === Important notes: > > * You need to do a full buildworld/installworld in order for the > patch to work correctly, due to various integration issues with the > threads libraries and rtld. > > * The virtual memory size of processes, as reported in the SIZE field > by top, will appear astronomical for almost all processes (32+ MB). > This is expected; it is merely an artifact of using large mmap()ed > regions rather than sbrk(). > > * In keeping with the default option settings for CURRENT, the A and > J flags are enabled by default. When conducting performance tests, > specify MALLOC_OPTIONS="aj" . > freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: New libc malloc patchJust curious what is the grand plan for this work? I wonder if it will
make sense to have two malloc's in the system, so that user can select one which better suits his needs. -Maxim Jason Evans wrote: > There is a patch that contains a new libc malloc implementation at: > > http://www.canonware.com/~jasone/jemalloc/jemalloc_20051127a.diff > > This implementation is very different from the current libc malloc. > Probably the most important difference is that this one is designed with > threads and SMP in mind. > > The patch has been tested for stability quite a bit already, thanks > mainly to Kris Kennaway. However, any help with performance testing > would be greatly appreciated. Specifically, I'd like to know how well > this malloc holds up to threaded workloads on SMP systems. If you have > an application that relies on threads, please let me know how > performance is affected. > > Naturally, if you notice horrible performance or ridiculous resident > memory usage, that's a bad thing and I'd like to hear about it. > > Thanks, > Jason > > === Important notes: > > * You need to do a full buildworld/installworld in order for the patch > to work correctly, due to various integration issues with the threads > libraries and rtld. > > * The virtual memory size of processes, as reported in the SIZE field by > top, will appear astronomical for almost all processes (32+ MB). This > is expected; it is merely an artifact of using large mmap()ed regions > rather than sbrk(). > > * In keeping with the default option settings for CURRENT, the A and J > flags are enabled by default. When conducting performance tests, > specify MALLOC_OPTIONS="aj" . > > _______________________________________________ > freebsd-current@... mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." > > > _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: New libc malloc patchOn Nov 29, 2005, at 2:52 AM, Hiten Pandya wrote:
> I see that you have included an implementation of red-black tree CPP > macros, but wouldn't it be better if you were to use the ones in > <sys/tree.h> ? I have only had a precursory look, but I would have > thought that would be the way to go. There is a feature missing from sys/tree.h that I need (rb_nsearch() in the patch), but you are right that it would probably be best to use sys/tree.h. I am going to work on adding RB_NFIND(), and will then try switching to sys/tree.h. Thanks, Jason _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: New libc malloc patchOn Nov 29, 2005, at 3:37 AM, Maxim Sobolev wrote:
> Just curious what is the grand plan for this work? I wonder if it > will make sense to have two malloc's in the system, so that user > can select one which better suits his needs. The plan for this work is to replace the current malloc, rather than augmenting it. There is a long history in Unix of using shared library tricks to override the system malloc, and the patch does not change the ability to do so. However, in my opinion, explicitly providing multiple implementations of malloc in the base OS misses the point of providing a general purpose memory allocator. The goal is to have a single implementation that works well for the vast majority of extant programs, and to allow applications to provide their own implementations when the general purpose allocator fails to perform adequately. phkmalloc did an excellent job in this capacity for quite some time, but now that we need to commonly support threaded programs on SMP systems, phkmalloc is being strained rather badly. This isn't an indication that we need multiple malloc implementations in the base OS; rather it indicates that the system malloc implementation needs to take into account constraints that did not exist when phkmalloc was designed. Jason _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: New libc malloc patchIn message <1A7D4B98-9474-42B6-8A21-4C9AB8582EC1@...>, Jason Evans wr
ites: >[...] phkmalloc did an excellent job in this capacity >for quite some time, but now that we need to commonly support >threaded programs on SMP systems, phkmalloc is being strained rather >badly. This isn't an indication that we need multiple malloc >implementations in the base OS; rather it indicates that the system >malloc implementation needs to take into account constraints that did >not exist when phkmalloc was designed. The malloc phkmalloc replaced was written at some point in the 1980ies on a VAX, and more or less assumed the Vax was effectively a single user machine and without effective paging algorithms. Phkmalloc was written in 1994/5 where I had 4MB of RAM in my "Gateway Handbook 486" and very strongly assumed that with the RAM prices of the day, I could not afford an upgrade. I gave a talk about phkmalloc at USENIX ATC 1998 in New Orleans. One of the central points in the talk was that infrastructure code should have regular service overhauls, to check that the assumptions in the design is still valid. In addition to assumptions phkmalloc makes which are no longer relevant, there are many assumptions which should be made today which phkmalloc is not aware of, multi-threading being but one of them. Cache line effects, pipeline prefetching, multi-cpu systems, different VM system algorithms, larger address spaces etc etc etc. Once Jason is done, I have no doubts that "jemalloc" will beat phkalloc in all relevant benchmarking thereby neatly rendering any discussion about having multiple mallocs in the tree pointless. A big thank you from the author of phkmalloc to Jason for following the service manual to the letter :-) Poul-Henning -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@... | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: New libc malloc patchI have a rather strong objection to make to this proposal (read: if this
change goes in I'm going to have to go through the effort of ripping it out locally...): There exists a problem right now--localized to i386 and any other arch based on 32-bit pointers: address space is simply too scarce. Your decision to switch to using mmap as the exclusive source of malloc buckets is admirable for its modernity but it simply cannot stand unless someone steps up to change the way mmap and brk interact within the kernel. The trouble arises from the need to set MAXDSIZ and the resulting effect it has in determining the start of the mmap region--which I might add is the location that the shared library loader is placed. This effectively (and explicitly) sets the limit for how large of a contiguous region can be allocated with brk. What you've done by switching the system malloc to exclusively using mmap is induced a lot of motivation on the part of the sysadmin to push that brk/mmap boundary down. This wouldn't be a problem except that you've effectively shot in the foot dozens of alternative c malloc implementations, not to mention the memory allocator routines used in obscure languages such as Modula-3 and Haskell that rely on brk derived buckets. This isn't playing very nicely! I looked into the issues and limitations with phkmalloc several months ago and concluded that simply adopting ptmalloc2 (the linux malloc) was the better approach--notably it is willing to draw from both brk and mmap, and it also implements per-thread arenas. There is also cause for concern about your "cache-line" business. Simply on the face of it there is the problem that the scheduler does not do a good job of pinning threads to individual CPUs. The threads are already bounding from cpu to cpu and thrashing (really thrashing) each CPU cache along the way. Second, you've forgotten that there is a layer of indirection between your address space and the cache: the mapping of logical pages (what you can see in userspace) to physical pages (the addresses of which actually matter for the purposes of the cache). I don't recall off-hand whether or not the L1 cache on i386 is based on tags of the virtual addresses, but I am certain that the L2 and L3 caches tag the physical addresses not the virtual addresses. This means that your careful address selection based on cache-lines will only work out if it is done in the vm codepath: remember the mapping of physical addresses to the virtual addresses that come back from mmap can be delayed arbitrarily long into the future depending on when the program actually goes to touch that memory. Furthermore, the answer may vary depending on the architecture or even the processor version. -Jon On Mon, 28 Nov 2005, Jason Evans wrote: > There is a patch that contains a new libc malloc implementation at: > > http://www.canonware.com/~jasone/jemalloc/jemalloc_20051127a.diff > > This implementation is very different from the current libc malloc. > Probably the most important difference is that this one is designed > with threads and SMP in mind. > > The patch has been tested for stability quite a bit already, thanks > mainly to Kris Kennaway. However, any help with performance testing > would be greatly appreciated. Specifically, I'd like to know how > well this malloc holds up to threaded workloads on SMP systems. If > you have an application that relies on threads, please let me know > how performance is affected. > > Naturally, if you notice horrible performance or ridiculous resident > memory usage, that's a bad thing and I'd like to hear about it. > > Thanks, > Jason > > === Important notes: > > * You need to do a full buildworld/installworld in order for the > patch to work correctly, due to various integration issues with the > threads libraries and rtld. > > * The virtual memory size of processes, as reported in the SIZE field > by top, will appear astronomical for almost all processes (32+ MB). > This is expected; it is merely an artifact of using large mmap()ed > regions rather than sbrk(). > > * In keeping with the default option settings for CURRENT, the A and > J flags are enabled by default. When conducting performance tests, > specify MALLOC_OPTIONS="aj" . > > _______________________________________________ > freebsd-current@... mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." > freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: New libc malloc patchJon,
Thanks for your comments. Fortunately, I don't think things are quite as hopeless as you think, though you may be right that some adjustments are necessary. Specific replies follow. On Nov 29, 2005, at 12:06 PM, Jon Dama wrote: > There exists a problem right now--localized to i386 and any other arch > based on 32-bit pointers: address space is simply too scarce. > > Your decision to switch to using mmap as the exclusive source of > malloc > buckets is admirable for its modernity but it simply cannot stand > unless > someone steps up to change the way mmap and brk interact within the > kernel. > > The trouble arises from the need to set MAXDSIZ and the resulting > effect > it has in determining the start of the mmap region--which I might > add is > the location that the shared library loader is placed. This > effectively > (and explicitly) sets the limit for how large of a contiguous > region can > be allocated with brk. > > What you've done by switching the system malloc to exclusively using > mmap is induced a lot of motivation on the part of the sysadmin to > push > that brk/mmap boundary down. > > This wouldn't be a problem except that you've effectively shot in > the foot > dozens of alternative c malloc implementations, not to mention the > memory > allocator routines used in obscure languages such as Modula-3 and > Haskell > that rely on brk derived buckets. > > This isn't playing very nicely! Where should MAXDSIZ be? Given scarce address space, the best we can hope for is setting it to the "least bad" default, as measured by what programs we care about do. No matter what we do, some programs lose. That said, it turns out that adding the ability to allocate via brk isn't hard. The code already contains the logic to recycle address ranges, in order to reduce mmap system call overhead. All of the locking infrastructure is also already in place. The only necessary modifications are 1) explicit use of brk until all data segment space is consumed, 2) special case code for addresses in the brk range, so that madvise() is used instead of munmap(), and 3) preferential re- use of the brk address space over mmap'ed memory. Do you agree that there is no need for using brk on 64-bit systems? > I looked into the issues and limitations with phkmalloc several > months ago > and concluded that simply adopting ptmalloc2 (the linux malloc) was > the > better approach--notably it is willing to draw from both brk and > mmap, and > it also implements per-thread arenas. ptmalloc takes a different approach to per-thread arenas that has been shown in multiple papers to not scale as well as the approach I took. The difference isn't significant until you get to 8+ CPUs, but we already have systems running with enough CPUs that this is an issue. > There is also cause for concern about your "cache-line" business. > Simply > on the face of it there is the problem that the scheduler does not > do a > good job of pinning threads to individual CPUs. The threads are > already > bounding from cpu to cpu and thrashing (really thrashing) each CPU > cache > along the way. > > Second, you've forgotten that there is a layer of indirection > between your > address space and the cache: the mapping of logical pages (what you > can > see in userspace) to physical pages (the addresses of which actually > matter for the purposes of the cache). I don't recall off-hand > whether or > not the L1 cache on i386 is based on tags of the virtual addresses, > but I > am certain that the L2 and L3 caches tag the physical addresses not > the > virtual addresses. > > This means that your careful address selection based on cache-lines > will > only work out if it is done in the vm codepath: remember the > mapping of > physical addresses to the virtual addresses that come back from > mmap can > be delayed arbitrarily long into the future depending on when the > program > actually goes to touch that memory. > > Furthermore, the answer may vary depending on the architecture or > even the > processor version. I don't think you understand the intent for the "cache-line business". There is only one intention: avoid storing data structures in the same cache line if they are likely to be accessed simultaneously by multiple threads. For example, if two independent structures are stored right next to each other, although they do not require synchronization protection from each other, the hardware will send a cache line invalidation message to other CPUs every time anything in the cache line is modified. This means horrible cache performance if the data are modified often. As a particular example, there are per-arena data structures that are modified quite often (arena_t). If two arenas were right next to each other, then they could share a cache line, and performance would potentially be severly impacted. |---arena_t---|---arena_t---| | | | | | | | | ^^^ BAD! Thanks, Jason _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: New libc malloc patchJason,
Actually I didn't mean to imply it was hopeless at all. :-) Obviously there are solutions to the address space issues and agree it is possible to change your code to improve the situation quite a bit; though I think the best one might be to permit mmap to actually consume space below maxdsiz either when hinted to do so or when the space above is consumed, but before I recommend that too much, I must say that I haven't looked into what that would entail at all. Let me take a closer look at what you are doing with regards to cache-lines. You seem to be implying that you are only taking care in regards to how you malloc within a given page? I have a suspicion that it might just be better to dump the problem on to the application in the sense that no malloc should ever be less than the size of one cache line. Perhaps this is what you are doing? -Jon On Tue, 29 Nov 2005, Jason Evans wrote: > Jon, > > Thanks for your comments. Fortunately, I don't think things are > quite as hopeless as you think, though you may be right that some > adjustments are necessary. Specific replies follow. > > On Nov 29, 2005, at 12:06 PM, Jon Dama wrote: > > There exists a problem right now--localized to i386 and any other arch > > based on 32-bit pointers: address space is simply too scarce. > > > > Your decision to switch to using mmap as the exclusive source of > > malloc > > buckets is admirable for its modernity but it simply cannot stand > > unless > > someone steps up to change the way mmap and brk interact within the > > kernel. > > > > The trouble arises from the need to set MAXDSIZ and the resulting > > effect > > it has in determining the start of the mmap region--which I might > > add is > > the location that the shared library loader is placed. This > > effectively > > (and explicitly) sets the limit for how large of a contiguous > > region can > > be allocated with brk. > > > > What you've done by switching the system malloc to exclusively using > > mmap is induced a lot of motivation on the part of the sysadmin to > > push > > that brk/mmap boundary down. > > > > This wouldn't be a problem except that you've effectively shot in > > the foot > > dozens of alternative c malloc implementations, not to mention the > > memory > > allocator routines used in obscure languages such as Modula-3 and > > Haskell > > that rely on brk derived buckets. > > > > This isn't playing very nicely! > > Where should MAXDSIZ be? Given scarce address space, the best we can > hope for is setting it to the "least bad" default, as measured by > what programs we care about do. No matter what we do, some programs > lose. > > That said, it turns out that adding the ability to allocate via brk > isn't hard. The code already contains the logic to recycle address > ranges, in order to reduce mmap system call overhead. All of the > locking infrastructure is also already in place. The only necessary > modifications are 1) explicit use of brk until all data segment space > is consumed, 2) special case code for addresses in the brk range, so > that madvise() is used instead of munmap(), and 3) preferential re- > use of the brk address space over mmap'ed memory. > > Do you agree that there is no need for using brk on 64-bit systems? > > > I looked into the issues and limitations with phkmalloc several > > months ago > > and concluded that simply adopting ptmalloc2 (the linux malloc) was > > the > > better approach--notably it is willing to draw from both brk and > > mmap, and > > it also implements per-thread arenas. > > ptmalloc takes a different approach to per-thread arenas that has > been shown in multiple papers to not scale as well as the approach I > took. The difference isn't significant until you get to 8+ CPUs, but > we already have systems running with enough CPUs that this is an issue. > > > There is also cause for concern about your "cache-line" business. > > Simply > > on the face of it there is the problem that the scheduler does not > > do a > > good job of pinning threads to individual CPUs. The threads are > > already > > bounding from cpu to cpu and thrashing (really thrashing) each CPU > > cache > > along the way. > > > > Second, you've forgotten that there is a layer of indirection > > between your > > address space and the cache: the mapping of logical pages (what you > > can > > see in userspace) to physical pages (the addresses of which actually > > matter for the purposes of the cache). I don't recall off-hand > > whether or > > not the L1 cache on i386 is based on tags of the virtual addresses, > > but I > > am certain that the L2 and L3 caches tag the physical addresses not > > the > > virtual addresses. > > > > This means that your careful address selection based on cache-lines > > will > > only work out if it is done in the vm codepath: remember the > > mapping of > > physical addresses to the virtual addresses that come back from > > mmap can > > be delayed arbitrarily long into the future depending on when the > > program > > actually goes to touch that memory. > > > > Furthermore, the answer may vary depending on the architecture or > > even the > > processor version. > > I don't think you understand the intent for the "cache-line > business". There is only one intention: avoid storing data > structures in the same cache line if they are likely to be accessed > simultaneously by multiple threads. For example, if two independent > structures are stored right next to each other, although they do not > require synchronization protection from each other, the hardware will > send a cache line invalidation message to other CPUs every time > anything in the cache line is modified. This means horrible cache > performance if the data are modified often. > > As a particular example, there are per-arena data structures that are > modified quite often (arena_t). If two arenas were right next to > each other, then they could share a cache line, and performance would > potentially be severly impacted. > > |---arena_t---|---arena_t---| > | | | | | | | | > ^^^ > BAD! > > Thanks, > Jason > freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: New libc malloc patchOn Nov 29, 2005, at 2:21 PM, Jon Dama wrote:
> Let me take a closer look at what you are doing with regards to > cache-lines. You seem to be implying that you are only taking care in > regards to how you malloc within a given page? You are correct that I am only taking care about allocations within a given page. > I have a suspicion that it might just be better to dump the problem > on to > the application in the sense that no malloc should ever be less > than the size of one cache line. Perhaps this is what you are doing? I am only worrying about cache line alignment for malloc's internal data structures. It's up to the application to do this for its allocations, if necessary (doing so for all allocations would induce unacceptable internal fragmentation). This implementation provides posix_memalign(3), which makes it much less painful for the application to do so. Jason _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: New libc malloc patchJon Dama wrote:
>I looked into the issues and limitations with phkmalloc several months ago >and concluded that simply adopting ptmalloc2 (the linux malloc) was the >better approach--notably it is willing to draw from both brk and mmap, and >it also implements per-thread arenas. > > Hi Jon, Is there any chance to test the jamalloc and ptmalloc2 ? I would like to see next ten years, we will use a best performance memory allocator. :-) David Xu _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: New libc malloc patch> There is a patch that contains a new libc malloc implementation at:
> > http://www.canonware.com/~jasone/jemalloc/jemalloc_20051127a.diff > > This implementation is very different from the current libc malloc. > Probably the most important difference is that this one is designed > with threads and SMP in mind. Do you need current for this? I patched and tried buildworld on 6.0 stable but no go. regards Claus _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: New libc malloc patchJason Evans wrote:
> * The virtual memory size of processes, as reported in the SIZE field by top, will appear > astronomical for almost all processes (32+ MB). This is expected; it is merely an artifact > of using large mmap()ed regions rather than sbrk(). Hi, I just read that mmap() part and have to wonder: Is it possible to introduce something like the guard pages that OpenBSD has implemented? I'd love to try this out and see the dozens of applications that fail due to off-by-one bugs. If the security features of OpenBSDs new malloc() could be implemented as new MALLOC_OPTIONS directives, that would be fantastic! Ulrich Spoerlein -- PGP Key ID: F0DB9F44 Encrypted mail welcome! Fingerprint: F1CE D062 0CA9 ADE3 349B 2FE8 980A C6B5 F0DB 9F44 Ok, which part of "Ph'nglui mglw'nafh Cthulhu R'lyeh wgah'nagl fhtagn." didn't you understand? |
|
|
Re: New libc malloc patchIn message <20051130111017.GA67032@...>, Ulrich Spoerlein writes:
>I just read that mmap() part and have to wonder: Is it possible to >introduce something like the guard pages that OpenBSD has implemented? >I'd love to try this out and see the dozens of applications that fail >due to off-by-one bugs. Guard-pages are very expensive and that is why I have not adopted OpenBSD's patch. I would advocate that people use one of the dedicated debugging malloc implementations (ElectricFence ?) instead of putting too much overhead into our default malloc. For all practical purposes, the options J, A, X & Z are the most commonly used. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@... | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: New libc malloc patchOn Wed, 30 Nov 2005 21:48, Poul-Henning Kamp wrote:
> In message <20051130111017.GA67032@...>, Ulrich Spoerlein writes: > >I just read that mmap() part and have to wonder: Is it possible to > >introduce something like the guard pages that OpenBSD has implemented? > >I'd love to try this out and see the dozens of applications that fail > >due to off-by-one bugs. > > Guard-pages are very expensive and that is why I have not adopted > OpenBSD's patch. > > I would advocate that people use one of the dedicated debugging malloc > implementations (ElectricFence ?) instead of putting too much overhead > into our default malloc. usually. Also if you do use it you'll probably have to bump up the vm.max_proc_mmap sysctl or it will fail to allocate memory. Another good one is valgrind (and it detects more problems to boot :) > For all practical purposes, the options J, A, X & Z are the most commonly > used. -- Daniel O'Connor software and network engineer for Genesis Software - http://www.gsoft.com.au "The nice thing about standards is that there are so many of them to choose from." -- Andrew Tanenbaum GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C |
|
|
Re: New libc malloc patchDaniel O'Connor wrote:
> On Wed, 30 Nov 2005 21:48, Poul-Henning Kamp wrote: > > In message <20051130111017.GA67032@...>, Ulrich Spoerlein writes: > > >I just read that mmap() part and have to wonder: Is it possible to > > >introduce something like the guard pages that OpenBSD has implemented? > > >I'd love to try this out and see the dozens of applications that fail > > >due to off-by-one bugs. > > > > Guard-pages are very expensive and that is why I have not adopted > > OpenBSD's patch. Yes, of course it should be disabled as default, but if it could be implemented so you can switch at runtime or compile time (think INVARIANTS/WITNESS) *and* there is no penalty for the disabled case, that be nice. > > I would advocate that people use one of the dedicated debugging malloc > > implementations (ElectricFence ?) instead of putting too much overhead > > into our default malloc. > > Electric fence is right. Although it IS slow, an order of magnitude or more > usually. Also if you do use it you'll probably have to bump up the > vm.max_proc_mmap sysctl or it will fail to allocate memory. > > Another good one is valgrind (and it detects more problems to boot :) Yes, I usualy use dmalloc and valgrind. It's sad other developers don't use any of these tools ... Ulrich Spoerlein -- PGP Key ID: F0DB9F44 Encrypted mail welcome! Fingerprint: F1CE D062 0CA9 ADE3 349B 2FE8 980A C6B5 F0DB 9F44 Ok, which part of "Ph'nglui mglw'nafh Cthulhu R'lyeh wgah'nagl fhtagn." didn't you understand? |
|
|
Re: New libc malloc patchOn Nov 30, 2005, at 4:30 AM, Ulrich Spoerlein wrote:
> Daniel O'Connor wrote: >> On Wed, 30 Nov 2005 21:48, Poul-Henning Kamp wrote: >>> In message <20051130111017.GA67032@...>, Ulrich >>> Spoerlein writes: >>>> I just read that mmap() part and have to wonder: Is it possible to >>>> introduce something like the guard pages that OpenBSD has >>>> implemented? >>>> I'd love to try this out and see the dozens of applications that >>>> fail >>>> due to off-by-one bugs. >>> >>> Guard-pages are very expensive and that is why I have not adopted >>> OpenBSD's patch. > > Yes, of course it should be disabled as default, but if it could be > implemented so you can switch at runtime or compile time (think > INVARIANTS/WITNESS) *and* there is no penalty for the disabled case, > that be nice. In a previous version of the patch, I included compile-time support for redzones around allocations. Kris Kennaway did a full ports tree build with redzones enabled, and several ports caused redzone corruption, but in every case it was due to writing one byte past the end of an allocation. None of these were serious, since word alignment required that the "corrupted" byte be unused. I suspect that we would catch very few serious errors, even if redzones were enabled by default. Due to some unrelated performance issues, I later did a significant rework of the internal data structures, and decided to drop redzone support since the new data structures weren't as conducive to redzones. Ultimately, I don't think we would have wanted to leave this feature enabled, even for CURRENT, because it required that all allocations be larger, thus bloating memory usage for all applications. As a runtime-switchable feature, I think we still wouldn't want to leave it compiled in for production systems. I spent a lot of time looking at valgrind (cachegrind tool) profiles, and found that even innocuous additional features such as the tracking of total allocated memory had significant negative impacts on performance. The feature that I really didn't want to remove, that is also important to redzone support, was byte-exact tracking of allocation size. The extra branches that would be required for runtime support of redzones probably wouldn't be worth the cost. Jason _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: New libc malloc patchOn Wed, Nov 30, 2005 at 06:32:54AM -0800, Jason Evans wrote:
> In a previous version of the patch, I included compile-time support > for redzones around allocations. Kris Kennaway did a full ports tree > build with redzones enabled, and several ports caused redzone > corruption, but in every case it was due to writing one byte past the > end of an allocation. None of these were serious, since word > alignment required that the "corrupted" byte be unused. I suspect > that we would catch very few serious errors, even if redzones were > enabled by default. You can make red zones word-aligned in addition to byte-aligned variant, both as malloc options, of course. -- http://ache.pp.ru/ _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: New libc malloc patchJason Evans wrote:
> [Why no redzones in jemalloc] Thanks for the elaborate explanation. Greatly appreciated. Ulrich Spoerlein -- PGP Key ID: F0DB9F44 Encrypted mail welcome! Fingerprint: F1CE D062 0CA9 ADE3 349B 2FE8 980A C6B5 F0DB 9F44 Ok, which part of "Ph'nglui mglw'nafh Cthulhu R'lyeh wgah'nagl fhtagn." didn't you understand? |
|
|
Re: New libc malloc patchOn Nov 30, 2005, at 1:02 AM, Claus Guttesen wrote:
>> There is a patch that contains a new libc malloc implementation at: >> >> http://www.canonware.com/~jasone/jemalloc/jemalloc_20051127a.diff > > Do you need current for this? I patched and tried buildworld on 6.0 > stable but no go. I started work on this before 6.0 branched, and am unaware of any changes that would impact the patch. However, I've only used current for the development. Jason _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
| < Prev | 1 - 2 - 3 - 4 | Next > |
| Free embeddable forum powered by Nabble | Forum Help |