|
View:
New views
20 Messages
—
Rating Filter:
Alert me
|
| < Prev | 1 - 2 - 3 - 4 | Next > |
|
|
Re: New libc malloc patchOn Nov 29, 2005, at 2:52 AM, Hiten Pandya wrote:
> I see that you have included an implementation of red-black tree CPP > macros, but wouldn't it be better if you were to use the ones in > <sys/tree.h> ? I have only had a precursory look, but I would have > thought that would be the way to go. There's an updated patch available: http://www.canonware.com/~jasone/jemalloc/jemalloc_20051201a.diff This patch includes the following changes: *) Use sys/tree.h rather than a separate red-black tree implementation. *) Use the __isthreaded symbol to avoid locking for single-threaded programs, and to simplify malloc initialization. The extra branches that are required to check __isthreaded should be more than offset by the removal of an atomic compare/swap operation. *) Fix an obscure bug (very difficult to trigger without changing some compile-time constants). Jason _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: New libc malloc patchThanks Jason!
Kind Regards, -- Hiten Pandya hiten.pandya at gmail.com On 01/12/05, Jason Evans <jasone@...> wrote: > On Nov 29, 2005, at 2:52 AM, Hiten Pandya wrote: > > I see that you have included an implementation of red-black tree CPP > > macros, but wouldn't it be better if you were to use the ones in > > <sys/tree.h> ? I have only had a precursory look, but I would have > > thought that would be the way to go. > > There's an updated patch available: > > http://www.canonware.com/~jasone/jemalloc/jemalloc_20051201a.diff > > This patch includes the following changes: > > *) Use sys/tree.h rather than a separate red-black tree implementation. > > *) Use the __isthreaded symbol to avoid locking for single-threaded > programs, and to simplify malloc initialization. The extra branches > that are required to check __isthreaded should be more than offset by > the removal of an atomic compare/swap operation. > > *) Fix an obscure bug (very difficult to trigger without changing > some compile-time constants). > > Jason > freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: New libc malloc patchOn Nov 29, 2005, at 12:06 PM, Jon Dama wrote:
> There exists a problem right now--localized to i386 and any other arch > based on 32-bit pointers: address space is simply too scarce. > > Your decision to switch to using mmap as the exclusive source of > malloc > buckets is admirable for its modernity but it simply cannot stand > unless > someone steps up to change the way mmap and brk interact within the > kernel. There's a new version of the patch available at: http://www.canonware.com/~jasone/jemalloc/jemalloc_20051202b.diff This version of the patch adds the following: * Prefer to use sbrk() rather than mmap() for the 32-bit platforms. * Lazily create arenas, so that single-threaded applications don't dedicate space to arenas they never use. * Add the '*' and '/' MALLOC_OPTIONS flags, which allow control over the number of arenas. As of this patch, all of the issues that were brought to my attention have been addressed. This is a good time for additional review and serious benchmarking. Thanks, Jason _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: New libc malloc patchJason Evans wrote:
> On Nov 29, 2005, at 12:06 PM, Jon Dama wrote: > >> There exists a problem right now--localized to i386 and any other arch >> based on 32-bit pointers: address space is simply too scarce. >> >> Your decision to switch to using mmap as the exclusive source of malloc >> buckets is admirable for its modernity but it simply cannot stand unless >> someone steps up to change the way mmap and brk interact within the >> kernel. > > > There's a new version of the patch available at: > > http://www.canonware.com/~jasone/jemalloc/jemalloc_20051202b.diff > > This version of the patch adds the following: > > * Prefer to use sbrk() rather than mmap() for the 32-bit platforms. > > * Lazily create arenas, so that single-threaded applications don't > dedicate space to arenas they never use. > > * Add the '*' and '/' MALLOC_OPTIONS flags, which allow control over > the number of arenas. > > As of this patch, all of the issues that were brought to my attention > have been addressed. This is a good time for additional review and > serious benchmarking. > > Thanks, > Jason I have a question about mutex used in the patch, you are using a spin loop, isn't it suboptimal ? and a thread library like libpthread supports static priority scheduling, this mutex does not work, it will causes a dead lock, if a lower priority thread locked the mutex, and preempted by a higher priority thread, and the higher priority thread also calls malloc, it will spin there to wait lower priority thread to complete, but that will never happen. David Xu _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: New libc malloc patch> There's a new version of the patch available at:
> http://www.canonware.com/~jasone/jemalloc/jemalloc_20051202b.diff When I do a make buildworld I get: cc -fpic -DPIC -O2 -fno-strict-aliasing -pipe -march=athlon64 -I/usr/src/lib/libc/include -I/usr/src/lib/libc/../../include -I/usr/src/lib/libc/amd64 -D__DBINTERFACE_PRIVATE -I/usr/src/lib/libc/../../contrib/gdtoa -DINET6 -I/usr/obj/usr/src/lib/libc -DPOSIX_MISTAKE -I/usr/src/lib/libc/locale -DBROKEN_DES -DPORTMAP -DDES_BUILTIN -I/usr/src/lib/libc/rpc -DYP -Wsystem-headers -Werror -Wall -Wno-format-y2k -Wno-uninitialized -c /usr/src/lib/libc/stdlib/malloc.c -o malloc.So cc -O2 -fno-strict-aliasing -pipe -march=athlon64 -I/usr/src/lib/libc/include -I/usr/src/lib/libc/../../include -I/usr/src/lib/libc/amd64 -D__DBINTERFACE_PRIVATE -I/usr/src/lib/libc/../../contrib/gdtoa -DINET6 -I/usr/obj/usr/src/lib/libc -DPOSIX_MISTAKE -I/usr/src/lib/libc/locale -DBROKEN_DES -DPORTMAP -DDES_BUILTIN -I/usr/src/lib/libc/rpc -DYP -Wsystem-headers -Werror -Wall -Wno-format-y2k -Wno-uninitialized -c /usr/src/lib/libc/stdlib/merge.c /usr/src/lib/libc/stdlib/malloc.c: In function `malloc_mutex_lock': /usr/src/lib/libc/stdlib/malloc.c:846: warning: cast from pointer to integer of different size /usr/src/lib/libc/stdlib/malloc.c:846: warning: cast from pointer to integer of different size /usr/src/lib/libc/stdlib/malloc.c:853: warning: cast from pointer to integer of different size /usr/src/lib/libc/stdlib/malloc.c: In function `malloc_mutex_unlock': /usr/src/lib/libc/stdlib/malloc.c:894: warning: cast from pointer to integer of different size *** Error code 1 /usr/src/lib/libc/stdlib/malloc.c: In function `malloc_mutex_lock': /usr/src/lib/libc/stdlib/malloc.c:846: warning: cast from pointer to integer of different size /usr/src/lib/libc/stdlib/malloc.c:846: warning: cast from pointer to integer of different size /usr/src/lib/libc/stdlib/malloc.c:853: warning: cast from pointer to integer of different size /usr/src/lib/libc/stdlib/malloc.c: In function `malloc_mutex_unlock': /usr/src/lib/libc/stdlib/malloc.c:894: warning: cast from pointer to integer of different size *** Error code 1 2 errors *** Error code 2 1 error *** Error code 2 1 error *** Error code 2 1 error *** Error code 2 1 error *** Error code 2 1 error make -j 3 buildworld 763,54s user 150,98s system 173% cpu 8:48,62 total twin/usr/src#>uname -a FreeBSD twin.gnome.no 7.0-CURRENT FreeBSD 7.0-CURRENT #0: Thu Dec 1 21:38:11 CET 2005 root@...:/usr/obj/usr/src/sys/TWIN amd64 regards Claus _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: New libc malloc patchOn Dec 3, 2005, at 12:45 PM, Claus Guttesen wrote:
>> There's a new version of the patch available at: >> http://www.canonware.com/~jasone/jemalloc/jemalloc_20051202b.diff > > When I do a make buildworld I get: > > cc -fpic -DPIC -O2 -fno-strict-aliasing -pipe -march=athlon64 > -I/usr/src/lib/libc/include -I/usr/src/lib/libc/../../include > -I/usr/src/lib/libc/amd64 -D__DBINTERFACE_PRIVATE > -I/usr/src/lib/libc/../../contrib/gdtoa -DINET6 > -I/usr/obj/usr/src/lib/libc -DPOSIX_MISTAKE -I/usr/src/lib/libc/locale > -DBROKEN_DES -DPORTMAP -DDES_BUILTIN -I/usr/src/lib/libc/rpc -DYP > -Wsystem-headers -Werror -Wall -Wno-format-y2k -Wno-uninitialized -c > /usr/src/lib/libc/stdlib/malloc.c -o malloc.So > cc -O2 -fno-strict-aliasing -pipe -march=athlon64 > -I/usr/src/lib/libc/include -I/usr/src/lib/libc/../../include > -I/usr/src/lib/libc/amd64 -D__DBINTERFACE_PRIVATE > -I/usr/src/lib/libc/../../contrib/gdtoa -DINET6 > -I/usr/obj/usr/src/lib/libc -DPOSIX_MISTAKE -I/usr/src/lib/libc/locale > -DBROKEN_DES -DPORTMAP -DDES_BUILTIN -I/usr/src/lib/libc/rpc -DYP > -Wsystem-headers -Werror -Wall -Wno-format-y2k -Wno-uninitialized -c > /usr/src/lib/libc/stdlib/merge.c > /usr/src/lib/libc/stdlib/malloc.c: In function `malloc_mutex_lock': > /usr/src/lib/libc/stdlib/malloc.c:846: warning: cast from pointer to > integer of different size > /usr/src/lib/libc/stdlib/malloc.c:846: warning: cast from pointer to > integer of different size > /usr/src/lib/libc/stdlib/malloc.c:853: warning: cast from pointer to > integer of different size > /usr/src/lib/libc/stdlib/malloc.c: In function `malloc_mutex_unlock': > /usr/src/lib/libc/stdlib/malloc.c:894: warning: cast from pointer to > integer of different size > *** Error code 1 > /usr/src/lib/libc/stdlib/malloc.c: In function `malloc_mutex_lock': > /usr/src/lib/libc/stdlib/malloc.c:846: warning: cast from pointer to > integer of different size > /usr/src/lib/libc/stdlib/malloc.c:846: warning: cast from pointer to > integer of different size > /usr/src/lib/libc/stdlib/malloc.c:853: warning: cast from pointer to > integer of different size > /usr/src/lib/libc/stdlib/malloc.c: In function `malloc_mutex_unlock': > /usr/src/lib/libc/stdlib/malloc.c:894: warning: cast from pointer to > integer of different size > *** Error code 1 > 2 errors > *** Error code 2 > 1 error > *** Error code 2 > 1 error > *** Error code 2 > 1 error > *** Error code 2 > 1 error > *** Error code 2 > 1 error > make -j 3 buildworld 763,54s user 150,98s system 173% cpu 8:48,62 > total > > twin/usr/src#>uname -a > FreeBSD twin.gnome.no 7.0-CURRENT FreeBSD 7.0-CURRENT #0: Thu Dec 1 > 21:38:11 CET 2005 root@...:/usr/obj/usr/src/sys/TWIN > amd64 Did you use the 20051202b patch? I thought I had fixed the problem, but I don't have an amd64 system to test on. In any case, I'll be uploading up a new patch in a few minutes that removes the offending code entirely. Thanks, Jason _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: New libc malloc patchOn Dec 3, 2005, at 12:26 AM, David Xu wrote:
> I have a question about mutex used in the patch, you are using > a spin loop, isn't it suboptimal ? and a thread library like > libpthread > supports static priority scheduling, this mutex does not work, it > will causes a dead lock, if a lower priority thread locked the mutex, > and preempted by a higher priority thread, and the higher priority > thread also calls malloc, it will spin there to wait lower > priority thread to complete, but that will never happen. David, You are correct that this is a problem. Thank you for pointing it out. There's a new patch that uses the spinlocks that are provided by the threads libraries. Please let me know if this looks okay. Also, this patch removes/modifies the code that was causing build failures on amd64, so it's worth giving another try. Hopefully, it will compile now... http://www.canonware.com/~jasone/jemalloc/jemalloc_20051203a.diff Thanks, Jason _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: New libc malloc patch> Did you use the 20051202b patch? I thought I had fixed the problem,
> but I don't have an amd64 system to test on. In any case, I'll be > uploading up a new patch in a few minutes that removes the offending > code entirely. Yes. I'll do the test that you want me to do :-) Thank you! regards Claus _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: New libc malloc patchJason Evans wrote:
> On Dec 3, 2005, at 12:26 AM, David Xu wrote: > >> I have a question about mutex used in the patch, you are using >> a spin loop, isn't it suboptimal ? and a thread library like libpthread >> supports static priority scheduling, this mutex does not work, it >> will causes a dead lock, if a lower priority thread locked the mutex, >> and preempted by a higher priority thread, and the higher priority >> thread also calls malloc, it will spin there to wait lower >> priority thread to complete, but that will never happen. > > > David, > > You are correct that this is a problem. Thank you for pointing it > out. There's a new patch that uses the spinlocks that are provided > by the threads libraries. Please let me know if this looks okay. > > Also, this patch removes/modifies the code that was causing build > failures on amd64, so it's worth giving another try. Hopefully, it > will compile now... > > http://www.canonware.com/~jasone/jemalloc/jemalloc_20051203a.diff > > Thanks, > Jason > track off all spinlocks in libc and reset them in child process, they will complain if there are too many spinlocks, this is not very correct, but would resolve dead lock in real world applications (weird applications). Because I see you have put _malloc_prefork() and _malloc_postfork() hooks in thread libraries, I guess you want to manage all malloc locks, so you might don't need to use the spinlocks, you can implement these locks by using umtx provided by kernel, you can use UMTX_OP_WAIT and UMTX_OP_WAKE to implement these locks, the UMTX_OP_LOCK and UMTX_OP_UNLOCK can also be used to implement locks, but I reserve these two functions since I have plan to implement reliable POSIX process shared mutex. you can find those code in libthr to study how to use umtx. Last, I don't know if umtx will work with libc_r, but libc_r has already been disconneted from world for some days, it will rot away. Regards, David Xu _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: New libc malloc patchHere is sample code to implement a mutex by using umtx syscalls:
#include <errno.h> #include <stddef.h> #include <sys/ucontext.h> #include <sys/umtx.h> #include <sys/types.h> #include <machine/atomic.h> #include <pthread.h> #define LCK_UNLOCKED 0 #define LCK_LOCKED 1 #define LCK_CONTENDED 2 void lock_mtx(struct umtx *mtx) { volatile uintptr_t *m = (volatile uintptr_t *)mtx; for (;;) { /* try to lock it. */ if (atomic_cmpset_acq_ptr(m, LCK_UNLOCKED, LCK_LOCKED)) return; if (atomic_load_acq_ptr(m) == LCK_LOCKED) { /* * if it was locked by single thread, try to * set it to contented state. */ if (!atomic_cmpset_acq_ptr(m, LCK_LOCKED, LCK_CONTENDED)) continue; } /* if in contented state, wait it to be unlocked. */ if (atomic_load_acq_ptr(m) == LCK_CONTENDED) _umtx_op((struct umtx *)m, UMTX_OP_WAIT, LCK_CONTENDED, 0, NULL); } } void unlock_mtx(struct umtx *mtx) { volatile uintptr_t *m = (volatile uintptr_t *)mtx; for (;;) { if (atomic_load_acq_ptr(m) == LCK_UNLOCKED) err(1, "unlock a unlocked mutex\n"); if (atomic_load_acq_ptr(m) == LCK_LOCKED) { if (atomic_cmpset_acq_ptr(m, LCK_LOCKED, LCK_UNLOCKED)) return; } if (atomic_load_acq_ptr(m) == LCK_CONTENDED) { atomic_store_rel_ptr(m, LCK_UNLOCKED); _umtx_op((struct umtx *)m, UMTX_OP_WAKE, 1, NULL, NULL); break; } } } struct umtx m; void * lock_test(void *arg) { int i = 0; for (i = 0; i < 10000; ++i) { lock_mtx(&m); pthread_yield(); unlock_mtx(&m); } return (0); } int main() { pthread_t td1, td2; pthread_create(&td1, NULL, lock_test, NULL); pthread_create(&td2, NULL, lock_test, NULL); pthread_join(td1, NULL); pthread_join(td2, NULL); return (0); } _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: New libc malloc patchDavid Xu wrote:
> Here is sample code to implement a mutex by using umtx syscalls: > ... > void > unlock_mtx(struct umtx *mtx) > { > volatile uintptr_t *m = (volatile uintptr_t *)mtx; > > for (;;) { > if (atomic_load_acq_ptr(m) == LCK_UNLOCKED) > err(1, "unlock a unlocked mutex\n"); > if (atomic_load_acq_ptr(m) == LCK_LOCKED) { > if (atomic_cmpset_acq_ptr(m, LCK_LOCKED, LCK_UNLOCKED)) > return; > } > if (atomic_load_acq_ptr(m) == LCK_CONTENDED) { > atomic_store_rel_ptr(m, LCK_UNLOCKED); > _umtx_op((struct umtx *)m, UMTX_OP_WAKE, 1, NULL, NULL); _umtx_op((struct umtx *)m, UMTX_OP_WAKE, INT_MAX, NULL, NULL); This line is not very optimal if there are lots of thread waiting there. :-) There is optimal version using transaction id: http://www.dragonflybsd.org/cvsweb/src/lib/libthread_xu/thread/thr_umtx.c?rev=1.2&content-type=text/x-cvsweb-markup Though, libthr in freebsd does not use these semantices, instead they are implemented in kernel. David Xu _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: New libc malloc patchOn Dec 3, 2005, at 5:40 PM, David Xu wrote:
> The libc spinlocks are deprecated, in fact, thread libraries try to > keep track > off all spinlocks in libc and reset them in child process, they > will complain > if there are too many spinlocks, this is not very correct, but > would resolve > dead lock in real world applications (weird applications). > Because I see you have put _malloc_prefork() and _malloc_postfork() > hooks in thread libraries, I guess you want to manage all malloc > locks, so > you might don't need to use the spinlocks, you can implement these > locks by using umtx provided by kernel, you can use UMTX_OP_WAIT > and UMTX_OP_WAKE to implement these locks, the UMTX_OP_LOCK > and UMTX_OP_UNLOCK can also be used to implement locks, but I reserve > these two functions since I have plan to implement reliable POSIX > process > shared mutex. you can find those code in libthr to study how to > use umtx. > Last, I don't know if umtx will work with libc_r, but libc_r has > already been > disconneted from world for some days, it will rot away. I just need simple (low overhead) mutexes that don't cause malloc to be called during their initialization. I would have used pthread_mutex_* directly, but cannot due to infinite recursion problems during initialization. As you pointed out, it's important to get priority inheritance right in order to avoid priority inversion deadlock, so my hand-rolled spinlocks weren't adequate. I need mutexes that are managed by the threads library. The libc spinlocks appear to fit the bill perfectly in that capacity. It seems to me that using umtx would actually be the wrong thing to do, because I'd be circumventing libpthread's userland scheduler, and it would be the wrong thing for libc_r, as you pointed out. This approach would work for libthr, but perhaps nothing else? I'd like to keep things as simple and general as possible. Is the current implementation that uses libc spinlocks acceptable? Thanks, Jason P.S. Why are libc spinlocks deprecated? _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: New libc malloc patchJason Evans wrote:
> I just need simple (low overhead) mutexes that don't cause malloc to be > called during their initialization. umtx is light weight and fast and need not malloc. > I would have used pthread_mutex_* > directly, but cannot due to infinite recursion problems during > initialization. > Yes, I know current pthread_mutex implementations use malloc, I don't think it will be changed to avoid using malloc very soon. > As you pointed out, it's important to get priority inheritance right in > order to avoid priority inversion deadlock, so my hand-rolled spinlocks > weren't adequate. I need mutexes that are managed by the threads > library. The libc spinlocks appear to fit the bill perfectly in that > capacity. It seems to me that using umtx would actually be the wrong > thing to do, because I'd be circumventing libpthread's userland > scheduler, and it would be the wrong thing for libc_r, as you pointed > out. This approach would work for libthr, but perhaps nothing else? > umtx will work with libpthread, I can not find any reason why using umtx will cause deadlock, the userland scheduler can not propagate its priority decision cross kernel, and umtx is a blockable syscall. > I'd like to keep things as simple and general as possible. Is the > current implementation that uses libc spinlocks acceptable? > > Thanks, > Jason > > P.S. Why are libc spinlocks deprecated? > > Because we want other libraries use pthread mutex, if it can not be used widely and we have to use spinlock, it is really a bad taste. I think only the malloc has recursive problem. I tell you the fact, libpthread needs malloc to initialize spinlock, so you can not create spinlock dynamically in your malloc code. only libthr does not have the problem. libc_r also has priority inversion problem with your current mutex code. Regards, David Xu _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: New libc malloc patch> Did you use the 20051202b patch? I thought I had fixed the problem,
> but I don't have an amd64 system to test on. In any case, I'll be > uploading up a new patch in a few minutes that removes the offending > code entirely. I was able to do a buildworld on current with this patch, but I had problems getting X to run and kldxref took all my space on the root-partition doing a installkernel. So I downgraded to 6.0 stable and get this error: ===> libexec/atrun (all) cc -O2 -fno-strict-aliasing -pipe -march=athlon64 -DATJOB_DIR=\"/var/at/jobs/\" -DLFILE=\"/var/at/jobs/.lockfile\" -DLOADAVG_MX=1.5 -DATSPOOL_DIR=\"/var/at/spool\" -DVERSION=\"2.9\" -DDAEMON_UID=1 -DDAEMON_GID=1 -DDEFAULT_BATCH_QUEUE=\'E\' -DDEFAULT_AT_QUEUE=\'c\' -DPERM_PATH=\"/var/at/\" -I/usr/src/libexec/atrun/../../usr.bin/at -I/usr/src/libexec/atrun -c /usr/src/libexec/atrun/atrun.c cc -O2 -fno-strict-aliasing -pipe -march=athlon64 -DATJOB_DIR=\"/var/at/jobs/\" -DLFILE=\"/var/at/jobs/.lockfile\" -DLOADAVG_MX=1.5 -DATSPOOL_DIR=\"/var/at/spool\" -DVERSION=\"2.9\" -DDAEMON_UID=1 -DDAEMON_GID=1 -DDEFAULT_BATCH_QUEUE=\'E\' -DDEFAULT_AT_QUEUE=\'c\' -DPERM_PATH=\"/var/at/\" -I/usr/src/libexec/atrun/../../usr.bin/at -I/usr/src/libexec/atrun -c /usr/src/libexec/atrun/gloadavg.c cc -O2 -fno-strict-aliasing -pipe -march=athlon64 -DATJOB_DIR=\"/var/at/jobs/\" -DLFILE=\"/var/at/jobs/.lockfile\" -DLOADAVG_MX=1.5 -DATSPOOL_DIR=\"/var/at/spool\" -DVERSION=\"2.9\" -DDAEMON_UID=1 -DDAEMON_GID=1 -DDEFAULT_BATCH_QUEUE=\'E\' -DDEFAULT_AT_QUEUE=\'c\' -DPERM_PATH=\"/var/at/\" -I/usr/src/libexec/atrun/../../usr.bin/at -I/usr/src/libexec/atrun -o atrun atrun.o gloadavg.o /usr/obj/usr/src/tmp/usr/lib/libc.so: undefined reference to `calloc' /usr/obj/usr/src/tmp/usr/lib/libc.so: undefined reference to `posix_memalign' *** Error code 1 Stop in /usr/src/libexec/atrun. *** Error code 1 Stop in /usr/src/libexec. *** Error code 1 Stop in /usr/src. *** Error code 1 Stop in /usr/src. *** Error code 1 Stop in /usr/src. make buildworld 1122,93s user 217,28s system 84% cpu 26:18,72 total twin/usr/src#>uname -a FreeBSD twin.gnome.no 6.0-STABLE FreeBSD 6.0-STABLE #0: Sun Dec 4 01:18:58 CET 2005 root@...:/usr/obj/usr/src/sys/TWIN amd64 regards Claus _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: New libc malloc patchOn Sun, 4 Dec 2005, David Xu wrote:
> Jason Evans wrote: > > I just need simple (low overhead) mutexes that don't cause malloc to be > > called during their initialization. > umtx is light weight and fast and need not malloc. > > > I would have used pthread_mutex_* > > directly, but cannot due to infinite recursion problems during > > initialization. > > > Yes, I know current pthread_mutex implementations use malloc, > I don't think it will be changed to avoid using malloc very soon. It's on my list of things to do. > > As you pointed out, it's important to get priority inheritance right in > > order to avoid priority inversion deadlock, so my hand-rolled spinlocks > > weren't adequate. I need mutexes that are managed by the threads > > library. The libc spinlocks appear to fit the bill perfectly in that > > capacity. It seems to me that using umtx would actually be the wrong > > thing to do, because I'd be circumventing libpthread's userland > > scheduler, and it would be the wrong thing for libc_r, as you pointed > > out. This approach would work for libthr, but perhaps nothing else? > > > umtx will work with libpthread, I can not find any reason why using umtx > will cause deadlock, the userland scheduler can not propagate its > priority decision cross kernel, and umtx is a blockable syscall. The problem is userland code can exit, circumvent the unlock by exception handling, take a signal and longjmp, etc., which may leave locks (not known by libpthread) held. At least with spinlocks or mutex, the thread libraries can know that the application is in a critical region and can behave accordingly. Libpthread will defer switching threads when they are in critical regions (unless they are blocked). I think that libc or other libraries that want to be thread-safe shouldn't try to roll their own locks. The reason to do so is that lock overhead may be deemed too great. If that is the case, then we should fix the problem at its source ;-) Of course, the other reason is that mutexes currently have to be allocated. -- DE _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: New libc malloc patchOn Dec 4, 2005, at 4:51 AM, Claus Guttesen wrote:
> I was able to do a buildworld on current with this patch, but I had > problems getting X to run and kldxref took all my space on the > root-partition doing a installkernel. I've fixed the offending bug in kldxref in the latest patch: http://www.canonware.com/~jasone/jemalloc/jemalloc_20051211b.diff I spent several hours poking at X, but was never able to determine why it goes into an infinite loop. The infinite loop happens rather early, during the load of the libbitmap module. My best guess is that it is stuck trying to acquire the Xlib lock, but cannot be sure, since I don't know how to get debug symbols for the loaded X module. In any case, malloc is nowhere in the backtrace. I do not have the time to acquire the X expertise that is likely needed to track down this problem. Hopefully someone else will be willing to do so. No new problems in the malloc code have been found for some time now. It has been tested on i386, sparc64, arm, and amd64. In my opinion, the malloc patch is ready to be committed. I am now working on the assumption that new problems are more likely application bugs than malloc bugs. This seems like a good time to start sharing the debugging load with the community. =) So, how about it? Can this patch go in now? Thanks, Jason _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: New libc malloc patchJason Evans wrote:
> On Dec 4, 2005, at 4:51 AM, Claus Guttesen wrote: > >> I was able to do a buildworld on current with this patch, but I had >> problems getting X to run and kldxref took all my space on the >> root-partition doing a installkernel. > > > I've fixed the offending bug in kldxref in the latest patch: > > http://www.canonware.com/~jasone/jemalloc/jemalloc_20051211b.diff > > I spent several hours poking at X, but was never able to determine > why it goes into an infinite loop. The infinite loop happens rather > early, during the load of the libbitmap module. My best guess is > that it is stuck trying to acquire the Xlib lock, but cannot be sure, > since I don't know how to get debug symbols for the loaded X module. > In any case, malloc is nowhere in the backtrace. I do not have the > time to acquire the X expertise that is likely needed to track down > this problem. Hopefully someone else will be willing to do so. > > No new problems in the malloc code have been found for some time > now. It has been tested on i386, sparc64, arm, and amd64. In my > opinion, the malloc patch is ready to be committed. I am now working > on the assumption that new problems are more likely application bugs > than malloc bugs. This seems like a good time to start sharing the > debugging load with the community. =) > > So, how about it? Can this patch go in now? I may have missed it but some benchmark numbers could be good. Is there no way to make it an option for a while? that would get good testing AND a fallback for people. > > Thanks, > Jason > _______________________________________________ > freebsd-current@... mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to > "freebsd-current-unsubscribe@..." _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: New libc malloc patchJulian Elischer wrote:
>> >> No new problems in the malloc code have been found for some time >> now. It has been tested on i386, sparc64, arm, and amd64. In my >> opinion, the malloc patch is ready to be committed. I am now >> working on the assumption that new problems are more likely >> application bugs than malloc bugs. This seems like a good time to >> start sharing the debugging load with the community. =) >> >> So, how about it? Can this patch go in now? > > > > I may have missed it but some benchmark numbers could be good. > > Is there no way to make it an option for a while? > that would get good testing AND a fallback for people. > to import ptmalloc in the past, the malloc problem had been discussed several times in thread@ list. Also, it would be nice if a fallback can be provided :-) David Xu _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: New libc malloc patchOn Mon, Dec 12, 2005 at 08:50:01AM +0800, David Xu wrote:
> Julian Elischer wrote: > > >> > >>No new problems in the malloc code have been found for some time > >>now. It has been tested on i386, sparc64, arm, and amd64. In my > >>opinion, the malloc patch is ready to be committed. I am now > >>working on the assumption that new problems are more likely > >>application bugs than malloc bugs. This seems like a good time to > >>start sharing the debugging load with the community. =) > >> > >>So, how about it? Can this patch go in now? > > > > > > > >I may have missed it but some benchmark numbers could be good. > > > >Is there no way to make it an option for a while? > >that would get good testing AND a fallback for people. > > > I also would like to see any benchmark number, in fact, I had plan > to import ptmalloc in the past, the malloc problem had been discussed > several times in thread@ list. multiple threads on a 14-CPU sparc64 machine. This is a poor test because sparc64 doesn't have TLS support, which is needed for jemalloc to perform well. It still shows it kicking the pants off of phkmalloc for both single-threaded and multi-threaded malloc. phkmalloc: # ./malloc-test 1024 1000000 1 Starting test with 1 thread... Thread 2114048 adjusted timing: 27.124817 seconds for 1000000 requests of 1024 bytes. Starting test with 2 threads... Thread 2114560 adjusted timing: 67.535854 seconds for 1000000 requests of 1024 bytes. Thread 2114048 adjusted timing: 70.330298 seconds for 1000000 requests of 1024 bytes. # ./malloc-test 1024 1000000 3 Starting test with 3 threads... Thread 2114048 adjusted timing: 74.154855 seconds for 1000000 requests of 1024 bytes. Thread 2115072 adjusted timing: 74.356363 seconds for 1000000 requests of 1024 bytes. Thread 2114560 adjusted timing: 77.038550 seconds for 1000000 requests of 1024 bytes. # ./malloc-test 1024 1000000 4 Starting test with 4 threads... Thread 2115072 adjusted timing: 217.741657 seconds for 1000000 requests of 1024 bytes. Thread 2115584 adjusted timing: 228.434310 seconds for 1000000 requests of 1024 bytes. Thread 2114048 adjusted timing: 228.941544 seconds for 1000000 requests of 1024 bytes. Thread 2114560 adjusted timing: 229.286134 seconds for 1000000 requests of 1024 bytes. # ./malloc-test 1024 1000000 5 Starting test with 5 threads... Thread 2114048 adjusted timing: 770.255000 seconds for 1000000 requests of 1024 bytes. Thread 2115072 adjusted timing: 770.749431 seconds for 1000000 requests of 1024 bytes. Thread 2116096 adjusted timing: 771.307654 seconds for 1000000 requests of 1024 bytes. Thread 2114560 adjusted timing: 772.293253 seconds for 1000000 requests of 1024 bytes. Thread 2115584 adjusted timing: 772.550847 seconds for 1000000 requests of 1024 bytes. jemalloc: # ./malloc-test 1024 1000000 1 Starting test with 1 thread... Thread -1610612656 adjusted timing: 5.428918 seconds for 1000000 requests of 1024 bytes. # ./malloc-test 1024 1000000 2 Starting test with 2 threads... Thread -1610612656 adjusted timing: 4.840497 seconds for 1000000 requests of 1024 bytes. Thread -1610612176 adjusted timing: 4.948382 seconds for 1000000 requests of 1024 bytes. # ./malloc-test 1024 1000000 3 Starting test with 3 threads... Thread -1610611696 adjusted timing: 25.065195 seconds for 1000000 requests of 1024 bytes. Thread -1610612656 adjusted timing: 25.218103 seconds for 1000000 requests of 1024 bytes. Thread -1610612176 adjusted timing: 25.286181 seconds for 1000000 requests of 1024 bytes. # ./malloc-test 1024 1000000 4 Starting test with 4 threads... Thread -1610612656 adjusted timing: 38.176479 seconds for 1000000 requests of 1024 bytes. Thread -1610611216 adjusted timing: 38.221169 seconds for 1000000 requests of 1024 bytes. Thread -1610611696 adjusted timing: 38.294425 seconds for 1000000 requests of 1024 bytes. Thread -1610612176 adjusted timing: 38.320669 seconds for 1000000 requests of 1024 bytes. # ./malloc-test 1024 1000000 5 Starting test with 5 threads... Thread -1610611216 adjusted timing: 50.376766 seconds for 1000000 requests of 1024 bytes. Thread -1610612656 adjusted timing: 50.435407 seconds for 1000000 requests of 1024 bytes. Thread -1610611696 adjusted timing: 50.885393 seconds for 1000000 requests of 1024 bytes. Thread -1610610736 adjusted timing: 50.943412 seconds for 1000000 requests of 1024 bytes. Thread -1610612176 adjusted timing: 50.953694 seconds for 1000000 requests of 1024 bytes. i.e. jemalloc is a factor of 5 times faster for single-threaded malloc, and about 15 times faster than phkmalloc for 5 threads. You see the effect of the missing TLS on sparc64 in the scaling (i.e. performance should be even better with multiple threads), and with some large performance variation with larger numbers of threads (probably due to hash collisions): # ./malloc-test 1024 1000000 20 Starting test with 20 threads... Thread -1610604016 adjusted timing: 48.297304 seconds for 1000000 requests of 1024 bytes. Thread -1610604496 adjusted timing: 104.249693 seconds for 1000000 requests of 1024 bytes. Thread -1610602496 adjusted timing: 109.578616 seconds for 1000000 requests of 1024 bytes. Thread -1610607856 adjusted timing: 252.337973 seconds for 1000000 requests of 1024 bytes. Thread -1610610736 adjusted timing: 254.338225 seconds for 1000000 requests of 1024 bytes. Thread -1610606896 adjusted timing: 255.015353 seconds for 1000000 requests of 1024 bytes. Thread -1610607376 adjusted timing: 257.463410 seconds for 1000000 requests of 1024 bytes. Thread -1610609776 adjusted timing: 257.848283 seconds for 1000000 requests of 1024 bytes. Thread -1610605936 adjusted timing: 257.955005 seconds for 1000000 requests of 1024 bytes. Thread -1610604976 adjusted timing: 259.303220 seconds for 1000000 requests of 1024 bytes. Thread -1610611216 adjusted timing: 259.610871 seconds for 1000000 requests of 1024 bytes. Thread -1610606416 adjusted timing: 260.622687 seconds for 1000000 requests of 1024 bytes. Thread -1610611696 adjusted timing: 260.857706 seconds for 1000000 requests of 1024 bytes. Thread -1610610256 adjusted timing: 261.056716 seconds for 1000000 requests of 1024 bytes. Thread -1610608816 adjusted timing: 261.764455 seconds for 1000000 requests of 1024 bytes. Thread -1610609296 adjusted timing: 261.800319 seconds for 1000000 requests of 1024 bytes. Thread -1610605456 adjusted timing: 261.748707 seconds for 1000000 requests of 1024 bytes. Thread -1610612176 adjusted timing: 262.108598 seconds for 1000000 requests of 1024 bytes. Thread -1610608336 adjusted timing: 262.119440 seconds for 1000000 requests of 1024 bytes. Thread -1610612656 adjusted timing: 262.315112 seconds for 1000000 requests of 1024 bytes. I'll try to test this on a 4 CPU amd64 machine next. Kris |
|
|
Re: New libc malloc patchOn Dec 11, 2005, at 4:35 PM, Julian Elischer wrote:
> I may have missed it but some benchmark numbers could be good. I haven't posted any benchmark numbers, but that is a reasonable request. Here's a summary of what I've seen so far. For single-threaded apps, phkmalloc and jemalloc exhibit very similar performance for all of the benchmarks I've run. Neither is a clear winner over the other from what I've seen. Kris Kennaway already posted some multi-threaded microbenchmark results. My tests have yielded similar results to his. It would be very informative to run benchmarks with real world multithreaded apps. bind9 (built with threading support) would be a great candidate, but thus far I haven't gotten a chance to use the machines that Robert Watson uses for such tests. > Is there no way to make it an option for a while? > that would get good testing AND a fallback for people. Unfortunately, there are some low level issues that make the two malloc implementations incompatible, and they both need access to libc internals in order to work correctly in a multi-threaded program. The way I have been comparing the two implementations is via chroot installations. It might be possible to make the two compatible (would require extra coding), but since both of them need to be part of libc, we would need a way of building separate libc libraries for the two mallocs. This all seems uglier than it's worth to me. Maybe there's another way... Jason _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
| < Prev | 1 - 2 - 3 - 4 | Next > |
| Free embeddable forum powered by Nabble | Forum Help |