|
View:
New views
16 Messages
—
Rating Filter:
Alert me
|
|
|
CPU affinity with ULE schedulerTo Whom It May Concerned:
Can someone explain or share about ULE scheduler (latest version 2 if I'm not mistaken) dealing with CPU affinity? Is there any existing benchmarks on this with FreeBSD? Because I am currently using 4BSD scheduler and as what I have observed especially on processing high network load traffic on multiple CPU cores, only one CPU were being stressed with network interrupt while the rests are mostly in idle state. This is an AMD-64 (4x) dual-core IBM system with GigE Broadcom network interface cards (bce0 and bce1). Below is the snapshot of the case. PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 17 root 1 171 52 0K 16K RUN 0 96:04 97.71% idle: cpu0 15 root 1 171 52 0K 16K RUN 2 98:41 97.07% idle: cpu2 14 root 1 171 52 0K 16K RUN 3 103:56 95.90% idle: cpu3 13 root 1 171 52 0K 16K RUN 4 104:17 88.23% idle: cpu4 12 root 1 171 52 0K 16K RUN 5 97:59 86.57% idle: cpu5 10 root 1 171 52 0K 16K RUN 7 81:51 82.08% idle: cpu7 11 root 1 171 52 0K 16K RUN 6 95:28 81.35% idle: cpu6 16 root 1 171 52 0K 16K RUN 1 102:15 77.78% idle: cpu1 36 root 1 -68 -187 0K 16K WAIT 7 19:37 4.59% irq23: bce0 bce1 18 root 1 -32 -151 0K 16K CPU0 0 2:13 0.00% swi4: clock sio 4488 root 1 96 0 30728K 4292K select 3 1:51 0.00% sshd 43 root 1 171 52 0K 16K pgzero 3 1:08 0.00% pagezero 218 root 1 96 0 3852K 1380K select 3 0:38 0.00% syslogd 20 root 1 -44 -163 0K 16K WAIT 7 0:32 0.00% swi1: net Thanks, Archimedes _______________________________________________ freebsd-smp@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-smp To unsubscribe, send any mail to "freebsd-smp-unsubscribe@..." |
|
|
Re: CPU affinity with ULE schedulerArchimedes Gaviola wrote:
> To Whom It May Concerned: > > Can someone explain or share about ULE scheduler (latest version 2 if > I'm not mistaken) dealing with CPU affinity? Is there any existing > benchmarks on this with FreeBSD? Because I am currently using 4BSD Yes but not for network loads. See for example benchmarks in http://people.freebsd.org/~kris/scaling/7.0%20and%20beyond.pdf > scheduler and as what I have observed especially on processing high > network load traffic on multiple CPU cores, only one CPU were being > stressed with network interrupt while the rests are mostly in idle > state. This is an AMD-64 (4x) dual-core IBM system with GigE Broadcom > network interface cards (bce0 and bce1). Below is the snapshot of the > case. This is unfortunately so and cannot be changed for now - you are not the first with this particular performance problem. BUT, looking at the data in the snapshot you gave, it's not clear that there is a performance problem in your case - bce is not nearly taking as much CPU time to be bottlenecking. What exactly do you think is wrong in your case? > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND > 17 root 1 171 52 0K 16K RUN 0 96:04 97.71% idle: cpu0 > 15 root 1 171 52 0K 16K RUN 2 98:41 97.07% idle: cpu2 > 14 root 1 171 52 0K 16K RUN 3 103:56 95.90% idle: cpu3 > 13 root 1 171 52 0K 16K RUN 4 104:17 88.23% idle: cpu4 > 12 root 1 171 52 0K 16K RUN 5 97:59 86.57% idle: cpu5 > 10 root 1 171 52 0K 16K RUN 7 81:51 82.08% idle: cpu7 > 11 root 1 171 52 0K 16K RUN 6 95:28 81.35% idle: cpu6 > 16 root 1 171 52 0K 16K RUN 1 102:15 77.78% idle: cpu1 > 36 root 1 -68 -187 0K 16K WAIT 7 19:37 4.59% > irq23: bce0 bce1 > 18 root 1 -32 -151 0K 16K CPU0 0 2:13 0.00% > swi4: clock sio > 4488 root 1 96 0 30728K 4292K select 3 1:51 0.00% sshd > 43 root 1 171 52 0K 16K pgzero 3 1:08 0.00% pagezero > 218 root 1 96 0 3852K 1380K select 3 0:38 0.00% syslogd > 20 root 1 -44 -163 0K 16K WAIT 7 0:32 0.00% swi1: net |
|
|
Re: CPU affinity with ULE schedulerOn Monday 10 November 2008 03:33:23 am Archimedes Gaviola wrote:
> To Whom It May Concerned: > > Can someone explain or share about ULE scheduler (latest version 2 if > I'm not mistaken) dealing with CPU affinity? Is there any existing > benchmarks on this with FreeBSD? Because I am currently using 4BSD > scheduler and as what I have observed especially on processing high > network load traffic on multiple CPU cores, only one CPU were being > stressed with network interrupt while the rests are mostly in idle > state. This is an AMD-64 (4x) dual-core IBM system with GigE Broadcom > network interface cards (bce0 and bce1). Below is the snapshot of the > case. Interrupts are routed to a single CPU. Since bce0 and bce1 are both on the same interrupt (irq 23), the CPU that interrupt is routed to is going to end up handling all the interrupts for bce0 and bce1. This not something ULE or 4BSD have any control over. -- John Baldwin _______________________________________________ freebsd-smp@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-smp To unsubscribe, send any mail to "freebsd-smp-unsubscribe@..." |
|
|
Re: CPU affinity with ULE schedulerOn Tue, Nov 11, 2008 at 6:33 AM, John Baldwin <jhb@...> wrote:
> On Monday 10 November 2008 03:33:23 am Archimedes Gaviola wrote: >> To Whom It May Concerned: >> >> Can someone explain or share about ULE scheduler (latest version 2 if >> I'm not mistaken) dealing with CPU affinity? Is there any existing >> benchmarks on this with FreeBSD? Because I am currently using 4BSD >> scheduler and as what I have observed especially on processing high >> network load traffic on multiple CPU cores, only one CPU were being >> stressed with network interrupt while the rests are mostly in idle >> state. This is an AMD-64 (4x) dual-core IBM system with GigE Broadcom >> network interface cards (bce0 and bce1). Below is the snapshot of the >> case. > > Interrupts are routed to a single CPU. Since bce0 and bce1 are both on the > same interrupt (irq 23), the CPU that interrupt is routed to is going to end > up handling all the interrupts for bce0 and bce1. This not something ULE or > 4BSD have any control over. > > -- > John Baldwin > Hi John, I'm sorry for the wrong snapshot. Here's the right one with my concern. PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 17 root 1 171 52 0K 16K CPU0 0 54:28 95.17% idle: cpu0 15 root 1 171 52 0K 16K CPU2 2 55:55 93.65% idle: cpu2 14 root 1 171 52 0K 16K CPU3 3 58:53 93.55% idle: cpu3 13 root 1 171 52 0K 16K RUN 4 59:14 82.47% idle: cpu4 12 root 1 171 52 0K 16K RUN 5 55:42 82.23% idle: cpu5 16 root 1 171 52 0K 16K CPU1 1 58:13 77.78% idle: cpu1 11 root 1 171 52 0K 16K CPU6 6 54:08 76.17% idle: cpu6 36 root 1 -68 -187 0K 16K WAIT 7 8:50 65.53% irq23: bce0 bce1 10 root 1 171 52 0K 16K CPU7 7 48:19 29.79% idle: cpu7 43 root 1 171 52 0K 16K pgzero 2 0:35 1.51% pagezero 1372 root 10 20 0 16716K 5764K kserel 6 58:42 0.00% kmd 4488 root 1 96 0 30676K 4236K select 2 1:51 0.00% sshd 18 root 1 -32 -151 0K 16K WAIT 0 1:14 0.00% swi4: clock s 20 root 1 -44 -163 0K 16K WAIT 0 0:30 0.00% swi1: net 218 root 1 96 0 3852K 1376K select 0 0:23 0.00% syslogd 2171 root 1 96 0 30676K 4224K select 6 0:19 0.00% sshd Actually I was doing a network performance testing on this system with FreeBSD-6.2 RELEASE using its default scheduler 4BSD and then I used a tool to generate big amount of traffic around 600Mbps-700Mbps traversing the FreeBSD system in bi-direction, meaning both network interfaces are receiving traffic. What happened was, the CPU (cpu7) that handles the (irq 23) on both interfaces consumed big amount of CPU utilization around 65.53% in which it affects other running applications and services like sshd and httpd. It's no longer accessible when traffic is bombarded. With the current situation of my FreeBSD system with only one CPU being stressed, I was thinking of moving to FreeBSD-7.0 RELEASE with the ULE scheduler because I thought my concern has something to do with the distributions of load on multiple CPU cores handled by the scheduler especially at the network level, processing network load. So, if it is more of interrupt handling and not on the scheduler, is there a way we can optimize it? Because if it still routed only to one CPU then for me it's still inefficient. Who handles interrupt scheduling for bounding CPU in order to prevent shared IRQ? Is there any improvements with FreeBSD-7.0 with regards to interrupt handling? Thanks, Archimedes _______________________________________________ freebsd-smp@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-smp To unsubscribe, send any mail to "freebsd-smp-unsubscribe@..." |
|
|
Re: CPU affinity with ULE schedulerOn Tue, Nov 11, 2008 at 12:32 PM, Archimedes Gaviola
<archimedes.gaviola@...> wrote: > On Tue, Nov 11, 2008 at 6:33 AM, John Baldwin <jhb@...> wrote: >> On Monday 10 November 2008 03:33:23 am Archimedes Gaviola wrote: >>> To Whom It May Concerned: >>> >>> Can someone explain or share about ULE scheduler (latest version 2 if >>> I'm not mistaken) dealing with CPU affinity? Is there any existing >>> benchmarks on this with FreeBSD? Because I am currently using 4BSD >>> scheduler and as what I have observed especially on processing high >>> network load traffic on multiple CPU cores, only one CPU were being >>> stressed with network interrupt while the rests are mostly in idle >>> state. This is an AMD-64 (4x) dual-core IBM system with GigE Broadcom >>> network interface cards (bce0 and bce1). Below is the snapshot of the >>> case. >> >> Interrupts are routed to a single CPU. Since bce0 and bce1 are both on the >> same interrupt (irq 23), the CPU that interrupt is routed to is going to end >> up handling all the interrupts for bce0 and bce1. This not something ULE or >> 4BSD have any control over. >> >> -- >> John Baldwin >> > > Hi John, > > I'm sorry for the wrong snapshot. Here's the right one with my concern. > > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND > 17 root 1 171 52 0K 16K CPU0 0 54:28 95.17% idle: cpu0 > 15 root 1 171 52 0K 16K CPU2 2 55:55 93.65% idle: cpu2 > 14 root 1 171 52 0K 16K CPU3 3 58:53 93.55% idle: cpu3 > 13 root 1 171 52 0K 16K RUN 4 59:14 82.47% idle: cpu4 > 12 root 1 171 52 0K 16K RUN 5 55:42 82.23% idle: cpu5 > 16 root 1 171 52 0K 16K CPU1 1 58:13 77.78% idle: cpu1 > 11 root 1 171 52 0K 16K CPU6 6 54:08 76.17% idle: cpu6 > 36 root 1 -68 -187 0K 16K WAIT 7 8:50 65.53% > irq23: bce0 bce1 > 10 root 1 171 52 0K 16K CPU7 7 48:19 29.79% idle: cpu7 > 43 root 1 171 52 0K 16K pgzero 2 0:35 1.51% pagezero > 1372 root 10 20 0 16716K 5764K kserel 6 58:42 0.00% kmd > 4488 root 1 96 0 30676K 4236K select 2 1:51 0.00% sshd > 18 root 1 -32 -151 0K 16K WAIT 0 1:14 0.00% swi4: clock s > 20 root 1 -44 -163 0K 16K WAIT 0 0:30 0.00% swi1: net > 218 root 1 96 0 3852K 1376K select 0 0:23 0.00% syslogd > 2171 root 1 96 0 30676K 4224K select 6 0:19 0.00% sshd > > Actually I was doing a network performance testing on this system with > FreeBSD-6.2 RELEASE using its default scheduler 4BSD and then I used a > tool to generate big amount of traffic around 600Mbps-700Mbps > traversing the FreeBSD system in bi-direction, meaning both network > interfaces are receiving traffic. What happened was, the CPU (cpu7) > that handles the (irq 23) on both interfaces consumed big amount of > CPU utilization around 65.53% in which it affects other running > applications and services like sshd and httpd. It's no longer > accessible when traffic is bombarded. With the current situation of my > FreeBSD system with only one CPU being stressed, I was thinking of > moving to FreeBSD-7.0 RELEASE with the ULE scheduler because I thought > my concern has something to do with the distributions of load on > multiple CPU cores handled by the scheduler especially at the network > level, processing network load. So, if it is more of interrupt > handling and not on the scheduler, is there a way we can optimize it? > Because if it still routed only to one CPU then for me it's still > inefficient. Who handles interrupt scheduling for bounding CPU in > order to prevent shared IRQ? Is there any improvements with > FreeBSD-7.0 with regards to interrupt handling? > > Thanks, > Archimedes > Hi Ivan, Archimedes Gaviola wrote: > To Whom It May Concerned: >=20 > Can someone explain or share about ULE scheduler (latest version 2 if > I'm not mistaken) dealing with CPU affinity? Is there any existing > benchmarks on this with FreeBSD? Because I am currently using 4BSD Yes but not for network loads. See for example benchmarks in http://people.freebsd.org/~kris/scaling/7.0%20and%20beyond.pdf [Archimedes] Ah okay, so based on my understanding with ULE scheduler in FreeBSD-7.0, it only scale well with userland applications scheduling such as database and DNS? > scheduler and as what I have observed especially on processing high > network load traffic on multiple CPU cores, only one CPU were being > stressed with network interrupt while the rests are mostly in idle > state. This is an AMD-64 (4x) dual-core IBM system with GigE Broadcom > network interface cards (bce0 and bce1). Below is the snapshot of the > case. This is unfortunately so and cannot be changed for now - you are not the first with this particular performance problem. [Archimedes] Meaning, you still have to improve the ULE scheduler processing network load? I have read some papers and articles that FreeBSD is implementing parallelized network stack, what is the status of this development? Is processing high network load can address this? BUT, looking at the data in the snapshot you gave, it's not clear that there is a performance problem in your case - bce is not nearly taking as much CPU time to be bottlenecking. What exactly do you think is wrong in your case? [Archimedes] Oh I'm sorry this is not the right one. Here below, PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 17 root 1 171 52 0K 16K CPU0 0 54:28 95.17% idle: cpu0 15 root 1 171 52 0K 16K CPU2 2 55:55 93.65% idle: cpu2 14 root 1 171 52 0K 16K CPU3 3 58:53 93.55% idle: cpu3 13 root 1 171 52 0K 16K RUN 4 59:14 82.47% idle: cpu4 12 root 1 171 52 0K 16K RUN 5 55:42 82.23% idle: cpu5 16 root 1 171 52 0K 16K CPU1 1 58:13 77.78% idle: cpu1 11 root 1 171 52 0K 16K CPU6 6 54:08 76.17% idle: cpu6 36 root 1 -68 -187 0K 16K WAIT 7 8:50 65.53% irq23: bce0 bce1 10 root 1 171 52 0K 16K CPU7 7 48:19 29.79% idle: cpu7 43 root 1 171 52 0K 16K pgzero 2 0:35 1.51% pagezero 1372 root 10 20 0 16716K 5764K kserel 6 58:42 0.00% kmd 4488 root 1 96 0 30676K 4236K select 2 1:51 0.00% sshd 18 root 1 -32 -151 0K 16K WAIT 0 1:14 0.00% swi4: clock s 20 root 1 -44 -163 0K 16K WAIT 0 0:30 0.00% swi1: net 218 root 1 96 0 3852K 1376K select 0 0:23 0.00% syslogd 2171 root 1 96 0 30676K 4224K select 6 0:19 0.00% sshd I was doing network performance testing with a traffic generator tool bombarding 600Mbps-700Mbps traversing my FreeBSD system in both directions. As you can see cpu7 is bounded to irq23 shared on both network interfaces bce0 and bce1. cpu7 takes up 65.53% CPU utilization which affects some of the applications running on the system like sshd and httpd. These services are no longer accessible when bombarding that amount of traffic. Since there are still more idled CPUs, I'm concern about CPU load distribution so that not only one CPU will be stressed. Thanks, Archimedes _______________________________________________ freebsd-smp@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-smp To unsubscribe, send any mail to "freebsd-smp-unsubscribe@..." |
|
|
Re: CPU affinity with ULE schedulerArchimedes Gaviola wrote:
> Hi Ivan, > > Archimedes Gaviola wrote: >> To Whom It May Concerned: >> =20 >> Can someone explain or share about ULE scheduler (latest version 2 if >> I'm not mistaken) dealing with CPU affinity? Is there any existing >> benchmarks on this with FreeBSD? Because I am currently using 4BSD > > Yes but not for network loads. See for example benchmarks in > http://people.freebsd.org/~kris/scaling/7.0%20and%20beyond.pdf > > [Archimedes] Ah okay, so based on my understanding with ULE scheduler > in FreeBSD-7.0, it only scale well with userland applications > scheduling such as database and DNS? The problem you are seeing is probably not solvable by a better scheduler. There are other parts of the system that cause performance bottlenecks. I'd recommend you try 7-STABLE, it might help you, but it probably won't. _______________________________________________ freebsd-smp@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-smp To unsubscribe, send any mail to "freebsd-smp-unsubscribe@..." |
|
|
Re: CPU affinity with ULE schedulerOn Monday 10 November 2008 11:32:55 pm Archimedes Gaviola wrote:
> On Tue, Nov 11, 2008 at 6:33 AM, John Baldwin <jhb@...> wrote: > > On Monday 10 November 2008 03:33:23 am Archimedes Gaviola wrote: > >> To Whom It May Concerned: > >> > >> Can someone explain or share about ULE scheduler (latest version 2 if > >> I'm not mistaken) dealing with CPU affinity? Is there any existing > >> benchmarks on this with FreeBSD? Because I am currently using 4BSD > >> scheduler and as what I have observed especially on processing high > >> network load traffic on multiple CPU cores, only one CPU were being > >> stressed with network interrupt while the rests are mostly in idle > >> state. This is an AMD-64 (4x) dual-core IBM system with GigE Broadcom > >> network interface cards (bce0 and bce1). Below is the snapshot of the > >> case. > > > > Interrupts are routed to a single CPU. Since bce0 and bce1 are both on > > same interrupt (irq 23), the CPU that interrupt is routed to is going to end > > up handling all the interrupts for bce0 and bce1. This not something ULE or > > 4BSD have any control over. > > > > -- > > John Baldwin > > > > Hi John, > > I'm sorry for the wrong snapshot. Here's the right one with my concern. > > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND > 17 root 1 171 52 0K 16K CPU0 0 54:28 95.17% idle: cpu0 > 15 root 1 171 52 0K 16K CPU2 2 55:55 93.65% idle: cpu2 > 14 root 1 171 52 0K 16K CPU3 3 58:53 93.55% idle: cpu3 > 13 root 1 171 52 0K 16K RUN 4 59:14 82.47% idle: cpu4 > 12 root 1 171 52 0K 16K RUN 5 55:42 82.23% idle: cpu5 > 16 root 1 171 52 0K 16K CPU1 1 58:13 77.78% idle: cpu1 > 11 root 1 171 52 0K 16K CPU6 6 54:08 76.17% idle: cpu6 > 36 root 1 -68 -187 0K 16K WAIT 7 8:50 65.53% > irq23: bce0 bce1 > 10 root 1 171 52 0K 16K CPU7 7 48:19 29.79% idle: cpu7 > 43 root 1 171 52 0K 16K pgzero 2 0:35 1.51% pagezero > 1372 root 10 20 0 16716K 5764K kserel 6 58:42 0.00% kmd > 4488 root 1 96 0 30676K 4236K select 2 1:51 0.00% sshd > 18 root 1 -32 -151 0K 16K WAIT 0 1:14 0.00% swi4: > 20 root 1 -44 -163 0K 16K WAIT 0 0:30 0.00% swi1: net > 218 root 1 96 0 3852K 1376K select 0 0:23 0.00% syslogd > 2171 root 1 96 0 30676K 4224K select 6 0:19 0.00% sshd > > Actually I was doing a network performance testing on this system with > FreeBSD-6.2 RELEASE using its default scheduler 4BSD and then I used a > tool to generate big amount of traffic around 600Mbps-700Mbps > traversing the FreeBSD system in bi-direction, meaning both network > interfaces are receiving traffic. What happened was, the CPU (cpu7) > that handles the (irq 23) on both interfaces consumed big amount of > CPU utilization around 65.53% in which it affects other running > applications and services like sshd and httpd. It's no longer > accessible when traffic is bombarded. With the current situation of my > FreeBSD system with only one CPU being stressed, I was thinking of > moving to FreeBSD-7.0 RELEASE with the ULE scheduler because I thought > my concern has something to do with the distributions of load on > multiple CPU cores handled by the scheduler especially at the network > level, processing network load. So, if it is more of interrupt > handling and not on the scheduler, is there a way we can optimize it? > Because if it still routed only to one CPU then for me it's still > inefficient. Who handles interrupt scheduling for bounding CPU in > order to prevent shared IRQ? Is there any improvements with > FreeBSD-7.0 with regards to interrupt handling? It depends. In all likelihood, the interrupts from bce0 and bce1 are both hardwired to the same interrupt pin and so they will always share the same ithread when using the legacy INTx interrupts. However, bce(4) parts do support MSI, and if you try a newer OS snap (6.3 or later) these devices should use MSI in which case each NIC would be assigned to a separate CPU. I would suggest trying 7.0 or a 7.1 release candidate and see if it does better. -- John Baldwin _______________________________________________ freebsd-smp@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-smp To unsubscribe, send any mail to "freebsd-smp-unsubscribe@..." |
|
|
Re: CPU affinity with ULE schedulerOn Wed, Nov 12, 2008 at 1:16 AM, John Baldwin <jhb@...> wrote:
> On Monday 10 November 2008 11:32:55 pm Archimedes Gaviola wrote: >> On Tue, Nov 11, 2008 at 6:33 AM, John Baldwin <jhb@...> wrote: >> > On Monday 10 November 2008 03:33:23 am Archimedes Gaviola wrote: >> >> To Whom It May Concerned: >> >> >> >> Can someone explain or share about ULE scheduler (latest version 2 if >> >> I'm not mistaken) dealing with CPU affinity? Is there any existing >> >> benchmarks on this with FreeBSD? Because I am currently using 4BSD >> >> scheduler and as what I have observed especially on processing high >> >> network load traffic on multiple CPU cores, only one CPU were being >> >> stressed with network interrupt while the rests are mostly in idle >> >> state. This is an AMD-64 (4x) dual-core IBM system with GigE Broadcom >> >> network interface cards (bce0 and bce1). Below is the snapshot of the >> >> case. >> > >> > Interrupts are routed to a single CPU. Since bce0 and bce1 are both on > the >> > same interrupt (irq 23), the CPU that interrupt is routed to is going to > end >> > up handling all the interrupts for bce0 and bce1. This not something ULE > or >> > 4BSD have any control over. >> > >> > -- >> > John Baldwin >> > >> >> Hi John, >> >> I'm sorry for the wrong snapshot. Here's the right one with my concern. >> >> PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND >> 17 root 1 171 52 0K 16K CPU0 0 54:28 95.17% idle: cpu0 >> 15 root 1 171 52 0K 16K CPU2 2 55:55 93.65% idle: cpu2 >> 14 root 1 171 52 0K 16K CPU3 3 58:53 93.55% idle: cpu3 >> 13 root 1 171 52 0K 16K RUN 4 59:14 82.47% idle: cpu4 >> 12 root 1 171 52 0K 16K RUN 5 55:42 82.23% idle: cpu5 >> 16 root 1 171 52 0K 16K CPU1 1 58:13 77.78% idle: cpu1 >> 11 root 1 171 52 0K 16K CPU6 6 54:08 76.17% idle: cpu6 >> 36 root 1 -68 -187 0K 16K WAIT 7 8:50 65.53% >> irq23: bce0 bce1 >> 10 root 1 171 52 0K 16K CPU7 7 48:19 29.79% idle: cpu7 >> 43 root 1 171 52 0K 16K pgzero 2 0:35 1.51% pagezero >> 1372 root 10 20 0 16716K 5764K kserel 6 58:42 0.00% kmd >> 4488 root 1 96 0 30676K 4236K select 2 1:51 0.00% sshd >> 18 root 1 -32 -151 0K 16K WAIT 0 1:14 0.00% swi4: > clock s >> 20 root 1 -44 -163 0K 16K WAIT 0 0:30 0.00% swi1: net >> 218 root 1 96 0 3852K 1376K select 0 0:23 0.00% syslogd >> 2171 root 1 96 0 30676K 4224K select 6 0:19 0.00% sshd >> >> Actually I was doing a network performance testing on this system with >> FreeBSD-6.2 RELEASE using its default scheduler 4BSD and then I used a >> tool to generate big amount of traffic around 600Mbps-700Mbps >> traversing the FreeBSD system in bi-direction, meaning both network >> interfaces are receiving traffic. What happened was, the CPU (cpu7) >> that handles the (irq 23) on both interfaces consumed big amount of >> CPU utilization around 65.53% in which it affects other running >> applications and services like sshd and httpd. It's no longer >> accessible when traffic is bombarded. With the current situation of my >> FreeBSD system with only one CPU being stressed, I was thinking of >> moving to FreeBSD-7.0 RELEASE with the ULE scheduler because I thought >> my concern has something to do with the distributions of load on >> multiple CPU cores handled by the scheduler especially at the network >> level, processing network load. So, if it is more of interrupt >> handling and not on the scheduler, is there a way we can optimize it? >> Because if it still routed only to one CPU then for me it's still >> inefficient. Who handles interrupt scheduling for bounding CPU in >> order to prevent shared IRQ? Is there any improvements with >> FreeBSD-7.0 with regards to interrupt handling? > > It depends. In all likelihood, the interrupts from bce0 and bce1 are both > hardwired to the same interrupt pin and so they will always share the same > ithread when using the legacy INTx interrupts. However, bce(4) parts do > support MSI, and if you try a newer OS snap (6.3 or later) these devices > should use MSI in which case each NIC would be assigned to a separate CPU. I > would suggest trying 7.0 or a 7.1 release candidate and see if it does > better. > > -- > John Baldwin > Hi John, I try 7.0 release and each network interface were already allocated separately on different CPU. Here, MSI is already working. PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 12 root 1 171 ki31 0K 16K CPU6 6 123:55 100.00% idle: cpu6 15 root 1 171 ki31 0K 16K CPU3 3 123:54 100.00% idle: cpu3 14 root 1 171 ki31 0K 16K CPU4 4 123:26 100.00% idle: cpu4 16 root 1 171 ki31 0K 16K CPU2 2 123:15 100.00% idle: cpu2 17 root 1 171 ki31 0K 16K CPU1 1 123:15 100.00% idle: cpu1 37 root 1 -68 - 0K 16K CPU7 7 9:09 100.00% irq256: bce0 13 root 1 171 ki31 0K 16K CPU5 5 123:49 99.07% idle: cpu5 40 root 1 -68 - 0K 16K WAIT 0 4:40 51.17% irq257: bce1 18 root 1 171 ki31 0K 16K RUN 0 117:48 49.37% idle: cpu0 11 root 1 171 ki31 0K 16K RUN 7 115:25 0.00% idle: cpu7 19 root 1 -32 - 0K 16K WAIT 0 0:39 0.00% swi4: clock s 14367 root 1 44 0 5176K 3104K select 2 0:01 0.00% dhcpd 22 root 1 -16 - 0K 16K - 3 0:01 0.00% yarrow 25 root 1 -24 - 0K 16K WAIT 0 0:00 0.00% swi6: Giant t 11658 root 1 44 0 32936K 4540K select 1 0:00 0.00% sshd 14224 root 1 44 0 32936K 4540K select 5 0:00 0.00% sshd 41 root 1 -60 - 0K 16K WAIT 0 0:00 0.00% irq1: atkbd0 4 root 1 -8 - 0K 16K - 2 0:00 0.00% g_down The bce0 interface interrupt (irq256) gets stressed out which already have 100% of CPU7 while CPU0 is around 51.17%. Any more recommendations? Is there anything we can do about optimization with MSI? Thanks, Archimedes _______________________________________________ freebsd-smp@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-smp To unsubscribe, send any mail to "freebsd-smp-unsubscribe@..." |
|
|
Re: CPU affinity with ULE schedulerOn Thursday 13 November 2008 06:55:01 am Archimedes Gaviola wrote:
> On Wed, Nov 12, 2008 at 1:16 AM, John Baldwin <jhb@...> wrote: > > On Monday 10 November 2008 11:32:55 pm Archimedes Gaviola wrote: > >> On Tue, Nov 11, 2008 at 6:33 AM, John Baldwin <jhb@...> wrote: > >> > On Monday 10 November 2008 03:33:23 am Archimedes Gaviola wrote: > >> >> To Whom It May Concerned: > >> >> > >> >> Can someone explain or share about ULE scheduler (latest version 2 if > >> >> I'm not mistaken) dealing with CPU affinity? Is there any existing > >> >> benchmarks on this with FreeBSD? Because I am currently using 4BSD > >> >> scheduler and as what I have observed especially on processing high > >> >> network load traffic on multiple CPU cores, only one CPU were being > >> >> stressed with network interrupt while the rests are mostly in idle > >> >> state. This is an AMD-64 (4x) dual-core IBM system with GigE Broadcom > >> >> network interface cards (bce0 and bce1). Below is the snapshot of the > >> >> case. > >> > > >> > Interrupts are routed to a single CPU. Since bce0 and bce1 are both on > > the > >> > same interrupt (irq 23), the CPU that interrupt is routed to is going > > end > >> > up handling all the interrupts for bce0 and bce1. This not something ULE > > or > >> > 4BSD have any control over. > >> > > >> > -- > >> > John Baldwin > >> > > >> > >> Hi John, > >> > >> I'm sorry for the wrong snapshot. Here's the right one with my concern. > >> > >> PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND > >> 17 root 1 171 52 0K 16K CPU0 0 54:28 95.17% idle: > >> 15 root 1 171 52 0K 16K CPU2 2 55:55 93.65% idle: cpu2 > >> 14 root 1 171 52 0K 16K CPU3 3 58:53 93.55% idle: cpu3 > >> 13 root 1 171 52 0K 16K RUN 4 59:14 82.47% idle: cpu4 > >> 12 root 1 171 52 0K 16K RUN 5 55:42 82.23% idle: cpu5 > >> 16 root 1 171 52 0K 16K CPU1 1 58:13 77.78% idle: cpu1 > >> 11 root 1 171 52 0K 16K CPU6 6 54:08 76.17% idle: cpu6 > >> 36 root 1 -68 -187 0K 16K WAIT 7 8:50 65.53% > >> irq23: bce0 bce1 > >> 10 root 1 171 52 0K 16K CPU7 7 48:19 29.79% idle: cpu7 > >> 43 root 1 171 52 0K 16K pgzero 2 0:35 1.51% pagezero > >> 1372 root 10 20 0 16716K 5764K kserel 6 58:42 0.00% kmd > >> 4488 root 1 96 0 30676K 4236K select 2 1:51 0.00% sshd > >> 18 root 1 -32 -151 0K 16K WAIT 0 1:14 0.00% swi4: > > clock s > >> 20 root 1 -44 -163 0K 16K WAIT 0 0:30 0.00% swi1: net > >> 218 root 1 96 0 3852K 1376K select 0 0:23 0.00% syslogd > >> 2171 root 1 96 0 30676K 4224K select 6 0:19 0.00% sshd > >> > >> Actually I was doing a network performance testing on this system with > >> FreeBSD-6.2 RELEASE using its default scheduler 4BSD and then I used a > >> tool to generate big amount of traffic around 600Mbps-700Mbps > >> traversing the FreeBSD system in bi-direction, meaning both network > >> interfaces are receiving traffic. What happened was, the CPU (cpu7) > >> that handles the (irq 23) on both interfaces consumed big amount of > >> CPU utilization around 65.53% in which it affects other running > >> applications and services like sshd and httpd. It's no longer > >> accessible when traffic is bombarded. With the current situation of my > >> FreeBSD system with only one CPU being stressed, I was thinking of > >> moving to FreeBSD-7.0 RELEASE with the ULE scheduler because I thought > >> my concern has something to do with the distributions of load on > >> multiple CPU cores handled by the scheduler especially at the network > >> level, processing network load. So, if it is more of interrupt > >> handling and not on the scheduler, is there a way we can optimize it? > >> Because if it still routed only to one CPU then for me it's still > >> inefficient. Who handles interrupt scheduling for bounding CPU in > >> order to prevent shared IRQ? Is there any improvements with > >> FreeBSD-7.0 with regards to interrupt handling? > > > > It depends. In all likelihood, the interrupts from bce0 and bce1 are both > > hardwired to the same interrupt pin and so they will always share the same > > ithread when using the legacy INTx interrupts. However, bce(4) parts do > > support MSI, and if you try a newer OS snap (6.3 or later) these devices > > should use MSI in which case each NIC would be assigned to a separate CPU. > > would suggest trying 7.0 or a 7.1 release candidate and see if it does > > better. > > > > -- > > John Baldwin > > > > Hi John, > > I try 7.0 release and each network interface were already allocated > separately on different CPU. Here, MSI is already working. > > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND > 12 root 1 171 ki31 0K 16K CPU6 6 123:55 100.00% idle: > 15 root 1 171 ki31 0K 16K CPU3 3 123:54 100.00% idle: cpu3 > 14 root 1 171 ki31 0K 16K CPU4 4 123:26 100.00% idle: cpu4 > 16 root 1 171 ki31 0K 16K CPU2 2 123:15 100.00% idle: cpu2 > 17 root 1 171 ki31 0K 16K CPU1 1 123:15 100.00% idle: cpu1 > 37 root 1 -68 - 0K 16K CPU7 7 9:09 100.00% irq256: bce0 > 13 root 1 171 ki31 0K 16K CPU5 5 123:49 99.07% idle: cpu5 > 40 root 1 -68 - 0K 16K WAIT 0 4:40 51.17% irq257: bce1 > 18 root 1 171 ki31 0K 16K RUN 0 117:48 49.37% idle: cpu0 > 11 root 1 171 ki31 0K 16K RUN 7 115:25 0.00% idle: cpu7 > 19 root 1 -32 - 0K 16K WAIT 0 0:39 0.00% swi4: clock s > 14367 root 1 44 0 5176K 3104K select 2 0:01 0.00% dhcpd > 22 root 1 -16 - 0K 16K - 3 0:01 0.00% yarrow > 25 root 1 -24 - 0K 16K WAIT 0 0:00 0.00% swi6: Giant t > 11658 root 1 44 0 32936K 4540K select 1 0:00 0.00% sshd > 14224 root 1 44 0 32936K 4540K select 5 0:00 0.00% sshd > 41 root 1 -60 - 0K 16K WAIT 0 0:00 0.00% irq1: atkbd0 > 4 root 1 -8 - 0K 16K - 2 0:00 0.00% g_down > > The bce0 interface interrupt (irq256) gets stressed out which already > have 100% of CPU7 while CPU0 is around 51.17%. Any more > recommendations? Is there anything we can do about optimization with > MSI? Well, on 7.x you can try turning net.isr.direct off (sysctl). However, it seems you are hammering your bce0 interface. You might want to try using polling on bce0 and seeing if it keeps up with the traffic better. -- John Baldwin _______________________________________________ freebsd-smp@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-smp To unsubscribe, send any mail to "freebsd-smp-unsubscribe@..." |
|
|
Re: CPU affinity with ULE schedulerOn Fri, Nov 14, 2008 at 12:28 AM, John Baldwin <jhb@...> wrote:
> On Thursday 13 November 2008 06:55:01 am Archimedes Gaviola wrote: >> On Wed, Nov 12, 2008 at 1:16 AM, John Baldwin <jhb@...> wrote: >> > On Monday 10 November 2008 11:32:55 pm Archimedes Gaviola wrote: >> >> On Tue, Nov 11, 2008 at 6:33 AM, John Baldwin <jhb@...> wrote: >> >> > On Monday 10 November 2008 03:33:23 am Archimedes Gaviola wrote: >> >> >> To Whom It May Concerned: >> >> >> >> >> >> Can someone explain or share about ULE scheduler (latest version 2 if >> >> >> I'm not mistaken) dealing with CPU affinity? Is there any existing >> >> >> benchmarks on this with FreeBSD? Because I am currently using 4BSD >> >> >> scheduler and as what I have observed especially on processing high >> >> >> network load traffic on multiple CPU cores, only one CPU were being >> >> >> stressed with network interrupt while the rests are mostly in idle >> >> >> state. This is an AMD-64 (4x) dual-core IBM system with GigE Broadcom >> >> >> network interface cards (bce0 and bce1). Below is the snapshot of the >> >> >> case. >> >> > >> >> > Interrupts are routed to a single CPU. Since bce0 and bce1 are both on >> > the >> >> > same interrupt (irq 23), the CPU that interrupt is routed to is going > to >> > end >> >> > up handling all the interrupts for bce0 and bce1. This not something > ULE >> > or >> >> > 4BSD have any control over. >> >> > >> >> > -- >> >> > John Baldwin >> >> > >> >> >> >> Hi John, >> >> >> >> I'm sorry for the wrong snapshot. Here's the right one with my concern. >> >> >> >> PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND >> >> 17 root 1 171 52 0K 16K CPU0 0 54:28 95.17% idle: > cpu0 >> >> 15 root 1 171 52 0K 16K CPU2 2 55:55 93.65% idle: > cpu2 >> >> 14 root 1 171 52 0K 16K CPU3 3 58:53 93.55% idle: > cpu3 >> >> 13 root 1 171 52 0K 16K RUN 4 59:14 82.47% idle: > cpu4 >> >> 12 root 1 171 52 0K 16K RUN 5 55:42 82.23% idle: > cpu5 >> >> 16 root 1 171 52 0K 16K CPU1 1 58:13 77.78% idle: > cpu1 >> >> 11 root 1 171 52 0K 16K CPU6 6 54:08 76.17% idle: > cpu6 >> >> 36 root 1 -68 -187 0K 16K WAIT 7 8:50 65.53% >> >> irq23: bce0 bce1 >> >> 10 root 1 171 52 0K 16K CPU7 7 48:19 29.79% idle: > cpu7 >> >> 43 root 1 171 52 0K 16K pgzero 2 0:35 1.51% > pagezero >> >> 1372 root 10 20 0 16716K 5764K kserel 6 58:42 0.00% kmd >> >> 4488 root 1 96 0 30676K 4236K select 2 1:51 0.00% sshd >> >> 18 root 1 -32 -151 0K 16K WAIT 0 1:14 0.00% swi4: >> > clock s >> >> 20 root 1 -44 -163 0K 16K WAIT 0 0:30 0.00% swi1: > net >> >> 218 root 1 96 0 3852K 1376K select 0 0:23 0.00% syslogd >> >> 2171 root 1 96 0 30676K 4224K select 6 0:19 0.00% sshd >> >> >> >> Actually I was doing a network performance testing on this system with >> >> FreeBSD-6.2 RELEASE using its default scheduler 4BSD and then I used a >> >> tool to generate big amount of traffic around 600Mbps-700Mbps >> >> traversing the FreeBSD system in bi-direction, meaning both network >> >> interfaces are receiving traffic. What happened was, the CPU (cpu7) >> >> that handles the (irq 23) on both interfaces consumed big amount of >> >> CPU utilization around 65.53% in which it affects other running >> >> applications and services like sshd and httpd. It's no longer >> >> accessible when traffic is bombarded. With the current situation of my >> >> FreeBSD system with only one CPU being stressed, I was thinking of >> >> moving to FreeBSD-7.0 RELEASE with the ULE scheduler because I thought >> >> my concern has something to do with the distributions of load on >> >> multiple CPU cores handled by the scheduler especially at the network >> >> level, processing network load. So, if it is more of interrupt >> >> handling and not on the scheduler, is there a way we can optimize it? >> >> Because if it still routed only to one CPU then for me it's still >> >> inefficient. Who handles interrupt scheduling for bounding CPU in >> >> order to prevent shared IRQ? Is there any improvements with >> >> FreeBSD-7.0 with regards to interrupt handling? >> > >> > It depends. In all likelihood, the interrupts from bce0 and bce1 are both >> > hardwired to the same interrupt pin and so they will always share the same >> > ithread when using the legacy INTx interrupts. However, bce(4) parts do >> > support MSI, and if you try a newer OS snap (6.3 or later) these devices >> > should use MSI in which case each NIC would be assigned to a separate CPU. > I >> > would suggest trying 7.0 or a 7.1 release candidate and see if it does >> > better. >> > >> > -- >> > John Baldwin >> > >> >> Hi John, >> >> I try 7.0 release and each network interface were already allocated >> separately on different CPU. Here, MSI is already working. >> >> PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND >> 12 root 1 171 ki31 0K 16K CPU6 6 123:55 100.00% idle: > cpu6 >> 15 root 1 171 ki31 0K 16K CPU3 3 123:54 100.00% idle: > cpu3 >> 14 root 1 171 ki31 0K 16K CPU4 4 123:26 100.00% idle: > cpu4 >> 16 root 1 171 ki31 0K 16K CPU2 2 123:15 100.00% idle: > cpu2 >> 17 root 1 171 ki31 0K 16K CPU1 1 123:15 100.00% idle: > cpu1 >> 37 root 1 -68 - 0K 16K CPU7 7 9:09 100.00% irq256: > bce0 >> 13 root 1 171 ki31 0K 16K CPU5 5 123:49 99.07% idle: cpu5 >> 40 root 1 -68 - 0K 16K WAIT 0 4:40 51.17% irq257: > bce1 >> 18 root 1 171 ki31 0K 16K RUN 0 117:48 49.37% idle: cpu0 >> 11 root 1 171 ki31 0K 16K RUN 7 115:25 0.00% idle: cpu7 >> 19 root 1 -32 - 0K 16K WAIT 0 0:39 0.00% swi4: > clock s >> 14367 root 1 44 0 5176K 3104K select 2 0:01 0.00% dhcpd >> 22 root 1 -16 - 0K 16K - 3 0:01 0.00% yarrow >> 25 root 1 -24 - 0K 16K WAIT 0 0:00 0.00% swi6: > Giant t >> 11658 root 1 44 0 32936K 4540K select 1 0:00 0.00% sshd >> 14224 root 1 44 0 32936K 4540K select 5 0:00 0.00% sshd >> 41 root 1 -60 - 0K 16K WAIT 0 0:00 0.00% irq1: > atkbd0 >> 4 root 1 -8 - 0K 16K - 2 0:00 0.00% g_down >> >> The bce0 interface interrupt (irq256) gets stressed out which already >> have 100% of CPU7 while CPU0 is around 51.17%. Any more >> recommendations? Is there anything we can do about optimization with >> MSI? > > Well, on 7.x you can try turning net.isr.direct off (sysctl). However, it > seems you are hammering your bce0 interface. You might want to try using > polling on bce0 and seeing if it keeps up with the traffic better. > > -- > John Baldwin > With net.isr.direct=0, my IBM system lessens CPU utilization per interface (bce0 and bce1) but swi1:net increase its utilization. Can you explained what's happening here? What does net.isr.direct do with the decrease of CPU utilization on its interface? I really wanted to know what happened internally during the packets being processed and received by the interfaces then to the device interrupt up to the software interrupt level because I am confused when enabling/disabling net.isr.direct in sysctl. Is there a tool that can we used to trace this process just to be able to know which part of the kernel internal is doing the bottleneck especially when net.isr.direct=1? By the way with device polling enabled, the system experienced packet errors and the interface throughput is worst, so I avoid using it though. PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 16 root 1 171 ki31 0K 16K CPU10 a 86:06 89.06% idle: cpu10 27 root 1 -44 - 0K 16K CPU1 1 34:37 82.67% swi1: net 52 root 1 -68 - 0K 16K WAIT b 51:59 59.77% irq32: bce1 15 root 1 171 ki31 0K 16K RUN b 69:28 43.16% idle: cpu11 25 root 1 171 ki31 0K 16K RUN 1 115:35 24.27% idle: cpu1 51 root 1 -68 - 0K 16K CPU10 a 35:21 13.48% irq31: bce0 Regards, Archimedes _______________________________________________ freebsd-smp@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-smp To unsubscribe, send any mail to "freebsd-smp-unsubscribe@..." |
|
|
Re: CPU affinity with ULE schedulerOn Mon, Nov 17, 2008 at 7:11 PM, Archimedes Gaviola
<archimedes.gaviola@...> wrote: > On Fri, Nov 14, 2008 at 12:28 AM, John Baldwin <jhb@...> wrote: >> On Thursday 13 November 2008 06:55:01 am Archimedes Gaviola wrote: >>> On Wed, Nov 12, 2008 at 1:16 AM, John Baldwin <jhb@...> wrote: >>> > On Monday 10 November 2008 11:32:55 pm Archimedes Gaviola wrote: >>> >> On Tue, Nov 11, 2008 at 6:33 AM, John Baldwin <jhb@...> wrote: >>> >> > On Monday 10 November 2008 03:33:23 am Archimedes Gaviola wrote: >>> >> >> To Whom It May Concerned: >>> >> >> >>> >> >> Can someone explain or share about ULE scheduler (latest version 2 if >>> >> >> I'm not mistaken) dealing with CPU affinity? Is there any existing >>> >> >> benchmarks on this with FreeBSD? Because I am currently using 4BSD >>> >> >> scheduler and as what I have observed especially on processing high >>> >> >> network load traffic on multiple CPU cores, only one CPU were being >>> >> >> stressed with network interrupt while the rests are mostly in idle >>> >> >> state. This is an AMD-64 (4x) dual-core IBM system with GigE Broadcom >>> >> >> network interface cards (bce0 and bce1). Below is the snapshot of the >>> >> >> case. >>> >> > >>> >> > Interrupts are routed to a single CPU. Since bce0 and bce1 are both on >>> > the >>> >> > same interrupt (irq 23), the CPU that interrupt is routed to is going >> to >>> > end >>> >> > up handling all the interrupts for bce0 and bce1. This not something >> ULE >>> > or >>> >> > 4BSD have any control over. >>> >> > >>> >> > -- >>> >> > John Baldwin >>> >> > >>> >> >>> >> Hi John, >>> >> >>> >> I'm sorry for the wrong snapshot. Here's the right one with my concern. >>> >> >>> >> PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND >>> >> 17 root 1 171 52 0K 16K CPU0 0 54:28 95.17% idle: >> cpu0 >>> >> 15 root 1 171 52 0K 16K CPU2 2 55:55 93.65% idle: >> cpu2 >>> >> 14 root 1 171 52 0K 16K CPU3 3 58:53 93.55% idle: >> cpu3 >>> >> 13 root 1 171 52 0K 16K RUN 4 59:14 82.47% idle: >> cpu4 >>> >> 12 root 1 171 52 0K 16K RUN 5 55:42 82.23% idle: >> cpu5 >>> >> 16 root 1 171 52 0K 16K CPU1 1 58:13 77.78% idle: >> cpu1 >>> >> 11 root 1 171 52 0K 16K CPU6 6 54:08 76.17% idle: >> cpu6 >>> >> 36 root 1 -68 -187 0K 16K WAIT 7 8:50 65.53% >>> >> irq23: bce0 bce1 >>> >> 10 root 1 171 52 0K 16K CPU7 7 48:19 29.79% idle: >> cpu7 >>> >> 43 root 1 171 52 0K 16K pgzero 2 0:35 1.51% >> pagezero >>> >> 1372 root 10 20 0 16716K 5764K kserel 6 58:42 0.00% kmd >>> >> 4488 root 1 96 0 30676K 4236K select 2 1:51 0.00% sshd >>> >> 18 root 1 -32 -151 0K 16K WAIT 0 1:14 0.00% swi4: >>> > clock s >>> >> 20 root 1 -44 -163 0K 16K WAIT 0 0:30 0.00% swi1: >> net >>> >> 218 root 1 96 0 3852K 1376K select 0 0:23 0.00% syslogd >>> >> 2171 root 1 96 0 30676K 4224K select 6 0:19 0.00% sshd >>> >> >>> >> Actually I was doing a network performance testing on this system with >>> >> FreeBSD-6.2 RELEASE using its default scheduler 4BSD and then I used a >>> >> tool to generate big amount of traffic around 600Mbps-700Mbps >>> >> traversing the FreeBSD system in bi-direction, meaning both network >>> >> interfaces are receiving traffic. What happened was, the CPU (cpu7) >>> >> that handles the (irq 23) on both interfaces consumed big amount of >>> >> CPU utilization around 65.53% in which it affects other running >>> >> applications and services like sshd and httpd. It's no longer >>> >> accessible when traffic is bombarded. With the current situation of my >>> >> FreeBSD system with only one CPU being stressed, I was thinking of >>> >> moving to FreeBSD-7.0 RELEASE with the ULE scheduler because I thought >>> >> my concern has something to do with the distributions of load on >>> >> multiple CPU cores handled by the scheduler especially at the network >>> >> level, processing network load. So, if it is more of interrupt >>> >> handling and not on the scheduler, is there a way we can optimize it? >>> >> Because if it still routed only to one CPU then for me it's still >>> >> inefficient. Who handles interrupt scheduling for bounding CPU in >>> >> order to prevent shared IRQ? Is there any improvements with >>> >> FreeBSD-7.0 with regards to interrupt handling? >>> > >>> > It depends. In all likelihood, the interrupts from bce0 and bce1 are both >>> > hardwired to the same interrupt pin and so they will always share the same >>> > ithread when using the legacy INTx interrupts. However, bce(4) parts do >>> > support MSI, and if you try a newer OS snap (6.3 or later) these devices >>> > should use MSI in which case each NIC would be assigned to a separate CPU. >> I >>> > would suggest trying 7.0 or a 7.1 release candidate and see if it does >>> > better. >>> > >>> > -- >>> > John Baldwin >>> > >>> >>> Hi John, >>> >>> I try 7.0 release and each network interface were already allocated >>> separately on different CPU. Here, MSI is already working. >>> >>> PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND >>> 12 root 1 171 ki31 0K 16K CPU6 6 123:55 100.00% idle: >> cpu6 >>> 15 root 1 171 ki31 0K 16K CPU3 3 123:54 100.00% idle: >> cpu3 >>> 14 root 1 171 ki31 0K 16K CPU4 4 123:26 100.00% idle: >> cpu4 >>> 16 root 1 171 ki31 0K 16K CPU2 2 123:15 100.00% idle: >> cpu2 >>> 17 root 1 171 ki31 0K 16K CPU1 1 123:15 100.00% idle: >> cpu1 >>> 37 root 1 -68 - 0K 16K CPU7 7 9:09 100.00% irq256: >> bce0 >>> 13 root 1 171 ki31 0K 16K CPU5 5 123:49 99.07% idle: cpu5 >>> 40 root 1 -68 - 0K 16K WAIT 0 4:40 51.17% irq257: >> bce1 >>> 18 root 1 171 ki31 0K 16K RUN 0 117:48 49.37% idle: cpu0 >>> 11 root 1 171 ki31 0K 16K RUN 7 115:25 0.00% idle: cpu7 >>> 19 root 1 -32 - 0K 16K WAIT 0 0:39 0.00% swi4: >> clock s >>> 14367 root 1 44 0 5176K 3104K select 2 0:01 0.00% dhcpd >>> 22 root 1 -16 - 0K 16K - 3 0:01 0.00% yarrow >>> 25 root 1 -24 - 0K 16K WAIT 0 0:00 0.00% swi6: >> Giant t >>> 11658 root 1 44 0 32936K 4540K select 1 0:00 0.00% sshd >>> 14224 root 1 44 0 32936K 4540K select 5 0:00 0.00% sshd >>> 41 root 1 -60 - 0K 16K WAIT 0 0:00 0.00% irq1: >> atkbd0 >>> 4 root 1 -8 - 0K 16K - 2 0:00 0.00% g_down >>> >>> The bce0 interface interrupt (irq256) gets stressed out which already >>> have 100% of CPU7 while CPU0 is around 51.17%. Any more >>> recommendations? Is there anything we can do about optimization with >>> MSI? >> >> Well, on 7.x you can try turning net.isr.direct off (sysctl). However, it >> seems you are hammering your bce0 interface. You might want to try using >> polling on bce0 and seeing if it keeps up with the traffic better. >> >> -- >> John Baldwin >> > > With net.isr.direct=0, my IBM system lessens CPU utilization per > interface (bce0 and bce1) but swi1:net increase its utilization. > Can you explained what's happening here? What does net.isr.direct do > with the decrease of CPU utilization on its interface? I really wanted > to know what happened internally during the packets being processed > and received by the interfaces then to the device interrupt up to the > software interrupt level because I am confused when enabling/disabling > net.isr.direct in sysctl. Is there a tool that can we used to trace > this process just to be able to know which part of the kernel internal > is doing the bottleneck especially when net.isr.direct=1? By the way > with device polling enabled, the system experienced packet errors and > the interface throughput is worst, so I avoid using it though. > > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND > > 16 root 1 171 ki31 0K 16K CPU10 a 86:06 89.06% idle: cpu10 > 27 root 1 -44 - 0K 16K CPU1 1 34:37 82.67% swi1: net > 52 root 1 -68 - 0K 16K WAIT b 51:59 59.77% irq32: bce1 > 15 root 1 171 ki31 0K 16K RUN b 69:28 43.16% idle: cpu11 > 25 root 1 171 ki31 0K 16K RUN 1 115:35 24.27% idle: cpu1 > 51 root 1 -68 - 0K 16K CPU10 a 35:21 13.48% irq31: bce0 > > > Regards, > Archimedes > One more thing, I observed that when net.isr.direct=1, bce0 is using irq256 and bce1 is using irq257 while net.isr.direct=0, bce0 is now using irq31 and bce1 is using irq32. What makes it different? _______________________________________________ freebsd-smp@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-smp To unsubscribe, send any mail to "freebsd-smp-unsubscribe@..." |
|
|
Re: CPU affinity with ULE schedulerArchimedes Gaviola wrote:
> With net.isr.direct=0, my IBM system lessens CPU utilization per > interface (bce0 and bce1) but swi1:net increase its utilization. > Can you explained what's happening here? What does net.isr.direct do > with the decrease of CPU utilization on its interface? The system has a choice between processing the packets in the interrupt handler (the "irq:bce" process) or in a dedicated network process (the "swi:net" process). This is about protocol handling not simply receiving packets. With net.isr.direct you're toggling between those two options. If "direct" is 1, the packets are processed in the interrupt handler; if it's 0, the processing is delegated to swi. It's set to 1 by default because this setting should yield best latency. In both cases the code path a packet must go through is very similar: it has to be received, then processed through firewalls and network stack code, then delivered to application(s), so it's a serial process. There are things that could be better parallelized in the stack and people are working on them, but they will not be finished any time soon. |
|
|
Re: CPU affinity with ULE schedulerOn Monday 17 November 2008 06:11:00 am Archimedes Gaviola wrote:
> On Fri, Nov 14, 2008 at 12:28 AM, John Baldwin <jhb@...> wrote: > > On Thursday 13 November 2008 06:55:01 am Archimedes Gaviola wrote: > >> On Wed, Nov 12, 2008 at 1:16 AM, John Baldwin <jhb@...> wrote: > >> > On Monday 10 November 2008 11:32:55 pm Archimedes Gaviola wrote: > >> >> On Tue, Nov 11, 2008 at 6:33 AM, John Baldwin <jhb@...> wrote: > >> >> > On Monday 10 November 2008 03:33:23 am Archimedes Gaviola wrote: > >> >> >> To Whom It May Concerned: > >> >> >> > >> >> >> Can someone explain or share about ULE scheduler (latest version 2 if > >> >> >> I'm not mistaken) dealing with CPU affinity? Is there any existing > >> >> >> benchmarks on this with FreeBSD? Because I am currently using 4BSD > >> >> >> scheduler and as what I have observed especially on processing high > >> >> >> network load traffic on multiple CPU cores, only one CPU were being > >> >> >> stressed with network interrupt while the rests are mostly in idle > >> >> >> state. This is an AMD-64 (4x) dual-core IBM system with GigE Broadcom > >> >> >> network interface cards (bce0 and bce1). Below is the snapshot of the > >> >> >> case. > >> >> > > >> >> > Interrupts are routed to a single CPU. Since bce0 and bce1 are both on > >> > the > >> >> > same interrupt (irq 23), the CPU that interrupt is routed to is going > > to > >> > end > >> >> > up handling all the interrupts for bce0 and bce1. This not something > > ULE > >> > or > >> >> > 4BSD have any control over. > >> >> > > >> >> > -- > >> >> > John Baldwin > >> >> > > >> >> > >> >> Hi John, > >> >> > >> >> I'm sorry for the wrong snapshot. Here's the right one with my > >> >> > >> >> PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND > >> >> 17 root 1 171 52 0K 16K CPU0 0 54:28 95.17% idle: > > cpu0 > >> >> 15 root 1 171 52 0K 16K CPU2 2 55:55 93.65% idle: > > cpu2 > >> >> 14 root 1 171 52 0K 16K CPU3 3 58:53 93.55% idle: > > cpu3 > >> >> 13 root 1 171 52 0K 16K RUN 4 59:14 82.47% idle: > > cpu4 > >> >> 12 root 1 171 52 0K 16K RUN 5 55:42 82.23% idle: > > cpu5 > >> >> 16 root 1 171 52 0K 16K CPU1 1 58:13 77.78% idle: > > cpu1 > >> >> 11 root 1 171 52 0K 16K CPU6 6 54:08 76.17% idle: > > cpu6 > >> >> 36 root 1 -68 -187 0K 16K WAIT 7 8:50 65.53% > >> >> irq23: bce0 bce1 > >> >> 10 root 1 171 52 0K 16K CPU7 7 48:19 29.79% idle: > > cpu7 > >> >> 43 root 1 171 52 0K 16K pgzero 2 0:35 1.51% > > pagezero > >> >> 1372 root 10 20 0 16716K 5764K kserel 6 58:42 0.00% kmd > >> >> 4488 root 1 96 0 30676K 4236K select 2 1:51 0.00% sshd > >> >> 18 root 1 -32 -151 0K 16K WAIT 0 1:14 0.00% swi4: > >> > clock s > >> >> 20 root 1 -44 -163 0K 16K WAIT 0 0:30 0.00% swi1: > > net > >> >> 218 root 1 96 0 3852K 1376K select 0 0:23 0.00% syslogd > >> >> 2171 root 1 96 0 30676K 4224K select 6 0:19 0.00% sshd > >> >> > >> >> Actually I was doing a network performance testing on this system with > >> >> FreeBSD-6.2 RELEASE using its default scheduler 4BSD and then I used a > >> >> tool to generate big amount of traffic around 600Mbps-700Mbps > >> >> traversing the FreeBSD system in bi-direction, meaning both network > >> >> interfaces are receiving traffic. What happened was, the CPU (cpu7) > >> >> that handles the (irq 23) on both interfaces consumed big amount of > >> >> CPU utilization around 65.53% in which it affects other running > >> >> applications and services like sshd and httpd. It's no longer > >> >> accessible when traffic is bombarded. With the current situation of my > >> >> FreeBSD system with only one CPU being stressed, I was thinking of > >> >> moving to FreeBSD-7.0 RELEASE with the ULE scheduler because I thought > >> >> my concern has something to do with the distributions of load on > >> >> multiple CPU cores handled by the scheduler especially at the network > >> >> level, processing network load. So, if it is more of interrupt > >> >> handling and not on the scheduler, is there a way we can optimize it? > >> >> Because if it still routed only to one CPU then for me it's still > >> >> inefficient. Who handles interrupt scheduling for bounding CPU in > >> >> order to prevent shared IRQ? Is there any improvements with > >> >> FreeBSD-7.0 with regards to interrupt handling? > >> > > >> > It depends. In all likelihood, the interrupts from bce0 and bce1 are > >> > hardwired to the same interrupt pin and so they will always share the same > >> > ithread when using the legacy INTx interrupts. However, bce(4) parts do > >> > support MSI, and if you try a newer OS snap (6.3 or later) these devices > >> > should use MSI in which case each NIC would be assigned to a separate CPU. > > I > >> > would suggest trying 7.0 or a 7.1 release candidate and see if it does > >> > better. > >> > > >> > -- > >> > John Baldwin > >> > > >> > >> Hi John, > >> > >> I try 7.0 release and each network interface were already allocated > >> separately on different CPU. Here, MSI is already working. > >> > >> PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND > >> 12 root 1 171 ki31 0K 16K CPU6 6 123:55 100.00% idle: > > cpu6 > >> 15 root 1 171 ki31 0K 16K CPU3 3 123:54 100.00% idle: > > cpu3 > >> 14 root 1 171 ki31 0K 16K CPU4 4 123:26 100.00% idle: > > cpu4 > >> 16 root 1 171 ki31 0K 16K CPU2 2 123:15 100.00% idle: > > cpu2 > >> 17 root 1 171 ki31 0K 16K CPU1 1 123:15 100.00% idle: > > cpu1 > >> 37 root 1 -68 - 0K 16K CPU7 7 9:09 100.00% > > bce0 > >> 13 root 1 171 ki31 0K 16K CPU5 5 123:49 99.07% idle: cpu5 > >> 40 root 1 -68 - 0K 16K WAIT 0 4:40 51.17% irq257: > > bce1 > >> 18 root 1 171 ki31 0K 16K RUN 0 117:48 49.37% idle: cpu0 > >> 11 root 1 171 ki31 0K 16K RUN 7 115:25 0.00% idle: cpu7 > >> 19 root 1 -32 - 0K 16K WAIT 0 0:39 0.00% swi4: > > clock s > >> 14367 root 1 44 0 5176K 3104K select 2 0:01 0.00% dhcpd > >> 22 root 1 -16 - 0K 16K - 3 0:01 0.00% yarrow > >> 25 root 1 -24 - 0K 16K WAIT 0 0:00 0.00% swi6: > > Giant t > >> 11658 root 1 44 0 32936K 4540K select 1 0:00 0.00% sshd > >> 14224 root 1 44 0 32936K 4540K select 5 0:00 0.00% sshd > >> 41 root 1 -60 - 0K 16K WAIT 0 0:00 0.00% irq1: > > atkbd0 > >> 4 root 1 -8 - 0K 16K - 2 0:00 0.00% g_down > >> > >> The bce0 interface interrupt (irq256) gets stressed out which already > >> have 100% of CPU7 while CPU0 is around 51.17%. Any more > >> recommendations? Is there anything we can do about optimization with > >> MSI? > > > > Well, on 7.x you can try turning net.isr.direct off (sysctl). However, it > > seems you are hammering your bce0 interface. You might want to try using > > polling on bce0 and seeing if it keeps up with the traffic better. > > > > -- > > John Baldwin > > > > With net.isr.direct=0, my IBM system lessens CPU utilization per > interface (bce0 and bce1) but swi1:net increase its utilization. > Can you explained what's happening here? What does net.isr.direct do > with the decrease of CPU utilization on its interface? I really wanted > to know what happened internally during the packets being processed > and received by the interfaces then to the device interrupt up to the > software interrupt level because I am confused when enabling/disabling > net.isr.direct in sysctl. Is there a tool that can we used to trace > this process just to be able to know which part of the kernel internal > is doing the bottleneck especially when net.isr.direct=1? By the way > with device polling enabled, the system experienced packet errors and > the interface throughput is worst, so I avoid using it though. > > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND > > 16 root 1 171 ki31 0K 16K CPU10 a 86:06 89.06% idle: > 27 root 1 -44 - 0K 16K CPU1 1 34:37 82.67% swi1: net > 52 root 1 -68 - 0K 16K WAIT b 51:59 59.77% irq32: bce1 > 15 root 1 171 ki31 0K 16K RUN b 69:28 43.16% idle: cpu11 > 25 root 1 171 ki31 0K 16K RUN 1 115:35 24.27% idle: cpu1 > 51 root 1 -68 - 0K 16K CPU10 a 35:21 13.48% irq31: bce0 With net.isr.direct=1, the ithread tries to pass the received packets up to IP/UDP/TCP/socket directly. With net.isr.direct=0, the ithread places received packets on a queue and sends a signal to 'sw1: net'. The swi thread wakes up, pulls the packets off of the queue and sends them to IP/UDP/TCP/socket. -- John Baldwin _______________________________________________ freebsd-smp@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-smp To unsubscribe, send any mail to "freebsd-smp-unsubscribe@..." |
|
|
Re: CPU affinity with ULE schedulerOn Monday 17 November 2008 06:36:40 am Archimedes Gaviola wrote:
> On Mon, Nov 17, 2008 at 7:11 PM, Archimedes Gaviola > <archimedes.gaviola@...> wrote: > > On Fri, Nov 14, 2008 at 12:28 AM, John Baldwin <jhb@...> wrote: > >> On Thursday 13 November 2008 06:55:01 am Archimedes Gaviola wrote: > >>> On Wed, Nov 12, 2008 at 1:16 AM, John Baldwin <jhb@...> wrote: > >>> > On Monday 10 November 2008 11:32:55 pm Archimedes Gaviola wrote: > >>> >> On Tue, Nov 11, 2008 at 6:33 AM, John Baldwin <jhb@...> wrote: > >>> >> > On Monday 10 November 2008 03:33:23 am Archimedes Gaviola wrote: > >>> >> >> To Whom It May Concerned: > >>> >> >> > >>> >> >> Can someone explain or share about ULE scheduler (latest version 2 if > >>> >> >> I'm not mistaken) dealing with CPU affinity? Is there any existing > >>> >> >> benchmarks on this with FreeBSD? Because I am currently using 4BSD > >>> >> >> scheduler and as what I have observed especially on processing high > >>> >> >> network load traffic on multiple CPU cores, only one CPU were being > >>> >> >> stressed with network interrupt while the rests are mostly in idle > >>> >> >> state. This is an AMD-64 (4x) dual-core IBM system with GigE Broadcom > >>> >> >> network interface cards (bce0 and bce1). Below is the snapshot of the > >>> >> >> case. > >>> >> > > >>> >> > Interrupts are routed to a single CPU. Since bce0 and bce1 are both on > >>> > the > >>> >> > same interrupt (irq 23), the CPU that interrupt is routed to is going > >> to > >>> > end > >>> >> > up handling all the interrupts for bce0 and bce1. This not something > >> ULE > >>> > or > >>> >> > 4BSD have any control over. > >>> >> > > >>> >> > -- > >>> >> > John Baldwin > >>> >> > > >>> >> > >>> >> Hi John, > >>> >> > >>> >> I'm sorry for the wrong snapshot. Here's the right one with my > >>> >> > >>> >> PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND > >>> >> 17 root 1 171 52 0K 16K CPU0 0 54:28 95.17% idle: > >> cpu0 > >>> >> 15 root 1 171 52 0K 16K CPU2 2 55:55 93.65% idle: > >> cpu2 > >>> >> 14 root 1 171 52 0K 16K CPU3 3 58:53 93.55% idle: > >> cpu3 > >>> >> 13 root 1 171 52 0K 16K RUN 4 59:14 82.47% idle: > >> cpu4 > >>> >> 12 root 1 171 52 0K 16K RUN 5 55:42 82.23% idle: > >> cpu5 > >>> >> 16 root 1 171 52 0K 16K CPU1 1 58:13 77.78% idle: > >> cpu1 > >>> >> 11 root 1 171 52 0K 16K CPU6 6 54:08 76.17% idle: > >> cpu6 > >>> >> 36 root 1 -68 -187 0K 16K WAIT 7 8:50 65.53% > >>> >> irq23: bce0 bce1 > >>> >> 10 root 1 171 52 0K 16K CPU7 7 48:19 29.79% idle: > >> cpu7 > >>> >> 43 root 1 171 52 0K 16K pgzero 2 0:35 1.51% > >> pagezero > >>> >> 1372 root 10 20 0 16716K 5764K kserel 6 58:42 0.00% kmd > >>> >> 4488 root 1 96 0 30676K 4236K select 2 1:51 0.00% sshd > >>> >> 18 root 1 -32 -151 0K 16K WAIT 0 1:14 0.00% swi4: > >>> > clock s > >>> >> 20 root 1 -44 -163 0K 16K WAIT 0 0:30 0.00% swi1: > >> net > >>> >> 218 root 1 96 0 3852K 1376K select 0 0:23 0.00% syslogd > >>> >> 2171 root 1 96 0 30676K 4224K select 6 0:19 0.00% sshd > >>> >> > >>> >> Actually I was doing a network performance testing on this system with > >>> >> FreeBSD-6.2 RELEASE using its default scheduler 4BSD and then I used a > >>> >> tool to generate big amount of traffic around 600Mbps-700Mbps > >>> >> traversing the FreeBSD system in bi-direction, meaning both network > >>> >> interfaces are receiving traffic. What happened was, the CPU (cpu7) > >>> >> that handles the (irq 23) on both interfaces consumed big amount of > >>> >> CPU utilization around 65.53% in which it affects other running > >>> >> applications and services like sshd and httpd. It's no longer > >>> >> accessible when traffic is bombarded. With the current situation of my > >>> >> FreeBSD system with only one CPU being stressed, I was thinking of > >>> >> moving to FreeBSD-7.0 RELEASE with the ULE scheduler because I thought > >>> >> my concern has something to do with the distributions of load on > >>> >> multiple CPU cores handled by the scheduler especially at the network > >>> >> level, processing network load. So, if it is more of interrupt > >>> >> handling and not on the scheduler, is there a way we can optimize it? > >>> >> Because if it still routed only to one CPU then for me it's still > >>> >> inefficient. Who handles interrupt scheduling for bounding CPU in > >>> >> order to prevent shared IRQ? Is there any improvements with > >>> >> FreeBSD-7.0 with regards to interrupt handling? > >>> > > >>> > It depends. In all likelihood, the interrupts from bce0 and bce1 are > >>> > hardwired to the same interrupt pin and so they will always share the same > >>> > ithread when using the legacy INTx interrupts. However, bce(4) parts do > >>> > support MSI, and if you try a newer OS snap (6.3 or later) these devices > >>> > should use MSI in which case each NIC would be assigned to a separate CPU. > >> I > >>> > would suggest trying 7.0 or a 7.1 release candidate and see if it does > >>> > better. > >>> > > >>> > -- > >>> > John Baldwin > >>> > > >>> > >>> Hi John, > >>> > >>> I try 7.0 release and each network interface were already allocated > >>> separately on different CPU. Here, MSI is already working. > >>> > >>> PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU > >>> 12 root 1 171 ki31 0K 16K CPU6 6 123:55 100.00% idle: > >> cpu6 > >>> 15 root 1 171 ki31 0K 16K CPU3 3 123:54 100.00% idle: > >> cpu3 > >>> 14 root 1 171 ki31 0K 16K CPU4 4 123:26 100.00% idle: > >> cpu4 > >>> 16 root 1 171 ki31 0K 16K CPU2 2 123:15 100.00% idle: > >> cpu2 > >>> 17 root 1 171 ki31 0K 16K CPU1 1 123:15 100.00% idle: > >> cpu1 > >>> 37 root 1 -68 - 0K 16K CPU7 7 9:09 100.00% > >> bce0 > >>> 13 root 1 171 ki31 0K 16K CPU5 5 123:49 99.07% idle: cpu5 > >>> 40 root 1 -68 - 0K 16K WAIT 0 4:40 51.17% irq257: > >> bce1 > >>> 18 root 1 171 ki31 0K 16K RUN 0 117:48 49.37% idle: cpu0 > >>> 11 root 1 171 ki31 0K 16K RUN 7 115:25 0.00% idle: cpu7 > >>> 19 root 1 -32 - 0K 16K WAIT 0 0:39 0.00% swi4: > >> clock s > >>> 14367 root 1 44 0 5176K 3104K select 2 0:01 0.00% dhcpd > >>> 22 root 1 -16 - 0K 16K - 3 0:01 0.00% yarrow > >>> 25 root 1 -24 - 0K 16K WAIT 0 0:00 0.00% swi6: > >> Giant t > >>> 11658 root 1 44 0 32936K 4540K select 1 0:00 0.00% sshd > >>> 14224 root 1 44 0 32936K 4540K select 5 0:00 0.00% sshd > >>> 41 root 1 -60 - 0K 16K WAIT 0 0:00 0.00% irq1: > >> atkbd0 > >>> 4 root 1 -8 - 0K 16K - 2 0:00 0.00% g_down > >>> > >>> The bce0 interface interrupt (irq256) gets stressed out which already > >>> have 100% of CPU7 while CPU0 is around 51.17%. Any more > >>> recommendations? Is there anything we can do about optimization with > >>> MSI? > >> > >> Well, on 7.x you can try turning net.isr.direct off (sysctl). However, > >> seems you are hammering your bce0 interface. You might want to try using > >> polling on bce0 and seeing if it keeps up with the traffic better. > >> > >> -- > >> John Baldwin > >> > > > > With net.isr.direct=0, my IBM system lessens CPU utilization per > > interface (bce0 and bce1) but swi1:net increase its utilization. > > Can you explained what's happening here? What does net.isr.direct do > > with the decrease of CPU utilization on its interface? I really wanted > > to know what happened internally during the packets being processed > > and received by the interfaces then to the device interrupt up to the > > software interrupt level because I am confused when enabling/disabling > > net.isr.direct in sysctl. Is there a tool that can we used to trace > > this process just to be able to know which part of the kernel internal > > is doing the bottleneck especially when net.isr.direct=1? By the way > > with device polling enabled, the system experienced packet errors and > > the interface throughput is worst, so I avoid using it though. > > > > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND > > > > 16 root 1 171 ki31 0K 16K CPU10 a 86:06 89.06% idle: > > 27 root 1 -44 - 0K 16K CPU1 1 34:37 82.67% swi1: net > > 52 root 1 -68 - 0K 16K WAIT b 51:59 59.77% irq32: bce1 > > 15 root 1 171 ki31 0K 16K RUN b 69:28 43.16% idle: cpu11 > > 25 root 1 171 ki31 0K 16K RUN 1 115:35 24.27% idle: cpu1 > > 51 root 1 -68 - 0K 16K CPU10 a 35:21 13.48% irq31: bce0 > > > > > > Regards, > > Archimedes > > > > One more thing, I observed that when net.isr.direct=1, bce0 is using > irq256 and bce1 is using irq257 while net.isr.direct=0, bce0 is now > using irq31 and bce1 is using irq32. What makes it different? That is not from net.isr.direcct. irq256/257 is when the bce devices are using MSI. irq31/32 is when the bce devices are using INTx. -- John Baldwin _______________________________________________ freebsd-smp@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-smp To unsubscribe, send any mail to "freebsd-smp-unsubscribe@..." |
|
|
Re: CPU affinity with ULE scheduler> In both cases the code path a packet must go through is very similar: it
> has to be received, then processed through firewalls and network stack > code, then delivered to application(s), so it's a serial process. There > are things that could be better parallelized in the stack and people are > working on them, but they will not be finished any time soon. Ah okay so the project is moving towards network stack parallelism. What is the benefit of parallelized network stack in comparison to the current serialized network stack? Is there any known issues with serialized network stack dealing with multiple CPUs? If it has, in what aspect, components or subsystem of the operating system? With network stack parallelism, what are the necessary changes of the operating system? How should be the network processing be optimized with parallelized network stack? I have gone through a technical paper in the Internet about evaluation on network stack parallelism strategies for modern operating system http://www.cs.rice.edu/CS/Architecture/docs/willmann-usenix06.pdf which described about approaches in implementing parallelized network stack in which also described FreeBSD were used as the prototype of the different approaches, from here I want to know what approach does FreeBSD is implementing, is it message-based parallelism or connection-based parallelism? Thanks, Archimedes _______________________________________________ freebsd-smp@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-smp To unsubscribe, send any mail to "freebsd-smp-unsubscribe@..." |
|
|
Re: CPU affinity with ULE scheduler> Is there a tool that can we used to trace
> this process just to be able to know which part of the kernel internal > is doing the bottleneck especially when net.isr.direct=1? By the way > with device polling enabled, the system experienced packet errors and > the interface throughput is worst, so I avoid using it though. > Since I was really looking for a tool to be able to know how packets are being processed from the interface and up to the network stack and applications, but I haven't found any tool for my concern. What I have found is the LOCK_PROFILING tool. Although I'm sure that this really not answer my concern but I just tried because I need to know something about locks which FreeBSD is using with. Some people consider that there's a lot of factors and variables with regards to network performance in FreeBSD, so I got a try on this tool. I also get valuable info from this link http://markmail.org/message/3uqxi4pipvvoy6jx#query:lock%20profiling%20freebsd+page:1+mid:ymqgrxqf4min54zd+state:results. Instead of the IBM machine with Broadcom NICs, I use another machine with 4 x Quad-Core AMD64 with still Broadcom NICs on FreeBSD-7.1 BETA2. I took data results with traffic and without traffic. With traffic, I use both TCP and UDP protocols in bombarding traffic. UDP for upload and TCP for download in a back-to-back setup. What I have found is that there's a high wait_total on some of the following when there's traffic: max total wait_total count avg wait_avg cnt_hold cnt_lock name 517 24761291 6165864 4460995 5 1 552124 1558183 net/route.c:293 (sleep mutex:radix node head) 277 1427082 140797 354220 4 0 14476 20674 amd64/amd64/io_apic.c:212 (spin mutex:icu) 33 25275 20744 5401 4 3 0 5400 amd64/amd64/mp_machdep.c:974 (spin mutex:sched lock 4) 17283 3346679 104214 107262 31 0 4545 4072 kern/kern_sysctl.c:1334 (sleep mutex:Giant) 257 28599 386 1302 21 0 35 30 vm/vm_fault.c:667 (sleep mutex:vm object) 282 2821743 2673 977635 2 0 926 552 net/if_ethersubr.c:405 (sleep mutex:bce1) 22 743637 157239 256274 2 0 5304 48357 dev/random/randomdev_soft.c:308 (spin mutex:entropy harvest mutex) 301 16301894 881827 1255534 12 0 241491 45973 dev/bce/if_bce.c:5016 (sleep mutex:bce0) 273 1228787 55458 103863 11 0 3733 4736 kern/subr_sleepqueue.c:232 (spin mutex:sleepq chain) 624 4682305 1339783 1251253 3 1 32664 254211 dev/bce/if_bce.c:4320 (sleep mutex:bce1) With lock profiling, how do we know that a certain kernel structure or function is causing a contention? I only have little knowledge about mutex, can someone elaborate on these especially sleep and spin mutex? Unfortunately due to the log result is too big for the mailing list then I only attached the complete log in compressed format. Thanks, Archimedes _______________________________________________ freebsd-smp@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-smp To unsubscribe, send any mail to "freebsd-smp-unsubscribe@..." |
| Free embeddable forum powered by Nabble | Forum Help |