|
View:
New views
5 Messages
—
Rating Filter:
Alert me
|
|
|
FreeBSD ARM network speedHi All,
I am continuing my work on CNX11XX/STR91XX (more info about the work: http://tinyhack.com/2009/09/28/cnx11xxstr91xx-freebsd-progress/), two important things left are the Flash/CFI driver, and network problem. The Flash/CFI in theory should be easy, but I will read more about it to make sure that I will not mess the boot loader part. And now about the network. The network speed is now around half of Linux on the same hardware. FTP-ing from the device to my computer (uploading 30 mb file), the speed is about 1.6-2 megabyte/second (the high speed is on the second time when the data is already cached). On Linux, I can upload the same file with the speed of about 3-4 megabyte/second. Some info about the device: RAM: 64 Mb, CPU FA526 (ARM4, no thumb instruction), Speed 200Mhz. MAC is part of SoC, PHY is ICPLUS IP101A I have two question: 1. Is the network speed in Freebsd ARM currently slower than Linux ARM? If it is slower, then how much slower is it? I can not find a comparison of network speed on Freebsd arm and Linux ARM. I am interested if anyone can provide me the comparison between Linux and Freebsd on NSLU2 or some other device. Just for information, changing some kernel options in the Linux version (such as the scheduler used) makes the network speed varies greatly (i think the variation is more than 30%, so in certain configuration it can be a slow as the current FreeBSD version). The network in Linux 2.4 kernel is faster than Linux 2.6. 2. What should I do to make the network faster (especially the sending from device part, to make it usable as a media server)? Here are the things that I have done: - using the scatter/gather feature of the hardware (this improves the speed a little bit) - using checksum offloading feature of the hardware (this improves the speed a little bit) - using task_queue for sending (this improves the speed a lot) - I have disabled spinlock debugging, and other debugging except for the DDB - I have used the -O2 optimization flag - I have checked that there is no error/retransmission (using wireshark), so all the packets are sent and received correctly - I have disabled IPV6 (here is my current configuration: http://p4db.freebsd.org/fileViewer.cgi?FSPC=//depot/projects/str91xx/src/sys/arm/conf/CNS11XXNAS&REV=3) The specification for the STR9104 SoC is available on Cavium website for those who are interested, but it is not very clear, so in developing the network driver, I followed the logic used by the Linux driver (the initialization sequence, etc). The current code is at http://p4db.freebsd.org/fileViewer.cgi?FSPC=//depot/projects/str91xx/src/sys/arm/econa/if_ece.c&REV=4 Here is how the sending part works on STR9104: - In the initialization part, I allocate a ring, the size of the ring is 256 entries (same as Linux version). - When being asked to send a packet, I will do the following thing: - stop the network TX DMA - put the address of each segment of the packet to the ring, and set a flag so that the entry in the ring will be sent by hardware - start the network TX DMA obviously there is a cleaning up part (freeing mbuf) that should be done. The network driver can generate interrupt when a packet has been sent (but can't tell which entry was sent). In the Linux version, this interrupt is not used, the clean up is done just after starting the TX DMA, at the send of the sending function, and I do the same in the FreeBSD driver . Usually only one entry that needs to be removed, so it is quite fast. Is there something obvious (or not so obvius) that I've missed? -- Regards Yohanes http://yohan.es/ _______________________________________________ freebsd-arm@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-arm To unsubscribe, send any mail to "freebsd-arm-unsubscribe@..." |
|
|
Re: FreeBSD ARM network speedOn 2 Oct 2009, at 06:58, Yohanes Nugroho wrote:
> Hi All, > > I am continuing my work on CNX11XX/STR91XX (more info about the work: > http://tinyhack.com/2009/09/28/cnx11xxstr91xx-freebsd-progress/), two > important things left are the Flash/CFI driver, and network problem. > The Flash/CFI in theory should be easy, but I will read more about it > to make sure that I will not mess the boot loader part. And now about > the network. > > The network speed is now around half of Linux on the same hardware. > FTP-ing from the device to my computer (uploading 30 mb file), the > speed is about 1.6-2 megabyte/second (the high speed is on the second > time when the data is already cached). On Linux, I can upload the same > file with the speed of about 3-4 megabyte/second. > > Some info about the device: RAM: 64 Mb, CPU FA526 (ARM4, no thumb > instruction), Speed 200Mhz. MAC is part of SoC, PHY is ICPLUS IP101A > > I have two question: > 1. Is the network speed in Freebsd ARM currently slower than Linux > ARM? I see no problems on my ARM boards running FreeBSD. -- Rui Paulo _______________________________________________ freebsd-arm@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-arm To unsubscribe, send any mail to "freebsd-arm-unsubscribe@..." |
|
|
Re: FreeBSD ARM network speedOn Fri, 2 Oct 2009 12:58:38 +0700
Yohanes Nugroho <yohanes@...> mentioned: > I have two question: > 1. Is the network speed in Freebsd ARM currently slower than Linux ARM? > I don't think so. Our network stack is arch-independent and should perform equally well on all platforms. I've been able to acchieve speeds up to 70 Mbps on my 180Mhz AT91 based board which uses very plain and dumb ethernet controller (although, DMA is supported). > Here is how the sending part works on STR9104: > > - In the initialization part, I allocate a ring, the size of the ring > is 256 entries (same as Linux version). > - When being asked to send a packet, I will do the following thing: > - stop the network TX DMA > - put the address of each segment of the packet to the ring, and set > a flag so that the entry in the ring will be sent by hardware > - start the network TX DMA > in the ring? This thing definitely can kill the network performace as the controller unable to transmit anything during the time you're filling the ring. You should not also generally transmit only one packet a time as in this case your driver will do a lot of extra work and, considering that you're stopping the TX engine when filling the ring, will prevent the adapter doing any useful work. The main strategy of the driver should be to keep the ring filled, waking up when some reasonable amount of space in the ring become available, and sleeping all other time when the adapter is working. I'm not sure why Linux doesn't use interrupt, but this looks really wrong. I'd suggest you to ananlyze the performance of network driver either by using the profiling tools available (kgmon, hardware counters (if any)) or/and via system monitoring tools (top, etc). Top, in particular, will allow you to see where all the CPU time went. -- Stanislav Sedov ST4096-RIPE |
|
|
Re: FreeBSD ARM network speedOn Fri, Oct 2, 2009 at 4:35 PM, Stanislav Sedov <stas@...> wrote:
> On Fri, 2 Oct 2009 12:58:38 +0700 > Yohanes Nugroho <yohanes@...> mentioned: > >> I have two question: >> 1. Is the network speed in Freebsd ARM currently slower than Linux ARM? >> > > I don't think so. Our network stack is arch-independent and should perform > equally well on all platforms. I've been able to acchieve speeds up to > 70 Mbps on my 180Mhz AT91 based board which uses very plain and dumb > ethernet controller (although, DMA is supported). Ok, glad to hear that :) because the first time I asked about a problem in the USB, it turns out that there was a problem in the latest source code in busdma, and I have spent several days thinking it was my bug. > > This looks weird. Why do you stop the TX engine to add more packets > in the ring? This thing definitely can kill the network performace yes, you are right, that is weird, I will have a look at it again. > The main strategy of the driver should be to keep the ring filled, > waking up when some reasonable amount of space in the ring become > available, and sleeping all other time when the adapter is working. Thank you for your enlightenment. > I'm not sure why Linux doesn't use interrupt, but this looks really > wrong. It is because the driver comes from a vendor (very messy code), not in the mainline kernel yet. the background story: - I have a cheap chinese NAS device (Agestar NCB3AST, cost about $50, now you can get it for about $40), comes with linux kernel 2.4, no source code was given. SoC used is Starsemi 9104 - I found out that there is a Linux source code for this SoC but for different device (with different hardware around the SoC). - Based on the source code, I ported it to work on Linux kernel 2.6, I didn't bother to try to clean up the source code - I am thinking of trying to add my code to mainline kernel, I realized that I didn't understand most of the source code - Bruce Simpson offered a device with same SoC (NSD-100) and I tried to port FreeBSD to it, thinking that I will understand the SoC better when rewriting the code - Starsemi was bought by cavium, the SoC is renamed to Econa CNS1102, and the datasheet was released. The datasheet is not very clear, so I am still basing some of my code on the Linux code (just the logic, not copy pasting, I understand about the license implication). > I'd suggest you to ananlyze the performance of network driver > either by using the profiling tools available (kgmon, hardware > counters (if any)) or/and via system monitoring tools (top, etc). > Top, in particular, will allow you to see where all the CPU time > went. I am testing in single user mode. Last time i tested using kgmon, it doesn't show any particular area that might cause the slowdown. Once again, thank you, I now have some ideas on what to do this weekend. -- Regards Yohanes http://yohan.es/ _______________________________________________ freebsd-arm@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-arm To unsubscribe, send any mail to "freebsd-arm-unsubscribe@..." |
|
|
Re: FreeBSD ARM network speedOn Fri, Oct 02, 2009 at 12:58:38PM +0700, Yohanes Nugroho wrote:
> Hi All, > Hi, [...] > The specification for the STR9104 SoC is available on Cavium website > for those who are interested, but it is not very clear, so in > developing the network driver, I followed the logic used by the Linux > driver (the initialization sequence, etc). The current code is at > http://p4db.freebsd.org/fileViewer.cgi?FSPC=//depot/projects/str91xx/src/sys/arm/econa/if_ece.c&REV=4 > > Here is how the sending part works on STR9104: > > - In the initialization part, I allocate a ring, the size of the ring > is 256 entries (same as Linux version). If ethernet controller does not support 1000baseT(I think it's fastethernt because ICPlus IP101A is 10/100 PHY) allocating 256 descriptors are waste of resource especially on 64MB systems, I think. > - When being asked to send a packet, I will do the following thing: > - stop the network TX DMA > - put the address of each segment of the packet to the ring, and set > a flag so that the entry in the ring will be sent by hardware > - start the network TX DMA > > obviously there is a cleaning up part (freeing mbuf) that should be > done. The network driver can generate interrupt when a packet has been > sent (but can't tell which entry was sent). In the Linux version, this > interrupt is not used, the clean up is done just after starting the TX > DMA, at the send of the sending function, and I do the same in the > FreeBSD driver . Usually only one entry that needs to be removed, so > it is quite fast. > > Is there something obvious (or not so obvius) that I've missed? > I briefly looked over the driver code and I can see missing bus_dmamap_sync(9) in several places as well as incorrect use of bus_dma(9). This may also affect performance because checking OWN bit wouldn't be correct in CPU's view without bus_dmamap_sync(9). Another poor performance might come from m_devget(9), I don't know whether controller really needs this type of copying(sorry, have no time to read data sheet) but m_devget(9) is really slow and time consuming operation because it has to copy entire frame to new mbuf. If you had to use m_devget(9) to align buffers on ETHER_ALIGN boundary I guess you can pass the alignment restriction to bus_dma(9). Of course, this requires the controller have ability to receive frames on even address boundary or no Rx buffer alignment limitation. I believe you should not stop DMA before sending another frame as you did in Rx handler. Basically you should make controller as busy as you can to get maximum performance and should reclaim transmitted buffers as soon as you noticed. Stopping DMA may take time since it may have to drain active DMA cycles. If the controller does not generate Tx completion interrupt after sending a frame, which is not likely, you may have to implement a kind of polling in separate thread or should use polling(9). Good luck! _______________________________________________ freebsd-arm@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-arm To unsubscribe, send any mail to "freebsd-arm-unsubscribe@..." |
| Free embeddable forum powered by Nabble | Forum Help |