|
View:
New views
6 Messages
—
Rating Filter:
Alert me
|
|
|
glxsb(4) doesn't appear to be working for me (was: AMD Geode LX Security Block)["Jared D. McNeill" wrote some time ago:]
> > Ok, thanks to a bunch of helpful hints on and off list, here we go: > > swcrypto: > > aes-128-cbc 3688.28k 4064.06k 4185.64k 4216.48k 4221.59k > > hwcrypto: > > aes-128-cbc 372.70k 1422.76k 5098.58k 13612.23k 26804.31k I've got NetBSD-4 running here on a PC Engines ALIX.2d3 board. My dmesg shows: cpu0: AMD Geode LX (586-class), 498.08 MHz, id 0x5a2 cpu0: features 88a93d<FPU,DE,PSE,TSC,MSR,CX8,SEP> cpu0: features 88a93d<PGE,CMOV,MPC,MMX> cpu0: "Geode(TM) Integrated Processor by AMD PCS" cpu0: I-cache 64 KB 32B/line 16-way, D-cache 64 KB 32B/line 16-way cpu0: L2 cache 128 KB 32B/line 4-way cpu0: ITLB 16 4 KB entries fully associative cpu0: DTLB 16 4 KB entries fully associative cpu0: 8 page colors [[....]] glxsb0 at pci0 dev 1 function 2: revision 0: RNG AES Open SSL seems to say what I'm told I should expect it to say: # openssl version OpenSSL 0.9.8e 23 Feb 2007 # openssl engine -c (cryptodev) BSD cryptodev engine [RSA, DSA, DH, AES-128-CBC] (padlock) VIA PadLock (no-RNG, no-ACE) (dynamic) Dynamic engine loading support (4758cca) IBM 4758 CCA hardware engine support [RSA, RAND] (aep) Aep hardware engine support [RSA, DSA, DH] (atalla) Atalla hardware engine support [RSA, DSA, DH] (cswift) CryptoSwift hardware engine support [RSA, DSA, DH, RAND] (chil) CHIL hardware engine support [RSA, DH, RAND] (nuron) Nuron hardware engine support [RSA, DSA, DH] (sureware) SureWare hardware engine support [RSA, DSA, DH, RAND] (ubsec) UBSEC hardware engine support [RSA, DSA, DH] However unlike Jared's report above when I run "openssl speed aes-128-cbc" in any of various ways I never see any difference in performance between when the crypto(4) device is enabled or disable, and certainly I don't see the accelerated speeds Jared reported. My best numbers from the average of 10 runs of the following command on an idle system: openssl speed -multi 10 aes-128-cbc -elapsed are: # sysctl -w kern.usercrypto=1 aes-128 cbc 5303.27k 5654.65k 5722.45k 5753.15k 8364.36k # sysctl -w kern.usercrypto=0 aes-128 cbc 5200.41k 5698.54k 5746.44k 5764.66k 8201.28k FreeBSD-7 with an identical version of OpenSSL seems slightly slower (again this is an average of 10 runs on an idle system): aes-128 cbc 4567.62k 5015.47k 5151.20k 5239.94k 6543.29k (and it's supposedly got the same driver for the AMD Geode LX block enabled too!) What the heck am I doing wrong? Or is something busted? How do I figure out what's going on with the hardware device short of adding printfs to it? Where are the kern.*crypt* sysctl settings documented!?!?!?!? -- Greg A. Woods Planix, Inc. <woods@...> +1 416 218 0099 http://www.planix.com/ |
|
|
Re: glxsb(4) doesn't appear to be working for me (was: AMD Geode LX Security Block)On Thu, Oct 29, 2009 at 09:20:16PM -0400, Greg A. Woods wrote:
> ["Jared D. McNeill" wrote some time ago:] > > > > Ok, thanks to a bunch of helpful hints on and off list, here we go: > > > > swcrypto: > > > > aes-128-cbc 3688.28k 4064.06k 4185.64k 4216.48k 4221.59k > > > > hwcrypto: > > > > aes-128-cbc 372.70k 1422.76k 5098.58k 13612.23k 26804.31k > > > I've got NetBSD-4 running here on a PC Engines ALIX.2d3 board. > > My dmesg shows: > > cpu0: AMD Geode LX (586-class), 498.08 MHz, id 0x5a2 > cpu0: features 88a93d<FPU,DE,PSE,TSC,MSR,CX8,SEP> > cpu0: features 88a93d<PGE,CMOV,MPC,MMX> > cpu0: "Geode(TM) Integrated Processor by AMD PCS" > cpu0: I-cache 64 KB 32B/line 16-way, D-cache 64 KB 32B/line 16-way > cpu0: L2 cache 128 KB 32B/line 4-way > cpu0: ITLB 16 4 KB entries fully associative > cpu0: DTLB 16 4 KB entries fully associative > cpu0: 8 page colors > [[....]] > glxsb0 at pci0 dev 1 function 2: revision 0: RNG AES > > > Open SSL seems to say what I'm told I should expect it to say: > > # openssl version > OpenSSL 0.9.8e 23 Feb 2007 > > # openssl engine -c > (cryptodev) BSD cryptodev engine You may need to explicitly specify -engine cryptodev, and note that you will not get *any* accelleration from openssl speed for any cipher unless you specify it as an "evp" instead of by the shortcut name: openssl speed -engine cryptodev -elapsed -evp aes-128-cbc FWIW, glxsb is not very efficient and the syscall overhead will just kill you for all but very large requests. You may see better results with -multi 32 to get some parallelism going to hide the latency. Thor |
|
|
Re: glxsb(4) doesn't appear to be working for me (was: AMD Geode LX Security Block)Le Thu, 29 Oct 2009 22:30:23 -0400,
Thor Lancelot Simon <tls@...> a écrit : Hello, > openssl speed -engine cryptodev -elapsed -evp aes-128-cbc I always prefer to measure the throughput with dd and openssl enc dd if=/dev/zero bs=1k count=100000 | openssl enc -e -aes-128-cbc -k abcd -out /dev/null [-engine cryptodev] here (FreeBSD 8) without cryptodev: 102400000 bytes transferred in 19.881321 secs (5150563 bytes/sec) => 39 MBytes/s With cryptodev => 120 MBytes/s > FWIW, glxsb is not very efficient and the syscall overhead will just > kill you for all but very large requests. You may see better results > with -multi 32 to get some parallelism going to hide the latency. Yes but it's not so bad IMHO. The throughput of 40 Mbytes/s (ie the same as without glxsb on openssl) is reached very fast with requests > 256 bytes. http://user.lamaiziere.net/patrick/glxsb-171108/glxsb-perf.pdf While I'm here there is a small mistake in glxsb.c in NetBSD (and OpenBSD), but this does not hurt. #define SB_AI_AES_A_COMPLETE 0x0100 #define SB_AI_AES_B_COMPLETE 0x0200 #define SB_AI_EEPROM_COMPLETE 0x0400 Should be: #define SB_AI_AES_A_COMPLETE 0x10000 #define SB_AI_AES_B_COMPLETE 0x20000 #define SB_AI_EEPROM_COMPLETE 0x40000 Source: http://support.amd.com/us/Embedded_TechDocs/33234H_LX_databook.pdf 6.12.3.3 SB AES Interrupt (SB_AES_INT) (page 522) (I've sent a bug report to OpenBSD) Regards. |
|
|
Re: glxsb(4) doesn't appear to be working for me (was: AMD Geode LX Security Block)At Thu, 29 Oct 2009 22:30:23 -0400, Thor Lancelot Simon <tls@...> wrote:
Subject: Re: glxsb(4) doesn't appear to be working for me (was: AMD Geode LX Security Block) > > You may need to explicitly specify -engine cryptodev, and note that you > will not get *any* accelleration from openssl speed for any cipher > unless you specify it as an "evp" instead of by the shortcut name: > > openssl speed -engine cryptodev -elapsed -evp aes-128-cbc I'm not sure I understand. None of the examples I saw on the NetBSD lists show this (and it's not explained at all in the manual page). It looks like the algorithm can also be given on the command line: openssl speed -engine cryptodev -elapsed -evp aes-128-cbc aes-128-cbc and then the program seems to runs the test twice, once in a way that will make use of /dev/crypto. "-engine cryptodev" does now indeed make the huge difference I was expecting, and I see the same kinds of stats others have posted. I've since found similar examples using "-evp aes-128-cbc" on the FreeBSD lists (regarding the same driver and device), as well as other tests that make use of the device such as: # dd if=/dev/zero bs=4k count=100000 | \ openssl enc -aes-128-cbc -e -out /dev/null -nosalt -k abcdefhij -engine cryptodev 10000+0 records in 10000+0 records out 81920000 bytes transferred in 5.465 secs (14989935 bytes/sec) I can also confirm that on NetBSD-4 with the native OpenSSL 0.9.8e the "cryptodev" engine must be specified in order to make use of the device. # for i in 1 2 3 4 5 6 7 8 9 0 ; do openssl speed -multi 10 -evp aes-128-cbc -elapsed 2>/dev/null | tail -1; done | awk ' {n1=$1; t1+=$2; t2+=$3; t3+=$4; t4+=$5; t5+=$6;} END{printf("%-13s %11.2fk %11.2fk %11.2fk %11.2fk %11.2fk (%d runs)\n", n1, t1/NR, t2/NR, t3/NR, t4/NR, t5/NR, NR)}' evp 310.16k 1229.65k 4354.50k 10540.60k 62369.80k (10 runs) # sysctl -w kern.usercrypto=0 evp 4917.08k 5519.23k 5746.64k 5808.70k 8549.20k (10 runs) For comparison my Dell PE2650 2*2.4GHz HTT server gets: evp 22753.38k 26595.67k 31588.73k 31056.11k 35666.74k (10 runs) > FWIW, glxsb is not very efficient and the syscall overhead will just > kill you for all but very large requests. You may see better results > with -multi 32 to get some parallelism going to hide the latency. Indeed. Thank you very much! -- Greg A. Woods Planix, Inc. <woods@...> +1 416 218 0099 http://www.planix.com/ |
|
|
Re: glxsb(4) doesn't appear to be working for me (was: AMD Geode LX Security Block)On Fri, Oct 30, 2009 at 12:18:00PM -0400, Greg A. Woods wrote:
> At Thu, 29 Oct 2009 22:30:23 -0400, Thor Lancelot Simon <tls@...> wrote: > Subject: Re: glxsb(4) doesn't appear to be working for me (was: AMD Geode LX Security Block) > > > > You may need to explicitly specify -engine cryptodev, and note that you > > will not get *any* accelleration from openssl speed for any cipher > > unless you specify it as an "evp" instead of by the shortcut name: > > > > openssl speed -engine cryptodev -elapsed -evp aes-128-cbc > > I'm not sure I understand. None of the examples I saw on the NetBSD > lists show this (and it's not explained at all in the manual page). I can't say why people would post wrong examples to the NetBSD lists. I do often wish that if people didn't know what they were talking about, they'd pipe down already with the "helpful" advice on the lists... I can say why the manual page is wrong: OpenSSL manual pages in general just plain suck. Here is what is going on: the OpenSSL "engine" interface is jammed in at their abstract-algorithm layer (fsvo "layer") which lies between their SSL-record-handling layer and the raw encryption routines. This layer is called "EVP". The openssl 'speed' utility calls the raw encryption routines when you tell it to do a speed test for a cipher. So the cryptodev engine never sees the requests. However, it calls the EVP routines when you tell it to do a speed test for any other kind of algorithm, such as a hash function like MD5 or SHA! This can be extremely confusing. The workaround is to trick it into thinking it's testing some other kind of block-oriented algorithm by telling it to look up the cipher *by its EVP* which forces it to use the EVP layer, so the engine layer sees the requests. This is what the -evp switch on the command line accomplishes. Thor |
|
|
Re: glxsb(4) doesn't appear to be working for me (was: AMD Geode LX Security Block)On Fri, Oct 30, 2009 at 04:12:26PM -0400, Thor Lancelot Simon wrote:
> > The workaround is to trick it into thinking it's testing some other > kind of block-oriented algorithm by telling it to look up the cipher > *by its EVP* which forces it to use the EVP layer, so the engine layer > sees the requests. This is what the -evp switch on the command line > accomplishes. The other thing is, using the "cryptodev" engine causes most of the actual work to be done in the kernel. So you need -elapsed on the openssl speed command line or you'll get false, insanely high results because it will track only the amount of time spent in the userspace openssl process. When you use -multi N I think it also forces the use of -elapsed. -- Thor Lancelot Simon tls@... "Even experienced UNIX users occasionally enter rm *.* at the UNIX prompt only to realize too late that they have removed the wrong segment of the directory structure." - Microsoft WSS whitepaper |
| Free embeddable forum powered by Nabble | Forum Help |