|
View:
New views
1 Messages
—
Rating Filter:
Alert me
|
|
|
About the measurement of L2 Cache on OpteronHi @ll, I'm evaluating program performance on AMD Opteron 270 with OProfile. I refer to "Basic Performance Measurements for AMD Athlon™ 64, AMD Opteron™ and AMD Phenom™ Processors" by Paul J. Drongowski.
Paul's artcile introduces two methods for L2 cache. One is direct method and another is indirect method. Direct method: L2 request rate = (L2_requests + L2_fill_write) / Ret_instructions
L2 miss ratio = L2_misses / (L2_requests + L2_fill_write) Indirect method: IC_misses = IC_refills_L2 + IC_refills_sys DC_misses = DC_refills_L2 + DC_refills_sys
L2_requests = IC_misses + DC_misses + L2_requests_TLB L2 request rate = L2_requests / Ret_instructions L2_misses = IC_refills_sys + DC_refills_sys + L2_misses_TLB L2 miss ratio = L2_misses / L2_requests
I have some questions about L2 Cache measurement with OProfile. 1. How to compute L2_request_TLB in the indirect method? My understanding is L2_request_TLB is equal to the sum of L1_ITLB_MISS_AND_L2_ITLB_MISS and L1_DTLB_AND_L2_DTLB_MISS. Event REQUESTS_TO_L2 has a mask bit (0x4) for TLB. I measured both mcf and vortex in SPEC2000.
opcontrol --event=REQUESTS_TO_L2:50003:0x4--event=L1_ITLB_MISS_AND_L2_ITLB_MISS:50003 --event=L1_DTLB_AND_L2_DTLB_MISS:50003 --image=mcf.exe,vortex.exe L1_DTLB_AND_L2_DTLB_MISS|REQUESTS_TO_L2:0x4|L1_ITLB_MISS_AND_L2_ITLB_MISS:50003|
samples| %| samples| %| samples| %| ------------------------------------------------------------------------- 1377 100.000 1664 100.000 0 100.000 mcf.exe
1192 100.000 10 100.000 1816 100.000 vortex.exe There is a big discrepancy between REQUESTS_TO_L2:0x4 and (L1_ITLB_MISS_AND_L2_ITLB_MISS + L1_DTLB_AND_L2_DTLB_MISS). Which is appropriate?
2. How to compute L2_request? Direct method: L2_requests + L2_fill_write Indirect method: IC_misses(IC_refills_L2 + IC_refills_sys) + DC_misses(DC_refills_L2 + DC_refills_sys) + L2_requests_TLB
1) Direct method opcontrol --event=L2_CACHE_FILL_WRITEBACK:50003 --event=REQUESTS_TO_L2:50003:0x7 --image=mcf.exe, vortex.exe L2_CACHE_FILL_WRITEBACK|REQUESTS_TO_L2:0x7|
samples| %| samples| %| ------------------------------------ 16402 100.000 15920 100.000 mcf.exe 11610 100.000 13761 100.000 vortex.exe
L2_request_mcf_direct = 16402 + 15920 = 32322 L2_request_vortex_direct = 11610 + 13761 = 25371 2) Indirect method opcontrol --event=DATA_CACHE_REFILLS_FROM_L2_OR_SYSTEM:50003--event=INSTRUCTION_CACHE_REFILLS_FROM_L2:50003 --event=INSTRUCTION_CACHE_REFILLS_FROM_SYSTEM:50003--event=REQUESTS_TO_L2:50003:0x4 --image=mcf.exe,vortex.exe
INSTRUCTION_CACHE_REFILLS_FROM_L2|INSTRUCTION_CACHE_REFILLS_FROM_SYSTEM|REQUESTS_TO_L2:0x4|DATA_CACHE_REFILLS_FROM_L2_OR_SYSTEM | samples| %| samples| %| samples| %| samples| %|
------------------------------------------------------------------------ 2251 100.000 15 100.000 2160 100.000 7587 100.000 vortex.exe 1 100.000 0 100.000 1660 100.000 10491 100.000 mcf.exe
L2_request_mcf_indirect = 2251 + 15 + 2160 + 7587 = 12013 L2_request_vortex_indirect = 1 + 1660 + 10491 = 12152 There is a VERY BIG discrepancy between L2_request computed with direct and indirect methods. Why?
3. Are the following statements right? 1) INSTRUCTION_CACHE_REFILLS_FROM_SYSTEM is equal to L2_CACHE_MISS:0x1. 2) DATA_CACHE_REFILLS_FROM_SYSTEM is equal to L2_CACHE_MISS:0x2.
Any suggestion is welcome! Appropriate measurement parameters are very necessary and important. We should have a unified version. -- Regards, Paul Yuan (袁鹏) ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ oprofile-list mailing list oprofile-list@... https://lists.sourceforge.net/lists/listinfo/oprofile-list |
| Free embeddable forum powered by Nabble | Forum Help |