It is allowed to get the L2 cache size at unit intialization time via CPUID and use it in the functions except in the RTL replacement category.
Validation and Benchmark Tool:
Tables of results
Documentation for Validation and Benchmark
Tool
Target | Function | Author | Speed up over RTL |
P4 Prescott | MoveJOH_SSE3_9 | John O'Harrow | 2,12 |
P4 Northwood | MoveJOH_SSE3_5 | John O'Harrow | 1,93 |
Pentium M Dothan | MoveDKCSSE_1 | Dennis Christensen | 2,25 |
Pentium M Banias | MoveJOH_SSE_8 | John O'Harrow | 2,24 |
AMD64 | MoveJOH_SSE2_9 | John O'Harrow | 2,09 |
Athlon XP | MoveDKCSSE_1 | Dennis Christensen | 1,93 |
Blended | MoveJOH_MMX_1 | John O'Harrow | 1,81 |
RTL Replacement | MoveJOH_RTL_1 | John O'Harrow | 1,66 |
Pascal | MoveJOH_PAS_9 | John O'Harrow | 1,60 |