Move Challenge:

The objective is to build the fastest replacement for RTL Move.

Rules

It is allowed to get the L2 cache size at unit intialization time via CPUID and use it in the functions except in the RTL replacement category.

Validation and Benchmark Tool:
Tables of results
Documentation for Validation and Benchmark Tool

 
Target Function Author Speed up over RTL
P4 Prescott MoveJOH_SSE3_9 John O'Harrow 2,12
P4 Northwood MoveJOH_SSE3_5 John O'Harrow  1,93
Pentium M Dothan MoveDKCSSE_1 Dennis Christensen 2,25
Pentium M Banias MoveJOH_SSE_8 John O'Harrow 2,24
AMD64 MoveJOH_SSE2_9 John O'Harrow 2,09
Athlon XP MoveDKCSSE_1 Dennis Christensen 1,93
Blended MoveJOH_MMX_1 John O'Harrow 1,81
RTL Replacement MoveJOH_RTL_1 John O'Harrow 1,66
Pascal MoveJOH_PAS_9 John O'Harrow 1,60