StrCopy Challenge

The objective is to build the fastest replacement for RTL StrCopy.

Validation and Benchmark Tool (
Tables of results (StrCopyBenchmark380.xls)

StrCopy Benchmark and Validation Tool Documentation

The benchmark calls the function under test a number of times with the same source and destination PChar pointers. Then it shortens the strings by one and calls the function a number of times again. The strings are shortened by inserting zero terminators at the location of the new string end.
The number of times the StrCopy function is called on the same strings is controlled by the constants NOOFRUNS1 and NOOFRUNS2.
The benchmark is duplicated in SubBenchmark1 and SubBenchmark2. SubBenchmark1 is running on strings in the interval 0-30 characters, and SubBenchmark2 is running on strings in the interval 30-100 characters.
The maximum string lengths are set by the local MAXSTRLEN constant. It is currently 100.
The string length were SubBenchmark1 stops and SubBenchmark2 starts is controlled by the constant CROSSOVERLENGTH = 30;
The cpu clock tick counter is used for measuring the runtime.
The function under test is called twice in the inner loop to tweak alignment of the source versus destination and to reduce loop overhead.
The strings are global to secure the same alignment on every benchmark run.

 The important code looks lke this.

TicksStart := GetCPUTick;
 for I1 := MaxStrLen downto 1 do
   Dest[I1] := #0;
   Source[I1] := #0;
   for I2 := 1 to NOOFRUNS1 do
     StrCopyFunction(Dest, Source);
     StrCopyFunction(Source, Dest);
 TicksEnd := GetCPUTick;

 To be observed is following possible weaknesses.