1Note that the UNROLL option makes the 'inner' des loop unroll all 16 rounds 2instead of the default 4. 3RISC1 and RISC2 are 2 alternatives for the inner loop and 4PTR means to use pointers arithmetic instead of arrays. 5 6IRIX 6.2 - R10000 195mhz - cc (-O3 -n32) - UNROLL RISC2 PTR 496,000 3968k/s 7solaris 2.5.1 usparc 167mhz?? - SC4.0 - UNROLL RISC1 PTR [1] 475,400 3804k/s 8solaris 2.5.1 usparc 167mhz?? - gcc 2.7.2 - UNROLL RISC1 PTR 306,000 2448k/s 9linux - pentium 100mhz - gcc 2.7.0 - assember 281,000 2250k/s 10NT 4.0 - pentium 100mhz - VC 4.2 - assember 281,000 2250k/s 11IRIX 5.3 - R4400 200mhz - gcc 2.6.3 - UNROLL RISC2 PTR 235,300 1882k/s 12IRIX 5.3 - R4400 200mhz - cc - UNROLL RISC2 PTR 233,700 1869k/s 13NT 4.0 - pentium 100mhz - VC 4.2 - UNROLL RISC1 PTR 191,000 1528k/s 14DEC Alpha 165mhz?? - cc - RISC2 PTR [2] 181,000 1448k/s 15linux - pentium 100mhz - gcc 2.7.0 - UNROLL RISC1 PTR 158,500 1268k/s 16HPUX 10 - 9000/887 - cc - UNROLL [3] 148,000 1190k/s 17solaris 2.5.1 - sparc 10 50mhz - gcc 2.7.2 - UNROLL 123,600 989k/s 18IRIX 5.3 - R4000 100mhz - cc - UNROLL RISC2 PTR 101,000 808k/s 19DGUX - 88100 50mhz(?) - gcc 2.6.3 - UNROLL 81,000 648k/s 20solaris 2.4 486 50mhz - gcc 2.6.3 - assember 65,000 522k/s 21HPUX 10 - 9000/887 - k&r cc (default compiler) - UNROLL PTR 76,000 608k/s 22solaris 2.4 486 50mhz - gcc 2.6.3 - UNROLL RISC2 43,500 344k/s 23AIX - old slow one :-) - cc - 39,000 312k/s 24 25Notes. 26[1] For the ultra sparc, SunC 4.0 cc -fast -Xa -xO5, running 'des_opts' 27 gives a speed of 475,000 des/s while 'speed' gives 417,000 des/s. 28 I believe the difference is tied up in optimisation that the compiler 29 is able to perform when the code is 'inlined'. For 'speed', the DES 30 routines are being linked from a library. I'll record the higher 31 speed since if performance is everything, you can always inline 32 'des_enc.c'. 33[2] Similar to the ultra sparc ([1]), 181,000 for 'des_opts' vs 175,000. 34[3] I was unable to get access to this machine when it was not heavily loaded. 35 As such, my timing program was never able to get more that %30 of the CPU. 36 This would cause the program to give much lower speed numbers because 37 it would be 'fighting' to stay in the cache with the other CPU burning 38 processes. 39