This is a strcmp() benchmark using different real-world-like cases, intended to be used while micro-optimizing strcmp() on Itanium and elsewhere.
bench.c has the full benchmark, bench2.c is a simplified version for the purpose of an article series on epic-linux.org.
Old implementation:
long_equal_align: 6469.349090 iter/msec
long_equal_noalign: 6443.890686 iter/msec
long_nonequal_align: 6634.703380 iter/msec
long_nonequal_noalign: 6607.477001 iter/msec
s16b_equal_align: 68656.575542 iter/msec
s16b_equal_noalign: 65802.817835 iter/msec
s16b_nonequal_align: 71774.449811 iter/msec
s16b_nonequal_noalign: 68661.470034 iter/msec
one_char_eq_align: 143535.752778 iter/msec
one_char_eq_noalign: 143552.079654 iter/msec
one_char_neq_align: 175441.342379 iter/msec
one_char_neq_noalign: 175452.605535 iter/msec
two_char_neq_align: 143556.397559 iter/msec
two_char_neq_noalign: 143551.781520 iter/msec
three_char_neq_align: 131585.084254 iter/msec
three_char_neq_noalign: 131591.071711 iter/msec
four_char_neq_align: 112800.083493 iter/msec
four_char_neq_noalign: 105267.924980 iter/msec
New, strictly better mplementation:
Versatile libc strcmp() benchmark for 64-bit machines
By: EPIC Linux project, http://www.epic-linux.org
long_equal_align: 2402.730049 iter/msec
long_equal_noalign: 2400.047425 iter/msec
long_nonequal_align: 2414.800975 iter/msec
long_nonequal_noalign: 2411.093062 iter/msec
s16b_equal_align: 37601.686675 iter/msec
s16b_equal_noalign: 36727.871082 iter/msec
s16b_nonequal_align: 40494.429571 iter/msec
s16b_nonequal_noalign: 39482.712309 iter/msec
one_char_eq_align: 75204.973133 iter/msec
one_char_eq_noalign: 71785.018296 iter/msec
one_char_neq_align: 157919.059369 iter/msec
one_char_neq_noalign: 157914.034311 iter/msec
two_char_neq_align: 75204.798299 iter/msec
two_char_neq_noalign: 71785.702118 iter/msec
three_char_neq_align: 65802.101692 iter/msec
three_char_neq_noalign: 63171.621402 iter/msec
four_char_neq_align: 58492.229262 iter/msec
four_char_neq_noalign: 56402.894159 iter/msec