HR (and TI guys)-
sprabf2 doesn't appear to have non-multiply specific examples.
One way to narrow my question would be ask:
-using 16-bit operands (i.e. following the dictum
of using smallest applicable data type in order
to enable SIMD instructions)
-assuming no loops
what is the maximum number of sets of:
add, shift, and xor
that can be performed in one clock cycle? As one example, if we are using C code:
uint32 x[16];
x[i1] ^= (x[i2] + x[i3]) << s1;
x[i4] ^= (x[i5] + x[i6]) << s2;
:
:
where iN are pre-determined indexes, sN are shift constants, and the left-shift is a rotating shift (no bits lost), would the optimizing compiler still be able to use SIMD instructions?
If there are any app notes or benchmarks for such crypto type code, please let us know. Thanks.
-Jeff