Hi,
There is an example code from DSPLIB under \dsplib_c66x_3_1_0_0\packages\ti\dsplib\src\DSP_fir_gen\c66 to run DSP_fir_gen. I modified the parameters to run the filter with NH=32 and NR=4096. I got 30276 cycles which is close to what the formula indicates.
Could you start your experiment with the example code? This may be easier to identify if anything is wrong with your own code.
Xiaohui