Technolgy: Dynamic Range
The FFT architecture includes a unique combination of block floating point (BFP) and floating point (FP) features that provide a much higher dynamic range than other other fixed-point FFT circuits with the same input word length. Typically, circuits show ~6db/bit of dynamic range, where dynamic range is a measure of the difference between the smallest and largest detectable signals. For example, Fig. 1 below shows the outputs of two 256-point FFT circuits along with exact Matlab reference values. Both circuits receive the same16-bit "single tone," real input block and produce 16-bit outputs plus an exponent. As can be see from Fig. 1, the Centar version provides ~96db of dynamic range between the two large coefficient outputs and other outputs, whereas the Altera version produces round-off noise that obscures the correct smaller outputs and therefore provides ~24db less dynamic range than its Centar equivalent.
Fig. 1. Outputs of Centar (left) and Altera (right) 256-point fixed-size FFTs with 16-bit fixed-point "single-tone," real input data. Matlab computed reference output is also provided.
Most FFT circuits use fixed-point processing with either automatic scaling during the course of the computation or no scaling. In the former case dynamic range is sacrificed because scaling is not always necessary. In the latter case the word size can grow by as much as a factor of two, which increases usage of logic and memory resources, reduces the maximum clock rate and at the same time burdens all post-FFT processing with unnecessarily increased word lengths. A pure BFP is better because scaling is only performed when necessary, but requires more logic and memory resources than using fixed-point with automatic scaling and is limited in effectiveness due to the restriction of a single exponent per FFT block. Alternatively, our automatic FP/BFP approach provides typically 4 bits (~24db) of extra dynamic range that other BFP based schemes and more for traditional fixed scaling per stage.
Because all the important processing occurs along horizontal PE rows, it is possible to add local circuitry to each so that during column DFT processing (Step 1 of the two step processing) all intermediate results are normalized to the same exponent using shifter circuitry in the multiplier PEs. Therefore, each of the each of the Nr rows of the N=Nr x Nc DFT matrix has its own BFP region. During the final row DFTs (Step 2) computation all results are computed using floating point without any further normalization, so that each output sample has its own exponent. For applications that require fixed-point inputs the output samples can be converted on-the-fly to pure BFP (single exponent for an entire FFT output data block.)

Fig.2. Illustration of
multiple BFP regions in an array (b=4, N=1024,
Nr=32,
Nc=32).
As a of measure dynamic range capabilities, the DFT was computed using “single tone” input data (full range real sinusoids with random frequency and phase). This generates a complex conjugate output along with very small residual values (Fig. 1). The difference in magnitudes of the main output and residual values is a measure of the the circuits potential dynamic range. The dynamic range is computed here in two ways for better characterization of the circuit. The first way computes it by summing the signal power of the single complex conjugate output and dividing by the sum of the roundoff noise output powers:
DR1 = 10 log10 (zs2 / ∑n (z(n)-zref(n))2)
where zs is the complex conjugate output power, z(n) and zref(n) are the other N circuit and reference values. Here DR1 measures the signal output with respect to the total roundoff noise. The second measure of dynamic range is the ratio of the power associated with the complex conjugate outputs and the power of the maximum noise value or
DR2 = 10 log10 (zs2 / max(zref(n))2)
This measure is useful because it can distinguish large "spikes" in the N-2 roundoff noise values, which could obscure a similar size small "real" signal. The results presented below in Table 1 are based on 16-bit fixed-point input data.
|
Transform Size |
DR1 |
DR |
||||||
|
mean |
max |
min |
std dev |
mean |
max |
min |
std dev |
|
|
128 |
97.0 |
111.9 |
92.6 |
2.04 |
102.7 |
117.4 |
93.1 |
2.71 |
|
256 |
96.8 |
129.9 |
91.8 |
2.62 |
104.5 |
111.4 |
94.5 |
2.27 |
|
512 |
96.3 |
102.3 |
89.7 |
2.48 |
103.4 |
114.6 |
95.8 |
4.24 |
|
1024 |
94.1 |
118.3 |
89.8 |
2.11 |
104.4 |
111.4 |
96.3 |
2.20 |
|
2048 |
95.0 |
105.7 |
89.4 |
1.63 |
105.1 |
111.4 |
98.7 |
2.00 |
Table 1. Dynamic range DR1
and DR2 comparisons for different FFT
sizes with 16-bit input data. (Results obtained as averages over
>1000 different random frequency and phase input data sets).
Generally the DR1 results show that the total round-off noise power is about 6db/bit below the maximum signal outputs and DR2 shows that the round-off noise "floor" is typically >6db/bit (>100db for 16-bit fixed-point input data).
