Small Logo

SummaryArchitectureDynamic RangePerformanceScalingPrecisionScalable ASICLTEReferences

 

Fast Fourier Transform

3GPP LTE SC-FDMA

The LTE protocol proposes an uplink that departs significantly from the LTE downlink (and WiMax uplink/downlink) in that it is single channel (3GPP TS 36.211 v8.1.0 2007-11). In this case a DFT proceeds the IFFT and its size is determined by the number of resource blocks (6#RB110) and is restricted to allow factors of only from the set {2,3,5}.

To implement the single carrier frequency division multiple access (SC-FDMA) DFTs the same architecture as for the FFTs can be used because it is programmable. For example, to do the "scalable" FFTs you just write a different program for each FFT size, rather the usual approach of picking an FFT size off a particular hardware pipeline stage. The SC-FDMA DFTs can be programmed in much the same way.

The programs for the SC-FDMA DFTs would all use the well known "row/column factorization" method again for computing the transform. Here, the transform size is N= Nr Nc (Nr rows by Nc columns), so Nc Nr-point column DFTs are done, then an N-point twiddle multiplication, and finally Nc Nr-point row DFTs. The column and row DFTs make use of the "base-2,-3,-4,-5, or -6" mathematical form of the DFT (e.g., they use radix-2 through radix-6 butterflies). The base-b approach requires the column or row DFT sizes to be multiples of b2.

The only programming differences between the FFTs for OFDM and the SC-FDMA DFTs is that the FFTs use base-4 processing for both column and row DFTs, whereas the SC-FDMA DFTs use the base-2 through base-6 forms.

Throughput cycle counts (cycles per DFT) for the SC-FDMA DFTs are shown below for a circuit consisting of approximately 2000 4-input LUT/FF pairs, 13 complex multipliers, and 13 memory blocks. Speeds of well over 400MHz have been demonstrated already. (The cycle counts for SC-FDMA DFTs come from a high level simulation so are estimates.)  The same circuit can do all the FFTs as well.

 

DFT Size Cycle  DFT Size Cycle  DFT Size Cycle 
N Count N Count N Count
1200 4805 192 469
1152 3461 540 1265 180 365
1080 2345 480 1045 144 373
972 3893 432 1381 120 270
960 3045 384 853 108 221
900 1805 360 905 96 197
864 1913 324 833 72 221
768 2469 300 730 60 145
720 3125 288 661 48 101
648 1481 240 565 36 36
600 1330 216 482 24 29
576 1157     12 17

 

Note that the signal-to-quantization-noise ratio (SQNR) of our architecture is much higher for a given word length than other fixed and block floating point designs. Our 85-90db SQNR for 16-bits (see "Dynamic Range" tab) is higher than what LTE needs, so a smaller word length might be used in which case all the resource/power numbers above would scale down and the clock speed would go up.

From the table it can be seen that the worst case DFT (1200-points) is 3730 cycles (the largest FFT, 2048-points, is 8357 cycles). At 426MHz, this corresponds to 8.7 and 19.6 usec or 28.3 usec total.