Middle-East Journal of Scientific Research 24 (S2): 92-97, 2016

ISSN 1990-9233

© IDOSI Publications, 2016

DOI: 10.5829/idosi.mejsr.2016.24.S2.123

# An Efficient VLSI Architecture for Pulse Shaping FIR Interpolation Filter of Multistandard DUC

M. Santhi and R. Vaishanavi

Department of ECE, Saranathan College of Engineering, Trichy, India

**Abstract:** This work proposes an optimization technique for the design of pulse shaping FIR interpolation filter of multistandard DUC. This consists of two step optimization technique for the implementation of root raised cosine FIR interpolation filter for multistandard DUC. In the first step, multiplications per input sample and additions per input sample are reduced. In the second step, VHBCSE algorithm is applied to design an efficient constant multiplier which is the basic element of FIR filter. This algorithm optimizes the multiplier adder tree (MAT). Carry skip addition is proposed in the accumulation unit to improve the speed performance. Less area and less power compact design can be achieved.

**Key words:** Finite Impulse Response (FIR) interpolation filter • Digital Up converter (DUC) • Vertical Horizontal Binary Common Sub-expression Elimination Algorithm (VHBCSE) • Root Raised Cosine (RRC) filter

#### INTRODUCTION

### **Square Root Raised Cosine for Pulse**

**Shaping:** Ouantization of a signal leads to conversion of analog signal to digital signal. Due to this sudden change, high frequency spectral components appear in frequency domain of a signal. The signal is then transmitted in a limited bandwidth. This resulted in interference of adjacent symbols known as Intersymbol Interference (ISI). Root raised cosine filter minimizes this ISI. A Root Raised Cosine (RRC) filter can be used as the transmit and receive filter. In root raised cosine filter, nonzero portion of frequency spectrum is a cosine function which is raised up above the frequency axis. The combined response of transmit and receive filters is known as the raised cosine filter. ISI is minimized when the combined response of transmit filter, channel response and receive filter satisfies Nyquist ISI criterion. Raised-cosine filter satisfies this criterion. Half of the filtering is done on the transmit side and and half is done on the receive side. When considered only on Transmitter side, the square root of transfer function raised cosine filter is considered.

**Interpolation:** Interpolation is needed for changing from one sampling rate to another which is also known as upsampling. Interpolation is a process of inserting

zero-valued samples between original samples to increase the sampling rate. This removes the unwanted spectral components that are added to the signal. The distortions that arise due to upsampling which are centered on multiples of original sampling rate are removed by filtering. The primary reason for interpolation is to increase the sampling rate at output of one system so that another system operate at higher sampling rate of input signal. For example, in speech processing systems, speech parameters are computed at lower sampling rates for low bit storage or processing. For constructing a synthetic speech signal, the speech parameters are to be processed at higher sampling rate. For this, digital interpolation process comes to utilization. Interpolation factor is defined as ratio of output rate to input rate and it is denoted by L.

Finite Impulse Response (Fir) Filters: FIR filter is a digital filter whose impulse response is of finite duration to any finite length input because it settles to zero in finite time. FIR filter requires no feedback and hence it is also known as non-recursive filter from [1], it can be seen that computations and hence area and power can be reduced by exploiting the symmetry property of the fir filter coefficients. [2] Proposed a CSE algorithm using binary representation of coefficients for higher order filters.

The number of unpaired bits for binary representation of coefficients is few when compared to CSD representation of coefficients. In [3], two types of reconfigurable architectures are proposed for multistandard wireless communication reconfigurability and low complexity.

## Issues in Designing a Reconfigurable Pulse Shaping Fir Filter for

**Multistandard DUC:** Three standards are considered for multistandard DUC: Universal mobile telecommunication standard, wideband code division multiple access and digital video broadcasting.

The challenges to be considered are:

- Consider RRC filter of N-tap with interpolation factor
  L. This RRC filter implementation requires N/L
  multipliers and adders to perform convolution
  operation and addition operation respectively.
  Number of multipliers and adders increases linearly
  with number of filters and number of parameters
  considered for designing the filter.
- Binary Common Subexpression Elimination (BCSE) algorithm is implemented to design efficient constant multiplier. A coefficient of n-bit word length can form 2<sup>n</sup>-(n+1) binary common subexpressions (BCS). Efficient hardware utilization depends on choice of proper length of BCS.
- Logical Depth (LD) is the number of addition operations. Length of BCS decides LD. Decrease in LD maximize the operating frequency of filter.

#### **Proposed Reconfigurable RRC Filter**

Architecture: CLK signals are used to sample the input and output. Master clock (CLK) that operates at higher rate is used to sample the output (RRCOUT). The other three clock signals: CLK divided by four (CLK4), by six (CLK6), by eight (CLK8) respectively are used to sample the input (RRCIN) for the corresponding interpolation factors. The proposed architecture consists of the blocks: data generator (DG) block, coefficient generator (CG) block, coefficient selector (CS) block and an accumulation unit block (FA). These blocks are discussed as follows:

**DG Block:** Sampling of input data (RRCIN) occurs which is based on the selected value of the interpolation factor selection parameter (INTP\_SEL). The input data of a particular standard is sampled using the corresponding clock signal.

CG Block: The sampled inputs from data generatorare multiplied with coefficients of RRC filter in this block. Here two coding passes are introduced for proper selection of reconfigurable FIR filter coefficients that are required for the implementation of standard. The selection lines used are FLT\_SEL and INTP\_SEL. These two coding passes enable in reducing the hardware usage. The partial product generator (PPG) unit, multiplexer unit and final addition unit is implemented using VHBCSE algorithm. The steps involved in CG block are shown in Fig. 1. The layered addition is due to the binary adder tree structure.

Constant Shifts Method (CSM) for fixed coefficients Programmable Shifts Method (PSM) and reconfigurable coefficients.CSM is used for application specific filters whereas PSM is used for reconfigurable filters. According to [4], a reconfigurable digital root raised cosine (RRC) filter for a UMTS terrestrial receiver is designed by taking advantage of symmetricity of linear phase FIR filter. An appropriate filter length is calculated based on the in-band and out-of-band received signal powers. This filter length should satisfy the bit energy to interference ratio. A reconfigurable distributed-arithmetic (DA) finite impulse response (FIR) filter targeted for a single-chain multi-mode radio transceiver is designed in [5]. Memory optimization in LUT is obtained by partitioning and offset binary coding (OBC).

**First Coding Pass (FCP):** Two coefficient setsof RRC filters of same length differing by the roll off factor are sent as input to 2:1 multiplexer. The selection line (FLT\_SEL) selects the desired filter depending on the roll off factor. Inside one FCP block, three coding passes are implemented in parallel for three different interpolation factors. Vertical BCSE is implemented between all bits of two coefficients of same filter length.

Second Coding Pass (SCP): The codedcoefficients from FCP are sent as input to another set of multiplexers. The selection line (INTP\_SEL) selects the desired filter which depends on the interpolation factor. Combination of FCP and SCP reduces the number of multiplications per input sample (MPIS) and additions per input sample (APIS). The vertical BCSs are obtained between the coefficient sets.

**Sign Conversion Block:** Sign conversionblock supports the signed decimal format data representation for both the input and the coefficients. The architecture of the sign conversion block is shown in Fig. 2.



Fig. 1: Flow diagram of CG block. Operation of blocks in CG block is as follows:



Fig. 2: Hardware architecture of the Sign Conversion Block.

One 1's complementer circuit generates inverted version of the 16-bit (excluding MSB) coefficient. One 16-bit 2:1 multiplexer generates the multiplexed coefficients according to the most significant bit (MSB) of the coefficient. The multiplexed coefficient will be in the inverted form for the negative coefficient. Otherwise it will be as it is.

Partial Product Generator (PPG) Unit: The shift and add based technique is used to generate the partial products and later these partial products are summed up by the following addition layers. The number of partial products depends on the choice of size of binary common subexpression (BCS). In layer-1, 2-bit BCSs are considered where extra adder is needed for the pattern

'11' whereas the rest are generated by hardwired shifting. For 16-bit coefficient, eight partial products (P8-P1) are obtained by right shifting the first partial product P8. This reduces the mux size that follows to select the proper partial product depending on the coefficient's binary value. The architecture of the block is shown in Fig. 3.

Control Logic Generator (CL): CL blocktakes multiplexed coefficient (Hm[15:0]) as input and divides it into groups where each group is of 4-bit each (Hm[15:12], Hm[11:8], Hm[7:4], and Hm[3:0]) and another groups of 8-bit each (Hm[15:8], Hm[7:0]). CL generator produce 7 control signals for seven equality checks for 7 different cases. The architecture of the block is shown in Fig. 4.



Fig. 3: Block diagram of the Partial Product Generator Unit



Fig. 5: Architectural details of the controlled addition at layer-2block

The control signal for 8-bit equality check is produced using the control signals generated from the 4-bit equality check.

Multiplexer Unit at Layer-1: Themultiplexer unit is used select the proper partial product from PPG unit which is based on the coefficient's binary value. Eight 4:1 multiplexers are required to produce the partial products in layer-1. The width of the mux depends on the width of partial products. This reduces the hardware and the power consumption. The mux architecture is shown in Fig. 8.

Controlled Addition at Layer-2: Thepartial products (PPs) that are generated from eight groups of 2-bit BCSs are summed up for the final multiplication results which are performed in three layers. Layer-2 requires four addition operations for addition of the eight PPs. Instead of direct

addition of these PPs, VHBCSE's controlled addition operations are performed at layer 2. The control signals (C1-C6) that were generated based on 4-bit BCSE from CL block. The architecture of this block is shown in Fig. 5.

Controlled Addition at Layer-3: Inlayer-3, the four multiplexed sums (AS1, AS2, AS3 and AS4) that are generated from layer-2 are summed up in layer-3. This controlled addition A6 is controlled by the control signal (C7) that has been generated based on 8-bit BCSE from the CL generator block. The architecture of the block is shown in Fig. 6.

**Final Addition on Layer-4:** The sums (AS5-AS6) are added in layer-3 to produce the final Fig. 4. Block diagram of the control logicgenerator unit. Multiplication result between input and coefficient. The overall architecture is shown in Fig. 9.



Fig. 6: Hardware architecture of the controlled addition at layer-3



Fig. 7: Hardware architecture of CS block



Fig. 8: VHBCSE constant multiplier architecture

**Coefficient Selector (CS):** CS supplies properdata to the final accumulation block.

This is done on the basis of interpolation factor parameter. Hardware architecture of CS block is shown in Fig. 7.

**Final Data Accumulation Unit (FA):** FAunit provide the final filter response. The inputs are obtained from coefficient selector. This unit performs carry skip addition that improves speed.

#### RESULTS AND DISCUSSION

The proposed design has been implemented on xc3sd1800a-4cs484 field-programmable gate array (FPGA) device using Xilinx ISE 9.2i. Data generator generates seven input samples according to the interpolation factor. Here the input and output are considered to be in 16-bit in length. Fig. 9 shows the simulated output of FCP where the selection process is based on roll off factor. Fig. 10 shows the simulated output of SCP where the selection process is based on interpolation factor. Fig. 11 shows the simulated output of coefficient generator block. Fig. 12 shows the simulated output of RRC filter.



Fig. 9: Simulated output of First Coding Pass block.



Fig. 10: Simulated output of Second Coding Pass block.



Fig. 11: Simulated output of Coefficient Generator Block.



Fig. 12: Simulated output of RRC filter.

#### **CONCLUSION**

An efficient architecture has been designed for FIR interpolation filter of multistandard DUC. Two step optimization technique resulted in less area and less power consumption. VHBCSE multiplier architecture which incorporates controlled addition using fixed bit horizontal BCSE results in optimization of adder structure that are used to add the partial products. The proposed architecture supports not only signed magnitude number system but also signed decimal number system.

#### **REFERENCES**

- 1. Mahesh, R. and A.P. Vinod, 2008. "A new common subexpression elimination algorithm for realizing low-complexity higher order digital filters," IEEE Trans. Comput.-Aided Design Integr. CircuitsSyst., 27(2): 217-229.
- Mahesh, R. and A.P. Vinod, 2010. "New reconfigurable architectures for implementing FIR filters with low complexity," IEEE Trans. Comput. Aided Design Integr. Circuits Syst., 29(2): 275-288.
- Acharya, T. and G.J. Miao, 2004. "Square-root raised cosine symmetric filter for mobile communication," U.S. Patent 20 040 172 3, Sep. 2, 2004.
- Chandran, J., R. Kaluri, J. Singh, V. Owall and R. Velijanovski, 2004. "Xilinx Virtex II Pro implementation of a reconfigurable UMTS digital channel filter," in Proc. IEEE Workshop Electron. Des. Test and Appl., pp: 77-82.
- Sheikh, F., M. Miller, B. Richards, D. Markovic and B. Nikolic, 2010. "A 1–190 MSample/s 8–64 tap energy-efficient reconfigurable FIR filter for multimode wireless communication," in Proc. IEEE Symp. VLSI Circuits, pp: 207-208.