# SIE40AM

# **VLSI Design for Digital Signal Processing Applications**

Lars Wanhammar Departments of Telecommunication and Physical Electronics The Norwegian University of Science and Technology

Trondheim, Norway

and

Department of Electrical Engineering

Linköping University

Linköping, Sweden

# +46 13 281344 larsw@isy.liu.se

Department of Electrical Engineering Linköping University

DSP Integrated Circuits Lars Wanhammar larsw@isy.liu.se http://www.es.isy.liu.se/



3

1

### **Textbook**

Lars Wanhammar: *DSP Integrated Circuits*, Academic Press, 1999.

See http://www.es.isy.liu.se/publications/books/ DSP\_Integrated\_Circuits/

for Corrections and Solutions to selected problems







Alles UNIVERS

4

# Wideband OFDM Radio Systems





# Heterogeneous Software/Hardware Architectures

- µProcessors
- Control Processors
- DSP Processors
- Algorithm-Specific Processors
- Fixed-Function Units
- Analog Circuits
- Mixed-Signal Circuits



Analog

DSP core

DSP Integrated Circuits Lars Wanhammar Department of Electrical Engineering Linköping University



http://www.es.isv.liu.s

DSP Integrated Circuits Lars Wanhammar Department of Electrical Engineering Linköping University larsw@isy.liu.se http://www.es.isy.liu.se



# **Type of DSP Systems**

Real-Time – Nonreal-Time Resource Adequate – Resource Limited Complexity – Throughput Recursive Algorithms – Nonrecursive Algorithms Data Dependent – Data Independent Static Scheduling – Dynamic Scheduling of Operations Control Dominated – Data Dominated Irregular – Regular/Modular

# **Design and Implementation Constraints**

Fixed/Variable ThroughputDesign ResourcesResource Adequate/LimitedSkilled ManpowerFixed/Multi FunctionalPrevious Design ExperienceEnergy/Power LimitedProduct FamilySizeDesign ReuseVolume (number of units)Platform BasedFlexibility (Design modifications)CAD ToolsTechnology Independence...

DSP Integrated Circuits Department of Electrical Engineering Lars Wanhammar Linköping University

@isy.liu.se www.es.isy.liu.se/ 5

7



#### ■ Design Efficiency

Correctness Design time Flexibility

Reuse

#### ■ Constraints

Resource adequate vs. resource limited

Power consumption – Energy limited

#### Cost

Standard digital CMOS technology Chip Area

#### ■ Energy Efficiency

DSP Integrated Circuits Lars Wanhammar Department of Electrical Engineering Linköping University



Design

larsw@isv.liu.se

http://www.es.isv.liu.s

# Some Observations

■ More and more sophisticated and complex systems

Linköping University

- **Research intensive small step from research to application**
- More efficient design methods needed short design time
- Broad competence in application domain, signal processing, algorithms, arithmetic, and electronics, is required
- Necessary to work in small teams at all levels of a design (A good team of 11 players)
- New cost measures Design and energy efficiency
- Necessary to optimize energy efficiency/power consumption at all levels of the design
- Data dependencies are becoming more important

. . .

DSP Integrated Circuits

Lars Wanhamma



larsw@isv.liu.se

http://www.es.isv.liu.se

# Standard and Application/Algorithm-Specific **Digital Signal Processors**



# **Algorithm-Specific Digital Signal Processors**

- Programmable (at design time) processor cores
- Specialized DSP cores

#### Direct Mapping Techniques

The basic idea is to optimize the amount and usage of resources with respect to the requirements and thereby minimize some cost function



# Standard DSPs

- + Standardized hardware structure
- + Emphasis on short design time
- + Flexible

. . .

DSP Integrated Circuits Lars Wanhamma

9

11

- + Easy to modify/correct errors
- + Low cost due to the wide applicability of the hardware
- Low performance/throughput
- High power consumption
- The flexibility is not needed in many applications/overhead
- Not always cost-effective

- **Application-Specific DSPs**
- + Some IP protection
- + Some CAD tools available
- Inflexible
- Very difficult
- + Lower unit cost, but higher initial cost
- + Higher performance/throughput
- + Lower power consumption

http://www.es.isy.liu.se

+ Optimized

10

# **Direct Mapping Techniques**

■ Partition the system into parts that are implemented with suitable software/hardware structures

Linköping University

- Subsystems are connected according to the signal-flow graph of the whole system
- Asynchronous communication is used between the subsystems (no global clock)
- Synchronous clocking of the subsystems (well-known design problem)
- Globally Asynchronous and Locally Synchronous - GALS Approach
- Schedule the processing elements (PEs) to met the requirements and minimize a cost function

Partitioning into co-operating processes Scheduling Resource Allocation Resource Assignment Architecture Design Logic Design VLSI Design larsw@isv.liu.se http://www.es.isv.liu.se

DSP Algorithm



DSP Integrated Circuits Lars Wanhamm





larsw@isv.liu.se

DSP Integrated Circuits Lars Wanhamma

Department of Electrical Engineering Linköping Universit

13

# **Describing and Modeling DSP Systems**



### **Facets**

#### Assumption:

A human can only handle a handful of items/concepts at a time

Hence, we need to use many different facets, or views to describe a complex system

| DSP Integrated Circuits<br>Lars Wanhammar | Department of Electrical Engineering<br>Linköping University | larsw@isy.liu.se<br>http://www.es.isy.liu.se/                                                              | AND THE OF UNIVERSITY OF THE O |
|-------------------------------------------|--------------------------------------------------------------|------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Physical vi                               | ew                                                           | Onionskin view                                                                                             | 15                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
| Host Process                              | or<br>Outputs                                                | User Interface DSP System                                                                                  |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| I/O Process                               |                                                              | DSP Algorithms           Memory         Buses           Memory         Wires           Cells         Gates |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |

Onionskin view: The idea is to reduce the design complexity of the system by using a hierarchy of views which usually are referred to as *virtual machines*.

Each virtual machine provides the basic functions that are needed to realize the virtual machine in the next higher layer.

larsw@isv.liu.se

http://www.es.isv.liu.s

Hence, the onionskin view represents a *hierarchy of virtual machines*.



Definition: A *behavioral description* is an input–output description that defines the required action of a system in response to prescribed inputs.

The description of the behavior may not include directions about the means of implementation or performance measures such as speed of operation, size, and power dissipation unless they directly affect the application.

Definition: A *functional description* defines the manner in which the system is operated to perform its function.



Definition: An *architectural description* describes how a number of objects (components) are interconnected.

An architectural description is sometimes referred to as a *structural description*.





DSP Integrated Circuits Lars Wanhammar Department of Electrical Engineering Linköping University





# System Design Methodology

"The starting point for the system design phase is the system specification."

Much better to start with a problem understanding phase!

Definition: A *design methodology* is the overall strategy to organize and solve the design tasks at the different steps of the design process.

It is necessary due to the high complexity of the design problem to follow a *structured design* approach that reduces the complexity. Structured design methods are primarily used to

- Guarantee that the performance goals are met and
- Attain a short and predictable design time

Definition: A *specification* has two main parts

■ A behavioral description that specifies *what* is to be designed and

| SPECIFICATION          |                         |  |  |
|------------------------|-------------------------|--|--|
| Sehavioral Description | Verification/Validation |  |  |

■ A verification or validation part that describes *how* the design should be verified (validated).

Definition: *Verification* involves a formal process of proving the equivalence of two different types of representations under all specified conditions.

Definition: Validation is an informal and less rigorous correctness check.

Validation is usually done by simulating the circuit with a finite set of input stimuli to assert that the circuit operate correctly.





#### DSP Integrated Circuits Lars Wanhamma Linköping University



23

larsw@isv.liu.se

larsw@isv.liu.se

http://www.es.isv.liu.s

http://www.es.isy.liu.se

21

- Step 1: Signal-flow model for one sample interval
- Step 2: Signal-flow model for maximally fast realization when the latencies of the operators are given....
- Phase 3: Logic level
- Step 1: Logic description in simulink/VHDL.
- Step 2: Introduce shimming delays.
- Step 3: Introduce quantization, timing circuitry.
- Step 3: Determine appropriate control signals....
- Phase 4: Bit-serial model
- Step 1: Simplify the multipliers....

Documentation: Obtained results and experience

#### **Sequence of Models Approach**

Design Process: Develop a sequence of (executable) models, preferable with a hierarchy, incorporating more and more functionality and details.

Emphasis on problem understanding, design space exploration, and supports innovation.

#### Example of design of a digital filter

Phase 1: Golden model

- Step 1: Determine the transfer function, poles and zeros and verify/validate its function.
- Step 2: Select an algorithm and implement it using, e.g., MATLAB, and validate it against step 1.
- Step 3: Scale the signal levels and determine the roundoff noise and dynamic range. Validate. ...

Phase 2: Develop a corresponding Simulink model



# **Design Transformations**

**Synthesis** Abstraction Optimization Analysis Verification Validation



DSP Integrated Circuits Lars Wanhamn

. . .





Department of Electrical Engineering Linköping University

larsw@isv.liu.se http://www.es.isv.liu.s



# **Complexity Issues**

The *complexity* of a system can be measured in terms of the number of interactions between its parts. More formally we have

where O is a set of objects with their functional description, F, and their interrelations. R.

The reduction in complexity achieved by grouping several parts into a larger object (module) that can be described by a simpler representation, describing only external communication and without explicit references to any internal interactions, is called *abstraction*.

Hierarchical abstraction is the iterative replacement of groups of modules.

Note that using hierarchy alone does not reduce the complexity.

A design that can be completely described by using modules (abstractions) is *modular* and will have low complexity.



**Algorithm Complexity** 

#### ■ Comparing algorithms

It is often important to know how rapidly the execution time grows with problem size.

Let g(n) be a function describing the execution time of an algorithm.

A function g(n) is a member of O(f(n)) if  $\lim_{n \to \infty} \frac{g(n)}{f(n)} = \text{const} < \infty$ .

That is, the function g(n) grows as fast as f(n).

Growth rate, e.g.  $O(\log(n)) < O(n) < O(n\log(n)) < O(n^2) < O(2^n) < O(n!)$ .

#### ■ Average vs. worst-case

Many algorithms have a much worse worst-case execution time than its average performance.

larsw@isv.liu.se

http://www.es.isv.liu.s



25

27

If the design has only a few types of modules, the complexity is even lower. Such designs have a high degree of *regularity*.

A regularity factor can be defined as the ratio of the total number of modules to the number of different modules

Standardization that restricts the design domain can be applied at all levels of the design to simplify modules and increase the regularity, thereby reducing the complexity.



26

#### ■ Actual execution time is interesting as well!

Consider two algorithms, one with exponential execution time  $1.001^n$ and one with polynomial execution time  $10^4 n^6$ . The exponential execution time algorithm will be faster for n < 76738.

# The Divide-And-Conquer Approach

function Solve(P); begin if size(P)  $\leq$  MinSize then Solve := Direct Solution(P) else begin Decompose( $P, P_1, P_2, ..., P_b$ ); for i := 1 to b do  $S_i := \text{Solve}(P_i);$ end: Solve := Combine( $S_1, S_2, \ldots, S_b$ ) end; end if; end:

DSP Integrated Circuits Department of Electrical Engineering Linköping University

Lars Wanhamma

larsw@isv.liu.se http://www.es.isv.liu.se



The amount of time required at each step is

$$T(n) = \begin{pmatrix} a & \text{for } n \le MinSize \\ bT\left(\frac{n}{c}\right) + d \cdot n & \text{for } n > MinSize \end{cases}$$

where *n* is the size of the problem,

a is the time required to solve the minimum-size problem,

*b* is the number of subproblems in each stage,

n/c is the size of the subproblems, and

d n is the linear amount of time required for decomposition and combination of the problems.



# **Integrated Circuit Design**

*The turnaround time* for design changes ranges from several weeks to many months.

Long design times may lead to lost opportunities of marketing the chip ahead of the competition and recouping the investment.

The *correctness of the design* is of paramount importance for a successful project.

# **Technical Feasibility**

#### **SYSTEM-RELATED**

- Partitioning into cabinets, boards, and circuits
- Mixed digital and analog circuits on the same chip

Clock frequencies

Power dissipation and cooling

Circuit area and packaging

I/O interface

DSP Integrated Circuits Lars Wanhammar



http://www.es.isv.liu.se

20

31

It can be shown that divide-and-conquer algorithms have the time-complexity:

$$T(n) \in \begin{pmatrix} O(n) & \text{for } b < c \\ O(n \log_c(n)) & \text{for } b = c \\ O\left(n \log_c(b)\right) & \text{for } b > c \end{pmatrix}$$

Thus, recursively dividing a problem, using a linear amount of time, into two problems (b = 2) of size n/2, (c = 2), results in an algorithm with time–complexity of  $O(n \log_2(n))$ .

The *fast Fourier transform (FFT)* is an example of this type of algorithm.

If the number of subproblems were b = 3, 4, or 8, then the required execution time would be  $O(n^{\log_2(3)})$ ,  $O(n^2)$ , or  $O(n^3)$ , respectively.

DSP Integrated Circuits Department of Electrical Engineering Lars Wanhammar Linköping University larsw@isy.liu.se http://www.es.isy.liu.se/

larsw@isv.liu.se

http://www.es.isv.liu.se

32

#### CIRCUIT-RELATED

#### External

- Interchip propagation delay
- Data transfer frequencies
- Input protection
- Loads that have to be driven, including expected PCB runs
- Available input drivers
- Drive capacity for output buffers
- Restrictions on pin-outs

#### Internal

- Clock frequencies
- Data transfer frequencies and distances
- Critical timing paths
- Non critical timing paths
- Power dissipation and cooling



Drive capacity for internal buffers Circuit area, yield, and packaging Temperature and voltage effects Maximum and minimum temperatures, voltages, etc. Process technology

### ■ DESIGN-EFFORT-RELATED

CAD tools Layout style Regularity and modularity of the circuits Module generators Cell library

DSP Integrated Circuits Lars Wanhammar Department of Electrical Engineering Linköping University larsw@isy.liu.se http://www.es.isy.liu.se/

