什么是DSP?

wikiterm

A?digital signal processor?(DSP?or?DSP micro) is a specialized?microprocessor?designed specifically for?digital signal processing, generally in?real-time computing.[1 ]Typical characteristics

Digital signal processing?algorithms?typically require a large number of mathematical operations to be performed quickly on a set of data. Signals are converted from analog to digital, manipulated digitally, and then converted again to analog form, as diagrammed below. Many DSP applications have constraints on?latency; that is, for the system to work, the DSP operation must be completed within some time constraint.

[size=94%][url=http://en.wikipedia.org/wiki/File

SP_block_diagram.svg][/url]

[size=94%][url=http://en.wikipedia.org/wiki/File

SP_block_diagram.svg][/url]
A simple digital processing system

Most general-purpose microprocessors and operating systems can execute DSP algorithms successfully. But these microprocessors are not suitable for application of mobile telephone and pocket PDA systems etc. because of power supply and space limit. A specialized digital signal processor, however, will tend to provide a lower-cost solution, with better performance and lower latency.
The architecture of a digital signal processor is optimized specifically for digital signal processing work. Some useful features for optimizing DSP algorithms are outlined below.

Architecture

Hardware modulo addressing, allowing?circular buffers?to be implemented without having to constantly test for wrapping.
A memory architecture designed for streaming data, using?DMA?extensively.
Separate program and data memories (Harvard architecture)
Special?SIMD?(single instruction, multiple data) operations
Special arithmetic operations, such as fast?multiply-accumulates?(MACs). Many fundamental DSP algorithms, such as?FIR filters?or the?Fast Fourier transform?(FFT) depend heavily on multiply-accumulate performance.
Bit-reversed addressing, a special?addressing mode?useful only for calculating FFTs
Deliberate exclusion of a?memory management unit. DSPs frequently use multi-tasking operating systems, but have no support for?virtual memory?or memory protection. Operating systems that use virtual memory require more time for?context switching?among?processes, which increases latency.

Program flow

Floating-point?unit integrated directly into the?datapath
Pipelined?architecture
Highly parallel?multiplier–accumulators?(MAC units)
Hardware-controlled?looping, to reduce or eliminate the overhead required for looping operations

Memory architecture

DSPs often use special memory architectures that are able to fetch multiple data and/or instructions at the same time:
- Harvard architecture
- Modified?von Neumann architecture
Use of?direct memory access
Memory-address calculation unit

Data operations

Saturation arithmetic, in which operations that produce overflows will accumulate at the maximum (or minimum) values that the register can hold rather than wrapping around (maximum+1 doesn&#39;t overflow to minimum as in many general-purpose CPUs, instead it stays at maximum). Sometimes various?sticky bits?operation modes are available.
Fixed-point?arithmetic is often used to speed up arithmetic processing
Single-cycle operations to increase the benefits of?pipelining

Instruction sets

Multiply-accumulate?(MAC, aka fused multiply-add, FMA) operations, which are used extensively in all kinds of?matrix?operations, such as?convolution?for filtering,?dot product, or even polynomial evaluation (seeHorner scheme)
Instructions to increase parallelism:?SIMD,?VLIW,?superscalar architecture
Specialized instructions for?modulo?addressing in?ring buffers?and bit-reversed addressing mode for?FFT?cross-referencing
Digital signal processors sometimes use?time-stationary encoding?to simplify hardware and increase coding efficiency.

History

Prior to the advent of stand-alone DSP chips discussed below, most DSP applications were implemented using bit slice processors. The AMD2901 bit slice chip with its family of components was a very popular choice. There were reference designs from AMD, but very often the specifics of a particular design were application specific. These bit slice architecture would sometimes include a peripheral multiplier chip. Examples of these multipliers were a series from TRW including the TRW1008 and TRW1010, some of which included an accumulator, providing the requisite multiply-accumulate (MAC) function.
In 1978, Intel released the 2920 as an "analog signal processor". It had an on-chip ADC/DAC with an internal signal processor, but it didn&#39;t have a hardware multiplier and was not successful in the market. In 1979, AMI released the?S2811. It was designed as a microprocessor peripheral, and it had to be initialized by the host. The S2811 was likewise not successful in the market.
In 1980 the first stand-alone, complete DSPs – the?NEC??PD7720?and?AT&T?DSP1?– were presented at the?IEEE?International?Solid-State Circuits?Conference &#39;80. Both processors were inspired by the research inPSTN?telecommunications.
The Altamira DX-1 was another early DSP, utilizing quad integer pipelines with delayed branches and branch prediction.
The first DSP produced by?Texas Instruments?(TI), the?TMS32010?presented in 1983, proved to be an even bigger success. It was based on the Harvard architecture, and so had separate instruction and data memory. It already had a special instruction set, with instructions like load-and-accumulate or multiply-and-accumulate. It could work on 16-bit numbers and needed 390ns for a multiply-add operation. TI is now the market leader in general-purpose DSPs. Another successful design was the?Motorola?56000.
About five years later, the second generation of DSPs began to spread. They had 3 memories for storing two operands simultaneously and included hardware to accelerate?tight loops, they also had an addressing unit capable of?loop-addressing. Some of them operated on 24-bit variables and a typical model only required about 21ns for a MAC (multiply-accumulate). Members of this generation were for example the AT&T DSP16A or the Motorola DSP56001.
The main improvement in the third generation was the appearance of application-specific units and instructions in the data path, or sometimes as coprocessors. These units allowed direct hardware acceleration of very specific but complex mathematical problems, like the Fourier-transform or matrix operations. Some chips, like the Motorola MC68356, even included more than one processor core to work in parallel. Other DSPs from 1995 are the TI TMS320C541 or the TMS 320C80.
The fourth generation is best characterized by the changes in the instruction set and the instruction encoding/decoding. SIMD and MMX extensions were added, VLIW and the superscalar architecture appeared. As always, the clock-speeds have increased, a 3ns MAC now became possible.

Modern DSPs

Modern signal processors yield greater performance. This is due in part to both technological and architectural advancements like lower design rules, fast-access two-level cache, (E)DMA?circuit and a wider bus system. Of course, not all DSPs provide the same speed and many kinds of signal processors exist, each one of them being better suited for a specific task, ranging in price from about US$1.50 to US$300. A Texas Instruments?C6000?series DSP clocks at 1.2 GHz and implements separate instruction and data caches as well as an 8 MiB 2nd level cache, and its I/O speed is rapid thanks to its 64 EDMA channels. The top models are capable of as many as 8000 MIPS (million?instructions per second), use VLIW (very long instruction word) encoding, perform eight operations per clock-cycle and are compatible with a broad range of external peripherals and various buses (PCI/serial/etc).
Another player at the high-end signal processor manufacturer today is?Freescale. The company provides a multi-core DSPs family MSC81xx. The MSC81xx is based on StarCore Architecture processors. The latest MSC8144 DSP combines four programmable SC3400 StarCore DSP cores. Each SC3400 StarCore DSP core runs at 1 GHz. The SC3400 performed higher than any other programmable DSP at 1 GHz on BDTIsimMark2000 results published by Berkeley Design Technology, Inc. (BDTI).
Another major signal processor manufacturer today is?Analog Devices. The company provides a broad range of DSPs, but its main portfolio is multimedia processors, such as codecs, filters and digital-analog converters. Its?SHARC-based processors range in performance from 66 MHz/198?MFLOPS?(million floating-point operations per second) to 400 MHz/2400MFLOPS. Some models even support multiple?multipliers?andALUs,?SIMD?instructions and audio processing-specific components and peripherals. Another product of the company is the?Blackfin?family of embedded digital signal processors, with models like the ADSP-BF531 to ADSP-BF536. These processors combine the features of a DSP with those of a general use processor. As a result, these processors can run simple?operating systems?like?μCLinux,?velOSity?and?Nucleus RTOSwhile operating relatively efficiently on real-time data.
Most DSPs use?fixed-point arithmetic, because in real world signal processing the additional range provided by floating point is not needed, and there is a large speed benefit and cost benefit due to reduced hardware complexity. Floating point DSPs may be invaluable in applications where a wide dynamic range is required. Product developers might also use floating point DSPs to reduce the cost and complexity of software development in exchange for more expensive hardware, since it is generally easier to implement algorithms in floating point.
Generally, DSPs are dedicated integrated circuits, however DSP functionality can also be realized using?Field Programmable Gate Array?chips.
Embedded general-purpose RISC processors are becoming increasingly DSP like in functionality. For example, ARM Cortex-A8 has a 128-bit wide SIMD unit that can have impressive 16- and 8-bit performance for industry standard benchmarks.

See also

Digital Signal Controller

References

^?A. John Anderson (1994).?Foundations of Computer Technology. CRC Press.

External links

什么是DSP?

wikiterm LV1

站长推荐 /3