Introduction
Over the last 25 years, microprocessors have enjoyed a con-
tinuous increase in performance and attendant reduction in
price/performance. Current best of breed microprocessors
operate at frequencies in excess of 300 MHz and offer super-
scalar instruction dispatch, sophisticated branch prediction
techniques and support for high performance memory sys-
tems including external second level cache controllers.
As general purpose microprocessors have continued to
become more powerful, they have been asked to perform
increasingly complex tasks. In fact, the trend of doubling
system performance every 1.5 to 2 years has not met the
requirements of the networking and telecommunications
infrastructure industry due to several emerging applications
and trends. Example applications include the explosive
growth of the Internet, the emergence of new digital com-
munications technologies, including digital cellular phones
employing CDMA, TDMA and PCS technologies, IP-based
telephony, fax and multimedia and wireless messaging. A
general trend in the industry is using programmable proces-
sors to implement adaptive filters, modulators/demodula-
tors, and other functions once only possible in hardware.
These trends and applications have created tremendous
opportunities for high-performance, high bandwidth proces-
sors. These demanding new applications, along with the con-
tinually increasing needs of the computing market, necessi-
tated a new approach in how to maximize performance in
order to provide our customers with the order of magnitude
increase in key application performance they demand.
To meet these needs, a new class of microprocessor product is
called for. One which offers in a single chip solution the high-
est level of processing performance while expanding the
processor’s capabilities to concurrently address high-band-
width data processing and the algorithmic intensive computa-
tions which today are typically handled off-chip by other
devices, such as dedicated hardware, DSP farms or custom
ASICs. Motorola is introducing a new technology that pro-
vides for this convergence in capabilities — AltiVec technology.
AltiVec technology is Motorola’s high-performance vector
parallel processing expansion to the PowerPC™ RISC
processor architecture. Motorola microprocessors offering
AltiVec technology will represent a new class of product. In
addition to providing 100% compatibility with the industry-
standard PowerPC Architecture, AltiVec technology will
also provide product designers and customers with a new
“one part—one code base” approach to product design
which simplifies design and support while simultaneously
providing a tremendous jump in performance.
Motorola’s AltiVecTechnology
Sam Fuller
System Architecture & Product Planning Manager,
Networking & Computing Core Technologies
Motorola Inc.
Semiconductor Product Sector
6501 William Cannon Drive West, Austin, Texas 78735
Integer
Unit
Floating-Point
Unit
Vector
Unit
GPRs FPRs VRs
Memory
INST INST
ADDR
DATA
ADDR DATA DATA DATA
Branch
Unit
INST INST INST
Figure 1. High-level structural overview for PowerPC with
AltiVec technology
1
ALTIVECWP/D
AltiVec Technology
Motorola's AltiVec technology expands the current
PowerPC architecture through the addition of a 128-bit
vector execution unit, which operates concurrently with the
existing integer and floating point units. This new engine
provides for highly parallel operations, allowing for the
simultaneous execution of up to 16 operations in a single
clock cycle.
AltiVec technology is a short vector parallel architecture.
Depending on data size, vectors are 4, 8 or 16 elements long.
This can be contrasted with the long vector architectures of
supercomputers that were popular in the 1980s. Vector sizes
for those machines ranged to hundreds of elements. The long
vector approach of supercomputers, while useful for scien-
tific calculations, is not optimal for the communications,
multimedia and other performance-driven applications tar-
geted by Motorola with AltiVec technology.
AltiVec technology operations are performed on multiple
data elements by a single instruction. This is often referred
to as SIMD (single instructions, multiple data) parallel pro-
cessing. AltiVec technology offers support for:
16-way parallelism for 8-bit signed and unsigned integers
and characters,
8-way parallelism for 16-bit signed and unsigned integers
4-way parallelism for 32-bit signed and unsigned integers
and IEEE floating-point numbers
AltiVec technology also includes a separate register file
containing 32-entries, each 128-bits wide. These 128-bit
wide registers hold the data sources for the AltiVec tech-
nology execution units. The registers are loaded and
unloaded through vector store and vector load instructions
that transfer the contents of a single 128-bit register to and
from memory.
AltiVec technology can be most accurately thought of as a
set of registers and execution units added to the PowerPC
architecture in an analogous manner to the addition of float-
ing point units. Floating point units were added to most
mainstream microprocessor architectures several years ago
to provide better support for high-precision scientific calcu-
lations. AltiVec technology is being added to the PowerPC
architecture to dramatically accelerate the next level of per-
formance-driven, high-bandwidth communications and
computing applications.
Each AltiVec instruction specifies up to three source
operands and a single destination operand. All operands are
vector registers, with the exception of the load and store
instructions and a few instruction types that provide
operands from immediate fields within the instruction. 162
new unique instructions are defined for the AltiVec technol-
ogy. These instructions fall into the following major classes.
1. Intra-Element Arithmetic Operations
Intra-element arithmetic operations perform independent
parallel computations on the elements contained in the
source vector registers and place the results in the corre-
sponding fields of the destination vector register. Both signed
and unsigned integers and floating-point data types are sup-
ported by the intra-element operations. The operations sup-
port both saturation and modulo arithmetic. A variety of
powerful intra-element operations are defined in the AltiVec
technology: addition, subtraction, multiply, and multiply-
vA
vC
vB
vT
op op op op op op op op op op op op op op opop
Figure 2. Generic presentation of a four operand, 16-element, intra-element operation
vA
vB
+
vT
00000000000000000000000000000
00000000000000000000000000000
Figure 3. Sum Across — an inter-element arithmetic operation
2
Motorola’s AltiVec Technology —White Paper
add. Additional instructions perform min, max and average,
as well as conversion between floating-point and 32-bit inte-
ger numerical formats.
2. Intra-Element Non-Arithmetic Operations
Intra-element non-arithmetic operations include various
forms of compare, shift, and rotate. The following logical
operations are also supported: AND, OR, NOT, XOR,
AND-NOT. A select instruction is also provided. This
instruction is designed to select or choose source data from
one of two source registers and transfer that data to the
results register. The combination of compare and select pro-
vides a powerful way to mask and replace data elements
across the entire 16-byte field of the vector registers with a
very few instructions.
3. Inter-Element Arithmetic Operations
A few special inter-element arithmetic operations are pro-
vided in the AltiVec technology, these operations are sum of
products and sum across. These operations allow for ele-
ments within a single vector register to be summed in com-
bination with a separate accumulation register. These opera-
tions are valuable for generating dot products which are the
most common vector operation.
4. Inter-Element Non-Arithmetic Operations
In addition to the powerful intra-element and inter-element
arithmetic operations, AltiVec technology also defines a
group of very powerful inter-element non-arithmetic opera-
tions. These inter-element operations include wide field shift
operations, pack and unpack operations, including a special
operation to handle the 1/5/5/5 pixel format common for
16-bit color pixels. Merge operations are also provided that
can interleave data at the byte, halfword and word level.
Perhaps the most powerful inter-element operation offered
in the AltiVec technology is the permute operation. The per-
mute operation is capable of arbitrarily selecting data with
the granularity of a byte from two 16-byte source registers
into a single 16-byte destination register.
For operations where 8- and 16-bit data items must be
reorganized in memory before or after computations, per-
mute can save significant time. In many instances a single
permute operation can operate on 16 bytes of data and
replace 4 or 5 operations per byte using a traditional RISC
or DSP operation.
The powerful inter-element operations of AltiVec technology
define a microprocessor not just capable of operating on 8,
16 and 32-bit data elements in parallel but of operating on
data 128 bits (16 bytes) at a time.
Applications of AltiVec Technology
The initial target applications for AltiVec technology
include: IP telephony gateways, multi-channel modems,
speech processing systems, echo cancelers, image and video
processing systems, scientific array processing systems, as
well as network infrastructure such as Internet routers and
virtual private network servers.
In addition to accelerating next-generation applications,
AltiVec technology can, through its wide datapaths and wide
field operations, also accelerate many time-consuming tradi-
tional computing and embedded processing operations such
as memory copies, string compares and page clears.
Unlike fixed function solutions which are most often imple-
mented as application specific integrated circuits, AltiVec
technology will offer a programmable solution that can eas-
ily migrate via software upgrades to follow changing stan-
dards and customer requirements. The preferred program-
ming environment is the C and C++ languages favored by
01 14 1318 0810 16 15 19 1A 1C 1C 1C 1D 1B OE
0123 4 5 6 7 89 A B C D E F
1A 1B 1C 1D 1E 1F
10 11 12 13 14 15 16 17 18 19
vC
vA
vB
vT
Figure 4. The inter-element Permute operation
Communication Control Computation
Memory
Interface
Circuit
DSP
bus
DSP DSP
DSP DSP DSP
DSP DSP DSP
Controller
Interface
Circuit
Figure 5. Typical controller plus DSP system
3
Motorola’s AltiVec Technology —White Paper
embedded systems developers. To more easily express the
parallelism presented by AltiVec technology, Motorola has
developed a standardized set of C/C++ language extensions.
These language extensions allow a software developer to use
their preferred C/C++ development environment and lan-
guage syntax while explicitly taking advantage of the paral-
lel functional units other facilities offered by the AltiVec
technology. Motorola is working with leading tools
providers to develop simulators, assemblers, linkers and
compilers to assure full support for the AltiVec technology.
While the initial PowerPC microprocessor utilizing AltiVec
technology will target very high-performance applications in
networking and computing, subsequent Motorola proces-
sors with AltiVec technology could address markets and
applications in which performance must be balanced with
power, price and peripheral integration.
ANew Design Model
The introduction of processors containing AltiVec technolo-
gy creates a new model of system design for high-perfor-
mance embedded systems. Historically, many high-perfor-
mance embedded applications have contained a combination
of a single RISC processor performing the system control
function and one or more DSPs or ASICs performing spe-
cialized computations.
The single RISC processor plus multiple DSP system has a
number of disadvantages, including two different architec-
tures, code bases, hardware types, and debug environments.
Additionally, because DSPs have not been on the same per-
formance growth curve as general purpose processors - for
example, they often require users to switch to newer non-
compatible architectures from generation to generation,
even minor upgrades in a customer’s product performance
often required major hardware redesigns; often including
changing DSP or controller architectures with the attendant
cost and time to market impact.
AltiVec technology-based systems can provide more capable
single architecture systems, often at lower cost, power bud-
get, and physical area than controller plus DSP solutions.
The use of a single high-performance device for controller
and signal processing functions results in quicker time to
market and lower overall engineering cost. A single architec-
ture solution provides a simpler development task to both
the hardware and software engineer.
Summary
With the introduction of AltiVec technology, Motorola is
demonstrating its commitment to the PowerPC architecture
and to meeting the requirements of next generation net-
working, communications and computing applications.
AltiVec technology will expand the PowerPC microproces-
sor capability by providing leading edge general purpose
processing performance while concurrently addressing high-
bandwidth data handling processing and algorithmic inten-
sive computations in a single chip solution. This new class of
processor will provide an aggressive performance growth
path for embedded and computing systems designers, while
lowering development barriers inherent in multiple architec-
ture designs, thereby reducing the time to market and total
system development expense.
Communication Control & Computation
PowerPC
Processor
with AltiVec
Technology
PowerPC
Processor
with AltiVec
Technology
bus
* Such as Motorola MPC860 PowerQUICC™ controller
Memory
Interface
Circuit*
Interface
Circuit*
Figure 6. System using multiple PowerPC processors with
AltiVec technology, sharing a common bus bridged to
shared memory
TM
4
Motorola’s AltiVec Technology —White Paper
©1998 Motorola, Inc. All rights reserved. Printed in the U.S.A. Motorola and the are registered trademarks and AltiVec and the AltiVec logo are trademarks of Motorola, Inc. PowerPC, the PowerPC logo and PowerPC Architecture are trademarks of
International Business Machines Corporation and used by Motorola, Inc. under license therefrom. This document contains information on a new product under development. Specifications and information herein are subject to change without notice.