Multimedia Acceleration eXtensions

The Multimedia Acceleration eXtensions or MAX are instruction set extensions to the Hewlett-Packard PA-RISC instruction set architecture (ISA). MAX was developed to improve the performance of multimedia applications that were becoming more prevalent during the 1990s.

MAX instructions operate on 32- or 64-bit SIMD data types consisting of multiple 16-bit integers packed in general purpose registers. The available functionality includes additions, subtractions and shifts.

The first version, MAX-1, was for the 32-bit PA-RISC 1.1 ISA. The second version, MAX-2, was for the 64-bit PA-RISC 2.0 ISA.

Notability

edit

The approach is notable because the set of instructions is much smaller than in other multimedia CPUs, and also more general-purpose. The small set and simplicity of the instructions reduce the recurring costs of the electronics, as well as the costs and difficulty of the design. The general-purpose nature of the instructions increases their overall value. These instructions require only small changes to a CPU's arithmetic-logic unit. A similar design approach promises to be a successful model for the multimedia instructions of other CPU designs.[1][2][3] The set is also small because the CPU already included powerful shift and bit-manipulation instructions: "Shift pair" which shifts a pair of registers, "extract" and "deposit" of bit fields, and all the common bit-wise logical operations (and, or, exclusive-or, etc.).[2]

This set of multimedia instructions has proven its performance, as well. In 1996 the 64-bit "MAX-2" instructions enabled real-time performance of MPEG-1 and MPEG-2 video while increasing the area of a RISC CPU by only 0.2%.[1]

Implementations

edit

MAX-1 was first implemented with the PA-7100LC in 1994. It is usually attributed as being the first SIMD extensions to an ISA. The second version, MAX-2, was for the 64-bit PA-RISC 2.0 ISA. It was first implemented in the PA-8000 microprocessor released in 1996.[1]

The basic approach to the arithmetic in MAX-2 is to "interrupt the carries" between the 16-bit subwords, and choose between modular arithmetic, signed and unsigned saturation. This requires only small changes to the arithmetic logic unit.[2]

MAX-1

edit
Instruction Description
HADD Parallel add with modulo arithmetic
HADD,ss Parallel add with signed saturation
HADD,us Parallel add with unsigned saturation
HSUB Parallel subtract with modulo arithmetic
HSUB,ss Parallel subtract with signed saturation
HSUB,us Parallel subtract with unsigned saturation
HAVE Parallel average
HSHLADD Parallel shift left and add with signed saturation
HSHRADD Parallel shift right and add with signed saturation

MAX-2

edit

MAX-2 instructions are register-to-register instructions that operate on multiple integers in 64-bit quantities. All have a one cycle latency in the PA-8000 microprocessor and its derivatives. Memory accesses are via the standard 64-bit loads and stores.

The "MIX" and "PERMH" instructions are a notable innovation because they permute words in the register set without accessing memory. This can substantially speed many operations.[2]

Instruction Description
HADD Parallel add with modulo arithmetic
HADD,ss Parallel add with signed saturation
HADD,us Parallel add with unsigned saturation
HSUB Parallel subtract with modulo arithmetic
HSUB,ss Parallel subtract with signed saturation
HSUB,us Parallel subtract with unsigned saturation
HSHLADD Parallel shift left and add with signed saturation
HSHRADD Parallel shift right and add with signed saturation
HAVG Parallel average
HSHR Parallel shift right signed
HSHR,u Parallel shift right unsigned
HSHL Parallel shift left
MIX Mix 16-bit sub-words in a 64-bit word; MIX Left, Ra,Rb,Rc, Rc:=a1,b1,a3,b3; MIX Right, Rc:=a2,b2,a4,b4[2]
MIXW Mix 32-bit sub-words in a 64-bit word; e.g. MIXW Left, Ra,Rb,Rc, Rc:=a1,a2,b1,b2; MIXW Right, Rc:=a3,a4,b3,b4[2]
PERMH Permute 16-bit sub-words of the source in any possible permutation in the destination register, including repetitions.[2]

References

edit
  1. ^ a b c Lee, Ruby B. (August 1996). "Subword Parallelism with MAX-2" (PDF). IEEE Micro. 16 (4): 51–59. doi:10.1109/40.526925. Retrieved 21 September 2014.
  2. ^ a b c d e f g Lee, Ruby; Huck, Jerry (February 25, 1996). "64-bit and multimedia extensions in the PA-RISC 2.0 architecture". COMPCON '96. Technologies for the Information Superhighway Digest of Papers. pp. 152–160. doi:10.1109/CMPCON.1996.501762. ISBN 0-8186-7414-8. S2CID 13081443.
  3. ^ Lee, Ruby B. (April 1995). "Accelerating Multimedia with Enhanced Microprocessors" (PDF). IEEE Micro. 15 (2): 22–32. doi:10.1109/40.372347. Retrieved 21 September 2014.