The Western Design Center (WDC) 65C02 microprocessor is an enhanced CMOS version of the popular NMOS-based 8-bit MOS Technology 6502. The 65C02 fixed several problems in the original 6502 and added a small number of new commands. However, its main feature was greatly lowered power usage, on the order of 10 times less than the 6502 running at the same speed. This made it useful in portable computer roles and microcontroller systems in industrial settings. It has been used in some home computers, as well as in embedded applications, including medical-grade implanted devices.
Development began in 1981[a] when it was known as the 65802. The first sample versions were released in early 1983.[b] WDC licensed the design to Synertek, NCR, GTE, and Rockwell Semiconductor. Rockwell's primary interest was in the embedded market and asked for several new commands to be added to aid in this role. These were later copied back into the baseline version, at which point WDC added two new commands of their own to create the W65C02. Sanyo later licensed the design as well, and Seiko Epson produced a further modified version as the HuC6280.
Early versions used 40-pin DIP packaging, and were available in 1, 2 and 4 MHz versions. Later versions were produced in PLCC and QFP, increasing in speed as well. The latest version from WDC, the W65C02S-14 runs at speeds up to 14 MHz.
- 1 Introduction and features
- 2 Comparison with the NMOS 6502
- 3 65SC02
- 4 Notable uses of the 65C02
- 5 See also
- 6 Notes
- 7 References
- 8 Further reading
- 9 External links
Introduction and featuresEdit
The 65C02 is a low cost, general-purpose 8-bit microprocessor (8-bit registers and data bus) with a 16-bit program counter and address bus. The register set is small, with a single 8-bit accumulator (A), two 8-bit index registers (X and Y), an 8-bit status register (P), and a 16-bit program counter (PC). In addition to the single accumulator, the first 256 bytes of RAM, the "zero page" ($0000 to $00FF), allow faster access through a dedicated addressing mode that requires a single byte of address instead of two. The stack lies in the next 256 bytes, page one ($0100 to $01FF), and cannot be moved or extended. The stack grows downwards with the stack pointer (S) starting at $01FF and decrementing as the stack grows. It has a variable-length instruction set, varying between one and three bytes per instruction.
The basic architecture of the 65C02 is identical to the original 6502, and can be considered a low-power implementation of that design. At 1 MHz, the most popular speed for the original 6502, the 65C02 requires only 20 mW, while the original uses 450 mW, a reduction of over twenty times. The manually optimized core and low power use is intended to make the 65C02 well suited for low power system-on-chip (SoC) designs.
A Verilog hardware description model is available for designing the W65C02S core into an application-specific integrated circuit (ASIC) or a field-programmable gate array (FPGA). As is common in the semiconductor industry, WDC offers a development system, which includes a developer board, an in-circuit emulator (ICE) and a software development system.
The W65C02S–14 is the production version as of 2019[update], and is available in PDIP, PLCC and QFP packages. The maximum officially supported ϕ2 (primary) clock speed is 14 MHz, indicated by the –14 part number suffix. The "S" designation indicates that the part has a fully static core, a feature that allows ϕ2 to be slowed down or fully stopped in either the high or low state with no loss of data. Typical microprocessors not implemented in CMOS have dynamic cores and will lose their internal register contents (and thus crash) if they are not continuously clocked at a rate between some minimum and maximum specified values.
General logic featuresEdit
- 8-bit data bus
- 16-bit address bus (providing an address space of 64 kB)
- 8-bit arithmetic logic unit (ALU)
- 8-bit processor registers:
- 16-bit program counter
- 69 instructions, implemented by 212 operation codes
- 16 addressing modes, including zero page addressing
- Vector pull (
VPB) output indicates when interrupt vectors are being addressed
- Memory lock (
MLB) output indicates to other bus masters when a read-modify-write instruction is being processed
- WAit-for-Interrupt (
WAI) and SToP (
STP) instructions reduce power consumption, decrease interrupt latency and enable synchronization with external events
- Supply voltage specified at 1.71 V to 5.25 V
- Current consumption (core) of 0.15 and 1.5 mA per MHz at 1.89 V and 5.25 V respectively
- Variable length instruction set, enabling code size optimization over fixed length instruction set processors, results in power savings
- Fully static circuitry allows stopping the clock to conserve power
The W65C02S may be operated at any convenient supply voltage (VDD) between 1.8 and 5 volts (±5%). The data sheet AC characteristics table lists operational characteristics at 5 V at 14 MHz, 3.3 V or 3 V at 8 MHz, 2.5 V at 4 MHz, and 1.8 V at 2 MHz. This information may be an artifact of an earlier data sheet, as a graph indicates that typical devices are capable of operation at higher speeds than suggested by the AC characteristics table, and that reliable operation at 20 MHz should be readily attainable with VDD at 5 volts, assuming the supporting hardware will allow it.
The W65C02S may also be operated at non-integral clock rates such as 13.5 MHz (digital SDTV luma sampling rate), 14.31818 MHz (NTSC colour carrier frequency × 4), 14.75 MHz (PAL square pixels), 14.7456 (baud rate crystal), etc., as long as VDD is sufficient to support the frequency. Designer Bill Mensch has pointed out that FMAX is affected by off-chip factors, such as the capacitive load on the microprocessor's pins. Minimizing load by using short signal tracks and fewest devices helps raise FMAX. The PLCC and QFP packages have less pin-to-pin capacitance than the PDIP package, and are more economical in the use of printed circuit board space.
WDC has reported that FPGA realizations of the W65C02S have been successfully operated at 200 MHz.
Comparison with the NMOS 6502Edit
Although the 65C02 can mostly be thought of as a low-power 6502, it also fixes several bugs found in the original and adds new opcodes that can help increase code density. It is estimated that the average 6502 assembly program can be made 10 to 15% smaller on the 65C02 and see a similar improvement in performance through avoided memory accesses.
The original 6502 had 56 instructions, which, when combined with different addressing modes, produced a total of 151 opcodes of the possible 256 8-bit patterns. The remaining 105 unused opcodes were undefined, with the set of codes with low-order 4-bits with 3, 7, B or F left entirely unused, the 2 having only a single opcode.
The 6502 was famous for the way that some of these leftover codes actually performed actions. Due to the way the 6502's instruction decoder worked, simply setting certain bits in the opcode would cause parts of the instruction processing to take place. Some of these opcodes would immediately crash the processor, while other performed useful functions and were even given unofficial assembler mnemonics by users.
The 65C02 added a number of new opcodes that used up a number of these previously "undocumented instruction" slots, for instance, $FF was now used for the new
BBS instruction (see below). Those that remained truly unused were set to perform
NOOPs. Programs that took advantage of these codes will not work on the 65C02, but these codes were always documented as non-operational and should not have been used.
The original 6502 had several bugs when initially launched. Among the most notorious was that the
ROR, rotate right, was broken due to a problem in the chip. MOS addressed this by not documenting the instruction. This bug was fixed early in the production run and was generally not an issue for the vast majority of machines using the processor.
In contrast, another bug that remained in the design for its lifetime involved the commonly-used jump command,
JMP, when using indirect addressing. In this mode, the address of the JMP was looked up in another memory location. For instance,
JMP ($1234) would fetch the value in bytes $1234 and $1235 and use those 16 bits as the actual memory location to jump to. However, if the initial address ended in $FF, the boundary of a memory page, the JMP took the most significant byte of the 16-bit address from $00 of the original page rather than $00 of the new page. So for instance,
JMP ($12FF) would get the first byte at $12FF and the second, incorrectly, from $1200 rather than $1300. This was fixed in the 65C02.
Another bug in the original 6502 concerned the behaviour of the (D)ecimal flag in the status register, which was left undefined after a reset or interrupt. This meant that programmers were forced to set the flag to a known value, and one finds a
CLD instruction (CLear Decimal) following the initial
SEI[c] in almost all interrupt handlers. The 65C02 automatically set or reset this flag correctly after pushing the status register onto the stack.
A related problem occurred while operating in decimal mode, where the (N)egative, o(V)erflow and (Z)ero flags were not updated properly. There were ways to address this in code, but only at the cost of additional instructions. The 65C02 addresses this problem and sets these flags correctly, at the cost of a single clock cycle.
New addressing modesEdit
The 6502 included a variation on the indirect addressing mode that was based on the X and Y index registers. When an instruction used this mode, the processor took an initial zero page address, added the X or Y register to it, and then returned the value at the resulting location. Unfortunately, this was the only way to load data from the zero page, so if one wanted to access a single byte at a known location in the page, the code had to first
LDY #$00 before
LDA ($12,Y). The 65C02 added a new mode,
LDA ($12), removing the need for the LDY.
The 6502 also included a mode known as "preindexing", which offset an address by the value in the X or Y register, and then fetched the data at the address pointed to that 16-bit data. For instance, if the X register holds the value $4, then
LDA ($12,X) would add the value in X to the absolute addresses $12, then pick up the two bytes at $16,$17. If the value in those two bytes was $1234, it then fetches the 8-bit value at memory location $1234. The basic idea is to allow you to set up a series of unrelated locations in memory and then access them by changing only the X register.
While useful, this could only be used with zero page data for the intermediate address. In the 65C02 this was extended with a new "postindexed" version, officially known as "indexed absolute indirect". Available for the Y register only, this mode fetched an address from the zero page as before, but then added the value of Y after retrieving the 16-bits from the zero page. For instance,
LDA ($12),Y would first look up the address at $12/$13, retrieve (for instance) $1230, and the add Y to it, loading the accumulator with the value from $1234.
This new mode was extremely useful for implementing jump tables into a set of subroutines. Previously all of the locations would have to be in the zero page, but now the list could be placed anywhere in main memory, and only the starting location of the list had to be placed in the zero page. Now the program can
LDY #$10;JSR ($12),Y to call "routine number 10". This sort of access pattern is extremely common, notably in the widely used BASIC programming language which was found on many 6502 systems. This same sort of access is possible with the original preindexing version, but only at the cost of having to have the entire table in the zero page, a very limited resource. This new mode was particularly useful when the routines were in ROM.
New and modified instructionsEdit
In addition to the new addressing modes, the "base model" 65C02 also added a set of new instructions.
STZ addr, STore Zero in addr. Replaces the need to
LDA #0;STA addrand doesn't change the value in the accumulator. As this task is common in most programs, using STZ can reduce code size, both by eliminating the LDA as well as any code needed to save the value of the accumulator, typically a
PLY, push and pull the X and Y registers to the stack. Formerly only the accumulator and status register could be pushed and pulled. Using this saves the need to use a
TxAto move the X or Y to the accumulator before a
PHA, and also leaves the accumulator unchanged which can be very helpful.
DECwith no parameters now increment or decrement the accumulator. This was an odd oversight in the original instruction set, which only included
DEC addr. Some assemblers use the alternate forms
DEC A, but oddly none appear to use the
DECAformat already in use.
BRA, branch always. Operates like a
JMPbut uses relative addressing like other branches. This makes it one byte smaller as it uses a 1-byte relative address instead of a 2-byte absolute address. This is useful for saving memory and improving speed by one memory cycle. As the address is relative, it is also useful when writing relocatable code, a common task in the era before memory management units.
Bit manipulation instructionsEdit
The initial design for the 65C02 was modified by Rockwell, who was interested in the 6502 as the basis for embedded processors. In these roles, it is common for device drivers to communicate with the CPU by encoding status as bits in a single byte, in a fashion similar to the CPU's own status register. This makes the various bit manipulation instructions very common in embedded applications.
Normally, one tests bits by
ANDing the desired pattern with the memory location holding the data, and then branching based on the status register's (Z)ero flag. So for instance, if address $1234 was the status register for a device, and bit 3 held the "ready" status, then one could implement a "continue if ready" with
LDA $1234;AND #$08;BEQ $2345 as hex $08 is the third bit, and if the bit is set then it will be Equal and branch to the routine at $2345.
As this sort of test is common, the original 6502 included a special-purpose
BIT addr instruction for automating some of this. BIT did not change the accumulator (unlike ANDing) and tested bits 6 and 7 at the same time, placing the results in the (N)egative and o(V)erflow flags. As long as the device drivers used bits 6 and 7 for their most commonly tested flags, using BIT could reduce the number of tests needed. The 65C02 further improved the BIT command by adding new addressing modes, including the ability to test a constant against the accumulator instead of the pattern having to be initially stored in memory.
However, Rockwell's changes went far beyond changes to BIT, adding a host of commands for directly setting and testing any bit, and combining the test, clear and branch into a single opcode. The new instructions were available from the start in Rockwell's R65C00 family, but was not part of the original 65C02 specification and not found in versions made by WDC or its other licensees. These were later copied back into the baseline design, and were available in later WDC versions, denoted with a leading "W", the W65C02's.
The new instructions include:
RMBbit# addr. Set or Reset bit number bit# in zero page byte addr. The bit# is often written as part of the instruction, like
SMB1 $12which sets bit 1 in zero-page address $12. Some assemblers have bit# written as part of the parameters, like
SMB 1,$12, which has the advantage of allowing it to be replaced by a variable name or calculated number. Previously the action of setting a bit to 0 was known as "clear", and this would have been called
RMChad it been part of the original 6502, but the term changed from clear to reset for the 65C02.
TRB, Test and Set Bits, Test and Reset Bits. A bitmask is first stored in the accumulator with LDA and then TSB/TRB is called. The pattern is ANDed to set the processor's Z(ero) flag, and then the bits are Set or Reset. This operates similar to the BIT command, but BIT only tests against bits 6 and 7, while this tests any single bit, and these commands then change the value without a separate STA being needed. This pattern is very common when a status bit needs to be tested and then reset to indicate the condition has been handled.
BBS bit#,offset,addr, Branch on Bit Set/Reset. Same zero-page addressing as SMB/RMB, but now branches to addr if that bit is set/reset. This combines the three instructions of LDA/AND/BEQ into a single instruction.
In addition to the new commands above, WDC also added the
WAI instructions for supporting low-power modes. STP, STop the Processor, halted all processing until a hardware reset was issued. This could be used to put a system to "sleep" and then rapidly wake it with a reset. However, this required some external system to maintain memory, and it was not widely used. WAIt had a similar effect, entering low-power mode, but this time woke up again on the reception of any interrupt. For embedded uses this meant the device driver did not have to poll for changes, instead it simply issued a WAI and then jumped into the interrupt processing when it started again. This had the added advantage of always entering the low-power mode when all other instructions had completed, so the handler always started in one cycle, instead of having to wait for any currently processing instructions to complete first.
The 65SC02 is a variant of the WDC 65C02 without bit instructions.
Notable uses of the 65C02Edit
- Apple IIc portable by Apple Computer (NCR 1.023 MHz)
- Enhanced Apple IIe by Apple Computer (1.023 MHz)
- BBC Master home/educational computer, by Acorn Computers Ltd (2 MHz 65SC12 plus optional 4 MHz 65C102 second processor)
- Replica 1 by Briel Computers, a replica of the Apple I hobbyist computer (1 MHz)
- Laser 128 series clones of Apple II
- KIM-1 Modern Replica of the MOS/CBM KIM-1 by Briel Computing
Video game consolesEdit
- Atari Lynx handheld (65SC02 @ ~4 MHz)
- NEC PC Engine aka TurboGrafx-16 (HuC6280 @ 7.16 MHz)
- GameKing handhelds (6 MHz) by Timetop
- Watara Supervision handhelds (65SC02 @ 4 MHz)
- Some sources, including prior versions of this article, claim 1978. This was the date that Bill Mench, the primary designer, formed WDC. Mench specifically states 1981 when talking about the design in 1984.
- Wagner's June 1983 article mentions it being available for “several months”. Given typical publication delays at that point this may date it to late 1982.
- SEI prevents interrupts, allowing the handler to complete without itself being interrupted.
- Wagner 1983, p. 204.
- McGeever, Christine (5 November 1984). "16-bit Apple II chips due". InfoWorld. pp. 21–22.
- Koehn, Philipp (2 March 2018). "6502 Stack" (PDF).
- Taylor & Watford 1984, p. 174.
- "6502 CPU Projects in HDL (for FPGA)".
- "W65C02DB Developer Board".
- Parker, Neil. "The 6502/65C02/65C816 Instruction Set Decoded". Neil Parker's Apple II page.
- Vardy, Adam (22 August 1995). "Extra Instructions Of The 65XX Series CPU".
- Steil, Michael (2010-09-28). "Measuring the ROR Bug in the Early MOS 6502".
- "Differences between NMOS 6502 and CMOS 65c02". Retrieved 27 February 2018.
N, V, and Z flags were incorrect after decimal operation (but C was ok).
- Wagner 1983, p. 199.
- Clark, Bruce. "65C02 Opcodes".
- Wagner 1983, p. 200.
- Wagner 1983, p. 203.
- Wagner 1983, pp. 200-201.
- Zaks, Rodnay. Programming the 6502. p. 348.
- Wagner, Robert (June 1983). "Assembly Lines". Softtalk. pp. 199–204.
- Taylor, Simon; Watford, Bob (July 1984). "6502 revival". Personal Computer World. pp. 174–175.