32 Bit address support #1

Open
opened 2026-04-25 22:11:06 -07:00 by bslathi19 · 14 comments
Owner

By changing the addressing modes, it is possible to convert the 6502 from using 16 bit addresses to 32 bit addresses. Any instruction which deals with 2-byte addresses is modified to work with 4-byte addresses instead.

The 6502 has 15 addressing modes. Here is how each of them will change:

Addressing Mode Change Detail
Implicit - No change
Accumulator - No change
Immediate - Registers are still 8 bits, so no change
Zero Page - No change
(Zero Page) read 4 bytes from zeropage Pointer is now 4 bytes instead of 2, zero page address remains 1 byte
Zero Page,X - No change
Zero Page,Y - No change
Relative - No change
Absolute Instruction is 5 bytes instead of 3 Specify the entire 32 bit address in the instruction instead of just 16.
Absolute,X Instruction is 5 bytes instead of 3 Specify the entire 32 bit address in the instruction instead of just 16.
Absolute,X Indirect Instruction is 5 bytes instead of 3 Specify the entire 32 bit address to the pointer in the instruction instead of just 16. 4 bytes will be loaded from the pointer for the PC instead of just 2. This is only used with JMP
Absolute,Y Instruction is 5 bytes instead of 3 pecify the entire 32 bit address in the instruction instead of just 16.
Indirect Instruction is 5 bytes instead of 3 instruction encodes a 32 pointer to a 32 bit jump target.
Indexed Indirect read 4 bytes from zeropage Pointer is now 4 bytes instead of 2, zero page address remains 1 byte
Indirect Indexed read 4 bytes from zeropage Pointer is now 4 bytes instead of 2, zero page address remains 1 byte

Overall, there are not too many different changes.

  1. reading 4 bytes from zero page instead of 2
  2. encoding 4 bytes in the instruction instead of 2
  3. Reading 4 bytes from non-zeropage

The registers all remain 8 bits, except for the program counter which obviously must be 32 bits.

The vector addresses also change since they are 32 bit now instead of 16 bit.

Backwards compatibility is NOT a requirement. There is NO need to have a 16 bit mode, or be able to run existing 6502 code in any way.

By changing the addressing modes, it is possible to convert the 6502 from using 16 bit addresses to 32 bit addresses. Any instruction which deals with 2-byte addresses is modified to work with 4-byte addresses instead. The 6502 has 15 addressing modes. Here is how each of them will change: | Addressing Mode | Change | Detail | | --- | --- | --- | | Implicit | - | No change | | Accumulator | - | No change | | Immediate | - | Registers are still 8 bits, so no change | | Zero Page | - | No change | | (Zero Page) | read 4 bytes from zeropage | Pointer is now 4 bytes instead of 2, zero page address remains 1 byte | | Zero Page,X | - | No change | | Zero Page,Y | - | No change | | Relative | - | No change | | Absolute | Instruction is 5 bytes instead of 3 | Specify the entire 32 bit address in the instruction instead of just 16. | | Absolute,X | Instruction is 5 bytes instead of 3 | Specify the entire 32 bit address in the instruction instead of just 16. | | Absolute,X Indirect | Instruction is 5 bytes instead of 3 | Specify the entire 32 bit address to the pointer in the instruction instead of just 16. 4 bytes will be loaded from the pointer for the PC instead of just 2. This is only used with JMP | | Absolute,Y | Instruction is 5 bytes instead of 3 | pecify the entire 32 bit address in the instruction instead of just 16. | | Indirect | Instruction is 5 bytes instead of 3 | instruction encodes a 32 pointer to a 32 bit jump target. | | Indexed Indirect | read 4 bytes from zeropage | Pointer is now 4 bytes instead of 2, zero page address remains 1 byte | | Indirect Indexed | read 4 bytes from zeropage | Pointer is now 4 bytes instead of 2, zero page address remains 1 byte | Overall, there are not too many different changes. 1. reading 4 bytes from zero page instead of 2 2. encoding 4 bytes in the instruction instead of 2 3. Reading 4 bytes from non-zeropage The registers all remain 8 bits, except for the program counter which obviously must be 32 bits. The vector addresses also change since they are 32 bit now instead of 16 bit. Backwards compatibility is NOT a requirement. There is NO need to have a 16 bit mode, or be able to run existing 6502 code in any way.
Author
Owner

Starting from the top, the first thing that needs to change is state IND0

All IND0 does is go to INDX1, bypassing the INDX0 state where the zp address is added with the X register. This state therefore requires no modifications since it will be handled by the INDXn states.

IND0 : state <= INDX1;

We will revisit this one when we handle the other zero page indirect modes.

Starting from the top, the first thing that needs to change is state `IND0` All `IND0` does is go to `INDX1`, bypassing the `INDX0` state where the zp address is added with the X register. This state therefore requires no modifications since it will be handled by the `INDXn` states. https://git.byronlathi.com/bslathi19/verilog6502/src/commit/06f933fa56fb4a83ef4580c3b1febf11fc9c6c59/src/cpu_65c02.v#L1018 We will revisit this one when we handle the other zero page indirect modes.
Author
Owner

second state that needs changed is ABSn

currently there are 2 states, for loading 2 bytes

ABS0 = 6'd0, // ABS - fetch LSB
ABS1 = 6'd1, // ABS - fetch MSB

ABS0 increments the program counter

ABS0,
JMPIX0,
JMPIX2,
ABSX0,
FETCH,
BRA0,
BRA2,
BRK3,
JMPI1,
JMP1,
RTI4,
RTS3: PC_inc = 1;

The ALU is set to add by default, AI is set to 0 by default, BI is set to DIMUX by default.

ABS1 sets the next address to the combination of DIMUX and the ALU output

ABS1: AB = { DIMUX, ADD };

So, in order to support 32 bit addresses we need to have 16 more bits of temporary storage. Right now it uses the ALU output register, as well as the input register. We can have an ALU shift register which simple stores the last 2 results of the ALU. In the final state when we jump to the new address, it will do

AB = {DIMUX, ALU_SR[1] ALU_SR[0], ADD}

This will be the result of ABS3. ABS0,1,2 will all be the same, just incrementing PC

second state that needs changed is `ABSn` currently there are 2 states, for loading 2 bytes https://git.byronlathi.com/bslathi19/verilog6502/src/commit/06f933fa56fb4a83ef4580c3b1febf11fc9c6c59/src/cpu_65c02.v#L212-L213 ABS0 increments the program counter https://git.byronlathi.com/bslathi19/verilog6502/src/commit/06f933fa56fb4a83ef4580c3b1febf11fc9c6c59/src/cpu_65c02.v#L382-L393 The ALU is set to add by default, AI is set to 0 by default, BI is set to DIMUX by default. ABS1 sets the next address to the combination of DIMUX and the ALU output https://git.byronlathi.com/bslathi19/verilog6502/src/commit/06f933fa56fb4a83ef4580c3b1febf11fc9c6c59/src/cpu_65c02.v#L426 So, in order to support 32 bit addresses we need to have 16 more bits of temporary storage. Right now it uses the ALU output register, as well as the input register. We can have an ALU shift register which simple stores the last 2 results of the ALU. In the final state when we jump to the new address, it will do `AB = {DIMUX, ALU_SR[1] ALU_SR[0], ADD}` This will be the result of ABS3. ABS0,1,2 will all be the same, just incrementing PC
Author
Owner

Actually before we do that, we need to do the vectors so that we can even reset the chip.
We start in state BRK0

if( reset )
state <= BRK0;

This sets the address to the current stack pointer, which will still only be 16 bits. We can hardcode the upper 16 bits to 0.
JSR0,
BRK0: DO = PCH;
JSR1,
BRK1: DO = PCL;

In BRK0 and BRK1, as well as JSR0 and JSR1, we push the current PC to the stack. We need to add 2 more states so that we can write all 32 bits, instead of just 16

BRK2: DO = (IRQ | NMI_edge) ? (P & 8'b1110_1111) : P;

In BRK2 we write the processor status register, so we can just move that back a few cycles

BRK1,
JSR1,
PULL1,
RTS1,
RTS2,
RTI1,
RTI2,
RTI3,
BRK2: AB = { STACKPAGE, ADD };

BRK1 and BRK2 increment the address, our stats will do the same

BRK2: PC_temp = res ? 16'hfffc :
NMI_edge ? 16'hfffa : 16'hfffe;

Here is where the vectors are harcoded. We will change these vectors to be at 0xFFFFFFF4, 0xFFFFFFF8, and 0xFFFFFFFC.

Actually before we do that, we need to do the vectors so that we can even reset the chip. We start in state BRK0 https://git.byronlathi.com/bslathi19/verilog6502/src/commit/06f933fa56fb4a83ef4580c3b1febf11fc9c6c59/src/cpu_65c02.v#L952-L953 This sets the address to the current stack pointer, which will still only be 16 bits. We can hardcode the upper 16 bits to 0. https://git.byronlathi.com/bslathi19/verilog6502/src/commit/06f933fa56fb4a83ef4580c3b1febf11fc9c6c59/src/cpu_65c02.v#L486-L490 In BRK0 and BRK1, as well as JSR0 and JSR1, we push the current PC to the stack. We need to add 2 more states so that we can write all 32 bits, instead of just 16 https://git.byronlathi.com/bslathi19/verilog6502/src/commit/06f933fa56fb4a83ef4580c3b1febf11fc9c6c59/src/cpu_65c02.v#L494 In BRK2 we write the processor status register, so we can just move that back a few cycles https://git.byronlathi.com/bslathi19/verilog6502/src/commit/06f933fa56fb4a83ef4580c3b1febf11fc9c6c59/src/cpu_65c02.v#L441-L449 BRK1 and BRK2 increment the address, our stats will do the same https://git.byronlathi.com/bslathi19/verilog6502/src/commit/06f933fa56fb4a83ef4580c3b1febf11fc9c6c59/src/cpu_65c02.v#L366-L367 Here is where the vectors are harcoded. We will change these vectors to be at 0xFFFFFFF4, 0xFFFFFFF8, and 0xFFFFFFFC.
Author
Owner

The BRK changes are added in 9476c6a0dd

Now that we have those, we need to update the JMP state, since it is only waiting 1 cycle for an address image.png

JMP does not really do anything. If we add 2 more JMP0 like states as well as the ALU shift register, this should be trivial.

The BRK changes are added in 9476c6a0dd3bec6bf7d521cdd2c3467bcd3fb929 Now that we have those, we need to update the JMP state, since it is only waiting 1 cycle for an address ![image.png](/attachments/67546c15-05a8-4f59-9800-87c1c87c5695) JMP does not really do anything. If we add 2 more JMP0 like states as well as the ALU shift register, this should be trivial.
9.8 KiB
Author
Owner

JMP changes are added in 019b84f41d

This is the Absolute jump

JMP changes are added in 019b84f41d6ea775194eeeb250fbd0eb185c3779 This is the Absolute jump
Author
Owner

Lets tackle absolute for normal instructions next.

747438a9b6

This was pretty simple, we just copy the ABS0 state two more times.

Lets tackle absolute for normal instructions next. 747438a9b678417f56eb94c90a31c456f70056b5 This was pretty simple, we just copy the ABS0 state two more times.
Author
Owner

abs,x next.

Looks like we can just copy this state

ABSX1 = 6'd3, // ABS, X - fetch MSB and send to ALU (+Carry)

2 more times

abs,x next. Looks like we can just copy this state https://git.byronlathi.com/bslathi19/verilog6502/src/commit/06f933fa56fb4a83ef4580c3b1febf11fc9c6c59/src/cpu_65c02.v#L215 2 more times
Author
Owner

Added abs,x here
cb6cac1245

Added abs,x here cb6cac12451b7a673680625ea2f77e9d4895305f
Author
Owner

abs,y should also be handled by abs,x add those to the test also

abs,y should also be handled by abs,x add those to the test also
Author
Owner

Lets tackle absolute,x indirect.

This is states JMPIXn

Like absx, We can probably just copy this state twice

JMPIX1 = 6'd52, // JMP (,X)- fetch MSB and send to ALU (+Carry)

Lets tackle absolute,x indirect. This is states JMPIXn Like absx, We can probably just copy this state twice https://git.byronlathi.com/bslathi19/verilog6502/src/commit/06f933fa56fb4a83ef4580c3b1febf11fc9c6c59/src/cpu_65c02.v#L264
Author
Owner

Ok that is one in dc339cb725

Now for regular indirect. We can just copy JMPI0 twice.

Ok that is one in dc339cb725af758c7bc9838d4920e9d921d31a55 Now for regular indirect. We can just copy JMPI0 twice.
Author
Owner

Added in b31d7490b2

Added in b31d7490b2d318d742cdd20a44003655cc565613
Author
Owner

Lets do Indirect Indexed, since it is apparently the most common indirection mode.
according to the state listing, here are the steps that we do

INDY0 = 6'd18, // (ZP),Y - fetch ZP address, and send ZP to ALU (+1)
INDY1 = 6'd19, // (ZP),Y - fetch at ZP+1, and send LSB to ALU (+Y)
INDY2 = 6'd20, // (ZP),Y - fetch data, and send MSB to ALU (+Carry)
INDY3 = 6'd21, // (ZP),Y) - fetch data (if page boundary crossed)

How should we make this work with 32 bit addresses?

The first step loads the LSB and sends the ZP index to ALU
the second step reads the MSB and sends the LSB to ALU to add Y
the third step reads

So we need to do a combination of steps 2 and 3. Instead of loading data from the calculated address, we need to read the 3rd and 4th bytes from zero page and add the carry. Only then can we read from the computed address.

Lets do Indirect Indexed, since it is apparently the most common indirection mode. according to the state listing, here are the steps that we do https://git.byronlathi.com/bslathi19/verilog6502/src/commit/06f933fa56fb4a83ef4580c3b1febf11fc9c6c59/src/cpu_65c02.v#L230-L233 How should we make this work with 32 bit addresses? The first step loads the LSB and sends the ZP index to ALU the second step reads the MSB and sends the LSB to ALU to add Y the third step reads So we need to do a combination of steps 2 and 3. Instead of loading data from the calculated address, we need to read the 3rd and 4th bytes from zero page and add the carry. Only then can we read from the computed address.
Author
Owner

Hmm that plan would not work because we need the ALU to be adding the offset, whereas this instruction also uses the ALU to generate the address.

Hmm that plan would not work because we need the ALU to be adding the offset, whereas this instruction also uses the ALU to generate the address.
Sign in to join this conversation.
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: bslathi19/verilog6502#1