embedded vs application processor #3
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
The way we have it right now is like a a micro controller system, with embedded memory that it executes and stores local data on, and can access other memory with added latency, its not meant to be the core of an SOC for example.
One day I would like to have a high performance fully 32 bit core with caches and such, but that should be separate from what we have right now.
However, I want to make a core to be used at the heart of an SOC right now, using the 8 bit core still. I can make a different wrapper that will have caches and an MMU and stuff, and I can hopefully reuse those modules or at least get practice writing them when I create the high performance core.
We need to create a simple MMU which has cache settings for each page. Page 0 contains zero page, so should probably always be cached. However this has considerations for later if we do multitasking we may want to relocate the stack and zeropage, so we want don't want to have it hardcoded. Lets just leave it alone then. The Highest page controls the core, so that should definitely always be non-cached.
Thinking about the future when we want multiple cores, how will we handle needing multiple of these config registers? They will need to go in separate address spaces. That can be figured out when I get there though.
So for now we just need to be able to read the one so yeah just do that. By default everything can be non-cached, and we can enable caching once the processor is running I guess.
It is very important that the lowest level cache have 0 cycle latency (aka 1 cycle access). the 6502 relies on memory accessing being extremely fast because of its lack of registers.
For cache hierarchy, we don't need to have separate icache and dcache, since the pipeline does not differentiate between them.
For L1 cache, 4kB seems like a reasonable amount, with 64 byte (512 bit) cache line.
4096 bytes with a 64 bit cache line is 64 entries. For associativity, lets start with 4 way set associative, but it may need to go down to direct mapped. the 6502 accesses memory on every cycle, so the cache cannot stall the CPU at all.
Actually lets just make the L1 direct mapped. Thats the easiest way to ensure that the cpu can run fast enough, even if there is some thrashing.