Last visit was: Sun Nov 10, 2024 7:43 pm
|
It is currently Sun Nov 10, 2024 7:43 pm
|
huge register file, revisited
Author |
Message |
BigEd
Joined: Wed Jan 09, 2013 6:54 pm Posts: 1806
|
. This is fresh from the half-bakery.
I was thinking about how a register file might conveniently be implemented, and looked up the typical SRAMs to see they only cost a few pounds.
So, you could buy a 128k byte SRAM and use it as a very small register file: good for chip count, at least. Just tie off a lot of address lines.
Why not use more of the chip? Usually, because the register index fits into an instruction, and so is kept very small.
We see now why some CPUs put the registers into main memory: set aside a few addresses at the bottom of memory, provide the zeros for the high bits of the address, and read/write main memory for registers.
Of course, this costs memory bandwidth, and so hurts performance.
Perhaps we can improve memory bandwidth by using multiple SRAMs with the same contents? When we write, we must write to all of them, but when we read, we can read them independently. We can, in effect, build a two-port RAM, or a three- or four-port RAM. Instruction reads, operand reads, register reads, memory reads.
If we interleave twice as many RAMs, we can even access odd and even addresses in parallel.
If we tile our memory space with RAMs, we can access high memory and low memory in parallel.
All we need to do is be comfortable in buying many SRAM chips, and leaving most of them unused.
|
Mon Dec 21, 2020 1:49 pm |
|
|
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2215 Location: Canada
|
One issue with using sram is usually they have a combined read/write port. Meaning it takes an extra clock cycle to read the ram and latch the output.
I like the 74LS612 memory mapper for a register file. It has 16 registers with a 12-bit read/write port and 12-bit read port. Already double ported.
With a large sram available one could always add a latch for the high-order address lines and have banked registers. A separate register set for interrupts for instance.
_________________Robert Finch http://www.finitron.ca
|
Tue Dec 22, 2020 9:32 am |
|
|
BigEd
Joined: Wed Jan 09, 2013 6:54 pm Posts: 1806
|
Looks like a handy chip, but only available now as new-old-stock, I think.
Indeed, you're quite right, higher address bits of a large SRAM could be used to make multiple register banks, or possibly even register windows.
|
Tue Dec 22, 2020 9:36 am |
|
|
oldben
Joined: Mon Oct 07, 2019 2:41 am Posts: 675
|
Stacks for FORTH also come to mind with a BIG ram. Ben. PS: A lot of two's this tuesday, for the date.
|
Tue Dec 22, 2020 2:04 pm |
|
|
robinsonb5
Joined: Wed Nov 20, 2019 12:56 pm Posts: 92
|
This topic makes me think of the wastefulness in FPGA designs of devoting an entire block RAM (sometimes two, to get enough byte-enables) - to a small register file. oldben wrote: Stacks for FORTH also come to mind with a BIG ram. Ben. The ZPU is an interesting CPU design that uses a stack instead of a register file. Quote: PS: A lot of two's this tuesday, for the date. Too bad you didn't post two minutes sooner
|
Tue Dec 22, 2020 4:07 pm |
|
|
BigEd
Joined: Wed Jan 09, 2013 6:54 pm Posts: 1806
|
Yes, good point about FPGAs: I don't like to see Block RAMs 'wasted' for microcode, but actually it's not waste, it's just use. (Although, on those FPGAs with just 64k on chip, it does feel a little awkward in an 8 bit design to be left with 62k...)
I think there's some point in here about feeling comfortable with under-using a plentiful resource. If your design needs just 8k of ROM, it's no real problem to use a 32k ROM - or even more - there's no need to cook up a fancy way to make all of it accessible. Likewise, if you want 56k of RAM, you don't need to make it out of small pieces, just buy a convenient size and make use of as much or as little as makes sense.
|
Tue Dec 22, 2020 4:15 pm |
|
|
robinsonb5
Joined: Wed Nov 20, 2019 12:56 pm Posts: 92
|
BigEd wrote: I think there's some point in here about feeling comfortable with under-using a plentiful resource. If your design needs just 8k of ROM, it's no real problem to use a 32k ROM - or even more - there's no need to cook up a fancy way to make all of it accessible. Likewise, if you want 56k of RAM, you don't need to make it out of small pieces, just buy a convenient size and make use of as much or as little as makes sense. Indeed - it sometimes feels inelegant and taking-the-easy-way-out to do that, even if it's pragmatically the correct path. I guess a lot depends on what the design goal is. If it's to get the job done (i.e. you're being paid to solve a problem in the most time-effective manner possible) then yes, throw into the mix whatever over-provisioned resource makes the job easier. If it's a personal project where elegance matters and you're trying to squeeze every last bit of performance out of deliberately limited resources in some kind of technological artistry, a sort of homage to the genius designs of the 70s and 80s, then leaving resources unused would seem somehow vulgar and counter to the spirit of the project.
|
Tue Dec 22, 2020 6:14 pm |
|
|
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2215 Location: Canada
|
There is less wastage of memory if physically a unified register file is used for GP / FP / posit / decimal floating-point and system registers. The register file can still be logically separate but using just one memory. This may not be the best for performance.
Another way to use all that block ram memory for a register file is to offer vector registers in the design.
Xilinx/AMD FPGA’s allow smaller register files to be built using the LUT rams memory. Up to 64 elements can be had from a single LUT. The issue that arises then is trading off between using an even larger LUT ram (eg 256 entries) or just using block ram which would allow thousands of entries.
_________________Robert Finch http://www.finitron.ca
|
Wed Dec 23, 2020 5:19 am |
|
|
BigEd
Joined: Wed Jan 09, 2013 6:54 pm Posts: 1806
|
> less wastage
Ah, I think you are falling into the very trap I'm trying to avoid. We no long have to account for each bit of storage, as needing a number of components to implement. The unit of purchase is the chip, and when we buy a chip, we are free to use as much or as little as like.
(Of course, there are always elements of aesthetics: efficiency, elegance. But I think this might be an example of premature optimisation.)
|
Wed Dec 23, 2020 9:16 am |
|
|
drogon
Joined: Sun Oct 14, 2018 5:05 pm Posts: 62
|
robinsonb5 wrote: oldben wrote: Stacks for FORTH also come to mind with a BIG ram. Ben. The ZPU is an interesting CPU design that uses a stack instead of a register file. The BCPL Cintcode VM has a 2-deep stack register set plus a 3rd register usable via swap A or B into C instructions. For a real CPU, the Inmos Transputer is very similar with a 3-deep stack. -Gordon
|
Wed Dec 23, 2020 8:22 pm |
|
|
rj45
Joined: Sat Nov 28, 2020 4:18 pm Posts: 123
|
For rj16 I plan to use dual port SRAM for the register file. Here's a 3.3V dual port memory, but there's 5V ones in DIP packages still available as well. There's even an 18-bit dual port sram with 25ns access time. You can't write and read the same address at the same time, but if you operate the memory on the half cycle by inverting the clock, you can write in one half and read in the other half. You also need (at least) two of them if you want two read ports (just tie the write ports together so they have the same contents), so it gets pricey. But well, it saves you having to buy & wire up a ton of individual chips for a register file. And you can get multiple register banks for free by tying the high-order address lines to a register.
|
Sun Dec 27, 2020 8:57 pm |
|
Who is online |
Users browsing this forum: AhrefsBot, claudebot, SemrushBot and 0 guests |
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot post attachments in this forum
|
|