Last visit was: Thu May 01, 2025 5:18 pm
It is currently Thu May 01, 2025 5:18 pm



 [ 204 posts ]  Go to page Previous  1 ... 8, 9, 10, 11, 12, 13, 14  Next
 Qupls (Q+) 
Author Message

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2307
Location: Canada
Some more work on the QMSI interrupt controller. Added an internal vector table. Each entry in the table is 128-bits. Almost the same format as MSI-X. The hardware itself only implements about 80 of the bits. The vector table contains 2048 entries. 512 entries available for each operating mode. However, as there are pointers to the vector table for each operating mode, all the modes could point to the same base address, which would allow 2048 vectors for all modes.
The target CPU core for interrupts is stored in a vector. The interrupt controller supplies the vector on a dedicated bus to the CPU cores. One interrupt controller can provide service for multiple CPU cores.
The idea of supplying an instruction ala 8080 interrupt control, is being toyed with. It would consume a bit in the vector to indicate either an address or instruction.

_________________
Robert Finch http://www.finitron.ca


Sat Mar 15, 2025 3:23 am WWW

Joined: Mon Oct 07, 2019 2:41 am
Posts: 768
robfinch wrote:
Some more work on the QMSI interrupt controller. Added an internal vector table. Each entry in the table is 128-bits. Almost the same format as MSI-X. The hardware itself only implements about 80 of the bits. The vector table contains 2048 entries. 512 entries available for each operating mode. However, as there are pointers to the vector table for each operating mode, all the modes could point to the same base address, which would allow 2048 vectors for all modes.
The target CPU core for interrupts is stored in a vector. The interrupt controller supplies the vector on a dedicated bus to the CPU cores. One interrupt controller can provide service for multiple CPU cores.
The idea of supplying an instruction ala 8080 interrupt control, is being toyed with. It would consume a bit in the vector to indicate either an address or instruction.


Have you considered a dual computing system like the CDC 6600.


Cray took another approach. In the 1960s, CPUs generally ran slower than the main memory to which they were attached. For instance, a processor might take 15 cycles to multiply two numbers, while each memory access took only one or two cycles. This meant there was a significant time where the main memory was idle. It was this idle time that the 6600 exploited.

The CDC 6600 used a simplified central processor (CP) that was designed to run mathematical and logic operations as rapidly as possible, which demanded it be built as small as possible to reduce the length of wiring and the associated signalling delays. This led to the machine's (typically) cross-shaped main chassis with the circuit boards for the CPU arranged close to the center, and resulted in a much smaller CPU. Combined with the faster switching speeds of the silicon transistors, the new CPU ran at 10 MHz (100 ns cycle time), about ten times faster than other machines on the market. In addition to the clock being faster, the simple processor executed instructions in fewer clock cycles; for instance, the CPU could complete a multiplication in ten cycles.

Supporting the CPU were ten 12-bit 4 KiB peripheral processors (PPs), each with access to a common pool of 12 input/output (I/O) channels, that handled input and output, as well as controlling what data were sent into central memory for processing by the CP. The PPs were designed to access memory during the times when the CPU was busy performing operations. This allowed them to perform input/output essentially for free in terms of central processing time, keeping the CPU busy as much as possible.


The entire 6600 machine contained approximately 400,000 transistors.


Sat Mar 15, 2025 6:42 pm

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2307
Location: Canada
Quote:
Have you considered a dual computing system like the CDC 6600.
...
The entire 6600 machine contained approximately 400,000 transistors.

Are you saying I should try tackling an emulation of the CDC6600?

I have read a little on the CDC6600. A great machine for its time. It is a bit oddball using 60-bit arithmetic. I tried to figure out how the score-board worked, then I found a sample scoreboard in the Nyuzi machine. I did a quick websearch to see if anyone tried implementing the CDC6600 in an FPGA yet. A slight update would be to make it 64-bit. I wonder what the IPC was.


At 400,000 individual transistors, that's a lot of transistors, but todays machines have a lot more transistors.

*****
Been doing some more work on interrupts.

_________________
Robert Finch http://www.finitron.ca


Sun Mar 16, 2025 3:16 am WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2307
Location: Canada
Updates
Updated the interval timer component. It was using an older version of the FTA bus config block, now upgraded to ddbb64. It was also using wired interrupts which was changed to MSI interrupts.

Now updating the ddbbnn components to integrate the IRQ management into the ddbb. This should provide a uniform way of managing device IRQs.

Bug Fixes
In the CPU if an interrupt occurred that was masked off, it was not restoring the correct path of execution.

_________________
Robert Finch http://www.finitron.ca


Wed Mar 19, 2025 4:05 am WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2307
Location: Canada
Adjusted the pipeline for variable length instructions. This required updating the PC increment and the instruction extract logic.

Changing to variable length instructions increased the size of the core because the instructions have a maximum length of 96 bits instead of 64. This made the pipeline wider.

Coded single step mode. This had not been coded yet. In the process an additional form was added to the RTE instruction. RTE now by default disables single step mode, but it has an option to restore single step mode to the status it was before the interrupt. This form of the instruction (RTE_SSR) would be used in a debugger to continue single-stepping. This kind of thing is needed because it is difficult to manipulate the status register which is stored in an internal stack.

_________________
Robert Finch http://www.finitron.ca


Thu Mar 20, 2025 4:43 am WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2307
Location: Canada
Built the system out to implementation. According to the tools it meets 40MHz timing. I was worried that moving to a variable length instruction would toast the timing of the fetch stage.

Updated the micro-code reset routine to account for new instruction formats.

Bin thinking of moving the micro-code address into the upper 12-bits of the PC register. That would restrict the PC to 52 bits but make it easier to manage for interrupts etc. It currently has its own dedicated counter.

Spent a ton of time updating the assembler and compiler.

There’s a bug in the compiler it spit out a PUSH instruction in the middle of a loop for some unknown reason.

Did some compiles and assembly to get an idea of code density. Unfortunately, my home grown compiler does not do the Q+ core justice. Tested against a m68k compiler, the code for the m68k was about 2.5 times denser.

Started working on yet another architecture. This time going with a more streamlined core. 32-bit fixed length instructions. Small constants are embedded in the instruction, large constants are referenced from the cache line.

_________________
Robert Finch http://www.finitron.ca


Fri Mar 21, 2025 4:07 am WWW

Joined: Mon Oct 07, 2019 2:41 am
Posts: 768
robfinch wrote:
Built the system out to implementation. According to the tools it meets 40MHz timing. I was worried that moving to a variable length instruction would toast the timing of the fetch stage.

Updated the micro-code reset routine to account for new instruction formats.

Bin thinking of moving the micro-code address into the upper 12-bits of the PC register. That would restrict the PC to 52 bits but make it easier to manage for interrupts etc. It currently has its own dedicated counter.

Spent a ton of time updating the assembler and compiler.

There’s a bug in the compiler it spit out a PUSH instruction in the middle of a loop for some unknown reason.

Did some compiles and assembly to get an idea of code density. Unfortunately, my home grown compiler does not do the Q+ core justice. Tested against a m68k compiler, the code for the m68k was about 2.5 times denser.

Started working on yet another architecture. This time going with a more streamlined core. 32-bit fixed length instructions. Small constants are embedded in the instruction, large constants are referenced from the cache line.


How small is a small constant?
The 6809 had a lot of encoding modes for indexing.


Fri Mar 21, 2025 1:07 pm

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2307
Location: Canada
Quote:
How small is a small constant?
The 6809 had a lot of encoding modes for indexing.

Yes, the 6809 was cool that way. They even left a couple of holes in the encoding.

For the new architecture (LB650) I am working on now, a small constant is 10-bits.
I spent today working on the LB650. Almost got a compiler modified to output code. A few glitches with it yet.
I coded part of the instruction decoder to make sure the constant encoding scheme could actually work in hardware. I think it will be okay.
Writing the assembler for the LB650 is going to be a challenge.

_________________
Robert Finch http://www.finitron.ca


Sat Mar 22, 2025 3:59 am WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2307
Location: Canada
Worked on the compiler / assembler today for LB650. Had a lot of luck and things are almost working.

_________________
Robert Finch http://www.finitron.ca


Sun Mar 23, 2025 2:44 am WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2307
Location: Canada
Went back and worked on the arpl compiler and the vasm assembler for Qupls. Got the assembler outputting a bit smaller code.

_________________
Robert Finch http://www.finitron.ca


Mon Mar 24, 2025 2:51 am WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2307
Location: Canada
Changing the branch architecture around to a separate compare then branch approach. Branches will be able to jump to the address contained in a code address register. This is more similar to the Thor approach. At the same time the branch may be linked to perform a subroutine call operation. So, there is now conditional subroutine call and return. A compare then branch approach allows for more bits in the branch instruction. There is also a larger branch displacement allowed.

Compiled / assembled a couple of programs. It looks like the code is much closer in density to the m68k. So, I think it is acceptable.

_________________
Robert Finch http://www.finitron.ca


Tue Mar 25, 2025 3:07 am WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2307
Location: Canada
Started a new project today code named StarkCPU. It is similar to the PowerPC. Just doing the documentation. I found I was re-writing too much of the Qupls core, I was basically writing a new core.

Fixed 32-bit instructions. Large constant located in words after the instruction.
32 GPRs, 32 FPRs, 8 – 8-bit CRs, 8 branch registers. (96 logical registers total). To be backed up with 256 physical registers.

Going to a 32-bit CPU to begin with, maybe expand to 64-bits later.

I cut out a lot of the Qupls instructions, to be added later.

_________________
Robert Finch http://www.finitron.ca


Thu Mar 27, 2025 2:45 am WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2307
Location: Canada
A lot of work on the StarkCPU. Updated the arpl compiler to generate code for it. It still needs more updates but I can get a sense of how it is working. Copied the PowerPC assembler and modified it to output code for the CPU. A lot of changes to it. ATM development is stalled on an ‘illegal operand type’ bug message appearing when assembling code. I have been going almost opcode by opcode in changes and testing.

Changed the design for the CPU to make use of cache-line constants instead of constants inline with code. The difference is that it then allows a constant postfix instruction to be easily used because instructions are all a fixed size then. The postfix can be used to override registers.

_________________
Robert Finch http://www.finitron.ca


Fri Mar 28, 2025 9:05 am WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2307
Location: Canada
Designing an MMU for my small system.

The challenge is to use only about 64 BRAMs max for MMU support. I was going to use a hash page table, but now I am thinking just a large direct mapping table would be better / simpler, KISS. Going to eliminate the TLB middleman for this system. Single clock access through the mapping table should be okay. Designed a 27-bit PTE. 1 MB pages. 4096 PTEs for 32-bit addressing – 1GB DRAM to map. 27-bit PTEs map nicely to 9-bit wide BRAMs.

One night many years ago, I had a dream that I was a technical guru designing a CPU for: Stark Enterprises. I was Tony Stark’s cousin. Working in an orbital research lab. Nuclear reactors and cryogenic chambers; strange dream. Resulting in calling this project the StarkCPU. Did not know what to call the project, then thought of this dream.

_________________
Robert Finch http://www.finitron.ca


Sat Mar 29, 2025 2:32 am WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2307
Location: Canada
Reconsidered the startup of yet another CPU, and decided to call this version #3 of Qupls instead.
Did a lot of documentation today, placing Qupls3 docs in Github under the Qupls repository.

I changed the mapping table to have 49152 entries, using a page size of 16kB. The PTE had to expand to 32-bits because the page number was wider.
The first 768MB of the memory space are mapped. The remaining 256MB are unmapped and reserved for the kernel.
Bounds registers were also added to limit programs to specific memory ranges. There four sets of bounds for code, data, stack, and an extra data one.
Decided to go with upper/lower bounds rather than base and limit. The program loader will have to be able to relocate code and data.

_________________
Robert Finch http://www.finitron.ca


Sun Mar 30, 2025 3:09 am WWW
 [ 204 posts ]  Go to page Previous  1 ... 8, 9, 10, 11, 12, 13, 14  Next

Who is online

Users browsing this forum: claudebot and 0 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Jump to:  
Powered by phpBB® Forum Software © phpBB Group
Designed by ST Software