Last visit was: Thu May 01, 2025 7:26 pm
|
It is currently Thu May 01, 2025 7:26 pm
|
Author |
Message |
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2307 Location: Canada
|
The pipeline stages are now all separate modules.
Wrote another version of response buffers for the FTA bus. The previous version time-multiplexed inputs into a single fifo using a five times clock. The newer version has a separate fifo for each input and selects from the fifo outputs in a round-robin fashion. It uses only a 1x clock. The time multiplexed version appeared to lose inputs sometimes. A logic issue somewhere.
For the rf6847 The video mode changed to WXGA 1366x768 which corresponds to the TV in use. New formats are text: 40x16 or 40x32 chars Graphics 80x64, 160x64, 160x96, 160x192, 320x192 Or 80x128, 160x128, 160x192, 160x384, 320x384
_________________Robert Finch http://www.finitron.ca
|
Sun Oct 20, 2024 4:14 am |
|
 |
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2307 Location: Canada
|
Having some luck getting things to work. Closer but still not perfect. The blinkin lights routine loops up to 52 times now before crapping out. It was looping 36 times then running out of rename registers, but I fixed that. Now it appears to have something to do with branches and register value propagation.
Slightly changed how the register map works. It used to use a RAM directly, referencing the RAM inputs and outputs. Now it goes through an intermediate current value register. A new checkpoint copies the current value register to the RAM and a restored operation does the opposite, copying the RAM to the current value register. The advantage of doing this seems to be a slightly smaller circuit size and perhaps better timing. Most of the ports are on the current value register, instead of being connected to the RAM.
Just building the system now to see how the plethora of changes has impacted the size / timing.
_________________Robert Finch http://www.finitron.ca
|
Mon Oct 21, 2024 4:09 am |
|
 |
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2307 Location: Canada
|
Worked on the video frame buffer today. Broke it up into smaller modules. Hoping to be able to get data from the DRAM and display it. The frame buffer is one device accessing the DRAM. The DRAM controller was updated and whether it works or not is not known.
Also created a stack CPU as a second CPU in the system. Looking for something simpler than Qupls OoO so I can test the system without needing the huge CPU. The stack CPU was innovated from the J1 which was widened to 64-bits and has several modifications to the instruction set. It takes about 1500 LUTs. No software for it yet.
_________________Robert Finch http://www.finitron.ca
|
Thu Oct 24, 2024 5:35 am |
|
 |
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2307 Location: Canada
|
Spent a chunk of time getting the memory controller to work. For testing the response from the memory was wired to the text display memory. Decided that after several hours of no-go testing that perhaps connecting the chip select line for memory would help. I ported the system from a different FPGA board where the chip select was externally tied active, and did not notice the difference until I decided to go through the pinouts again pin by pin. I do not know what I was thinking when I wrote it, but the multi-port memory controller had a flaw to it. It multiplexed the inputs into a single fifo, then processed the output of the fifo. It should have had separate input fifo’s for each channel, then multiplex at the outputs of the fifos. This allows the channel to have a different clock frequency than the MPMC. It is less hardware the other way around, but also possibly less reliable, as channels needs to be synchronized to the MPMC clock, possibly needing latching as well. Getting an awesome amount of bandwidth out of the memory system. As shown below, data for an entire scan-line at 800x600 by 16bpp is fetched during the horizontal sync interval. The output is a logic analyzer trace from running in the FPGA. About 9.5% of available bandwidth is used. Might space those 50 accesses out every 9 cycles so that the rest of the system gets a chance for memory access without having to wait. More tweaking to do.
You do not have the required permissions to view the files attached to this post.
_________________Robert Finch http://www.finitron.ca
|
Fri Oct 25, 2024 3:57 am |
|
 |
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2307 Location: Canada
|
More trouble with the MIG controller. Sometimes the MIG controller seems to be locked up. The reset button needs to be pressed more than once to get things working. The ready signal goes low and stays low after a while. The interface to MIG is fairly simple. Issue a read command, then wait for the data to come back. Internally MIG uses a fifo to hold the data, but that is invisible to the app. I suspect the fifo is locking up because it is full or empty. But there is no access to the fifo controls to reset it. Looking around for another DDR3 controller to try.
Found out the DRAM / frame buffer did not work as well as expected. The DRAM is setup to automatically fill a burst request by responding with the data is successive responses. The controller expects to see just a single burst request. But the frame buffer was sending a new request for each element of the burst, which greatly confused things. The frame buffer now just says with one request send back 50 consecutive pieces of data, which is enough for an entire scan-line. And the controller responds with the data.
_________________Robert Finch http://www.finitron.ca
|
Sat Oct 26, 2024 6:04 am |
|
 |
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2307 Location: Canada
|
Trouble with reset. The MIG controller was not functioning properly because of a bad reset. Getting a good reset pulse is more challenging than one would think. Using the vendor’s reset component, the reset input was not long enough to trigger the reset outputs. The issue was pressing the reset button stopped the clock generator so there were no clocks occurring to provide to the reset timing logic. The clock generator comes out of reset about three clocks after the reset button is inactive. This made an effective reset pulse of only three clock cycles. The default requirement of the reset core is a four clock cycles wide input reset. So the system reset core was not outputting any resets. The proper width of the reset pulse cannot be seen in the logic analyzer. This confused me for a little bit. The reset pulse looks much shorter than it actually is because reset stops the clock generator, which means the logic analyzer does not have a clock to record things by. Since there is no signal record for the reset period, the reset pulse appears much shorter than it is in the logic analyzer display. I spent some time trying to figure out why reset was so short when created from a pushbutton. I ended up using two reset buttons for debugging. One to reset the entire system, and a second one to reset the system without resetting the clock generators.
_________________Robert Finch http://www.finitron.ca
|
Tue Oct 29, 2024 8:24 am |
|
 |
oldben
Joined: Mon Oct 07, 2019 2:41 am Posts: 768
|
Time for that 555 timer! POWER RESET.  I found that LS19P is a nice switch denounce chip if you need hardware. https://logiswitch.com/shop Note unlike most denounce methods this does not invert the logic out.
|
Tue Oct 29, 2024 6:51 pm |
|
 |
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2307 Location: Canada
|
555 timer's a great chip. Calling it a night over the following nonsense. The memory controller is returning extra or missing pulses. In the picture below it returns the last data after 200 clocks! Definitely will not work for a frame-buffer memory. I may have to try to roll some of my own hardware for the DDR3 interface.
You do not have the required permissions to view the files attached to this post.
_________________Robert Finch http://www.finitron.ca
|
Wed Oct 30, 2024 8:16 am |
|
 |
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2307 Location: Canada
|
The memory controller is almost working now. Changing the memory part number in the MIG setup helped. The MIG controller is not locking up now. But the video display displays data for only ½ of the display. This has been traced to the burst request signal which is occurring only once per scan-line. It is supposed to happen twice per scan-line. Two 25 element bursts are used to fetch data for 800 pixels. Only the first burst is working though. Something is amiss in the frame buffer component. The multi-port memory controller (MPMC11) is working at a blistering 225 MHz. ¼ the memory speed. The frame buffer is using a 100 MHz bus rate though. Only a small amount of bandwidth is required to support the display, which is good as the system should be able to support multiple masters accessing the memory. The burst takes about 80 clock cycles, but that is at 225 MHz. The CPU will likely run at <50 MHz. Attachment: fb_hsburst.jpg
You do not have the required permissions to view the files attached to this post.
_________________Robert Finch http://www.finitron.ca
|
Thu Oct 31, 2024 6:16 am |
|
 |
oldben
Joined: Mon Oct 07, 2019 2:41 am Posts: 768
|
Do you have any CPU instructions for bit mapped display?
|
Thu Oct 31, 2024 6:37 am |
|
 |
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2307 Location: Canada
|
Quote: Do you have any CPU instructions for bit mapped display? No. Generic instructions would be used to manipulate graphics. There is a separate graphics accelerator that does line, curve and triangle drawing. Milestone: blinkin LEDs count up over 100 in SIM. I was not able to figure out why two bursts did not work, so I converted them into a single burst, which does work. I also added a state machine to fill the frame buffer memory with a constant value. Write cycles appear to work now as do streaming reads. Gained some confidence that the DRAM system will work.
_________________Robert Finch http://www.finitron.ca
|
Fri Nov 01, 2024 1:25 pm |
|
 |
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2307 Location: Canada
|
Latest hack Store operations were sometimes causing the CPU to hang waiting for the store data value. I am not sure why, but the value was not always being transferred to the store buffer. I noticed however, that the value always seemed to be correctly present in part of the debug display. Debug stores the value in the reorder buffer while this is not normally done for anything except debug. So I cheated and used the debug code with the reorder buffer value to get the store argument for store operations. It seems to work.
Milestones Running in the FPGA now causes the A single LED to light up and stay on. This seems like a crash, but it is good because it means the CPU must be to writing the LED port; otherwise, all the LEDs are flashing. So, things are close to working, at least for the blinkin lights.
ToDo: is to add a stall signal from the memory controller to the CPU. Currently the memory controller expects that the CPU will not access memory too often, which keeps the fifos from overflowing. It may take up to about 40 to 50 DDR clocks for memory to be written, which equates to about 4 or 5 CPU clocks. It is more of an issue with CPU writes than reads as the CPU does not normally wait for a response from a write cycle.
Timing Issues The core seems to meet timing for 20 MHz. With one or two exceptions which I can work on. The memory controller (running @225MHz) misses timing by about 500ps. I am trying a fix for that. The other thing missing timing is the pushbutton input which is debounced with a 20 MHz clock. This is not really an issue; a constraint somewhere needs to be updated to indicate timing is irrelevant for a pushbutton. After messing around with the RAT for a while, the scheduler is on the critical timing path now instead of the RAT. If I could put a set of FFs between the re-order buffer and the scheduling logic that would help a lot. There are also a couple of signals tied to the scheduling logic that miss timing by about 500 ps. This is more challenging to remedy. The issue is the scheduler uses a lot of comparators on the reorder buffer to determine the order of instructions. For instance, before a store can issue the scheduler needs to know if there was a prior branch instruction. The “prior” part means searching the re-order buffer and comparing instruction sequences. There is some debug code in place that if removed might help the timing little bit. I put serialization code in the scheduler that says the next instruction can be issued only if the previous one is complete.
_________________Robert Finch http://www.finitron.ca
|
Sat Nov 02, 2024 5:21 am |
|
 |
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2307 Location: Canada
|
Slowed the memory system down to 200 MHz from 225 MHz. There were some signals missing timing by about 200ps. This likely due to the size of the design. The fix for the memory controller worked. A set of FFs was inserted in a read path. Possible to do easily because of the pipelining.
Milestones: text screen clear working in SIM. Anti-milestones: Fibonacci still not working correctly
Bug fixes Load and store instructions had the Rb register always set to zero leading to a lack of indexing. The assembler was also not encoding Rb leaving the field set at zero. The index scale was encoded to the wrong position in the opcode. The above showed up as writes to display memory always writing the same base location. The available registers list was not being passed from the name supplier to the RAT. This led to registers incorrectly being used that were not available. Fixing this fixed issues with the propagation of data values in registers.
Still scratching my head over how to improve the scheduler timing. One thought is to try and pre-compute as much as possible.
_________________Robert Finch http://www.finitron.ca
|
Mon Nov 04, 2024 2:34 am |
|
 |
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2307 Location: Canada
|
Bug fixes A typo in bypass logic led to the incorrect register being marked invalid. This caused the CPU to hang waiting for the register to be valid. In the RAT the currently mapped register was being placed in the to-free buffer, it should have been the register being written to by update logic. This issue led to registers being reused that should not be. An improper backout led to registers restored to a state to early, which caused infinite loops to appear, as the loop counter was getting backed out to a previous value.
Milestones: Got Fibonacci almost working now. It counts in progression 0,1,1,2,3,5,8,13, but when it gets to 8 it starts writing low memory addresses instead of displaying the count on the LEDs. This is due to a register that gets trashed during a branch backout operation.
_________________Robert Finch http://www.finitron.ca
|
Tue Nov 05, 2024 12:11 pm |
|
 |
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2307 Location: Canada
|
Bug fixes Data cache tag memory needed to be dual ported to allow snooping and normal access. It only had a port for snoop.
Milestone Got Fibonacci to work by disabling the backout and restore handling for branches. Instead, instructions following the branch that should not be executed are turned into copy-targets, that copy the target register to the new target register. Since the ROB is only 16 entries this can be done in about the same length of time as a backout and restore. Having got Fibonacci working, I decided to chain the blinkin lights delay routine onto it. It does not work quite right. There is no display, but the routine runs. There is something amiss with the datapath to the LEDs.
_________________Robert Finch http://www.finitron.ca
|
Wed Nov 06, 2024 4:07 am |
|
Who is online |
Users browsing this forum: claudebot and 15 guests |
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot post attachments in this forum
|
|