Last visit was: Fri May 02, 2025 9:10 am
|
It is currently Fri May 02, 2025 9:10 am
|
Author |
Message |
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2308 Location: Canada
|
Whew! That was a day/long night. I managed to figure out the cursor tracking problem, it was due to a register bypassing problem. I had setup the entire pipeline to stall on a cache miss or multi-cycle operation with the idea that results would remain in place in the pipeline for bypassing. I was after simple pipeline controls. Well it didn't work correctly. I ended up making the pipeline remember the last two register updates for bypassing. The last two updates are remembered because there could be two instructions that were in the pipeline that fall off the end when the pipeline stalls. (A cache miss doesn't stall the entire pipeline now). It works much better now, but the software hangs in another place now. I'm guessing more pipelining problems. I found two software bugs along the way. Onto the next bug. Cached and uncached instruction access now produces the same results with the software. Using the cache is about 400% faster than uncached. Keyboard input seems to be working which is amazing given the complexity of the keyboard code. The code was ported with minor changes from another project. The projects gotten to the point where its almost reasonable to write and make use of debug routines.
_________________Robert Finch http://www.finitron.ca
|
Sat Mar 21, 2015 1:57 am |
|
 |
BigEd
Joined: Wed Jan 09, 2013 6:54 pm Posts: 1821
|
That's good progress, also a scary reminder of how this is anything but a beginner's project!
|
Sat Mar 21, 2015 6:24 am |
|
 |
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2308 Location: Canada
|
Trying to figure out this problem now. A register is loaded with the hex value "12345678", but when it's displayed using the following routine it displays as "12345600" - the last byte displayed is displaying as zero, not "78" like it should. There does not appear to be any software problem, as this program executes correctly in the simulator, but not in real life. I've tried almost everything I can think of to ferret out the problem. Code: ;------------------------------------------------------------------------------ ; Display the half-word in r1 ;------------------------------------------------------------------------------
DisplayHalf: push lr push r1 ror r1,r1,#16 bsr DisplayCharHex rol r1,r1,#16 bsr DisplayCharHex pop r1 pop lr rtl ;------------------------------------------------------------------------------ ; Display the char in r1 ;------------------------------------------------------------------------------
DisplayCharHex: push lr push r1 ror r1,r1,#8 bsr DisplayByte rol r1,r1,#8 bsr DisplayByte pop r1 rts
;------------------------------------------------------------------------------ ; Display the byte in r1 ;------------------------------------------------------------------------------
DisplayByte: push lr push r1 ror r1,r1,#4 bsr DisplayNybble rol r1,r1,#4 bsr DisplayNybble pop r1 rts ;------------------------------------------------------------------------------ ; Display nybble in r1 ;------------------------------------------------------------------------------
DisplayNybble: push lr push r1 push r2 and r1,r1,#$0F addui r1,r1,#'0' cmpu r2,r1,#'9'+1 blt r2,.0001 addui r1,r1,#7 .0001: bsr OutChar pop r2 pop r1 rts
_________________Robert Finch http://www.finitron.ca
|
Sat Mar 21, 2015 5:11 pm |
|
 |
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2308 Location: Canada
|
Well I think I figured out the problem but I'm not 100% sure, especially as to why it worked in simulation. But anyways, I noticed it started working after some experimentation, and I found out the RTS instruction NOP'ped out the instruction at the return target. It was too ambitious at pipeline flushing. (Un) or fortunately I used the RTL instruction almost everywhere and it didn't suffer from the same problem, so most code worked without a problem. RTL returns from a Leaf subroutine using the link register. RTS returns from a (non leaf) subroutine by first popping the link register from the Stack, then loading the same value into the PC. Why have two instructions ? RTS is one instruction shorter and faster than the sequence POP LR, RTL.
With that problem fixed, values are displaying correctly now. The small monitor program works flawlessly now it seems. Memory can be dumped and filled, and the current date displayed from an RTC. So I wrote a program in 'C', to control sprites and copied the compiled source code into the boot rom. The result was it worked ! It looks like I can now run compiled code for more ambitious software.
The next thing to do is get interrupts working.
_________________Robert Finch http://www.finitron.ca
|
Sun Mar 22, 2015 10:48 am |
|
 |
legacy
Joined: Fri Jan 10, 2014 9:46 pm Posts: 37
|
[removed due to user inquiry]
|
Sun Mar 22, 2015 4:17 pm |
|
 |
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2308 Location: Canada
|
Quote: i am still not understanding your assembler compiler's sources, there is no makefile, and files seem written in a mixture of C and C++, i am a bit confused I have put a Makefile.win which is the makefile for the assembler onto the source directory. You are correct the files are written in a mix of C and C++. The files need to be compiled as C++ files except for SEARCHEN.c. I've been using DEV c++ so I placed the project file A64.dev in the directory as well. There are headers and libraries used that are a part of DEV c++.
_________________Robert Finch http://www.finitron.ca
|
Mon Mar 23, 2015 12:13 am |
|
 |
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2308 Location: Canada
|
Interrupts appear to work now as there are timer interrupts occurring. It's possible to do a memory dump for instance and the timer interrupt isn't causing a problem. I have the processor saving the stack pointer register in addition to the PC when an interrupt occurs. A return from interrupt (RTI) restores the stack pointer to what it was before the interrupt. This allows the stack pointer to be reset to an interrupt stack for processing. Interrupt code looks like: Code: ;------------------------------------------------------------------------------ ; 1024Hz interupt routine. This must be fast. Allows the system time to be ; gotten by right shifting by 10 bits. ;------------------------------------------------------------------------------ Tick1024Rout: 0103C4 FC 00 00 00 ldi sp,#$8000 ; set stack pointer to interrupt processing stack 0103C8 0A E0 01 00 0103CC E7 E0 01 00 push r1 0103D0 0A 10 04 00 ldi r1,#2 ; reset the edge sense circuit 0103D4 7C DC FF 00 sh r1,PIC_RSTE 0103D8 62 10 A8 1F 0103DC 64 10 20 00 inc Milliseconds 0103E0 57 1F 10 00 pop r1 0103E4 02 E0 3F 6E rti ; restore stack pointer and return I have the interrupt line to the cpu tied to an external LED which glows a dull blue when interrupts are being processed, and painfully bright blue when the interrupt line is stuck high. The interrupt line is also connected via a switch so that interrupts can be switched off manually. Interrupts are managed with a simple priority encoder circuit associated with the processor. This circuit detects an interrupt and encodes it into a nine bit vector for the processor. The cpu just has a single interrupt input and a single status bit to enable / disable interrupt detection. Otherwise enabling specific interrupts is done with the priority encoder circuit (PIC). One can see the PIC I/O device is accessed to acknowledge the interrupt in the above code. This type of thing is common in interrupt routines.
_________________Robert Finch http://www.finitron.ca
|
Mon Mar 23, 2015 12:32 am |
|
 |
legacy
Joined: Fri Jan 10, 2014 9:46 pm Posts: 37
|
[removed due to user inquiry]
|
Mon Mar 23, 2015 9:04 am |
|
 |
Tor
Joined: Tue Jan 15, 2013 10:11 am Posts: 114 Location: Norway/Japan
|
legacy wrote: wander why don't you use gcc on linux instead. Rob would know best of course, but in general one could say that you should first define a problem, and only then define a solution - not the other way around. In short, if he has a workchain that works for him then there's no problem to fix. I've used Linux nearly as long as Torvalds himself, every day, but that doesn't mean that everyone should switch just to switch. Let's keep the discussion on the FISA64 project itself, I find it fascinating. -Tor
|
Mon Mar 23, 2015 4:04 pm |
|
 |
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2308 Location: Canada
|
Serial communication is now possible! The following quick and dirty routine loads a 16kB file through the serial port to memory. Making it possible to bootstrap from a remote host. Downloaded code can be run using the monitor's jump to code ('J') command. Next step is to come up with a small OS kernel. In the past I've gotten as far as trying to develop file system (FS) code. Code: LoadFromSerial: push lr ldi r3,#16384 ldi r2,#$20000 ; target store address .0001: bsr SerialGetCharDirect sb r1,[r2] addui r2,r2,#1 subui r3,r3,#1 bne r3,.0001 rts
A dual processor test system is pending. Another item on the ToDo: list.
_________________Robert Finch http://www.finitron.ca
|
Mon Mar 23, 2015 10:45 pm |
|
 |
legacy
Joined: Fri Jan 10, 2014 9:46 pm Posts: 37
|
[removed due to user inquiry]
|
Tue Mar 24, 2015 12:21 am |
|
 |
legacy
Joined: Fri Jan 10, 2014 9:46 pm Posts: 37
|
[removed due to user inquiry]
Last edited by legacy on Tue Mar 24, 2015 1:46 pm, edited 1 time in total.
|
Tue Mar 24, 2015 12:55 am |
|
 |
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2308 Location: Canada
|
The test system is running with Dual cpu cores now ! The first few lines of the startup code are shown below. The cpuid instruction is used to cause cpu#1 to enter an idle loop, while cpu#0 runs the bootrom like normal. the implementation is a bit naïve and has the second cpu bound to compute tasks only since it has no access to I/O or interrupts. I think I may hook up the 60Hz pulse interrupt to the second cpu as well as the first (for multi-tasking). Code: start: sei ; interrupts off cpuid r1,r0,#0 beq r1,.0002 .0003: inc $20000 lw r1,StartCPU1Flag cmp r1,r1,#$12345678 bne r1,.0003 jmp (StartCPU1Addr) .0002: ldi sp,#32760 ; set stack pointer to top of 32k Area ldi r5,#$0000 Now what to do with two cpu's.
_________________Robert Finch http://www.finitron.ca
|
Tue Mar 24, 2015 3:51 am |
|
 |
BigEd
Joined: Wed Jan 09, 2013 6:54 pm Posts: 1821
|
Hi Rob, please could you sketch out the resource utilisation for this dual CPU setup? Pasting a bit of a Xilinx report would be fine - but also knowing which FPGA you're on, and which dev board, would be interesting. Your repo: https://github.com/robfinch/Cores/tree/ ... SA64/trunkThe doc file: https://docs.google.com/viewer?url=http ... x?raw=true
|
Tue Mar 24, 2015 9:36 am |
|
 |
legacy
Joined: Fri Jan 10, 2014 9:46 pm Posts: 37
|
[removed due to user inquiry]
|
Tue Mar 24, 2015 3:06 pm |
|
Who is online |
Users browsing this forum: Amazonbot, claudebot and 0 guests |
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot post attachments in this forum
|
|