Hookup emu-russia/dmgcpu #1

Rodrigodd · 2024-03-17T02:39:02Z

Hello! For some weeks now, I have been investigating if I could use the Verilog simulation in this repo to better understand the inner workings of the Game Boy and in some way improve the precision of my Game Boy emulator.

The main point that I want to investigate was the inner work of the PPU, and the precise read/write timing between the CPU and the PPU. But, as far as I know, you are using only an equivalent CPU implementation, not a reverse-engineered CPU from the Game Boy. This made me a little hesitant in learning its timing, knowing it may still be off.

But I discovered that there is a reverse-engineered HDL of the CPU core at https://github.com/emu-russia/dmgcpu, and I decided to try hooking it up with the simulation in this repo.

Connecting the two together went almost painless. But I could not get it to work; the CPU never starts running after the circuit RESET.

I tried investigating where things were going wrong, but I still could not figure it out (my investigation is resumed in notes.txt, copied in the section below), and I am not experienced in debugging Verilog. I could investigate a little more, but I presume there will be many more problems after fixing this one, so I became a little demotivated.

Regardless, I decided to publish my work here to not make it all go to waste.

Investigation

The address bus of the CPU, when running in dmg-sim, never increments, whereas the simulation in dmgcpu does.

The signal d of the Sequencer never changes from 0x00..04000 to 0x00402..00, like the simulation in dmgcpu does.

The following inputs to the CPU differ from the simulation in dmgcpu:

slightly changed in the CLKs' phase during the reset, nothing major.
OSC_STABLE is always low, instead of high before the reset.
NMI is unconnected, instead of always 0
IPL_REQ is always high, instead of always 0
SYNC_RESET is always high, instead of going low after the reset.

OSC_STABLE becomes high when UPOF (last bit of DIV, 16Hz) is high, and TUBO, a latch for the CLK_ENA signal from the CPU, becomes low. In dmgcpu, UPOF is hardcoded to be high.

OSC_STABLE = T1&~T2 | ~T1&T2 | UPOF&~TUBO
TUBO = nor_latch(.set(CLK_ENA), .reset(RESET | ~OSC_ENA))

OSC_STABLE feeds the SYNC_RESET (CPU T12), which becomes locked low in ASOL latch. This made me think that CLK_ENA should only go high after SYNC_RESET is low.

Looking at Seq.v (see logic.py), CLK_ENA is low when RESET is high, or if the CPU halted:

RESET	SYNC_RESET	CLK_ENA
0	0	`~(d[100] \| IR[4] & d[101]) \| SeqControl_1`
0	1	`~(d[100] \| IR[4] & d[101])`
1	x	0

From _GekkioNames.v:

d[100]: s1_op_halt_s0xx
d[101]: s1_op_nop_stop_s0xx

So CLK_ENA is almost unaffected by SYNC_RESET.

Another way to make OSC_STABLE go high is to ensure to only reset when UPOF is high. But UPOF does not run until RESET is low! And the CPU RESET is directly connected to the global simulation RESET!

Maybe UPOF should be reset high? Probably not, it may mess up with the DIV timing, and the simulation is too slow to make guessed changes.

UPOF is reset by RESET_DIV:

~RESET_DIV = ~(RESET | ~OSC_STABLE | (FF04_FF07 & CPU_WR & ~A1 & ~A2)).
RESET_DIV = RESET | ~OSC_STABLE | <write to DIV>

ASOL is reset by RESET.

msinger · 2024-03-17T09:01:32Z

Hi Rodrigo,

thank you very much for sharing your work. I want to look into this, and I just realized that I had fixed two small bugs in my working directory that I hadn't committed yet. I just committed them now, so maybe you want to rebase your stuff to the new changes on master. At first glance, I'm very happy that you found a stupid mistake I made with the SERY inverter feeding itself. I still had graphical glitches when simulating Zelda DX. I will run a new simulation with this fix, maybe it helps. If you wouldn't mind, could you make a separate pull request for that SERY fix? I will merge that immediately.

I will comment a bit more on this when I have more time and after I made a few tests myself.

Thanks,
Michael

was set to `sery = !sery` before.

Was need when trying to hook up emurussia/dmgcpu.

msinger · 2024-03-17T16:33:49Z

Okay, it looks to me, that the CPU is misbehaving. It raises the cpu_clk_ena signal too early. I documented that signal a few years ago here: http://iceboy.a-singer.de/doc/dmg_cpu_connections.html
You can read it in the row T11 of the table on that page. The second paragraph of the description is the important one. It says that the CLK_ENA (T11) signal of the CPU must be initialized with 0 and raised by the CPU when TABA/OSC_STABLE (T15) gets high. If the CPU raises CLK_ENA earlier than this, then SYNC_RESET (T12) will never be released by the circuit behind AFER.
I documented this before we had detailed die shots of the CPU core. So I didn't have insight and details could be wrong. Maybe @ogamespec can help us, when he sees that his name got referenced here. :)

What we can see is that SYNC_RESET never gets released:

It should get released after around 32 ms.
When we zoom in, we see that cpu_clk_ena gets raised way too early:

Because of that, TABA never gets pulsed and SYNC_RESET never gets released.

I removed this port from the CPU instantiation:

//.CLK_ENA(cpu_clk_ena),   // out T11

Then I added these two lines somewhere above or below the CPU instantiation:

initial cpu_clk_ena = 0;
always_ff @(negedge cpu_in_t12) cpu_clk_ena <= 1;

In a hackish way, this ensures that cpu_clk_enais raised at the correct time.
Now it looks like the CPU starts running and the instruction register, data and address lines are changing:

The simulation runs very slow though, I haven't seen any changes to the PPU registers yet. I think I'm running the original bootrom, so this may still take a while until it gets there.

I noticed that you added the timescale directive everywhere. This isn't necessary in Icarus. I configured the timescale in the timescale.f file. It is given to Icarus at command line. But maybe you need this for Verilator, idk. I don't even know which one Icarus uses if both are given.

I hope this hack helps you to continue with your research. I don't know where exactly the issue is. I'm planning to do my own implementation of the CPU, but first I want to finish the CPU layout that I'm recreating in Electric VLSI. I'm working on that for a while now and it is very frustrating, because it seems that I can't get the proportions right while simultaneously obeying the spacing rules of the MOSIS fabrication technology that I've selected for the project. It's good to know that someone finds this stuff useful. I think @ogamespec will be also happy that you are using his CPU. If he's not reacting here, then maybe you could also write him an issue on his project, just to let him know at least.

msinger · 2024-03-17T17:10:58Z

I just noticed, something else is not working correctly. The conditional jump instruction at address 0x000a at the beginning of the bootrom is failing somehow and the CPU restarts executing at address 0 over and over again.

	LD SP,$fffe		; $0000  Setup Stack

	XOR A			; $0003  Zero the memory from $8000-$9FFF (VRAM)
	LD HL,$9fff		; $0004
Addr_0007:
	LD (HL-),A		; $0007
	BIT 7,H		; $0008
	JR NZ, Addr_0007	; $000a     <-- This fails and it jumps to 0

Rodrigodd · 2024-03-17T17:25:47Z

Thank you! So it is exactly what I had first suspected. I tried looking at the CPU sequencer netlist trace to see if there was anything obviously wrong, but I had no confidence I would figure out something. I will start using your hack.

I noticed that you added the timescale directive everywhere.

dmgcpu has timescale directives in every file, and when connecting the two projects together, Icarus was emitting a warning or error about having timescales in just some files. But I didn't research what was the most sensible fix for that, this was just a hack fix for now.

I think @ogamespec will be also happy that you are using his CPU. If he's not reacting here, then maybe you could also write him an issue on his project, just to let him know at least.

I will make an issue there, at least to make things more cross-referenced.

I just noticed something else is not working correctly. The conditional jump instruction at address 0x000a at the beginning of the bootrom is failing somehow, and the CPU restarts executing at address 0 over and over again.

I will take a look at that. I had my emulator emitting VCD traces, so it is easier to spot where an error is first introduced.

Again, thank you for the help!

msinger · 2024-03-17T17:29:38Z

Glad I could help. Let me know if there is any progress.

ogamespec · 2024-03-18T06:27:39Z

Hi, I've read what's been written here, but for the most part it's all outside the SM83 Core, and that's not my "expertise" there (it's already been studied by others and I don't go there).
Regarding OSC_STABLE and CLK_ENA, I replied in Issue (emu-russia/dmgcpu#219);
As for timescale and CLK, I do it this way:

`timescale 1ns/1ns
always #25 CLK = ~CLK;

That is, 1 simulation step is equal to 1 ns, but I made the CLK transit longer, so that the circuits had time to "settle".
Also keep in mind that HDL in emu-russia/dmgcpu is in the "NOP Engine" stage :D That is, I made sure that something is moving there and somehow lost interest.
Good luck!

Rodrigodd · 2024-03-24T22:13:25Z

Updated the dmgcpu to my submodule.

With emu-russia/dmgcpu#219 fixed, the CPU now starts running instructions. I am testing the execution in quickboot.bin. It had a problem where all registers were inverted, hackly fixed (see emu-russia/dmgcpu#240); and now it is derailing on a RET after a CALL, probably because writing to memory doesn't work (see emu-russia/dmgcpu#239).

Fix check of conditional flags. See emu-russia/dmgcpu#266.

The delays in the CLK6 is not necessary anymore. A transparent DLatch was added to the condtional branch logic (as the original hardware has), avoiding the need of the delays.

Rodrigodd added 4 commits March 17, 2024 09:39

Fix sery = !rama

6645b53

was set to `sery = !sery` before.

Add `timescale to all SystemVerilog files

d69fa80

Was need when trying to hook up emurussia/dmgcpu.

Replace CPU with emu-russia/dmgcpu

036bbac

Include notes

d424855

Rodrigodd force-pushed the hookup-dmgcpu branch from 1413192 to d424855 Compare March 17, 2024 12:42

Rodrigodd mentioned this pull request Mar 17, 2024

CLK_ENA should only go high after OSC_STABLE emu-russia/dmgcpu#219

Closed

Change dmgcpu submodule to my fork

4bc0e22

Update dmgcpu and fix CLK6 timing

b94e567

Fix check of conditional flags. See emu-russia/dmgcpu#266.

Rodrigodd mentioned this pull request Mar 30, 2024

Investigate branching emu-russia/dmgcpu#270

Merged

Update dmgcpu and remove CLK delays

c9d447f

The delays in the CLK6 is not necessary anymore. A transparent DLatch was added to the condtional branch logic (as the original hardware has), avoiding the need of the delays.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hookup emu-russia/dmgcpu #1

Hookup emu-russia/dmgcpu #1

Rodrigodd commented Mar 17, 2024

msinger commented Mar 17, 2024

msinger commented Mar 17, 2024

msinger commented Mar 17, 2024

Rodrigodd commented Mar 17, 2024

msinger commented Mar 17, 2024

ogamespec commented Mar 18, 2024

Rodrigodd commented Mar 24, 2024

Hookup emu-russia/dmgcpu #1

Are you sure you want to change the base?

Hookup emu-russia/dmgcpu #1

Conversation

Rodrigodd commented Mar 17, 2024

Investigation

msinger commented Mar 17, 2024

msinger commented Mar 17, 2024

msinger commented Mar 17, 2024

Rodrigodd commented Mar 17, 2024

msinger commented Mar 17, 2024

ogamespec commented Mar 18, 2024

Rodrigodd commented Mar 24, 2024