Debugging an intermittent crash in a sea of code!

The place for codemasters or beginners to talk about programming any language for the Spectrum.
Post Reply
User avatar
Cosmium
Microbot
Posts: 156
Joined: Tue Dec 04, 2018 10:20 pm
Location: USA

Debugging an intermittent crash in a sea of code!

Post by Cosmium »

Wondered if anyone had a good approach to debugging rare, intermittent crashes in Z80 code.

My normal method of debugging is to print expected values on screen and delve deeper if they don't align with my expectations, or use the debugger to step through the recently added code, or add breakpoints (individually, or in a range) in the general area that seems to be the culprit.

Trouble is, at this point there about 14,500 lines of source code and it's hard to know where to to look! I've tried to understand the game conditions before the crash occurs, but there doesn't seem to be a pattern. I've considered using an rzx recording to get closer to the problem but am not familiar with the process or if it could help with debugging.

Short of stepping through the code and hoping for the crash the occur (which resets the Spectrum back to the (c) message), are there any tips for debugging these sort of rare crashes? Any usual suspects that cause this type of thing?!
User avatar
WhatHoSnorkers
Manic Miner
Posts: 254
Joined: Tue Dec 10, 2019 3:22 pm

Re: Debugging an intermittent crash in a sea of code!

Post by WhatHoSnorkers »

Would looking at ERRSP help at all?
I have a little YouTube channel of nonsense
https://www.youtube.com/c/JamesOGradyWhatHoSnorkers
sn3j
Manic Miner
Posts: 500
Joined: Mon Oct 31, 2022 12:29 am
Location: Germany

Re: Debugging an intermittent crash in a sea of code!

Post by sn3j »

If it's a bad jp (hl) or ret, you could put a 207,26 in front of the stack, and keep the ram in front of that all zeroes.
So you have a ~50% chance the faulty jump ends up in that contiguous area of zeros and finally hits the 207.
Last edited by sn3j on Sun Apr 30, 2023 8:47 pm, edited 1 time in total.
POKE 23614,10: STOP      1..0 hold, SS/m/n colors, b/spc toggle
Ralf
Rick Dangerous
Posts: 2289
Joined: Mon Nov 13, 2017 11:59 am
Location: Poland

Re: Debugging an intermittent crash in a sea of code!

Post by Ralf »

Does your crash makes Spectrum reset and go back to Basic or does just it hang the program?

If your crash resets the Spectrum then I could have a tip. In some emulators like Spin you could set a breakpoint which fires when ROM code is executed (the current executed instruction is at address 0-16384).

Then, when your breakpoint fires, try to step one instruction back and if you succeed, you will know where your program crashed.

Unfortunately in most emulators you cannot just undo your last instruction which is a shame because it would be really useful. But you may think of some workarounds.
edjones
Drutt
Posts: 33
Joined: Fri Feb 28, 2020 1:42 pm

Re: Debugging an intermittent crash in a sea of code!

Post by edjones »

AndyC
Dynamite Dan
Posts: 1408
Joined: Mon Nov 13, 2017 5:12 am

Re: Debugging an intermittent crash in a sea of code!

Post by AndyC »

If you have a "main loop" type arrangement, try monitoring the value of SP on every pass. If it's constantly decreasing, you're probably leaking stack space and that almost inevitably leads to a crash at some point.
User avatar
bob_fossil
Manic Miner
Posts: 661
Joined: Mon Nov 13, 2017 6:09 pm

Re: Debugging an intermittent crash in a sea of code!

Post by bob_fossil »

It sounds like it's stack related. All my spectacular game development crashes back to copyright prompt or other sections of the ROM were due to mismatched pushes and pops. As you're using an emulator you could try adding write breakpoints to the addresses around the stack area to see if you're overflowing your stack with too many pushes or under flowing in to the memory above your starting sp with too many pops.
User avatar
Morkin
Bugaboo
Posts: 3277
Joined: Mon Nov 13, 2017 8:50 am
Location: Bristol, UK

Re: Debugging an intermittent crash in a sea of code!

Post by Morkin »

Yep, seconded (not that we're voting :lol:), for me just about every intermittent crash that happened after a while of play testing was caused by a PUSH/POP mismatch.

I figured that as it wasn't happening a lot, it was probably related to an activity not happening every game loop (otherwise it'd crash pretty much straight away), so I narrowed it down that way. Still took ages.
My Speccy site: thirdharmoniser.com
User avatar
cmal
Manic Miner
Posts: 630
Joined: Fri Jun 05, 2020 1:05 am
Location: California

Re: Debugging an intermittent crash in a sea of code!

Post by cmal »

SpecEmu has a useful tool in the debugger to look at the execution history. In the debugger click on View -> Execution History. It's the same as looking at the stack pointer memory but it's quicker and less confusing when SP gets modified in the code.
User avatar
deanysoft
Dizzy
Posts: 75
Joined: Sat Jun 18, 2022 10:35 pm

Re: Debugging an intermittent crash in a sea of code!

Post by deanysoft »

The trace method is likely to be the only way to track down a truly intermittent crash but if you're not using a dev system that supports it, it could be quite a process to get your code onto one that does. Then you have the system tracing everything until you get the crash. I've not seen the specemu trace system (other than on that video) but usually you end up with a huge list file of instructions executed and after the crash you simply wind back through it until you see your code go awry. It can help if you can hide the interrupts that are processed as that will bulk your trace file. You could turn interrupts off but if that's the source of the crash...!

Personally, I've never needed this trace facility on the ZX like I used to on 64180 or 68K MICEd up hardware (nothing crashes like an OCTART handler). My crashes on the Spectrum are usually, as others have stated...

stack based
dumping a word over neighbouring bytes (this is a favourite! sometimes I crap on another variable, sometimes some code - endless fun)
indexing off the end of a table to get out of bound values
self modding the code with an out of range value
drawing a sprite in the wrong place or at the wrong size

It's probably well after the event now but if you don't already, try and keep some source control (git etc) and regularly use it. If you suddenly notice crashes occurring, you can at least compare your recent changes and scrutinise your additions.

You could also simply start commenting routines out bit by bit. Run your code until the crash occurs OR doesn't occur. Either way, you learn something useful.
User avatar
Ast A. Moore
Rick Dangerous
Posts: 2641
Joined: Mon Nov 13, 2017 3:16 pm

Re: Debugging an intermittent crash in a sea of code!

Post by Ast A. Moore »

Yup, if it’s a runaway stack issue, RZX is your friend. It helped me out with a couple of bizarre and nasty (and intermittent) issues.

RZX is really pretty straightforward. You just record your game; then you simply play back the recording, mark down the approximate time the crash occurs, and break into the debugger slightly before that on the next playback.
Every man should plant a tree, build a house, and write a ZX Spectrum game.

Author of A Yankee in Iraq, a 50 fps shoot-’em-up—the first game to utilize the floating bus on the +2A/+3,
and zasm Z80 Assembler syntax highlighter.
User avatar
Cosmium
Microbot
Posts: 156
Joined: Tue Dec 04, 2018 10:20 pm
Location: USA

Re: Debugging an intermittent crash in a sea of code!

Post by Cosmium »

Some fantastic suggestions, thanks!

And when I track this bug down I'll post back here what it was.
User avatar
Cosmium
Microbot
Posts: 156
Joined: Tue Dec 04, 2018 10:20 pm
Location: USA

Re: Debugging an intermittent crash in a sea of code!

Post by Cosmium »

Aha!

Using some of the neat debugging techniques suggested here I was thankfully able to find and fix the intermittent crash I was experiencing.

It was to do with an interaction between my IM2 service routine and code in the main game loop. I'd assumed the interrupt code and the game code operated on their own code and data. I'd assumed wrong!

There was one particular shared subroutine that relies on self modifying code for speed. To provide the correct starting point for an unrolled LDI copy, it writes a calculated value into the JR offset enabling a jump into a block of LDIs terminated with a JP PE, which then loops back for the remaining blocks of bytes to copy.

The game code occasionally calls this LDI copy subroutine, but under very rare circumstances the mode 2 interrupt happens just at the precise moment the JR offset's been written, and just before the very next instruction (the JR) has executed.

If the interrupt routine happens to also execute the same LDI copy subroutine, it modifies the same JR offset so that by the time the interrupt is over, the game code's JR offset is invalid and an incorrect number of LDI copies occur, meaning the JP PE at end of the loop fails, and LDIs continue past the expected endpoint..

Anyway I've successfully modified the code to avoid this scenario going forward, and am very happy to have the expanded range of debugging techniques offered here "in the toolbox", ready for next time. Much appreciated :)
User avatar
WhatHoSnorkers
Manic Miner
Posts: 254
Joined: Tue Dec 10, 2019 3:22 pm

Re: Debugging an intermittent crash in a sea of code!

Post by WhatHoSnorkers »

Brilliant you fixed it, a clever speed technique and awesome that you've told us what it was!
I have a little YouTube channel of nonsense
https://www.youtube.com/c/JamesOGradyWhatHoSnorkers
User avatar
Morkin
Bugaboo
Posts: 3277
Joined: Mon Nov 13, 2017 8:50 am
Location: Bristol, UK

Re: Debugging an intermittent crash in a sea of code!

Post by Morkin »

Fair play, I would never have figured that one out :lol: :lol:
My Speccy site: thirdharmoniser.com
User avatar
PROSM
Manic Miner
Posts: 476
Joined: Fri Nov 17, 2017 7:18 pm
Location: Sunderland, England
Contact:

Re: Debugging an intermittent crash in a sea of code!

Post by PROSM »

Great that you managed to fix it up! Bugs arising from concurrency are some of the trickiest ones to identify.
All software to-date
Working on something, as always.
User avatar
Bedazzle
Manic Miner
Posts: 305
Joined: Sun Mar 24, 2019 9:03 am

Re: Debugging an intermittent crash in a sea of code!

Post by Bedazzle »

I'll add my 5 cents.
In 128k if incorrect ram page loaded, and it must contain some code, while it does not...
Post Reply