Keeping track of registers

djnzx48 · Post by **djnzx48** » Mon Feb 10, 2020 8:57 am

I started using that tool fairly recently as well, and find it quite helpful, although I also experience frequent crashes with ZEsarUX. One thing the description doesn't mention is that the version on the Marketplace is out of date: the latest version (0.11.0) is on the GitHub releases page.

Dr_Dave · Post by **Dr_Dave** » Mon Feb 10, 2020 9:04 am

Yeah, I do have the latest version, but still get crashes. I raised an issue about it on the github page, and the author suggested trying a beta version of ZESarUX, but that didn't help either.

To be honest, even with the crashes, it is still enormously useful.

Morkin · Post by **Morkin** » Mon Feb 10, 2020 2:30 pm

Ast A. Moore wrote: ↑Sat Feb 08, 2020 8:09 am
1024MAK wrote: ↑Sat Feb 08, 2020 1:57 am Do you have to take your socks off to do a sixteen bit shift?
Oh, now you’re just being silly. For that purpose precisely, I never wear socks.

...The double benefit is that you can trim your toenails while you're coding. Though trying to extract them from your computer keyboard can be a bit time consuming...

Back to the OT, I kinda like a bit of chaos when it comes to assembly programming. I guess some people here are modern devs IRL, where code structure and clarity are particularly important, whereas I'm not a developer so never really cared too much about comments and order in Speccy programming. Most of my debugging issues are caused by typos...

I'm definitely in with the pen and paper crew though, doodling sprites in little squares. It found it amazing how quick you can get at recognising byte patterns and mentally converting binary bytes to hex/decimal values.

(Great discussion BTW

)

Joefish · Post by **Joefish** » Mon Feb 10, 2020 3:39 pm

I've done loads of sprite design recently on my DS, running a sprite editor in ZXDS with the buttons set to all the keyboard controls.

ZXDunny · Post by **ZXDunny** » Mon Feb 10, 2020 3:46 pm

Cosmium wrote: ↑Sun Feb 09, 2020 4:16 am Or even days or hours later..

My record was literally less than 5 minutes. I was under the influence of mind-altering substances, feeling pretty chill and suddenly struck by the urge to code. It flowed like water from my fingers and I was well underway when I realised I'd not written a small routine to handle breakpoints in my editor. Off I toddled to write the routine, then came to use it - it crashed. Went back to it and literally had no clue what the hell any of it did.

...though I'm not sure that comments would have helped

RMartins · Post by **RMartins** » Mon Feb 10, 2020 3:53 pm

The most important step in this process, is to realize that you can only code useful functions using registers.

The second most important step is to realize that calling functions using C conventions (pushing params), is not very efficient, since it has lots of overhead to PUSHs params (eventually POP them and/or increment SP). And there is no real easy/efficient way to reference those params on stack.

Third, realize that in most cases the parameters you need to call a function are already in registers (since we need registers to code).

Remember that Variables are a high level concept that associates some data with a memory location.

After this, it's just a matter of organizing your register use for efficiency, i.e. avoid doing any PUSH/POP of registers at the lower level of detail (the leaf functions and/or processing loops).

We can use PUSH and POP at the next highest level, and or index registers to abstract data structures.

Using this hierarchical logic, minimizes the number of PUSH/POPs and or load/save from variables (from/into memory).

Also, avoid keeping data on the A register, since this is the most used register of them all.
You can do (EX AF, AF') if you really need to, but reserve this only for the lower level detail (leaf functions and/or loops).

Dr_Dave · Post by **Dr_Dave** » Mon Feb 10, 2020 3:56 pm

RMartins wrote: ↑Mon Feb 10, 2020 3:53 pm Also, avoid keeping data on the A register, since this is the most used register of them all.
You can do (EX AF, AF') if you really need to, but reserve this only for the lower level detail (leaf functions and/or loops).

In terms of performance, which would be preferred? push/pop af or ex af?

RMartins · Post by **RMartins** » Mon Feb 10, 2020 4:02 pm

Dr_Dave wrote: ↑Mon Feb 10, 2020 3:56 pm In terms of performance, which would be preferred? push/pop af or ex af?

PUSH takes 11T
POP takes 10T

EX AF, AF' takes 4T

But you can check all this on the Z80 Manual.

You can see some examples of this style of coding in NOXROM Banks.asm. Leaf functions (no loops in this case).
Check "SetBankRead" or "SetBankWrite" functions.

NOTE: a Leaf function is one that does not call any other function, like "SendCMD".

Ast A. Moore · Post by **Ast A. Moore** » Mon Feb 10, 2020 6:38 pm

Dr_Dave wrote: ↑Mon Feb 10, 2020 3:56 pm In terms of performance, which would be preferred? push/pop af or ex af?

EX AF,AF′ is much faster, but keep in mind that it also swaps the flags register. Sometimes, it’s more useful to save one register in another one temporarily.

Ast A. Moore · Post by **Ast A. Moore** » Mon Feb 10, 2020 6:39 pm

ZXDunny wrote: ↑Mon Feb 10, 2020 3:46 pm I was under the influence of mind-altering substances, feeling pretty chill and suddenly struck by the urge to code.

Far out, man.

ZXDunny · Post by **ZXDunny** » Mon Feb 10, 2020 8:44 pm

Ast A. Moore wrote: ↑Mon Feb 10, 2020 6:39 pm
ZXDunny wrote: ↑Mon Feb 10, 2020 3:46 pm I was under the influence of mind-altering substances, feeling pretty chill and suddenly struck by the urge to code.
Far out, man.

I cannot recommend it enough. Will you come back to the code the next day with some incredibly nuanced and optimised genius in code form? Or will it be a huge steaming pile of gibberish?

It's like Xmas day, it really is

Ast A. Moore · Post by **Ast A. Moore** » Mon Feb 10, 2020 9:11 pm

ZXDunny wrote: ↑Mon Feb 10, 2020 8:44 pm
Ast A. Moore wrote: ↑Mon Feb 10, 2020 6:39 pm Far out, man.
I cannot recommend it enough. Will you come back to the code the next day with some incredibly nuanced and optimised genius in code form? Or will it be a huge steaming pile of gibberish?

Every code is sacred,
Every code is great,
When some code gets wasted,
God gets quite irate.

On a more serious note, I do my best coding on my feet. It’s curious how incredibly stuck in a rut I get when I work on a piece of code for hours on end. It’s only when you walk away from it that you can come up with some brilliant optimization or even a completely different approach. Of course, sometimes you overlook something obvious when you’re away from your code, and when you get back to it with “this awesome new idea,” you realize it’s not going to work as intended, because your brilliant idea was only good for spherical cows in a vacuum . . .

Bedazzle · Post by **Bedazzle** » Wed Feb 12, 2020 5:04 am

Dr_Dave wrote: ↑Mon Feb 10, 2020 3:56 pm In terms of performance, which would be preferred? push/pop af or ex af?

http://clrhome.org/table/

Dr_Dave · Post by **Dr_Dave** » Wed Feb 12, 2020 6:32 am

Bedazzle wrote: ↑Wed Feb 12, 2020 5:04 am
Dr_Dave wrote: ↑Mon Feb 10, 2020 3:56 pm In terms of performance, which would be preferred? push/pop af or ex af?
http://clrhome.org/table/

Okay, beginner level question... what is a t-state?

Is it just a clock cycle? So for a 3.5Mz processor, there would be (counts fingers) about 70,000 t-states per frame at 50 fps? And checking if your code fits within one frame is a simple matter of summing the t-states of the op-codes executed within it?

AndyC · Post by **AndyC** » Wed Feb 12, 2020 7:24 am

Dr_Dave wrote: ↑Wed Feb 12, 2020 6:32 am
Bedazzle wrote: ↑Wed Feb 12, 2020 5:04 am http://clrhome.org/table/
Okay, beginner level question... what is a t-state?

Is it just a clock cycle? So for a 3.5Mz processor, there would be (counts fingers) about 70,000 t-states per frame at 50 fps? And checking if your code fits within one frame is a simple matter of summing the t-states of the op-codes executed within it?

In essence, yes. Checking how long your code takes gets more complicated in reality because you have to start considering contention which slows some instructions down some times. If you're just looking for a ballpark, then t-state counting is a good starting point.

Firefox · Post by **Firefox** » Wed Feb 12, 2020 7:36 am

I'm going to throw in a recommendation for Rodnay Zaks' classic tome, "Programming the Z80".

https://archive.org/details/Programming ... nd_Edition

There's an explanation of T-states (which build up into M-cycles) starting on page 69, with nice little shaded-in block diagrams of the z80 internal furtlings showing what happens in each T-state.

Joefish · Post by **Joefish** » Wed Feb 12, 2020 7:05 pm

Dr_Dave wrote: ↑Wed Feb 12, 2020 6:32 am Okay, beginner level question... what is a t-state?
Is it just a clock cycle? So for a 3.5Mz processor, there would be (counts fingers) about 70,000 t-states per frame at 50 fps? And checking if your code fits within one frame is a simple matter of summing the t-states of the op-codes executed within it?

Exactly that. Some instructions take 4 T-states to execute, others take longer because they have more to do, particularly if they read or write memory. As a rough guide, it takes 3 T-states to read or write a byte of memory. So most of a quick 4-T-state instruction is actually fetching it from memory. Actual execution of the instruction is very quick. For example INC A takes only 4T, but INC BC is 6T, as it has to INC one register first, carry a bit, then INC the other.

PUSH BC takes 11T, as it has to do two memory writes and two decrements of the stack pointer. POP BC is slightly quicker at 10T, for reasons.

Spoiler

(the CPU can increment the stack pointer after a memory read quicker than if it has to decrement it before a write)

You can't really predict them exactly, so you look them up in a reference.

It's those memory reads and writes that can slow it down more though, if you try to do one while the ULA is fetching screen data from the lower 16K of memory. Your reads or writes have to wait for a gap in an 8-T-state cycle to get through, which can slow the instruction down. This is the 'contention' that everyone talks about. So best to keep your important stuff above 32767 and only write to the screen memory when you have to.

The Amstrad CPC avoids this by slowing every instruction down to a multiple of 4 clock cycles so everything works in synch.

Other CPUs like the 6502 might have lower clock speeds, but execute instructions in only two cycles, which can then be synched with a video generator and so share memory access on alternate ticks of the clock.

One curious instruction is HALT, which is supposed to wait for the next interrupt to occur. Technically what it does is keep doing a 4-T-State NOP instruction - and the interrupts can never break in to the middle of an instruction - so the interrupt can then be up to 3-T-States late in occurring, even though you were waiting for it. And if you're doing an instruction that takes even longer (like LDI) the interrupt can kick in even later. Though it draws the line at LDIR, and will break in between repeats of the loop.

Lethargeek · Post by **Lethargeek** » Wed Feb 12, 2020 9:33 pm

Joefish wrote: ↑Wed Feb 12, 2020 7:05 pm Exactly that. Some instructions take 4 T-states to execute, others take longer because they have more to do, particularly if they read or write memory. As a rough guide, it takes 3 T-states to read or write a byte of memory. So most of a quick 4-T-state instruction is actually fetching it from memory.

fetching takes 2 T-states only and is shortest memory access of Z80 (other 2 T-states takes refresh)

http://www.piclist.com/techref/mem/dram/slide4.html

Joefish wrote: ↑Wed Feb 12, 2020 7:05 pm One curious instruction is HALT, which is supposed to wait for the next interrupt to occur. Technically what it does is keep doing a 4-T-State NOP instruction - and the interrupts can never break in to the middle of an instruction - so the interrupt can then be up to 3-T-States late in occurring, even though you were waiting for it. And if you're doing an instruction that takes even longer (like LDI) the interrupt can kick in even later. Though it draws the line at LDIR, and will break in between repeats of the loop.

and a long string of prefixes and/or EIs may cause missed interrupt

RMartins · Post by **RMartins** » Sat Feb 22, 2020 2:54 pm

Joefish wrote: ↑Wed Feb 12, 2020 7:05 pm One curious instruction is HALT, which is supposed to wait for the next interrupt to occur. Technically what it does is keep doing a 4-T-State NOP instruction - and the interrupts can never break in to the middle of an instruction - so the interrupt can then be up to 3-T-States late in occurring, even though you were waiting for it. And if you're doing an instruction that takes even longer (like LDI) the interrupt can kick in even later. Though it draws the line at LDIR, and will break in between repeats of the loop.

Actually LDIR is just an LDI with an implicit jump to the start of the instruction, if BC != 0
It actually re-fetches, decodes and executes the instruction everytime (again and again), for each cycle step.
This makes it simple for the interrupt mechanism to work, without no extra special handling, since this is just a string of repeated instructions, exactly like HALT works, or any other Z80 self repeating instruction, making them look as any other short instruction from the CPU state machine perspective.

This has several advantages, like allowing the refresh address to keep ticking after each fetch, hence allowing the computer memory to be correctly refreshed, as usual.

TIP: If you basically lock the computer out, by disabling interrupts and issuing an HALT instruction, you will never get back control until you receive an NMI (this interrupt is not maskable, by definition).
But while the computer is HALted (i.e.waiting for an interrupt), you can still see it constantly re-fetching the HALT instruction from memory, if you take a look at the address and data bus.

This was specially relevant, to look out for, when I was developing my NOXROM cartridge, since these accesses (like HALT) will actually repeat the same address N times, since it's a single byte instructions that repeats, so it fetches the same address 2 times in a row if given the chance (repeat condition is true), and if that matches with the refresh address done for each fetch, it would trigger 3 "identical" addresses in sequence, which would force a NOXROM page swap.

So one of the conditions for software to work successfully with a NOXROM cartridge, is to never have a single byte repeating instruction (like HALT) in an address that ends in 0xFE.

Spectrum Computing

Keeping track of registers

Re: Keeping track of registers

Re: Keeping track of registers

Re: Keeping track of registers

Re: Keeping track of registers

Re: Keeping track of registers

Re: Keeping track of registers

Re: Keeping track of registers

Re: Keeping track of registers

Re: Keeping track of registers

Re: Keeping track of registers

Re: Keeping track of registers

Re: Keeping track of registers

Re: Keeping track of registers

Re: Keeping track of registers

Re: Keeping track of registers

Re: Keeping track of registers

Re: Keeping track of registers

Re: Keeping track of registers

Re: Keeping track of registers