CPI / CPIR instruction
CPI / CPIR instruction
Has anyone ever actually used these instructions?
I remember when I first read about them in 'Spectrum Machine Language for the Absolute Beginner', and it described them as being so powerful as to leap skyscrapers (or something). They really do sound like they have the potential to be as useful as LDI/LDIR, but every time I think "Ah! Finally - I can use CPI!*", there's a property of it that means it's not the best solution.
I think if it was a CP (HL) with (DE), increment, and DEC BC then I'd find a use for it.
*which was the case this week
I remember when I first read about them in 'Spectrum Machine Language for the Absolute Beginner', and it described them as being so powerful as to leap skyscrapers (or something). They really do sound like they have the potential to be as useful as LDI/LDIR, but every time I think "Ah! Finally - I can use CPI!*", there's a property of it that means it's not the best solution.
I think if it was a CP (HL) with (DE), increment, and DEC BC then I'd find a use for it.
*which was the case this week
- Ast A. Moore
- Rick Dangerous
- Posts: 2641
- Joined: Mon Nov 13, 2017 3:16 pm
Re: CPI / CPIR instruction
Yes, I use CPIR it in my redefine keys routine. It’s pretty slow, so I’d reserve it for when code execution speed isn’t a concern.
I think its primary purpose is for use with data arrays, i.e. text, etc. I use it to ignore the already defined keys (something many redefine keys routines neglect, much to the chagrin of players).
I think its primary purpose is for use with data arrays, i.e. text, etc. I use it to ignore the already defined keys (something many redefine keys routines neglect, much to the chagrin of players).
Every man should plant a tree, build a house, and write a ZX Spectrum game.
Author of A Yankee in Iraq, a 50 fps shoot-’em-up—the first game to utilize the floating bus on the +2A/+3,
and zasm Z80 Assembler syntax highlighter.
Author of A Yankee in Iraq, a 50 fps shoot-’em-up—the first game to utilize the floating bus on the +2A/+3,
and zasm Z80 Assembler syntax highlighter.
Re: CPI / CPIR instruction
I very rarely use it. But I used it once in one of my games. Imagine a table of N bytes. I needed to count how many bytes in this table are 0.
Eventually I did:
Eventually I did:
Code: Select all
LD HL,TableStart
LD DE,0 ;count of 0s
LD BC,TableSize
XOR A
Loop1:
CPI ;does CP (HL) INC HL DEC BC
JP NZ,Loop2
INC DE ;increase count of 0s
Loop2
JP PE,Loop1
Re: CPI / CPIR instruction
No, I usually try to find things by faster way than by CP*(R).
Seriously, aren't the instructions rather slow ?
AFAIK native LZ packers use the CPIR/CPDR as fast way to find match candidate.
Proud owner of Didaktik M
Re: CPI / CPIR instruction
It’s very useful for writing expression parsers. Finding the CR end of line markers, finding the equals in Key=Value constructs, reading data out of modem AT command responses, that kind of thing.
Robin Verhagen-Guest
SevenFFF / Threetwosevensixseven / colonel32
NXtel • NXTP • ESP Update • ESP Reset • CSpect Plugins
SevenFFF / Threetwosevensixseven / colonel32
NXtel • NXTP • ESP Update • ESP Reset • CSpect Plugins
Re: CPI / CPIR instruction
Thanks guys. You've made me realise there is somewhere I could (and should) have used this. I had a map of 18 bytes and I needed to find the first available empty place, so a new tile could be inserted.
For the first time ever I will use the CPIR instruction!
For the first time ever I will use the CPIR instruction!
The speed looks quite good to me, same as LDI/LDIR. CPI/R seem less likely to be used in fast game loops anyway.
- Ast A. Moore
- Rick Dangerous
- Posts: 2641
- Joined: Mon Nov 13, 2017 3:16 pm
Re: CPI / CPIR instruction
Yup. That is precisely where you’d use CPI/CPIR.
Every man should plant a tree, build a house, and write a ZX Spectrum game.
Author of A Yankee in Iraq, a 50 fps shoot-’em-up—the first game to utilize the floating bus on the +2A/+3,
and zasm Z80 Assembler syntax highlighter.
Author of A Yankee in Iraq, a 50 fps shoot-’em-up—the first game to utilize the floating bus on the +2A/+3,
and zasm Z80 Assembler syntax highlighter.
Re: CPI / CPIR instruction
Well, LDI decreases/increases 3 16bit registers, does one memory read, one memory write.
CPI decreases/increases 2 16bit registers, does one memory read and register comparison.
I was just thinking about that yesterday when I was trying to figure out why I don't use CPI etc.
Btw. Whenever I needed some buffer that would have empty slots, I kept an user stack with pointers to empty slots as a helper structure. That way an empty slot is always on top of stack so I just pick it up. When I invalidate data in slot, I return the slot pointer on top of stack.
Proud owner of Didaktik M
Re: CPI / CPIR instruction
I could have used it in my compression routine, for looking to see if a data byte is already recorded in a dictionary of most common values. Except I want an 8-bit index into the dictionary, not an absolute address, so it's easier to just write my own compare routine than faff around with adjusting the answer from an absolute address to a relative value. And no, I'm not going to align everything to page boundaries!
Re: CPI / CPIR instruction
Ah sorry, I see what you mean. I rarely think in terms of tstates and had to resort to Rodney Zaks.catmeows wrote: ↑Fri Sep 27, 2019 3:35 pm Well, LDI decreases/increases 3 16bit registers, does one memory read, one memory write.
CPI decreases/increases 2 16bit registers, does one memory read and register comparison.
I was just thinking about that yesterday when I was trying to figure out why I don't use CPI etc.
To summarise:
Code: Select all
LDI 16t
equivalent to:
ld a,(hl) ;7t
ld (de),a ;7t
inc hl ;6t
inc de ;6t
dec bc ;6t
=32t
CPI 16t
equivalent to:
cp (hl) ;7t
inc hl ;6t
dec bc ;6t
=19t
(It might be easiest just to say "wiring" )
EDITED - removed stupid flowery prose
- Ast A. Moore
- Rick Dangerous
- Posts: 2641
- Joined: Mon Nov 13, 2017 3:16 pm
Re: CPI / CPIR instruction
Uh . . . The short answer is: it’s complicated.
The long answer is itself complicated.
You see, when analyzing these combo instructions, it’s best not to rewrite them in pseudocode like you did. Your pseudocode is correct, but only in breaking down the logic of the instruction. That is how the CPU arrives at the result, but that’s actually not what it’s doing.
A better way of breaking down any instruction is to think of it in machine cycles, not T states. Each machine cycle can take several T states, and each instruction takes at least one M cycle—the opcode fetch. The absolute minimum number of T states in a fetch M cycle is four. Some instructions take just that many T states (say, INC A). That’s how long it takes to place the PC register on the address bus and read the opcode. Extended instructions (prefixed by ED, CB, DD, and FD), take an additional 4 T states, because their opcodes are two bytes long. IX and IY bit instructions (prefixed by DDCB and FDCB) take even longer. Compare, for example, the regular LD HL,(**) instruction (opcode 22; 16 T states) with its undocumented counterpart (opcode ED6B; 20 T states).
Now, each of the pseudocode instructions that you wrote out doesn’t need to be fetched and parsed individually; only one instruction fetch happens in either LDI or CPI. Since those are extended instructions (with the ED prefix), the fetch machine cycle for each takes 8 T states.
Next machine cycles (if they exist at all) are for moving data between the CPU and the RAM/ROM or other devices (I/O). They can take anywhere from three to five T states. Some instruction don’t move any data (INC A) and thus take much less time. Incrementing an index register, however, will take longer, because, say, INC IXh is an extended instruction; it takes another 4 T states to fetch the second byte of its opcode. Yet something like EX (SP),IX can take as many as six machine cycles and 23 T states (!) (two fetches, two memory reads and two writes—one for each byte).
The internal workings of the Z80 are not as easily broken down timing-wise and they do depend on numerous factors, including, as you put it—“the wiring.” Suffice it to say, that actually incrementing a register (or register pair) doesn’t take up 6 T states. Moreover, increments and decrements can be grouped together and impose little to no overhead when executed simultaneously—they’re not necessarily cumulative. The incrementer/decrementer circuity in the Z80 is quite clever and can do various things. It can, too, pass a value without incrementing or decrementing it; thus, similar to the WZ register pair, it can be used for storing data temporarily.
The HL/DE registers pairs can be very easily swapped in hardware. In fact, they are not strictly speaking physically separate registers at all. Instructions like EX DE,HL don’t actually exchange data between DE and HL, but it sure looks like it to the programmer.
Some internal operations in the Z80 can be pipelined and thus overlap, but not all. For example, you can’t directly copy a value from one register to another (yes, even the LD B,C mnemonic is a lie). The operation must be done through the ALU. But the ALU in the Z80 is 4-bit, and using it for transferring data between 16-bit registers would be too slow. It’s much faster to use the incrementer/decrementer circuity for that. Now, the ALU (and register) operations can finish while the CPU is fetching another instruction, but since that requires the incrementer/decrementer latch, if an instruction requires its use, it must be completed first before the next instruction can be fetched. This explains why INC A is faster than INC HL, for example. Block transfers (LDIR, CPIR, etc.) sure use the incrementer latch a lot.
Like I said, it’s complicated. Hopefully, I’ve now confused you beyond reason, and you have no desire to investigate the matter any further.
Every man should plant a tree, build a house, and write a ZX Spectrum game.
Author of A Yankee in Iraq, a 50 fps shoot-’em-up—the first game to utilize the floating bus on the +2A/+3,
and zasm Z80 Assembler syntax highlighter.
Author of A Yankee in Iraq, a 50 fps shoot-’em-up—the first game to utilize the floating bus on the +2A/+3,
and zasm Z80 Assembler syntax highlighter.
Re: CPI / CPIR instruction
Those instructions sequences aren't exactly equivalent as CPI/LDI set flags if BC is equal to zero.
If LDI and CPI take the same number of T-states, my guess is that CPI uses the same circuitry but doesn't output the value to memory, or maybe just writes it back to HL.
If LDI and CPI take the same number of T-states, my guess is that CPI uses the same circuitry but doesn't output the value to memory, or maybe just writes it back to HL.
- 1024MAK
- Bugaboo
- Posts: 3123
- Joined: Wed Nov 15, 2017 2:52 pm
- Location: Sunny Somerset in the U.K. in Europe
Re: CPI / CPIR instruction
The Z80 MPU has a number of features (including clever ideas) that make it a bit unconventional.
Remember, the mnemonics are only there to help humans remember the effect of the instruction. They do not necessarily accurately indicate how the Z80 carries out the operation. As all sorts of hardware tricks take place. The exchange of alternative registers sets is a good example. No copy/swap operation takes place, instead a single latch/flip-flop bit changes state to tell the Z80 which registers are the current in-use set.
The other thing to remember, is that MPU/CPU design is closely tied in with memory performance. At the time that the Z80 was designed, DRAM and ROM memory chips were painfully slow (in fact, DRAM memory is still painfully slow, we have just come up with many more tricks to make it look a bit faster). Hence where possible MPU/CPU designers avoided unnecessary memory accesses where they could. Memory was also very expensive. So again, instructions that did a lot of useful work for not many bytes of code were favoured, so that code could be compact.
One limiting factor with the Z80 design, is that because it was designed to run 8080 code, this rather limited the flexibility of the instruction set. Hence there are a lot of operations that take longer than is actually needed compared with if you started with a clean sheet approach.
Mark
Remember, the mnemonics are only there to help humans remember the effect of the instruction. They do not necessarily accurately indicate how the Z80 carries out the operation. As all sorts of hardware tricks take place. The exchange of alternative registers sets is a good example. No copy/swap operation takes place, instead a single latch/flip-flop bit changes state to tell the Z80 which registers are the current in-use set.
The other thing to remember, is that MPU/CPU design is closely tied in with memory performance. At the time that the Z80 was designed, DRAM and ROM memory chips were painfully slow (in fact, DRAM memory is still painfully slow, we have just come up with many more tricks to make it look a bit faster). Hence where possible MPU/CPU designers avoided unnecessary memory accesses where they could. Memory was also very expensive. So again, instructions that did a lot of useful work for not many bytes of code were favoured, so that code could be compact.
One limiting factor with the Z80 design, is that because it was designed to run 8080 code, this rather limited the flexibility of the instruction set. Hence there are a lot of operations that take longer than is actually needed compared with if you started with a clean sheet approach.
Mark
Standby alert
“There are four lights!”
Step up to red alert. Sir, are you absolutely sure? It does mean changing the bulb
Looking forward to summer later in the year.
“There are four lights!”
Step up to red alert. Sir, are you absolutely sure? It does mean changing the bulb
Looking forward to summer later in the year.
- Juan F. Ramirez
- Bugaboo
- Posts: 5137
- Joined: Tue Nov 14, 2017 6:55 am
- Location: Málaga, Spain
Re: CPI / CPIR instruction
I read the whole thread because of [mention]R-Tape[/mention] 's meme, I don't usually wander around this kind of threads! No idea of coding!
Re: CPI / CPIR instruction
Heh - I'm the same with the hardware threads, enjoy reading them but no idea what anyone's talking about...Juan F. Ramirez wrote: ↑Sat Sep 28, 2019 1:23 pm I read the whole thread because of @R-Tape 's meme, I don't usually wander around this kind of threads! No idea of coding!
My Speccy site: thirdharmoniser.com