nice one! sadly, the compiler cannot use "JRs". but i'll save this to my snippets collection to use in hand-crafted asm code, thank you! ;-)sn3j wrote: ↑Thu Apr 25, 2024 1:12 pm Here's another one:which is 31T for Zero, 20T for negative numbers (50% of all cases) and 26T for the positive numbers.Code: Select all
ld a, h add a jr c LE_0 or l jr z LE_0
Average should be 23T, depending on how your cases are distributed.
jump if signed 16-bit number is >0, <=0, fastest code
Re: jump if signed 16-bit number is >0, <=0, fastest code
Re: jump if signed 16-bit number is >0, <=0, fastest code
Yet another one:
Average would be something along 23.05 T.
Code: Select all
ld a, h
add a
jp c LE_0
jp nz GT_0
; deal with cases 0..255 here
or l
jp z LE_0
POKE 23614,10: STOP 1..0 hold, SS/m/n colors, b/spc toggle
Re: jump if signed 16-bit number is >0, <=0, fastest code
Antonio Luque wrote: ↑Thu Apr 25, 2024 12:36 pm This takes 26ts, but preserve HL:
Code: Select all
xor a sub l sbc a,a sub h jp m,GreaterThan0
This could be fixed by:
Code: Select all
xor a
sub l
sbc a,a
cpl
or h
jp m,LessOrEqual0
POKE 23614,10: STOP 1..0 hold, SS/m/n colors, b/spc toggle
Re: jump if signed 16-bit number is >0, <=0, fastest code
but than it is slower than 24ts case. 4*5+10=30.sn3j wrote: ↑Thu Apr 25, 2024 9:19 pm This could be fixed by:Code: Select all
xor a sub l sbc a,a cpl or h jp m,LessOrEqual0
might be useful if HL cannot be destroyed, though. but then we have 18/32 version, which is not much slower, and sometimes faster.
still, might be useful for stable timings.
Re: jump if signed 16-bit number is >0, <=0, fastest code
Yes, and maybe A is already 0 from a previous operation which would allow us to drop the XOR A.
So we have a 26T non-destructive solution with 7 bytes and stable timing.
POKE 23614,10: STOP 1..0 hold, SS/m/n colors, b/spc toggle
Re: jump if signed 16-bit number is >0, <=0, fastest code
test HL = 0x3300sn3j wrote: ↑Thu Apr 25, 2024 9:19 pm This could be fixed by:Code: Select all
xor a sub l sbc a,a cpl or h jp m,LessOrEqual0
xor A --> a == 0
sub L --> 0x00 - 0x00 --> not carry
sbc A,A --> A = 0x00
cpl --> A = 0xFF
or H --> fail, because H does not depend and the result will be less than or equal to 0
Do I understand correctly?
Z80 Forth compiler (ZX Spectrum 48kb): https://codeberg.org/DW0RKiN/M4_FORTH
Re: jump if signed 16-bit number is >0, <=0, fastest code
Save TopOfStack code
My normal (negative) code rating is byte count * 4 + T-clocks. In this way, you will be able to count apples and pears together.
In the file __macros.m4 there is a "variable" named __BYTE_PRICE
dnl # is used to calculate the difficulty of the code
dnl # prize = clocks + (__BYTE_PRICE * bytes)
define({__BYTE_PRICE},4){}dnl
If it is redefined to a different value, the output may differ, because if there is a solution vice, the change could affect which one wins.
Spoiler
Code: Select all
dworkin@dw-A15:~/Programovani/ZX/Forth/Pasmo_test$ ../check_word.sh 'DUP PUSH(0) GT IF'
;[7:30] dup 0 > if ( x -- x ) flag: x > 0
ld A, H ; 1:4 dup 0 > if save sign
dec HL ; 1:6 dup 0 > if zero to negative
or H ; 1:4 dup 0 > if
inc HL ; 1:6 dup 0 > if
jp m, else101 ; 3:10 dup 0 > if
; seconds: 0 ;[ 7:30]
dworkin@dw-A15:~/Programovani/ZX/Forth/Pasmo_test$ ../check_word.sh 'DUP PUSH(0) LE IF'
;[7:30] dup 0 <= if ( x -- ) flag: x <= 0
ld A, H ; 1:4 dup 0 <= if save sign
dec HL ; 1:6 dup 0 <= if zero to negative
or H ; 1:4 dup 0 <= if
inc HL ; 1:6 dup 0 <= if
jp p, else101 ; 3:10 dup 0 <= if
; seconds: 0 ;[ 7:30]
In the file __macros.m4 there is a "variable" named __BYTE_PRICE
dnl # is used to calculate the difficulty of the code
dnl # prize = clocks + (__BYTE_PRICE * bytes)
define({__BYTE_PRICE},4){}dnl
If it is redefined to a different value, the output may differ, because if there is a solution vice, the change could affect which one wins.
Z80 Forth compiler (ZX Spectrum 48kb): https://codeberg.org/DW0RKiN/M4_FORTH
Re: jump if signed 16-bit number is >0, <=0, fastest code
So I should improve the code in the case of "DUP 0 LE IF".
old code
new result:
Code: Select all
ld A, H ; 1:4
add A, A ; 1:4
jr c, $+6 ; 2:7/12
or L ; 1:4
jr nz, LAB_GT ; 3:10
; LAB_LE:
; ...
LAB_GT:
;[8:20,29/29] price 56.5
Spoiler
Code: Select all
dnl
dnl
dnl # ( x1 -- x1 )
dnl # dup 0<= if
define({DUP_0LE_IF},{dnl
__{}__ADD_TOKEN({__TOKEN_DUP_0LE_IF},{dup 0<= if},$@){}dnl
}){}dnl
dnl
define({__ASM_TOKEN_DUP_0LE_IF},{dnl
__{}define({__INFO},__COMPILE_INFO){}dnl
__{}define({IF_COUNT}, incr(IF_COUNT)){}dnl
__{}pushdef({ELSE_STACK}, IF_COUNT){}dnl
__{}pushdef({THEN_STACK}, IF_COUNT){}dnl
__{}ifelse(1,0,{
__{} ;[9:33/20] __INFO ( x -- x ) flag: x <= 0
__{} ld A, L ; 1:4 __INFO
__{} or H ; 1:4 __INFO
__{} jr z, $+7 ; 2:7/12 __INFO
__{} bit 7, H ; 2:8 __INFO
__{} jp z, format({%-11s},else{}IF_COUNT); 3:10 __INFO},
__{}{
__{} ;[7:30] __INFO ( x -- ) flag: x <= 0
__{} ld A, H ; 1:4 __INFO save sign
__{} dec HL ; 1:6 __INFO zero to negative
__{} or H ; 1:4 __INFO
__{} inc HL ; 1:6 __INFO
__{} jp p, format({%-11s},else{}IF_COUNT); 3:10 __INFO}){}dnl
}){}dnl
Spoiler
Code: Select all
dworkin@dw-A15:~/Programovani/ZX/Forth/Pasmo_test$ ../check_word.sh 'VERBOSE(1) define({__BYTE_PRICE},6) DUP PUSH(0) LE IF'
...new __TOKEN_DUP() "dup" --> token(1) __TOKEN_DUP "dup"
...new __TOKEN_PUSHS(0) "{0}" --> token(2) __TOKEN_PUSHS(0) "{{0}}"
...new __TOKEN_LE() "<=" --> token(2) __TOKEN_0LE "{{0}} <="
...new __TOKEN_IF() "if" --> token(1) __TOKEN_DUP_0LE_IF "dup {{0}} <= if"
...check all tokens
...second pass(1) __TOKEN_DUP_0LE_IF() "dup {0} <= if" --> __TOKEN_DUP_0LE_IF "dup {{0}} <= if"
...check all tokens2
...third pass(1) __TOKEN_DUP_0LE_IF() "dup {0} <= if" --> __TOKEN_DUP_0LE_IF "dup {{0}} <= if"
...create(1) __TOKEN_DUP_0LE_IF "dup {{0}} <= if"
;[7:30] dup 0 <= if ( x -- ) flag: x <= 0
ld A, H ; 1:4 dup 0 <= if save sign
dec HL ; 1:6 dup 0 <= if zero to negative
or H ; 1:4 dup 0 <= if
inc HL ; 1:6 dup 0 <= if
jp p, else101 ; 3:10 dup 0 <= if
; seconds: 0 ;[ 7:30]
dworkin@dw-A15:~/Programovani/ZX/Forth/Pasmo_test$ ../check_word.sh 'VERBOSE(1) DUP PUSH(0) LE IF'
...new __TOKEN_DUP() "dup" --> token(1) __TOKEN_DUP "dup"
...new __TOKEN_PUSHS(0) "{0}" --> token(2) __TOKEN_PUSHS(0) "{{0}}"
...new __TOKEN_LE() "<=" --> token(2) __TOKEN_0LE "{{0}} <="
...new __TOKEN_IF() "if" --> token(1) __TOKEN_DUP_0LE_IF "dup {{0}} <= if"
...check all tokens
...second pass(1) __TOKEN_DUP_0LE_IF() "dup {0} <= if" --> __TOKEN_DUP_0LE_IF "dup {{0}} <= if"
...check all tokens2
...third pass(1) __TOKEN_DUP_0LE_IF() "dup {0} <= if" --> __TOKEN_DUP_0LE_IF "dup {{0}} <= if"
...create(1) __TOKEN_DUP_0LE_IF "dup {{0}} <= if"
;[8:20/29] dup 0 <= if ( x -- x ) flag: x <= 0
ld A, H ; 1:4 dup 0 <= if
add A, A ; 1:4 dup 0 <= if
jr c, $+6 ; 2:7/12 dup 0 <= if
or L ; 1:4 dup 0 <= if
jp nz, else101 ; 3:10 dup 0 <= if
; seconds: 0 ;[ 8:29]
Z80 Forth compiler (ZX Spectrum 48kb): https://codeberg.org/DW0RKiN/M4_FORTH
Re: jump if signed 16-bit number is >0, <=0, fastest code
Argh - right, this is not working with positive high-bytes... too bad.
Idea:
Code: Select all
xor a
cp l
sbc a, h ; 16-bit subtraction: 0 - x
sbc a, a
cpl
or h
jp m, LessOrEqual0
cool
Last edited by sn3j on Fri Apr 26, 2024 7:49 am, edited 1 time in total.
POKE 23614,10: STOP 1..0 hold, SS/m/n colors, b/spc toggle
Re: jump if signed 16-bit number is >0, <=0, fastest code
i tried similar tricks, trying to avoid branches… until i remembered that it's not x86. ;-)