Forth compiler for ZX

_dw · Post by **_dw** » Sat Dec 09, 2023 1:34 am

Hello,

I tried using macros to simulate loops in assembler and finally ended up with a higher language compiler into Z80 assembler. I chose Forth as a language because it is simpler than C and in a sense it is somewhere halfway between C and Asm.

When I started I knew almost nothing about Forth, now I know a little more, but not much, because I still haven't used one and I'm looking at it more from the point of view of how to translate an interpreter into asm. Things like "DOES>" keep surprising me.

But the result is quite good. It's the clearest asm code I've ever seen. When each word or group of words is clearly recognizable.

Forth code:

Code: Select all

: fib2 ( n1 -- n2 )
  0 1 rot 0 ?do 
    over + swap 
  loop 
  drop
;

: fib2_bench ( -- )
  1000 0 do 
    20 0 do
      I fib2 drop
    loop
  loop
;

fib2_bench

macro M4 FORTH code (../fth2m4.sh fib2.fth > fib2.m4):

Code: Select all

include(`../M4/FIRST.M4')dnl
  ifdef __ORG
    org __ORG
  else
    org 32768
  endif
  INIT(60000)


CALL(_fib2_bench)
  STOP

COLON(_fib2,({{{{ n1 -- n2 }}}}))
  PUSH(0) PUSH(1) ROT PUSH(0) QUESTIONDO 
    OVER ADD SWAP 
  LOOP 
  DROP
SEMICOLON

COLON(_fib2_bench,({{{{ -- }}}}))
  PUSH(1000) PUSH(0) DO 
    PUSH(20) PUSH(0) DO
      I CALL(_fib2) DROP
    LOOP
  LOOP
SEMICOLON

Asm code (../compile.sh fib2 32768):

Code: Select all

  ifdef __ORG
    org __ORG
  else
    org 32768
  endif
  



  


       
       
   
  



     
      
        
    
  


;   ===  b e g i n  ===
    ld  [Stop+1], SP    ; 4:20      init   storing the original SP value when the "bye" word is used
    ld    L, 0x1A       ; 2:7       init   Upper screen
    call 0x1605         ; 3:17      init   Open channel
    ld   HL, 0xEA60     ; 3:10      init   Return address stack = 60000
    exx                 ; 1:4       init
    call _fib2_bench    ; 3:17      call ( -- )
Stop:                   ;           stop
    ld   SP, 0x0000     ; 3:10      stop   restoring the original SP value when the "bye" word is used
    ld   HL, 0x2758     ; 3:10      stop
    exx                 ; 1:4       stop
    ret                 ; 1:10      stop
;   =====  e n d  =====
;   ---  the beginning of a non-recursive function  ---
_fib2:                  ;           ( n1 -- n2 )
    pop  BC             ; 1:10      : ret
    ld  [_fib2_end+1],BC; 4:20      : ( ret -- )
                        ;[6:36]     0 1 rot   ( x -- 0 1 x )
    push DE             ; 1:11      0 1 rot
    ld   DE, 0x0000     ; 3:10      0 1 rot
    push DE             ; 1:11      0 1 rot
    inc   E             ; 1:4       0 1 rot
    ld    A, L          ; 1:4       0 ?do_101(m)   ( stop 0 -- )
    ld  [stp_lo101], A  ; 3:13      0 ?do_101(m)   lo stop
    ld    A, H          ; 1:4       0 ?do_101(m)
    ld  [stp_hi101], A  ; 3:13      0 ?do_101(m)   hi stop
    or    L             ; 1:4       0 ?do_101(m)
    ex   DE, HL         ; 1:4       0 ?do_101(m)
    pop  DE             ; 1:10      0 ?do_101(m)
    jp    z, exit101    ; 3:10      0 ?do_101(m)
    ld   BC, 0x0000     ; 3:10      0 ?do_101(m)
do101save:              ;           0 ?do_101(m)
    ld  [idx101], BC    ; 4:20      0 ?do_101(m)   save index
do101:                  ;           0 ?do_101(m)
    add  HL, DE         ; 1:11      over +
    ex   DE, HL         ; 1:4       swap   ( b a -- a b )
idx101 EQU $+1          ;[16:57/78] loop_101(m)   idx always points to a 16-bit index
    ld   BC, 0x0000     ; 3:10      loop_101(m)
    inc  BC             ; 1:6       loop_101(m)   index++
    ld    A, C          ; 1:4       loop_101(m)   lo new index
stp_lo101 EQU $+1       ;           loop_101(m)
    xor  0x00           ; 2:7       loop_101(m)   lo stop
    jp   nz, do101save  ; 3:10      loop_101(m)
    ld    A, B          ; 1:4       loop_101(m)   hi new index
stp_hi101 EQU $+1       ;           loop_101(m)
    xor  0x00           ; 2:7       loop_101(m)   hi stop
    jp   nz, do101save  ; 3:10      loop_101(m)
leave101:               ;           loop_101(m)
exit101:                ;           loop_101(m)
    ex   DE, HL         ; 1:4       drop
    pop  DE             ; 1:10      drop   ( a -- )
_fib2_end:
    jp   0x0000         ; 3:10      ;
;   ---------  end of non-recursive function  ---------
;   ---  the beginning of a non-recursive function  ---
_fib2_bench:            ;           ( -- )
    pop  BC             ; 1:10      : ret
    ld  [_fib2_bench_end+1],BC; 4:20      : ( ret -- )
    ld   BC, 0x0000     ; 3:10      1000 0 do_102(xm)
do102save:              ;           1000 0 do_102(xm)
    ld  [idx102],BC     ; 4:20      1000 0 do_102(xm)
    xor   A             ; 1:4       20 0 do_103 i_103(m)   8-bit loop   ( 20 0 -- i )
do103saveA:             ;           20 0 do_103 i_103(m)
    push DE             ; 1:11      20 0 do_103 i_103(m)
    ex   DE, HL         ; 1:4       20 0 do_103 i_103(m)
    ld  [idx103], A     ; 3:13      20 0 do_103 i_103(m)   save lo(index)
    ld    L, A          ; 1:4       20 0 do_103 i_103(m)
    ld    H, 0x00       ; 2:7       20 0 do_103 i_103(m)
    call _fib2          ; 3:17      call ( -- )
    ex   DE, HL         ; 1:4       drop
    pop  DE             ; 1:10      drop   ( a -- )
                        ;[9:32/32]  loop_103(m)   variant +1.ignore: 8-bit loop, run 20x
idx103 EQU $+1          ;           loop_103(m)   idx always points to a 16-bit index
    ld    A, 0          ; 2:7       loop_103(m)   0.. +1 ..(20), real_stop:0x0014
    db   0x00           ; 1:4       loop_103(m)   ignore opcode = hi(index) -> idx always points to a 16-bit index.
    inc   A             ; 1:4       loop_103(m)   index++
    cp   0x14           ; 2:7       loop_103(m)   lo(real_stop)
    jp   nz, do103saveA ; 3:10      loop_103(m)   index<>real_stop?
                        ;[16:57/58] loop_102(xm)   variant +1.default: step one, run 1000x
idx102 EQU $+1          ;           loop_102(xm)   idx always points to a 16-bit index
    ld   BC, 0x0000     ; 3:10      loop_102(xm)   0.. +1 ..(1000), real_stop:0x03E8
    inc  BC             ; 1:6       loop_102(xm)   index++
    ld    A, C          ; 1:4       loop_102(xm)
    xor  0xE8           ; 2:7       loop_102(xm)   lo(real_stop) first (232>3)
    jp   nz, do102save  ; 3:10      loop_102(xm)   3x false positive
    ld    A, B          ; 1:4       loop_102(xm)
    xor  0x03           ; 2:7       loop_102(xm)   hi(real_stop)
    jp   nz, do102save  ; 3:10      loop_102(xm)   232x false positive if he was first
leave102:               ;           loop_102(xm)
exit102:                ;           loop_102(xm)
_fib2_bench_end:
    jp   0x0000         ; 3:10      ;
;   ---------  end of non-recursive function  ---------

The resulting 133-byte program is about 10 times faster than a 2222-byte program compiled from C. Over time, the C compiler has improved a bit (and also has the ability to write the correct parameter during compilation), but it is still an order of magnitude worse.

Code: Select all

|        Forth / C        |  Benchmark  | Time (sec/round) | Bytes
| :---------------------: | :---------: | :--------------- | :------
| M4_FORTH                | Fib2        | 0m5.65s          | 133
| M4_FORTH use data stack | Fib2s       | 0m5.03s          | 112
| M4_FORTH use assembler  | Fib2a       | 0m2.55s          | 96
| Boriel Basic zxbc 1.16.4| Fib2 a = a+c| 0m14.38s         |
| zcc z88dk v16209        | Fib2 a = a+c| 0m49.19s         |
| zcc z88dk v16209        | Fib2 a+= c  | 0m43.97s         |
| zcc z88dk v19766 -O2    | Fib2 a+= c  | 0m36.09s         | 2222

The average estimate is that a C compiler is 2x slower and a handwritten assembler 2x faster. Forth has a problem when it is forced to compile code with recursion, it starts comparing to C code because the second stack has to be emulated.

But it still has the advantage of holding TOS (top of stack) in the paired register HL and NOS in DE. This in turn forces the programmer to choose an algorithm that is more efficient because it is executed over registers and not memory space.

Thanks to the fact that words or combinations of words are just macros, it is easy to create a new word if necessary and thereby make the resulting program more efficient. If you can improve the translation, you are not dependent on what the C compiler can do.

ketmar · Post by **ketmar** » Sat Dec 09, 2023 3:54 am

i'd suggest to deviate from "standard Forth" a little. for example, get rid of "DO", and replace it with "FOR/ENDFOR":

Code: Select all

FOR  ( iteration-count — )
10 FOR … ENDFOR — repeats 10 times, "I" returns current iteration counter

most of the time DO is used this way anyway, and you can simplify the code by counting iterations to 0 (and comparing with 0 instead of some arbitrary limit). still need to use 2 return stack entries, so "I" could calculate the proper value, though.

also, in my Forth system i throwed away all conditional loop constructs, and replaced them with generalised Parnas' iterator:

Code: Select all

<< cond ?^| action |?
   cond ?v| action |?
   … and so on, as much such checks as you need …
   else| action >>

"?^| … |?" executes the action, and then jumps back to "<<", and "?v| … ?|" jumps to ">>".

actually, i found that "FOR" is not really really necessary: with words like "1+R!", "1-R!", "R@" and "R1@" you can have quite efficient loops (basically, writing "FOR" manually if you need to). doing manual loop index manipulations may hurt readability a little, but in exchange you have full control over loop step, and you can avoid keeping the original limit if you don't need to emulate "I".

also, if you're doing cross-compiling, it is possible to implement native code generator too. with some automatic inlining and SSA-based codegen it can get rid of most stack manipulations, constantly beating the crap out of most Z80 C compilers. i'm planning to release such system somewhere in the next year (it is in a PoC state now).

marenja · Post by **marenja** » Sat Dec 09, 2023 4:30 am

ketmar wrote: ↑Sat Dec 09, 2023 3:54 am actually, i found that "FOR" is not really really necessary: with words like "1+R!", "1-R!", "R@" and "R1@" you can have quite efficient loops (basically, writing "FOR" manually if you need to). doing manual loop index manipulations may hurt readability a little, but in exchange you have full control over loop step, and you can avoid keeping the original limit if you don't need to emulate "I".

Could you please give an example how we can write with those funny R operators

Code: Select all

FOR i = 17 TO 29 STEP 3
FOR i = 17 TO 1 STEP -1
FOR i = 0 TO 5

is Parnas iterator something like endless loop ?

Code: Select all

loop begin << 
switch {
    case cond#1:
        action#1
        continue; // next iteration
    case cond#2:
        action#2
        break;
    case cond#3:
        action#3
        break;
   … and so on, as much such checks as you need …
   else: action
        action
        break;

 >>  loop end

_dw · Post by **_dw** » Sat Dec 09, 2023 6:04 am

ketmar wrote: ↑Sat Dec 09, 2023 3:54 am i'd suggest to deviate from "standard Forth" a little. for example, get rid of "DO", and replace it with "FOR/ENDFOR":
Code: Select all
FOR  ( iteration-count — )
10 FOR … ENDFOR — repeats 10 times, "I" returns current iteration counter
most of the time DO is used this way anyway, and you can simplify the code by counting iterations to 0 (and comparing with 0 instead of some arbitrary limit). still need to use 2 return stack entries, so "I" could calculate the proper value, though.

also, in my Forth system i throwed away all conditional loop constructs, and replaced them with generalised Parnas' iterator:
Code: Select all
<< cond ?^| action |?
   cond ?v| action |?
   … and so on, as much such checks as you need …
   else| action >>
"?^| … |?" executes the action, and then jumps back to "<<", and "?v| … ?|" jumps to ">>".

actually, i found that "FOR" is not really really necessary: with words like "1+R!", "1-R!", "R@" and "R1@" you can have quite efficient loops (basically, writing "FOR" manually if you need to). doing manual loop index manipulations may hurt readability a little, but in exchange you have full control over loop step, and you can avoid keeping the original limit if you don't need to emulate "I".

also, if you're doing cross-compiling, it is possible to implement native code generator too. with some automatic inlining and SSA-based codegen it can get rid of most stack manipulations, constantly beating the crap out of most Z80 C compilers. i'm planning to release such system somewhere in the next year (it is in a PoC state now).

The FOR NEXT loop is implemented.
But it's not that much different from 0 N DO -1 +LOOP because the loops are quite optimized.
Forth's way of solving loops is a bit impractical from C, because it's 16-bit bad and it still has to be solved in such a way that the loop can overflow and then terminate. The only correct termination of the loop is that there is a STOP value between the current index and index+step.

Another disadvantage is that I test whether the loop uses an index or not. Because they are all made in such a way that it could be used. So both the values and the direction must fit.

Code: Select all

dworkin@dw-A15:~/Programovani/ZX/Forth/Testing$ ../check_word.sh 'PUSH(30) PUSH(17) DO PUSH(3) ADDLOOP'
    ld    A, 0x11       ; 2:7       30 17 do_101(xm)   8-bit loop
do101saveA:             ;           30 17 do_101(xm)
    ld  [idx101],A      ; 3:13      30 17 do_101(xm)
                       ;[10:48/35]  3 +loop_101(xm)   variant +X.B: 8-bit loop and hi(index)=0 and hi(begin)=hi(real_stop), run 5x
idx101 EQU $+1          ;           3 +loop_101(xm)   idx always points to a 16-bit index
    ld    A, 0          ; 2:7       3 +loop_101(xm)   17.. +3 ..(30), real_stop:0x0020
    nop                 ; 1:4       3 +loop_101(xm)   hi(index) = 0 = nop -> idx always points to a 16-bit index.
    add   A, 0x03       ; 2:7       3 +loop_101(xm)   index+lo(step)
    cp   0x20           ; 2:7       3 +loop_101(xm)   lo(real_stop)
    jp   nz, do101saveA ; 3:10      3 +loop_101(xm)   index<>real_stop?
leave101:               ;           3 +loop_101(xm)
exit101:                ;           3 +loop_101(xm)
; seconds: 0           ;[15:55]
dworkin@dw-A15:~/Programovani/ZX/Forth/Testing$ ../check_word.sh 'PUSH(1) PUSH(17) DO PUSH(-1) ADDLOOP'
    ld    A, 0x11       ; 2:7       1 17 do_101(xm)   8-bit loop
do101saveA:             ;           1 17 do_101(xm)
    ld  [idx101],A      ; 3:13      1 17 do_101(xm)
                        ;[7:38/25]  -1 +loop_101(xm)   variant -1.nop(0): 8-bit loop and hi(index)=0 and real_stop=0, run 17x
idx101 EQU $+1          ;           -1 +loop_101(xm)   idx always points to a 16-bit index
    ld    A, 0          ; 2:7       -1 +loop_101(xm)   17.. -1 ..1, real_stop:0x0000
    nop                 ; 1:4       -1 +loop_101(xm)   hi(index) = 0 = nop -> idx always points to a 16-bit index.
    dec   A             ; 1:4       -1 +loop_101(xm)   index--
    jp   nz, do101saveA ; 3:10      -1 +loop_101(xm)   index<>real_stop?
leave101:               ;           -1 +loop_101(xm)
exit101:                ;           -1 +loop_101(xm)
; seconds: 1           ;[12:45]
dworkin@dw-A15:~/Programovani/ZX/Forth/Testing$ ../check_word.sh 'PUSH(6) PUSH(0) DO LOOP'
    ld    A, 0x00       ; 2:7       6 0 do_101(xm)   8-bit loop
do101saveA:             ;           6 0 do_101(xm)
    ld  [idx101],A      ; 3:13      6 0 do_101(xm)
                        ;[9:45/32]  loop_101(xm)   variant +1.nop: 8-bit loop and hi(index)=0, run 6x
idx101 EQU $+1          ;           loop_101(xm)   idx always points to a 16-bit index
    ld    A, 0          ; 2:7       loop_101(xm)   0.. +1 ..(6), real_stop:0x0006
    nop                 ; 1:4       loop_101(xm)   hi(index) = 0 = nop -> idx always points to a 16-bit index.
    inc   A             ; 1:4       loop_101(xm)   index++
    cp   0x06           ; 2:7       loop_101(xm)   lo(real_stop)
    jp   nz, do101saveA ; 3:10      loop_101(xm)   index-real_stop
leave101:               ;           loop_101(xm)
exit101:                ;           loop_101(xm)
; seconds: 0           ;[14:52]
dworkin@dw-A15:~/Programovani/ZX/Forth/Testing$ ../check_word.sh 'PUSH(0) PUSH(5) DO PUSH(-1) ADDLOOP'
    ld    A, 0x05       ; 2:7       0 5 do_101(xm)   8-bit loop
do101saveA:             ;           0 5 do_101(xm)
    ld  [idx101],A      ; 3:13      0 5 do_101(xm)
                        ;[7:38/25]  -1 +loop_101(xm)   variant -1.nop(-1): 8-bit loop and hi(index)=0 and index<=128 and real_stop=-1, run 6x
idx101 EQU $+1          ;           -1 +loop_101(xm)   idx always points to a 16-bit index
    ld    A, 0          ; 2:7       -1 +loop_101(xm)   5.. -1 ..0, real_stop:0xFFFF
    nop                 ; 1:4       -1 +loop_101(xm)   hi(index) = 0 = nop -> idx always points to a 16-bit index.
    dec   A             ; 1:4       -1 +loop_101(xm)   index--
    jp    p, do101saveA ; 3:10      -1 +loop_101(xm)   index<>real_stop?
leave101:               ;           -1 +loop_101(xm)
exit101:                ;           -1 +loop_101(xm)
; seconds: 0           ;[12:45]
dworkin@dw-A15:~/Programovani/ZX/Forth/Testing$ ../check_word.sh 'PUSH(5) FOR NEXT'
    ld    A, 5          ; 2:7       5 for_101   ( -- )
for101:                 ;           5 for_101
    ld  [idx101],A      ; 3:13      5 for_101   save index
idx101 EQU $+1          ;           next_101
    ld    A, 0x00       ; 2:7       next_101   idx always points to a 16-bit index
    nop                 ; 1:4       next_101
    sub  0x01           ; 2:7       next_101   index--
    jp   nc, for101     ; 3:10      next_101
leave101:               ;           next_101
; seconds: 0           ;[13:48]

In reality, there are many, many more loops of this type. It depends on what information about the loop is known at the time of compilation. Whether it stores the values in the memory as they are in the instructions. Or it keeps it in the data stack or when it is forced to keep it in the stack of return addresses.

RAS (I don't recommend it unless absolutely necessary)
https://codeberg.org/DW0RKiN/M4_FORTH/s ... xloop_r.m4
https://codeberg.org/DW0RKiN/M4_FORTH/s ... loop_r.asm

memory (default method)
https://codeberg.org/DW0RKiN/M4_FORTH/s ... xloop_m.m4
https://codeberg.org/DW0RKiN/M4_FORTH/s ... loop_m.asm

stack (special)
https://codeberg.org/DW0RKiN/M4_FORTH/s ... xloop_s.m4
https://codeberg.org/DW0RKiN/M4_FORTH/s ... loop_s.asm

__ASM is used as an artificial word separator in these tests, because then the compiler does not use any optimizations.
I recommend you try it yourself through the check word script if you have linux.

_dw · Post by **_dw** » Sat Dec 09, 2023 6:15 am

I see that google translator messed up some things a bit.

_dw · Post by **_dw** » Sat Dec 09, 2023 6:30 am

Infinite loops can be solved via BEGIN AGAIN/UNTIL

It is important to understand that this is a compiler and not an interpreter. So it will have different characteristics, strengths and weaknesses compared to Forth. It will have faster code, but always bigger. The most efficient method of writing it in Forh may not be the best for compilation. And as I said if you can avoid RAS then it is the right way for M4 FORTH. Words like >R R> will be slow.

Code: Select all

dworkin@dw-A15:~/Programovani/ZX/Forth/Testing$ ../check_word.sh 'TO_R R_FROM'
                        ;[9:65]     >r   ( c b a -- c b ) ( R: -- a )
    ex  [SP],HL         ; 1:19      >r   a . b c
    ex   DE, HL         ; 1:4       >r   a . c b
    exx                 ; 1:4       >r
    pop  DE             ; 1:10      >r
    dec  HL             ; 1:6       >r
    ld  [HL],D          ; 1:7       >r
    dec   L             ; 1:4       >r
    ld  [HL],E          ; 1:7       >r
    exx                 ; 1:4       >r
                        ;[9:66]     r>   ( b a -- b a i ) ( R: i -- )
    exx                 ; 1:4       r>
    ld    E,[HL]        ; 1:7       r>
    inc   L             ; 1:4       r>
    ld    D,[HL]        ; 1:7       r>
    inc  HL             ; 1:6       r>
    push DE             ; 1:11      r>
    exx                 ; 1:4       r>   i . b a
    ex   DE, HL         ; 1:4       r>   i . a b
    ex  [SP],HL         ; 1:19      r>   b . a i

The RAS address is kept in the shadow HL, everything is emulated and it is still difficult to get data from the not shadow HL and DE there. I didn't want to do it through IX or IY, it's even worse on Z80.

The same code written differently.

Code: Select all

dworkin@dw-A15:~/Programovani/ZX/Forth/Testing$ ../check_word.sh 'VARIABLE(z) PUSH(z) STORE PUSH(z) FETCH'
                        ;[5:30]     z !   ( x -- )  addr=z
    ld  [z], HL         ; 3:16      z !
    ex   DE, HL         ; 1:4       z !
    pop  DE             ; 1:10      z !
    push DE             ; 1:11      z @   ( -- [z] )
    ex   DE, HL         ; 1:4       z @
    ld   HL,[z]         ; 3:16      z @

VARIABLE_SECTION:

z:                      ;           variable z
    dw 0                ;           variable z

;# ============================================================================
  if ($<0x0100)
    .error Overflow 64k! over 0..255 bytes
  endif
  if ($<0x0200)
    .error Overflow 64k! over 256..511 bytes
  endif
  if ($<0x0400)
    .error Overflow 64k! over 512..1023 bytes
  endif
  if ($<0x0800)
    .error Overflow 64k! over 1024..2047 bytes
  endif
  if ($<0x1000)
    .error Overflow 64k! over 2048..4095 bytes
  endif
  if ($<0x2000)
    .error Overflow 64k! over 4096..8191 bytes
  endif
  if ($<0x3000)
    .error Overflow 64k! over 8192..12287 bytes
  endif
  if ($<0x4000)
    .error Overflow 64k! over 12288..16383 bytes
  endif
  if ($>0xFF00)
    .warning Data ends at 0xFF00+ address!
  endif
; seconds: 0           ;[10:61]

It is much faster even if at first glance it seems that the word PUSH_STORE_PUSH_FETCH would be useful and at least remove POP DE followed by PUSH DE.

PS: I fix this

Code: Select all

dworkin@dw-A15:~/Programovani/ZX/Forth/Testing$ ../check_word.sh 'PUSH(z) STORE PUSH(z) FETCH'
    ld  [z], HL         ; 3:16      z !
    ld   HL,[z]         ; 3:16      z @   ( x -- [z] )
; seconds: 0           ;[ 6:32]

_dw · Post by **_dw** » Sat Dec 09, 2023 7:11 am

It actually produces slightly different words/tokens than it shows. Because I discovered that the base word in forth may not be the best piece of code. Sometimes a compound word can be shorter than the basic word. It depends on the implementation.

If I turn on VERBOSE...

Code: Select all

dworkin@dw-A15:~/Programovani/ZX/Forth/Testing$ ../check_word.sh 'VERBOSE(1) PUSH(z) STORE PUSH(z) FETCH'
   ...new __TOKEN_PUSHS(z) "{z}" --> token(1) __TOKEN_PUSHS(z) "{{z}}"
   ...new __TOKEN_2DUP_STORE_1ADD() "!" --> token(2) __TOKEN_2DUP_STORE_1ADD "!"
   ...new __TOKEN_2DROP() "__dtto" --> token(2) __TOKEN_DROP "__dtto"
   ...new __TOKEN_PUSHS(z) "{z}" --> token(3) __TOKEN_PUSHS(z) "{{z}}"
   ...new __TOKEN_FETCH() "@" --> token(3) __TOKEN_PUSH_FETCH(z) "{{z}} @"

   ...check all tokens
   ...second pass(1) __TOKEN_DUP_PUSH_STORE(z) "{z} !" --> __TOKEN_DUP_PUSH_STORE(z) "{{z}} !"
   ...second pass(2) __TOKEN_DROP() "__dtto" --> same
         ...token(2) --> __TOKEN_NOPE
   ...second pass(3) __TOKEN_PUSH_FETCH(z) "{z} @" --> __TOKEN_DROP_PUSH_FETCH(z) "{{z}} @"

   ...check all tokens2
   ...third pass(1) __TOKEN_DUP_PUSH_STORE(z) "{z} !" --> __TOKEN_DUP_PUSH_STORE(z) "{{z}} !"
   ...third pass(2) __TOKEN_NOPE() "" --> same
        ...token(2) --> __TOKEN_NOPE
   ...third pass(3) __TOKEN_DROP_PUSH_FETCH(z) "{z} @" --> __TOKEN_DROP_PUSH_FETCH(z) "{{z}} @"

   ...create(1) __TOKEN_DUP_PUSH_STORE(z) "{{z}} !"
   ...create(2) __TOKEN_NOPE ""
   ...create(3) __TOKEN_DROP_PUSH_FETCH(z) "{{z}} @"
    ld  [z], HL         ; 3:16      z !
    ld   HL,[z]         ; 3:16      z @   ( x -- [z] )
; seconds: 0           ;[ 6:32]

So you can see that the code consists of two words. DUP_PUSH_STORE(z) and DROP_PUSH_FETCH(z). I just added DROP_PUSH_FETCH(z) to make it efficient. Plus I added one token rule to the second pass so that when it finds a DROP token followed by PUSH_FETCH it connects it.

PS: I will upload it to the website after midnight (English time)

ketmar · Post by **ketmar** » Sat Dec 09, 2023 10:23 am

marenja wrote: ↑Sat Dec 09, 2023 4:30 am Could you please give an example how we can write with those funny R operators
Code: Select all
FOR i = 17 TO 29 STEP 3

Code: Select all

17 >R << R@ 29 <= ?^| …actions… 3 +R! |? ELSE| RDROP >>

i think the others are obvious now. ;-) it is slightly more complicated when loop limits are not known at compile time, and if you need to use "I" value, but the basic idea is the same.

marenja wrote: ↑Sat Dec 09, 2023 4:30 am is Parnas iterator something like endless loop ?

it is like a generalized loop, with several repeat/exit conditions and actions. note that ">>" doesn't loop back, it simply marks the exit point for "v" branches. actually, in my implementation it bombs with runtime error, because if you want to simply exit the iterator, you should use explicit "ELSE|".

the idea is that all possible cases should be catched by guards. if you forgot some case, this is a bug. "ELSE|" is just a shortcut for "TRUE ?v| … |?".

you got "^" and "v" marks absolutery right, btw. ;-) of course, it cannot be emulated with C "switch", because guards (conditions) could be as complex as you want them to be, not just simple constants. but the idea is right, yeah.

this thing could also be used as a "generalized switch" too, if all branches are down ("v"). basically, you don't need NO other control flow operators at all, because "IF" is just a Parnas' iterator with "v" too. i kept "IF"s only to make simple conditionals shorter to write. but i eliminated all other loop operators from my system, because i found that i never actually used them after introducing Parnas' iterator.

in the original paper, David Parnas gives the way to make proofs on your code, so it doesn't contradict Dijkstra's structured programming. tbh, i was SO proud when i invented this thing… until i did my search. ;-)

btw, it very naturally maps to Forth, because in the original paper the actions are preceded with the respective guards too. so in Forth it looks almost like a copypasta of the paper ideas.

ketmar · Post by **ketmar** » Sat Dec 09, 2023 10:46 am

_dw wrote: ↑Sat Dec 09, 2023 7:11 am It actually produces slightly different words/tokens than it shows. Because I discovered that the base word in forth may not be the best piece of code. Sometimes a compound word can be shorter than the basic word. It depends on the implementation.

yep, this is so-called "superinstruction optimisation", some Forth systems are using that. the compiler tracks several last words compiled, and optimises them into one "superinstruction", if possible. my system does that too, albeit in very limited manner. it basically folds things like "LIT someop" to "(LIT-someop)" superinstructions. this is very simple thing to implement, so i did it. also, instead of calling a variable word, it directly compiles PFA of the variable as a literal (i see that you're doing something like this too), so "var @" is properly folded to "(#@) <address>" in threaded code. this way using variables is almost as cheap as using simple literals.

my research with real-word Forth applications (Z80 asm, for example, and some others) shows that there is no practical reason to explore this way any further. the real gains come from SSA-based native code generator with proper register allocation. Z80 doesn't have a lot of registers, tho, so my system isn't able to do much. yet it can aggressively inline code, reorder some evaluations, and such. still a win, because if you're writing your code "Forth way" (with many small words), inliner is able to eliminate a lot of word calling overhead.

but on x86 this codegen starts to shine, and even more on x86_64. there are a lot of registers to use, so it can eliminate most stack acrobatics. it also tracks comparisons, and avoid generating proper true/false values if it sees that comparison only used in a branch. it tries to move it closer to a branch, and use right branch instruction instead.

i'm not sure if i'll really finish that thing for "bug CPUs", though: my x86 Forth system is fast enough, and i prefer to keep things simple. yet it was fun thing to try.

p.s.: i am talking about cross-compiler here, btw. it is written in x86 Forth, and cross-compiles to Z80. on Z80, it is STC kind of threaded code. i'm still experimentig with it, though. the compiler performs tail call optimisation too, so most words end with the direct jump to the last compiled word instead of the usual "word EXIT" seqence.

Wall_Axe · Post by **Wall_Axe** » Sat Dec 09, 2023 12:36 pm

Looks interesting thanks.
I'll have a go at it.

Is it possible to create a string,then reference it in different places in the code? (Like a string pointer)

Can I dereference a pointer? (As in store a 16bit address at a certain address....then retrieve a byte or two from that stored address)

Wall_Axe · Post by **Wall_Axe** » Sat Dec 09, 2023 2:30 pm

the snake example is good, seems faster than if it were made in BASIC.

Im trying to compile it using CYGWIN which is a bash shell in windows

I tried doing sh compile.sh Game/snake
but it said it couldnt find FIRST.M4 because it was looking in ../FIRST.M4

So I moved the compile.sh into the same folder as snake.

its doing something using z88dk M4.exe and running into errors of not being able to find the command printf

probably because im not using real linux

_dw · Post by **_dw** » Sat Dec 09, 2023 2:55 pm

I've provided support for strings as written in standard Forth, but I haven't written any test programs to see how usable it is. By this I mean that if you expect some wild operations on chains (I don't understand why it is translated once as "chains" which is the literal translation of the Czech term and the second time as "string"), when they change during the program and move around the memory differently, then there is some support I can't see much.

If you look at what Forth has for concatenation words, that's it

>>> ." <<< where the space in front is only as a separator and sometimes it may not be necessary.
The space at the end is also a separator and is required. Followed by printable_text and then >>>"<<< without separator.
Tim ends the word, so there doesn't have to be a space after it.

I write it as PRINT(printable_text). This is a macro with a parameter that is printed. This method is the only one that will be translated by the fth2m4 script.

Code: Select all

dworkin@dw-A15:~/Programovani/ZX/Forth/M4$ ../check_word.sh 'PRINT({"Hello, Word!"})'
    push DE             ; 1:11      print     "Hello, Word!"
    ld   BC, size101    ; 3:10      print     Length of string101
    ld   DE, string101  ; 3:10      print     Address of string101
    call 0x203C         ; 3:17      print     Print our string with ZX 48K ROM
    pop  DE             ; 1:10      print

STRING_SECTION:
string101:
    db "Hello, Word!"
size101              EQU $ - string101
; seconds: 0           ;[11:58]

But if you write the code directly in M4, there is support for PRINT_Z, where a character with an ascii value of 0 will be added to the end of the text. The disadvantage is if you use 0 in the text, which can easily happen if you have something like "AT coordinates" or color settings.

Code: Select all

dworkin@dw-A15:~/Programovani/ZX/Forth/M4$ ../check_word.sh 'PRINT_Z({"Hello, Word!"})'
    ld   BC, string101  ; 3:10      print_z   Address of null-terminated string101
    call PRINT_STRING_Z ; 3:17      print_z
;#------------------------------------------------------------------------------
;# Print C-style stringZ
;# In: BC = addr
;# Out: BC = addr zero + 1
    rst   0x10          ; 1:11      print_string_z   putchar(reg A) with ZX 48K ROM
PRINT_STRING_Z:         ;           print_string_z
    ld    A,[BC]        ; 1:7       print_string_z
    inc  BC             ; 1:6       print_string_z
    or    A             ; 1:4       print_string_z
    jp   nz, $-4        ; 3:10      print_string_z
    ret                 ; 1:10      print_string_z

STRING_SECTION:
string101:
    db "Hello, Word!", 0x00
size101              EQU $ - string101
; seconds: 0           ;[14:75]

Then there is support for PRINT_I, where the value 128 is added to the last character. There, again, you cannot print characters greater than 127.

Code: Select all

dworkin@dw-A15:~/Programovani/ZX/Forth/M4$ ../check_word.sh 'PRINT_I({"Hello, Word!"})'
    ld   BC, string101  ; 3:10      print_i   Address of string101 ending with inverted most significant bit
    call PRINT_STRING_I ; 3:17      print_i
;#------------------------------------------------------------------------------
;# Print string ending with inverted most significant bit
;# In: BC = addr string_imsb
;# Out: BC = addr last_char + 1
    rst   0x10          ; 1:11      print_string_i   putchar(reg A) with ZX 48K ROM
PRINT_STRING_I:         ;           print_string_i
    ld    A,[BC]        ; 1:7       print_string_i
    inc  BC             ; 1:6       print_string_i
    or    A             ; 1:4       print_string_i
    jp    p, $-4        ; 3:10      print_string_i
    and  0x7f           ; 2:7       print_string_i
    rst   0x10          ; 1:11      print_string_i   putchar(reg A) with ZX 48K ROM
    ret                 ; 1:10      print_string_i

STRING_SECTION:
string101:
    db "Hello, Word","!" + 0x80
size101              EQU $ - string101
; seconds: 0           ;[17:93]

As you can see in the example, the strings are stored at the end of the code as constants (if they were changed, it will be undefined when the program is restarted), but before vatiables (if there are any).

Strings are automatically numbered from 100 (which is excluded). I chose it initially for easy text alignment.

Text strings have these problems. Commas - if I include them, the macro will think that it contains several parameters and not one string. Therefore, it is important to wrap the text. In M4, wrapping between {} is set. But that's not enough because the strings will still be in memory, so they have to be wrapped for the translator from asm to bin. So if it's text, it should be between "".

Forth can also store strings via the word STRING (STRING_Z or STRING_I), where the output is the address and the length of the string (just the address). And it is printed using TYPE if I'm not mistaken. I'm not very good at Forth.

Code: Select all

dworkin@dw-A15:~/Programovani/ZX/Forth/M4$ ../check_word.sh 'STRING({"Hello, Word!"}) TYPE'
    push DE             ; 1:11      string    ( -- addr size )
    push HL             ; 1:11      string    "Hello, Word!"
    ld   DE, string101  ; 3:10      string    Address of string101
    ld   HL, size101    ; 3:10      string    Length of string101
    call PRINT_TYPE     ; 3:17      type   ( addr n -- )
;#==============================================================================
;# ( addr n -- )
;# print n chars from addr
;#  Input: HL, DE
;# Output: Print decimal number in HL
;# Pollutes: AF, BC, DE
PRINT_TYPE:             ;[10:76]    print_string
    ld    B, H          ; 1:4       print_string
    ld    C, L          ; 1:4       print_string   BC = length of string to print
    call 0x203C         ; 3:17      print_string   Use ZX 48K ROM
    pop  AF             ; 1:10      print_string   load ret
    pop  HL             ; 1:10      print_string
    pop  DE             ; 1:10      print_string
    push AF             ; 1:11      print_string   save ret
    ret                 ; 1:10      print_string

STRING_SECTION:
string101:
    db "Hello, Word!"
size101              EQU $ - string101
; seconds: 0           ;[21:135]

Code: Select all

dworkin@dw-A15:~/Programovani/ZX/Forth/M4$ ../check_word.sh 'STRING_Z({"Hello, Word!"}) TYPE_Z'
    push DE             ; 1:11      string_z   ( -- addr )
    ex   DE, HL         ; 1:4       string_z   "Hello, Word!"
    ld   HL, string101  ; 3:10      string_z   Address of null-terminated string101
    call PRINT_TYPE_Z   ; 3:17      type_z   ( addr -- )
    ex   DE, HL         ; 1:4       type_z
    pop  DE             ; 1:10      type_z
;#==============================================================================
;# Print C-style stringZ
;# In: HL = addr stringZ
;# Out: BC = addr zero
PRINT_TYPE_Z:           ;           print_type_z
    ld    B, H          ; 1:4       print_type_z
    ld    C, L          ; 1:4       print_type_z   BC = addr stringZ
    db   0x3E           ; 1:7       print_type_z   ld    A, 0xD7
    ; fall to PRINT_STRING_Z
;#------------------------------------------------------------------------------
;# Print C-style stringZ
;# In: BC = addr
;# Out: BC = addr zero + 1
    rst   0x10          ; 1:11      print_string_z   putchar(reg A) with ZX 48K ROM
PRINT_STRING_Z:         ;           print_string_z
    ld    A,[BC]        ; 1:7       print_string_z
    inc  BC             ; 1:6       print_string_z
    or    A             ; 1:4       print_string_z
    jp   nz, $-4        ; 3:10      print_string_z
    ret                 ; 1:10      print_string_z

STRING_SECTION:
string101:
    db "Hello, Word!", 0x00
size101              EQU $ - string101
; seconds: 0           ;[21:119]

Code: Select all

dworkin@dw-A15:~/Programovani/ZX/Forth/M4$ ../check_word.sh 'STRING_I({"Hello, Word!"}) TYPE_I'
    push DE             ; 1:11      string_i   ( -- addr )
    ex   DE, HL         ; 1:4       string_i   "Hello, Word!"
    ld   HL, string101  ; 3:10      string_i   Address of string101 ending with inverted most significant bit
    call PRINT_TYPE_I   ; 3:17      type_i   ( addr -- addr )
    ex   DE, HL         ; 1:4       type_i
    pop  DE             ; 1:10      type_i   ( a -- )
;#==============================================================================
;# Print string ending with inverted most significant bit
;# In: HL = addr string_imsb
;# Out: BC = addr last_char + 1
PRINT_TYPE_I:           ;           print_type_i
    ld    B, H          ; 1:4       print_type_i
    ld    C, L          ; 1:4       print_type_i   BC = addr string_imsb
    db   0x3E           ; 1:7       print_type_i   ld    A, 0xD7
    ; fall to PRINT_STRING_Z
;#------------------------------------------------------------------------------
;# Print string ending with inverted most significant bit
;# In: BC = addr string_imsb
;# Out: BC = addr last_char + 1
    rst   0x10          ; 1:11      print_string_i   putchar(reg A) with ZX 48K ROM
PRINT_STRING_I:         ;           print_string_i
    ld    A,[BC]        ; 1:7       print_string_i
    inc  BC             ; 1:6       print_string_i
    or    A             ; 1:4       print_string_i
    jp    p, $-4        ; 3:10      print_string_i
    and  0x7f           ; 2:7       print_string_i
    rst   0x10          ; 1:11      print_string_i   putchar(reg A) with ZX 48K ROM
    ret                 ; 1:10      print_string_i

STRING_SECTION:
string101:
    db "Hello, Word","!" + 0x80
size101              EQU $ - string101
; seconds: 0           ;[24:137]

At the beginning of the program, it is possible to set a font other than 8*8 with the help of switching. I gave support for 5 pixel match.

About those pointers.
Forth has the word HERE which returns a pointer to the first free space in the variable section.
It is done "statically" at compile time. So it will mostly work if some allocation is not done on the fly or in some loop.

Code: Select all

dworkin@dw-A15:~/Programovani/ZX/Forth/M4$ ../check_word.sh 'PUSH(10) ALLOT CREATE(my_pointer) HERE PUSH(123) COMMA FETCH UDOT'
my_pointer         EQU __create_my_pointer     
                        ;           10 allot
                        ;           create my_pointer
    push DE             ; 1:11      here
    ex   DE, HL         ; 1:4       here
    ld   HL, __create_my_pointer; 3:10      here
    ld   BC, 0x007B     ; 3:10      123 ,
    ld  [__create_my_pointer],BC; 4:20      123 ,
    ld    A,[HL]        ; 1:7       @   ( addr -- x )
    inc  HL             ; 1:6       @
    ld    H,[HL]        ; 1:7       @
    ld    L, A          ; 1:4       @
    call PRT_U16        ; 3:17      u.   ( u -- )
;#------------------------------------------------------------------------------
;# Input: HL
;# Output: Print unsigned decimal number in HL
;# Pollutes: AF, BC, HL <- DE, DE <- [SP]
PRT_U16:                ;           prt_u16
    xor   A             ; 1:4       prt_u16   HL=103 & A=0 => 103, HL = 103 & A='0' => 00103
    ld   BC, -10000     ; 3:10      prt_u16
    call BIN16_DEC      ; 3:17      prt_u16
    ld   BC, -1000      ; 3:10      prt_u16
    call BIN16_DEC      ; 3:17      prt_u16
    ld   BC, -100       ; 3:10      prt_u16
    call BIN16_DEC      ; 3:17      prt_u16
    ld    C, -10        ; 2:7       prt_u16
    call BIN16_DEC      ; 3:17      prt_u16
    ld    A, L          ; 1:4       prt_u16
    pop  HL             ; 1:10      prt_u16   load ret
    ex  [SP],HL         ; 1:19      prt_u16
    ex   DE, HL         ; 1:4       prt_u16
    jr   BIN16_DEC_CHAR ; 2:12      prt_u16
;#------------------------------------------------------------------------------
;# Input: A = 0 or A = '0' = 0x30 = 48, HL, IX, BC, DE
;# Output: if ((HL/(-BC) > 0) || (A >= '0')) print number -HL/BC
;# Pollutes: AF, HL
    inc   A             ; 1:4       bin16_dec
BIN16_DEC:              ;           bin16_dec
    add  HL, BC         ; 1:11      bin16_dec
    jr    c, $-2        ; 2:7/12    bin16_dec
    sbc  HL, BC         ; 2:15      bin16_dec
    or    A             ; 1:4       bin16_dec
    ret   z             ; 1:5/11    bin16_dec   does not print leading zeros
BIN16_DEC_CHAR:         ;           bin16_dec
    or   '0'            ; 2:7       bin16_dec   1..9 --> '1'..'9', unchanged '0'..'9'
    rst   0x10          ; 1:11      bin16_dec   putchar(reg A) with ZX 48K ROM
    ld    A, '0'        ; 2:7       bin16_dec   reset A to '0'
    ret                 ; 1:10      bin16_dec

VARIABLE_SECTION:

    ds 10               ;           10 allot
__create_my_pointer:    ;           create my_pointer
    dw 123              ;           123 ,

;# ============================================================================
  if ($<0x0100)
    .error Overflow 64k! over 0..255 bytes
  endif
  if ($<0x0200)
    .error Overflow 64k! over 256..511 bytes
  endif
  if ($<0x0400)
    .error Overflow 64k! over 512..1023 bytes
  endif
  if ($<0x0800)
    .error Overflow 64k! over 1024..2047 bytes
  endif
  if ($<0x1000)
    .error Overflow 64k! over 2048..4095 bytes
  endif
  if ($<0x2000)
    .error Overflow 64k! over 4096..8191 bytes
  endif
  if ($<0x3000)
    .error Overflow 64k! over 8192..12287 bytes
  endif
  if ($<0x4000)
    .error Overflow 64k! over 12288..16383 bytes
  endif
  if ($>0xFF00)
    .warning Data ends at 0xFF00+ address!
  endif
; seconds: 0           ;[63:335]

Then, after using the word VARIABLE(name), "name" will return a pointer.

Code: Select all

dworkin@dw-A15:~/Programovani/ZX/Forth/M4$ ../check_word.sh 'VARIABLE(abc) PUSH(abc)'
    push DE             ; 1:11      abc
    ex   DE, HL         ; 1:4       abc
    ld   HL, abc        ; 3:10      abc

VARIABLE_SECTION:

abc:                    ;           variable abc
    dw 0                ;           variable abc

;# ============================================================================
  if ($<0x0100)
    .error Overflow 64k! over 0..255 bytes
  endif
  if ($<0x0200)
    .error Overflow 64k! over 256..511 bytes
  endif
  if ($<0x0400)
    .error Overflow 64k! over 512..1023 bytes
  endif
  if ($<0x0800)
    .error Overflow 64k! over 1024..2047 bytes
  endif
  if ($<0x1000)
    .error Overflow 64k! over 2048..4095 bytes
  endif
  if ($<0x2000)
    .error Overflow 64k! over 4096..8191 bytes
  endif
  if ($<0x3000)
    .error Overflow 64k! over 8192..12287 bytes
  endif
  if ($<0x4000)
    .error Overflow 64k! over 12288..16383 bytes
  endif
  if ($>0xFF00)
    .warning Data ends at 0xFF00+ address!
  endif
; seconds: 1           ;[ 5:25]

I don't know if I answered the question. In reality, it will be a bit more complicated.

_dw · Post by **_dw** » Sat Dec 09, 2023 3:11 pm

ketmar wrote: ↑Sat Dec 09, 2023 10:46 am yep, this is so-called "superinstruction optimisation", some Forth systems are using that. the compiler tracks several last words compiled, and optimises them into one "superinstruction", if possible. my system does that too, albeit in very limited manner. it basically folds things like "LIT someop" to "(LIT-someop)" superinstructions. this is very simple thing to implement, so i did it. also, instead of calling a variable word, it directly compiles PFA of the variable as a literal (i see that you're doing something like this too), so "var @" is properly folded to "(#@) <address>" in threaded code. this way using variables is almost as cheap as using simple literals.

my research with real-word Forth applications (Z80 asm, for example, and some others) shows that there is no practical reason to explore this way any further. the real gains come from SSA-based native code generator with proper register allocation. Z80 doesn't have a lot of registers, tho, so my system isn't able to do much. yet it can aggressively inline code, reorder some evaluations, and such. still a win, because if you're writing your code "Forth way" (with many small words), inliner is able to eliminate a lot of word calling overhead.

but on x86 this codegen starts to shine, and even more on x86_64. there are a lot of registers to use, so it can eliminate most stack acrobatics. it also tracks comparisons, and avoid generating proper true/false values if it sees that comparison only used in a branch. it tries to move it closer to a branch, and use right branch instruction instead.

i'm not sure if i'll really finish that thing for "bug CPUs", though: my x86 Forth system is fast enough, and i prefer to keep things simple. yet it was fun thing to try.

p.s.: i am talking about cross-compiler here, btw. it is written in x86 Forth, and cross-compiles to Z80. on Z80, it is STC kind of threaded code. i'm still experimentig with it, though. the compiler performs tail call optimisation too, so most words end with the direct jump to the last compiled word instead of the usual "word EXIT" seqence.

Just to clarify what I meant. Hopefully the meaning will pass through the translator.

I distinguish 2 things.
One is to combine multiple Forth words into one because it throws them out.
The second is that I don't look at the word Forth as a foundation stone, a Lego brick, but that it can still be broken into smaller pieces.
There is actually a third thing, a word that Forth doesn't know. When working with floating-point numbers with the help of ZX ROM, I finally came to the realization that I need the word DOWN, which is similar to the word DUP with that, but that in TOS there will be an undefined value.

Don't take this as me saying that Forth is poorly designed. Only that when implementing it on a concrete Z80 processor, I can secretly break it into other parts and immediately glue them to others or return to the original if no optimization takes place.

But it's true that when I started, I thought that Forth was simply not good for the Z80, because it didn't use all the registers, etc., before I realized that it was just an implementation error.

Given that I have both TOS and NOS pulled from the stack, pulling or inserting a value into the stack requires the instruction "ex DE,HL". So, in the first resulting code, I almost everywhere saw the following "ex DE,HL". That was the first impulse to solve it by combining the words that create it into one new one.
Then I realized that it would be even more useful to break such a word so that I would not have to create new words for each combination of words, but I changed the words ending with "ex" or starting with "ex" to the SWAP_WORD or WORD_SWAP variant.

PS: To clarify, from the word WORD that contains the instruction "ex DE, HL" at the end, I have a WORD_SWAP token that creates the same word without the "ex DE, HL" followed by a second SWAP token. And so just grant me token optimization just for SWAP. So if WORD_SWAP SWAP SWAP occurs, "SWAP SWAP" will be canceled.

_dw · Post by **_dw** » Sat Dec 09, 2023 3:18 pm

Wall_Axe wrote: ↑Sat Dec 09, 2023 2:30 pm the snake example is good, seems faster than if it were made in BASIC.

Im trying to compile it using CYGWIN which is a bash shell in windows

I tried doing sh compile.sh Game/snake
but it said it couldnt find FIRST.M4 because it was looking in ../FIRST.M4

So I moved the compile.sh into the same folder as snake.

its doing something using z88dk M4.exe and running into errors of not being able to find the command printf

probably because im not using real linux

If you write it in Forth and use the "fth2m4" script, it will look for the FIRST.M4 file.
I think it's looking for "./" "../M4/" and something else. And accordingly set the first line correctly. FIRST.M4 does the same.
So it was probably enough to manually edit the line

include(`../M4/FIRST.M4')dnl

He expects to start the compilation inside the directory with the source (in the GAME directory) and therefore jumps out of GAME and enters the neighboring M4 directory where FIRST.M4 should be located, which then includes the other files that are already needed. And change ` to { and ' to }.

PS: Regarding the printf, I'm not sure what you're running into. But in scripts I often use the printf command instead of "echo", which is more flexible, because it doesn't break your line every time. And it should work on various OSes, including BSD.

Unfortunately, I can't advise you with the Windows support.

PPS: Actually, it will be even worse in Windows, because M4 (I mean directly this macro language).does not support it. As, for example, the ascii value of a character is determined, I can call from the M4 Linux shell and solve it there. So it can happen that something unexpectedly fails, even if it goes without a problem at other times.
Actually, it can fail even in Linux, because I'm not afraid to change it continuously and not everything is tested. It's more of a bazaar than a cathedral. There is no other way, because I find problems on the fly. I have already made changes that deliberately broke more things than I was able to fix in a day, so the code was deliberately broken in some situations on the website. But with zero users I can afford it. .)
It's always useful for showing small parts of the code, how this or that is solved, etc. Something like Compiler Explorer.

ketmar · Post by **ketmar** » Sat Dec 09, 2023 3:49 pm

_dw wrote: ↑Sat Dec 09, 2023 3:11 pm Don't take this as me saying that Forth is poorly designed.

no problems. ;-) Forth is what you decided it is anyway. it is more like a foundation to build your own "view" on it — that's why i love it so much. my x86 system, for example, would prolly be barely recognized as Forth at all by anyone strictly adhering to "Forth standard". so it's great that you took it as something to build your ideas upon — this is, after all, exactly as it *should* be used.

Lethargeek · Post by **Lethargeek** » Sat Dec 09, 2023 5:07 pm

some possible optimizations

a) keeping track what stack values happened to be in registers when they are needed, thus avoiding some explicit redundant swap/drop/rot/etc

b) for simple functions taking and returning 1-2 or no values instead of "call...jp" use "pop:(pop):call...ret" (maybe even there will be no need for pops if the value(s) are in regs already)

_dw · Post by **_dw** » Sat Dec 09, 2023 5:41 pm

a) Keeping variables that contain what is in HL, DE, BC, etc. occurred to me too late. And then I didn't decide to rework all the words. Some words I do internally, because after a while I started using auxiliary (sub)macros to solve some constantly recurring problems. So then I just call them and the contents of the registry can also be among the parameters.
Then, the easier way is to divide the problematic words into smaller units and solve those as well.
Or, on the contrary, create a new word for problematic combinations of words.

b) Support for calling words/function via data stack is granted. It's the same as with loops (bad translate to bows). The default option is via memory directly inside the function ( bad translate to face). It is possible to have it inefficiently via RAS (default for standard Forth), but there is the possibility to have the return address always on the stack. Considering that I have NOS and TOS extracted from it to DE, HL, the address in [SP] or NNOS will appear. But it is a special possibility with the fact that the programmer solves the problems himself.

CALL

Code: Select all

dworkin@dw-A15:~/Programovani/ZX/Forth/M4$ ../check_word.sh 'CALL(Hi) COLON(Hi) PRINT({"Hello",13}) SEMICOLON'    call Hi             ; 3:17      call ( -- )
;   ---  the beginning of a non-recursive function  ---
Hi:                     ;           
    pop  BC             ; 1:10      : ret
    ld  [Hi_end+1],BC   ; 4:20      : ( ret -- )
    push DE             ; 1:11      print     "Hello",13
    ld   BC, size101    ; 3:10      print     Length of string101
    ld   DE, string101  ; 3:10      print     Address of string101
    call 0x203C         ; 3:17      print     Print our string with ZX 48K ROM
    pop  DE             ; 1:10      print
Hi_end:
    jp   0x0000         ; 3:10      ;
;   ---------  end of non-recursive function  ---------

STRING_SECTION:
string101:
    db "Hello",13
size101              EQU $ - string101
; seconds: 0           ;[22:115]

RCALL

Code: Select all

dworkin@dw-A15:~/Programovani/ZX/Forth/M4$ ../check_word.sh 'RCALL(Hi) RCOLON(Hi) PRINT({"Hello",13}) RSEMICOLON'
    call Hi             ; 3:17      rcall
    ex   DE, HL         ; 1:4       rcall
    exx                 ; 1:4       rcall ( R: ret -- )
;   ---  the beginning of a recursive function  ---
Hi:                     ;           
    exx                 ; 1:4       : rcolon
    pop  DE             ; 1:10      : rcolon ret
    dec  HL             ; 1:6       : rcolon
    ld  [HL],D          ; 1:7       : rcolon
    dec   L             ; 1:4       : rcolon
    ld  [HL],E          ; 1:7       : rcolon (HL') = ret
    exx                 ; 1:4       : rcolon ( R: -- ret )
    push DE             ; 1:11      print     "Hello",13
    ld   BC, size101    ; 3:10      print     Length of string101
    ld   DE, string101  ; 3:10      print     Address of string101
    call 0x203C         ; 3:17      print     Print our string with ZX 48K ROM
    pop  DE             ; 1:10      print
Hi_end:
    exx                 ; 1:4       ; rsemicilon
    ld    E,[HL]        ; 1:7       ; rsemicilon
    inc   L             ; 1:4       ; rsemicilon
    ld    D,[HL]        ; 1:7       ; rsemicilon DE = ret
    inc  HL             ; 1:6       ; rsemicilon
    ex   DE, HL         ; 1:4       ; rsemicilon
    jp  [HL]            ; 1:4       ; rsemicilon
;   ---------  end of recursive function  ---------

STRING_SECTION:
string101:
    db "Hello",13
size101              EQU $ - string101
; seconds: 1           ;[30:161]

SCALL

Code: Select all

dworkin@dw-A15:~/Programovani/ZX/Forth/M4$ ../check_word.sh 'SCALL(Hi) SCOLON(Hi) PRINT({"Hello",13}) SSEMICOLON'
    call Hi             ; 3:17      scall
;   ---  the beginning of a data stack function  ---
Hi:                     ;           
    push DE             ; 1:11      print     "Hello",13
    ld   BC, size101    ; 3:10      print     Length of string101
    ld   DE, string101  ; 3:10      print     Address of string101
    call 0x203C         ; 3:17      print     Print our string with ZX 48K ROM
    pop  DE             ; 1:10      print
Hi_end:
    ret                 ; 1:10      s;
;   ---------  end of data stack function  ---------

STRING_SECTION:
string101:
    db "Hello",13
size101              EQU $ - string101
; seconds: 0           ;[15:85]

Lethargeek · Post by **Lethargeek** » Sat Dec 09, 2023 6:37 pm

@_dw, i mean, even your _fib2 here (being "n1 -- n2") could be SCALL

_dw · Post by **_dw** » Sat Dec 09, 2023 6:59 pm

Lethargeek wrote: ↑Sat Dec 09, 2023 6:37 pm @_dw, i mean, even your _fib2 here (being "n1 -- n2") could be SCALL

https://codeberg.org/DW0RKiN/M4_FORTH/s ... tent-fib-2

Both versions are there.

_dw · Post by **_dw** » Tue Dec 12, 2023 8:37 pm

Wall_Axe wrote: ↑Sat Dec 09, 2023 2:30 pm the snake example is good, seems faster than if it were made in BASIC.

Im trying to compile it using CYGWIN which is a bash shell in windows

I tried doing sh compile.sh Game/snake
but it said it couldnt find FIRST.M4 because it was looking in ../FIRST.M4

So I moved the compile.sh into the same folder as snake.

its doing something using z88dk M4.exe and running into errors of not being able to find the command printf

probably because im not using real linux

_dw wrote: ↑Sat Dec 09, 2023 3:18 pm If you write it in Forth and use the "fth2m4" script, it will look for the FIRST.M4 file.
I think it's looking for "./" "../M4/" and something else. And accordingly set the first line correctly. FIRST.M4 does the same.
So it was probably enough to manually edit the line

include(`../M4/FIRST.M4')dnl

He expects to start the compilation inside the directory with the source (in the GAME directory) and therefore jumps out of GAME and enters the neighboring M4 directory where FIRST.M4 should be located, which then includes the other files that are already needed. And change ` to { and ' to }.

PS: Regarding the printf, I'm not sure what you're running into. But in scripts I often use the printf command instead of "echo", which is more flexible, because it doesn't break your line every time. And it should work on various OSes, including BSD.

Unfortunately, I can't advise you with the Windows support.

PPS: Actually, it will be even worse in Windows, because M4 (I mean directly this macro language).does not support it. As, for example, the ascii value of a character is determined, I can call from the M4 Linux shell and solve it there. So it can happen that something unexpectedly fails, even if it goes without a problem at other times.
Actually, it can fail even in Linux, because I'm not afraid to change it continuously and not everything is tested. It's more of a bazaar than a cathedral. There is no other way, because I find problems on the fly. I have already made changes that deliberately broke more things than I was able to fix in a day, so the code was deliberately broken in some situations on the website. But with zero users I can afford it. .)
It's always useful for showing small parts of the code, how this or that is solved, etc. Something like Compiler Explorer.

I just tried booting W10 and installing all updates and then installing windows subsystem for linux (wsl --install)

After a while of searching where the hell I have my data, I found out that I connect the disks to /mnt under my name (I'm not sure if the translation "my" is correct, it should be possessive pronoun for the third person).

I had to install some things, first I tried midnight commander, but it didn't load because the universe repository must be activated first.

Code: Select all

sudo add-apt-repository universe
sudo apt-get install mc
sudo apt-get install m4
sudo apt-get install pasmo
sudo apt-get install fuse-emulator-common
sudo apt-get install zmakebas

fuse had some problem with the graphics when it started, displayed artifacts and was unusable. But I installed fuse-emulator-gtk and didn't update the graphics driver, so it's not my problem. In addition, there are many more native emulators under Win.

the check_word.sh script ran without any problems with printf.

I wondered where I had to release m4 and help myself in the bash shell

Code: Select all

dworkin@dw-A15:~/Programovani/ZX/Forth$ grep -R esyscmd
M4/__macros.m4:define({__FILE_SIZE_VER1},{esyscmd({stat -c%s "$1" 2>/dev/null| tr -d '\n\t\r'})}){}dnl
M4/__macros.m4:define({__FILE_SIZE_VER2},{esyscmd({find "$1" -printf "%s" 2>/dev/null})}){}dnl
M4/__macros.m4:define({__FILE_SIZE_VER3},{esyscmd({wc -c $1 2>/dev/null | cut -f 1 -d " "| tr -d '\n\t\r'})}){}dnl
M4/__macros.m4:define({__FILE_SIZE_VER4},{esyscmd({wc -c $1 2>/dev/null | a=$(cut -f 1 -d " "); printf "$a"})}){}dnl
M4/__macros.m4:define({__FILE_SIZE_VER5},{esyscmd({a=$(wc -c $1 2>/dev/null); a=${a%% }; printf ""$a})}){}dnl
M4/__macros.m4:define({__FILE_SIZE_VER6},{esyscmd({a=$(wc -c $1 2>/dev/null); printf ""${a%% }})}){}dnl
M4/__macros.m4:__{}esyscmd({grep -o 'define({$2_size},[^)]*)' $1$2$3 2>/dev/null}){}dnl
M4/__macros.m4:esyscmd(printf "0x%02X" \'[{$1}])}])[{}]dnl

and tried to write
printf "printf "0x%02X" \'A (quite a horror because Windows uses a different keyboard layout than the one I'm used to in Linux) and the result ran correctly in both WSL and ubuntu.

Code: Select all

dworkin@dw-A15:~/Programovani/ZX/Forth$ printf "0x%02X" \'A 
0x41dworkin@dw-A15:~/Programovani/ZX/Forth$ printf "0x%02X" \'B
0x42dworkin@dw-A15:~/Programovani/ZX/Forth$ printf "0x%02X" \'C
0x43dworkin@dw-A15:~/Programovani/ZX/Forth$

the compile.sh script test after installing pasmo and zmakebas was almost successful. I encountered the problem only in one place where this code is at the beginning of the program

Code: Select all

  ifdef __ORG
    org __ORG
  else
    org 32768
  endif

and pasmo persistently rejects any word after ifdef that "expected macro name".

The problem is that the pasmo from the repository is version 5.0.3 and the latest version is 5.0.5. according to the pages, it only adds the ifdef construction.

Version 0.5.5
New directives IFDEF and IFNDEF. New emit options --sdrel for SDCC .rel files and --trs for TRS-80 cmd files. A lot of code cleaning and reformating. Some bug fixes. Test suite: use make check.

Gzipped tar file: pasmo-0.5.5.tar.gz

So I didn't notice a problem using M4 FORTH in windows.

...but even then I realized that I probably just misunderstood (or it was wrongly translated into Czech) what you wrote about printf.

Maybe you meant that the problem with printf is in the compilation from C using z88dk.
Because the C standard for parameter printing is quite complex to implement. And then I find out that the integer is not printed, but nowhere does it warn me that I should not use %i but I must use %d. Floating point number I want some special parameters to be supported etc.

From z88dk, I constantly have an inferiority complex, because it took me hours to find the right parameters for compilation, so that it finally compiled something at all...

_dw · Post by **_dw** » Wed Dec 20, 2023 11:14 pm

ketmar wrote: ↑Sat Dec 09, 2023 3:54 am actually, i found that "FOR" is not really really necessary: with words like "1+R!", "1-R!", "R@" and "R1@" you can have quite efficient loops (basically, writing "FOR" manually if you need to). doing manual loop index manipulations may hurt readability a little, but in exchange you have full control over loop step, and you can avoid keeping the original limit if you don't need to emulate "I".

also, if you're doing cross-compiling, it is possible to implement native code generator too. with some automatic inlining and SSA-based codegen it can get rid of most stack manipulations, constantly beating the crap out of most Z80 C compilers. i'm planning to release such system somewhere in the next year (it is in a PoC state now).

How exactly do you understand words like "1+R!", "1-R!", "R@" and "R1@".

"R@" is the standard dividing word ( -- x ) ( R: x -- x )
Naming logic in Forth can sometimes be a little tricky.
If I compare it with "R>" dividing ( -- x ) ( R: x -- ) and the opposite word ">R", or "RDROP". So it seems to me that it is something between R is marked that it will refer to ReturnAdrressStack and at the same time as if R marks it as "TOS" (Top Of Stack) but on R.A.S.

Because the normal "@" works with one parameter (so automatically TOS). And if it is added before as "32768 @" then it looks like "R@" is "top of return address stack".

Like "1+" does the same thing as "1+", only it can save work if it is handled by the interpet.

So "1+R!" then it looks like you increase TOS by 1 and then save it to the address whose value was in R.A.S. and then remove it (removal is not certain, because R@ should not behave that way). But that doesn't make much sense, because if it's supposed to be useful for writing loops, I'd expect you to mean by that word that you just increase the value stored in RAS by 1. Similar to "1-R!".

"R1@" is quite a mystery, because saving RAS at address 1 (somewhere in ROM) is obvious nonsense.
So it's supposed to store a value of 1 on RAS? "1 >R" Probably not, that would have a different name.
It could load the second parameter from the return value stack. Something like NOS. The Forth specification tries as much as possible to avoid any work on RAS. Just a store of values. I understand that it makes implementation easier. It just doesn't work on the Z80 if RAS is selected as the emulated one.

I have so far avoided supporting the words that I work with R.A.S. because it's always the wrong way, but in case someone uses it, I've added new words that are useful for computer emulation loops like:

"R> 1+ >R" and its modification with 2+ 1- 2-
"R> number + >R" and modification with -.

Original

Spoiler

Code: Select all

dworkin@dw-A15:~/Programovani/ZX/Forth/M4$ ../check_word.sh 'R_FROM __ASM _1ADD TO_R'
                        ;[9:66]     r>   ( b a -- b a i ) ( R: i -- )
    exx                 ; 1:4       r>
    ld    E,[HL]        ; 1:7       r>
    inc   L             ; 1:4       r>
    ld    D,[HL]        ; 1:7       r>
    inc  HL             ; 1:6       r>
    push DE             ; 1:11      r>
    exx                 ; 1:4       r>   i . b a
    ex   DE, HL         ; 1:4       r>   i . a b
    ex  [SP],HL         ; 1:19      r>   b . a i

    inc  HL             ; 1:6       1+
                        ;[9:65]     >r   ( c b a -- c b ) ( R: -- a )
    ex  [SP],HL         ; 1:19      >r   a . b c
    ex   DE, HL         ; 1:4       >r   a . c b
    exx                 ; 1:4       >r
    pop  DE             ; 1:10      >r
    dec  HL             ; 1:6       >r
    ld  [HL],D          ; 1:7       >r
    dec   L             ; 1:4       >r
    ld  [HL],E          ; 1:7       >r
    exx                 ; 1:4       >r
; seconds: 1           ;[19:137]

Now

Spoiler

Code: Select all

dworkin@dw-A15:~/Programovani/ZX/Forth/M4$ ../check_word.sh 'R_FROM _1ADD TO_R'
    exx                 ; 1:4      r> 1+ >r   ( R: x -- x+ )
    inc [HL]            ; 1:11     r> 1+ >r   lo 
    jr   nz, $+5        ; 2:7/12   r> 1+ >r
    inc   L             ; 1:4      r> 1+ >r
    inc [HL]            ; 1:11     r> 1+ >r   hi 
    dec   L             ; 1:4      r> 1+ >r
    exx                 ; 1:4      r> 1+ >r
; seconds: 1           ;[ 8:45]

That is a big difference. And it's quite fun to do, find a part that could be used often and improve it. And it's not even complicated, just a small challenge or a trick.

Then I started reworking the support for branching to decide if the loop should continue. This is quite laborious because there are different types of tests, so I then completed only unsigned. And for IF branches. But it is done in such a way that I create it in auxiliary macros, so it can be quickly added for WHILE and UNTIL. Everything skips if the result is FALSE, only the label is named differently. IF jumps to "else". UNTIL jumps to "begin" and WHILE jumps to "exit".

Before

Spoiler

Code: Select all

dworkin@dw-A15:~/Programovani/ZX/Forth/M4$ ../check_word.sh 'R_FETCH PUSH(100) ULT __ASM IF'
                        ;[9:64]     r@   ( -- i ) ( R: i -- i )
    exx                 ; 1:4       r@
    ld    E,[HL]        ; 1:7       r@
    inc   L             ; 1:4       r@
    ld    D,[HL]        ; 1:7       r@
    dec   L             ; 1:4       r@
    push DE             ; 1:11      r@
    exx                 ; 1:4       r@
    ex   DE, HL         ; 1:4       r@
    ex  [SP],HL         ; 1:19      r@
    push DE             ; 1:11      100
    ex   DE, HL         ; 1:4       100
    ld   HL, 100        ; 3:10      100
                        ;[7:41]     u<
    ld    A, E          ; 1:4       u<   DE<HL --> DE-HL<0 --> carry if true
    sub   L             ; 1:4       u<   DE<HL --> DE-HL<0 --> carry if true
    ld    A, D          ; 1:4       u<   DE<HL --> DE-HL<0 --> carry if true
    sbc   A, H          ; 1:4       u<   DE<HL --> DE-HL<0 --> carry if true
    sbc  HL, HL         ; 2:15      u<
    pop  DE             ; 1:10      u<

    ld    A, H          ; 1:4       if
    or    L             ; 1:4       if
    ex   DE, HL         ; 1:4       if
    pop  DE             ; 1:10      if
    jp    z, else101    ; 3:10      if
; seconds: 0           ;[28:162]

Now

Spoiler

Code: Select all

dworkin@dw-A15:~/Programovani/ZX/Forth/M4$ ../check_word.sh 'R_FETCH PUSH(100) ULT IF'
                       ;[13:54]     r@ 100 u< if   ( R: x -- x ) flag: x u< 100 ;#variant: default
    exx                 ; 1:4       r@ 100 u< if
    ld    A,[HL]        ; 1:7       r@ 100 u< if   [ras]<100 --> lo [ras]-0x64<0 --> false if not carry
    sub  0x64           ; 2:7       r@ 100 u< if   [ras]<100 --> lo [ras]-0x64<0 --> false if not carry
    inc   L             ; 1:4       r@ 100 u< if
    ld    A,[HL]        ; 1:7       r@ 100 u< if   [ras]<100 --> hi [ras]-0x00<0 --> false if not carry
    dec   L             ; 1:4       r@ 100 u< if
    sbc   A, 0x00       ; 2:7       r@ 100 u< if   [ras]<100 --> hi [ras]-0x00<0 --> false if not carry
    exx                 ; 1:4       r@ 100 u< if
    jp   nc, else101    ; 3:10      r@ 100 u< if   false if not carry
; seconds: 0           ;[13:54]

ketmar · Post by **ketmar** » Thu Dec 21, 2023 12:22 pm

_dw wrote: ↑Wed Dec 20, 2023 11:14 pm How exactly do you understand words like "1+R!", "1-R!", "R@" and "R1@".

the logic behing naming is like this: "R<smth>" indicates return stack operation. "R@" is "load the TOS of the return stack", so, logically, "R!" will be "store the argument to the TOS of the return stack". then, we usually have "1+!", word, which does "dup @ 1+ swap !" (i.e. increments value at the given address). hence, "1+R!" does the same with the TOS of the return stack (as indicated by "R"). the same reasoning applies to "+R!" (modelled after "+!" word).

now, "R<n>@" is the same as "<n> RPICK". "R0@" is just "R@", "R1@" is "1 RPICK", and so on. there are also "R1!" and such.

note that "!" in Forth is not read as "store something at the given address", but just as "store". this is the common convention: "!" is read as "store", and "@" is read as "load". so "R!" is "RTOS store", and "R@" is "RTOS load".

so, you can build return stack operations using a set of simple steps, going backward (the usual Forth way ;-).
1. do i need to load the value from RTOS? ok, the last char is "@".
2. what index i want? if index is "0", it is optional (but can be specified nevertheless). prepend index.
3. i want to indicate that i work with the return stack: prepend "R".
4. do i want some additional operation there? prepend it too.

so, using the above algorithm: if i want to increment 1th element of the return stack (note that we start counting from 0):
1. i need to store something. "!"
2. index is 1. "1!"
3. it is a return stack operation, "R1!"
4. i want to increment it. "+R1!".
5. i want to increment it by 1. "1 +R1!". optional microoptimisation applied, because we have such word: "1+R1!"

it can be deciphered back with the same algo, again, starting from the last char.

_dw wrote: ↑Wed Dec 20, 2023 11:14 pm I have so far avoided supporting the words that I work with R.A.S. because it's always the wrong way

yes, most of the time using return stack for something else than keeping one or two temp values is a bad coding style. but sometimes you may need return stack manipulation instructions for some very low-level words. i mostly added them to emulate DO/FOR loops when porting code from some other Forth system.

as for the branches, for Z80 i usually have "compare and branch" superinstructions. so things like "< IF" are written as "<IF" (actually, my cross-compiler does such optimisation automatically). this way we can avoid extra code to perform a comparison and push a flag only for it to be consumed by the following branch. i also have special cases like "-IF" ("0< IF"), "+IF" ("0> IF"), "-0IF" ("0<= IF"), "+0IF" ("0>= IF"), "IFNOT" ("0= IF"). "WHILE" and "UNTIL" also have such variants.

i am also doing tail call optimisation (i.e. the last word before ";" is not called as usual, we simply branch to it), and simple folding of operations like "addr @" to superinstruction "(#:@)", which is followed by the address. variables in my system compiled as simple literals (i.e. instead of calling a variable word, the compiler compiles LIT with variable PFA address), so "var @" is optimised to one fast "(#:@)" instruction.

sadly, my cross-compiler is not ported to UrForth/Beast yet, but i will eventually port and publish it.

_dw · Post by **_dw** » Tue Jan 09, 2024 2:37 am

You've inspired me to maybe create proper Forth names for extensions I couldn't figure out.

It is sometimes useful when BC is used as an auxiliary register, but this is completely outside the Forth standard.
So reading is BC@ (with the fact that fetch does not automatically delete the read value, which is correct behavior)
I already thought about the notation ">BC", but when "@" (fetch) is read, the notation should be "!" (store).
BC! is therefore a record.

This was also useful for direct access to the processor's data in Forth.

zf@ reads zero_flag as a bool value. When 1 converts to -1 and zero to zero.

cf@ loads carry_flag.

So I can have some internal words that simply end like ...zf! and follow it with "zf@ if" (which is one word and is converted into a single instruction "jp nz, else" or follow it with "zf@ invert if" which is converted into "jp z, else". It is similar to "while" and "until".

If I had done it this way from the beginning, I could have saved myself a lot of programming work

XXX_IF
XXX_WHILE
XXX_UNTIL

and could only create

XXX_ZF_STORE

and in tokens just connect it with

XXX_ZF_STORE + ZF_IF
XXX_ZF_STORE + ZF_WHILE
XXX_ZF_STORE + ZF_UNTIL

coz is written on one line.

But I will probably always leave it as hidden (undocumented) words.

_dw · Post by **_dw** » Tue Jan 09, 2024 3:10 am

But I have a problem with the support of 8 bit values.

Forth has a fixed cell size. 16 bit in my case on an 8 bit Z80 cpu (I don't want to solve a 4 bit adder and a 16 bit bus).

Given that the data is stored on a stack of only 8 bits, the width of the cell does not even make much sense.

I support words starting with letter C as 8 bit words.

So if I have TOS as HL, I only manipulate L and H will be ignored because of speed and not tested to see if it is zero.

I also introduced words starting with H as "hi"gh. I only manipulate the upper half of HL, i.e. the H register. Even STORE and FETCH should add 1 to the address.

But I'm already setting up problems here...

Such "c+" gives "L += E" and H does not change. But some words internally do something like "EX DE,HL". So the upper byte has the original value of D and not H.

And I didn't mention that I would need easy-to-remember words for the combination between the upper and lower byte.

It can always be solved somehow

L += H --> "dup 8 rshift +" but that seems quite long.

For easy code generation, I like to convert internally

DUP --> 0 PICK
OVER --> 1 PICK
2OVER NIP --> 2 PICK
2OVER DROP --> 3 PICK
etc.

If only for the reason that the rules for connecting the token will be easier for me. It's the same function just with a different parameter.

So something like an 8-bit CPICK would be useful for me. Where would

0 CPICK --> new tos = original L
1 CPICK --> new tos = original H
2 CPICK --> new tos = original E
3 CPICK --> new tos = original D
4 CPICK --> new tos = byte ptr [original SP]
5 CPICK --> new tos = byte ptr [original SP+1]
etc.

Then it would

L += H --> "dup 8 rshift +" --> "1 cpick c+"

The internal optimization of the compiler has already taken care of the fact that the stack does not move, but only reaches the H register to the L register.

PS: He would probably write a forth with a cell size of 8 bits if the TOS was in [SP] and the "inc/dec SP" instructions were used. But it would probably be quite problematic, because POP would load both TOS and NOS together every time. And if I were to return only one value, for example with "+"

Code: Select all

pop HL
ld A,L
add A,H
ld H.A
push HL
inc SP

I would have to move TOS from L to H.

16-bit words would be resolved as "d+" etc.
But it would probably be easier to pretend to the outside that the cell is 16 bits. So always write "swap" and not "2swap".

PPS: I read somewhere a translation of an older version of Forth where they say that the return address stack and the data stack should each have at least 64 values...
Because they both fit into 256 bytes.
Argh, it's possible to drive the RAS differently. Because it is possible that even some programs will not use it anyway.

ketmar · Post by **ketmar** » Tue Jan 09, 2024 6:58 pm

actually, there was at least one system (for 6502, if i recall it correctly), which was using 8-bit cells. so yeah, it may be worth trying to create a version with 8-bit basic data type, and 16-bit as double/address.

as for stack sizes… actually, for most Forth code (except the complex data structures like balanced trees, which, i believe, is not something you'll do on Speccy anyway) even 16-slot stack (32 bytes) is enough. actually, the stacks shouldn't always be symmetrical: rule of thumb says that data stack size should be something along "#rstack*1.8" (i.e. data stack is almost twice as big as return stack; i pulled 1.8 out of my ass, though ;-).

and yep, you can deviate from standards. (tbh, to write good and usable system you should throw all standards out of the window ;-). let "swap" still work on 16-bit values, and "cswap" on 8-bit values, for example. the beauty of Forth is that you can throw all kinds of socks to the wall, and watch which will stuck. ;-)

p.s.: about stack sizes again. even most complex data structures i am using (self-balancing binary trees with recursive balancing algorithms) require no more than ~96 slots of return stack, and ~120 slots of data stack. this is for 32-bit system, and my trees have 1e+6 nodes, which is totally unrealistic on Speccy, of course. ;-)

Spectrum Computing

Forth compiler for ZX

Forth compiler for ZX

Re: Forth compiler for ZX

Re: Forth compiler for ZX

Re: Forth compiler for ZX

Re: Forth compiler for ZX

Re: Forth compiler for ZX

Re: Forth compiler for ZX

Re: Forth compiler for ZX

Re: Forth compiler for ZX

Re: Forth compiler for ZX

Re: Forth compiler for ZX

Re: Forth compiler for ZX

Re: Forth compiler for ZX

Re: Forth compiler for ZX

Re: Forth compiler for ZX

Re: Forth compiler for ZX

Re: Forth compiler for ZX

Re: Forth compiler for ZX

Re: Forth compiler for ZX

Re: Forth compiler for ZX

Re: Forth compiler for ZX

Re: Forth compiler for ZX

Re: Forth compiler for ZX

Re: Forth compiler for ZX

Re: Forth compiler for ZX