A new data compressor called ZX0

The place for codemasters or beginners to talk about programming any language for the Spectrum.
Alone Coder
Manic Miner
Posts: 401
Joined: Fri Jan 03, 2020 10:00 am

Re: A new data compressor called ZX0

Post by Alone Coder »

evilpaul wrote: Thu Sep 02, 2021 8:56 am
rastersoft wrote: Thu Sep 02, 2021 7:50 am Mmm... Does anybody have interest in my text compression/decompression routines from Escape from M.O.N.J.A.S.? The output is about the 59% size of the original, each string is individually addressable, and can be decompressed on-the-fly without needing any kind of buffer. If there is interest, I can extract it and put some instructions...
I need to do something with text compression for my Zelda-lite game. I have a bunch of ideas, but I would be interested to hear your approach. Maybe open another thread?
One simple approach is 5 bit per letter and 3 bits for space. I've used this in one 128 byte intro. However, the punctuation will require separate treatment.
Huffman encoding gives better results for big amount of data (in ZX-Guide #2 and 3, some of the articles were Huffman encoded, each line individually).
In ZX-Time magazine, all the text was in 5 bits per letter with some filtering on the result. One possible filter is to change the register after . ! ?
User avatar
Einar Saukas
Bugaboo
Posts: 3145
Joined: Wed Nov 15, 2017 2:48 pm

Re: A new data compressor called ZX0

Post by Einar Saukas »

Urusergi wrote: Wed Sep 01, 2021 8:03 pm This time I have achieved a more important speed optimization, around 2% 8-)
Awesome, thank you!!!
User avatar
rastersoft
Microbot
Posts: 151
Joined: Mon Feb 22, 2021 3:55 pm

Re: A new data compressor called ZX0

Post by rastersoft »

evilpaul wrote: Thu Sep 02, 2021 8:56 am
rastersoft wrote: Thu Sep 02, 2021 7:50 am Mmm... Does anybody have interest in my text compression/decompression routines from Escape from M.O.N.J.A.S.? The output is about the 59% size of the original, each string is individually addressable, and can be decompressed on-the-fly without needing any kind of buffer. If there is interest, I can extract it and put some instructions...
I need to do something with text compression for my Zelda-lite game. I have a bunch of ideas, but I would be interested to hear your approach. Maybe open another thread?
My system is based in LZSS (this is: encoding a pair distance-length for each block). The decompression is fast as light, but the compression is another story. In fact, I did some research and it seems to not exist an optimal way of doing it. I do it in an iterative fashion, by compressing the sentences, reordering them and trying again, keeping the best compression achieves up to that moment. It takes advantage of the fact that ASCII is 7 bits, so if a byte has its 7th bit set, it and the next is a pair distance-length (12 bits for distance from the beginning of the compressed block, and three bits for size, from 3 to 10 bytes).

The input is a source code file with each sentence in a DB statement, with a label immediately before for each sentence. The output is a source code file too, with the same labels but the DBs contain the compressed sentences. Each sentence must end in 0.
User avatar
rastersoft
Microbot
Posts: 151
Joined: Mon Feb 22, 2021 3:55 pm

Re: A new data compressor called ZX0

Post by rastersoft »

Alone Coder wrote: Thu Sep 02, 2021 10:09 am
evilpaul wrote: Thu Sep 02, 2021 8:56 am I need to do something with text compression for my Zelda-lite game. I have a bunch of ideas, but I would be interested to hear your approach. Maybe open another thread?
One simple approach is 5 bit per letter and 3 bits for space. I've used this in one 128 byte intro. However, the punctuation will require separate treatment.
Huffman encoding gives better results for big amount of data (in ZX-Guide #2 and 3, some of the articles were Huffman encoded, each line individually).
In ZX-Time magazine, all the text was in 5 bits per letter with some filtering on the result. One possible filter is to change the register after . ! ?
I tried that, but compression was not good.
Urusergi
Drutt
Posts: 16
Joined: Sun Aug 15, 2021 3:46 pm
Location: Madrid, Spain

Re: A new data compressor called ZX0

Post by Urusergi »

A crazy idea?

if we use a totally negative newoffset (forwards) I think it would work, and pretty fast and the best thing is that we would stop using alternative registers.

Code: Select all

; -----------------------------------------------------------------------------
; ZX0 decoder by Einar Saukas
; "Standard" version (69 bytes only)
; -----------------------------------------------------------------------------
; Parameters:
;   HL source address (compressed data)
;   DE destination address (decompressing)
; -----------------------------------------------------------------------------

	....
	....
	....
	
dzx0s_new_offset:
	inc	sp
	inc	sp
	ld	c, $ff
        call    dzx0s_elias_loop        ; obtain negative offset MSB
	inc	b
        ret     z                       ; check end marker
        ld      b, c
	ld      c, (hl)                 ; obtain negative offset LSB
        inc     hl
        rr      b                       ; last offset bit becomes first length bit
        rr      c
        push    bc                      ; preserve new offset
        ld      bc, 1                   ; obtain length
        call    nc, dzx0s_elias_backtrack
        inc     bc
        jr      dzx0s_copy
        
dzx0s_elias:
        inc	c	                ; interlaced Elias gamma coding
dzx0s_elias_loop:
        add     a, a
        jr      nz, dzx0s_elias_skip
        ld      a, (hl)                 ; load another group of 8 bits
        inc     hl
        rla
dzx0s_elias_skip:
        ret     c
dzx0s_elias_backtrack:
        add     a, a
        rl      c
        rl      b
        jr      dzx0s_elias_loop
It's possible or is there something wrong?
catmeows
Manic Miner
Posts: 718
Joined: Tue May 28, 2019 12:02 pm
Location: Prague

Re: A new data compressor called ZX0

Post by catmeows »

rastersoft wrote: Thu Sep 02, 2021 3:50 pm
Alone Coder wrote: Thu Sep 02, 2021 10:09 am One simple approach is 5 bit per letter and 3 bits for space. I've used this in one 128 byte intro. However, the punctuation will require separate treatment.
Huffman encoding gives better results for big amount of data (in ZX-Guide #2 and 3, some of the articles were Huffman encoded, each line individually).
In ZX-Time magazine, all the text was in 5 bits per letter with some filtering on the result. One possible filter is to change the register after . ! ?
I tried that, but compression was not good.
The best simple text compression is Byte Pair encoding. (https://en.wikipedia.org/wiki/Byte_pair_encoding) It is dead simple, it allows to decode just single message and it still thinks globally (which is why it usually beats LZ77 variants that compress standalone messages).

This how about 9KB text (151 messages) from Pirates! packs when transcribbed to upper case. In theory, messages with both cases have somewhat worse ratio.

Code: Select all

Message count 151
Messages are 9643 bytes long.
Registering literal W
Registering literal E
Registering literal L
Registering literal C
Registering literal O
Registering literal M
Registering literal  
Registering literal T
Registering literal "
Registering literal B
Registering literal A
Registering literal K
Registering literal F
Registering literal G
Registering literal n
Registering literal R
Registering literal U
Registering literal N
Registering literal I
Registering literal H
Registering literal S
Registering literal D
Registering literal P
Registering literal .
Registering literal Y
Registering literal ?
Registering literal ÿ
Registering literal ,
Registering literal :
Registering literal V
Registering literal (
Registering literal 1
Registering literal 5
Registering literal 6
Registering literal 0
Registering literal )
Registering literal 2
Registering literal 4
Registering literal '
Registering literal 8
Registering literal X
Registering literal $
Registering literal J
Registering literal !
Registering literal Q
Registering literal Z
Registering literal -
Registering literal @
Registering literal &
Registering literal i
best available pair with occurency 248 -> 69,32
Registering pair 69, 32. Space left: 205
Encoding
best available pair with occurency 171 -> 32,65
Registering pair 32, 65. Space left: 204
Encoding
best available pair with occurency 166 -> 79,85
Registering pair 79, 85. Space left: 203
And so on till....

Code: Select all

Final size: 4442(was 9643)
Proud owner of Didaktik M
User avatar
Sol_HSA
Microbot
Posts: 162
Joined: Thu Feb 04, 2021 11:03 am
Location: .fi
Contact:

Re: A new data compressor called ZX0

Post by Sol_HSA »

catmeows wrote: Mon Sep 06, 2021 5:47 pm The best simple text compression is Byte Pair encoding. (https://en.wikipedia.org/wiki/Byte_pair_encoding) It is dead simple, it allows to decode just single message and it still thinks globally (which is why it usually beats LZ77 variants that compress standalone messages).
"Alter Ego" on PC used a similar scheme, but instead of encoding byte pairs to a single byte the unused byte values referred to a string table (which was also stored). Since the amount of text in alter ego was significant, this worked out. So the string table might include things like "The a" or perhaps some frequently used longer words like "probably".

I can see how the byte pair thing is a killer on short strings, though.
User avatar
rastersoft
Microbot
Posts: 151
Joined: Mon Feb 22, 2021 3:55 pm

Re: A new data compressor called ZX0

Post by rastersoft »

catmeows wrote: Mon Sep 06, 2021 5:47 pm The best simple text compression is Byte Pair encoding. (https://en.wikipedia.org/wiki/Byte_pair_encoding) It is dead simple, it allows to decode just single message and it still thinks globally (which is why it usually beats LZ77 variants that compress standalone messages).

This how about 9KB text (151 messages) from Pirates! packs when transcribbed to upper case. In theory, messages with both cases have somewhat worse ratio.

Code: Select all

Message count 151
Messages are 9643 bytes long.
Registering literal W
Registering literal E
Registering literal L
Registering literal C
Registering literal O
Registering literal M
Registering literal  
Registering literal T
Registering literal "
Registering literal B
Registering literal A
Registering literal K
Registering literal F
Registering literal G
Registering literal n
Registering literal R
Registering literal U
Registering literal N
Registering literal I
Registering literal H
Registering literal S
Registering literal D
Registering literal P
Registering literal .
Registering literal Y
Registering literal ?
Registering literal ÿ
Registering literal ,
Registering literal :
Registering literal V
Registering literal (
Registering literal 1
Registering literal 5
Registering literal 6
Registering literal 0
Registering literal )
Registering literal 2
Registering literal 4
Registering literal '
Registering literal 8
Registering literal X
Registering literal $
Registering literal J
Registering literal !
Registering literal Q
Registering literal Z
Registering literal -
Registering literal @
Registering literal &
Registering literal i
best available pair with occurency 248 -> 69,32
Registering pair 69, 32. Space left: 205
Encoding
best available pair with occurency 171 -> 32,65
Registering pair 32, 65. Space left: 204
Encoding
best available pair with occurency 166 -> 79,85
Registering pair 79, 85. Space left: 203
And so on till....

Code: Select all

Final size: 4442(was 9643)
The problem with that algorithm is that it is "recursive": you must process the data over and over, replacing one occurrence each pass, and then undo each pass during decompression. Thus you can't really do decompression "on-the-fly".
User avatar
Einar Saukas
Bugaboo
Posts: 3145
Joined: Wed Nov 15, 2017 2:48 pm

Re: A new data compressor called ZX0

Post by Einar Saukas »

Urusergi wrote: Mon Sep 06, 2021 3:29 pm A crazy idea?

if we use a totally negative newoffset (forwards) I think it would work, and pretty fast and the best thing is that we would stop using alternative registers.

Code: Select all

; -----------------------------------------------------------------------------
; ZX0 decoder by Einar Saukas
; "Standard" version (69 bytes only)
; -----------------------------------------------------------------------------
; Parameters:
;   HL source address (compressed data)
;   DE destination address (decompressing)
; -----------------------------------------------------------------------------

	....
	....
	....
	
dzx0s_new_offset:
	inc	sp
	inc	sp
	ld	c, $ff
        call    dzx0s_elias_loop        ; obtain negative offset MSB
	inc	b
        ret     z                       ; check end marker
        ld      b, c
	ld      c, (hl)                 ; obtain negative offset LSB
        inc     hl
        rr      b                       ; last offset bit becomes first length bit
        rr      c
        push    bc                      ; preserve new offset
        ld      bc, 1                   ; obtain length
        call    nc, dzx0s_elias_backtrack
        inc     bc
        jr      dzx0s_copy
        
dzx0s_elias:
        inc	c	                ; interlaced Elias gamma coding
dzx0s_elias_loop:
        add     a, a
        jr      nz, dzx0s_elias_skip
        ld      a, (hl)                 ; load another group of 8 bits
        inc     hl
        rla
dzx0s_elias_skip:
        ret     c
dzx0s_elias_backtrack:
        add     a, a
        rl      c
        rl      b
        jr      dzx0s_elias_loop
It's possible or is there something wrong?
This is a great idea!!!

Except the math was wrong, but I think this will work:

Code: Select all

dzx0s_new_offset:
        inc     sp
        inc     sp
        ld      c, $fe
        call    dzx0s_elias_loop        ; obtain negative offset MSB
        inc     c
        ret     z                       ; check end marker
        ld      b, c
Unfortunately it will require modifying the ZX0 format, but it's probably worth it.

Anyway I will double check everything and run detailed tests. And if this change is approved, I will also implement a "backwards compatibility mode" so current format will be supported too.

Thanks again!
catmeows
Manic Miner
Posts: 718
Joined: Tue May 28, 2019 12:02 pm
Location: Prague

Re: A new data compressor called ZX0

Post by catmeows »

rastersoft wrote: Mon Sep 06, 2021 7:19 pm The problem with that algorithm is that it is "recursive": you must process the data over and over, replacing one occurrence each pass, and then undo each pass during decompression. Thus you can't really do decompression "on-the-fly".
That's not true.

Code: Select all

	.MODULE depack_text
	
	;test: 4800 chars packed to 2079 bytes + 512 bytes dictionary -> 54% i.e. 4.3 bits per character

depack_text
	;HL callback
	;BC message id

	ld (_callback + 1), hl	;set callback

	ld hl, messages
	inc bc			;add one 
_find1
	dec bc			;decrease one
	ld a, b
	or c
	jr z, _depack		;message found ?
_find2
	ld a, (hl)		;scan for next message end
	inc hl
	cp 255
	jr nz, _find2
	jr _find1		;when found decrease message counter
_depack	
	ld a, (hl)		;get code
	cp 255			;end of message ?
	ret z			;leave
	push hl
	call _decode
	pop hl
	inc hl
	jr _depack

_decode	
	ld c, a
	ld b, 0
	ld hl, dictionary			
	add hl, bc
	add hl, bc		;find code in dictionary
	ld a, (hl)		;get first
	cp 255			;is it leaf?
	jr nz, _decode1
	inc hl
	ld a, (hl)		;get literal
_callback	
	jp 0			;apply callback
_decode1
	push hl			;not a leaf
	call _decode		;decode first (recursion)
	pop hl			
	inc hl
	ld a, (hl)
	jp _decode		;decode second (recursion) and leave		

And this is tested code - see screenshot of working crude menu system: viewtopic.php?p=22356#p22356
The only problem is depth of stack, in a pathological case, you could go down to level 255. In practice, it will never happen. For example 32 spaces will be packed to 16 double_spaces, 16 double spaces to 8 quad spaces, 8 quad spaces to 4 oct spaces. Actual tree is pretty shallow, digraphs are level two, trigraphs level three, short common words are level four etc. And if you are paranoid, you can always let compressor to check tree depth (for example when working on console like a2600, when you really need to keep eye on depth of stack).

Basically when is message too long to cause problem with stack or even performance, it is usually long enough to pass treshold when LZ would provide better ratio. But here we are speaking about perhaps 1KB+ texts, not about three liners.
Proud owner of Didaktik M
Urusergi
Drutt
Posts: 16
Joined: Sun Aug 15, 2021 3:46 pm
Location: Madrid, Spain

Re: A new data compressor called ZX0

Post by Urusergi »

You're right, brilliant!!

This means that b doesn't matter and we can optimize 1 byte more? :o

Code: Select all

dzx0s_new_offset:
	pop	bc
        ld      c, $fe
        call    dzx0s_elias_loop        ; obtain negative offset MSB
        inc     c
        ret     z                       ; check end marker
        ld      b, c
User avatar
Einar Saukas
Bugaboo
Posts: 3145
Joined: Wed Nov 15, 2017 2:48 pm

Re: A new data compressor called ZX0

Post by Einar Saukas »

Awesome!
User avatar
rastersoft
Microbot
Posts: 151
Joined: Mon Feb 22, 2021 3:55 pm

Re: A new data compressor called ZX0

Post by rastersoft »

catmeows wrote: Mon Sep 06, 2021 9:17 pm
rastersoft wrote: Mon Sep 06, 2021 7:19 pm The problem with that algorithm is that it is "recursive": you must process the data over and over, replacing one occurrence each pass, and then undo each pass during decompression. Thus you can't really do decompression "on-the-fly".
That's not true.
...
[/code]
And this is tested code - see screenshot of working crude menu system: viewtopic.php?p=22356#p22356
The only problem is depth of stack, in a pathological case, you could go down to level 255. In practice, it will never happen. For example 32 spaces will be packed to 16 double_spaces, 16 double spaces to 8 quad spaces, 8 quad spaces to 4 oct spaces.
Mmmm... I see what you mean... but precisely those pathological (and not so pathological) cases are the ones that worry me. In my case I had very little space in the stack because I used threads, each one with its own stack, so I had to be very careful with how much it grown.
Urusergi
Drutt
Posts: 16
Joined: Sun Aug 15, 2021 3:46 pm
Location: Madrid, Spain

Re: A new data compressor called ZX0

Post by Urusergi »

This could be the definitive standard forwards version (if the modifications work on the compressor).
Still occupies the same bytes as before (69, magic number :mrgreen:) but we take advantage of the speed optimizations of the standard backwards version.

in this way the forwards version is slightly faster than the backwards version, I think.

Code: Select all

; -----------------------------------------------------------------------------
; ZX0 decoder by Einar Saukas
; "Standard" version (69 bytes only)
; -----------------------------------------------------------------------------
; Parameters:
;   HL: source address (compressed data)
;   DE: destination address (decompressing)
; -----------------------------------------------------------------------------

dzx0_standard:
        ld      bc, $ffff               ; preserve default offset 1
        push    bc
        inc     bc
        ld      a, $80
dzx0s_literals:
	inc	c
        call    dzx0s_elias             ; obtain length
        ldir                            ; copy literals
        add     a, a                    ; copy from last offset or new offset?
        jr      c, dzx0s_new_offset
	inc	c
        call    dzx0s_elias             ; obtain length
dzx0s_copy:
        ex      (sp), hl                ; preserve source, restore offset
        push    hl                      ; preserve offset
        add     hl, de                  ; calculate destination - offset
        ldir                            ; copy from offset
        pop     hl                      ; restore offset
        ex      (sp), hl                ; preserve offset, restore source
        add     a, a                    ; copy from literals or new offset?
        jr      nc, dzx0s_literals
dzx0s_new_offset:
	pop	bc
	ld	c, $fe
        call    dzx0s_elias             ; obtain offset MSB
	inc	c
        ret     z                       ; check end marker
        ld      b, c
        ld      c, (hl)                 ; obtain offset LSB
        inc     hl
        rr      b                       ; last offset bit becomes first length bit
        rr      c
        push    bc                      ; preserve new offset
        ld      bc, 1                   ; obtain length
        call    nc, dzx0s_elias_backtrack
        inc     bc
        jr      dzx0s_copy

dzx0s_elias_backtrack:
        add     a, a
        rl      c
        rl      b
dzx0s_elias:
        add     a, a			; interlaced Elias gamma coding
        jr      nz, dzx0s_elias_skip
        ld      a, (hl)                 ; load another group of 8 bits
        inc     hl
        rla
dzx0s_elias_skip:
        jr	nc, dzx0s_elias_backtrack
	ret
; -----------------------------------------------------------------------------
User avatar
evilpaul
Manic Miner
Posts: 244
Joined: Tue Jul 28, 2020 8:04 am
Location: Espoo, Finland
Contact:

Re: A new data compressor called ZX0

Post by evilpaul »

Some "progression" on the Python port. Timings for various versions (from my laptop) compressing 48.rom:

My port running on Python3: 93 seconds
My port running on PyPy3: 4 seconds
Original compiled C version: 3 seconds

Massive speed-up achieved simply by using PyPy3 (an optimised Python3 interpreter) instead of the stock interpreter. I'm happy with these timings now. I'll tidy up the code at (waves hands indistinctly) some time in the future and share it.
User avatar
Einar Saukas
Bugaboo
Posts: 3145
Joined: Wed Nov 15, 2017 2:48 pm

Re: A new data compressor called ZX0

Post by Einar Saukas »

OK, it's done! The latest optimizations are now available in the official ZX0 repository:

https://github.com/einar-saukas/ZX0

IMPORTANT: ZX0 version 2 now uses a different compressed file format (technically value bits in Elias(MSB(offset)) are now inverted). This new format allows ZX0 decompressors in Assembly Z80 to be slightly smaller and slightly faster. I believe this change will also benefit ZX0 decompressors in most other platforms too. The old format from ZX0 version 1 is still supported by the compressor (using option "-c") but it's now deprecated. If you are the author of a ZX0 decompressor port to another platform, please consider upgrading to ZX0 version 2!

I have also updated the integrated ZX0+RCS decompressors to use the new format, and submitted changes to Boriel ZX BASIC library. I will post about it when the pull request gets approved.

If there are any questions, please let me know!
Urusergi
Drutt
Posts: 16
Joined: Sun Aug 15, 2021 3:46 pm
Location: Madrid, Spain

Re: A new data compressor called ZX0

Post by Urusergi »

Einar Saukas wrote: Thu Sep 09, 2021 6:49 pm OK, it's done! The latest optimizations are now available in the official ZX0 repository:
Bravo!!!

I'm really glad it works :D

Congratulations.

EDIT: you didn't like my last forward standard optimization? in exchange for a byte you get 2% extra speed. I found it interesting enough 8-)
User avatar
Einar Saukas
Bugaboo
Posts: 3145
Joined: Wed Nov 15, 2017 2:48 pm

Re: A new data compressor called ZX0

Post by Einar Saukas »

Urusergi wrote: Thu Sep 09, 2021 8:36 pm Congratulations.
Thanks to your extremely creative suggestion!

You are properly mentioned in the credits :)

Urusergi wrote: Thu Sep 09, 2021 8:36 pm EDIT: you didn't like my last forward standard optimization? in exchange for a byte you get 2% extra speed. I found it interesting enough 8-)
Sorry, I forgot to post a reply for your last post.

Problem is, the "standard" decompressor is intended to be as small as possible. The idea of sacrificing size for better performance makes perfect sense, but it's exactly the reason we have other decompressor variants.
Urusergi
Drutt
Posts: 16
Joined: Sun Aug 15, 2021 3:46 pm
Location: Madrid, Spain

Re: A new data compressor called ZX0

Post by Urusergi »

Einar Saukas wrote: Thu Sep 09, 2021 9:12 pm Thanks to your extremely creative suggestion!

You are properly mentioned in the credits :)
Thank you very much :D
Einar Saukas wrote: Thu Sep 09, 2021 9:12 pmSorry, I forgot to post a reply for your last post.

Problem is, the "standard" decompressor is intended to be as small as possible. The idea of sacrificing size for better performance makes perfect sense, but it's exactly the reason we have other decompressor variants.
No problem, I understand you perfectly. I like much more to optimize in size than speed (I'm really terrible optimizing in speed :? that's why I don't dare with "mega".asm files)
Anyway, new standard decompressor is about 1% faster than the previous 8-)
Urusergi
Drutt
Posts: 16
Joined: Sun Aug 15, 2021 3:46 pm
Location: Madrid, Spain

Re: A new data compressor called ZX0

Post by Urusergi »

Another little optimization. This time for fast 2.0 routine:

Code: Select all

        ld      b, &ff                  ; the top byte of the offset is always $FF
at this point bc is always 0 so it can be safely changed by:

Code: Select all

	dec	b
On the other hand we have a lonely "jp":

Code: Select all

        jp      nz, CopyMatch1
I think it was a "jr" in the past, but the source code was increasing in size until reaching out the range and was replaced by a "jp"? because all my tests indicate a slight improvement in speed now that we can change it to "jr" again.

There is also a wrong label: LongerOffets

Code: Select all

        jr      nc, LongerOffets
        jr      nz, ShorterOffsets      ; NZ after NC == "confirmed C"
        
        ld      a, (hl)                 ; reload bits
        inc     hl
        rla

        jr      c, ShorterOffsets

LongerOffets:
Urusergi
Drutt
Posts: 16
Joined: Sun Aug 15, 2021 3:46 pm
Location: Madrid, Spain

Re: A new data compressor called ZX0

Post by Urusergi »

I can't find any more optimizations, so this afternoon I started to think about a possible evolution of the format, and again, I had a new crazy idea.

What if we don't discard the last offset and use the stack as a stack & buffer mix? I'm not sure if this can improve compression but I had nothing else to do :P so, I created this forward standard decompressor mod:

Code: Select all

; -----------------------------------------------------------------------------
; ZXX decoder by Einar Saukas & Urusergi
; "Standard" version (93 bytes only)
; -----------------------------------------------------------------------------
; Parameters:
;   HL: source address (compressed data)
;   DE: destination address (decompressing)
; -----------------------------------------------------------------------------
dzxx_standard:
	pop	iy			; safeguard the return of the routine
        ld      bc, $ffff               ; preserve default offset 1
        push    bc
        inc     bc
        ld      a, $80
dzxxs_literals:
        call    dzxxs_elias             ; obtain length
        ldir                            ; copy literals
        add     a, a                    ; copy from old offset or new offset?
        jr      c, dzxxs_new_offset

        call    dzxxs_elias             ; obtain buffer index
	push	bc
	pop	ix
	add	ix, ix			; x2 being 16 bits
	add	ix, sp			; ix points to old offset LSB
	ld	c, (ix + 0)
	ld	b, (ix + 1)
	push	bc			; buffer updated
	ld	bc, 1
        call    dzxxs_elias             ; obtain length
	
dzxxs_copy:
        ex      (sp), hl                ; preserve source, restore offset
        push    hl                      ; preserve offset
        add     hl, de                  ; calculate destination - offset
        ldir                            ; copy from offset
        pop     hl                      ; restore offset
        ex      (sp), hl                ; preserve offset, restore source
        add     a, a                    ; copy from literals or new offset?
        jr      nc, dzxxs_literals
dzxxs_new_offset:
        ld      c, $fe                  ; prepare negative offset
        call    dzxxs_elias_loop        ; obtain offset MSB
        inc     c
        jr      z, dzxx_exit            ; check end marker
        ld      b, c
        ld      c, (hl)                 ; obtain offset LSB
        inc     hl
        rr      b                       ; last offset bit becomes first length bit
        rr      c
        push    bc                      ; preserve new offset
        ld      bc, 1                   ; obtain length
        call    nc, dzxxs_elias_backtrack
        inc     bc
        jr      dzxxs_copy
dzxxs_elias:
        inc     c                       ; interlaced Elias gamma coding
dzxxs_elias_loop:
        add     a, a
        jr      nz, dzxxs_elias_skip
        ld      a, (hl)                 ; load another group of 8 bits
        inc     hl
        rla
dzxxs_elias_skip:
        ret     c
dzxxs_elias_backtrack:
        add     a, a
        rl      c
        rl      b
        jr      dzxxs_elias_loop
dzxx_exit:
	push	iy
	ret
; -----------------------------------------------------------------------------
Naturally, the stack will increase quite a bit, so the hypothetical compressor should report the final size of the buffer to have control.
 
User avatar
Einar Saukas
Bugaboo
Posts: 3145
Joined: Wed Nov 15, 2017 2:48 pm

Re: A new data compressor called ZX0

Post by Einar Saukas »

Urusergi wrote: Sat Sep 18, 2021 11:26 pm I can't find any more optimizations, so this afternoon I started to think about a possible evolution of the format, and again, I had a new crazy idea.
That's a very original idea but probably too crazy. :)

I'm not convinced it would give good enough compression. Also stack growth could make it impractical.

But if you want to test it, go ahead! Except I suggest replacing POP IY ... PUSH IY/RET with LD (NN),SP ... LD SP,0/RET otherwise the caller would get a stack full of garbage.
User avatar
Einar Saukas
Bugaboo
Posts: 3145
Joined: Wed Nov 15, 2017 2:48 pm

Re: A new data compressor called ZX0

Post by Einar Saukas »

Urusergi wrote: Tue Sep 14, 2021 9:33 pm Another little optimization. This time for fast 2.0 routine:

Code: Select all

        ld      b, &ff                  ; the top byte of the offset is always $FF
at this point bc is always 0
There's an execution flow with LDIR followed by LDI, thus I think you could have bc=$ffff at this point... Since the "fast" decompressor is provided by introspec and uniabis, it's up to them to evaluate your suggestions.

Anyway thanks again!
Urusergi
Drutt
Posts: 16
Joined: Sun Aug 15, 2021 3:46 pm
Location: Madrid, Spain

Re: A new data compressor called ZX0

Post by Urusergi »

Einar Saukas wrote: Sun Sep 19, 2021 5:51 amThat's a very original idea but probably too crazy. :)
:lol: Yeah, sometimes I get so bored that these things come to me. However to test the idea I would need to learn C programming first :oops:
Einar Saukas wrote: Sun Sep 19, 2021 6:02 amThere's an execution flow with LDIR followed by LDI, thus I think you could have bc=$ffff at this point... Since the "fast" decompressor is provided by introspec and uniabis, it's up to them to evaluate your suggestions.

Anyway thanks again!
Yes, but there is a <inc c> which ensures that the last ldi decreases bc to 0

Code: Select all

        ; because BC>=3-1, we can do 2 x LDI safely
        ldi
        ldir
        inc     c
        ldi
        pop     hl                      ; restore source

        ; after a match you can have either
        ; 0 + <elias length> = run of literals, or
        ; 1 + <elias offset msb> + [7-bits of offset lsb + 1-bit of length] + <elias length> = another match
AfterMatch3:
        add     a, a
        jr      c, UsualMatch
User avatar
Einar Saukas
Bugaboo
Posts: 3145
Joined: Wed Nov 15, 2017 2:48 pm

Re: A new data compressor called ZX0

Post by Einar Saukas »

Urusergi wrote: Sun Sep 19, 2021 8:31 am Yes, but there is a <inc c> which ensures that the last ldi decreases bc to 0
You are right! Sorry I missed it.
Post Reply