A new data compressor called ZX0

The place for codemasters or beginners to talk about programming any language for the Spectrum.
User avatar
Einar Saukas
Bugaboo
Posts: 3097
Joined: Wed Nov 15, 2017 2:48 pm

Re: A new data compressor called ZX0

Post by Einar Saukas »

Urusergi wrote: Sat Sep 18, 2021 11:26 pmthis afternoon I started to think about a possible evolution of the format
When I was working on ZX0, I also created another more complex format that I used for comparison. It had a slightly better compression than ZX0 (around .5% on average). The problem is, an optimal compressor for this more complex format required so much time and memory that it wasn't able to compress about 1/3 of the files I tested. There was no point in using a non-optimal compressor either, since it typically lost more than .5% in compression ratio. It was simply not worth it.
Urusergi
Drutt
Posts: 16
Joined: Sun Aug 15, 2021 3:46 pm
Location: Madrid, Spain

Re: A new data compressor called ZX0

Post by Urusergi »

Einar Saukas wrote: Tue Sep 21, 2021 11:45 pmWhen I was working on ZX0, I also created another more complex format that I used for comparison.
I'm so curious! could you please briefly explain that format? thank you.

It's true that my last code made the stack grow a lot but I've improved it. Now it only grows when a "new offset" occurs, using "move to front" inside "old offset" method. Anyway, this is nothing serious, just for my entertainment.
jimmy
Drutt
Posts: 31
Joined: Sun Nov 24, 2019 9:06 pm

Re: A new data compressor called ZX0

Post by jimmy »

Einar Saukas wrote: Tue Sep 21, 2021 11:45 pm When I was working on ZX0, I also created another more complex format that I used for comparison. It had a slightly better compression than ZX0 (around .5% on average).
Yes please do tell us more about this - I'm always on the lookout for better compression! I have some games which are just slightly too big for a 16K cartridge. I have one game which fits with Shrinkler (another compressor), but not with ZX0 - I need another 100 bytes!

The problem is that Shrinkler takes around 75 seconds to decompress before starting the game :-(
User avatar
Einar Saukas
Bugaboo
Posts: 3097
Joined: Wed Nov 15, 2017 2:48 pm

Re: A new data compressor called ZX0

Post by Einar Saukas »

OK, give me a few days to put together a proper release.
Urusergi
Drutt
Posts: 16
Joined: Sun Aug 15, 2021 3:46 pm
Location: Madrid, Spain

Re: A new data compressor called ZX0

Post by Urusergi »

Einar Saukas wrote: Thu Sep 23, 2021 2:28 pm OK, give me a few days to put together a proper release.
Bets are allowed! ;)

We are intrigued. Which format the compressor will use? Is it possible that the new compressor uses more than one "last offset"?
Yesterday I found out that lzma uses the last 4 offsets, so I wasn't really wrong with my latest idea 8-)
With a Z80 we can get up to 3 offsets easily, using the stack, ix & iy with ex (sp),ix or ex (sp),iy for the "move to front".
 
User avatar
Einar Saukas
Bugaboo
Posts: 3097
Joined: Wed Nov 15, 2017 2:48 pm

Re: A new data compressor called ZX0

Post by Einar Saukas »

The "experimental" ZX5 compressor is now available here:

https://github.com/einar-saukas/ZX5

If you thought ZX0 compressor was too slow, wait until you try ZX5! :)

The good news is that ZX5 decompressor is almost as fast as ZX0.
Urusergi
Drutt
Posts: 16
Joined: Sun Aug 15, 2021 3:46 pm
Location: Madrid, Spain

Re: A new data compressor called ZX0

Post by Urusergi »

Einar Saukas wrote: Tue Oct 05, 2021 12:54 am The "experimental" ZX5 compressor is now available
Hurrah!

I have won my own bet!! :lol:
Several things catch my attention, why did you move the first lenght bit of the new offset block? is the offset limit still 32Kb or does it improve?
and was there no way around the use of alternative registers? for example using index registers.
Did you try an intermediate version with only two last offsets?

I'm going to test the compression power 8-)

EDIT: It's very good! 339 bytes less in my test file :D
User avatar
Einar Saukas
Bugaboo
Posts: 3097
Joined: Wed Nov 15, 2017 2:48 pm

Re: A new data compressor called ZX0

Post by Einar Saukas »

Urusergi wrote: Tue Oct 05, 2021 9:58 pm why did you move the first lenght bit of the new offset block?
When reading any Elias coded number, my decompressors only need to check "refill" (i.e if it's time to load another group of 8 bits) when loading control bits (1st, 3rd, 5th, etc), not value bits (2nd, 4th, etc). Storing first length bit separately is needed to enforce this rule.

Urusergi wrote: Tue Oct 05, 2021 9:58 pm is the offset limit still 32Kb or does it improve?
It's approximately 64Kb now, because LSB(offset) has 8 bits.

Urusergi wrote: Tue Oct 05, 2021 9:58 pm and was there no way around the use of alternative registers? for example using index registers.
Alternate registers allow me to use EX DE, HL :)
User avatar
Einar Saukas
Bugaboo
Posts: 3097
Joined: Wed Nov 15, 2017 2:48 pm

Re: A new data compressor called ZX0

Post by Einar Saukas »

Urusergi wrote: Tue Oct 05, 2021 9:58 pm Did you try an intermediate version with only two last offsets?
Yes. It didn't compress as much as ZX5.

There are lots of ways this data format could be changed. Perhaps a different number of bits in LSB(offset). Perhaps "copy from new offset" storing Elias(length) instead of Elias(length-1), thus accepting length=1 too. Perhaps support for reusing last 4 offsets, adding an extra bit to differentiate between 3rd and 4th last offset... After testing lots of combinations, I believe ZX5 format was the best tradeoff between compression ratio, decompressor size and decompression speed, when reusing multiple offsets.
introspec
Dizzy
Posts: 53
Joined: Wed Jan 02, 2019 2:26 pm
Location: London
Contact:

Re: A new data compressor called ZX0

Post by introspec »

Einar Saukas wrote: Tue Sep 21, 2021 11:45 pmWhen I was working on ZX0, I also created another more complex format that I used for comparison. It had a slightly better compression than ZX0 (around .5% on average). The problem is, an optimal compressor for this more complex format required so much time and memory that it wasn't able to compress about 1/3 of the files I tested. There was no point in using a non-optimal compressor either, since it typically lost more than .5% in compression ratio. It was simply not worth it.
I am not seeing the 0.5% benefit on the files I run it on thus far. Most files tend to be compressed better by ZX0. However, I won't be able to be more specific than this, because the compressor chokes on some files from my corpus (it reports out of memory error).
User avatar
Einar Saukas
Bugaboo
Posts: 3097
Joined: Wed Nov 15, 2017 2:48 pm

Re: A new data compressor called ZX0

Post by Einar Saukas »

introspec wrote: Wed Oct 06, 2021 9:45 am I am not seeing the 0.5% benefit on the files I run it on thus far. Most files tend to be compressed better by ZX0. However, I won't be able to be more specific than this, because the compressor chokes on some files from my corpus (it reports out of memory error).
I'm not surprised.

You are getting memory errors because the ZX5 compressor needs to allocate too much memory. This is the main reason ZX5 documentation recommends using ZX0 instead :)

ZX5 compression works slightly better on certain kinds of files (such as SCR files) and ZX0 compression works slightly better on others (such as RCS files). The results heavily depend on the composition of your test corpus. In my own tests, ZX5 runs out of memory in 25% of files, and compresses better than ZX0 in 50% of cases. For instance ZX0 can compress this Gigascreen image from 13824 to 7760 bytes, and ZX5 compresses it to 7247 bytes, so that's more than 3% better.

However the difference between ZX0 and ZX5 is usually very small. ZX5 results are kinda hit-and-miss. Overall it's better to keep using ZX0, and only trying ZX5 when there's a very specific file where ZX0 compression is not enough. I wasn't even planning to release ZX5, this was the post that convinced me otherwise:
jimmy wrote: Wed Sep 22, 2021 3:53 pm I have one game which fits with Shrinkler (another compressor), but not with ZX0 - I need another 100 bytes!
User avatar
NEO SPECTRUMAN
Microbot
Posts: 110
Joined: Tue Jan 26, 2021 10:27 pm

Re: A new data compressor called ZX0

Post by NEO SPECTRUMAN »

Einar Saukas wrote: Wed Oct 06, 2021 2:46 pm However the difference between ZX0 and ZX5 is usually very small.
I'm did not RTFM completely
it's possible to use other not packed data as "dictionary" for packing\unpacking?
User avatar
Einar Saukas
Bugaboo
Posts: 3097
Joined: Wed Nov 15, 2017 2:48 pm

Re: A new data compressor called ZX0

Post by Einar Saukas »

NEO SPECTRUMAN wrote: Wed Oct 06, 2021 5:54 pm I'm did not RTFM completely
it's possible to use other not packed data as "dictionary" for packing\unpacking?
Yes. Read the manual section called "COMPRESSING WITH PREFIX".

All my recent compressors support this feature.
jimmy
Drutt
Posts: 31
Joined: Sun Nov 24, 2019 9:06 pm

Re: A new data compressor called ZX0

Post by jimmy »

Einar Saukas wrote: Wed Oct 06, 2021 2:46 pm I wasn't even planning to release ZX5, this was the post that convinced me otherwise:
jimmy wrote: Wed Sep 22, 2021 3:53 pm I have one game which fits with Shrinkler (another compressor), but not with ZX0 - I need another 100 bytes!
Thank you!

Unfortunately I can't tell you how many bytes it now uses because after 37 hours of CPU time my laptop is still trying to compress 33887 bytes into ZX5 format. ZX5.exe is currently using 1.5Gb of RAM too...

Can the algorithm be made to use multiple threads?
Urusergi
Drutt
Posts: 16
Joined: Sun Aug 15, 2021 3:46 pm
Location: Madrid, Spain

Re: A new data compressor called ZX0

Post by Urusergi »

jimmy wrote: Tue Oct 12, 2021 9:56 amUnfortunately I can't tell you how many bytes it now uses because after 37 hours of CPU time my laptop is still trying to compress 33887 bytes into ZX5 format. ZX5.exe is currently using 1.5Gb of RAM too...
Uaaaaa! 37 hours!!! :shock: What game are you compressing?
 
User avatar
arkannoyed
Manic Miner
Posts: 436
Joined: Mon Feb 05, 2018 9:56 am
Location: Northamptonshire

Re: A new data compressor called ZX0

Post by arkannoyed »

Hey, 37 hours is nothing. Try 4 days for me 😂

I tried to compress the main code block from Xevious. Previous tests got it down from 40270 bytes to;

10051 with zx0
9937. (IIRC) with shrinkler.

I was hoping zx5 would achieve better, but still waiting!

It has compressed a few SCRs to smaller than zx0 and even with the larger decompressor, it’s still a small improvement, which is lovely.
User avatar
Alessandro
Dynamite Dan
Posts: 1910
Joined: Wed Nov 15, 2017 11:10 am
Location: Messina, Italy
Contact:

Re: A new data compressor called ZX0

Post by Alessandro »

I used ZX5 to compress the real scenario test file I employ to make comparisons between data compressors, i.e. the first level of Funky Fungus Reloaded. It took 8'12" to compress it from 25481 to 14296 bytes. With ZX0, the same file is compressed to 14382 bytes in just 13".
User avatar
Einar Saukas
Bugaboo
Posts: 3097
Joined: Wed Nov 15, 2017 2:48 pm

Re: A new data compressor called ZX0

Post by Einar Saukas »

jimmy wrote: Tue Oct 12, 2021 9:56 am Unfortunately I can't tell you how many bytes it now uses because after 37 hours of CPU time my laptop is still trying to compress 33887 bytes into ZX5 format. ZX5.exe is currently using 1.5Gb of RAM too...
Ouch!

jimmy wrote: Tue Oct 12, 2021 9:56 am Can the algorithm be made to use multiple threads?
Sure. However memory management in current C implementation is based on reference counting, which would be inefficient across threads. This will require another language with GC and lightweight threads, preferably also resizeable hash tables. I will probably try Kotlin, unless anyone has a better idea?
User avatar
Einar Saukas
Bugaboo
Posts: 3097
Joined: Wed Nov 15, 2017 2:48 pm

Re: A new data compressor called ZX0

Post by Einar Saukas »

Alessandro wrote: Tue Oct 12, 2021 7:00 pmWith ZX0, the same file is compressed to 14382 bytes in just 13".
I confess I only released ZX5 so everyone would be happy about ZX0 compression speed :)
jimmy
Drutt
Posts: 31
Joined: Sun Nov 24, 2019 9:06 pm

Re: A new data compressor called ZX0

Post by jimmy »

Einar Saukas wrote: Tue Oct 12, 2021 10:25 pm Sure. However memory management in current C implementation is based on reference counting, which would be inefficient across threads. This will require another language with GC and lightweight threads, preferably also resizeable hash tables. I will probably try Kotlin, unless anyone has a better idea?
Any language will do as long as it results in a working Win32 executable!

As I'm still waiting for the ZX5 output (43 hours so far, the dots have made it across as far as the "n" in "Einar") I thought people might like to see a table of games that I investigated making into a ZX IF/2 Interface cartridge but then gave up due to size issues. You can see just how much better ZX0 is than ZX7. All sizes are in bytes, a * means that the game was paritally run to decrypt itself before saving the binary file.

Code: Select all

		orig	zx7	zx0	shrinkler
Arkanoid 2	33024	17407	16384	15536
Bargain Basemt	32700		21105	20108
Bruce Lee	33239	18433	17369	16404
Chessmaster 2k	34814		14775	14368
Deathstar Int	41541		21405	20052
Everyones Wally*32809		20144	19044
Exploding Fist	38915		22890	21916
Finders Keepers	36860		18831	17972
Flunky		38650	25674	23706	22884
Fox fights bk	40224		20660	19688
Ghostbusters	32064	19717	18636	17976
Gilligan's Gold	39936	20719	18968	18088
Greg Loses Clk	32900	22948	21289	20588
Hampstead	33957		17787	16460
Herbert Dummy* 	32191		20327
Hobbit		40000		27347
Hydrofool	36466		22713	21404
Iball		41436		23090	21756
Iball 2		40530		23534	22000
Knightlore*	30720	19490	18228	17064
Lunar Jetman*	31745		19509	18940
Matchday 2	34565		23526	22808
Max desktop	32099	19850	18474	17680
Minder		39001		23301	22284
Mordon's Quest	41204		28565	26988
Mugsy		40960		28582	27536
Night Gunner	40960	25971	24302	23472
Pippo		34151	34880!	30839	30780
Raid over Mscw	40282	23928	21664	20764
Roland Rat Race	36890	17955	16501	15664
Sabre Wulf*	35936		23632	22168
Sigma 7		33024	24650	22977	21904
Starion		45024		21345	20848
Starstrike	41024		20931	19148
Tapper		39724		22813	22624
Tau Ceti	40520		22899	21756
The Eye of Bain	32340	23869	22285	21212
Through TrapD	32860		20824	20112
Thrust		40351	19371	17159	16252
Trapdoor	35600		22244	21412
Travel Trashman	32768		19791	18948
Turbo Esprit	38351		27680	26636
Virus		35134	20995	18734	17920
Wheelie		39037		18066	17196
Alone Coder
Manic Miner
Posts: 401
Joined: Fri Jan 03, 2020 10:00 am

Re: A new data compressor called ZX0

Post by Alone Coder »

jimmy wrote: Wed Oct 13, 2021 10:42 am All sizes are in bytes, a * means that the game was paritally run to decrypt itself before saving the binary file.
Can you please upload these files?
introspec
Dizzy
Posts: 53
Joined: Wed Jan 02, 2019 2:26 pm
Location: London
Contact:

Re: A new data compressor called ZX0

Post by introspec »

jimmy wrote: Wed Oct 13, 2021 10:42 am I thought people might like to see a table of games that I investigated making into a ZX IF/2 Interface cartridge but then gave up due to size issues. You can see just how much better ZX0 is than ZX7.

Code: Select all

		orig	zx7	zx0	shrinkler
Arkanoid 2	33024	17407	16384	15536
Bargain Basemt	32700		21105	20108
Bruce Lee	33239	18433	17369	16404
Chessmaster 2k	34814		14775	14368
Deathstar Int	41541		21405	20052
Everyones Wally*32809		20144	19044
Exploding Fist	38915		22890	21916
Finders Keepers	36860		18831	17972
Flunky		38650	25674	23706	22884
Fox fights bk	40224		20660	19688
Ghostbusters	32064	19717	18636	17976
Gilligan's Gold	39936	20719	18968	18088
Greg Loses Clk	32900	22948	21289	20588
Hampstead	33957		17787	16460
Herbert Dummy* 	32191		20327
Hobbit		40000		27347
Hydrofool	36466		22713	21404
Iball		41436		23090	21756
Iball 2		40530		23534	22000
Knightlore*	30720	19490	18228	17064
Lunar Jetman*	31745		19509	18940
Matchday 2	34565		23526	22808
Max desktop	32099	19850	18474	17680
Minder		39001		23301	22284
Mordon's Quest	41204		28565	26988
Mugsy		40960		28582	27536
Night Gunner	40960	25971	24302	23472
Pippo		34151	34880!	30839	30780
Raid over Mscw	40282	23928	21664	20764
Roland Rat Race	36890	17955	16501	15664
Sabre Wulf*	35936		23632	22168
Sigma 7		33024	24650	22977	21904
Starion		45024		21345	20848
Starstrike	41024		20931	19148
Tapper		39724		22813	22624
Tau Ceti	40520		22899	21756
The Eye of Bain	32340	23869	22285	21212
Through TrapD	32860		20824	20112
Thrust		40351	19371	17159	16252
Trapdoor	35600		22244	21412
Travel Trashman	32768		19791	18948
Turbo Esprit	38351		27680	26636
Virus		35134	20995	18734	17920
Wheelie		39037		18066	17196
Why do you use Shrinkler as a point of reference? It is slower to decompress by 2-3 orders of magnitude, to the extent that it is about the same speed as loading uncompressed data from tape.
User avatar
Bedazzle
Manic Miner
Posts: 305
Joined: Sun Mar 24, 2019 9:03 am

Re: A new data compressor called ZX0

Post by Bedazzle »

introspec wrote: Wed Oct 13, 2021 11:09 am Why do you use Shrinkler as a point of reference?
Because of size?
jimmy
Drutt
Posts: 31
Joined: Sun Nov 24, 2019 9:06 pm

Re: A new data compressor called ZX0

Post by jimmy »

introspec wrote: Wed Oct 13, 2021 11:09 am Why do you use Shrinkler as a point of reference?
Because we need a theoretical minimum size - it's not going to be zero or one, and not even 1K in these cases. As Shrinkler seems to offer the best compression at present I thought it was a useful reference. I could've used .zip, but by sticking to Shrinkler I know they can be decompressed on a Spectrum with a small piece of code.

It is interesting that for complex data (eg:Pippo) there isn't much difference between ZX0 and Shrinkler, but for adventures such as Eye Of Bain, Hampstead and Hobbit the Shrinkler files are at least 1K smaller. There might be room for improvement here.

I can't update the original table, so I've posted an updated table below. Some more games should've been marked as decrypted. I've included Atic Atac, added the missing shrinkler values and included more zx7 values.

Code: Select all

		orig	zx7	zx0	shrinkler
Arkanoid 2	33024	17407	16384	15536
Atic Atac*	30209	20039	18400	17240
Bargain Basemt	32700	22495	21105	20108
Bruce Lee	33239	18433	17369	16404
Chessmaster 2k	34814	16069	14775	14368
Deathstar Int	41541	24060	21405	20052
Everyones Wally*32809	21446	20144	19044
Exploding Fist	38915	26036	22890	21916
Finders Keepers	36860	21777	18831	17972
Flunky*		38650	25674	23706	22884
Fox fights bk	40224	22269	20660	19688
Ghostbusters	32064	19717	18636	17976
Gilligan's Gold	39936	20719	18968	18088
Greg Loses Clk	32900	22948	21289	20588
Hampstead	33957	19653	17787	16460
Herbert Dummy* 	32191	21763	20327	19252
Hobbit		40000	28911	27347	26076
Hydrofool	36466		22713	21404
Iball		41436	24954	23090	21756
Iball 2		40530	25304	23534	22000
Knightlore*	30720	19490	18228	17064
Lunar Jetman*	31745	21082	19509	18940
Matchday 2	34565		23526	22808
Max desktop	32099	19850	18474	17680
Minder		39001	25111	23301	22284
Mordon's Quest	41204	29799	28565	26988
Mugsy		40960	30459	28582	27536
Night Gunner	40960	25971	24302	23472
Pippo		34151	34880!	30839	30780
Raid over Mscw	40282	23928	21664	20764
Roland Rat Race	36890	17955	16501	15664
Sabre Wulf*	35936	25318	23632	22168
Sigma 7		33024	24650	22977	21904
Starion		45024		21345	20848
Starstrike	41024		20931	19148
Tapper		39724		22813	22624
Tau Ceti	40520		22899	21756
The Eye of Bain	32340	23869	22285	21212
Through TrapD	32860		20824	20112
Thrust		40351	19371	17159	16252
Trapdoor*	35600	23836	22244	21412
Travel Trashman	32768		19791	18948
Turbo Esprit	38351	29361	27680	26636
Virus		35134	20995	18734	17920
Wheelie		39037	19074	18066	17196
jimmy
Drutt
Posts: 31
Joined: Sun Nov 24, 2019 9:06 pm

Re: A new data compressor called ZX0

Post by jimmy »

Alone Coder wrote: Wed Oct 13, 2021 10:50 am
jimmy wrote: Wed Oct 13, 2021 10:42 am All sizes are in bytes, a * means that the game was paritally run to decrypt itself before saving the binary file.
Can you please upload these files?
Upload all those denenienced titles? :shock: Sorry I can't do that.
However you can recreate them just by loading the game in and stopping it just before it starts (or just after it decrypts). As an example I'll show you how I obtained a decrypted Flunky binary:
  • The game has a standard loader screen$ and a hyperloader for the main code.
  • Place a breakpoint at $FDE8 and start loading the game. It should stop just before it starts the hyperloader. At this point IX=$6D60 and DE=$8FF2 so we know where the game loads and how big it is.
  • Place another breakpoint at $FE70 - this should trigger after the game finishes loading. At this point the binary is still encoded.
  • If you look around $FEA9 you can see the decryptor routine (it modifies memory starting at $6D60 and is $8FF2 bytes long)
  • If you place a final breakpoint at $FEB8 this will trigger once decryption has finished.
  • The decrypted bytes can now be save as "FLUNKY" CODE 28000,36850
  • If you use zx0,zx7 or shrinkler 4.6 (without parity) you should see the same sizes I listed in the table.
Post Reply