Page 9 of 12

Re: A new data compressor called ZX0

Posted: Tue Sep 21, 2021 11:45 pm
by Einar Saukas
Urusergi wrote: Sat Sep 18, 2021 11:26 pmthis afternoon I started to think about a possible evolution of the format
When I was working on ZX0, I also created another more complex format that I used for comparison. It had a slightly better compression than ZX0 (around .5% on average). The problem is, an optimal compressor for this more complex format required so much time and memory that it wasn't able to compress about 1/3 of the files I tested. There was no point in using a non-optimal compressor either, since it typically lost more than .5% in compression ratio. It was simply not worth it.

Re: A new data compressor called ZX0

Posted: Wed Sep 22, 2021 2:41 pm
by Urusergi
Einar Saukas wrote: Tue Sep 21, 2021 11:45 pmWhen I was working on ZX0, I also created another more complex format that I used for comparison.
I'm so curious! could you please briefly explain that format? thank you.

It's true that my last code made the stack grow a lot but I've improved it. Now it only grows when a "new offset" occurs, using "move to front" inside "old offset" method. Anyway, this is nothing serious, just for my entertainment.

Re: A new data compressor called ZX0

Posted: Wed Sep 22, 2021 3:53 pm
by jimmy
Einar Saukas wrote: Tue Sep 21, 2021 11:45 pm When I was working on ZX0, I also created another more complex format that I used for comparison. It had a slightly better compression than ZX0 (around .5% on average).
Yes please do tell us more about this - I'm always on the lookout for better compression! I have some games which are just slightly too big for a 16K cartridge. I have one game which fits with Shrinkler (another compressor), but not with ZX0 - I need another 100 bytes!

The problem is that Shrinkler takes around 75 seconds to decompress before starting the game :-(

Re: A new data compressor called ZX0

Posted: Thu Sep 23, 2021 2:28 pm
by Einar Saukas
OK, give me a few days to put together a proper release.

Re: A new data compressor called ZX0

Posted: Sat Sep 25, 2021 6:08 pm
by Urusergi
Einar Saukas wrote: Thu Sep 23, 2021 2:28 pm OK, give me a few days to put together a proper release.
Bets are allowed! ;)

We are intrigued. Which format the compressor will use? Is it possible that the new compressor uses more than one "last offset"?
Yesterday I found out that lzma uses the last 4 offsets, so I wasn't really wrong with my latest idea 8-)
With a Z80 we can get up to 3 offsets easily, using the stack, ix & iy with ex (sp),ix or ex (sp),iy for the "move to front".
 

Re: A new data compressor called ZX0

Posted: Tue Oct 05, 2021 12:54 am
by Einar Saukas
The "experimental" ZX5 compressor is now available here:

https://github.com/einar-saukas/ZX5

If you thought ZX0 compressor was too slow, wait until you try ZX5! :)

The good news is that ZX5 decompressor is almost as fast as ZX0.

Re: A new data compressor called ZX0

Posted: Tue Oct 05, 2021 9:58 pm
by Urusergi
Einar Saukas wrote: Tue Oct 05, 2021 12:54 am The "experimental" ZX5 compressor is now available
Hurrah!

I have won my own bet!! :lol:
Several things catch my attention, why did you move the first lenght bit of the new offset block? is the offset limit still 32Kb or does it improve?
and was there no way around the use of alternative registers? for example using index registers.
Did you try an intermediate version with only two last offsets?

I'm going to test the compression power 8-)

EDIT: It's very good! 339 bytes less in my test file :D

Re: A new data compressor called ZX0

Posted: Tue Oct 05, 2021 10:56 pm
by Einar Saukas
Urusergi wrote: Tue Oct 05, 2021 9:58 pm why did you move the first lenght bit of the new offset block?
When reading any Elias coded number, my decompressors only need to check "refill" (i.e if it's time to load another group of 8 bits) when loading control bits (1st, 3rd, 5th, etc), not value bits (2nd, 4th, etc). Storing first length bit separately is needed to enforce this rule.

Urusergi wrote: Tue Oct 05, 2021 9:58 pm is the offset limit still 32Kb or does it improve?
It's approximately 64Kb now, because LSB(offset) has 8 bits.

Urusergi wrote: Tue Oct 05, 2021 9:58 pm and was there no way around the use of alternative registers? for example using index registers.
Alternate registers allow me to use EX DE, HL :)

Re: A new data compressor called ZX0

Posted: Wed Oct 06, 2021 4:42 am
by Einar Saukas
Urusergi wrote: Tue Oct 05, 2021 9:58 pm Did you try an intermediate version with only two last offsets?
Yes. It didn't compress as much as ZX5.

There are lots of ways this data format could be changed. Perhaps a different number of bits in LSB(offset). Perhaps "copy from new offset" storing Elias(length) instead of Elias(length-1), thus accepting length=1 too. Perhaps support for reusing last 4 offsets, adding an extra bit to differentiate between 3rd and 4th last offset... After testing lots of combinations, I believe ZX5 format was the best tradeoff between compression ratio, decompressor size and decompression speed, when reusing multiple offsets.

Re: A new data compressor called ZX0

Posted: Wed Oct 06, 2021 9:45 am
by introspec
Einar Saukas wrote: Tue Sep 21, 2021 11:45 pmWhen I was working on ZX0, I also created another more complex format that I used for comparison. It had a slightly better compression than ZX0 (around .5% on average). The problem is, an optimal compressor for this more complex format required so much time and memory that it wasn't able to compress about 1/3 of the files I tested. There was no point in using a non-optimal compressor either, since it typically lost more than .5% in compression ratio. It was simply not worth it.
I am not seeing the 0.5% benefit on the files I run it on thus far. Most files tend to be compressed better by ZX0. However, I won't be able to be more specific than this, because the compressor chokes on some files from my corpus (it reports out of memory error).

Re: A new data compressor called ZX0

Posted: Wed Oct 06, 2021 2:46 pm
by Einar Saukas
introspec wrote: Wed Oct 06, 2021 9:45 am I am not seeing the 0.5% benefit on the files I run it on thus far. Most files tend to be compressed better by ZX0. However, I won't be able to be more specific than this, because the compressor chokes on some files from my corpus (it reports out of memory error).
I'm not surprised.

You are getting memory errors because the ZX5 compressor needs to allocate too much memory. This is the main reason ZX5 documentation recommends using ZX0 instead :)

ZX5 compression works slightly better on certain kinds of files (such as SCR files) and ZX0 compression works slightly better on others (such as RCS files). The results heavily depend on the composition of your test corpus. In my own tests, ZX5 runs out of memory in 25% of files, and compresses better than ZX0 in 50% of cases. For instance ZX0 can compress this Gigascreen image from 13824 to 7760 bytes, and ZX5 compresses it to 7247 bytes, so that's more than 3% better.

However the difference between ZX0 and ZX5 is usually very small. ZX5 results are kinda hit-and-miss. Overall it's better to keep using ZX0, and only trying ZX5 when there's a very specific file where ZX0 compression is not enough. I wasn't even planning to release ZX5, this was the post that convinced me otherwise:
jimmy wrote: Wed Sep 22, 2021 3:53 pm I have one game which fits with Shrinkler (another compressor), but not with ZX0 - I need another 100 bytes!

Re: A new data compressor called ZX0

Posted: Wed Oct 06, 2021 5:54 pm
by NEO SPECTRUMAN
Einar Saukas wrote: Wed Oct 06, 2021 2:46 pm However the difference between ZX0 and ZX5 is usually very small.
I'm did not RTFM completely
it's possible to use other not packed data as "dictionary" for packing\unpacking?

Re: A new data compressor called ZX0

Posted: Wed Oct 06, 2021 6:07 pm
by Einar Saukas
NEO SPECTRUMAN wrote: Wed Oct 06, 2021 5:54 pm I'm did not RTFM completely
it's possible to use other not packed data as "dictionary" for packing\unpacking?
Yes. Read the manual section called "COMPRESSING WITH PREFIX".

All my recent compressors support this feature.

Re: A new data compressor called ZX0

Posted: Tue Oct 12, 2021 9:56 am
by jimmy
Einar Saukas wrote: Wed Oct 06, 2021 2:46 pm I wasn't even planning to release ZX5, this was the post that convinced me otherwise:
jimmy wrote: Wed Sep 22, 2021 3:53 pm I have one game which fits with Shrinkler (another compressor), but not with ZX0 - I need another 100 bytes!
Thank you!

Unfortunately I can't tell you how many bytes it now uses because after 37 hours of CPU time my laptop is still trying to compress 33887 bytes into ZX5 format. ZX5.exe is currently using 1.5Gb of RAM too...

Can the algorithm be made to use multiple threads?

Re: A new data compressor called ZX0

Posted: Tue Oct 12, 2021 11:27 am
by Urusergi
jimmy wrote: Tue Oct 12, 2021 9:56 amUnfortunately I can't tell you how many bytes it now uses because after 37 hours of CPU time my laptop is still trying to compress 33887 bytes into ZX5 format. ZX5.exe is currently using 1.5Gb of RAM too...
Uaaaaa! 37 hours!!! :shock: What game are you compressing?
 

Re: A new data compressor called ZX0

Posted: Tue Oct 12, 2021 1:39 pm
by arkannoyed
Hey, 37 hours is nothing. Try 4 days for me 😂

I tried to compress the main code block from Xevious. Previous tests got it down from 40270 bytes to;

10051 with zx0
9937. (IIRC) with shrinkler.

I was hoping zx5 would achieve better, but still waiting!

It has compressed a few SCRs to smaller than zx0 and even with the larger decompressor, it’s still a small improvement, which is lovely.

Re: A new data compressor called ZX0

Posted: Tue Oct 12, 2021 7:00 pm
by Alessandro
I used ZX5 to compress the real scenario test file I employ to make comparisons between data compressors, i.e. the first level of Funky Fungus Reloaded. It took 8'12" to compress it from 25481 to 14296 bytes. With ZX0, the same file is compressed to 14382 bytes in just 13".

Re: A new data compressor called ZX0

Posted: Tue Oct 12, 2021 10:25 pm
by Einar Saukas
jimmy wrote: Tue Oct 12, 2021 9:56 am Unfortunately I can't tell you how many bytes it now uses because after 37 hours of CPU time my laptop is still trying to compress 33887 bytes into ZX5 format. ZX5.exe is currently using 1.5Gb of RAM too...
Ouch!

jimmy wrote: Tue Oct 12, 2021 9:56 am Can the algorithm be made to use multiple threads?
Sure. However memory management in current C implementation is based on reference counting, which would be inefficient across threads. This will require another language with GC and lightweight threads, preferably also resizeable hash tables. I will probably try Kotlin, unless anyone has a better idea?

Re: A new data compressor called ZX0

Posted: Tue Oct 12, 2021 10:41 pm
by Einar Saukas
Alessandro wrote: Tue Oct 12, 2021 7:00 pmWith ZX0, the same file is compressed to 14382 bytes in just 13".
I confess I only released ZX5 so everyone would be happy about ZX0 compression speed :)

Re: A new data compressor called ZX0

Posted: Wed Oct 13, 2021 10:42 am
by jimmy
Einar Saukas wrote: Tue Oct 12, 2021 10:25 pm Sure. However memory management in current C implementation is based on reference counting, which would be inefficient across threads. This will require another language with GC and lightweight threads, preferably also resizeable hash tables. I will probably try Kotlin, unless anyone has a better idea?
Any language will do as long as it results in a working Win32 executable!

As I'm still waiting for the ZX5 output (43 hours so far, the dots have made it across as far as the "n" in "Einar") I thought people might like to see a table of games that I investigated making into a ZX IF/2 Interface cartridge but then gave up due to size issues. You can see just how much better ZX0 is than ZX7. All sizes are in bytes, a * means that the game was paritally run to decrypt itself before saving the binary file.

Code: Select all

		orig	zx7	zx0	shrinkler
Arkanoid 2	33024	17407	16384	15536
Bargain Basemt	32700		21105	20108
Bruce Lee	33239	18433	17369	16404
Chessmaster 2k	34814		14775	14368
Deathstar Int	41541		21405	20052
Everyones Wally*32809		20144	19044
Exploding Fist	38915		22890	21916
Finders Keepers	36860		18831	17972
Flunky		38650	25674	23706	22884
Fox fights bk	40224		20660	19688
Ghostbusters	32064	19717	18636	17976
Gilligan's Gold	39936	20719	18968	18088
Greg Loses Clk	32900	22948	21289	20588
Hampstead	33957		17787	16460
Herbert Dummy* 	32191		20327
Hobbit		40000		27347
Hydrofool	36466		22713	21404
Iball		41436		23090	21756
Iball 2		40530		23534	22000
Knightlore*	30720	19490	18228	17064
Lunar Jetman*	31745		19509	18940
Matchday 2	34565		23526	22808
Max desktop	32099	19850	18474	17680
Minder		39001		23301	22284
Mordon's Quest	41204		28565	26988
Mugsy		40960		28582	27536
Night Gunner	40960	25971	24302	23472
Pippo		34151	34880!	30839	30780
Raid over Mscw	40282	23928	21664	20764
Roland Rat Race	36890	17955	16501	15664
Sabre Wulf*	35936		23632	22168
Sigma 7		33024	24650	22977	21904
Starion		45024		21345	20848
Starstrike	41024		20931	19148
Tapper		39724		22813	22624
Tau Ceti	40520		22899	21756
The Eye of Bain	32340	23869	22285	21212
Through TrapD	32860		20824	20112
Thrust		40351	19371	17159	16252
Trapdoor	35600		22244	21412
Travel Trashman	32768		19791	18948
Turbo Esprit	38351		27680	26636
Virus		35134	20995	18734	17920
Wheelie		39037		18066	17196

Re: A new data compressor called ZX0

Posted: Wed Oct 13, 2021 10:50 am
by Alone Coder
jimmy wrote: Wed Oct 13, 2021 10:42 am All sizes are in bytes, a * means that the game was paritally run to decrypt itself before saving the binary file.
Can you please upload these files?

Re: A new data compressor called ZX0

Posted: Wed Oct 13, 2021 11:09 am
by introspec
jimmy wrote: Wed Oct 13, 2021 10:42 am I thought people might like to see a table of games that I investigated making into a ZX IF/2 Interface cartridge but then gave up due to size issues. You can see just how much better ZX0 is than ZX7.

Code: Select all

		orig	zx7	zx0	shrinkler
Arkanoid 2	33024	17407	16384	15536
Bargain Basemt	32700		21105	20108
Bruce Lee	33239	18433	17369	16404
Chessmaster 2k	34814		14775	14368
Deathstar Int	41541		21405	20052
Everyones Wally*32809		20144	19044
Exploding Fist	38915		22890	21916
Finders Keepers	36860		18831	17972
Flunky		38650	25674	23706	22884
Fox fights bk	40224		20660	19688
Ghostbusters	32064	19717	18636	17976
Gilligan's Gold	39936	20719	18968	18088
Greg Loses Clk	32900	22948	21289	20588
Hampstead	33957		17787	16460
Herbert Dummy* 	32191		20327
Hobbit		40000		27347
Hydrofool	36466		22713	21404
Iball		41436		23090	21756
Iball 2		40530		23534	22000
Knightlore*	30720	19490	18228	17064
Lunar Jetman*	31745		19509	18940
Matchday 2	34565		23526	22808
Max desktop	32099	19850	18474	17680
Minder		39001		23301	22284
Mordon's Quest	41204		28565	26988
Mugsy		40960		28582	27536
Night Gunner	40960	25971	24302	23472
Pippo		34151	34880!	30839	30780
Raid over Mscw	40282	23928	21664	20764
Roland Rat Race	36890	17955	16501	15664
Sabre Wulf*	35936		23632	22168
Sigma 7		33024	24650	22977	21904
Starion		45024		21345	20848
Starstrike	41024		20931	19148
Tapper		39724		22813	22624
Tau Ceti	40520		22899	21756
The Eye of Bain	32340	23869	22285	21212
Through TrapD	32860		20824	20112
Thrust		40351	19371	17159	16252
Trapdoor	35600		22244	21412
Travel Trashman	32768		19791	18948
Turbo Esprit	38351		27680	26636
Virus		35134	20995	18734	17920
Wheelie		39037		18066	17196
Why do you use Shrinkler as a point of reference? It is slower to decompress by 2-3 orders of magnitude, to the extent that it is about the same speed as loading uncompressed data from tape.

Re: A new data compressor called ZX0

Posted: Wed Oct 13, 2021 8:49 pm
by Bedazzle
introspec wrote: Wed Oct 13, 2021 11:09 am Why do you use Shrinkler as a point of reference?
Because of size?

Re: A new data compressor called ZX0

Posted: Thu Oct 14, 2021 9:16 am
by jimmy
introspec wrote: Wed Oct 13, 2021 11:09 am Why do you use Shrinkler as a point of reference?
Because we need a theoretical minimum size - it's not going to be zero or one, and not even 1K in these cases. As Shrinkler seems to offer the best compression at present I thought it was a useful reference. I could've used .zip, but by sticking to Shrinkler I know they can be decompressed on a Spectrum with a small piece of code.

It is interesting that for complex data (eg:Pippo) there isn't much difference between ZX0 and Shrinkler, but for adventures such as Eye Of Bain, Hampstead and Hobbit the Shrinkler files are at least 1K smaller. There might be room for improvement here.

I can't update the original table, so I've posted an updated table below. Some more games should've been marked as decrypted. I've included Atic Atac, added the missing shrinkler values and included more zx7 values.

Code: Select all

		orig	zx7	zx0	shrinkler
Arkanoid 2	33024	17407	16384	15536
Atic Atac*	30209	20039	18400	17240
Bargain Basemt	32700	22495	21105	20108
Bruce Lee	33239	18433	17369	16404
Chessmaster 2k	34814	16069	14775	14368
Deathstar Int	41541	24060	21405	20052
Everyones Wally*32809	21446	20144	19044
Exploding Fist	38915	26036	22890	21916
Finders Keepers	36860	21777	18831	17972
Flunky*		38650	25674	23706	22884
Fox fights bk	40224	22269	20660	19688
Ghostbusters	32064	19717	18636	17976
Gilligan's Gold	39936	20719	18968	18088
Greg Loses Clk	32900	22948	21289	20588
Hampstead	33957	19653	17787	16460
Herbert Dummy* 	32191	21763	20327	19252
Hobbit		40000	28911	27347	26076
Hydrofool	36466		22713	21404
Iball		41436	24954	23090	21756
Iball 2		40530	25304	23534	22000
Knightlore*	30720	19490	18228	17064
Lunar Jetman*	31745	21082	19509	18940
Matchday 2	34565		23526	22808
Max desktop	32099	19850	18474	17680
Minder		39001	25111	23301	22284
Mordon's Quest	41204	29799	28565	26988
Mugsy		40960	30459	28582	27536
Night Gunner	40960	25971	24302	23472
Pippo		34151	34880!	30839	30780
Raid over Mscw	40282	23928	21664	20764
Roland Rat Race	36890	17955	16501	15664
Sabre Wulf*	35936	25318	23632	22168
Sigma 7		33024	24650	22977	21904
Starion		45024		21345	20848
Starstrike	41024		20931	19148
Tapper		39724		22813	22624
Tau Ceti	40520		22899	21756
The Eye of Bain	32340	23869	22285	21212
Through TrapD	32860		20824	20112
Thrust		40351	19371	17159	16252
Trapdoor*	35600	23836	22244	21412
Travel Trashman	32768		19791	18948
Turbo Esprit	38351	29361	27680	26636
Virus		35134	20995	18734	17920
Wheelie		39037	19074	18066	17196

Re: A new data compressor called ZX0

Posted: Thu Oct 14, 2021 9:35 am
by jimmy
Alone Coder wrote: Wed Oct 13, 2021 10:50 am
jimmy wrote: Wed Oct 13, 2021 10:42 am All sizes are in bytes, a * means that the game was paritally run to decrypt itself before saving the binary file.
Can you please upload these files?
Upload all those denenienced titles? :shock: Sorry I can't do that.
However you can recreate them just by loading the game in and stopping it just before it starts (or just after it decrypts). As an example I'll show you how I obtained a decrypted Flunky binary:
  • The game has a standard loader screen$ and a hyperloader for the main code.
  • Place a breakpoint at $FDE8 and start loading the game. It should stop just before it starts the hyperloader. At this point IX=$6D60 and DE=$8FF2 so we know where the game loads and how big it is.
  • Place another breakpoint at $FE70 - this should trigger after the game finishes loading. At this point the binary is still encoded.
  • If you look around $FEA9 you can see the decryptor routine (it modifies memory starting at $6D60 and is $8FF2 bytes long)
  • If you place a final breakpoint at $FEB8 this will trigger once decryption has finished.
  • The decrypted bytes can now be save as "FLUNKY" CODE 28000,36850
  • If you use zx0,zx7 or shrinkler 4.6 (without parity) you should see the same sizes I listed in the table.