Page 9 of 12
Re: A new data compressor called ZX0
Posted: Tue Sep 21, 2021 11:45 pm
by Einar Saukas
Urusergi wrote: ↑Sat Sep 18, 2021 11:26 pmthis afternoon I started to think about a possible evolution of the format
When I was working on ZX0, I also created another more complex format that I used for comparison. It had a slightly better compression than ZX0 (around .5% on average). The problem is, an optimal compressor for this more complex format required so much time and memory that it wasn't able to compress about 1/3 of the files I tested. There was no point in using a non-optimal compressor either, since it typically lost more than .5% in compression ratio. It was simply not worth it.
Re: A new data compressor called ZX0
Posted: Wed Sep 22, 2021 2:41 pm
by Urusergi
Einar Saukas wrote: ↑Tue Sep 21, 2021 11:45 pmWhen I was working on ZX0, I also created another more complex format that I used for comparison.
I'm so curious! could you please briefly explain that format? thank you.
It's true that my last code made the stack grow a lot but I've improved it. Now it only grows when a "new offset" occurs, using "move to front" inside "old offset" method. Anyway, this is nothing serious, just for my entertainment.
Re: A new data compressor called ZX0
Posted: Wed Sep 22, 2021 3:53 pm
by jimmy
Einar Saukas wrote: ↑Tue Sep 21, 2021 11:45 pm
When I was working on ZX0, I also created another more complex format that I used for comparison. It had a slightly better compression than ZX0 (around .5% on average).
Yes please do tell us more about this - I'm always on the lookout for better compression! I have some games which are just slightly too big for a 16K cartridge. I have one game which fits with Shrinkler (another compressor), but not with ZX0 - I need another 100 bytes!
The problem is that Shrinkler takes around 75 seconds to decompress before starting the game
Re: A new data compressor called ZX0
Posted: Thu Sep 23, 2021 2:28 pm
by Einar Saukas
OK, give me a few days to put together a proper release.
Re: A new data compressor called ZX0
Posted: Sat Sep 25, 2021 6:08 pm
by Urusergi
Einar Saukas wrote: ↑Thu Sep 23, 2021 2:28 pm
OK, give me a few days to put together a proper release.
Bets are allowed!
We are intrigued. Which format the compressor will use? Is it possible that the new compressor uses more than one "last offset"?
Yesterday I found out that lzma uses the last 4 offsets, so I wasn't really wrong with my latest idea
With a Z80 we can get up to 3 offsets easily, using the stack, ix & iy with ex (sp),ix or ex (sp),iy for the "move to front".
Re: A new data compressor called ZX0
Posted: Tue Oct 05, 2021 12:54 am
by Einar Saukas
The "experimental" ZX5 compressor is now available here:
https://github.com/einar-saukas/ZX5
If you thought ZX0 compressor was too slow, wait until you try ZX5!
The good news is that ZX5 decompressor is almost as fast as ZX0.
Re: A new data compressor called ZX0
Posted: Tue Oct 05, 2021 9:58 pm
by Urusergi
Einar Saukas wrote: ↑Tue Oct 05, 2021 12:54 am
The "experimental" ZX5 compressor is now available
Hurrah!
I have won my own bet!!
Several things catch my attention, why did you move the first lenght bit of the new offset block? is the offset limit still 32Kb or does it improve?
and was there no way around the use of alternative registers? for example using index registers.
Did you try an intermediate version with only two last offsets?
I'm going to test the compression power
EDIT: It's very good! 339 bytes less in my test file
Re: A new data compressor called ZX0
Posted: Tue Oct 05, 2021 10:56 pm
by Einar Saukas
Urusergi wrote: ↑Tue Oct 05, 2021 9:58 pm
why did you move the first lenght bit of the new offset block?
When reading any Elias coded number, my decompressors only need to check "refill" (i.e if it's time to load another group of 8 bits) when loading control bits (1st, 3rd, 5th, etc), not value bits (2nd, 4th, etc). Storing first length bit separately is needed to enforce this rule.
Urusergi wrote: ↑Tue Oct 05, 2021 9:58 pm
is the offset limit still 32Kb or does it improve?
It's approximately 64Kb now, because LSB(offset) has 8 bits.
Urusergi wrote: ↑Tue Oct 05, 2021 9:58 pm
and was there no way around the use of alternative registers? for example using index registers.
Alternate registers allow me to use
EX DE, HL
Re: A new data compressor called ZX0
Posted: Wed Oct 06, 2021 4:42 am
by Einar Saukas
Urusergi wrote: ↑Tue Oct 05, 2021 9:58 pm
Did you try an intermediate version with only two last offsets?
Yes. It didn't compress as much as ZX5.
There are lots of ways this data format could be changed. Perhaps a different number of bits in LSB(offset). Perhaps "copy from new offset" storing Elias(length) instead of Elias(length-1), thus accepting length=1 too. Perhaps support for reusing last 4 offsets, adding an extra bit to differentiate between 3rd and 4th last offset... After testing lots of combinations, I believe ZX5 format was the best tradeoff between compression ratio, decompressor size and decompression speed, when reusing multiple offsets.
Re: A new data compressor called ZX0
Posted: Wed Oct 06, 2021 9:45 am
by introspec
Einar Saukas wrote: ↑Tue Sep 21, 2021 11:45 pmWhen I was working on ZX0, I also created another more complex format that I used for comparison. It had a slightly better compression than ZX0 (around .5% on average). The problem is, an optimal compressor for this more complex format required so much time and memory that it wasn't able to compress about 1/3 of the files I tested. There was no point in using a non-optimal compressor either, since it typically lost more than .5% in compression ratio. It was simply not worth it.
I am not seeing the 0.5% benefit on the files I run it on thus far. Most files tend to be compressed better by ZX0. However, I won't be able to be more specific than this, because the compressor chokes on some files from my corpus (it reports out of memory error).
Re: A new data compressor called ZX0
Posted: Wed Oct 06, 2021 2:46 pm
by Einar Saukas
introspec wrote: ↑Wed Oct 06, 2021 9:45 am
I am not seeing the 0.5% benefit on the files I run it on thus far. Most files tend to be compressed better by ZX0. However, I won't be able to be more specific than this, because the compressor chokes on some files from my corpus (it reports out of memory error).
I'm not surprised.
You are getting memory errors because the ZX5 compressor needs to allocate too much memory. This is the main reason ZX5 documentation recommends using ZX0 instead
ZX5 compression works slightly better on certain kinds of files (such as SCR files) and ZX0 compression works slightly better on others (such as RCS files). The results heavily depend on the composition of your test corpus. In my own tests, ZX5 runs out of memory in 25% of files, and compresses better than ZX0 in 50% of cases. For instance ZX0 can compress
this Gigascreen image from 13824 to 7760 bytes, and ZX5 compresses it to 7247 bytes, so that's more than 3% better.
However the difference between ZX0 and ZX5 is usually very small. ZX5 results are kinda hit-and-miss. Overall it's better to keep using ZX0, and only trying ZX5 when there's a very specific file where ZX0 compression is not enough. I wasn't even planning to release ZX5, this was the post that convinced me otherwise:
jimmy wrote: ↑Wed Sep 22, 2021 3:53 pm
I have one game which fits with Shrinkler (another compressor), but not with ZX0 - I need another 100 bytes!
Re: A new data compressor called ZX0
Posted: Wed Oct 06, 2021 5:54 pm
by NEO SPECTRUMAN
Einar Saukas wrote: ↑Wed Oct 06, 2021 2:46 pm
However the difference between ZX0 and ZX5 is usually very small.
I'm did not RTFM completely
it's possible to use other not packed data as "dictionary" for packing\unpacking?
Re: A new data compressor called ZX0
Posted: Wed Oct 06, 2021 6:07 pm
by Einar Saukas
NEO SPECTRUMAN wrote: ↑Wed Oct 06, 2021 5:54 pm
I'm did not RTFM completely
it's possible to use other not packed data as "dictionary" for packing\unpacking?
Yes. Read the manual section called "COMPRESSING WITH PREFIX".
All my recent compressors support this feature.
Re: A new data compressor called ZX0
Posted: Tue Oct 12, 2021 9:56 am
by jimmy
Einar Saukas wrote: ↑Wed Oct 06, 2021 2:46 pm
I wasn't even planning to release ZX5, this was the post that convinced me otherwise:
jimmy wrote: ↑Wed Sep 22, 2021 3:53 pm
I have one game which fits with Shrinkler (another compressor), but not with ZX0 - I need another 100 bytes!
Thank you!
Unfortunately I can't tell you how many bytes it now uses because after 37 hours of CPU time my laptop is still trying to compress 33887 bytes into ZX5 format. ZX5.exe is currently using 1.5Gb of RAM too...
Can the algorithm be made to use multiple threads?
Re: A new data compressor called ZX0
Posted: Tue Oct 12, 2021 11:27 am
by Urusergi
jimmy wrote: ↑Tue Oct 12, 2021 9:56 amUnfortunately I can't tell you how many bytes it now uses because after 37 hours of CPU time my laptop is still trying to compress 33887 bytes into ZX5 format. ZX5.exe is currently using 1.5Gb of RAM too...
Uaaaaa! 37 hours!!!
What game are you compressing?
Re: A new data compressor called ZX0
Posted: Tue Oct 12, 2021 1:39 pm
by arkannoyed
Hey, 37 hours is nothing. Try 4 days for me
I tried to compress the main code block from Xevious. Previous tests got it down from 40270 bytes to;
10051 with zx0
9937. (IIRC) with shrinkler.
I was hoping zx5 would achieve better, but still waiting!
It has compressed a few SCRs to smaller than zx0 and even with the larger decompressor, it’s still a small improvement, which is lovely.
Re: A new data compressor called ZX0
Posted: Tue Oct 12, 2021 7:00 pm
by Alessandro
I used ZX5 to compress the real scenario test file I employ to make comparisons between data compressors, i.e. the first level of Funky Fungus Reloaded. It took 8'12" to compress it from 25481 to 14296 bytes. With ZX0, the same file is compressed to 14382 bytes in just 13".
Re: A new data compressor called ZX0
Posted: Tue Oct 12, 2021 10:25 pm
by Einar Saukas
jimmy wrote: ↑Tue Oct 12, 2021 9:56 am
Unfortunately I can't tell you how many bytes it now uses because after 37 hours of CPU time my laptop is still trying to compress 33887 bytes into ZX5 format. ZX5.exe is currently using 1.5Gb of RAM too...
Ouch!
jimmy wrote: ↑Tue Oct 12, 2021 9:56 am
Can the algorithm be made to use multiple threads?
Sure. However memory management in current C implementation is based on reference counting, which would be inefficient across threads. This will require another language with GC and lightweight threads, preferably also resizeable hash tables. I will probably try Kotlin, unless anyone has a better idea?
Re: A new data compressor called ZX0
Posted: Tue Oct 12, 2021 10:41 pm
by Einar Saukas
Alessandro wrote: ↑Tue Oct 12, 2021 7:00 pmWith ZX0, the same file is compressed to 14382 bytes in just 13".
I confess I only released ZX5 so everyone would be happy about ZX0 compression speed
Re: A new data compressor called ZX0
Posted: Wed Oct 13, 2021 10:42 am
by jimmy
Einar Saukas wrote: ↑Tue Oct 12, 2021 10:25 pm
Sure. However memory management in current C implementation is based on reference counting, which would be inefficient across threads. This will require another language with GC and lightweight threads, preferably also resizeable hash tables. I will probably try Kotlin, unless anyone has a better idea?
Any language will do as long as it results in a working Win32 executable!
As I'm still waiting for the ZX5 output (43 hours so far, the dots have made it across as far as the "n" in "Einar") I thought people might like to see a table of games that I investigated making into a ZX IF/2 Interface cartridge but then gave up due to size issues. You can see just how much better ZX0 is than ZX7. All sizes are in bytes, a * means that the game was paritally run to decrypt itself before saving the binary file.
Code: Select all
orig zx7 zx0 shrinkler
Arkanoid 2 33024 17407 16384 15536
Bargain Basemt 32700 21105 20108
Bruce Lee 33239 18433 17369 16404
Chessmaster 2k 34814 14775 14368
Deathstar Int 41541 21405 20052
Everyones Wally*32809 20144 19044
Exploding Fist 38915 22890 21916
Finders Keepers 36860 18831 17972
Flunky 38650 25674 23706 22884
Fox fights bk 40224 20660 19688
Ghostbusters 32064 19717 18636 17976
Gilligan's Gold 39936 20719 18968 18088
Greg Loses Clk 32900 22948 21289 20588
Hampstead 33957 17787 16460
Herbert Dummy* 32191 20327
Hobbit 40000 27347
Hydrofool 36466 22713 21404
Iball 41436 23090 21756
Iball 2 40530 23534 22000
Knightlore* 30720 19490 18228 17064
Lunar Jetman* 31745 19509 18940
Matchday 2 34565 23526 22808
Max desktop 32099 19850 18474 17680
Minder 39001 23301 22284
Mordon's Quest 41204 28565 26988
Mugsy 40960 28582 27536
Night Gunner 40960 25971 24302 23472
Pippo 34151 34880! 30839 30780
Raid over Mscw 40282 23928 21664 20764
Roland Rat Race 36890 17955 16501 15664
Sabre Wulf* 35936 23632 22168
Sigma 7 33024 24650 22977 21904
Starion 45024 21345 20848
Starstrike 41024 20931 19148
Tapper 39724 22813 22624
Tau Ceti 40520 22899 21756
The Eye of Bain 32340 23869 22285 21212
Through TrapD 32860 20824 20112
Thrust 40351 19371 17159 16252
Trapdoor 35600 22244 21412
Travel Trashman 32768 19791 18948
Turbo Esprit 38351 27680 26636
Virus 35134 20995 18734 17920
Wheelie 39037 18066 17196
Re: A new data compressor called ZX0
Posted: Wed Oct 13, 2021 10:50 am
by Alone Coder
jimmy wrote: ↑Wed Oct 13, 2021 10:42 am
All sizes are in bytes, a * means that the game was paritally run to decrypt itself before saving the binary file.
Can you please upload these files?
Re: A new data compressor called ZX0
Posted: Wed Oct 13, 2021 11:09 am
by introspec
jimmy wrote: ↑Wed Oct 13, 2021 10:42 am
I thought people might like to see a table of games that I investigated making into a ZX IF/2 Interface cartridge but then gave up due to size issues. You can see just how much better ZX0 is than ZX7.
Code: Select all
orig zx7 zx0 shrinkler
Arkanoid 2 33024 17407 16384 15536
Bargain Basemt 32700 21105 20108
Bruce Lee 33239 18433 17369 16404
Chessmaster 2k 34814 14775 14368
Deathstar Int 41541 21405 20052
Everyones Wally*32809 20144 19044
Exploding Fist 38915 22890 21916
Finders Keepers 36860 18831 17972
Flunky 38650 25674 23706 22884
Fox fights bk 40224 20660 19688
Ghostbusters 32064 19717 18636 17976
Gilligan's Gold 39936 20719 18968 18088
Greg Loses Clk 32900 22948 21289 20588
Hampstead 33957 17787 16460
Herbert Dummy* 32191 20327
Hobbit 40000 27347
Hydrofool 36466 22713 21404
Iball 41436 23090 21756
Iball 2 40530 23534 22000
Knightlore* 30720 19490 18228 17064
Lunar Jetman* 31745 19509 18940
Matchday 2 34565 23526 22808
Max desktop 32099 19850 18474 17680
Minder 39001 23301 22284
Mordon's Quest 41204 28565 26988
Mugsy 40960 28582 27536
Night Gunner 40960 25971 24302 23472
Pippo 34151 34880! 30839 30780
Raid over Mscw 40282 23928 21664 20764
Roland Rat Race 36890 17955 16501 15664
Sabre Wulf* 35936 23632 22168
Sigma 7 33024 24650 22977 21904
Starion 45024 21345 20848
Starstrike 41024 20931 19148
Tapper 39724 22813 22624
Tau Ceti 40520 22899 21756
The Eye of Bain 32340 23869 22285 21212
Through TrapD 32860 20824 20112
Thrust 40351 19371 17159 16252
Trapdoor 35600 22244 21412
Travel Trashman 32768 19791 18948
Turbo Esprit 38351 27680 26636
Virus 35134 20995 18734 17920
Wheelie 39037 18066 17196
Why do you use Shrinkler as a point of reference? It is slower to decompress by 2-3 orders of magnitude, to the extent that it is about the same speed as loading uncompressed data from tape.
Re: A new data compressor called ZX0
Posted: Wed Oct 13, 2021 8:49 pm
by Bedazzle
introspec wrote: ↑Wed Oct 13, 2021 11:09 am
Why do you use Shrinkler as a point of reference?
Because of size?
Re: A new data compressor called ZX0
Posted: Thu Oct 14, 2021 9:16 am
by jimmy
introspec wrote: ↑Wed Oct 13, 2021 11:09 am
Why do you use Shrinkler as a point of reference?
Because we need a theoretical minimum size - it's not going to be zero or one, and not even 1K in these cases. As Shrinkler seems to offer the best compression at present I thought it was a useful reference. I could've used .zip, but by sticking to Shrinkler I know they can be decompressed on a Spectrum with a small piece of code.
It is interesting that for complex data (eg:Pippo) there isn't much difference between ZX0 and Shrinkler, but for adventures such as Eye Of Bain, Hampstead and Hobbit the Shrinkler files are at least 1K smaller. There might be room for improvement here.
I can't update the original table, so I've posted an updated table below. Some more games should've been marked as decrypted. I've included Atic Atac, added the missing shrinkler values and included more zx7 values.
Code: Select all
orig zx7 zx0 shrinkler
Arkanoid 2 33024 17407 16384 15536
Atic Atac* 30209 20039 18400 17240
Bargain Basemt 32700 22495 21105 20108
Bruce Lee 33239 18433 17369 16404
Chessmaster 2k 34814 16069 14775 14368
Deathstar Int 41541 24060 21405 20052
Everyones Wally*32809 21446 20144 19044
Exploding Fist 38915 26036 22890 21916
Finders Keepers 36860 21777 18831 17972
Flunky* 38650 25674 23706 22884
Fox fights bk 40224 22269 20660 19688
Ghostbusters 32064 19717 18636 17976
Gilligan's Gold 39936 20719 18968 18088
Greg Loses Clk 32900 22948 21289 20588
Hampstead 33957 19653 17787 16460
Herbert Dummy* 32191 21763 20327 19252
Hobbit 40000 28911 27347 26076
Hydrofool 36466 22713 21404
Iball 41436 24954 23090 21756
Iball 2 40530 25304 23534 22000
Knightlore* 30720 19490 18228 17064
Lunar Jetman* 31745 21082 19509 18940
Matchday 2 34565 23526 22808
Max desktop 32099 19850 18474 17680
Minder 39001 25111 23301 22284
Mordon's Quest 41204 29799 28565 26988
Mugsy 40960 30459 28582 27536
Night Gunner 40960 25971 24302 23472
Pippo 34151 34880! 30839 30780
Raid over Mscw 40282 23928 21664 20764
Roland Rat Race 36890 17955 16501 15664
Sabre Wulf* 35936 25318 23632 22168
Sigma 7 33024 24650 22977 21904
Starion 45024 21345 20848
Starstrike 41024 20931 19148
Tapper 39724 22813 22624
Tau Ceti 40520 22899 21756
The Eye of Bain 32340 23869 22285 21212
Through TrapD 32860 20824 20112
Thrust 40351 19371 17159 16252
Trapdoor* 35600 23836 22244 21412
Travel Trashman 32768 19791 18948
Turbo Esprit 38351 29361 27680 26636
Virus 35134 20995 18734 17920
Wheelie 39037 19074 18066 17196
Re: A new data compressor called ZX0
Posted: Thu Oct 14, 2021 9:35 am
by jimmy
Alone Coder wrote: ↑Wed Oct 13, 2021 10:50 am
jimmy wrote: ↑Wed Oct 13, 2021 10:42 am
All sizes are in bytes, a * means that the game was paritally run to decrypt itself before saving the binary file.
Can you please upload these files?
Upload all those denenienced titles?
Sorry I can't do that.
However you can recreate them just by loading the game in and stopping it just before it starts (or just after it decrypts). As an example I'll show you how I obtained a decrypted Flunky binary:
- The game has a standard loader screen$ and a hyperloader for the main code.
- Place a breakpoint at $FDE8 and start loading the game. It should stop just before it starts the hyperloader. At this point IX=$6D60 and DE=$8FF2 so we know where the game loads and how big it is.
- Place another breakpoint at $FE70 - this should trigger after the game finishes loading. At this point the binary is still encoded.
- If you look around $FEA9 you can see the decryptor routine (it modifies memory starting at $6D60 and is $8FF2 bytes long)
- If you place a final breakpoint at $FEB8 this will trigger once decryption has finished.
- The decrypted bytes can now be save as "FLUNKY" CODE 28000,36850
- If you use zx0,zx7 or shrinkler 4.6 (without parity) you should see the same sizes I listed in the table.