A new data compressor called ZX0

The place for codemasters or beginners to talk about programming any language for the Spectrum.
Post Reply
User avatar
Einar Saukas
Bugaboo
Posts: 3100
Joined: Wed Nov 15, 2017 2:48 pm

A new data compressor called ZX0

Post by Einar Saukas »

When you need to compress data in your Spectrum programs, you usually have 2 options:
  • Use a complex tool like Exomizer, that will compress your data as much as possible, but then you will need to use a slower larger routine to decompress it.
  • Use a simpler tool like ZX7, that gives you a small routine to decompress your data much faster, but then your compressed data won't be so small.
Wouldn't it be nice to have a simpler compressor that works like ZX7, but that could compress data as much as Exomizer? It this even possible?

Yes, it is! It's called ZX0 :)

The full source code and documentation is here:

https://github.com/einar-saukas/ZX0

The only noticeable difference from ZX7, is that the cross-platform ZX0 compressor can take several seconds to compress data. For this reason, ZX0 compressor provides an extra option "-q" (quick mode) for faster, non-optimal compression. Thus when you need to repeatedly modify, compress and retest a certain file, you can use option "-q" to speed up this process, then later remove this option to compress even better. Except for this, everything else works exactly the same in ZX0 and ZX7. In particular, the standard ZX0 decompressor has 82 bytes only, and it decompresses data at about the same speed as ZX7.


TEST RESULTS

To ensure that my test results are not biased, I will be posting here a comparison between different compressors using someone else's data. This way, I can be sure I'm not giving preference (even involuntarily) to the kind of data that would conveniently work better for ZX0.

I just found quite a few interesting articles on the web that compare Z80 data compressor results. However it seems the dataset used in these comparison is only available for download in one of them:

https://hype.retroscene.org/blog/dev/740.html

This is an excellent article written by introspec. It's in Russian but Google translate does a good job here, so I recommend reading it. I hope he won't mind that I will use his dataset to complement his comparison!

This is the original table presented in this article:

Code: Select all

             unpacked   ApLib  Exomizer   Hrum    Hrust1   Hrust2    LZ4     MegaLZ  Pletter5 PuCrunch   ZX7

Calgary       273526    98192    96248   111380   103148   102742   120843   109519   106650    99041   117658
Canterbury     81941    26609    26968    31767    28441    28791    34976    31338    30247    27792    32268
Graphics      289927   169879   164868   173026   169221   171249   195544   172089   171807   169767   172140
Music         151657    59819    59857    62977    60902    62678    77617    62568    63661    63977    66692
Misc          436944   252334   248220   262508   251890   255363   293542   261396   263432   256278   265121

TOTAL:       1233995   606833   596161   641658   613602   620823   722522   636910   635797   616855   653879
Noticed that, in the original article, Exomizer provided the best results overall, even using an older version. To be fair, I have tested it again using latest version Exomizer 3.1.0, that provided even better results. I have also tested ZX0 both in full and "quick" modes, for convenience. These are the new results:

Code: Select all

              ZX7       ZX0-q   Exomizer3    ZX0

Calgary      117658    116685     95997     99993
Canterbury    32268     30725     26365     26337
Graphics     172140    164692    163915    163647
Music         66692     61923     58586     58388
Misc         265121    249638    245434    244252

TOTAL:       653879    623663    590297    592617
These results indicate that ZX0 didn't perform so well in the Calgary test (Exomizer, ApLib and PuCrunch all compressed better in this case). However ZX0 performed better than every other compressor in all other test cases. The TOTAL result shows ZX0 surpassed only by Exomizer3, just because of the Calgary test. Does it mean that Exomizer3 is a better compressor overall? Well, I don't think so...

The Calgary test is a subset of the "classic" Calgary corpus. Exactly 7 out of 8 files in this subset are pure text, with an average size of 36Kb each. Although this is an interesting test, it doesn't represent the typical content you need to compress in a Spectrum. What kind of Spectrum program needs 36Kb of compressed text? Perhaps a text-only adventure, but in this case you would either compress it separately as multiple smaller blocks, or use a dictionary-based custom compression so you could access sentences individually.

The Canterbury test files have more diversity, but they are not typical Spectrum content either. The best representation for the kind of data that needs to be compressed in Spectrum programs are the remaining test cases: Graphics (Spectrum screens, etc), Music (AY, etc) and Misc (tape and disk images). In these test cases, ZX0 worked noticeably better.

However I'm not saying that ZX0 will always compress better than Exomizer3 in practice. My other tests seem to indicate that Exomizer3 works better in a few practical cases and ZX0 in others. The difference is so close, that it's hard to choose a winner.

The real advantage of ZX0 is to achieve about the same compression as Exomizer3, but using a much smaller faster decompressor.

So give it a try! Have fun :)
azesmbog
Manic Miner
Posts: 307
Joined: Sat May 16, 2020 8:43 am

Re: A new data compressor called ZX0

Post by azesmbog »

I am a rather lazy user, so I have always used something simpler, usability is sometimes more important to me :)
Therefore, I used mlz
I tried this compressor, as well as mlz for convenience, here is the result:

Code: Select all

File name        Size              
-------------    ---------------   
crugi.bin                 10,987   
crugi.bin.zx0              1,406   
crugi.bin.mlz              1,548
  
krugi.scr                  6,912   
krugi.scr.zx0                657   
krugi.scr.mlz                692
my data is "loose", so good compression :)
I will use it, thanks!
* - zx0 -standard compression
User avatar
R-Tape
Site Admin
Posts: 6402
Joined: Thu Nov 09, 2017 11:46 am

Re: A new data compressor called ZX0

Post by R-Tape »

Nice one Einar.

I'm a regular ZX7 user, and of the screens I've checked so far, ZX0 saves at least 200 bytes per screen, which will be very useful in things like a woot slideshow.

Cheers!
User avatar
utz
Microbot
Posts: 116
Joined: Wed Nov 15, 2017 9:04 am
Contact:

Re: A new data compressor called ZX0

Post by utz »

Fantastic work, thanks a lot Einar! Running a few tests shows that ZX0 works very well for my most common use case, compressing data for music players. Very exited to see something that can beat apack.
User avatar
Einar Saukas
Bugaboo
Posts: 3100
Joined: Wed Nov 15, 2017 2:48 pm

Re: A new data compressor called ZX0

Post by Einar Saukas »

Thank you!

In the meantime, I have implemented support for ZX0 in RCS:

https://github.com/einar-saukas/RCS

It works exactly like support for ZX7.
User avatar
arkannoyed
Manic Miner
Posts: 436
Joined: Mon Feb 05, 2018 9:56 am
Location: Northamptonshire

Re: A new data compressor called ZX0

Post by arkannoyed »

Having read the stuff on GitHub, am I correct that zx0 has the ability to utilise a ‘dictionary’ at a different location to where the code decompresses?

This really is wonderful. Any chance of an overview of the compression methods?
introspec
Dizzy
Posts: 53
Joined: Wed Jan 02, 2019 2:26 pm
Location: London
Contact:

Re: A new data compressor called ZX0

Post by introspec »

Very impressive work Einar! I do not think this is better in terms of compression ratio than the latest Exomizer, but it is definitely in the same league, and with a lot less demanding decompressor. On average, it beats older versions of Exomizer in my tests, and the decompression speeds I am seeing even with the standard decompressor are faster than the fastest Exomizer can do at present. I think that if the decompression speed could get at least to the speed of recent aplib decompressors, you'll have a definite winner.

I do not have much time during the term time, but I'll definitely study your decompressors and will try to contribute to them, if/when I can.

P.S. I know you usually don't keep version numbers in your software, but could you please consider adding a version number to ZX0? It makes it so much easier to identify pieces of software evolving over the years.
User avatar
Einar Saukas
Bugaboo
Posts: 3100
Joined: Wed Nov 15, 2017 2:48 pm

Re: A new data compressor called ZX0

Post by Einar Saukas »

    arkannoyed wrote: Wed Jan 20, 2021 12:07 pmHaving read the stuff on GitHub, am I correct that zx0 has the ability to utilise a ‘dictionary’ at a different location to where the code decompresses?
    Not exactly a dictionary. Compressing with prefix (or suffix) means that, if you know there will be certain data in the area immediately before (or after) you are decompressing something, then you can use this information to improve compression.

    Although I agree this could be used like a dictionary. If you put together a block of common words, and use it as prefix when compressing other text blocks, then it will help reduce the compressed size of the other blocks. However this is not exactly dictionary-based compression. If you need to compress lots of text, it's probably better to use a dictionary-based custom compression instead.

    arkannoyed wrote: Wed Jan 20, 2021 12:07 pmThis really is wonderful. Any chance of an overview of the compression methods?
    The ZX0 format is very simple. There are only 3 kinds of blocks:
    • Literal (copy next N bytes from compressed input to output)

    Code: Select all

    0  <length>  <byte1>  <byte2>  ...  <byteN>
    • Copy from last offset (repeat N bytes from last offset to output)

    Code: Select all

    0  <length>
    • Copy from new offset (repeat N bytes from new offset to output)

    Code: Select all

    1  <MSB_offset>  <LSB_offset>  <length-1>
    We only need 1 bit to distinguish these blocks, because we cannot have 2 consecutive literal blocks, and reusing last offset only makes sense after a literal block.

    Both length and offset MSB are stored using Elias Gamma Coding. The offset LSB is stored using 7 bits instead of 8, because it produces better results in most practical cases.

    The compressed block always start with a literal, so the first bit is omitted.
    introspec
    Dizzy
    Posts: 53
    Joined: Wed Jan 02, 2019 2:26 pm
    Location: London
    Contact:

    Re: A new data compressor called ZX0

    Post by introspec »

    Einas, in your decompressors you are using a new kind of optimization which I am struggling to get my head around. Specifically, you do something like this:

    Code: Select all

    copy_literals:
    	call read_elias_code	; for the length of literal run
    	ldir
    	add a,a : jr c,read_new_offset
    	call read_elias_code	; for match length
    	
    	; [skipped code for copying matches]
    	
    	add a,a : jr nc,copy_literals
    read_new_offset:
    	call read_elias_code_carry
    On both occasions with the command ADD A,A you get a new bit from the bit reservoir in A to see what needs to be done next. What I am struggling to understand is: why do you NOT need to check if A==0 after ADD A,A? I understand (and used in my decompressors before) a somewhat subtle idea that if after ADD A,A flag C is down, then we are certain that A does not need to be reloaded. However, I am failing to see, how the alternative works. If after ADD A,A flag C is up, how do you know that it is an actual 1 from the bitstream, and not the "signal bit" 1 that was pushed into A using LD A,(HL) : INC HL : RLA during bit stream parsing?

    UPDATE 1: OMG, I think I understand. Gamma codes always have odd number of bits, so for your chosen format EVERY POSSIBLE COMMAND uses exactly EVEN number of bits from the bitsteam. Hence, the flag bit can never exhaust the bit reservoir. INSANE. I love it. And this also means that you cannot really use 8 bit LSB with the same decompressor structure, as it will break literally EVERYTHING :)

    UPDATE 2: However, since this is the case, why did you not use interlaced gamma code? I.e. replace 1, 01x, 001xx, etc by 1, 0x1, 0x0x1, etc? Since your encoding can easily ensure which gamma bits occupy even and odd positions in the bitstream, this would be a marvelous speed optimization. I suspect that the size-optimized code can also be made smaller with this style of encoding...
    Last edited by introspec on Wed Jan 20, 2021 5:15 pm, edited 3 times in total.
    User avatar
    Joefish
    Rick Dangerous
    Posts: 2058
    Joined: Tue Nov 14, 2017 10:26 am

    Re: A new data compressor called ZX0

    Post by Joefish »

    That's what I used in my encoding! Actually I think I cached two or three recently used offsets in a stack, although your usage of one bit to pick from your three options is pretty ruddy clever.

    Although why do you say that using 'last offset' only makes sense after a literal block? If you cached the previous offset when you define a new one, you could still revert to it (or switch new and old) after copying from the new offset. There are going to be some savings to be found there too... maybe... with graphics, or anywhere there's lots of repetition with small variations from one copy to the next.
    Alone Coder
    Manic Miner
    Posts: 401
    Joined: Fri Jan 03, 2020 10:00 am

    Re: A new data compressor called ZX0

    Post by Alone Coder »

    Introspec has excluded native compressors RIP, mRIP, and ZXRar from his article. ZXRar can be easily tested in NedoOS:

    calgary - 98571 (lost to Exomizer3)
    canterbury - 27939
    graphics - 163173 (better than others)
    music - 63721
    misc - 254202

    RAR headers included.
    Decompressor size is 342 bytes (uses 1566 bytes for temporary buffer).

    TR-DOS version of ZXRar optionally supports compression with 2-byte blocks (helpful for code), with 386-byte decompressor. It has to be tested too.

    The other compressors use similar methods, but smaller decompressors (mRIP is 220 bytes), and no headers.

    These formats, if implemented on PC with optimal compression (like 7zip does with ZIP format), can be more effective with the same decompressors.
    introspec
    Dizzy
    Posts: 53
    Joined: Wed Jan 02, 2019 2:26 pm
    Location: London
    Contact:

    Re: A new data compressor called ZX0

    Post by introspec »

    Joefish wrote: Wed Jan 20, 2021 3:49 pmAlthough why do you say that using 'last offset' only makes sense after a literal block? If you cached the previous offset when you define a new one, you could still revert to it (or switch new and old) after copying from the new offset. There are going to be some savings to be found there too... maybe... with graphics, or anywhere there's lots of repetition with small variations from one copy to the next.
    If you do not have a literal block there, than it means you can encode the same data by simply using longer match length. So, it already represents a fairly substantial saving. This type of context dependent encodings are popular in good modern compression algorithms, such as LZMA. I think the cleanest discussion of this kind of ideas can be found in the Charles Bloom article about his compression format LZNIB: http://cbloomrants.blogspot.com/2019/01 ... rithm.html
    Alone Coder
    Manic Miner
    Posts: 401
    Joined: Fri Jan 03, 2020 10:00 am

    Re: A new data compressor called ZX0

    Post by Alone Coder »

    Alone Coder wrote: Wed Jan 20, 2021 4:02 pm calgary - 98571 (lost to Exomizer3)
    canterbury - 27939
    graphics - 163173 (better than others)
    music - 63721
    misc - 254202

    RAR headers included.
    Decompressor size is 342 bytes (uses 1566 bytes for temporary buffer).

    TR-DOS version of ZXRar optionally supports compression with 2-byte blocks (helpful for code), with 386-byte decompressor. It has to be tested too.
    Temporarily added 2-byte blocks in NedoOS version:
    calgary - 98525
    canterbury - 27911
    graphics - 163943
    music - 62740
    misc - 250341

    So, some files pack better, and some others worse.
    I suppose, a future ZXRar compressor must automatically switch 2-byte blocks on and off depending on the results for a given file.
    User avatar
    Einar Saukas
    Bugaboo
    Posts: 3100
    Joined: Wed Nov 15, 2017 2:48 pm

    Re: A new data compressor called ZX0

    Post by Einar Saukas »

    introspec wrote: Wed Jan 20, 2021 12:26 pmVery impressive work Einar!
    Thank you!

    introspec wrote: Wed Jan 20, 2021 12:26 pmI do not think this is better in terms of compression ratio than the latest Exomizer, but it is definitely in the same league, and with a lot less demanding decompressor.
    Agreed! This was exactly my conclusion.

    introspec wrote: Wed Jan 20, 2021 12:26 pmOn average, it beats older versions of Exomizer in my tests, and the decompression speeds I am seeing even with the standard decompressor are faster than the fastest Exomizer can do at present. I think that if the decompression speed could get at least to the speed of recent aplib decompressors, you'll have a definite winner.
    The "mega" ZX0 decompressor is faster than all ZX7 decompressors. And according to this link, ZX7 is faster than aplib. Unless there's a much faster aplib decompressor out there?

    introspec wrote: Wed Jan 20, 2021 12:26 pmI do not have much time during the term time, but I'll definitely study your decompressors and will try to contribute to them, if/when I can.
    Thanks!

    introspec wrote: Wed Jan 20, 2021 12:26 pmP.S. I know you usually don't keep version numbers in your software, but could you please consider adding a version number to ZX0? It makes it so much easier to identify pieces of software evolving over the years.
    ZX0 is already an optimal compressor, so I don't expect it to evolve any further. But if I make any significant changes, I will add a version number to distinguish it. OK?
    User avatar
    Joefish
    Rick Dangerous
    Posts: 2058
    Joined: Tue Nov 14, 2017 10:26 am

    Re: A new data compressor called ZX0

    Post by Joefish »

    introspec wrote: Wed Jan 20, 2021 4:03 pm If you do not have a literal block there, than it means you can encode the same data by simply using longer match length. So, it already represents a fairly substantial saving. This type of context dependent encodings are popular in good modern compression algorithms, such as LZMA. I think the cleanest discussion of this kind of ideas can be found in the Charles Bloom article about his compression format LZNIB: http://cbloomrants.blogspot.com/2019/01 ... rithm.html
    What's discussed there isn't relevant. When you're compressing blocks as small as 64K or less on a modern PC (which all Spectrum data is going to be) then you can take the time to consider EVERY POSSIBLE compression combination route, whether that be skipping over a possible back-reference and just stick to an unbroken string of literals, or whether to re-use an old reference for a partial copy or pick a new reference for a longer back-copy. You don't have to make those decisions on-the-fly during encoding based on statistical probabilities; you can simply brute-force it and follow every possible choice through to its end and know for sure which is the best option.

    The choices come down to how many bits you give over in your encoding scheme to encode each type of data. I used the first few values in the smallest back-reference encoding to indicate a possibly cached, older back-reference. Then any higher number was a new back-reference. But it meant using the same number of bits as a new, small back-reference, to be able to re-use a cached one. Which made many potential re-uses irrelevant. By hitting on the idea of just using a single bit to signal a new reference or re-use the last (cached) reference, Einar's got a better encoding scheme. But he's not using it at every opportunity, I feel. Having said that, his decoder is way, way more efficiently written than anything I've ever done!

    I like his take on Elias Gamma Coding, particularly for speed, as it starts with a whole byte (where 7 bits are data and the last is an extension bit) rather than having to stream an arbitrary number of bits. A question though. Can it always guarantee it can copy enough literals? What if it can't represent a copy-count big enough and is forced to swap to a back-reference, and then can't find one?
    User avatar
    Einar Saukas
    Bugaboo
    Posts: 3100
    Joined: Wed Nov 15, 2017 2:48 pm

    Re: A new data compressor called ZX0

    Post by Einar Saukas »

    introspec wrote: Wed Jan 20, 2021 3:47 pmAnd this also means that you cannot really use 8 bit LSB with the same decompressor structure, as it will break literally EVERYTHING :)
    I actually implemented another alternate version with 8 bit LSB, that was called ZXX. Using a full byte for LSB simplifies other parts of the code, so it somewhat compensates not being able to use the optimization you mentioned. Because of this, ZXX decoder was even shorter. It also compressed better in a few benchmarks (especially the Calgary test), but worse in most practical cases. In the end, I decided that ZXX was not worth it.

    You will find ZXX in this old commit if you are interested.

    introspec wrote: Wed Jan 20, 2021 3:47 pmUPDATE 2: However, since this is the case, why did you not use interlaced gamma code?
    Because using interlaced gamma code would always make each compressed file exactly 1 byte larger. I didn't think it was worth it.

    It's the same reason I didn't use interlaced gamma code in ZX7. This idea was originally discussed during ZX7 development here:

    https://worldofspectrum.org/forums/disc ... ent_673185

    I have considered to use interlaced gamma code in ZX1, which is going to be a simplified version of ZX0 with a much smaller decompressor. But I still need to run more tests before I can make a final decision.
    User avatar
    Einar Saukas
    Bugaboo
    Posts: 3100
    Joined: Wed Nov 15, 2017 2:48 pm

    Re: A new data compressor called ZX0

    Post by Einar Saukas »

    Joefish wrote: Wed Jan 20, 2021 3:49 pm Although why do you say that using 'last offset' only makes sense after a literal block? If you cached the previous offset when you define a new one, you could still revert to it (or switch new and old) after copying from the new offset. There are going to be some savings to be found there too... maybe... with graphics, or anywhere there's lots of repetition with small variations from one copy to the next.
    Alternating between the last 2 offsets, or allowing to reuse the last N offsets, are all good ideas. However providing more choices also require more bits to distinguish them. Is it worth it? I have no idea :)

    Besides good compression, my other goals for ZX0 were a small fast decompressor, no additional storage, no self-modifying code, and only use main registers. These goals wouldn't be possible if it had to remember multiple offsets.
    User avatar
    Einar Saukas
    Bugaboo
    Posts: 3100
    Joined: Wed Nov 15, 2017 2:48 pm

    Re: A new data compressor called ZX0

    Post by Einar Saukas »

    Joefish wrote: Wed Jan 20, 2021 7:19 pmWhen you're compressing blocks as small as 64K or less on a modern PC (which all Spectrum data is going to be) then you can take the time to consider EVERY POSSIBLE compression combination route, whether that be skipping over a possible back-reference and just stick to an unbroken string of literals, or whether to re-use an old reference for a partial copy or pick a new reference for a longer back-copy. You don't have to make those decisions on-the-fly during encoding based on statistical probabilities; you can simply brute-force it and follow every possible choice through to its end and know for sure which is the best option.
    Not really :)

    The number of possibilities increase exponentially with file size. Because of this, my first brute-force implementation took a few hours to compress 64Kb! Unfortunately most optimizations I originally created for the ZX7 compressor didn't apply to the ZX0 format, so I had to figure out new ones in order to make it viable.

    Joefish wrote: Wed Jan 20, 2021 7:19 pmHaving said that, his decoder is way, way more efficiently written than anything I've ever done!
    Thank you!

    Joefish wrote: Wed Jan 20, 2021 7:19 pmCan it always guarantee it can copy enough literals? What if it can't represent a copy-count big enough and is forced to swap to a back-reference, and then can't find one?
    The copy-count limit in the Z80 version of the decompressors is 65535 bytes. It's enough to fill the entire memory :)

    The copy-count limit in the C version (both compressor and decompressor) is MAX_INT, which is usually over 2 billion. The only reason it's not MAX_LONG, is because I was trying to squeeze as much performance as possible from the compressor.
    User avatar
    Einar Saukas
    Bugaboo
    Posts: 3100
    Joined: Wed Nov 15, 2017 2:48 pm

    Re: A new data compressor called ZX0

    Post by Einar Saukas »

    Alone Coder wrote: Wed Jan 20, 2021 4:02 pm Introspec has excluded native compressors RIP, mRIP, and ZXRar from his article. ZXRar can be easily tested in NedoOS:
    That's impressive!
    introspec
    Dizzy
    Posts: 53
    Joined: Wed Jan 02, 2019 2:26 pm
    Location: London
    Contact:

    Re: A new data compressor called ZX0

    Post by introspec »

    Einar Saukas wrote: Wed Jan 20, 2021 7:08 pm
    introspec wrote: Wed Jan 20, 2021 12:26 pmI think that if the decompression speed could get at least to the speed of recent aplib decompressors, you'll have a definite winner.
    The "mega" ZX0 decompressor is faster than all ZX7 decompressors. And according to this link, ZX7 is faster than aplib. Unless there's a much faster aplib decompressor out there?
    Well, the fastest ZX7 decompressor on that diagram is actually unpublished and it is definitely quite a bit faster than dzx7_mega.asm. However, I did make a mistake and you are indeed a bit faster than the fastest ApLib decompressor. Which is great, because I am sure dzx7_mega can be improved further! When my decompression time tests will finish running, I'll give you more precise numbers.
    Einar Saukas wrote: Wed Jan 20, 2021 7:08 pm
    introspec wrote: Wed Jan 20, 2021 12:26 pmcould you please consider adding a version number to ZX0? It makes it so much easier to identify pieces of software evolving over the years.
    ZX0 is already an optimal compressor, so I don't expect it to evolve any further. But if I make any significant changes, I will add a version number to distinguish it. OK?
    I am secretly hoping for a much faster compressor :) Apultra proves it can be done, almost.
    introspec
    Dizzy
    Posts: 53
    Joined: Wed Jan 02, 2019 2:26 pm
    Location: London
    Contact:

    Re: A new data compressor called ZX0

    Post by introspec »

    Einar Saukas wrote: Wed Jan 20, 2021 7:08 pmThe "mega" ZX0 decompressor is faster than all ZX7 decompressors. And according to this link, ZX7 is faster than aplib. Unless there's a much faster aplib decompressor out there?
    Image
    Pretty amazing! This further supports my idea that Pareto frontier should not have any kinks or dips - your new compressor is already top notch, and every improvement to decompressors will move you further to the right!
    catmeows
    Manic Miner
    Posts: 716
    Joined: Tue May 28, 2019 12:02 pm
    Location: Prague

    Re: A new data compressor called ZX0

    Post by catmeows »

    Einar Saukas wrote: Wed Jan 20, 2021 9:32 pm

    Alternating between the last 2 offsets, or allowing to reuse the last N offsets, are all good ideas. However providing more choices also require more bits to distinguish them. Is it worth it? I have no idea :)
    According Bloom, entries in old offset table have diminishing return. Second entry gives less gain than first, third entry gives less gain than second etc. Aplib keeps one entry, LZMA keeps 4. When exactly cost of additional entries (flags) starts to hurt compression depends on data. It is generally good idea to use some kind of adaptive coding to cancel out pathological cases when you play with many stored offsets.
    Anyway, I always thought ZX7 is quite old fashioned and someone should create something like ZX0. I think you really found sweetpot.
    Proud owner of Didaktik M
    introspec
    Dizzy
    Posts: 53
    Joined: Wed Jan 02, 2019 2:26 pm
    Location: London
    Contact:

    Re: A new data compressor called ZX0

    Post by introspec »

    Current "mega" decompressor is very wasteful. Here is 158/163 byte decompressor that is 10%/15% faster:

    Code: Select all

    ;	DEFINE	AllowSelfmodifyingCode
    
    		MACRO	GET_BIT
    			add a : call z,ReloadByte
    		ENDM
    
    		MACRO	FASTER_GET_BIT
    			add a : jp nz,1f : ld a,(hl) : inc hl : rla
    1
    		ENDM
    
    
    @DecompressZX0:		ld bc,#FFFF
    		IFNDEF	AllowSelfmodifyingCode
    			push bc						; default offset is -1
    		ELSE
    			ld (PrevOffset),bc
    		ENDIF
    			inc bc : inc c
    			ld a,#80 : jr RunOfLiterals
    
    UsualMatch:
    		IFNDEF	AllowSelfmodifyingCode
    			inc sp : inc sp					; discard last offset
    		ENDIF
    			FASTER_GET_BIT : call nc,ReadEliasGamma
    			dec b : ret z					; EoD
    
    			exa : xor a : sub c : ld b,a : exa
    			ld c,(hl) : inc hl
    			sra b : rr c					; last offset bit becomes first length bit
    
    		IFNDEF	AllowSelfmodifyingCode
            		push bc						; preserve new offset
    		ELSE
    			ld (PrevOffset),bc
    		ENDIF
    			ld bc,1 : call nc,ReadEliasGamma
    			inc bc
    
    		IFNDEF	AllowSelfmodifyingCode
    CopyMatch		ex (sp),hl					; preserve source, restore offset
            		push hl						; preserve offset
    			add hl,de
    			ldir : inc c
    			pop hl						; restore offset
    			ex (sp),hl					; ex (sp),hl - preserve offset, restore source
    		ELSE
    CopyMatch		push hl						; preserve source
    PrevOffset		EQU $+1 : ld hl,#FFFF				; restore offset (default offset is -1)
    			add hl,de
    			ldir : inc c
    			pop hl						; restore source
    		ENDIF
    
    			; after a match you can have either
    			; 0 + <elias length> = run of literals, or
    			; 1 + = another match
    			add a : jr c,UsualMatch
    
    RunOfLiterals:		FASTER_GET_BIT : call nc,ReadEliasGamma
    			ldir : inc c
    
    			; after a literal run you can have either
    			; 0 + <elias length> = match using a repeated offset, or
    			; 1 + = another match
    			add a : jr c,UsualMatch
    
    RepMatch:		FASTER_GET_BIT : call nc,ReadEliasGamma
    			jp CopyMatch
    
    
    ;
    ;  elias gamma reader
    ;  1 -> 1
    ;  01x -> 1x
    ;  001xx -> 1xx etc
    
    ReadEliasGamma:		add a : jr c,EliasCode01
    			FASTER_GET_BIT : jr c,EliasCode001
    			add a : jr c,EliasCode0001
    			GET_BIT : jr c,EliasCode00001
    
    			ld c,5-1
    .CodeLengthLoop		inc c : GET_BIT
    			jr nc,.CodeLengthLoop
    
    			ld ixl,c : ld c,1
    .CodeValueLoop		GET_BIT : rl c : rl b
    			dec ixl : jr nz,.CodeValueLoop
    			ret
    
    EliasCode00001		add a : rl c
    EliasCode0001		GET_BIT : rl c
    EliasCode001		add a : rl c
    EliasCode01		add a : jr z,ReloadByte1
    			rl c : ret
    ReloadByte1		ld a,(hl) : inc hl : rla
    			rl c : ret
    
    ReloadByte:		ld a,(hl) : inc hl : rla : ret
    I know you mentioned that you tried to avoid self-modifying code, but 5% on top of everything else is not small. The rest of performance improvements are coming from processing the first bit of Elias gamma inline and also from agressive inlining of the key code paths through it (esp. taking into account which bits are even and which are odd). The inner (general) code inside of the Elias gamma reader is currently just lazy; it needs to be optimized for speed (as it does not affect the performance anymore).
    introspec
    Dizzy
    Posts: 53
    Joined: Wed Jan 02, 2019 2:26 pm
    Location: London
    Contact:

    Re: A new data compressor called ZX0

    Post by introspec »

    This is an updated diagram, showing the improved performance of the new decompressor.

    Image

    Einar, now you can definitely claim that ZX0 is faster that ZX7 :) at least on average.
    User avatar
    Einar Saukas
    Bugaboo
    Posts: 3100
    Joined: Wed Nov 15, 2017 2:48 pm

    Re: A new data compressor called ZX0

    Post by Einar Saukas »

    introspec wrote: Wed Jan 20, 2021 10:40 pm Well, the fastest ZX7 decompressor on that diagram is actually unpublished and it is definitely quite a bit faster than dzx7_mega.asm.
    I just noticed a faster ZX7 decompressor at the bottom of this page. It didn't seem to be faster than dzx0_mega, although I didn't have time to test it extensively.

    Is this the unpublished ZX7 decompressor that you mentioned, or is there another even faster?
    Post Reply