Stability of IDs in ZXDB

Broken link? Feature request? Anything related to the Spectrum Computing website here.
Post Reply
User avatar
pak21
Drutt
Posts: 11
Joined: Sun Nov 12, 2017 9:14 pm

Stability of IDs in ZXDB

Post by pak21 »

[ Not sure if this is the right place for this topic. If it's not, please move it somewhere more appropriate ]

As well as the well known game IDs, ZXDB contains IDs for things like releases, labels and the like. How stable are these IDs across ZXDB releases? For example, if I'm looking at the release (2259, 4) which is currently the budget re-release of Head over Heels, will this always be the budget re-release of Head over Heels, or could that move to (2259, 3) or (2259, 5)?

I've tried to solve this problem multiple times in the past when dealing with changes from WoS database dumps and it's hard. On the other hand if ZXDB has now hard-forked from WoS, then it's easier...
User avatar
Mike Davies
Microbot
Posts: 137
Joined: Mon Nov 13, 2017 10:11 am

Re: Stability of IDs in ZXDB

Post by Mike Davies »

From memory, the releases table there isn't an id field (except entry_id as a foreign key). It has a release_seq, which I think is a sequence, that's only an arbitrary sort order column. From the docs I've seen the only safe assumption to make about this *_seq fields is that 0 is the primary release.

How Einar arrived at the current sequence allocations, I don't know. There doesn't seem to be a granular date order to work from. So my gut feeling is that it can't be safely used as a unique identifier.

I take it you have a use case that needs to reference a particular release of a title?
User avatar
pak21
Drutt
Posts: 11
Joined: Sun Nov 12, 2017 9:14 pm

Re: Stability of IDs in ZXDB

Post by pak21 »

Mike Davies wrote: Mon Nov 13, 2017 8:53 pm From memory, the releases table there isn't an id field

Code: Select all

MariaDB [zxdb]> show index from releases;
+----------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table    | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+----------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| releases |          0 | PRIMARY  |            1 | entry_id    | A         |       34696 |     NULL | NULL   |      | BTREE      |         |               |
| releases |          0 | PRIMARY  |            2 | release_seq | A         |       34696 |     NULL | NULL   |      | BTREE      |         |               |
+----------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
2 rows in set (0.00 sec)
:-) You can argue a primary key isn't an index, but if it's there...
I take it you have a use case that needs to reference a particular release of a title?
A number of them, but probably the most immediately public is Fuse's loader database where I'm much rather reference a release of a game as ID 12345 rather than "Game ID 3563, Original release, Tape 1, Side 1".
User avatar
Einar Saukas
Bugaboo
Posts: 3070
Joined: Wed Nov 15, 2017 2:48 pm

Re: Stability of IDs in ZXDB

Post by Einar Saukas »

pak21 wrote: Mon Nov 13, 2017 8:41 pm[ Not sure if this is the right place for this topic. If it's not, please move it somewhere more appropriate ]
Sorry for the late reply! I subscribed to this forum 2 days after you posted this question, and I didn't notice it until now... I will be paying more attention to questions related to my projects from now on!

pak21 wrote: Mon Nov 13, 2017 8:41 pmAs well as the well known game IDs, ZXDB contains IDs for things like releases, labels and the like. How stable are these IDs across ZXDB releases? For example, if I'm looking at the release (2259, 4) which is currently the budget re-release of Head over Heels, will this always be the budget re-release of Head over Heels, or could that move to (2259, 3) or (2259, 5)?
The short answer is, these IDs are extremely stable. They will never be changed unless there's a very very good reason to do so.

The long answer is a little more complicated...

In your example, (2259,4) refers to the 4th re-release of InfoseekID number 0002259. Let's see:


ABOUT INFOSEEKID:

In original WoS archive, sometimes people realized that two different numbers were actually different releases of the same program, in these cases Martijn moved all information to one of the entries and eliminated the other. It's probably the reason there are so many gaps in program numbers at WoS.

This may also happen sometimes in ZXDB, but only in cases I'm fairly confident it won't cause trouble for others. In this case, I will mention it here, and I won't reuse a deleted InfoseekID for anything else so there will be no false positives.

A counter-example is program "Enigma" with InfoseekIDs 0009708 and 0009709. One of them should have been deleted but, since original WoS archive still have 2 different entries, I didn't want to break compatibility so I just added a comment about it instead. This way, we won't have lost links, or entries from original WoS archive without correspondence in ZXDB.


ABOUT RELEASE SEQUENCE:

This number is exactly the order that multiple releases were stored in original WoS archive for each program, thus the order they appear in a webpage.

This order was supposed to be chronological, but this information isn't always easy to find out and verify. In practice, I will only modify the sequence numbers of a particular game, if we can be sure that the release order for a certain game is wrong. If it happens, I will also mention it here.

Notice I'm already adding new entries in the proper order, so we won't have this problem in new cases.

pak21 wrote: Mon Nov 13, 2017 8:41 pmI've tried to solve this problem multiple times in the past when dealing with changes from WoS database dumps and it's hard. On the other hand if ZXDB has now hard-forked from WoS, then it's easier...
In practice, you can assume these numbers are fixed. Any changes will be very rare, and I will mention them in this forum.

Can you tell me more about your usage of this information, so I can try to minimize impacts for you?
User avatar
pak21
Drutt
Posts: 11
Joined: Sun Nov 12, 2017 9:14 pm

Re: Stability of IDs in ZXDB

Post by pak21 »

Einar Saukas wrote: Fri Nov 17, 2017 1:25 pm Can you tell me more about your usage of this information, so I can try to minimize impacts for you?
At the moment, the use case is fairly simple: as part of the regression test suite for Fuse, I have a set of 164 games which are automatically checked to see if they load correctly. For this test suite to be useful to anyone else, they have to get exactly the same files. I clearly can't distribute the files themselves, so I'm doing the best I can otherwise: telling people as much information as I can about the file, including its SHA256. What I'm storing at the moment is something like "zxdb_id = 3369, release = 'Original, Small Case', sha256 = 9d36...", which hopefully means people can download the right file from... errr... somewhere else ;-)

It would obviously be nicer to just be able to reference a release in a "canonical" set of releases, rather than a nasty mash up of a canonical program ID and a text string - but unless those releases have a fixed identifier of some form, it's going to be actively harmful as people will start looking at the wrong release for the file to download.

Does that make sense?

(A bit of background and links here if you're interested in the regression suite itself)
Post Reply