ZXDB and russian web resources

This is the place for general discussion and updates about the ZXDB Database. This forum is not specific to Spectrum Computing.

Moderator: druellan

Post Reply
User avatar
Einar Saukas
Bugaboo
Posts: 3070
Joined: Wed Nov 15, 2017 2:48 pm

Re: ZXDB and russian web resources

Post by Einar Saukas »

Ralf wrote: Tue Apr 10, 2018 8:18 am Einar, does it mean that you don't want individual MIA Russian games to be submitted to Spectrum Computing and ZXDB like here viewtopic.php?f=30&t=591 because you are going to get it all in some big import?
Any integration will take considerable time to be done properly. In the meantime, we shouldn't stop adding titles to the database.

Please continue providing more titles but, whenever you have the corresponding page links at those Russians sites, please provide them too. We will also store these links in ZXDB, so it will be much easier to integrate with these sites later.
User avatar
Einar Saukas
Bugaboo
Posts: 3070
Joined: Wed Nov 15, 2017 2:48 pm

Re: ZXDB and russian web resources

Post by Einar Saukas »

moroz1999 wrote: Tue Apr 10, 2018 9:34 pmI've sent you contacts in private message. Please ask me if you need to contact anybody, I can help in some cases.
Thanks a lot! Since you also have similar plans, I will certainly involve you in these discussions too. So we can find the solution that works best for everyone.

moroz1999 wrote: Tue Apr 10, 2018 9:34 pm
Einar Saukas wrote: Tue Apr 10, 2018 3:19 amSomething else we need to decide is, if it makes more sense for you to integrate with all sites separately, or integrate them with ZXDB first so ZX-Art can simply obtain this information from a single source afterwards. Again, let's talk about it in further detail to find out the best solution for everybody involved.
That's the most complicated question really. Ralf has really got the point: new software has been added to different databases constantly as we are speaking, so even if you import one of archives, the week after you'll have to repeat it and somehow deal with the fact that same software gets added into different non-related databases.
ZXDB already deals with integrations that need to be updated periodically, such as RZX Archive and ZXSR. It's not too much of a problem, we just need to be careful on establishing a good process for each case.

moroz1999 wrote: Tue Apr 10, 2018 9:34 pmI'm going to resolve it this way:
1. I'll hold the unique guids for each author, alias, group, production and release from each database. This will allow me to run the import procedure more than once, so only the added information would have been imported.
What does "production" mean?

I agree about the others.

moroz1999 wrote: Tue Apr 10, 2018 9:34 pm2. For every new file I'll check the file's MD5, and if it already exists in database, I'll just save an additional guid, not make a duplicate.
Also if a file disappears from ZXDB, it means the file was considered obsolete and replaced with a better version, so it may be better for you to replace it too.

Therefore it may work better for you if you simply drop all files from ZXDB, then import them all again. If any additional information you have (like comments) is associated with releases instead of individual files, then reimporting all files won't affect anything on your side.

moroz1999 wrote: Tue Apr 10, 2018 9:34 pm3. For every new file non-existing in database, I'll try to find the existing author and author's existing software (by name+year) and add a new release to the existing software.
This won't be necessary for new files reimported from ZXDB, because each one of them will already have ENTRY_ID and RELEASE_SEQ.

It won't be necessary for TOSEC files either, because you can obtain the ENTRY_ID for each of them from ZX Pokemaster.

It may not be necessary for Russian sites because, if ZXDB integrates with them, you should be able to obtain this information from ZXDB too. That's something we will need to decide.

moroz1999 wrote: Tue Apr 10, 2018 9:34 pmThis means that I'll gather all the cracks, versions, mods and rereleases. Also, every sync procedure would be run periodically, I will need to manually fix the sync errors and improve the algorithms.
ZXDB aims to have everything (versions, mods and re-releases), except cracks. And TOSEC aims to have everything including cracks, associated with ZXDB through ZX Pokemaster. Also ZXDB aims to keep information about corresponding information in every other site. Therefore importing data from ZXDB would give you all information you need for integrating with everybody else with minimum effort.

There's still a lot of work to be done before we get there, but we are moving faster than ever :)
User avatar
moroz1999
Manic Miner
Posts: 329
Joined: Fri Mar 30, 2018 9:22 pm

Re: ZXDB and russian web resources

Post by moroz1999 »

Einar Saukas wrote: Wed Apr 11, 2018 6:29 pmZXDB already deals with integrations that need to be updated periodically, such as RZX Archive and ZXSR. It's not too much of a problem, we just need to be careful on establishing a good process for each case.
The difference is that new RZX's are being added only to RZX Archive, so there is no problem determing which one do you have and which one not.
Imagine there was an online submission form for RZX, reviews, software or releases on spectrumcomputing.co.uk, which would instantly modify the database. Then the integration task becomes complicated :)

Einar Saukas wrote: Wed Apr 11, 2018 6:29 pm What does "production" mean?
I agree about the others.
I just call entry "production", otherwise it's mostly the same.
Einar Saukas wrote: Wed Apr 11, 2018 6:29 pm Also if a file disappears from ZXDB, it means the file was considered obsolete and replaced with a better version, so it may be better for you to replace it too.
Yes, I agree. This is why I don't want to use entry_id+crc32 as GUID, because this would mean that after file update I won't find which release it belonged to previously.
Einar Saukas wrote: Wed Apr 11, 2018 6:29 pm Therefore it may work better for you if you simply drop all files from ZXDB, then import them all again. If any additional information you have (like comments) is associated with releases instead of individual files, then reimporting all files won't affect anything on your side.
I'm afraid that's not so easy. The import procedure already takes hours, and if I re-download each file during each import, the problem get worse. Also, this will put a load on external database, which I would like to avoid at any means.
Einar Saukas wrote: Wed Apr 11, 2018 6:29 pmThis won't be necessary for new files reimported from ZXDB, because each one of them will already have ENTRY_ID and RELEASE_SEQ.
Thanks! ENTRY_ID + RELEASE_SEQ seems like the best choice for me at the moment.
Einar Saukas wrote: Wed Apr 11, 2018 6:29 pmIt may not be necessary for Russian sites because, if ZXDB integrates with them, you should be able to obtain this information from ZXDB too. That's something we will need to decide.
I can surely guarantee, that really somebody would upload release to zxn.ru, somebody would upload it to zxart, and somebody would submit it to WOS. And they all be identical but all have different IDs :)
We are all together basically building a distributed ZX Spectrum archive. It's complicated, but it's better protected from extinction than a single-point-of-failure centralized solution.
Einar Saukas wrote: Wed Apr 11, 2018 6:29 pm ZXDB aims to have everything (versions, mods and re-releases), except cracks. And TOSEC aims to have everything including cracks, associated with ZXDB through ZX Pokemaster. Also ZXDB aims to keep information about corresponding information in every other site. Therefore importing data from ZXDB would give you all information you need for integrating with everybody else with minimum effort.

There's still a lot of work to be done before we get there, but we are moving faster than ever :)
Great! Let's see what we'll have in result.
User avatar
Einar Saukas
Bugaboo
Posts: 3070
Joined: Wed Nov 15, 2017 2:48 pm

Re: ZXDB and russian web resources

Post by Einar Saukas »

Cool :)
Post Reply