Multiple sources have confirmed the same words from Lee Fogarty posted at his own private Facebook group, that I cannot access. I wasn't planning to quote him, but since
this is now in the open, there's no reason for privacy concerns anymore. He apparently forgot to post his claims in his own forum, so let me help him:
Lee Fogarty: "I think it needs clearing up exactly what "created by" means. And another reason the data has to go is some of the changes are dubious or un-needed. This is why WoS always wanted to credit people submitting changes, and list them on the whats new page.
From what I can make out, the claim is that we are using the ZXDB database. Totally untrue - the WoS db was created a long time before - using the original data files.
Any group of people creating a database from an existing dataset will invariably create similar tables and structure. Things such as the machine types used - create a list for machine types.. both parties will likely create the same table with the same data. There is a WoS admins group on FB that Einar was in, and posts still there where I am sending structure/data to him.
That seems to be changing now to we are using "their" data. Again - untrue. There are some left over bits from a very old import test that are being removed.
WoS currently has over 300,000 indexed pages. Not just software. The software is a very small part of the database, and with the bits we are removing, comes to a minuscule amount.
This is all something that could have been sorted with a PM."
So that's the main point. Is new WoS simply using the same data from old WoS that was imported into ZXDB? Or is it using such an early version of ZXDB (from July/August 2016) that only contained old WoS data, so there's no need to credit ZXDB (despite
literally about 50,000 fixes I did when importing this data)?
Unfortunately the answer is no. To understand the difference, let's take a look at ZXDB chronology. The summary below has plenty of links to prove everything, although I suggest ignoring the links for now and just reading from start to finish:
So that's the point. It comes back to something I wrote in my original post:
Einar Saukas;57777 wrote:I imported old WoS content with their help from July 2016 to August 2016. If they had used one of the early versions of ZXDB that they participated, without crediting ZXDB, I would leave it alone. However they chose to use a version of ZXDB from September 2018, in order to take advantage of over 2 years of other people's work, without crediting anybody. That's a ZXDB version released 2 years after WoS stopped supporting ZXDB and started attacking my work. About 1 year after ZXDB and SpectrumComputing were censored at WoS thus forcing ZXDB to move to another forum. Months after I was personally censored at WoS without ever receiving any explanation.
Everybody is probably asking now, how do we know that new WoS is using ZXDB 1.0.8? Is it really much different from the original data from old WoS?
I'm glad you asked
I will have to get technical now, but I will explain it so everyone can understand. We will compare new WoS content against old WoS and a few ZXDB versions. Let's see what happens!
To reproduce this experiment at home, you need MySQL (or even better MariaDB) and any SQL client (HeidiSQL, MySQL Workbench, DBeaver, etc). They are open source and free. Also download a few versions of
ZXDB from Github (click on "commits" to find and download older versions) and load one of them into your database.
I already mentioned you could visit
new WoS software page and click on
"EXPORT CSV (ALL)", to download some of the data from all titles stored in new WoS. I know lots of people did it (you may still have an old copy of this file yourself, perhaps in your
Recycle Bin?). Let's start with "software-20200616.csv" from a day after new WoS was launched (we will talk about files from different days later).
Here's a database script to import this file into a database. Even if you don't know SQL, you should be able to see it's quite straightforward:
Code: Select all
create table x_newwos (
rows0 varchar(100),
id int(11) not null primary key,
title varchar(500),
slug varchar(500),
no_players varchar(100),
turn_type varchar(100),
entry_type varchar(100),
availability varchar(100),
comments varchar(5000),
is_x_rated varchar(100),
is_crap varchar(100),
clone_of varchar(100),
old_id int(11),
title_publisher varchar(500),
publishers varchar(500),
all_publishers varchar(500),
entry_groups varchar(500),
distribution_status_type varchar(500),
display_image varchar(500)
);
load data local infile 'software-20200616.csv'
into table x_newwos character set utf8
fields terminated by ',' optionally enclosed by '"'
lines terminated by '\n' ignore 1 lines;
Now download the original Martijn's WoS internal file
"maindb.dat". Hopefully
Lee Fogarty declared it "open source" so I don't need to worry anymore about sharing it. If you don't believe this file is authentic, choose any game at random and compare the corresponding line in this file against the old WoS pages. Let us know if you spot any difference!
Here's a simple database script to import this file:
Code: Select all
create table x_entries (
titlekey varchar(500),
pubkey varchar(500),
title varchar(500),
release_year varchar(10),
orig_publisher varchar(500),
re_publishers varchar(500),
memory varchar(500),
players varchar(500),
joysticks varchar(500),
genre varchar(500),
category varchar(500),
language varchar(500),
distrib_status varchar(500),
schemetype varchar(500),
downloads varchar(500),
flags varchar(500),
authors varchar(500),
aliases varchar(500),
id int(11) primary key not null,
spot_num varchar(500),
spot_genre varchar(500),
spot_full_price varchar(500),
spot_budget_price varchar(500),
spot_disk_price varchar(500),
spot_comments varchar(500),
spot_publisher varchar(500),
license varchar(500),
groupname varchar(500),
comments varchar(5000),
series varchar(500),
orig_price varchar(500),
c64_ref varchar(500),
spanish_price varchar(500),
wikipedia varchar(500),
typein_ref varchar(500),
authoring varchar(500)
);
load data local infile 'maindb.dat'
into table x_entries character set utf8
fields terminated by '\t'
lines terminated by '\n';
The first CSV file from new WoS didn't have much useful content besides title, original publisher, and comments. Comparing title and original publisher from old titles won't help, since old WoS rarely got this information wrong so it almost never changed. However comparing comments is very useful, since they are continuously improved in ZXDB with fixes, further details, etc.
Here's a simple SQL to compare comments (except backslashes) between 2 tables. Notice it only compares titles that existed in old WoS (i.e 24369 titles with ID below 28187) to give new WoS a better chance:
Code: Select all
select e.id,e.comments,x.comments from entries e
inner join x_newwos x on e.id = x.old_id
where replace(coalesce(e.comments,''),'\\ ',' ') <> replace(coalesce(x.comments,''),'\\ ',' ')
and e.id <= 28187;
From this comparison, you will get the following results:
Code: Select all
new WoS (software-20200616.csv) vs. old WoS (maindb.dat) - 2583 differences
new WoS (software-20200616.csv) vs. ZXDB 1.0.0 (April 2018) - 5 differences
new WoS (software-20200616.csv) vs. ZXDB 1.0.8 (September 2018) - 0 (zero) differences
new WoS (software-20200616.csv) vs. ZXDB 1.0.9 (October 2018) - 1 difference
new WoS (software-20200616.csv) vs. ZXDB 1.0.69 (latest) - 766 differences
As you can see, there's
a lot more similarity between new WoS and current ZXDB, than between new WoS and old WoS.
What if you want to repeat this test yourself to believe it, but you only have a newer CSV file from a different day? No problem. Although the CSV format at new WoS has changed over time, any CSV file downloaded before 2 days ago (when
all comments changed into a bloody mess) will do. You just need to add or remove a couple columns from the import script, based on the column names you can see at the top of your CSV file. For instance, here's the same script adapted according to the CSV columns from 2 days ago:
Code: Select all
create table x_newwos (
rows0 varchar(100),
id int(11) not null primary key,
title varchar(500),
slug varchar(500),
no_players varchar(100),
turn_type varchar(100),
entry_type varchar(100),
availability varchar(100),
comments varchar(5000),
is_x_rated varchar(100),
is_crap varchar(100),
clone_of varchar(100),
old_id int(11),
title_publisher varchar(500),
release_year varchar(10),
search_title varchar(500),
known_errors text(30000),
has_inlay varchar(4),
has_loading_screen varchar(4),
machine_type varchar(500),
publishers varchar(500),
control_types varchar(500),
theme varchar(500),
all_publishers varchar(500),
machine_types varchar(500),
entry_groups varchar(500),
az varchar(500),
distribution_status_type varchar(500),
display_image varchar(500),
index x_id(old_id)
);
load data local infile 'software-20200702.csv'
into table x_newwos character set utf8
fields terminated by ',' optionally enclosed by '"'
lines terminated by '\n' ignore 1 lines;
The same comparison using a more recent CSV file will show nearly identical results, except for 2 titles:
Reckless Rufus (new comments added on June 26th) and
Werner's Quest (new comments added on June 17th but
later lost).
It's absolutely clear that new WoS is really using content taken from ZXDB, not from old WoS. Instead of just pointing a few examples, we have now executed a comparison involving all titles. Even better, I provided instructions so anyone can replicate this experiment at home to see by themselves. And this comparison demonstrated that
new WoS content is very much different from old WoS, not so much different from current ZXDB, and absolutely identical to ZXDB from September 2018.
As promised, this is my final post providing evidences. There's no need to prove anything else.
So what now? Well, THAT QUESTION will require one more post. But it's late, so let's talk about that tomorrow.
NOTE: On June 18th, I did a similar test at SpectrumComputing and found 2572 differences (instead of 2583) between new WoS and old WoS. It's because I compared new WoS against old Wos data that was already converted to ZXDB in 2016. This new comparison now, directly between new WoS and old WoS, is even more accurate and indicates even more differences.
NOTE: Reproduced from my post at the WoS forum