Predictive Search

Broken link? Feature request? Anything related to the Spectrum Computing website here.
User avatar
Einar Saukas
Manic Miner
Posts: 953
Joined: Wed Nov 15, 2017 2:48 pm

Re: Predictive Search

Post by Einar Saukas » Thu Jul 18, 2019 7:48 pm

Sure, I understand your ideas. They are quite similar to my considerations when I created these tables about a year ago. But I still don't see how they could be improved.

I don't really need to store text in these tables as lowercase only, since all comparisons use "ci" collation. Everything is currently converted to lowercase just to reduce unnecessary duplicates. Also collation is currently defined in these tables just for convenience, so site owners don't need to remember to specify collation in their queries. Changing these details wouldn't improve searches or help autocomplete.

Moreover, I still think it's a good idea to remove spaces and punctuation from these tables, otherwise users searching for either "Offroad Racing" or "Off Road Racing" wouldn't find anything.

Admittedly I'm not an expert in this area so I could be wrong. If you have any changes to suggest, please go ahead!
0 x

User avatar
Einar Saukas
Manic Miner
Posts: 953
Joined: Wed Nov 15, 2017 2:48 pm

Re: Predictive Search

Post by Einar Saukas » Thu Jul 18, 2019 8:07 pm

RMartins wrote:
Thu Jul 18, 2019 2:18 pm
Also remember, that quick search, includes: Titles, Aliases, Authors and Publishers.
You probably don't need to distinguish between authors and publishers.

In field "author" you can simply autocomplete against:

Code: Select all

SELECT name FROM labels WHERE id IN
(SELECT label_id FROM search_by_authors)
Analogously for "publisher":

Code: Select all

SELECT name FROM labels WHERE id IN
(SELECT label_id FROM search_by_publishers)
In field "title" you can autocomplete against:

Code: Select all

SELECT title FROM entries 
UNION ALL SELECT title FROM aliases
And in a generic field "text" in basic search, you can autocomplete against:

Code: Select all

SELECT title AS text FROM entries 
UNION ALL SELECT title AS text FROM aliases 
UNION ALL SELECT name AS text FROM labels
For simplicity I have omitted COLLATION, GROUP BY, and ORDER BY. They are left as exercise for the reader :)

This way, it's also easy to impose a limit of 40 results overall, instead of partial limits of 10 each.
0 x

User avatar
RMartins
Manic Miner
Posts: 397
Joined: Thu Nov 16, 2017 3:26 pm

Re: Predictive Search

Post by RMartins » Fri Jul 19, 2019 1:42 am

Einar Saukas wrote:
Thu Jul 18, 2019 7:48 pm
Sure, I understand your ideas. They are quite similar to my considerations when I created these tables about a year ago. But I still don't see how they could be improved.

I don't really need to store text in these tables as lowercase only, since all comparisons use "ci" collation. Everything is currently converted to lowercase just to reduce unnecessary duplicates. Also collation is currently defined in these tables just for convenience, so site owners don't need to remember to specify collation in their queries. Changing these details wouldn't improve searches or help autocomplete.
I agree, except with the last sentence :)

NOTE:
If table data does not contain Upper case characters, we can never match them as Upper case characters.
However, if table data contains Upper case characters, then we can match or ignore them, depending on the collation being used.

So, there is an improvement to be gained, by not forcing the content to be all lower case, which is to allow matching case if required.

Another improvement, is that instead of using MYSQL collation utf8_unicode_ci , the MB4 version should be used (utf8mb4_unicode_ci), according to note on the following documentation page https://dev.mysql.com/doc/refman/8.0/en ... icode.html (this is mentioned in several places in the documentaiton, since the 3 byte version, is a MySQL fluke and non standard Umicode.

Einar Saukas wrote:
Thu Jul 18, 2019 7:48 pm
Moreover, I still think it's a good idea to remove spaces and punctuation from these tables, otherwise users searching for either "Offroad Racing" or "Off Road Racing" wouldn't find anything.
Punctuation in data is not required to be removed, for similar reasons as case (being possible to match it, if needed).

NOTE:
If we remove accents, we can no longer match them explicitly, since they are not there.
But if we have accents available in table data, we can ignore it (accent insensitive), just by using a different collation, or use it for mathching (accent sensitive).

NOTE:
Current collation in use utf8_unicode_ci, is equivalent to utf8_unicode_ai_ci (3 byte Unicode accent insensitive and case insensitive).
This is why searching for '%aña%' returns results that contain 'ana'.

Here is some documentation reference on this:
https://dev.mysql.com/doc/refman/8.0/en ... names.html

Regarding special cases, ( like "Off Road" versus "Offroad" ) the standard solution is to have a dictionary for those (where synonyms can be defined) or special match/translationrules (when using a search engine), but probably the best we can do here is to provide aliases (AKA synonym) for those known cases, including the entire title for simplification, for example in the "aliases" table or similar.
So there should be 2 entries like "Off Road Racing" versus "Offroad Racing".

This "aliases" (dictionary) can then be used, when generating the search indexes (AKA search_by_* tables in our case).
Einar Saukas wrote:
Thu Jul 18, 2019 7:48 pm
Admittedly I'm not an expert in this area so I could be wrong. If you have any changes to suggest, please go ahead!
Neither am I, because search is a huge subject, with many considerations and p´ros and cons, like any engineering solution, sometimes compromises need to be done.

But I can give you practical examples of this stuff I'm talking about.

For example, in every place in the world, there are streets or areas that are known by a different name, then their real name, usually for cultural reasons. And hence this is the name that most people use to search stuff in that area.

For example, in Lisbon, there was a big international event in 1998, The World International Exposition, also known as Expo 98.

The surrounding area of that event, got culturally known as "EXPO", which in fact, maps to 3 different parishes.
So when people search for stuff in EXPO, we know that we should limit the search to that geographic location.

The same happens, to a river side road that is culturally called "Marginal", since it's on the margin/bank of the river, but it has 3 or 4 names, since it crosses several parishes and 2 counties.

The same probably happens with some games, that might have some culturally distinct name.

No good example comes to mind right now, but a tentative example is "Chaos - The Battle of Wizards", that is known by many as just "Chaos". Not a very good example, because the cultural name is a substring of the real name.

Another example is "Live and Let Die" which is also known as "Aquablast". Although these different names are from different releases, but the idea behind is the same.
0 x

User avatar
RMartins
Manic Miner
Posts: 397
Joined: Thu Nov 16, 2017 3:26 pm

Re: Predictive Search

Post by RMartins » Fri Jul 19, 2019 2:15 am

Einar Saukas wrote:
Thu Jul 18, 2019 8:07 pm
RMartins wrote:
Thu Jul 18, 2019 2:18 pm
Also remember, that quick search, includes: Titles, Aliases, Authors and Publishers.
You probably don't need to distinguish between authors and publishers.

...
Currently the Quick search is using this query:

Code: Select all

SELECT N.name
FROM(
(
	SELECT	E.title AS name
	FROM	entries E	INNER JOIN search_by_titles SBT ON (SBT.entry_id = E.id)
	WHERE	SBT.entry_title LIKE ?
	ORDER	BY E.title
	LIMIT 10
)
UNION
(
	SELECT	A.title AS name
	FROM	aliases A
	WHERE	A.title LIKE ?
	ORDER	BY A.title
	LIMIT 10
)
UNION
(
	SELECT	L.name AS name
	FROM	labels L	INNER JOIN search_by_authors SBA ON (SBA.label_id = L.id)
	WHERE	L.name LIKE ?
	ORDER	BY L.name
	LIMIT 10
)
UNION
(
	SELECT	L.name AS name
	FROM	labels L	INNER JOIN search_by_publishers SBP ON (SBP.label_id = L.id)
	WHERE	L.name LIKE ?
	ORDER	BY L.name
	LIMIT 10
)) N
ORDER BY N.name
LIMIT 10
Einar Saukas wrote:
Thu Jul 18, 2019 8:07 pm
This way, it's also easy to impose a limit of 40 results overall, instead of partial limits of 10 each.
Not sure I understood what you meant.

It does not matter, if we have 40 results of each, if we are only going to present 10 in total.
At most they could be 10 of the same type, so we will never show the other 30 of each type.
If you look at the above query, it should be self explanatory.

I would suggest we take the more technical detail somewhere else, to avoid loosing most of the current readers :lol:

However, I can give a simple example, on how to query for each intent.
Here is a simple explanation of how COLLATIONs work, using weights.
https://dev.mysql.com/doc/refman/8.0/en ... ation.html

Lets focus just on titles for example.

Code: Select all

SELECT	E.title AS name
FROM	entries E	INNER JOIN search_by_titles SBT ON (SBT.entry_id = E.id)
WHERE	SBT.entry_title LIKE ?
ORDER	BY E.title
LIMIT 10
Current COLLATION is utf8_unicode_ci, meaning the search is accent insensitive and case insensitive.

Now suppose the user, introduced a search string like 'aña':
- this means the user is interested in matching the special char 'ñ'.
- the user did not use any Upper case character ( so is not interested in a specific case match ).

From the observation of the search input string, we can infer that we should do an accent sensitive and case insensitive search.

Code: Select all

SELECT	E.title AS name
FROM	entries E	INNER JOIN search_by_titles SBT ON (SBT.entry_id = E.id)
WHERE	SBT.entry_title LIKE ? COLLATE utf8mb4_unicode_as_ci
ORDER	BY E.title
LIMIT 10
Notice the extra "COLLATE utf8mb4_unicode_as_ci"!

NOTE:
Remember that to reap the benefits of the several distinct collations, the table data needs to contain mixed case and special chars, so that the comparisons will work as expected.

So, for all the possible combinations, we would need the following 4 COLLATIONS:
- utf8_unicode_ai_ci <=> utf8_unicode_ci
- utf8_unicode_as_cs <=> utf8_unicode_cs
- utf8_unicode_as_ci
- utf8_unicode_ai_cs

And this is were most MySQL installs will give us trouble, since these COLLATIONS are not ALL available by default!

But this can be fixed, by adding them to the server.
https://dev.mysql.com/doc/refman/8.0/en ... e-uca.html
It might not be as easy as we would like (might require a recompile), for the mixed ones at least.

But at first try, we could live with just the first two, that are usually available or more easily installable.

NOTE: to simplify the example, I'm not using the utf8mb4, which we should.

To check for locally available/installed collations, run this command:

Code: Select all

SHOW COLLATION WHERE Charset = 'utf8mb4';
UPDATE:
Using specific COLLATIONS on a query, might trip MySQL into not using indexes.
Or might require extra indexes on the same field, but for a different collation.
Something to keep an eye on.
0 x

User avatar
RMartins
Manic Miner
Posts: 397
Joined: Thu Nov 16, 2017 3:26 pm

Re: Predictive Search

Post by RMartins » Fri Jul 19, 2019 10:12 am

RMartins wrote:
Fri Jul 19, 2019 2:15 am
...

UPDATE:
Using specific COLLATIONS on a query, might trip MySQL into not using indexes.
...
But using SQL LIKE usually prevents the use of any index, so it might not be a big concern, since the database is small.
0 x

User avatar
Einar Saukas
Manic Miner
Posts: 953
Joined: Wed Nov 15, 2017 2:48 pm

Re: Predictive Search

Post by Einar Saukas » Fri Jul 19, 2019 4:44 pm

RMartins wrote:
Fri Jul 19, 2019 1:42 am
If table data does not contain Upper case characters, we can never match them as Upper case characters.
However, if table data contains Upper case characters, then we can match or ignore them, depending on the collation being used.
I understood your point. I just happen to disagree with you :)

This was your proposal for uppercase:
So, if user uses Upper case in the search string, then it means they are looking for that specific case, for example to distinguish the start of a Word or a Name. But if the user uses all lower case, then it means that any match is acceptable including in the middle of a word or name.
According to your proposal, searching for "Earth Shock" won't find anything, because the exact spelling is "Earthshock".

The point is, I believe searches should be as permissive as possible. Right now, users don't need to know the exact spelling of a title. Searching for "Earth Shock", "earthshock" and "earth-shock" will find the intended game. It goes even further: right now, searching for "night rally" will find both "Night Rally" and "Nightmare Rally". And searching for "Ocean Software" will also find titles published by Ocean brands and subsidiaries, not just Ocean directly. That's how searches in Martijn's WoS work and I always liked it, thus the reason I provided something similar with these auxiliary tables. It's a very different requirement from autocomplete, that only makes sense for exact matches.

RMartins wrote:
Fri Jul 19, 2019 1:42 am
Another improvement, is that instead of using MYSQL collation utf8_unicode_ci , the MB4 version should be used (utf8mb4_unicode_ci), according to note on the following documentation page https://dev.mysql.com/doc/refman/8.0/en ... icode.html (this is mentioned in several places in the documentaiton, since the 3 byte version, is a MySQL fluke and non standard Umicode.
Good point, I will fix it. Thanks!

RMartins wrote:
Fri Jul 19, 2019 1:42 am
Punctuation in data is not required to be removed, for similar reasons as case (being possible to match it, if needed).
I disagree...

RMartins wrote:
Fri Jul 19, 2019 1:42 am
If we remove accents, we can no longer match them explicitly, since they are not there.
But if we have accents available in table data, we can ignore it (accent insensitive), just by using a different collation, or use it for mathching (accent sensitive).
And I still think searches here should be always accent insensitive. Otherwise searching for "iniciación" won't find any results.

RMartins wrote:
Fri Jul 19, 2019 1:42 am
Regarding special cases, ( like "Off Road" versus "Offroad" ) the standard solution is to have a dictionary for those (where synonyms can be defined) or special match/translationrules (when using a search engine), but probably the best we can do here is to provide aliases (AKA synonym) for those known cases, including the entire title for simplification, for example in the "aliases" table or similar.
So there should be 2 entries like "Off Road Racing" versus "Offroad Racing".
Even so, your suggestion wouldn't be enough to locate the game called "4x4 Off-Road Racing". Unless you are suggesting to store lots of possible variants for each game title in the database? If so, that would interfere with autocomplete. Imagine someone typing "knight" and autocomplete showing 10 variants of the same title ("Knight Time", "Knight Tyme", "Knight-time", "Knight-tyme", "Knighttime", "Knighttyme", "Knightime", "Knightyme", etc) and none for Knight Lore, for instance.

Besides, it feels wrong to store aliases in the database for games that were never really known by those titles, just to help searches that are already working without this.
0 x

User avatar
RMartins
Manic Miner
Posts: 397
Joined: Thu Nov 16, 2017 3:26 pm

Re: Predictive Search

Post by RMartins » Fri Jul 19, 2019 6:35 pm

Einar Saukas wrote:
Fri Jul 19, 2019 4:44 pm
RMartins wrote:
Fri Jul 19, 2019 1:42 am
If table data does not contain Upper case characters, we can never match them as Upper case characters.
However, if table data contains Upper case characters, then we can match or ignore them, depending on the collation being used.
I understood your point. I just happen to disagree with you :)
That's perfectly fine. 8-)

If we would all think alike, it would be boring :)

But I believe you are disagreeing, because I might not have explained it well enough.

If you take into account, that you are using a collation that is accent insensitive and case insensitive, it doesn't matter, what accentuated characters or which case they are in the database, because an A = a = Â = à = À = á = ... etc... using the *_unicode_ai_ci collation.

So having the original data accents and case on the search_by_titles table, doesn't affect how you currently do your searches.
If you don't believe me, change one and try :D

Einar Saukas wrote:
Fri Jul 19, 2019 4:44 pm
This was your proposal for uppercase:
So, if user uses Upper case in the search string, then it means they are looking for that specific case, for example to distinguish the start of a Word or a Name. But if the user uses all lower case, then it means that any match is acceptable including in the middle of a word or name.
According to your proposal, searching for "Earth Shock" won't find anything, because the exact spelling is "Earthshock".
Nope.
NOTE: his is why I believe, I might not have explained myself well.

In search engines, things known to have different spellings, like your example above, are perfectly fine.
They just need several entries into the search index (not database index), like if they were alternative names (aliases).

Clipping/removing the spaces, dashes, funny characters, etc..., to try to retain only the parts that are common to facilitate a match is not the way to go about it, because it will give a lot of false positives in other unrelated searches.

So in this case, you would have 3 records in search_by_titles, namely:
- "Earth Shock"
- "Earthshock"
- "Earth-Shock"
that map to the same exact game id (entry_id).

If you think about it, you already do this with the "aliases" table.

The problem with clipping the uncommon stuff, besides giving false positives, is that it will not work for names (aliases) that are no exact matches once you strip the funny(uncommon) characters between the several spellings.

A good example is the "Live and Let Die" versus "Aquablast".
Same game, different "spellings" or names for the same thing.

I hope that clears any doubt about it.
You get the good but you also get the bad, by clipping data.
If instead you just keep the several spellings, you get the good part, but without the bad ones.

Einar Saukas wrote:
Fri Jul 19, 2019 4:44 pm
The point is, I believe searches should be as permissive as possible. Right now, users don't need to know the exact spelling of a title. Searching for "Earth Shock", "earthshock" and "earth-shock" will find the intended game.
Completely agree.
Einar Saukas wrote:
Fri Jul 19, 2019 4:44 pm
It goes even further: right now, searching for "night rally" will find both "
Night Rally" and "Nightmare Rally".
I hope you agree with me, that those results are not the outcome, of clipping the funny/uncommon chars.
Because "night rally" when stripped of the funny chars, would become "nightrally", and would never match "nightmarerally".

I haven't checked, but that outcome is most probably the result of the "aliases" table or similar.
Einar Saukas wrote:
Fri Jul 19, 2019 4:44 pm
And searching for "Ocean Software" will also find titles published by Ocean brands and subsidiaries, not just Ocean directly. That's how searches in Martijn's WoS work and I always liked it, thus the reason I provided something similar with these auxiliary tables. It's a very different requirement from autocomplete, that only makes sense for exact matches.
Again, that is not the result of clipping the funny/uncommon chars.
That is the result of pre-processing some table joins to include extra keys in the search_by_* tables, which is exactly what I'm proposing. 8-)

We can still have all those features, without having the false positives.

Autocomplete is just a search, like any other. I see no difference there.
Einar Saukas wrote:
Fri Jul 19, 2019 4:44 pm
RMartins wrote:
Fri Jul 19, 2019 1:42 am
Punctuation in data is not required to be removed, for similar reasons as case (being possible to match it, if needed).
I disagree...
Can you please try a search, after changing a few entries to have the correct case and accents, and confirm that you will still find them with the current collation in use ?

NOTE: If this isn't true, MySQL has a bug in that collation.
Einar Saukas wrote:
Fri Jul 19, 2019 4:44 pm
RMartins wrote:
Fri Jul 19, 2019 1:42 am
If we remove accents, we can no longer match them explicitly, since they are not there.
But if we have accents available in table data, we can ignore it (accent insensitive), just by using a different collation, or use it for mathching (accent sensitive).
And I still think searches here should be always accent insensitive. Otherwise searching for "iniciación" won't find any results.
OK, here is a matter of opinion, that I won't debate, because my proposition is based on the supposition of user intent, since we do not have a real search engine at hand, which would allow definition of rules to match cases like these, or misspells like char reversals and other combinations.
"Iniciation" versus "iniciacion" versus "inicaison" versus "iniciasno", etc...

Note however, if user knows and uses NON accentuated chars and gets the results with the correct accents, they will know what to do.
This is what I referred has user intent, in the search.
Einar Saukas wrote:
Fri Jul 19, 2019 4:44 pm
RMartins wrote:
Fri Jul 19, 2019 1:42 am
Regarding special cases, ( like "Off Road" versus "Offroad" ) the standard solution is to have a dictionary for those (where synonyms can be defined) or special match/translationrules (when using a search engine), but probably the best we can do here is to provide aliases (AKA synonym) for those known cases, including the entire title for simplification, for example in the "aliases" table or similar.
So there should be 2 entries like "Off Road Racing" versus "Offroad Racing".
Even so, your suggestion wouldn't be enough to locate the game called "4x4 Off-Road Racing". Unless you are suggesting to store lots of possible variants for each game title in the database?
That's exactly right!
That's the whole idea of a search Index (completely different from a database Index), and they usually are huge!
Specially if we do not have a search engine, that can generate the combinations by itself, given some (many) rules.

Since we are using the poor man SQL solution, we need to build the search index tables, with all valid combinations and then search that, using SQL.
Einar Saukas wrote:
Fri Jul 19, 2019 4:44 pm

If so, that would interfere with autocomplete. Imagine someone typing "knight" and autocomplete showing 10 variants of the same title ("Knight Time", "Knight Tyme", "Knight-time", "Knight-tyme", "Knighttime", "Knighttyme", "Knightime", "Knightyme", etc) and none for Knight Lore, for instance.
Actually that will work fine, because we only show the real title, extracted from the "entries.title", while searching using the search_by_titles.entry_title, and after performing an UNION operation, that removes the duplicates.

Code: Select all

SELECT	E.title AS name
FROM	entries E	INNER JOIN search_by_titles SBT ON (SBT.entry_id = E.id)
WHERE	SBT.entry_title LIKE ?
ORDER	BY E.title
LIMIT 10
Notice that the result comes from the entries table => E.title
Einar Saukas wrote:
Fri Jul 19, 2019 4:44 pm

Besides, it feels wrong to store aliases in the database for games that were never really known by those titles, just to help searches that are already working without this.
Now, that I can relate! :lol:

It's that feeling that it seems a waste due to apparent duplication of data.
But that is the domain of the problem unfortunately.

But let me clarify that when i mentioned "aliases", it was to make it simpler to relate.
I know that "aliases" table has a more in depth context, which is more akin to different language spellings or country releases.

So when I said "aliases" I meant real aliases as in possible replacement names, which are usually handled in specific collections (AKA tables) for search indexes.

Thank you for your feedback.
I really appreciate it.

I hope I was able to clarify any doubt, I may have left previously.
0 x

User avatar
Einar Saukas
Manic Miner
Posts: 953
Joined: Wed Nov 15, 2017 2:48 pm

Re: Predictive Search

Post by Einar Saukas » Fri Jul 19, 2019 11:05 pm

RMartins wrote:
Fri Jul 19, 2019 2:15 am
Einar Saukas wrote:
Thu Jul 18, 2019 8:07 pm
You probably don't need to distinguish between authors and publishers.
Currently the Quick search is using this query:
There's no need. All labels are valid in quick search.

For instance, try searching for "columbia pictures" in quick search to see what happens. It's neither author or publisher, but you will still get a result.

RMartins wrote:
Fri Jul 19, 2019 2:15 am
Einar Saukas wrote:
Thu Jul 18, 2019 8:07 pm
This way, it's also easy to impose a limit of 40 results overall, instead of partial limits of 10 each.
Not sure I understood what you meant.
Sorry, I understood you had an overall limit of 40 results. Now I see in your query that your overall limit is 10. Never mind!

RMartins wrote:
Fri Jul 19, 2019 6:35 pm
But I believe you are disagreeing, because I might not have explained it well enough.
No. I understood exactly what you meant. There's no need to discuss it technically, because the problem is not technical. We just happen to have different ideas for the intended behavior.

You assume that users searching for "cadáv" are only interested in words containing this accent. I assume they want "Cadàveriön" but didn't memorize the exact spelling.

You assume that users searching for "off-road" are only interested in words containing this hyphen. I assume they want all off road games regardless of spelling.

You assume that users searching for "night rally" don't want to see "Nightmare Rally" in the list of results. I assume they do.

Basically you assume most users will bother to learn the nuances of using capital letters or not, using punctuation or not, etc. I assume they just want to find a game with minimum effort.

RMartins wrote:
Fri Jul 19, 2019 6:35 pm
If you take into account, that you are using a collation that is accent insensitive and case insensitive, it doesn't matter, what accentuated characters or which case they are in the database, because an A = a = Â = à = À = á = ... etc... using the *_unicode_ai_ci collation.

So having the original data accents and case on the search_by_titles table, doesn't affect how you currently do your searches.
If you don't believe me, change one and try :D
Of course I believe you. That's exactly what I wrote in a previous post. In this post I wrote "I don't really need to store text in these tables as lowercase only, since all comparisons use "ci" collation. Everything is currently converted to lowercase just to reduce unnecessary duplicates."

RMartins wrote:
Fri Jul 19, 2019 6:35 pm
They just need several entries into the search index (not database index), like if they were alternative names (aliases).
And that's exactly what I wanted to avoid, because maintaining a list of alternative names would be a PITA.

RMartins wrote:
Fri Jul 19, 2019 6:35 pm
If you think about it, you already do this with the "aliases" table.
No, I don't. The "aliases" table contains alternative official names for the same game. It doesn't make sense to add other titles to this table that a game never had, just to help searches.

RMartins wrote:
Fri Jul 19, 2019 6:35 pm
You get the good but you also get the bad, by clipping data.
If instead you just keep the several spellings, you get the good part, but without the bad ones.
I know. False positives is the price to pay for easier, broader searches. I think that's a good tradeoff.

RMartins wrote:
Fri Jul 19, 2019 6:35 pm
We can still have all those features, without having the false positives.
I disagree, because my understanding of desired features is different than yours (as I described at the beginning of this post).

RMartins wrote:
Fri Jul 19, 2019 6:35 pm
Autocomplete is just a search, like any other. I see no difference there.
I do!

RMartins wrote:
Fri Jul 19, 2019 6:35 pm
Can you please try a search, after changing a few entries to have the correct case and accents, and confirm that you will still find them with the current collation in use ?
RMartins wrote:
Fri Jul 19, 2019 6:35 pm
Since we are using the poor man SQL solution, we need to build the search index tables, with all valid combinations and then search that, using SQL.
Again, that's what I'm trying to avoid.

RMartins wrote:
Fri Jul 19, 2019 6:35 pm
Actually that will work fine, because we only show the real title, extracted from the "entries.title", while searching using the search_by_titles.entry_title, and after performing an UNION operation, that removes the duplicates.
Your proposal wouldn't be able to find "Electra 9000", for instance. It's the title of a game re-released by Alternative, probably better known than the original title from the first K'Soft release.

RMartins wrote:
Fri Jul 19, 2019 6:35 pm
Thank you for your feedback.
I really appreciate it.

I hope I was able to clarify any doubt, I may have left previously.
Absolutely!

And I hope my answers didn't sound as negative criticism, it wasn't my intent. You originally asked about the motivations for the current implementation, so I tried to explain them all and point out the reasons I didn't do it differently. However I have no right to impose my own preferences on any websites. You are currently working on autocomplete and searches at SC, so the site functionality should be your decision, not mine. Please feel free to implement your ideas, and if there's any way I can help just let me know. I'm confident you can do a good job, and hopefully the issues I pointed out can help you make even better choices. I will gladly leave the ball in your court now! :)

BTW tables "search_by" are included in ZXDB distribution right now, just for convenience. However they are not really part of the database content. I actually like the idea of distributing them separately, to emphasize they are optional and motivate each website to come up with its own creative solutions. Good luck!
0 x

User avatar
RMartins
Manic Miner
Posts: 397
Joined: Thu Nov 16, 2017 3:26 pm

Re: Predictive Search

Post by RMartins » Fri Jul 19, 2019 11:48 pm

Once again thank you for your feedback @Einar Saukas.

This I s just a debate so I hope I'm not comming across as imposing anything.That's not the intent.

Having said that, it would be nice to know what other users think about this subject, in order to get more feedback and ideas.

One of the reasons why I did not want to go too technical, so not to scare the rest of the crowd 😂

Please come back and give us your opinion or insight on how you use or want to use search.

Thank you in advance for any feedback.
0 x

User avatar
Einar Saukas
Manic Miner
Posts: 953
Joined: Wed Nov 15, 2017 2:48 pm

Re: Predictive Search

Post by Einar Saukas » Sat Jul 20, 2019 10:36 pm

RMartins wrote:
Fri Jul 19, 2019 11:48 pm
This I s just a debate so I hope I'm not comming across as imposing anything.That's not the intent.
Sure, don't worry about it!

RMartins wrote:
Fri Jul 19, 2019 11:48 pm
Having said that, it would be nice to know what other users think about this subject, in order to get more feedback and ideas.
I suspect this discussion has been too abstract for most users. It's hard for anyone to evaluate what's the best interface without actually trying it.

I suggest you implement your idea in your alternate page (or even replace the main page), afterwards write a post describing the changes and asking for feedback.
0 x

Post Reply