*Solved* IFDB Search anomaly

IFDB search game for “adventure” used to list games with adventure in title including Crowther & Woods’s Advent. But now it goes to this game directly without listing any others:

5 Likes

I’ve noticed that when I search “Never Gives” it says there are 0 games, even though I have a game with that name. Searching just “never” or “gives” does the same thing. I wonder if the recent restructure of search (that made things faster) causes these issues.

4 Likes

I had the same issue when trying to add NGUHD to the “longest games ever” thread. I success at the end.

1 Like

This has now been reported on github:

Searches for words in a game’s title don’t always include that game in results · Issue #438 · iftechfoundation/ifdb-suggestion-tracker (github.com)

1 Like

Thanks for raising these issues. They’re actually due to two different problems.

The “Adventure” issue is caused by our changes to support implicit search by IFID. Christmas Adventure is configured to have a bunch of non-reasonable IFIDs.

image

Since its IFID is “ADVENTURE,” we’re matching that one directly.

“Never Gives” is due to our our change to use a fulltext search index. Out of the box, our database uses a large list of “stopwords,” words that will be ignored in all searches, but the default list is more than 500 stopwords, words that absolutely shouldn’t be stopwords.

4 Likes

The IFIDs seem to match the tags for the game, so maybe these were entered as IFIDs by mistake.

3 Likes

I marked Dan’s post as the solution. I will edit the title of the topic to indicate as solved when both issues are fixed in IFDB.

It has been in my mind when I created the topic to tag it ifdb, but it totally slipped out of my mind to actually do it, thanks Hanon for doing it for me.

3 Likes

I’ve filed PRs to fix both of these issues. I’ll post an update here when they’ve merged in.

2 Likes

I’m glad these were small, discrete changes rather than major systemic issues!

3 Likes

Should they be removed in that case? Not sure if there’s a rule that IFIDs have to be UUIDs, but I think basically all of them are. Since it’s the easiest way to avoid collisions?

2 Likes

The intent is that they’re all UUIDs except for old games that predate the IFID spec, which get special-case IFIDs computed from the game file contents. As with all indexing schemes, this has not worked perfectly.

But in this case, it seems like a clear metadata error.

4 Likes

In the PR for this issue, I just removed the feature to implicitly search by IFID. You can still search by ifid:adventure and Christmas Adventure would turn right up.

If somebody wants to clean up bogus IFIDs, I think that’d make sense. Here’s how I found them.

Initially, I queried for IFIDs that didn’t contain a hyphen, like this:

select ifid from ifids where ifid not like '%-%';

That turned up 320 IFIDs, most of which looked like this:

ifid
004D9E39231A1797D3A8A58ADA7C02F4
00CC0D46A7A54FD08D89FB4652DA3CDC
0182F9777087D22739413F6B7C1896FA
01A08F7D37CB02940A1D6EAA7E57E5F9
04FC82FF2020CED57E315493435C59C1

Do those need to be “fixed up”? I’m honestly unsure. They appear to be 32-character UUIDs without their hyphens.

So then I searched for all IFIDs where the length was not exactly 32 or 36 characters long, and where the IFID does not start with any of the known, declared prefixes from the Treaty of Babel.

select ifid
from ifids
where 
    ifid not like 'ADRIFT-%'
    and ifid not like 'ADVSYS-%'
    and ifid not like 'AGT-%'
    and ifid not like 'ALAN-%'
    and ifid not like 'GLULX-%'
    and ifid not like 'HUGO-%'
    and ifid not like 'LEVEL9-%'
    and ifid not like 'MAGNETIC-%'
    and IFID not like 'MZ-%'
    and ifid not like 'TADS2-%'
    and ifid not like 'TADS3-%'
    and ifid not like 'ZCODE-%'
    and length(ifid) not in (32, 36);
Here's the list of 147 plausibly invalid IFIDs
ifid
02479785
1
110422
120115
120324
120528
121216
1212162134
123423232
130419
1313133
13911
140826
150525
160314
180104
181115
1989
210814
22083007
223423423
227788
230409
252204787
2A4C8FF-0CB9-451A-85AB-4271D14CAC29
2HWV41LDJGHB3D4L
3138023
33113311
359F3A66ED77FC3D4C1FA2DCB640A447B5E6CB344404F2FBC1DEF8A7A51C6400
3F866853-BC6C-4927-8F0C-BFFFDF6C3E7
407E5657-D5E9-4218-BE3E-3DECA658485F.
4406415
616263
6623
69950
6XBYTLWLC1C1ZT0G
74VDSUAU3KRW9048
7777333999555
78173
7A7865B2-DB4C-9574-48910D475317
823-8978-8888-8888
8888880119
8B0747EF-B9DB-4CA9-94B8-84960E73373
8CA5115F5-B05A-4070-A79B-0C65E8758AE5
943EBD54-BBF3-4457-A31B-66FB83991020.
9782384
9A3CFF0B-4CDF-453B-9CA3-453A28447B5A.
A6C2C73A-0358-481B-BFBF-C461BEE04515.
ABXKHCI33ZSWEB2L
ADVENTURE
ANASTASIYA
BADGERCIAR
BASIC
BEKANTDOOM
CAMELOTPARADOX
CCU403250023
CDF1
CHRISTMAS
COM.CESTRIAN.DEEPSAGA.ANDROID
CRAWL
DATING
DUNGEON
EUROPOP-VAMPIRE
F2E89303-A810-4012B581-BAFB00A61AE7
FANTASY
FIDO
G@BEN
GNOMONROSE
HAUNTED
HEXFU
HITLER
HOLIDAYS
HOMINATEH
HOMINATEHH
HOMINATEHHH
HORROR
HOUSE
HPECD
HTTPS://INTERACTIVEADVENTURES.ITCH.IO/WHAT-DWELLS-WITHIN
IDK
IFRIT10169
INSOMNIA
INTERACTIVE
INTEVENCIONIDENTIDAD
JACL-002
JACL-003
JACL-004
JASPSTIC
JOURNEY1979SOFTAPE
JOURNEYTOLUNARNINE!
KG5UM5UZVJZ1VJSL
KLUEANDDOO
KUMA74262ZH466DL8K
LAPDOG
LIBRARYSERIALNUMBER991113
MARKCMARINOCOM
MATCHMADEINSTEELAARTHUR9
MICRO
MIDELLEBLESSINGS
NA
NARRATIVE
NIGHTDRIVE474
OR-1
OR-2
OR-3
OR-4
OR-4E
OR-5
OR-6
PARANORMAL
PB3SNH6NK4VNS8PD
PLZ_NO_PLAY33
REDOCK001
RISEOFTHEMEMELORDS
RQ:BRR+
RSGSK1JGLWYNDJH4
SAFESIMFIREV1
SAMFOXALL
SANTA
SCRIPT-57E69A4362FB89E2E4237D65C8280C67
STORYSMASH
SURVIVETHEHOLOCENE
TAPLO-1
TAPLO-2
TFLIZZLE
TLAOTLCYOA
TLHWBB
TRANSGRESION
TSB4JDMN4FHDPEBH
TURRAKS-CLASSICS-THE-PRISON
TWINYJAM
UHYES
UNABLETODETERMINE
UNKNOWN
UNKNOWNALPHA
VWMDIAC
WILBURC
WIP
WK3
WORMO
WUT
XM9LATSQMA9F6ENS
XXXXXXX
Y8ZH5-VC1KMSGQ4BXAGEFA
YG9X8NZC81JQJTHQ
{STXXDI3PYVQ39HN1}

It’s plausible that someone should just search for all of these (by searching IFDB for ifid:02479785 etc.) and just delete all of those bogus IFIDs.

Finally, as you can see, there are currently no validation rules for IFDB’s IFIDs. Reading over the Treaty, I think someone could cook up a validation algorithm. I’d accept a PR for this, but I’m probably not going to do it myself.

6 Likes

I think I’ll go through and clean up some of these ifids but I’ll start from the bottom of the list in case someone else already started from the top.

3 Likes

If you do try to fix them up, you’ll likely run into this bug that @JTN just told me about:

Edit a game, blank out its IFID field, and save it. You’ll see this error:

An error occurred updating the game record in the database. You might try the request again in a little while, or contact us if the problem persists. If you need to contact us, please tell us exactly what you were trying to do (if you can take a snapshot of the screen showing all of the updates you made, that would help), and mention this error code, which might tell us more about what’s going on internally to the database: IIF070F.0 [1062 - Duplicate entry ‘’ for key ‘PRIMARY’].

I’ve fixed it in a PR, not yet merged. Despite the error message, the change will go through correctly if all you’re doing is blanking out the IFID field and submitting.

That’d be me then… looks like we met around T. (I’ve been cherry-picking the works with obvious bogus word-like IFIDs, leaving the strings-of-letters-and-number ones; and bombing through pretty fast.)

2 Likes
Now these 38 weird IFIDs remain.
ifid
1212162134
123423232
130419
1313133
13911
150525
160314
180104
181115
210814
22083007
223423423
227788
230409
252204787
2HWV41LDJGHB3D4L
3138023
33113311
4406415
616263
6623
69950
6XBYTLWLC1C1ZT0G
74VDSUAU3KRW9048
7777333999555
78173
823-8978-8888-8888
8888880119
8B0747EF-B9DB-4CA9-94B8-84960E73373
8CA5115F5-B05A-4070-A79B-0C65E8758AE5
9782384
ABXKHCI33ZSWEB2L
CCU403250023
CDF1
KG5UM5UZVJZ1VJSL
KUMA74262ZH466DL8K
PB3SNH6NK4VNS8PD
1 Like

I’ve had another pass through the numbers-and-letters ones, and got rid of some of them.

Most of what I’ve left was works where it’s impossible to tell what the real IFID is if any, because there are no links at all from the IFDB record. (With some vague notion that the bogus IFID might be A Clue, to someone who cared more.)

1 Like

I think this regex allows all valid IFIDs, while not allowing a lot of invalid ones. It will allow a few legacy IDs which actually don’t exist, like LEVEL9-025. It assumes that the language codes for Level 9s Champion of the Raj are two-letter codes in uppercase although this isn’t specified in the treaty.

^([0-9A-F]{8}-([0-9A-F]{4}-){3}[0-9A-F]{12}|ZCODE-\d{1,5}-[A-Za-z0-9-]{6}(-[0-9A-F]{4})?|GLULX-([0-9A-F]{8}|\d{1,5}-[A-Za-z0-9-]{6})-[0-9A-F]{8}|((TADS[23]|HUGO|MAGNETIC|LEVEL9|ADVSYS|ALAN|HTML|MZ|ELF|JAVA|AMIGA|SCRIPT|MACHO|MAC)-)?[0-9A-F]{32}|MAGNETIC-[1-7]|AGT-\d{5}-[0-9A-F]{8}|LEVEL9-0[0-2][0-9](-(\d|[A-Z]{2}))?)$

This should be matched as case sensitive (without the “i” flag).

The regex should be self-explanatory :slight_smile:

2 Likes

This one is in fact valid.

It’s missing the Adrift IFIDs, and the updated legacy Hugo ones. (The Babel spec hasn’t had a formal update since these were documented, so check the source on GitHub.)