Tagfest 2024: Suggested tag consolidations

I’m starting this thread to gather everyone’s observations about possible tag consolidations.

The best candidate for a tag consolidation is one in which all of the following are true:

  1. The two tags have a very specific and identical meaning.
  2. One tag has a much larger “population” of games to which it has been applied.
  3. There is at least some overlap between the populations of the two tags.

A case study would be the tags one room and one-room. These have a conventionalized meaning which (as far as I can see) is the same in both cases. There doesn’t seem to be a compelling reason for both tags to exist.

In this case, both populations were about the same size, but there is also a third tag, single room, which had a much larger population than both of those and some overlap with each of them. It seems very safe to get rid of one-room and one room in favor of single room, which carries the same conventionalized and commonly-understood meaning.

As a proactive measure, I went and tagged all of the games with either one room or one-room as single room. (It wasn’t fun, but it took under 10 minutes, and a quick glance at title, blurb, competition was enough to confirm that the intended meaning was the same.) Any future hypothetical “tag wranglers” for IFDB will now have an easier time confidently deleting the redundant tags.

Hopefully, other people will notice similar correlations and redundancies. If you do, please bring them up here, for future reference. I would not suggest that anyone take on the task of trying to tag every game with the “better” tag, unless it true both that the numbers involved are comfortably small and the two tags being compared have the same very conventionalized meaning. (If in doubt, raise the question here.)

Also, if anyone can think of good reasons to keep one room or one-room in addition to single room, please lay out your argument here for future reference. I can undo the new single room tags, if necessary.

1 Like

There are 10 games tagged “parser-choice hybrid” and one tagged “parser choice hybrid,” but no overlap in the populations. I’m not sure of the reason for the third item in your list:

Why would overlapping populations make these tags better candidates for consolidation?

Some suggestions for consolidation:

adventuron source available / adventuron source code available

BASIC source available / Basic Source Code

I7 source available / I7 Source Code

literary adaptation / literary adaption

long-form / long form

parser-choice hybrid / parser choice hybrid

point and click / point-and-click / point-n-click

real time / real-time

science fiction / science-fiction / sciencefiction / scifi / sci fi / sci-fi

second person / second-person

slice of life / slice-of-life

source / Source Code / source available / source code available

Twine source available / Twine source code available

very short / very brief

Delete all the quotation marks from these tags (and don’t put multiple genres in the same tag):
“adventure” “adventure” “historical” “middle ages” “historical” “historical” “europe” “political” “Europe” “1930’s”

6 Likes

Should we just make a generic tag for “source code available” as well? Some games that aren’t I7 have public source code.
EDIT: Also worth noting that “source code” tags should be exempt from the Magic 8-Ball? If that ever goes into place at IFDB. Now that I think of it though, that’s gonna be a lot of work just to make that tag not important to the search. Ignore me.

There are a whole lot of variations on the “source code available” idea.

variations on source code theme
+------------------------------------+
| tag                                |
+------------------------------------+
| I6 source available                |
| I7 source available                |
| BASIC source available             |
| Source Code                        | ***
| source available                   | ***
| TADS2 source available             |
| AGT source available               |
| PunyInform source available        |
| C source available                 |
| TADS3 source available             |
| Twine source code available        | ---
| Hugo source available              |
| source code available              | ***
| ZIL source available               |
| T/SAL source available             |
| Alan source available              |
| JavaScript source available        |
| INSTEAD source code available      |
| Dendry source available            |
| Adventuron source available        |
| GAGS source available              |
| Pascal source available            |
| Quest source available             |
| I5 source available                |
| Java source available              |
| adventuron source code available   | ---
| Twine source available             |
| Assembly source available          |
| source                             | ***
| Vorple source available            |
| Basic Source Code                  | ---
| Squiffy source available           |
| ADRIFT source available            |
| AdvSys source available            |
| Visual SINTAC source available     |
| JACL source available              |
| FORTRAN source available           |
| TIGSource                          | ???
| game source NOT actually available | !!!
| PHP source available               |
| Gruescript source available        |
| Typescript source available        |
| Undum source available             |
| I7 Source Code                     | ---
| Inform source available            |
| C++ source available               |
| ChoiceScript source available      |
| Clojure source available           |
| Emacs Lisp source available        |
| Magx source available              |
| A-code source available            |
| Forth source available             |
| MAPPER source available            |
+------------------------------------+

They’re sorted in order of descending usage. The tags I marked with asterisks seem to be variations on the general category.

I would think that some people would find value in the language/system-specific versions, but I see some candidates for consolidation. By far the most common phrase in use for the concept seems to be “source available”, so I would recommend that as the generic tag.

1 Like

The overlap between populations is evidence that they may mean the same thing to the community at large.

Ah, I started compiling a list of these last night. Here are the ones I think are reasonably clear-cut and where multiple versions of the tag are reasonably commonly used:

Summary
  • AI, artificial intelligence
  • animation, animations
  • Apocalypse, apocalyptic, end of the world
  • autobiographical, autobiography
  • based on book, book adaptation
  • breaking up, breakup
  • child main character, child protagonist
  • choice based, choice-based
  • Choose Your Own Adventure, Choose-your-own-adventure, CYOA
  • constructed language, conlang
  • Cosy, cozy
  • covid, COVID-19
  • dreaming, dreams
  • Dungeons & Dragons, dungeons and dragons, dnd, d&d
  • dystopia, dystopian
  • escape room, escape the room, escape-the-room, room escape
  • faery, faerie
  • fae, fey (possibly merge these both into “faery/ie” also?)
  • fairy tale, fairytale
  • family friendly, family-friendly
  • fan fiction, fanfic, fanfiction
  • fan game, fangame
  • female protagonist, woman protagonist
  • folk story, folk tale, folktale
  • gender neutral protagonist, gender-neutral protagonist
  • humor, humorous, humour
  • illustrated, illustrations
  • math, mathematics, maths
  • mobile friendly, mobile-friendly
  • neo twiny jam, Neo-Twiny Jam
  • non-fiction, nonfiction
  • office, office setting
  • one move, one-move, one turn
  • one room, one room game, one-room, single room, single-room
  • procedural generation, procgen
  • real time, real-time, realtime
  • short, short game, short games, short if, short length
  • sub-q, Sub-Q Magazine, sub-q-Sub-Q Magazine, subq

I have a bunch more as well (including a lot of one-off tags that are a more common tag with a typo, missing/extra spaces, or missing/extra punctuation), along with a list of tags that have nothing to merge with but contain typos.

6 Likes

Some female protagonists are children, so I’m not sure which of these tags you’d get rid of.

I would merge “woman protagonist” into “female protagonist”, which far more people use anyway.

There’s a theoretical world in which people use “woman protagonist” along with “girl protagonist”, “man protagonist”, and “boy protagonist” to indicate age+gender groupings, but the fact that the latter three tags don’t exist suggests to me that people are just using “woman protagonist” as an alternative to “female protagonist” and there’s not a huge group of people who want to be able to search by age+gender. In practice, people mostly tag “female protagonist” or “male protagonist” (or “gender-neutral protagonist” or “nonbinary protagonist”, for that matter) and then “child protagonist” or “teenage protagonist” additionally if applicable.

5 Likes

This might have been suggested before, but would it be useful for IFDB searches to treat hyphens and spaces in tags as interchangeable? Does it already do this?

Also, I’m not sure exactly what “consolidation” entails. It sounds like people are going to remove tags and insert new “master” tags. If this is true, eventually history will repeat itself and we’ll find ourselves in the same situation again.

I suggest creating a database table of consolidated “synonym/related” tags that group existing tags as one criteria element. For example, “point and click / point-and-click / point-n-click” would be separate tags belonging to a master tag of “point-and-click”. People could still enter either tag, but all of those other tags would be searched in conjunction. It also provides an easier way to stay on top of new tag candidates. If someone noticed, “point-n’-click” as a tag, that could be added to the “point-and-click” master tag. Done.

Nobody has to go and delete tags (potentially dangerous), just add new tags and note tag “synonyms” to be accounted for in the new relational table. Plus that magic 8 ball thingy could really take advantage of the synonyms as well.

When adding tags, if a user starts adding a synonym tag, it could suggest the “master tag” name instead. This is completely optional though.

Thoughts?

Edit:
Also, finding tags to consolidate wouldn’t involve manually scouring the games. A database query for all unique tags could be done and someone could just look at them and sort them into categories. Consolidation done. The consolidated table could be reviewed as a spread sheet for those that are interested. Once approved. Put that sucker in the database safely knowing that all the original game data is untouched.

(Apologies if this is already a thing in some way or has been discussed in the past.)

1 Like

I think we have all been assuming that tags would be deleted/find-and-replaced because that is a thing that can be done currently, whereas the thing you are suggesting is a whole new functionality that someone would have to write the code for. It has indeed come up before in tagging discussions, and it seems to be something a lot of people would like, but until someone says “I have the technical know-how to set this up and I’m going to sit down and do that soon,” we’re kind of at an impasse.

2 Likes

Even that could be automated with a script that accesses the database…
→ find all occurrences of “point-n-click”, replace with “point-and-click”.

Is there no one maintaining the IFDB site with database experience?

Yeah, and I think from a user perspective having fewer tags will increase usability – taggers who don’t necessarily know the backstory here might think “oh, I should tag this game as point and click AND point-and-click so nobody misses anything”, and then as as a user you wind up looking at a whole bunch of tags that all seem to say the same thing (this also potentially impacts the recommendation ideas – my sense from seeing @otistdog’s engine in action is that it rarely hops the choice/parser divide, I’m guessing because it’s seeing a bunch of tags like “parser”, “Inform 7”, “Inform 7 source available”, etc. that collectively outweigh one-off but potentially more relevant tags).

1 Like

I’m not aware that there’s a committed plan to do anything with the information in this thread. (With IFDB as it stands, any such tag manipulation requires admin access.)

My understanding is that there’s only one IFDB admin who has the technical knowledge, and he’s not very interested in making big changes/additions himself but is happy to incorporate other people’s suggested code if it’s good enough.

This is true, although we did establish somewhere in one of the other tag discussion threads that the admins could find-and-replace tags and it wouldn’t be very difficult as long as they didn’t have to do too many.

2 Likes

Who would I contact?

If you want to contribute code, the repo is on Github:

iftechfoundation/ifdb

If you want to talk to someone about contributing code, you can talk to @dfabulich .

2 Likes

This is a good plan, and shouldn’t be difficult to implement. As you say, it would allow for automated conversion to “standard” tags. I would suggest coordination with @Cerfeuil, who is already looking at open requests for tag-related functionality.

The main drawback to the plan is that regular human attention is required to maintain the table that you mentioned. My approach of manually re-tagging small groups of games showing outlier tags with “standard” equivalents is just the best that I could think of in the absence of database access and widespread cooperation.

If there were a commitment to regular human attention, this responsibility would naturally be part of the purview of a “tag wrangler” committee. (I would personally nominate @EJoyce and @bg to be the seed group there.) If such a committee were to be set up, I would also expect them to be charged with putting together a list of standardized definitions for standardized tags – this, too, could be integrated with site functionality with hopefully little effort, allowing things like a “glossary” page for tags and/or pop-up definitions of terms in the tag selection window.

EDIT: I do think that superseded tags should be deleted from games, after replacement with the standardized equivalent – in those limited cases where there is a high certainty that the replaced tags mean 100% the same thing.

2 Likes

Yes, I’d be happy to be part of that! (Though I don’t know if that even needed to be said given that I went and compiled a huge list of every redundant or misspelled tag before you even started this thread.)

4 Likes
More ideas for consolidation (first tag is my suggestion for master/kept tag name)
  • 1920s / 1920 / 20’s
  • 1930s / 30’s
  • 1960s / 60s
  • 1970s / 70s
  • 1980s / 1980’s / 1984 [1] / 80s / 80s nostalgia
  • 1990s / 90s
  • 3D / 3d graphics / 3d movement
  • 8-bit / 8bit / 8bit graphic
  • A-code / A-code source available
  • abstract / Abstract Concepts
  • absurd / absurdistic
  • active NPCs / active npc / activeNPCs
  • adaptation / adaption / game adaptation
  • ADRIFT / Adfrift / Adrift game / ADRIFT game / [various versions] [2]
  • adult / adult themes
  • Adventuron source available / adventuron source code available
  • alien / alien culture / alien intelligence / alien planet / alien sidekick / alien world / aliens
  • alternate reality / alternate history / alternate universe / alternative history
  • ancient history / ancient world
  • animation / animated art / animated background / animated text / animations
  • character creation / character customization / customize
  • child protagonist / child main character / children’s perspective
  • choice-based / choice based / choice / choice consequence / choices / choices matter
  • CYOA / choose your own adventure / choose-your-own-adventure / choose your story / ChooseYourStory / multiple choice / multiple paths / multiple-choices
  • clicker / click games / clicker games / clickers / clickers games / clickventure
  • ClubFloyd transcript / Club Floyd transcript / Clubfloyt transcript / multiple clubfloyd transcripts
  • comics / comic / comic book character / comic strip
  • COVID-19 / covid
  • CYOA with location-based world model / CYOA with IF-like world model [3]
  • dating / date / dating sim
  • alcohol / drinking / [the alcoholism-related words, maybe?]
  • dystopian / dystopia
  • dungeon crawler / dungeon crawl / dungeon-crawl
  • escape / escape game / escape room
  • gender choice / gender choicee / gender freedom / gender-choice
  • [the various LGBTQ+ abbreviations]
  • multiple endings / multiple ending / multiple-endings / multipleendings
  • multiple protagonists / multiple playable characters / multiple protaganists / multiple protagonists / multiple pov
  • single room / one-room / one room / one room game / single-room
  • one-turn / one-move
  • sexual content / unsurprisingly pornographic / [all the other more specific ones can go in the blurb]
  • sound / sound effects / sound fx / comes with theme song / soundtrack / soundtrack music / music
  • delayed text / typing effect
  • violence / violent / [more specific can go in blurb]
  • etc. etc.

Competition/language tags would do well with the “master tag” system, e.g.:

  • Inform (Inform 6, 7, etc.)
  • XYZZY Awards (2019, 2020, etc.)
  • the various different-language tags (education, educación, etc.)

Also maybe make tags case-insensitive?


  1. the two games with this tag are related to the year, not the Orwell novel ↩︎

  2. maybe separate by major update? don’t know ADRIFT well enough but can’t be a huge difference between 3.8 and 3.9 ↩︎

  3. or separate these into different tags (new location-based world model tag) ↩︎

4 Likes

They are already—if you’re seeing tags where the only difference seems to be capitalization, my guess would be that the actual issue is something like extra spaces between words (which isn’t very obvious in the tag cloud).

1 Like