(I was inspired to post this by a Twitter thread – unsure what the etiquette is around whether I should link it.)
Lately there’s been some discussion about whether IFComp is too big, or whether it’s even possible for IFComp to be too big. The IFComp 2020 announcement blog post mentioned that 41% of survey respondents thought “the more games the merrier”, and 72% of respondents thought there weren’t too many games. Personally, though, I am a little concerned about it.
One worry I have is that, the more entries there are, the harder it’ll be to fairly compare their scores.
Let’s imagine two hypothetical IFComps: one where there’s ten entries, and one where there’s a thousand entries. In IFComp Ten, it’s safe to assume that most of the games are being played by most of the reviewers, so if you average everything together, you’ll get a roughly consistent rating scale. In IFComp Thousand, though, no one will be able to play all the games! So what people will do instead is scroll through the list of games until they find some whose titles/covers/blurbs look interesting, and play and rate those. (A few people might use the random shuffle and rate whatever comes up, but why would you bother playing a game that you don’t think you’ll enjoy?)
This creates a problem because different blurbs appeal to different people. Like, if one game’s blurb is “Join this teenager’s journey of self-discovery in a fifteen-minute work of interactive poetry”, and another game’s blurb is “Can you solve all two hours of parser-based math puzzles inspired by the Intel 80286 Programmer’s Reference Manual?”, these games will get rated by two almost entirely separate groups of people with very different ideas of how to score games. So the average score any given game gets will dramatically differ depending on which subgroup of judges its blurb/cover/etc manages to attract. You could imagine platform-based effects too, like maybe Windows users give higher numeric scores on average than Mac users.
It would be unfortunate, I think, for the comp to depend so heavily on “metagaming” like this. Does anyone know if this is currently the case, or if it’s likely to become the case if the comp gets even bigger? Heck, should I package up my next hypertext game as a Windows-only executable to increase its average review score?