IFComp 2023 Predictions

I can absolutely see the nature of the joke at the heart of Dick McButts wearing thin for some. Of course if it does I’d imagine their experience very much mirroring your own.

I found that the repetition itself enhances the joke, knowing it was coming was itself enough to raise a smile from me, with the progression a pleasant ancillary surprise.

Reading your comments really drove home for me how important the reader/player is to the experience. Thank you for sharing.

5 Likes

For what it’s worth, you got the “good route”. Maybe you’d have liked it more if you’d got the bad one :joy:

I agree that there’s a seemingly deliberate attempt to claim the banana. Whether it wins that or not, I still think that it will score higher than this sheet predicts.

6 Likes

My prediction is that McButts got the Golden Banana :wink:

Best regards from Italy,
dott. Piergiorgio.

5 Likes

Now that the voting period is over, I’ve updated the prediction spreadsheet to account for all the reviews and ratings. There were some mistakes in the spreadsheet, which should be fixed now.

5 Likes

What’s the vibes about? How does that affect it?

Predictions! Okay, I haven’t played all games (slightly more than half), but I’ve played most of the ones that seem contenders for top places. The highest ones in the spreadsheet that I have not played are Prince Quisborne, Last Valentine’s Day, LUNIUM, Paintball Wizard and Bali B&B.

One thing that I noticed this year is the large amount of very polished, very fun, fundamentally light-hearted parser puzzlers. I rated all of the following with an 8:

  • To Sea in a Sieve
  • Honk!
  • Assembly
  • Eat the Eldritch
  • Little Match Girl 4
  • Dr Ludwig and the Devil

and from what I gather, Prince Quisborne would fit this list as well, as does the middle part of Hand Me Down. I expect all of these games to do quite well! They’re good, they’re easy to like, and they’re hard to dislike (also important when the average score is the determinant). In fact, I very much expect one of these games to be the winner of the competition, and my best guess is:

Dr Ludwig and the Devil

because it is very funny, very well implemented, and probably hits the sweet spot in terms of puzzle difficulty and length. If I had to choose my own favourite among these, it would be Little Match Girl 4, because I liked the hints of darkness that are its background and the brilliantly weird world design. But any of these would be a worthy winner.

What about LAKE Adventure? It was one of my favourite games, but it’s bleak, really bleak, and I doubt that it will win. I’m happy if it does, but I expect it to be in the top 5, but not the winner. My other favourite games were Gestures Towards Divinity and Citizen Makane, but I don’t expect those to finish in the top 10, since it’s just too easy for me to imagine people disliking them. I wouldn’t mind being wrong, of course! (These are the three games that I gave a 9.)

I expect there to be very few choice games in the top 10, since the parser crop was so good this year. It would be very cursed to make any kind of prediction about my own game; ignoring that, I expect the highest ranking choice game to be Dysfluent, followed by Put Your Hand Inside the Puppet Head and My Pseudo-Dementia Exhbition. (But note that I have not played some of the more prominent choice games.) I’d love Tricks of light in the forest to do well, but I think it will fall behind these ones. (I gave some of these games an 8 too, but to be honest, I don’t fully remember all my scores and now they are forever lost in the database of IFComp.)

18 Likes

I only played 30/74 games this year. Dr Ludwig and the Devil was my absolute favourite, so I’m happy to see its prediction. But I wouldn’t be surprised if the following do better than the spreadsheet is predicting: Honk!, One King to Loot them All, Bali B&B, Eat the Eldritch and Fix Your Mother’s Printer. Thanks @cchennnn for the predictions!

8 Likes

I think McButts deserves the banana for using a fun strategy (and being at least moderately clever despite the immaturity). If anyone else uses the same strategy, though, in future years, they should not win it.

I think they added considerably to discussion and enjoyment.

7 Likes

Reflections

After going through all 75 entries, here’s my thoughts before the results are announced:

  • There is a 2/3 choice based, 1/3 parser based distribution. In recent years, perhaps since 2016/7, choice-based entries overtook the traditional parser.

  • The most popular medium of choice for the choice category is Twine due to its accessibility and the most popular medium of choice for the parser category is Inform due to its rich libraries.

  • It’s my first time evaluating, so I had a considerable bias towards: Choicescript stuff, as I came over from the CoG forums, and Metroidvania parsers, since I love a good dose of these and Mega Man. This explains why Milliways, Little Match Girl, Bali B&B and One Does Not Simply Fry had grades in the 80s range. Due to this, I would not participate in the voting outright, since I was going to flood the ballot boxes with my preferences.

  • By the time I finished all my reviews, there were a good number of, as what @VictorGijsbers said, stuff in the parser category which I felt was strong in humor, puzzle, and story. All three aspects, which made them popular. In fact, most of these in this list ended up in the 70s range, some even in the 80s range. The results were so close, and some entries were quite similar in theme, that I had a hard time giving grades (I admit). Moreover, my unfamiliarity with the parser category only ironically pushed these further up.

  • These include: All Hands Abandon Ship, Assembly, Citizen Makane (to some extent), Stormrider, Dr Ludwig, Eat the Eldritch, Hand Me Down, Honk, Prince Quisborne, Lake Adventure (to some extent), Little Match Girl, Magor, One King to Loot, Bat Lady Plunder Quest, To Sea in a Sieve. Note that Little Match Girl was my top rated entry.

  • For some reason, the old-school parsers like Hawkstone, Have Orb, The Witch, they don’t resonate very well with me, so they went into the 60s range.

  • Last year somehow had a lot more popular choice-based ones, and this year, ironically, the tables have turned in favor of the parser-based.

  • Among the choice-based, The Ship, Puppet Head, Paintball Wizard, some Choicescript ones (listed above), Fix Your Mother’s Printer, Xanthippe and Socrates, Whale’s Keeper, Finders Commission, Osiris, Paintball Wizard, Dysfluent, are on my list.

  • Next year, I hope to be more objective, now that I have more experience in evaluation. Fingers crossed that I can actually submit something, but if I do, it’s going to be a lot more complicated.

Thank you!

12 Likes

While it’s of course perfectly fine to not vote if you don’t want to, you should know that if you had, your vote would be just as valid as anyone else’s. Everyone comes to the comp with their own biases, and that’s part of the point–it’s a shared experience that reflects the entirety of the IF community, not just one branch of it. Being fair but still allowing yourself to like or dislike things because of your biases is a balancing act everyone who votes goes through; you just do your best.

14 Likes

Although it’s true that choice has been popular recently, @RockmanX, it’s fair to say that this year I think there’s been an overturn of good parser. There are definitely some good choice games in the comp (tGoWYNM), but parsers I think a little more.

PQ. Utterly well designed and implemented. An almost perfectly two-hour intro that players can feel good accomplishing. (Although I’d much prefer this to be first place I wonder if others loved it as much as I did, so maybe 2nd place.)

Dr Ludwig and the Devil. Very nice responses to all verbs. Some cool puzzles. Funny, like PQ, if not a little more. (I’d say 1st place, but you know, something shocking might shift it up.)

Although I loved Beat Witch more than DLaD, and nearly as much as PQ, I can’t help but think others didn’t like the linearity of it, so I’d say around 17th place.

To Sea in a Sieve. Some comic moments. Nice puzzles, though I think many of us can agree the hint system wasn’t very helpful. But overall a decent 6th place.

Milliways. 26th place is my guess. Which isn’t bad, considering it’s apparently quite controversial to many!

LAKE Adventure. I haven’t played it, but I’d say from reviews about 8th place?

I started PYHItPH, but never got a review in in time, so I’ll say here that a decent 13th place is fair.

There’s more, but… I can’t get them all in.

6 Likes

Yeah, this is very much my approach - there’s nothing wrong with people who work to make their votes as objective as possible, but personally I aim to be subjective but hopefully in a transparent and fair way. Mostly I try to lean into this more when it provides an opportunity to think better of a game; like, I rated Kaboom a 9 because it really clicked with me in a highly specific way. If I’m honest, it might not have clicked so strongly if I’d played it a week earlier or a week later, much less if I were a different person, so I doubt a 9 would be justifiable on objective criteria. But as I played it it I experienced it as a 9, so that’s what I rated it.

Anyway not to say that this is the best way to go or anything - just that if objectivity seems a constraining goal, I think there are other models available.

10 Likes

I also struggled a bit with my own rating system, as a first-time judge!
I went through a period of very generous scoring, followed by adjustments which ended up seeming too harsh, but in the end I think I managed to re-balance everything in a reasonably fair way. I had some low-ish scores but no really low ones – I didn’t see any of the entries I played as being “bad” and I felt that they all brought something interesting to the table.

In most areas of my life I’m very dedicated to objectivity. But as I tried (and mostly failed) to apply objective judging standards to IFComp, I noticed my perspective slowly shifting. Both as a judge and as an author, I feel it’s been very healthy to acknowledge the significant role that subjectivity can play in evaluating games, and try to free myself of any negative feelings about the scores I give or receive.

To be honest I feel very uncomfortable about ranking art, and assigning numerical scores feels even worse; I usually prefer to consider and discuss each piece as its own valuable thing, and compare it only to the best possible version of itself. But I knew what I signed up for when joining IFComp and I let myself play along!

Seeing Dysfluent predicted to be in the top 20 is such a nice surprise :blush: whether or not that ends up happening, I’m so glad that people enjoyed it overall! I had no idea what to expect when submitting what felt like a “weird” game to such a big event, and the lovely/helpful feedback alone has made it all worthwhile.

I don’t dare make any predictions of my own, but I will say that my highest-scored games are spread out all across the sheet and not just at the top. The anticipation is killing me and I’m very eager to see the final results!

16 Likes

I always shoot myself in the foot this way! Well, shoot myself in the foot… but of course if you want to maximise the impact of your votes on the final rankings, you should grade from 1 to 10. But none of the games I played was really bad, and so I ended up using only the grades from 5 to 9.

(When I started judging IFComp around 2004, there would always be some terrible games: ultra-short, random deaths, misspellings everywhere, terrible homebrew parser; or just plain troll entries. That’s my standard for the lowest grades. But those kinds of entries don’t exist anymore!)

9 Likes

Not knowing what to expect when I started I took the “use all numbers” guideline serious, like, doctrinally serious. Even then, in two years I’ve only used 8 of the 10 numbers possible and 70% of games end up in the 4-7 range. At this point, I’m afraid my approach has ossified.

Well put. Notwithstanding my objectivity slight of hand, this seems inevitable in judging art.

6 Likes

This is the correct path, in that its questions leave you restless until the clock strikes midnight and your eyes flash wide as you start hovering several feet off the ground with your voice echoseeping from the walls, “I A M T H E O B J E C T I V E”

12 Likes

I went from 3 to 9. Partially, that’s due to the overall high quality. Also, the more I play, the harder it is to assign 1’s and 2’s, especially when I saw what the author meant to do and realized a slip in one line of code might’ve caused this-or-that bug, and it’s something I did before, and who knows but it might be a regression introduced fixing another bug in the 3 days leading up to IFComp. And so forth.

I still remember in 2014 vaguely wondering what entry would be in last place, as there was no clear candidate. The one that did, I was shocked. But then I looked at the ones above and genuinely felt “These don’t deserve last place, either.” This isn’t an “everyone gets participation trophies” thing, either. I know judging everything from 2021, a lot that would’ve been upper-half in the early teens didn’t make that cut.

There’s no problem with an approach ossifying–it seems to make sense things will go under a bell curve of sorts, both if we judge a lot of entries and if we have a lot of voters. (That’s the Central Limit Theorem, unless I’m being clueless.) It’s up to us how curvy said bell curve is.

4 Likes

I say ‘Dick McButts for president’!
I might be the one person this game was written for…I laughed the whole way through.
Not sure what that says about me though…please don’t hold it against me.

7 Likes

I’ve updated the prediction spreadsheet with the results!

Overall, the predictions were pretty similar to the actual results, especially at the top end. Correlations between the predicted and actual ratings were around 0.9. The most similar rankings to the actual results (in terms of rank correlation) were actually the IFDB rankings.

See the 2023_results sheet for the actual final results and some fun graphs and stats.

Some fun stats

Parser games were less rated than choice-based games, but had higher average scores - parsers had an average of 45 ratings vs 53 for choice, while the average parser score was 6.21 vs 5.78 for choice.

There was an average of 49 ratings per game, for a total of 3693 ratings. This is less than last year, when there was an average of 59 reviews per game and a total of 4127 ratings.

You could probably analyze these results for some more interesting trends (this doc has IFComp results back to 2018).

11 Likes

I think if people take the “use all numbers” rule literally, some dispersions in scoring is bound to happen (aside that can alter the calculations for the “golden banana” prize)

but I personally think that the most problematic rule is the “no comment whose can influence judges” whose is ill-defined and can too easily led to chilling effect, whose can, and will, hamper the feedback and influence negatively the judging because of the lack of clarification in feedback from entrants.

Best regards from Italy,
dott. Piergiorgio.