The Future of the XYZZY Awards

The 2011 XYZZYs are complete; congratulations to the winners, and many thanks to the presenters and voters who helped put this thing together.

Now we get to sort out the long-term issue: how do we run the XYZZYs in future?

a) encourage higher rates of voting from IF-literate voters,
b) discourage tactical, crowdsourced or random voting,
c) maintain the values of the XYZZYs:

  • as a reflection of informed critical opinion,
  • as more than just a recap of Comp results. There are games that get mediocre results in ratings-based comps but do very well in the XYZZYs, and vice versa. I think this is awesome; it gives a more well-rounded picture of the State of IF. It’s also important that the XYZZYs are able (and likely) to reward games released out-of-comp.
  • as a flexible, living institution that can evolve with the IF community.

d) keep the system straightforward to implement and easy to understand,
e) as far as possible, be welcoming to newcomers.

The following list is intended to cover every crazy and not-so-crazy idea that someone or other has suggested in my hearing in the past weeks; inclusion on this list doesn’t mean that I think the ideas are sound. Some are directly intended to deal with the outsider-voting problem; some aim more generally to boost in-community voting; and some are unrelated to that whole issue and are just intended to improve the awards. Your thoughts are welcomed.

1) Get rid of open voting and replace it with a restricted pool of judges.
a) A named panel of judges, rechosen each year. (By whom, and how?)
b) A narrow qualification. Might include past nominations or victories in the XYZZYs, placing in the top N of the IF Comp, etc.
c) A sponsorship system, with judges added to the pool by invitation from other judges (possibly, as with the Oscars, a XYZZY nomination gets you an automatic invitation.) Allows the judge pool to grow in an organic fashion, and to whatever extent the community feels makes sense. Might require a limit on individual invites (otherwise one person could flood the judge pool). The judge pool is inevitably christened the Ancient Qabalistic Order of Old Ones Solemnly Assembled for the Destruction of All that is Good and Pure in IF.
d) A broad qualification. Might include N posts on intfiction, N Frequent Fiction points on IFDB, an ifMUD account at least N years old, authorship of any game listed on IFDB, having a tester credit on any game listed on IFDB, or having been declared evil by Poster. This would be preferable to 1b), but much more of an administrative hassle.
Generally speaking, I dislike all the options here. When I first came to the IF world, the ability to vote in IF Comp and the XYZZYs was a really big part of feeling invested in the community. If we had to pick one, though, I think c) is probably the least troublesome.

2) Weight votes to benefit more desirable voters.
a) Ask voters which games they’ve played, and give more active game-players more weight in vote tallies.
Problems: creates disincentives for the honour system (or, if you prefer, ‘rewards liars’). Unclear (have you ‘played’ a game if you quit after three moves? If you just read a transcript?) Even if everybody’s equally honest, I think that someone who’s played a dozen speedIFs probably isn’t as qualified a judge as someone who’s played the three biggest games of the year.
b) Weight voters more highly when they vote for a more diverse set of games (but that would encourage random voting in categories you don’t care about, which we really want to avoid.)
c) Weight voters according to various community qualifications as in 1). (Most of the same problems, only more fiddly. And just because you have community cred doesn’t mean you’ve been paying attention this particular year.)
d) Weight voters according to their participation in the events of the previous year. This could be more easily automated, and would be more difficult to game. Still, I’m uneasy about this because a big part of the value of the XYZZYs is to be distinct from the comps; non-comp games are at enough of a disadvantage already.
e) As with d), but with points accumulating indefinitely. (If you voted in Comp '12 and Comp '13 and Comp '14, you get extra weighting for all of them.) Not loving this one.
f) Weight voters according to how much money they give us. (In the first year we collect $0.02 and a counterfeit zorkmid.)

In general, I dislike this kind of solution because whichever way you slice it, it boils down to someone superimposing an artificial hierarchy on the pleasantly organic field of community membership -- inevitably we'd screw some people and over-represent others. It'd suck if we did it openly (so that everyone knew what their weighting was) and it'd suck if we did it secretly.

3) Allow voters to leave comments with their votes; in effect, ‘prove that you’re human and actually like IF.’ (Comments could then be given out anonymously to authors.) This could help separate out stuffed ballots (or at least force ballot-stuffers to be more clever), while giving authors the sweet feedback they all lust after. (Possibly weight votes with comments more heavily? More voter burden.)

4) Do the nomination and final-voting rounds differently. (For instance, have the nomination round be restricted or weighted voting and the final round open-voting.)
This could allow one system to offset the excesses of the other, though it would be even more fiddly. Don’t put too much weight on this, though; if people can’t vote in both rounds they may not want to vote in either.

5) Make the expectations of XYZZY voting more explicit, put them somewhere more obvious, and set clearer standards for the circumstances under which votes will be discounted. We’re definitely doing this, in one form or another; it’s the exact manner of doing so that’s the issue.
a) Forbid authors from campaigning (and maybe testers and other interested parties, too.)
b) Forbid anyone from campaigning.
c) Allow discussion but not campaigning, so ‘Here are the games I think should win’ is uncool, but ‘Here’s some discussion of the Best Setting games and why you might be interested in them’ is okay. (But distinguishing between the two would be wildly impractical.)
d) Ban campaigning on non-IF-related social networks. Sensible, but difficult to distinguish: my Planet-IF presence is just a selection of stuff from my personal Livejournal. You might use your Twitter account mostly for talking about game design, but that doesn’t mean your non-IF-playing relatives don’t all follow it.
e) Allow people to self-police on campaigning/discussion, but set a clear standard for when the organisers can throw out votes. (Probably helpful only against naive or unintentional ballot-stuffing.)
f) As with e), but allow for the throwing-out of games, not just votes. (Probably unfair to authors, who may not be able to predict whether a casual mention will unleash the floodgates.)
g) Ask second-round voters not to vote in categories where they haven’t played N games, for some value of N that is probably 2.

6) Use some kind of preferential voting system for the second round, rather than first-past-the-post.
This would protect the Awards against relatively small-scale vote-flooding, such as happened this year. Problems: more complicated, and mathematics geeks will bemoan whichever system we choose (probably at length and with formulae). Also, this is likely to lead to even more skew towards heavily-played games, particularly comp games. The results would begin to look more like a ratings system.

7) Allow voters to nominate as many things as they want in the first round.
a) Go to three rounds: in the first, you can nominate as many games as you want, and 1 vote is the minimum requirement to reach the next round. This could whittle the list down to a somehat less scary size, and give voters more opportunity to play more of the nominated games. But it’d be extra-fiddly; do people really want to vote three times?
b) Replacing the first round, producing a nominee list of the same size (i.e., four or so). This becomes more like instant-runoff voting; it would strongly favour well-known, broadly acceptable games (i.e. Comp winners). This, I think, would do little to safeguard against crowdsourced votes, and I’m unsure about how much help it would be with voter apathy.

8) Get rid of the first-round list and make all votes text-entry.
Problem: would probably lead to more votes wasted on ineligible games, could lead to games being overlooked, etc.

9) Get rid of list compilation; to get on the list, authors have to propose their own games (and puzzles, NPCs, technological developments, whatever) for consideration. (They could do this throughout the year, so that if you wanted to you could release your game and submit it for XYZZY consideration at the same time.)
Benefits: makes for a rather shorter, less intimidating list. Deals with the ambiguities of the text-entry categories.
Problems: rewards authors who pay close attention to this kind of thing. Runs afoul of the Culture of Authorial Modesty. (SpeedIFs have won XYZZYs before, and that’s great; but it looks kind of arrogant to suggest that your speedIF might deserve a XYZZY.) If anything, it could make community-invasion tactics easier (for your consideration: Minecraft.) If everybody always submits every game they publish, we’ve gained nothing. If enough of the year’s significant authors (or tech developers, etc.) forget to submit their games, the XYZZYs aren’t really a reflection of the year’s best games any more. Finally, it forces authors to guess about what the best thing in their game is (since it doesn’t tactically make much sense to nominate yourself for three different NPCs or whatever).

10) Compile a first-round list for Technological Development and Supplemental Materials.
Problem: we don’t quite know how to do this. Just as it’s not clear what counts as a single, discrete puzzle, these are rather fuzzy categories. Does every bullet-point of a new I7 release make it onto the list? Every piece of cover art? At the moment we can leave this kind of question in the voter’s court, but that leads to relatively low voting in those categories; and some things may be overlooked because people don’t associate them with a particular year, etc.

11) Ban CYOA from consideration for most of the awards.
a) …and create a category for Best CYOA.
b) …and duplicate all the categories, e.g. as Best NPC in a CYOA, Best Puzzle in a Parser Game.
Both of these would mean that we’d have to compile an entirely separate CYOA list, which I don’t know would even be possible. We’d have to distinguish between IF and CYOA, between CYOA and not-CYOA-webgames (both of which are difficult now, and not getting any easier), and then we’d have to dredge up a year’s worth of CYOA from a thousand little corners of the Internet. Even if we could do this, we’d be taking ownership over CYOA to a degree that I’m not sure is merited.
c) …and just declare the XYZZYs parser-only.

The issue of whether CYOA can legitimately be judged on the same scale as IF is fairly contentious, but the average position seems to be that it's okay to do so, even if the standards are a bit different. Of course, this wouldn't address the risk of vote-flooding for parser games, which there's plenty of potential for.

12) Extend final-round voting, find other ways to motivate people to play more of the nominees.
The rationale here is that many potential voters, particularly the more conscientious people who we really want to vote, end up not voting because they haven’t played enough of the nominees.

The second round usually lasts about three weeks. As a thought-experiment, let's imagine that you want to play all the nominees, either to refresh your memory or because you haven't played them before. For the 2011 Awards, that would have meant playing twenty-two games, even if you count the four Hat Mystery games as a single unit. Compare this to the task of playing all the Comp games. Usually that's 30-something games. Comp voters have twice as long -- a month and a half -- to play them all. And for a number of obvious reasons, the average XYZZY nominee is going to take longer to play than the average Comp entry. Of course, XYZZY voters will usually have played a good number of the games already; but it's usually still a tall order to finish up the list.
Email a reminder to voters a few days before voting deadlines.

13) Do something to stop motivating authors to tactically avoid voting.
Right now, authors are allowed to vote on any game but their own. This means that, if you think you have a decent chance of being nominated in a category, your best bet is to abstain from voting. This creates a perverse incentive: competent, active authors are precisely the kind of conscientious, informed voters we’d like to see voting, but we’re getting them to abstain.
a) Let authors vote for their own damn game. Every other democracy allows it.
b) Don’t let authors vote for their own game, but instate a mercy rule: in the event that your vote for another game would hurt your own game’s chances, that vote is void. (Fiddly to manage, but not impossible.)

14) Rename some of the awards to be more transparent.
a) NPC and PC are terms derived from role-playing games, are fairly common in discussion of computer games; but outside the Geek Arts they’re effectively unknown. Changing them to something less opaque might help the idea that people who don’t have that background are welcome too.
b) The awards don’t really account for games with multiple or highly-variable PCs. Slightly change the sense of ‘Best NPCs’ to ‘Best Characters’, to include any and all PCs.
c) Rename, restructure and define some or all of the categories to make them more IF-specific. (But getting that right would be pretty difficult.)

15) Change nothing. The kerfuffle this year was a one-off incident; such incidents should be dealt with on a case-by-case basis, and we shouldn’t fundamentally change the awards just because we’re nervous.

16) Split the awards into Judged and Open-Voting.
a) Duplicate all the awards. This would effectively create two awards that happened to run in parallel; it would be easier to split off one of these into a different set of awards (and let someone else run it). Also, there’s a risk of dilution: the more awards we hand out, the more they start to look like third-world dictator medals.
b) Have some awards judged and some open. Some awards, like Use of Innovation, Implementation and Technical Development are more obvious candidates for restricted judging, because they require more background knowledge. For most of them, though, the choice would feel pretty arbitrary.

17) Only include games that are listed on IFDB.
That way, if anybody feels that a game really should be included in the list – anybody to include the author – they can go and add it themselves with a few minutes of effort. This would make it easier for the XYZZYs to compile entries and to link to all the games.

18) Add categories for Best CYOA and Best Parser Game, leaving the other categories open to both. (Again, if we have Best CYOA we run into a representation problem; it’s silly to have a CYOA award just for the relatively small number of CYOA works released annually in the IF sphere, but it’d be very difficult to compile a full list of all the CYOA published anywhere in a year.)

  1. CYOA was never IF, never will be. IF is a parser-based text game with a simulated world behind it. CYOA may be more popular because it doesn’t involve typing at all, but popularity doesn’t make it IF. Just ban the damn thing and let them wait for the last IF fans to die before they can grab the title for themselves. edit I actually enjoyed current “CYOA” systems, such as Undum. Much more dynamic than old CYOA.

  2. is the IF community really that small that there are more authors than readers/players? If all of them vote for themselves, all is lost… I keep dreaming of one day making my own game and getting all the fame and chicks but fear I would make this fragile balance of egos break.

Only four goals are listed at the top, but it seems clear to me that a fifth goal should be added to the list:

e) Invite others to join the IF community.

Suggestion #1 (“Get rid of open voting”) doesn’t really run afoul of maga’s four stated goals, but clearly goes against this fifth goal E. If that’s the goal, we should state it up front.

Having said that, I think E and B’s goal to prevent “crowdsourced” voting are at least somewhat at odds. When someone posts to their blog, “Hey, I’m nominated for an XYZZY award!” it lets people know that XYZZY exists, and that they might be interested in the results.

So, here’s another couple of ideas to add to the list:

  1. Split the competition into “judged” winners and “open voting” winners, accepting that the open voting winners are more susceptible to crowdsourcing. You see this a lot in game blogs/magazines, where the journalists identify their Game of the Year (GOTY) nominees, allowing readers to vote for a winner, and announcing both their “Staff Pick GOTY” and the “Reader Pick GOTY.”

(This suggestion raises a host of technical questions: should the XYZZY awards split? Or should there be a new (“PLUGH?”) competition, judged by a pre-selected panel? Or should XYZZY become a closed-judging competition, while the PLUGH awards are open voting? Should the “two” competitions even run at the same time?)

  1. Forbid at least some people from blogging/tweeting/etc. about XYZZY awards.

Variations on this include:
17a) Forbid authors from campaigning for votes.
17b) Forbid anyone from campaigning for votes.
17c) Forbid authors/anyone from campaigning for votes in the second round of voting, but allow it in the first round of nominations.
17d) Forbid authors/nominated authors/anyone from blogging/tweeting/etc. about XYZZY at all

Finally, a quick remark: I don’t see how proposal #3 to merely “allow comments” would help with the crowdsourcing problem, unless it were actually a proposal to require comments, or at least weight commenting voters more heavily.

Would you ban Jon Ingold’s The Colder Light? It’s a point-and-click text adventure game; it has puzzles and a simulated world, but no parser. Most people who’ve played it agree that it “feels like IF.”

I think I was a bit too premature and harsh about it, sorry.

Here’s a more honest opinion on the subject:

Summing up, seems CYOA of today is not static at all. I think since the parser has not really been updated in ages and audiences of today are much too picky, we’ll be seeing much more “dynamic CYOA” than traditional IF in years to come…

Yes, I change opinions fast. As fast as I realize I’m talking BS.

(1) I like the idea of (1c) and (1d) and really dislike the idea of (1a) and (1b).
(2) Ugh. Just, ugh.
(3) Sure. Apart from the weighting.
(4) No strong opinion.
(5) Definitely. How can we help?
(6) Ugh.
(7) I dislike the sound of (7a) in general and dislike (7b) on the grounds that it goes against the grain of goal ©.
(8) I think this is at least worth looking into.
(9) I’d be fine with this, though I agree that some of the cited problems would be hurdles to be examined before implementing it.
(10) I don’t understand why this would be necessary or beneficial (I don’t dislike it; I just literally don’t grok).
(11) I consider CYOA an important form of IF and dislike every flavor of (11), but (11c) in particular would be tragic, since efforts like A Dark and Stormy Entry would be disqualified. While I recognize that many disagree (and try to adjust my phrasings for clarity): I’m firmly in the “CYOA simply is IF, not a subform, not a cousin, not a form that needs to be separated or judged differently” camp and wish wedgies upon those who are not. I think the parser/multiple-choice distinction is, under the hood, much less meaningful than it looks on the surface, and a good many parser games are, in terms of actual game design and in terms of the player experience, functionally multiple-choice exercises where you’re allowed to putz around with the environment or puzzles between meaningful choices (Sand-Dancer, for example) while others are functionally zero-choice exercises where putzing around with the environment is the only illusion of choice offered, making games like Rameses (a deserving and impressive work, to be sure) functionally less interactive, and less dynamic, than even the most simplistic CYOA. While the surface distinction (parser or not?) is simple and objective, I think focusing at that surface level does a disservice to what lies within each work in terms of real design.
(12) I think it’s one part of the solution, but I feel it needs a buddy on the other side of the problem (convincing people that their vote is still legit even if they’ve only played a portion of the nominees, assuming that’s still the idea).
(13) I think (13b) is just a no-brainer. I can’t imagine what objections there could be to it, and if anyone wants to help my imagination along, I dislike them in advance for doing so.
(14) I like the sound of (14b) a lot; I’m not a fan of (14a) for the reasons cited (though I prefer “protagonist” to “PC” on different, aesthetic grounds). I think NPC is an excellent term and apart from perhaps spelling it out as Non-Player Character (which is pretty plain English) I see no reason to remove it.

It only makes you a better person.

I was thinking that was covered by ‘flexible, living institution’, but yeah, we can put that in there. Though, hmm.

I don’t really think that the XYZZYs have historically worked as a recruiting/awareness-raising tool to the degree of, e.g., the Comp, but I do think they’re pretty great for building engagement in people who are already interested. So I’m not necessarily interested in shaping the XYZZYs to be a better outreach tool: it’s not really about that. But stopping it from becoming obstructive to outreach is important, I think.

All praise to you for this. It’s sadly too rare (and I don’t except myself).

Substantively, I think that a “Best CYOA” category might be a nice idea and even a helpful outreach tool. CYOAs would still be eligible for Best Game, but there’d be a special category for them as well. This might make it harder for a CYOA to win best game, just as pitchers rarely get the MVP because they can win the Cy Young (non-baseball fans: please ignore), but I think in years when a CYOA clearly is the best game it would win.

As for S. John’s substantive objection to the CYOA/parser distinction… ahhhh, that needs another damn thread.

Well, hm. This is a complicated question that we can’t get a definitive answer to.

In the visible, dedicated community, most players are prospective authors (or like to think of themselves that way). This is a huge strength, I think; in media where the demographics of the audience are significantly different from the authors, there’s a tendency to get works that are patronising or ill-considered. When authors are effectively writing for an audience of their peers, and can write the things that they want to write…

(One problem is that a fairly high proportion of this group don’t vote for the XYZZYs, for one reason or another.)

There’s also an invisible community, and we don’t really have any idea how big it is: people who play IF but never try to write a game and rarely become involved in the community. We don’t know an awful lot about them, but plenty of datapoints crop up. There are people who play a handful of games a year; people who don’t seek out IF, but know what it is and will happily play it if it comes to their attention; people who play and vote in the Comp but do nothing else; people who play a lot of games, but aren’t interested in authorship and see the community as basically an author’s community. There are even people whose sum involvement with IF is that they religiously read the ClubFloyd transcripts: so they don’t actually play any IF at all, but they’ve still got a pretty good idea about the state of the art. And we don’t really know how typical any of these people are.

The trick is that the invisible community are, for obvious reasons, much less likely to vote in the XYZZYs (or even to be aware that they’ve started).

I don’t think this is the right way to think about XYZZY vs. IFComp. If the idea is to make IFComp a good recruiting event, and make XYZZY an event for engaging the existing community, then the rules are totally mixed up.

IFComp has a strict “no campaigning” rule, and a rule requiring that all entries must be previously unreleased at the opening of voting. These rules help to prevent IFComp from turning into a popularity contest, where games with the most buzz win the competition; they’re designed to preserve the integrity of the competition at the expense of some publicity.

XYZZY, on the other hand, has no rules about campaigning or prior releases. As such, it’s a better tool for recruiting and publicity, which comes at the expense of the competition’s integrity.

It makes sense for the community to have an open-voting competition and a more restricted competition. As long as everybody’s clear that the open-voting competition can be swayed by the winds of popularity, the machinery works as designed.

The only thing I’d add is something like proposal #5, to make it clear to the voters that they should try the other games.

In that scenario, if we at Choice of Games blogged about XYZZY in 2013, we’d make it crystal clear that it’s inappropriate to vote for ChoiceScript games in the second round of voting without playing the other nominees. (It’s not clear what we’d do for the first round, but I guess we’d muddle through.)

I agree with this, but we really can’t call it “Best CYOA”. “Choose Your Own Adventure” is a trademark of ChooseCo, an active company still publishing gamebooks. IMO, their mark has lost distinctiveness and could be successfully challenged in court, but I don’t think anybody’s planning to do that any time soon.

The trouble is that it’s hard to decide what else to call it. “Best Multiple-Choice Game?” “Best Parserless Game?” “Best Point-and-Click?”

Right, on both counts.

I sort of feel like following on the baseball theme and calling it the “Cy Oates” award, but that wouldn’t really help clarify what it was for.

Well, I hope not this one (unless the idea would be to exclude CYOAs which don’t require electricity, and I hope that wouldn’t be the idea …)

S. John has said that his comment about CYOA in the context of IF was meant only to oppose the idea that we can’t meaningfully judge CYOA and non-CYOA works together. To clarify, my idea for a best CYOA award is not based on the idea that we can’t judge them together, but that we can (mostly) meaningfully distinguish CYOA and non-CYOA. So an award for “best CYOA” or whatever we want to call it makes sense if we want one, even though CYOA is also a subset of IF.

To approximately the same extent, we can (mostly) meaningfully distinguish games with procedurally-generated content from games with more deliberately-scripted content, for example. There are quite a few styles of IF which could be broken out and recognized as subsets. Is there particular value (to the enthusiast community or to the game’s authors, or to the integrity of the XYZZYs) to lumping most of them together (judging Kerkerkruip against Cryptozookeeper, for example) while separating CYOA? (and obviously, there aren’t enough Kerkerkruips in the world - yet - to form a category of their own, but the basic question applies to many things)

Well, let’s be clear: neither event has recruiting/publicity as their primary purpose. IF Comp is, in practice, the event that’s been better at it, and has made some decisions (allowing public discussion during the voting period) that have served that purpose.

It seems as if you’re conflating open-voting with open-campaigning. I don’t really think that we have a need or a desire for a major event that’s dominated by open-campaigning.

It depends on the details of the case. If there were a lot of IFs with procedurally generated content, to the extent that they constituted something of a subtradition, then (the world would be a better place and) it might make sense to give them a separate category as well as their own category. Since there ain’t, it don’t.

As for value to the community, it depends on whether the community values it, especially those people in the community who write CYOA games. My hope isn’t that this would reinforce segregation between CYOA and non-CYOA games, but that it might give a boost to a kind of game that could be meaningfully broken out and might be in danger of getting swamped by another kind of game in the voting. It’s not fundamentally different to me from having a “Best short game” or “Best Speed-IF” category, if people wanted those.

I think it would be interesting to run a no holds barred, popular vote event. You mention upthread that there’s an invisible community of players whose opinions go largely unrecognized and unsolicited. I’ve said elsewhere (based on Gargoyle download statistics and such) that this community is vastly larger than the pessimists would credit; at the very least it numbers in the tens of thousands.

Adam Thornton and S. John Ross both provided examples of their own games which could serve as a gateway for unaffiliated but passionate communities to interact with the core IF scene. Whether or not those voters would stick around to become IF lifers is in some sense irrelevant; it would be significant to show that IF can produce titles that attract that kind of attention.

It seems to me that mass appeal is something that many authors manage, but without a way to capture those success stories, it hasn’t taken root as a cultural perception. I don’t love the idea of turning the XYZZYs into this feedback mechanism, but it would be great if there were “People’s Choice” awards that authors could promote in clear conscience. Whether it turns into a respected institution or simply works as an escape valve for enthusiasm is less important than sending the message that we love players.