Can we just ban AI content on IFComp?

Yeah, I would love to see a simple “Works submitted to IFComp must be written by a human author” rule (and “assets used must be created by humans”). I don’t think that’s too much to ask/expect! For those interested in playing AI-generated works, surely there are plenty of other places to go.

17 Likes

Ironically, it was the organizers’ refusal to outright ban AI content that led to me refusing to participate by playing, judging, or reviewing any games submitted. And I will continue to refuse going forward. My interest in this pocket of the online IF community is specifically the creative, close-knit human factor. I’m not wasting my finite time judging something a person didn’t even write, draw, or come up with themselves… like, lol. Let’s be real: If I’m that desperate to read the patterns of words a large language model strings together I’d just open ChatGPT myself and skip the middle man.

30 Likes

I’m not going to participate at all in IFComp as long as AI is allowed, other than filling out the survey to say that AI needs to be banned from it.

27 Likes

This is specifically something IFComp has chosen not to do for a long time now—it explicitly allows live-service games of various sorts (with the intent being more “multiplayer coordinated through an external server” than “calling ChatGPT’s API”). I don’t like how it prevents archiving, but I see the argument to allow it; if we want people to experiment with things like multiplayer IF, they need venues to do it in.

Much like copyright infringement, you could also just have a rule against it that’s enforced if and when it becomes a problem. Less “we’ll review every game beforehand” and more “if we find out later, and kick you out for it, this is the rule we’ll point to”.

I mean, IFComp already flat-out bans a lot of things. That’s what the rules are. Monetized games, previously released games, games infringing on copyright…

The “experimental” comp was historically Spring Thing, which now explicitly bans all AI.

19 Likes

A rule banning AI will be much, much more difficult to enforce than I think most people recognize.

As many of y’all know, I run Choice of Games, a publishing house for interactive novels, as well as its sister company, Hosted Games, a separate label where anyone can submit a game for us to publish on Apple’s App Store, the Google Play Store, and on Steam.

Two years ago, Hosted Games implemented a rule:

We can’t accept art, code, or prose that was generated via AI, due to ongoing legal uncertainty around the copyright of AI-generated content. Your submission, including all artwork, must be created by a person or group of people who have the legal rights to publish it, and we have to provide them credit in the Credits.

This rule has been hell on our staff.

How do you you know whether a body of text has been generated by AI or not? By feel. Does this text “feel” like AI slop? Does it seem like this text uses the word “delves” too often? Too many emdashes? Oh, well, the new version of ChatGPT doesn’t have those stylistic indicators, it has these new stylistic indicators. Our staff is forced to keep up with the very latest in AI-detection techniques, to read every submission under a magnifying glass.

Our staff can do this because we’re getting paid to do it. Volunteer competition organizers would just have to do it out of the goodness of their hearts.

There are “AI detector” tools, but they are deeply flawed. (They are, themselves, AI.) We’ve tried using multiple such tools, and they frequently disagree with each other, including numerous false positives on material that we confidently know was hand-written.

Enforcing a rule against AI commits you to “aivestigate” submissions. You’re pretty much guaranteed to make mistakes.

The rule also terrifies authors who want constant clarification about whether this use of AI is “OK.”

  • “I used the Grammarly grammar checker to check my grammar. It fixed a bunch of grammar errors in my game, but I didn’t realize it was using AI to do that. Am I allowed to publish my game?”
  • “I used Google Translate to translate a few passages in this game. Am I allowed to publish my game?”
  • “I chatted with ChatGPT about the plot of my game. Am I allowed to publish my game?”
  • “I used Claude to help me write the code of my game, but I wrote all of the prose by hand. Am I allowed to publish my game?”

These authors want assurance, but we can’t give them assurance without reviewing their game, which we can’t possibly afford to do until the game is totally done and ready for submission, by which point it’s too late.

(And, the author asking for reassurance might be lying.)

Banning AI also means we have to argue about it. Constantly. Users want to pipe up to tell us that they think an author has snuck one past us. “I’m a very good judge of AI slop, and this text is certainly AI. I can’t believe you guys let it through.”

When we get reports like that, we have to check on them. We have to reread the text, now with an eye towards the issues the user raised. Sometimes we do miss stuff, and then we have to go back and confront the author.

Sometimes, the author admits that their game was LLM-generated. We have ever pulled a game post-publication over this, forcing us to discard all of the work of packaging a game for publication on all platforms, submitting it for app-store review, responding to app-store review.

But frequently, the author doubles down, claiming that they wrote the game themselves. Now what? A customer who thinks they’re very good at detecting AI slop thinks the author is lying. Now we have to judge whether the author is lying or not. And that either means arguing with the author or arguing with our own customers.

Sometimes authors have claimed that they just used Grammarly in a few places, but it seems pretty clear to us that they had an LLM generate the whole thing.

Then they tell other authors, publicly: “You won’t believe the nerve of those guys! They refused to publish my game even though all I did was use a grammar checker!!

What do you do then? Do you correct their lies in public? (Are you sure they’re really lying? Will the public be able to be sure?)

AI policing requires an attitude of mistrust and suspicion. Assume they’re trying to pull the wool over you, and then find the evidence. Enforcing an AI rule requires volunteer competition organizers to assume an antagonistic relationship with every author.

Sometimes authors admit to using AI, and then send us another submission, in which they claim they “rewrote the AI parts by hand.” Did they, really? They’ve admitted to lying once. Do we spend our time and money re-evaluating their game?

If not, are they effectively just banned for life? Because their text “feels” like AI slop?!?

IMO, anyone proposing an AI ban has to also propose their preferred enforcement mechanism. Here are a few options. (You can pick more than one of these.)

  1. The IFComp volunteer organizer has to read and evaluate every game before it goes live.
  2. Users can flag games that they suspect to be AI. The volunteer organizer then has to review those games, discuss the matter with the author and the public, and then make a final judgment.
    • We might decide to have those conversations privately, out of respect for the author.
    • Then again, we might have the conversation publicly, to ensure we get the community’s buy-in that a given game is or isn’t slop. (If we reject a game, the conversation will surely become public at that point.)
  3. The author has to generate a transcript and the volunteer organizer has to submit the transcript to one or more AI detectors (which ones?), and decide what to do when they flag a game. (AI detectors are frequently wrong, both ways.)
  4. We could also try to trick authors into outing themselves. Leave a little checkbox on the submission form, or a box where the author explains how exactly they used LLMs in their work. If they admit to it, reject their submission.
  5. No enforcement mechanism at all. Just add a rule to https://ifcomp.org/rules/, and force authors to check a box promising that they didn’t use AI. When someone submits a game with “obviously” AI-generated art and the game is all AI slop, even though that’s blatantly against the rules, we’d simply let judges rate those games poorly. (And when users complain on the forum of “obvious” cheating, we’d say “yeah, it’s a rule, but we never enforce it, and here’s why…”)

As I hope you can see, all of these options suck. #5 would be easiest on the organizers, but it’s not clear to me that it’s actually better than the rule about disclosing AI that we already have. (At least with the AI disclosure rule we have, authors aren’t directly incentivized to lie about their use of AI.)

So, to answer the question at the top of this thread: No, we cannot “just” ban AI content. AI bans require lots of hard work, lots of arguing about whether you did a good job or not, and create an antagonistic relationship between the organizers and the authors. It’s miserable, unrewarding work.

Speaking for myself here, if I were organizing IFComp as a volunteer, and the community voted for the maximum-enforcement options #1 and #2, I would simply resign. Life is too short to waste it on forensic aivestigations for no reward.

42 Likes

Thank you for mentioning Spring Thing explicitly bans AI. That is great to know, and I’ll turn my focus toward that competition going forward. I’m looking to start entering competitions in the next year or two so I am paying very close attention to how organizers choose to handle the issue of AI, especially after reading about ParserComp.

8 Likes

I would vote for the secret rule #6 (guess you mean #5). If AI is explicitly forbidden, then I would just rate it low iof I felt that the game was AI, or used AI.

6 Likes

Do any of the other IFComp rules have such thorough enforcement mechanisms, though? I agree that the current disclosure rule is better than an outright ban, since it doesn’t incentivize people to try to game the system; but also, IFComp generally assumes that most people want to follow the rules, and people trying to cheat will be a rare exception to be dealt with when it happens.

Right now, if you don’t want to interact with AI-written works (I don’t, for one), you can just filter them out.

12 Likes

I’m not really sure the concerns of a commercial publisher like Choice of Games map neatly onto the concerns of a hobbyist contest with small prizes anyway. There is, for one, no marketing budget or time spent dealing with marketplaces being sunk into a game that gets pulled. For another thing, in a competition that encourages shorter games and has limited rewards, the incentive to generate text and then lie about it is less. I think a lot of the people incorporating LLM stuff into their games think it’s cool and awesome and want us to validate that it’s cool and awesome, rather than trying to sneak it past us.

My feeling is that the “honor system” rules of Spring Thing and ECTOCOMP are not really good enough for a publisher but are probably good enough for this kind of context, although I could understand if the IFComp organizers wanted to wait a year or two to see what kinds of problems might arise with the other major comps.

29 Likes

The other rules for authors are overwhelmingly easier to enforce. https://ifcomp.org/rules/

  1. Entries may not infringe on other works’ copyrights.
  2. All entries must cost nothing for judges to play.
  3. All entries must be previously unreleased at the opening of judging.
  4. Authors may not encourage competition judges to violate the rules that pertain to them (as listed above).
  5. Authors may enter at most three games

It can be a bit tricky to ensure that entries don’t infringe on any copyright anywhere, but at least in those cases, if someone writes in saying “game X plagiarizes work Y” you can pretty easily read it and tell whether that happened.

Hosted Games has had rules against copyright infringement from the beginning. It’s a hassle from time to time, when authors plagiarize, but it’s honestly quite rare, and it’s nowhere near as big a hassle as our rule against AI.

Forensic aivestigation is in a league of its own.

3 Likes

Yeah, there will definitely be bad faith folks, I suppose, but I suspect the experience wouldn’t be radically different from the current rules. Hopefully it’s not telling tales out of school since I have been on the volunteer committee vetting entries the last two years, and there were several times when authors submitted cover art that ran afoul of copyright rules or didn’t disclose the source; in all the cases I’m aware of the author appreciated the issue being flagged before the Comp went live and fixed things.

I definitely appreciate the look at what full-press enforcement actually requires, don’t get me wrong - it’s very much worth considering, and there are some unique angles around genAI, including some of the cultural norms folks have flagged. But I do think analogy with the other rules is reason to think that even the low-effort version of enforcement might get us pretty far - the question is how much getting from 80% to 90% or 95% compliance is worth.

12 Likes

I agree that enforcement is nontrivial, but I’d argue that the parameters of what ‘enforcement’ looks like for something like IFComp are significantly different from the work that you and your team do in moderating an open commercial platform.

First: the point of rules goes beyond just their punitive enforcement; I think there’s value in explicitly communicating what the comp is for and what it’s not even if it is relatively easy for someone to smuggle in LLM text.

I think it’s fair to not want to create some auto-da-fé-like enforcement mechanism, but I don’t think anyone is proposing this, and it’s reasonable to say “we’re going to take action if we see compelling evidence only” with the caveat that most tools to detect LLM text have a low reliability.

What constitutes compelling evidence:

  • Author admits to using an LLM, whether directly to comp organizers or elsewhere
  • Obvious indications of LLM use left in game text (eg, text that is clearly part of a chatbot response left copy-pasted in)

(Certainly a rule against games that make use of a live LLM, whether it runs on-device or over the web, is very easily enforceable)

I think part of the point here is not to say “no one will ever submit llm-generated text and we can guarantee it” but to basically drive off, yeah, as @EJoyce says, people who think this stuff is cool and awesome and want everyone else to agree with them. The boosterism is itself part of the noxious effect, so I think it’s worth addressing even just on that level, honestly.

Yes, I think that there would be some amount of frustration and discourse around such a rule and the enforcement thereof, but there’s also evidently frustration and discourse around the nonexistence of such a rule anyway. I also realize that it represents more work and frustration for organizers. AI is taxing and (metaphorically) DDoSing all kinds of community efforts and artistic projects throughout the whole internet.

It’s unfortunate that this is an issue that needs to be addressed at all, but everyone kind of has to address things that are objectively happening out in the world, one way or another. Doing nothing is still a choice, and I think it’s a choice that leads to worse outcomes.

Ultimately this stuff is going to have the long term effect of degrading and devaluing a community space for IF.

30 Likes

Agreed. Several other IFComp rules also seem to work on the honor system. Authors aren’t allowed to encourage judges to break the rules, but can the IFComp admins really stop them from doing so in, say, a small private Discord server? Judges aren’t allowed to vote on the games they beta tested, but if someone’s name in the game credits isn’t publicly connected to their ifcomp.org account can anyone really do anything about it?

I think it can still be useful to disallow public behavior even if there are ways for people to get around it. Authors telling people to rate their game five stars is more of a problem if it happens on a larger forum than if it happens in a private Discord server. Similarly, I’m more concerned about AI bros trying to use the competition as a platform with all that that entails (as we saw in ParserComp) than I am about a few authors using ChatGPT and then lying about it.

19 Likes

Right, like, if I submitted to IFComp and I said to all my close personal friends “you’re going to rate me high and rate all these people I hate low right?”, are ifcomp mods able to meaningfully enforce against me? Does IFTF have the power to subpoena my chat logs? Obviously not, right, but this doesn’t mean it’s not worth having the rule.

I do understand though that it’s meaningfully different in that people will assert that they can tell that something is AI from looking at it, and often they’ll be right. I think with images it’s easy enough to point at stuff, say ‘AI’ and be right often enough that this is a useful enforcement mechanism; with LLM material it’s a bit more challenging especially as there’s a lot of superstition (eg, the whole thing about em dashes).

But for me this stuff much more about working to define what the social norm is as it is about trying to catch people.

16 Likes

Yeah, I will say that the norm thing feels like a big deal to me. I read in your post that you saw the large number of AI covers and had a strong negative reaction to it, which makes complete sense to me - but the number of games that actually used genAI to make in-game text or graphics is relatively smaller. I do worry that that visual communicates that AI use is a norm, which could become a self-fulfilling prophecy; conversely, a rule that just makes sure that when you scroll through the game list no cover jumps out as being made by an LLM could have the opposite effect.

16 Likes

Yeah, as an author I do feel hurt by the prospect that someone would avoid checking out any games in the comp because of a small number of AI entries. It’s easy enough to avoid them… but it’s also a compelling argument not to have them there in the first place. I see reviewers trying to give them a fair shot but I’m yet to see anyone come away saying their inclusion was justified.

And unless I missed something, it also seems pointed that no-one in this thread is actually in favour of AI games. The dispute is only about whether a ban would be too difficult to implement.

16 Likes

I’d guess the people who are interested in submitting AI or using it to generate assets probably are wise enough to not participate in a thread full of people expressing very strong opinions against it!

Personally I would be in favor of banning AI games as I think they are contrary to the spirit of human creativity that IFComp values. I do appreciate the current system as well, as it lets me avoid these entries.

Accusing the IFComp team for a lack of leadership feels like an odd take to me. If only 10% of respondents to the official feedback mechanism for gathering community input even mentioned AI, and this is how they’ve always made IFComp decisions, why would they have changed the policy? (Again, I would like to see the policy change. But I think the next step is “everybody use the feedback mechanism to ask them to change it”)

8 Likes

I feel as though Penny Nichols, Troubleshooter does something interesting, even if it doesn’t seem to work very well and I’ve got the usual reservations about even touching ChatGPT etc. in the first place.

Everyone knows about AI Dungeon, right? It’s been around for over 5 years now. If that’s the sort of thing you want to play, there are endless examples there, and I don’t think adding one more to the pile is something of much value to IFComp.

In fact, if IFComp is going to allow that sort of entry, I can foresee people plagiarizing from AI Dungeon prompts and entering the result in IFComp.

7 Likes

Is there a filter for this? I couldn’t find one, but if it were easy to filter out the AI, it probably wouldn’t be necessary to ban it. Uninterested judges could just send it off to sit at the “slop table” and get on with it.

Side note: the ship has sailed on referring to LLM chatbots as “AI”, as if that’s the only sort of AI there is, but if we’re discussing rules for competitions we should put on our lawyer hats and be more specific. Because 99% of the concerns around AI submissions have to do with a (perceived) flood of (perceived) low-effort trash. (Which is their design goal, isn’t it? Effortlessly producing what you expected, and lots of it, is the selling point.)

You don’t have to ban “AI” to address the specific issue that comps are facing. Not that I’m particularly against comps banning AI. The AI hype machine has a little bit more influence than IFComp, so it’s not as if some brilliant effort is going to languish in obscurity because the IF grognards didn’t let it in the comp.

7 Likes

Thanks for these perspectives, everyone; I though there was an intimidating number of replies, but I actually learned a few things. I really appreciate the time @dfabulich took to write about the issues they run into at Choice of Games; and it helps immensely to clarify, as some of the later replies did, that just having a rule that sets a norm could already be useful, even if there are few mechanisms to enforce the rule. I buy that.

I do think one would want to be a bit subtle about what to ban. There’s at least one author in the current competition who used an LLM to iron out language mistakes, since English was not their first language. This is actually something that LLMs are pretty good at; I’m tempted to say it is their main strength, because the one thing all their training enforces is using language the way it is actually used. Banning such a use of LLMs is arguably excluding well-meaning authors. Nor does it seem very sensible to me to ban an entry because the author got an LLM to solve a coding problem, or to do some brainstorming. This does not usually hurt the experience of playing the game, nor does it need to inhibit the creativity of the author.

(Maybe it does! I’m actually very interested in the idea that LLM chatbots change the way people think in dubious directions; but that’s not really something we need to decide here.)

So a rule should perhaps ban, I think in decreasing order of obviousness:

  • Reliance on live running an AI model.
  • Games where all or most of the text is AI generated.
  • AI cover art.

I’d personally be in favour of all three. Let IF Comp be a place where it is part of the rules that something I made myself with my non-existent MSPaint talents (I actually use Gimp, but hey) is seen as more worthwhile than something spit out by a GenAI. :slight_smile:

25 Likes