Can we just ban AI content on IFComp?

Right, the various Twine formats confuse things quite a lot, from my understanding. Not sure if things have gotten better on the Inform front; it’s been a while since I’ve seen any LLM Inform code but the last few examples that people posted here looking for help were pretty dire.

4 Likes

I can’t speak to other languages but as @chyrono linked, multiple communities have had to deal with an influx of people asking for help about very broken AI-generated Twine code. Some have banned it outright because it was a majority of all help requests and it was burning people out.

Maybe Claude can do this if you already know what you’re doing, but a lot of people clearly don’t and are using AI as a crutch instead of learning the language. IF languages are particularly ill-suited to this because of the very limited amount of data in the training set compared to mainstream languages, and for Twine specifically that data is split between several different formats that are all actively being updated. This is a very different situation than what you’re doing with Sharpee, and as far as I can tell it’s the more common situation.

17 Likes

Especially when these IF languages are under active development! We’ve had some LLM-generated pull requests for the Dialog project and they superficially fit the syntax of Dialog but are full of queries to predicates that don’t exist.

For an analogy, it’s like if an LLM knew the syntax of C (how to write a function call, for instance) but had no idea what functions existed in the standard libraries, so it was full of calls to atan_yx instead of atan2, nprint instead of snprintf, and so on. This is significantly worse than useless: newcomers tend to understand the syntax pretty quickly, but you need significant experience to figure out what these malformed function names are supposed to be.

And with the amount of Dialog training data currently in existence, that’s not surprising! The standard library is actively being developed, and which predicates exist change with every new version. But it takes the project maintainers much longer to wade through the slop and figure out what’s wrong than it took Claude or whatever to generate it. The comparisons to a DOS attack are not at all wrong.

13 Likes

Yeah I can two paths. Any IF platform is a reasonable target for Claude, but you’d still need to understand how to add Twine libraries, you’d need to know how Twine works at the JavaScript level, and you’d need to be able to see when Claude is being obtuse.

The other path is as you describe. Authors with zero clue asking broad questions without knowing what they’re doing.

I’d be curious what happens when you start with all the base Twine code and its documentation, including the architecture.

Can you point to what you’re talking about here? Are we talking about the same thing? As I said previously, a lot of generative methods have been conflated under the rubric of ‘AI’ but what people are generally taking issue with and describing is LLMs and other large generative models like eg Midjourney or chatGPT’s image/video generators.

10 Likes

I relate to this angle in that I’ve only had time to review five games this year. That means the rest are things I want to play but haven’t had time, things I don’t pine to play, and AI prose games. The last category, as work to try in this competition, I am uninterested in. But what’s in my categories two and three is only known to me personally. Which I don’t view as a problem.

Are you sure @Kastel you feel you have no ideological stance on this? I ask because phrases such as ‘I don’t mind embracing it’ (a Luddite position) sound like a stance. I feel the primary stance for all of us in IFComp is that we have ideas about what should be in or out. If you (personally) haven’t decided whether you want various kinds of AI in or out, or you don’t care, then I wouldn’t think you have to do or say anything about it.

Me being me, except in years where I reviewed everything (I forget if I ever did that, but maybe I came close) it’s always been easy to skip playing things I don’t want to play. There are just so many things! The not-me things will get played by someone else for whom it’s yes-them. And of course, like going to a film festival, sometimes the big gains are when you go to some not-me thing and get into it. But the area of AI is qualitatively different from just what may be anyone’s not-me, as evinced by these threads.

I don’t know if this helps but I wanted to say I’d had similar thoughts at least in terms of ‘what am I choosing to play or review and how’ as things stand.

-Wade

6 Likes

This is what I want people to be aware of. There are forms of AI that don’t resemble all of the bad behavior we see in GenAI creation. We can’t lump everything together and say, “All AI is bad.”

1 Like

I’m just unsure of what you’re referencing by:

I see that as no different than the AI built at some game companies, which no one here has ever complained about because some of our own prominent members have been employed at said companies.

(And also your later post in this same thread)

5 Likes

I can’t help but note you still have not provided actual examples of these games, despite two people now asking you for them.

Which games are you talking about? How did the authors use AI for dialogue in them?

7 Likes

“AI” to mean LLMs is a marketing gimmick, but unfortunately one that we’re stuck with now. As far as I know, no successful IF has ever used LLMs to generate its text.

7 Likes

I know this is a tangent, but it really is worth stressing that renewables actually are dominant, and there is a committed information campaign to undermine this fact. A solar farm today has the lowest levelest cost of electricity of all scaled power generation and can start pushing electrons onto the grid in 18 months, whereas natural gas has generally cheap but volatile fuel prices and there’s such an incredible backlog of gas turbines that anyone wanting to build one today is going to spend the next six years tapping their foot in a queue. In the US, EIA data records that over 2019 through 2024 solar grew at an astonishing 24.9% 5-year CAGR as compared to natural gas, which grew at a 3.3% 5-year CAGR. Now obviously a major reason for the staggering pace of that was due to transferrable tax credits subsidizing 30% plus bonuses of your capex, but that shows you how fast we can build when we want to build, and crucially Lazard has a report out showing that solar and onshore wind will still have the lowest LCOE even without subsidies. In large part, as the historical renewables LCOE chart on there shows, the subsidies worked exactly as intended to accelerate the economics, with the relative trend they produced even surviving the post-covid supply chain inflation shock. In some ways, it’s precisely because the LCOE delta has been plateauing that taking the subsidies off now may actually prove healthy for a maturing industry.

That’s all to say that fundamentally the issue with renewables isn’t economic, it’s logistical and technical: you need baseload (ideally nuclear with a sprinkling of hydro and geothermal), you need flexible generation (ideally batteries), you need grid interconnects, you need tons and tons and tons of transmission lines, you need more advanced inverters that can actually contribute to grid inertia to prevent blackouts like the recent one in Spain, etc etc etc.

But those are hurdles that are only preventing us from going from a majority clean grid to a completely clean grid, not for renewables to “come online” or whatever. If we sustained current trends, especially if we weren’t simultaneously panicking over the extraordinary growth in demand from AI data centers (maxscale newbuilds are projected to use more electricity in one data center than Houston), renewables would dominate. Which is why there is such an effort to kill that momentum. Energy Secretary Chris Wright wrote a recent op-ed with a line so obviously misleading that I’m almost sure it’s a joke, some kind of knowing wink to his colleagues: “At 8 p.m. on Inauguration Day, amid bitter cold across much of the Eastern seaboard, we reached peak demand for electricity in the mid-Atlantic region. At that point in time, PJM Interconnection, which supplies the Mid-Atlantic United States, got approximately 44% of its power from coal, 24% from natural gas, 25% from nuclear, 3% from oil, 3% from wind, 1% from hydro and 0% from solar.” First of all he chose the middle of the night to say 0% from solar as opposed to like 10%, second of all he did mix minus battery discharge which would at 8pm be pushing its stored solar energy into peak demand, but the most obvious trick is he chose PJM, which has a renewables penetration lower than many ISOs/RTOs for so many reasons that but like yes yes I’m already ranting offtopic sorry but seriously open up the live fuel mix for CAISO, ERCOT, ISONE, markets that have for very different reasons grown their renewables and you’ll see that they carry a huge amount of the load, especially when you factor in the batteries, and that’s even with substantial curtailments of the existing renewables supply because capacity markets are required to favor natural gas.

Sorry, what was the conversation about? Oh yes, AI. Yes, we should find some rule that honors and respects the volunteers who run the comp as human beings that are trying to do something nice in a brutal world ceaselessly baying for blood.

32 Likes

Thank you for the detailed perspective but I’m asking for a first step, at least. As I already mentioned in the first post, organizers already have to play the games before comp to determine if it’s truly copyright-okay, freeware (as opposed to free-to-play-with paid saves or something), ad-free, unreleased-before and - since this year - legal for British minors (i.e. no non-negative depictions of suicides, self-injury, overeating, racial-religious-sexual-gay-trans-disabled content, violence, bullying, dangerous stunts or drugs).

Adding “no obvious GenAI tells” on top of that should not be a big strain on volunteers.

On another point, I don’t see coding LLM tools much different from writing LLM tools - in addition to software copyright, they don’t understand the point of software patents, illegal code or export restrictions, which makes them actually dangerous to the author. While it’s unlikely for someone to reimplement a full Pokemon game in their engine…

davidc:

I’m absolutely certain I’ll be able to port mainframe Zork to my platform Sharpee in a very compressed timeframe.

Last time I checked, Zork (even mainframe Zork) is in a very very gray area of copyright and Zork I belongs to Microsoft.

4 Likes

To clarify, I try not to put those kinds of thoughts into my own evaluation/reviewing. Like everyone else, I have my own personal, ideological stances on matters like this. But I would prefer not to put that into my own judging. Especially for a competition that requires fairness, I think it’d be unfair of me to put my admittedly ignorant dismissal on the same level as my own evaluations of other games.

Put another way, I want to give every game I play with the same level of energy and time as equally as possible. If I play a LLM game, that critical aptitude just disappears and the reptilian part of my brain activates. I don’t think such a bad faith attitude would be great for judging anything. And it’s certainly misleading for people who read my reviews expecting a good, honest analysis.

As for me, I still haven’t decided on how much LLMs I’m okay with. I’m certainly in the skeptical camp, but it’s more a matter of me getting sick of the low-quality spam outside of IFComp. In the grand scheme of things, there is a bit more care and attention in the IFComp AI entries than the general spam out there – at least, from the reviews I’ve read. But I still have a low opinion on them anyway, which is I think that’s enough for me to recuse myself from evaluating these games.


But you are right that this LLM stuff is far different from “this isn’t just for me”. We all have different stakes on this, so the tolerance for these works will differ from the kind of genre works that may not be palatable to everyone. I can watch a romance movie and get very little out of it, but at least I know what I don’t like about the genre. But when it comes to works that use LLMs, our focus is not on the genre or object but the technology that powers it and the feelings we have on it.

Every review I’ve read about a LLM game so far basically revolves around how it used the technology, whether it’s innovative, and the implications so far. It’s very different from, say, a review that is essentially a close reading of the code behind this Inform 7 game. We’ve moved away from talking about the game itself and more about speculating on what LLMs can and cannot be. In other words, we are recreating the same tired threads and posts on AI in our own reviews of the LLM games.

It can be a bit exhausting to read and write in that sense. Are we really examining the games we’re playing or are we speculating about the technology’s benefits and harms at this point? Are IFComp and similar venues the right place for these kinds of critical discussion?

I have no idea, and that’s probably the real reason I’m avoiding reviewing AI games. Not only am I biased toward it but I’m just going to regurgitate the same talking points everyone has in these threads in my own reviews. I’d rather rehearse for something else.

These thoughts are probably unhelpful for IFComp organizers though. I just want to read and write interesting commentary, not the same stuff over and over again. Still, my impression of IFComp is not just about the new and cool games out there but the critical commentary that comes out of this competition. It’s always exciting to read a new review of anything from your favorite prolific reviewer of choice.

But I have to admit: it’s getting irritating to read people waffle about whether LLMs are good or not in a review.

17 Likes

In a previous AI thread on this forum, I asked if anyone could provide a link to a publicly-available LLM which fully documents its training set, which feels like it should be a minimal requirement for an “ethical” model. No-one responded. I did some searching of my own and couldn’t find anything. Could you point me to one of these ethical models?

[Edit: since I keep getting notifications of people liking this post almost a week later, I should point out that @DavidC has provided a link to such a model in a later post, albeit not a general-purpose one]

14 Likes

I can confirm that it’s irritating to have to produce the waffle, too.

13 Likes

I know that as somebody whose only involvement in this year’s IFComp so far has been beta-testing a few entries and who has only recently begun playing IFs this year, my opinions might not hold weight as much as other people in this thread but I am firmly of the opinion that AI slop should have no place on IFComp.
Sure, some people might argue ‘AI is an inevitable part of the future and we must all embrace it or perish’ or ‘AI reduces the barrier for me to put my ideas into a working IF’. OK. If you’re one of these people, then fine, go and create as many AI slop as you want but don’t expect me to hold your work to the same standard as I would if you had actually spent time planning out your story, actually putting it into words, finding an engine of your choice to create the IF, genuinely listening to the feedback gathered from your testers and you know, bother to implement such feedback into your IF.
Because, if you’re just gonna have AI do all of the heavy lifting while you sit back and do nothing but input prompt after prompt, even if you were using an ‘ethical’ model (if there ever is one), it’s not your work in the end. I mean, and I’ve already seen it being mentioned by someone else in this thread (though I can’t remember who), anybody can do such a thing. This is not about being creative anymore. It’s just like typing a bunch of words into some autocorrect-on-steroids and ignoring just about every fact as to where the data used for training was from. You know, people who actually poured their hearts out into bringing to life a part of themselves, to capture uniquely human experiences, not so that some genius on a keyboard can hack together a bunch of prompts to ‘create’ a piece that they’ll go on to submit to an competition under their own name.

OK. That’s probably enough of my rant but I strongly am in favour of excluding AI content from IFComp.

11 Likes

Wrong about what? What “truth” do you think I am “compromised” on so much that I can’t make my own art? What am I so wrong about, that the mere belief in it has warped my very ability to create?

15 Likes

I was drawn to an article mentioned in @alyshkalia’s weekly IF roundup and also linked above by @DamonWakes. The article is titled Slop comes for everything you love and it’s worth reading.

The essence of the article is that the author had been away from IF for some time and had returned to play the games in IFComp 2025, but was repelled by the use of generative AI in 10% of the games.

I might be reading too much into this, but it sounds like that person will never come back to IF. Is this what the IFTF wants?

The author of that article is not going to fill out any IFComp survey. I, for one, didn’t even know that there was a survey and I’m not going to go looking for it in some obscure area of the internet that I’ve never ventured into.

14 Likes

Many people in this thread have raised good points but at the risk of repeating arguments, I am replying to make my stance clear.

Why would that not be seen as a huge crossing of the line? Forgive me, but why on earth would I want a machine to generate writing that I could do myself? If it’s a matter of “freeing you up to do thinking you could be doing otherwise” - a big part of the process IS the initial thinking, the drafting, etc. Furthermore, as others have already mentioned in the thread, the base model being trained on “pretty public information” is problematic enough as it is! What possible benefit could feeding more of my own writing into it have, besides a macabre push for churning a larger /quantity/ of writing out?

Of course that crosses the line! As people say: get good. Put some damn effort into it.

12 Likes

FWIW “that person” is @Sequitur :slight_smile:

10 Likes