AI rule for Spring Thing: How to make a rule that is enforceable and fair?

While it is true that I have not yet played an AI text adventure game that I particularly enjoyed. This does not preclude the possibility of such a thing happening in the future. Any kind of anti-AI rule risks being overly broad, ruling out entries that weren’t the intended target of the rule. (For example a game that is 99% human written, but just happens to include some open ended AI dialogue sections might be disqualified because of the no AI text rule.) This rule wouldn’t accomplish anything anyway because a game on the other extreme where it’s just a bunch of AI generated trash wouldn’t gain any positive recognition from peers in any case. So I wouldn’t support any kind of anti-AI rule.

4 Likes

A suggested modification:

Prohibit text previously generated by AI. Do not prohibit text generated during play by AI.

There’s a pretty strong community consensus that static AI text sucks, for reasons discussed elsewhere. But the use of AI in realtime (both to parse player input and generate game output) is still an area of wild experimentation, and Spring Thing would normally be the perfect venue for these types of tech experiments.

This seems like a sufficiently distinct category of AI to warrant different treatment:

  • “This is an AI-written story about Elara, knight of Larion” → instant skip.
  • “This is a social sim that uses an LLM to track and express characters’ emotions” → I am intrigued.
11 Likes

It’s more a matter of curation than “accepting the inevitable”.

While it is definitely true that “AI-generated trash” would be rated lowly, it would make it difficult for people to find games that are worth playing. If a festival has fifteen entries and fourteen of them are your typical AI spam, chances are few people would check the fifteenth entry that is actually properly crafted.

Rather than viewing this as a blanket ban on generative AI (Neo-Interactive jams are a good example on that), I think it’s more about what it means to be part of Spring Thing. Spring Thing already disallows games with misogynistic and homophobic content in order to reduce offensive submissions. These symbolic rules aren’t going to eradicate the errors of human society and there’s going to be game submissions that will offend sensibilities, but they’re there to at least foster a culture that is inclusive and supportive. I like to think of the “anti-AI rule” (which to be honest is far more lenient than other places, so I wouldn’t even call it that) as a way to help curate submissions and ensure some form of quality.

If Spring Thing allows “AI-generated trash” without censure, I think the festival will just get so much “trash” that no one’s gonna check it out. I believe this may be worse than getting false flags.

In the end, Spring Thing is a “benevolent dictatorship” as the website puts it:

The organizer reserves the final right to decide what games to exhibit and how to conduct festival business.

It’s up to MathBrush’s call on what’s considered a Spring Thing game and what isn’t. There are going to be choices that we will disagree on, but that’s what organizers tend to do. I’m just more concerned about the people who are already participating and judging in Spring Thing, not so much the possible outlier cases where a LLM game might be good and we’re missing out.

It’s why I don’t really care too much about a “possibly good game” with LLMs or generative AI or whatever. To me, they’re just diversions from the actual topic, which is curation. If my email is getting a lot of spam from one address, I’m not gonna be expecting an email from them that will change my life. I just want my inbox to be clean, so I’m gonna block that address and live on.

Perhaps, I might be missing out on a really great deal from a prince, but I just want emails that are pertinent to my life. And I just think it’s weird that people are concerned about potential outliers that may never happen.

15 Likes

I just feel that this effort is going down a rabbit hole that’s going to be trickier every year. What’s to say in a few years that every tool used to edit document doesn’t have an AI assist that automatically corrects and even generates content based on your style and goal. Even simple image editors will have built-in AI that completes, fills-in, enhances, all based on a crude scribble outline of input. One will be able to go from 80% user-generated content to 80% AI-generated content and barely even think about it.

Then every year, someone will have to do the dubious job of deciding what percentage of AI content assistance will be tolerated and where to draw the line. That’s a slippery rabbit hole.

As a writer, I don’t worry about AI-generated books taking over the market, because as MathBrush stated, the output is terribly soulless and uninspired. I welcome the challenge of competing against AI, I don’t think I would feel threatened by it for a long, long time. The same holds true of interactive fiction.

Most AI generated entries will wind up being rated poorly, reviewed poorly, and that will take care of itself. And if one is worried about bandwidth, then that is really a different argument.

6 Likes

Since I have been brought up I will add my notes.

I have already explained elsewhere that those images are a by-product of the text-checking activity, to try to understand how a reader might imagine the things described.
Some of them seemed nice and I used them to “lighten” the manual so that it was not just a long text.
Removing them would not have any negative consequences for the game.

Creating the game is my fun: what fun would it be if the AI did it?

9 Likes

The Time Machine v2.0 (Spring Thing 2024) used AI-generated art created via Midjourney (all the text was handwritten by me).

That said, in the poll I voted for I support every idea listed above without reservation for the rule outlined below.

I’m just a sole developer with ancient drawing skills and no funds to hire a human artist (even if I found a volunteer artist, the required bandwidth is not something I want to incur at this time) so my next game will have AI-generated art (but will probably have a no-graphics option).

If Spring Thing ends up allowing AI-generated art and text but requires disclosure, then I still have the option of participating in the competition.

If Spring Thing ends up banning all AI-generated art and text, which I voted for, there are other competitions out there OR I have the option of also just releasing my game “in the wild” (something I’m strongly considering given all the AI Ludditism floating around this forum these days).

6 Likes

I get the feeling that this poll has nothing to do with AI per se, but rather a smokescreen to ban AI out of ideological prejudice while pretending that it’s all fair, just and above board.

In a few years, AI in various forms will be routinely used in IF creation. It will find its niche, in the same way as Photoshop and other tools are used. It isn’t and won’t be a “full solution” system, but rather a tool to help creative people do their work. When used with skill, it will be an asset.

We’re seeing that already with grammar and writing checkers/assistants. Which is why the discussion here is already tied up in knots over this one.

but many of these games are so long that it takes up the time a lot of smaller, more heartfelt games could take up.

Don’t play them. No one is forcing you (or anyone) to. These are just badly made games. You’re blaming AI rather than the root of the problem - sloppy and lazy work.

10 Likes

The point is that this is not an automatic process though - there is a strong norm in this community of folks trying to play at least a representative sample of the games in the big events and write thoughtful reviews about them. We have big spreadsheets tracking which games aren’t getting as much attention and try to flag them so they get a reasonable share of reviews, too. We celebrate the people who manage to play and review all the games. There’s even a Review-A-Thon where authors can nominate their own games to request more reviews if they feel like they were overlooked.

This focus on engagement, criticism, and review is a very distinctive part of the IF community - from what I’ve seen you don’t get anything like that in Itch jams that pull orders of magnitude more players. And I think it’s a big part of what makes folks want to stick around; there are a million places to release games on the internet, but comparatively few where you’ll get real engagement and conversation - I’ve heard many, many authors say stuff like this.

I definitely worry that if our events start being full of low-effort AI entries, those norms and practices will be at risk. Playing these games is tortuous, and spending an hour or two writing a review of them is worse.

Sure, I could just play five minutes of them, write a one-sentence review, and move on, I guess - but speaking personally completionism is one of the things that motivates me to try to write reviews engaging with even weaker entries, and I think there’s real value in that. Setting out such a two-tier model of criticism will also dramatically raise the stakes for accusations of AI use, with all that implies. And the more we accept “1 star, it sucked” as all the critical response we expect, the more of that we’re going to get, the less we’ll see of the good stuff, and the fewer, I fear, authors deciding it’s a good use of their time engaging with this tiny community of weirdos who still like text games. That’s all especially the case for Spring Thing, which deliberately fosters a welcoming, less-rating-focused vibe where numerical scores and harsh criticism are less welcome.

I’m sympathetic to the runtime vs static argument - there’ve been a couple of pretty well-done chatbot games I’ve played - and sure, eventually use of these things as tools rather than content generators might get sufficiently advanced that these issues will disappear. So I doubt these will be Spring Thing’s rules on generative AI use in 2035. But I do want it to be around and thriving and able to have that debate in ten years, and damaging one of its major comparative advantages doesn’t seem like a good way of getting there.

29 Likes

Kind of late to the party but related to two points being made:

a.) I’ve actually tried to hunt down an older grammar/spellchecker to use but had no luck. Every checker uses it now. I’m not thrilled. Sometimes I’ve seen it say a word was misspelled when it wasn’t.

b.) Just as a data point I still haven’t played the IFComp 2024 games mostly because the presence of AI art on some covers turned me off (no particular argument or judgment, just what has happened the couple times I booted up the page to try something)

4 Likes

I agree with the proposed rules. But the unstated assumption is that we’re talking about the current commercial generative AI companies. If in the future there were popular AI systems trained on ethical sources, then we should consider revising them. Likewise if someone trained their own LLM or diffusion model on ethical sources. In such a case it should be up to the organisers discretion to allow an entry, though it should still be labelled as using GenAI, and it would need to credit all the source material.

Collecting a tailored training set could be as much a labour of love as an entirely handmade game. But we need to know where the training data came from.

6 Likes

Which relies on human players and reviewers wading through the AI slop in order to participate in the competition. Mathbrush and Mike Russo, two very well-known reviewers, have just said above that it makes them less interested in playing and reviewing for the comp.

If we don’t have reviewers, we don’t have a comp. Forbidding what amounts to a denial-of-service attack on the reviewers seems perfectly reasonable to me.

22 Likes

Thanks to Brian for being transparent about this and bringing it up already–I think advance clarity about expectations is really important.

Caveat that this is a tough topic and and I’m by no means sure my thinking is correct. Nonetheless

I am concerned about the proposed policy from a practical standpoint, even assuming full agreement with the goal.

The proposed rule has pretty severe consequences (removal from the event, and I take it public accusations / controversy), but we don’t have a reliable way to determine if someone actually used genAI. As far as I know, there’s not a reliable tool right now that will detect if text or images were created by genAI.

more about whether genAI text is detectable

I’m by no means an expert, but LLMs are trained on large corpuses of text and they’re pretty good at creating text that’s plausibly what a human would create. Here’s one preprint of a paper from 2024 analyzing a bunch of tools that claim to be able to identify genAI text, and finding about a 60% accuracy rate on average. [2403.19148] GenAI Detection Tools, Adversarial Techniques and Implications for Inclusivity in Higher Education, or lower if they used “adversarial” techniques when generating.

Colleges and universities are obviously hugely incentivized to try to figure out how to detect genAI text, but as far as I have heard they’re really struggling, see all the recent reporting claiming an epidemic of genAI use in violation of policies

I think it’s interesting that in this thread, the main “signs” people bring up of using genAI is descriptions that are very long and rooms/items that don’t relate to a puzzle. Certainly you can get these by using genAI but you can also get them from human authors who either aren’t familiar with parser conventions or are trying something experimental. And that framing overlooks a lot of other ways people might be using genAI. For example, if an author wrote a parser game (that conformed to convention and only called out plot-relevant items etc), the author could then use an LLM to, say, rewrite the descriptions in the style of a Sherlock Holmes pastiche, and I suspect the result would go unnoticed by most of the audience.

I think rules banning something that’s not really detectable has at least two bad effects:

(1) a tendency to select for the least ethical people. (I feel like there’s a term for this that is similar to, but not “moral hazard,” but I can’t think of it.) Like, assume the existence of a group of people who want to use genAI to write IF games: under an unverifiable rule, the most ethical people in that group will self-enforce and not enter Spring Thing and be sad, but the least ethical people are just going to yolo it despite the rule. So an unverifiable rule is selecting for willingness to lie.

(As some of the discussion here suggests, these rules also impose non-trivial costs on the segment of highly scrupulous people who are wracked with guilt about whether they adequately investigated how MS Word grammar check works . . .)

(2) corrosive effects on a community based on accusations of misconduct that can’t be reliable resolved. If readers accuse a game of using genAI, but the author denies it, are the organizers going to feel comfortable arbitrating that? What methodology are they going to use?

Thinking about how these things usually go, it strikes me that accusations of using genAI are going to be aimed more frequently at authors in particular groups (e.g., authors without an established reputation in the community, authors writing outside their native language). Is it worth it if the policies ends up making those groups feel unwelcome / under suspicion?


All that said, I very much sympathize with the motivation here. Engaging with games thoughtfully and providing feedback does take a lot of work, and it feels bad to cast that effort into a void. Maybe people think that community norms are strong enough that an honor system would work without disputes.

I don’t know what an ideal solution is if people feel there are too many low-effort games right now–for my own part, I’ve been thinking “I should be more willing to bail out of a game I’m not enjoying,” which at least doesn’t require me to try determine if something is genAI or not, but also isn’t going to result in very meaningful feedback.

7 Likes

I hereby vote to ban AI specifically to be unjust and unfair to AI devs. I think we should do this purely for the love of censorship. It would be really fun to violate rigorous principles of intellectual honesty in order to abridge AI creators’ free speech rights.

19 Likes

Whenever this comes up, we always drift toward a hypothetical edge case whodunnit, but my experience of this community is that we are generally an honest bunch that supports community norms. Establishing clear rules will almost certainly benefit everyone, be they organizers and reviewers, who do a lot of valued work for the community, or authors and artists trying to find a welcoming audience. Which might not be event X!

While it’s possible that someone will misrepresent their work (lie), we can’t live in fear of such people or overengineer our hobbies for their sakes.

We also need to accept that organizers, who again do a ton of free work for us, ought to be entrusted with some discretion, rather than trying to anticipate what they might do wrong or otherwise box them in.

21 Likes

I think your concerns are definitely valid. With regards to detectability, part of the reason I’m framing this rule for enjoyability reasons is that its a bit more manageable than framing it for ethical reasons. With enjoyability as the driving force, if someone is able to hide their AI use in a way that the game is enjoyable, then it’s okay. There’s no way I’d be able to tell. Some AI use is undetectable, so a rule banning all AI use on ethical grounds would be lopsided and ineffective in its enforcement.

I agree that schools have difficulty detecting AI use; I’ve seen people use those AI detectors and they’re completely inaccurate.

I do think there is room for ethical debate about AI, but I think ethics-based rules are currently unenforceable. For the Spring Thing rule specifically, I’d like to only weed out egregious examples. For instance, having 2-3 people I could have preview games I think are likely AI-generated and only acting if all 3-4 of us are in agreement. Or having a rubric of 5-10 signs of heavy AI use and only acting if 4 or more of them are in agreement.

That’s what I currently do for bad faith voting in Spring Thing and IFDB. I don’t usually like talking about how we detect bad faith voting, because I don’t want people getting around it, but some obvious signs are (and it’s really hard to believe people can be this stupid): a large chunk of votes coming from accounts that were created with the same IP address within 5 minutes of each other and are all the same email address with only a couple numbers different from each other and only vote on the one game and never interact again and give scores of all 1s and all 5s. That’s like five different criteria; sometimes I see suspicious behavior that only matches one of those (like a single account that was created, only votes 5 stars on one game, and never returns) and I leave those alone.

I feel like it would be the same here. Yes, some bad authors have a ton of red herring items they don’t implement. Yes, some are repetitive. Yes, some are verbose in unhelpful ways, mentioning objects that don’t exist. Yes, some have really bad logic (like writing ‘I am filled with curiosity and wonder about what lies on the other side of the door’ when you just walked through from the other side). Some bad authors use really stereotypical ideas for puzzles that are optional or not connected to gameplay. Some bad authors write the exact same text 15 different times with slight variation (i.e. “It exudes a sense of wonder and joy, as if craftsmen of old have imbued it with love.”) So if someone does one or two of those things it would likely not get banned. But if something shows every sign at once, I think it’s more telling.

21 Likes

If we had perfectly ethically-sourced LLMs which were still producing the same tedious overwritten prose that the actual LLMs of today do, it would still be an existential risk to the engagement-and-review culture of the big IF competitions to receiving large numbers of entries with large amounts of low-effort LLM text. I’ll make no secret of the fact that I’m no fan of generative AI from an ethical standpoint either, but the majority of the arguments people are bringing up against it in this thread are nothing to do with that.

You could argue that there’s no reason to have a rule banning entries that are racist, misogynist, homophobic, etc. because by the same token “no one would force you to play them”. But since most people in this community don’t want to play those games, the competition as a whole is improved by screening them out.

19 Likes

Up until a few month ago, I would not have had an opinion, one way or the other. But, as I am sure most people here would agree with, AI is becoming more and more prevalent in every day life, especially work life, and to that end companies are not just wanting, but expecting their employees to ‘embrace’ this new technology, so as not to be left behind in the ‘rat race’. The game I am currently writing is using AI (Co-pilot) to both assist with coding and to help with prose. I am not expecting it to ‘create the plot’ or to fully ‘author’ the prose, but to assist me with my own creative writing. AI has faults (a lot of them) - it does not create perfect code (believe me, I am writing in ZIL and it is far from being an ‘implementer’ and the code constantly will not compile due to syntax or block errors) and the text it writes can deviate between excessive and terse. I think that you have to view AI as a tool, like a spell checker, grammar checker or google (I don’t remember those ever being banned when writing IF, but I am sure most have used these tools in some form). The only negative argument I can fully agree with is that AI should NOT replace any physical ‘person’ when used for ‘profitable gain’ (Is entering into a comp with hope of winning a prize ‘profitable gain’?). Therefore, if selling a game, then prose and code ‘only’ written by AI or Graphics/Art solely produces by AI should be a No-No. However, I am all for using AI to ‘assist’ with the creation of AI work, it is good at mundane tasks; like asking it to create the boiler plate code for 30 objects in your game world - something it can do much quicker than I ever could . People can judge for themselves whether the ‘end result’ deserves to be given credit for the content and story it portrays.

1 Like

I guess this is a great example of why I want to ban this type of AI use in Spring Thing. In this long thread where people are expressing heartfelt feelings, this is a large chunk of text that wasn’t written by you and responding to it is responding to no one because no human wrote it. There was no human contribution; you haven’t expounded on any parts of the argument. There’s no indication you checked to see if it aligned with your thoughts. This post contributed nothing to the thread. That’s exactly like the games I’d like to remove from future Spring Things.

33 Likes

Out of curiosity, are these objects important to the game or just for scenery/flair (e.g. a desk)? To what extent are you using AI for this? What is an example of a description you’re using that was AI-generated?

3 Likes

I’m sorry, because this is the first time I’m weighing in on this thread and I fear I will come off as rather hostile but if you cannot be bothered to write your own argument against the banning of AI, then why on earth should I care when any argument I have for the banning of AI has had to come from my own thoughts and opinions? Are we outsourcing emotion and reasoning and logic to the machines now?

My thoughts on this subject are that all AI, including AI assisted coding, should be banned. I would rather see a beginner’s clumsy art, grammar-error riddled sentences, code that’s redundant and half-broken and buggy – I would rather see something that is utterly human in its seeking to be than something that is perfect and utterly devoid of life.

“But what about a disparity of resources? Some people aren’t good artists, some people can’t write and still want to participate and…” learn. Try and fail and learn. Writing, art, code – all of these are skills that you must use to get better at them. You do not get better at them overnight or in some single stroke of genius. You will be no better for using AI to write your game, you will learn nothing, you will not get better, you will stagnate and do nothing but feed a machine your ideas forever.

20 Likes