Okay, fair enough.
Conventions/minimums:
Yes, it’s true that there are standard expectations in the parser IF community about minimum levels of parser behavior. Absolutely, that is the case. I am not convinced they’re as prescriptive as you think, but the emphasis on internal consistency is definitely there.
I think scenery implementation is sometimes taken as a quality litmus test, in the sense that people feel that if the scenery in the first room is not implemented, then there’s a good chance the author hasn’t put a lot of time and testing resources into the game. Other types of games absolutely have this type of litmus test as well; even in the Twine space, many audience members react notably better to pieces that have customized CSS so that the first thing you see isn’t the same old standard template. Is this because you can’t have a good Twine game in the standard template? Not at all: howling dogs was initially released that way, and it was so well received as to more or less spark a revolution. But “has custom CSS” in that space has become a kind of shorthand for “author put some customization work into this”.
There are a few spaces for parser IF that is explicitly not trying to meet these standards – no one expects SpeedIF to be well-implemented. Communities around certain tools – ADRIFT and Quest, notably – have also tended to embrace alternative sets of expectations, with the result that those forums have a different and less critical flavor. So there are a few spaces where one can go to escape from this, but I feel like part of the challenge is that, because of the tight mesh between fiction and mechanic, it’s really easy in parser games for “rough-hewn” to become “unplayable”, in contrast with many types of altgame in which (say) the low-quality art doesn’t affect the mechanical experience much.
NPCs specifically
When it comes to NPCs, I think part of the reason for the “avoid complexity” feedback you see is that parser input plus the possibility of abstract content plus theory of mind generates a near-infinite, very hard to test state space. X NOUN is hard enough, but in conversation, ASK BOB ABOUT TOPIC, “TOPIC” could theoretically be any concrete or abstract thing; moreover, Bob’s response may need to vary depending on what Bob knows, how Bob feels, and whether you’ve asked Bob about that topic previously in the story. In a game with an advancing plot, hidden evidence, and possibly secret NPC motivations, the possible state space gets even bigger.
There are several strategies for coping with this:
– rigorous implementation and testing based on a really well understood state space and (ideally) a set of topics that are listed or heavily hinted to the player. If you want to be exhaustive and you are able to basically sit down and list all the relevant states your game can get into (per scene, etc.) then you can make a testing harness to set each of those state spaces in turn and then iterate through ASK NPC ABOUT ALLOWED TOPIC.
– crowdsourced authoring/beta-testing, which is something we did in the creation of Alabaster: involve other people really extensively (more than is usual for betatesters) in identifying content that could be augmented; this will be less rigorous than the previous possibility,
– cut complexity in some respect, often by going to a partially menu-based approach (which is what TC sort of does) or by limiting some aspect of the character (they can’t hear and can only be shown physical objects, e.g.)
And, again, the reason why I think parser players worry about this more than players in many other genres is not that they’re all grotesquely unreasonable, but this question of design communication. In a space where I could type anything and the topic possibilities aren’t even constrained to nouns in the room description, how do I know what I should type in order to make the game go forward? A really rigorously implemented NPC says “talking to NPCs is important, so you should spend a lot of time on it.” A minimally implemented one can either direct my attention towards the space that is explorable (“this mechanic robot only answers questions about trucks, but you should definitely ask them about every truck in the game!”) or not try to give me a parser-style exploration experience per se at all (“here is a menu of choices you can pick from”). But a conversational NPC that does none of those things, and has a couple of keywords that are going to advance the game surrounded by a lot of unimplemented blank “Sally has nothing to say about that” is going to stump and frustrate a lot of players.
Moving-Parts games in general
I totally hear you about the desire for tools, testing mechanisms, etc. that make the kinds of thing you want to write more possible. This is huge.
I’m not sure that it’s possible to make parser-game tools that guarantee high-quality results, but I do think it would be possible to make much, much better support for the lots-of-moving-parts type of game. This is loosely related to things people asked for in the Missing Tools discussion a while back.
Threaded Conversation is one stab at handling part of this problem, but it’s got a significant learning curve of its own and would need have some aspects built into Inform in order to become easy to use; it’s also fundamentally fighting the problem that it’s sort of a menu-based conversation system trying to live in a parser world. And at best it only deals with a bit of the issue. (I’m happy to hear about your struggles with it if you want to share; I’m not the maintainer of the extension but I am interested in looking into how these things can be built out to be kinder to authors.)
What I’m wondering is: might we be able to provide better tools for specifying a game that runs on a semi-flexible schedule with a lot of moving NPCs? What would be the natural way of describing the schedule, the rules for where NPCs should go, their response states at different times? I think probably there are things we could do in this area (and I haven’t recently looked at all of TADS 3’s scheduling features, so maybe some of this is covered in the TADS 3 model a bit more deeply). Versu handled some of this, but it did so by having a very light conventional world model and focusing mostly on social situations.
I think there’s potential research space here, as well as in the automated tool area.
Criticism focused on advancement
On the issue of wanting to see parser IF that advances the discipline – guilty as charged, at least in my case. I know not everyone is coming to this in the same way or for the same reason, but my reason for being involved with IF is that I’m interested in advancing the art of interactive storytelling. So I want to look at games that do that in some way, and I want to talk about how they do it and how those developments are situated relative to other pieces in the history of interactive stories, and I want to draw other people’s attention to those games, and I want to be inspired by them. So the way I engage with stuff and the way I discuss it is pretty different from the way I engage with products where I’m more of a passive consumer. (Even within the game space, there are certainly genres of game, such as tablet puzzle games, where my feedback is much more on the order of “okay, I had fun with/was pleasurably frustrated by that” or else “that was not fun”, not “this ruleset was really derivative of PuzzleBlaster 2013”.)
One of the big things about Twine for me lately is that Twine is so new that it’s still cropping up innovations pretty much every two weeks. This is so much fun!
This is (I think) not the only kind of thing we talk about other than missing nouns/verbs – I’m thinking of Sam Ashwell on the theme of the monstrous in Krypteia, or Jenni Polodna on characterization in One Night Stand, or Liz England on the presentation features of Zest, or Victor Gijsbers on the philosophical disciplines underlying Metamorphoses, just off the top of my head.