So this is tricky, like Draconis said.
In a parser game, that effect comes off like this: the game interrupts the player as they’re typing a command, or thinking about typing a command. This is very intrusive. Infocom experimented with it in Border Zone, but again, it wasn’t popular.
In Twine, the effect would similar: the player is looking at the screen, thinking about what to click, when the text changes abruptly: now it describes two people walking in and starting a conversation.
This is really hard to get right. You don’t know how fast the player reads. You could wind up with a presentation where the text-on-screen changes faster than they can keep up. Or they could go to the bathroom and come back to find that everything has changed. You might say “oh, that’s realistic”, but in fact it’s probably annoying.
The convention for single-player IF is that the clock advances when the player does something. So for parser games, an “event” is something that updates at the end of the current turn.
> EXAMINE BOOK
It’s a fine leather volume. You puzzle over the text, but you can’t make out what it says.
Bob walks into the room. “Oh hey, I was just looking for you…”
This is the familiar convention. Player command, command response, and then background events for the turn. Repeat every turn. (In I7, this is literally called an “every turn” rule.)
You could do something similar in Twine, but you’d have to think about setting it up, because there’s no convention of “turns”. You could invent one; certain links (representing actions) would cause a turn to pass, and append the appropriate event text.