Unusual Behaviour With < Symbol in Harlowe 3.3.9

DamonWakes · November 13, 2024, 5:27pm

Twine Version: 2.9.2

I’ve come across some odd behaviour involving the < symbol in Harlowe 3.3.9, and I can’t tell if it’s a bug or just a feature I don’t understand. I’ve searched through the manual for anything obvious but so far no luck.

The issue is that if < is followed by (as far as I can tell) any text character, the rest of the line fails to appear. Here’s an example:

(text-style: "shudder")[JA<K OF ALL TRADES]

^This should read "JA<K OF ALL TRADES" but the < symbol appears to cause the rest of the line to disappear.

Using < with another character immediately following it seems to be an issue.

When run, this will read:

JA
^This should read "JA
Using < with another character immediately following it seems to be an issue.

The “shudder” text style works as expected, but only the “JA” from within that hook actually appears.

Does anybody know what’s going on here?

HAL9000 · November 13, 2024, 5:32pm

(text-style: "shudder")[JA`<`K OF ALL TRADES]

Those ticks act like a verbatim markup of sorts and nothing between them is computed. My guess is that it sees it as an opening bracket for HTML.

DamonWakes · November 13, 2024, 5:51pm

That’s certainly a workaround, but what I wonder is if this should even be happening in the first place or if it’s a bug that should be reported. I find it very strange that it only affects the rest of the line (not the whole rest of the passage), that Twine still treats the text-style hook as though it’s been closed as expected, and that syntax highlighting doesn’t offer any indication that the < symbol will have an effect.

HAL9000 · November 13, 2024, 6:10pm

You can also use HTML-safe codes to render < and > characters: < and >. Multiple spaces can be rendered in a row with  .

I think Harlowe might be removing it as potentially “bad HTML”. <div></div> is good HTML, but <div</div> is bad. <div><span></div></span> is also bad HTML. Anyway, “angle brackets” are sacred in HTML. Harlowe is HTML. Harlowe has to treat all angle brackets with extra care, is my point.

It’s not strange, as each hook, if, for loop, macro creates it’s own, self contained HTML tags. The content within each is “sanitized” from potentially “bad HTML” leaving everything else outside the hook, if, for loops, etc. unaffected.

A problem with Harlowe is that it’s bound to the rules of HTML, but doesn’t require that authors know HTML… so you get little hiccups, like this. Anyone that knows HTML will immediately be concerned with rogue < and > symbols.

Greyelf · November 13, 2024, 6:33pm

Additional to the guesses others have made, the < character also represents:

the starting delimiter of a (post) Named Hook. [some stuff]<name|
the starting character of operator used by the target<-label variation of a Markup based Link
one of the direction markers used by Aligner markup.

So there are many opportunities for Harlowe’s Passage Content Parser to guess wrong, and it is likely that (at least) one of the (RegEx) Parsing Rules is being “greedy”.

And as suggested by @HAL9000, a Web Developer would likely replace the < character with its HTML Entity / Escape code equivalent, because that character can have meaning depending where in the HTML document it is encountered.

(text-style: "shudder")[JA&lt;K OF ALL TRADES]

DamonWakes · November 13, 2024, 7:01pm

This makes sense for the < within the hook, but now I’m wondering about the one on the next line that’s outside the hook. Between your suggestions and @Greyelf’s I think I gather the sort of thing that’s probably going on here, though.