Punctuation Removal by Emily Short not removing apostrophes?

I have a problem where PR is not removing apostrophes when I need it to.
Here’s the code:

Include Punctuation Removal by Emily Short.

Lab is a room. Phil's Desk is in Lab. Understand "Phil s" as Phil's desk.
	
After reading a command:
    	remove apostrophes;

and here’s the output:

>**trace 4**

[Parser tracing set to level 4.]

>**x phil’s desk**

[ "x" x / "phil?s" ? / "desk" desk ]

[Parsing for the verb 'x' (1 lines)]

...<SNIP>...

[ND made 0 matches]

[token resulted in failure with error type 5]

You can't see any such thing.

So it doesn’t seem to remove the apostrophe here. Any idea why not and is there a way I can work around it?

I think the problem has to do with this being a curly apostrophe. The trace shows that the input is read as [ "x" x / "phil?s" ? / "desk" desk ], so the curly sign isn’t really coming through; whereas a straight apostrophe ' is read as [ "x" x / "phil's" phil's / "desk" desk ] (and the straight apostrophe should actually be understood even without being stripped by Punctuation Removal).

Since the input with the curly apostrophe is read as “phil?s”, this gives us a workaround:

After reading a command:
	remove question marks;

But I’m not sure what the best and most robust solution is; I don’t know whether other interpreters might behave differently.

2 Likes

The problem was noted here, too:

If you’re on 9.3/6M62, Zarf’s Unicode Parser would let someone write a unicode-aware Punctuation Removal. I just took a stab at updating it for 10.1 but I really need to diff the originals of all the I6 template layer stuff it replaces against the new ones in the kits to see if there’s relevant updating to do there: I know Parser__parse, at least has changed, which makes it too big a job for a Sunday on which I have a lot to do.

3 Likes

Yep, definitely. Thanks.

Not sure which way to go with this. For now I’ll go with the question mark workaround. I don’t think I’ll need any question marks!

2 Likes