As a side-note regex based parsers are a typical jumping off point because they’re accessible but they tend to toward brittleness with increasingly complex expressions trying to tidy up loose ends.
Writing a proper parser doesn’t have to be difficult and pays off in making it easier to reason about and modify. For my money parser combinators are accessible and powerful. Since I wrote my PC library I’ve found several applications that are just easier because I can quickly roll a parser to build an AST with error handling.
If it would be useful to talk over I’d be happy to.
Thank you, I would appreciate an introduction to that sometime. This is an iteration of the first parser I ever built, and I have increasingly noticed the brittleness now that I introduced new syntax for ChronicleHub specfically. So, if you could explain the other method sometime, perhaps it will lead to refactoring the logic into something more stable.
Currently using TypeScript for ChronicleHub, though in terms of pseudocode, I tend to be a little quicker to catch on to C#, as that is my most comfortable language to write in (if that is relevant).
In many respects this is such low-level stuff (characters, data structure, and functions) that the language doesn’t matter. The type spec’ing might be tedious in TypeScript tho. But I guess we’d muddle through. Or use C#. It’s been a while but, again, muddle through. The nice thing about parser combinators is they are so simple … it seems like a trick and that they couldn’t be as powerful as they are.
Would you mind sending me a private message? If need be, I can share the place in the repo where parsing is currently handled.
Or keep it public for the edification of all!
Either way, the main parsing logic is handled here:
And the most frequent class interacting with it is here:
There were attempts at refactoring it into sub-classes, but I gave up when I got too frustrated with the parser breaking. The code is still around in the scribescript/ folder.
It should quickly become apparent that, while I’ve programmed for about 10 years casually, this was my first attempt at a DSL.
I love reading about parsers (and all programming language theory, to be honest) and I don’t know very much about parser combinators as an approach, so I’d be delighted if you decided to do this publicly!
Yeah, parsers are fun! For some values of fun. I played with Earley parsers a long time back because I was curious about the research literature (I found out through the forum that Draconis actually found my intro helpful which was cool!), and these days I lean toward an extended shunting-yard parser for anything that can handle, just for sheer simplicity. But using parser combinators for recursive descent or PEGs is certainly a great way to go too. Maybe a new thread?
I’m by no means an expert although I have implemented a parser combinator library, Ergo, (partly because I didn’t gel with the dominant libraries in Elixir, and partly as a learning exercise) which I have used to build a number of reasonably complex parsers, including the parsers for the Rez language, Mangle assembly language, and DemoGen.
I’m happy to share what I know. My opening gambit would be either an open Zoom call (I pay for Zoom so can host) or something on the IF Discord.
Open to alternative approaches/venues and happy to share the floor with anyone else who has a perspective they want to share. E.g. I would appreciate hearing about PEG parsing as it’s not something I’ve used for over a decade and I’m rusty.
I’m afraid I haven’t played with Parsing Expression Grammars in at least that long either… they use ordered choice (always choose the first option if it matches, unlike context-free grammars where either can be chosen) which eliminates the ambiguity (multiple ways to parse an input). That can make writing grammars simpler, because you don’t have to work as hard to remove ambiguity, but otherwise my recollection is that they’re very similar to writing a recursive-descent parser for LL(k) languages (that is, grammars with no left-recursion)…
Okay with some caveats in place like I am not a Typescript user and Rez uses Javascript at all only because it’s generating code for use in a browser and that’s the language we have.
I think you’d have a job of work ahead of you to write a “proper” parser based on this code. It’s complex and it appears to intertwine parsing & action. I can see pretty clearly how you’d do it, but it wouldn’t be trivial. I don’t want to sugar coat that.
So, if it were me, the questions uppermost in my mind would be “How long do I plan to own this code?” and “Is it done, or is there more to do?”
If the project is done and you don’t have significant features to add and it works more or less bug free then unless you have an itch to scratch, you might be better off leaving it as it is.
On the other hand, if you’re wrestling with adding new features, perhaps finding that going slower than you would like, and find bugs popping up and you plan for this to be a long-term thing, then the investment in building a parser for scribescript is more likely a worthwhile investment.
Either way, I am still happy to run a session introducing parsing with parser combinators.
Right now, the code is in a pretty good place. I can at least keep up with any updates I have to make to it and it’s maintainable enough. But, it’s definitely a long term thing, and I am nothing if not eager to experiment with better ways of handling code. This is definitely in the itch to scratch category, though it does mean it will be lower down on the priority list.
I hang out in the IntFiction Discord as @sandbags so if anyone wants to chat parsers & parser combinators, ping me there. I’m the UK so best times for me to talk are usually going to be 19:00-22:00 GMT.