Some of you asked for some input from someone about choice game testing.
By default, choice-based games are less likely to have transcript capability out of the box. Certainly, Ren’Py (the engine I use) does not have an automatic transcript. However, I have noticed some choice-based games come with a History function, which depending on how it is programmed, can be the same thing with a different name. When such a function exists, it generally doesn’t need to be turned on and records all text from the moment “Play” is pressed ,automatically (the player may not be able to see it until they get to an interactive part of the game, however). Player actions may be noted differently from displayed text (for example, in square brackets) and occasionally, relevant major variables are listed as well. The text can even be copied to text programs for annotation and/or onward communication.
I haven’t programmed a History function into Budacanta yet, though I intend to do so before starting to do testing for the first commercial part of the game. (It didn’t happen for the demo because it simply didn’t occur to me, and since none of my testers had ever used a transcript either…) I was primarily interested in shipping something that worked and wasn’t going to be terrible, and advised my testers accordingly. Thankfully, the four people I picked had quite different ideas of what that meant, meaning I got feedback on everything from the tutorial to technical corrections concerning aircraft.
I didn’t bother asking for much detail about what had preceded any problems, just where players had got to when they encountered it. (Thankfully, every bug/typo/low-flying slow plane turned out to be reproducible from this information, so I got away with this approach).
On the other hand, when I organised my first translation’s beta testing, my poor translator had to put a series of 38 screenshots to illustrate the (many) game-breaking bugs due to me forgetting to define a character (19 screenshots with the cursor on which option was selected, and 19 of the resulting errors). Please bear in mind all of these screenshots were in Indonesian - a language I have never studied. Let’s say I learned to dread seeing “Bengeculian telah terjadi” on my screen… (Indonesian for “A runtime error has occurred”).
Due to the relatively high word count in visual novels compared to parser-based adventures, typoes are slightly more likely to be forgiven (although testers are just as likely to spot them, and rightly so). Theme and character tend to be easy to find testers for. Bugs that give errors or make the game unplayable will be mentioned regardless of the game format. Beta testers who will comment on the accuracy of technical details and pacing advice are more difficult to find in a choice context. It’s also surprisingly difficult to find testers who will mention blatant logic issues in a choice game - the “main character is able to eat 360 grilled cheese sandwiches in a row” bug was something I mentioned on one choice-based Discord, to be met with replies along the lines of, “How is that a bug? You didn’t need to remove that - it’s funny!” Something tells me that a parser-based tester, of a game that is clearly not meant to be a reality-bending comedy, would not have provided the feedback that way 
Sensitivity testers (specialists in spotting problems with either the representation of a particular viewpoint, or the way the game portrays its story to players of particular backgrounds) are still not used for the majority of choice games that could potentially use them, but are starting to do testing in that space.
A pure-choice game isn’t going to need its choice array “stretching” unless, for some reason, the author opted to roll their own (which is rare, given that several platforms are both relatively easy to pick up and extensible and expertise tends to get siloed into platform-specific locations).
I cannot comment on puzzle testing because my game didn’t have any formal puzzles last time it was tested, but some choice games do have puzzles and other authors will be more qualified to comment on this matter.
I am using version numbering on my game. Ren’Py does not automate the numbering system, although it automatically renames builds according to the version number provided. This means that testers are unlikely to confuse versions of the game unless the author has failed to communicate which version is the current one.
There’s also a lot more propensity for choice-based authors to provide small snippets of code or text and ask, “Does this look OK to you?” This means a certain amount of ad hoc testing is going on outside of formal beta testing cycles.
If all else fails, it’s possible to send Ren’Py authors the entire save file. It’s only readable in the game, but it does allow the author to go through the rollback and reconstruct the most recent move(s) the player sequence up to that point. Note that this is likely to work, to varying degrees, in any game with a save function and a substantial rollback or undo facility.
(Consider allowing unlimited rollback/undo on test versions of your game for this reason, even if you are not planning to permit this in the completed version. You can always use tester feedback to decide what level of undo is right for the completed game).