Does firstObj...nextObj include orphans not GCed yet?

I suspect this is happening but I’d really rather not delve into the TADS VM source code to answer the question. I was hoping someone here may know if this is true:

When you (or adv3 code) executes a loop using firstObj() and nextObj() to iterate over all objects of a given type, what does it do with the objects that you have tried to “delete” by use of moveInto(nil) if the garbage collector hasn’t gotten around to really deleting them yet? Are firstObj() and nextObj() designed to skip over those orphaned objects since in a way they’re not really supposed to be there anymore? I suspect they don’t skip them, and instead they do include these objects in their iteration. It would explain some behavior I’m seeing that makes my program crash but in nondeterministic ways. It may be that it’s only crashing when the garbage collector is delayed enough that some orphaned objects are being picked up by the object iterator loop. These are objects for which many of their properties now contain obsolete information that really messes things up when methods are run on them.

If this is the problem I can put in checks to skip over any objects picked up by nextObj() who’s location is nil and that should fix it (and I did and it seems to have made it stop happening). Because the problem was originally only happening in maybe about one out of every ten runs on average, it’s hard to tell if I really fixed it just because it hasn’t happened in any recent runs, which is why I’m asking if I’m right about my hunch. The inconsistency of the crash is why my eyes are on the garbage collection as an explanation.

(Also, its important because if I’m wrong, then maybe I’m not really orphaning the object like I thought I was and I would therefore need to scan through my code very carefully looking to see if I left a reference around somewhere that’s keeping it alive.)

I wouldn’t expect moveInto(nil) to “delete” an object. All it does is set the object’s location to nil and remove it from the contents list of its former container. Any object defined in code (as opposed to being created dynamically) would, I imagine, persist throughout the game. You often want to move objects into nil to move them off-stage, but you might want to bring them on-stage again, so the garbage collector certainly shouldn’t delete them.

Or are you referring specifically to objects you create dynamically at run-time? In which case I suppose the garbage collector would delete them when there’s no longer a reference to them, and it’s possible that the last remaining reference to a dynamic object might be its presence in the contents list of its location, so that moving it into nil would remove the last remaining reference (although you’d presumably have needed some other reference to the object to call moveInto(nil) on it, so that reference would have to go out of scope too).

Or am I completely misunderstanding your question?

Eric is right. When a dynamically created object is not referenced anymore, it is placed in the garbage collector’s work queue. Whether its location is nil or not has nothing to do with it, as the GC does not care about the adv3 world model and has no concept of location. It only cares about whether an object is still reachable in your code, not in the game. Here’s an example:

// 'obj' is a property, not a local.
obj = new Thing;
obj = new Thing;
obj.moveInto(nil);

The Thing that was created first is lost forever. There’s no way to ever get to it in the code. The second Thing is still reachable (through the ‘obj’ property) and therefore isn’t GCed.

So the answer is, the list will include objects that are moved into nil.

I think you both missed my question because you picked on the idea of whether or not moveInto(nil) is an effective way to orphan an object rather than the actual question, which is if I have orphaned an object, can firstObj and nextObj hit on it because garbage collection isn’t instantaneous and might not have deleted it yet?

I.e. Can this happen in this order some of the time:
Step 1 - I have orphaned an object of type Foo.
Step 2 - firstObj/nextObj are used to iterate over all objects of type Foo.
Step 3 - The garbage collecter gets around to finally deleting the object I orphaned.

But in this order other times depending on unrelated circumstances:
Step 1 - I have orphaned an object of type Foo.
Step 2 - The garbage collecter gets around to finally deleting the object I orphaned.
Step 3 - firstObj/nextObj are used to iterate over all objects of type Foo.

And if that can happen, would the orphaned object appear in the firstObj/nextObj iteration when things happen the first way, but not appear in it when things happen the second way.

On the unrelated topic of whether or not moveInto(nil) actually orphans an object or not, I would expect that it can in the case where the contents list of the object’s previous container (prior to the moveInto() ) was the only place the object still had a reference pointing to it.

i.e. something like this:

{
  // Is there an object matching the criteria I'm looking for in the actor's inventory?
  // please note that obj is declared as local, which is relevant to garbage collection.
  local obj = person.contents.indexWhich( ... some condition here ... );
  // If so, get rid of it:
  if( obj ) {
     "\^<<obj.theName>> dissapears from {your/his} hands in a puff of smoke.\n";
    obj.moveInto(nil);
  }

  // Note that the object was originally created with a 'new' operator.  Therefore unlike the static
  // objects people usually use in TADS, it doesn't always have a reference hanging around just
  // by virtue of the fact that it wasn't created anonymously.
}

From what I can tell by reading the code, once an object has been marked for GC, it’s removed from the interpreter stack. nextObj() is not going to find it, as it looks in the stack for objects.

That’s interesting that the code was put there to check that an object is marked for deletion. But are you really sure that will have happened yet during the window of time I’m talking about? (Which is when the user code has orphaned an object, but the GC hasn’t been run yet since then.) If the orphan-detecting algorithm is part of the GC as the documentation seems to claim, then how does the VM know yet that this is an object that will be deleted? How could it have marked it for deletion yet? I would strongly suspect that the object isn’t marked as such until the next time after it was orphaned that the GC is invoked.

I’ll see if I can create a minimal example game to try to prove it. I’ll make an object, orphan it, and then iterate over all of its type to see if it’s still there.

How long the window of time I’m talking about is will sort of depend on how often the GC runs, which I don’t know. I know the documentation here: tads.org/t3doc/doc/sysman/gc.htm says it’s not a separate thread but a routine run explicitly by the main thread, but I don’t know what criteria it uses to decide it’s time to do a GC run. If those criteria are something that could easily vary from one run to the next then it would explain what I was seeing if my guess is right.

Note that before doing the traversal, you can run the GC by calling t3RunGC(). If the crashes go away, then that might indeed be the problem.

Edit:
I’ve sent a mail to Mike Roberts about this, since it’s probably a good idea to document this on tads.org/t3doc/doc/sysman/gc.htm. He knows the VM code in and out (obviously).

(I did know about the t3RunGC() and in fact it does seem to make the problem go away, but the problem with a crash of this variety that was only ever intermittent in the first place is that just because it stopped happening that’s not enough to prove you found the cause. Since it’s intermittent it could just be that the more recent runs have been more “lucky”. Thus why I was seeking the help of an expert on the inner workings and why I tried making a test case to watch it happening. By the way thanks for all the help you’ve given. It’s been handy.)

Oh, and if it’s useful, I just got done creating a small demonstration game file that shows the situation very clearly. Because of the results of this test I am now very strongly convinced that this was in fact the problem. I’ll paste it below.

It’s even more prominent than I thought. The orphans stay in the firstObj/nextObj list for quite some time. The window of time during which this happens is very long. It was long enough for me to enter “z” over and over and pass about 10 turns in a row before the all the temp objects were gone from firstObj/nextObj. I originally made my test game thinking I’d have to print out the object list many times per second with a RealTimeDaemon to “catch” the moment when the orphans went away, but it turns out it’s slow enough that a normal once-a-turn daemon is good enough to demonstrate it.

In the first box below is the sample game file, and in the second box is the example output captured with a ‘script’ command from one of my runs of it:

#include <adv3.h>
#include <en_us.h>


gameMain: GameMainDef
  initialPlayerChar = me

  showIntro()
  {
    "==============================================================\n";
    "A test of how firstObj/nextObj behaves with garbage collection\n";
    "==============================================================\n";
    "Verbs for this test:\n";
    "====================\n";
    " - generate <num> : Generate and immediately orphan a count of <num> new MyThings with random names.\n";
    " - reportlong : Switch to long reporting - showing only the total count of MyThings, not their names.\n";
    " - reportshort : Switch to short reporting - showing all the names of all the MyThings.\n";
    "(Note: reportlong is the default at first, but you should change it when you make a lot of things.)\n";
    "\n";
    "To perform a test: Generate some orphans and then keep waiting with the 'z' command to see how long they last.\n";
  }

;

versionInfo: GameID
  name = 'dummy for this test'
  byline = 'dummy for this test'
  authorEmail = 'dummy for this test'
  desc = 'So far this is a dummy placeholder string'
  version = '1' // dummy for this test
  IFID = 'ffffffff-ffff-ffff-ffff-aaaaaaaaaaaa' // dummy for this test
;

class MyThing: object
  name = '---'
  construct(n) { name = n; }
;

tester: PreinitObject
  lastingThing = nil
  repeatingDisplay = nil

  // detailedList:
  //    Set to true to make the daemon repeatedly print verbosely the full list of MyObjects.
  //    Set to nil to make the daemon repeatedly print only a count of how many there are.
  detailedList = true

  execute()
  { 
    // Make one lasting reference to a MyThing for comparison.  This one won't be
    // orphaned and should always exist.  The other MyThings I make will be temporary
    // and orphaned:
    lastingThing = new MyThing('persistant thing'); 

    // Start repeatedly printing how many things there are in the list of MyThings according to firstObj/nextObj,
    repeatingDisplay = new Daemon( self, &displayAllThings, 1 );
  }

  displayAllThings()
  {
    local thingCount = 0;

    "\nReporting all MyThings in firstObj/NextObj:\n";
    for( local obj = firstObj(MyThing) ; obj != nil ; obj = nextObj(obj, MyThing) ) {
      if( detailedList ) {
        "\'<<obj.name>>\' ";
      }
      ++thingCount;
    }
    "\n(total of <<thingCount>> MyThings)\n";
  }
;

lab: Room 'lab' 'lab' "The tads testing lab."
;
+me: Actor 'Player' 'player' "You, the player"
;

DefineLiteralAction(Generate)
  execAction()
  {
    local howMany = toInteger(getLiteral());
    local obj = nil;

    // Just to make sure the objects all have unique names for printing purposes:
    local prefixName = '<<rand(100000)>>'; 

    "Making <<howMany>> MyThing orphans with names of the form: \'<<prefixName>>(NUM)\'\n";
    for( local i = 1 ; i <= howMany ; ++i ) {
      obj = new MyThing('<<prefixName>>(<<i>>)');
    }
     
    // Orphan the lastmost instance I made.  Not strictly necessary since obj is 
    // local and thus about to go out of scope.
    obj = nil;

    if( obj ) { /* nothing */ } // Just to silence the 'obj set but not used' compiler warning
    
  }
;
VerbRule(Generate)
  'generate' singleLiteral : GenerateAction
  verbPhrase = 'generate/generating (what)'
;
DefineIAction(ReportShort)
  execAction()
  {
    "Switching to short report list.  Will only show total MyThing count.\n";
    tester.detailedList = nil;
  }
;
VerbRule(ReportShort)
  'reportshort' : ReportShortAction
  verbPhrase = 'reportshort/reportshorting'
;
DefineIAction(ReportLong)
  execAction()
  {
    "Switching to long report list.  Will show all MyThing names.\n";
    tester.detailedList = true;
  }
;
VerbRule(ReportLong)
  'reportlong' : ReportLongAction
  verbPhrase = 'reportlong/reportlonging'
;

firstObj() / nextObj() will indeed return otherwise unreferenced objects. They just run through the current object heap. When an object becomes unreferenced, that isn’t immediately detected; it’s only detected when the garbage collector runs. Between gc runs, the VM doesn’t know which objects are reachable and which aren’t, because making that determination requires tracing the network of references, which is time-consuming enough that the VM only does it at gc time. This makes it possible to “resurrect” an object that’s otherwise unreachable with firstObj/nextObj, as you’ve apparently encountered.

When you say your program crashes, are you talking about a run-time error, or an actual VM crash? The latter shouldn’t be happening even in this situation. Resurrecting an object with nextObj is harmless at the VM level - in particular, it won’t create a situation where an object that you’ve resurrected is later deleted and leaves you with an invalid reference. The fact that an object was momentarily unreachable doesn’t make it deletable, since there’s no “trigger” that occurs when the last reference is removed; in order to be deleted, the object has to be unreachable at the moment the gc actually runs. So if an object becomes unreachable, and then you resurrect it with nextObj(), and you keep hold of that reference until the next gc pass, the gc will never know about the interval when the object was unreachable; it will just see that you have a reference to the object and will keep the object alive.

If your code is sensitive to this, the suggestion to call t3RunGC() before your loop will probably help, although even that’s no guarantee if the code you’re running in the loop could discard other objects. If it’s important to your project to be able to explicitly remove an object from the game, I’d probably do something more explicit to manage the dynamic objects, like keeping a list or lookup table of the active ones, or setting a property that says whether the object is active or not. Relying on the garbage collector probably isn’t quite the ideal approach, in that it operates at a much lower level than the game world model (e.g., other programmatic references besides contents/location type references could unexpectedly keep an object alive) and because of its “lazy” cleanup strategy.

First off, thank you very much for the reply (and for just TADS in general).

The VM engine wasn’t crashing. It was my code that was crashing from running the methods of my objects that were attempting to follow references that were set to nil as part of the process of de-linking it from the other objects in the world. Specifically, trying to do things like print self.location.name after a moveInto(nil) caused it to become orphaned and self.location to now be nil (the contents of the parent container was the only reference, so moveInto(nil) was indeed orphaning it as I intended - its just that I hadn’t thought about how firstObj/nextObj could be a way to get access to an orphaned object.

Really the major hurdle for me here was simply coming to the realization in the first place that its perfectly normal for firstObj/nextObj to sometimes return orphaned objects and sometimes not, based on criteria I don’t control. This began from me trying to diagnose a problem of the form “why do I not get the same behavior even when my code is identical, I haven’t recompiled, and I feed it the exact same input data?” (Since there were NPC’s taking actions based on some heuristics, and a deck of cards being shuffled randomly in the game, I also forced a consistent seed into the random number generator just to get the NPC’s to give the same actions each time, thinking that had something to do with it but it was still inconsistent) After delving into it it now makes sense why the behavior is inconsistent based on how it’s dependent on the GC, but for a while it was really confusing.

In the end it probably is a good idea to document the fact that when using firstObj/nextObj you need to write code that is tolerant of the fact that sometimes it will be operating on objects you’ve orphaned and may have thought were out of sight, out of mind. (With a short mention of the GC and why that is, but not more than one or two sentences about the GC). I’m not sure where it makes the most sense to put that documentation.

I know the place I probably would have been most likely to have seen it first if it was there is not on this page:
tads.org/t3doc/doc/sysman/gc.htm

but on this one:
tads.org/t3doc/doc/sysman/tadsgen.htm (where it talks about firstObj and nextObj)

It might be handy for it to also be mentioned in the header file tadsgen.h so it will end up in the auto-generated library documentation, but really I think the above two places are where I’d have seen it first given the order in which I tried to read docs to diagnose the problem.

But that’s just me. I have no idea if my doc reading habits are typical. Take it as you will.

I’m glad to hear VM crashes weren’t involved! This is the kind of situation where I think I know what should happen, but I often find myself surprised…

Your ideas for places to mention this in the docs are all good - I’ll add some notes about it.