Faster route-finding

TL/DR: I’ve written drop-in replacements for “slow” and “fast” map route-finding that are significantly faster than the built-in implementations. If you have a game that uses map route-finding (especially if the map is large), I’d appreciate it if you could try the code in the rant box below out and let me know if it’s slower or has bugs. If it proves helpful (and non-buggy) I’ll try to get it officially approved or package it in an extension.

[rant]All you need to do is paste this code anywhere in your story file. It should just work, on both Z and Glulx - let me know if not.[code]Include (- Property vector2; -) after “Definitions.i6t”.
Include (- with vector2 0 -) when defining a room.
Include (- Replace SlowRouteTo; -) after “Definitions.i6t”. [ Quick hack; in an extension, use template replacement commands instead ]
Include (-
#Ifndef FAST_ROUTE_FINDING;
! This routine uses the room_index property for temporary storage, since it isn’t needed for slow route-finding anyway
[ SlowRouteTo from to filter use_doors obj dir head tail hrow in_direction sl found last_dir;
if (from == nothing) return nothing;
if (to == nothing) return nothing;
if (from == to) return nothing;
objectloop (obj has mark_as_room) { obj.vector = 0; obj.vector2 = 0; obj.room_index = 0; }
from.vector = 1; ! mark source as visited
head = from; ! initialize queue to contain only the source
tail = head;
found = false;
while ((~~found) && (head ~= 0)) { ! iterate until destination found or queue empty
hrow = head.IK1_Count * No_Directions;
objectloop (dir ofclass K3_direction) { ! find all unvisited child nodes
in_direction = Map_Storage–>(hrow + dir.IK3_Count);
if (in_direction == nothing) continue;
if (use_doors && (in_direction ofclass K4_door) && ((use_doors & 2) || (in_direction has open) || ((in_direction has openable) && (in_direction hasnt locked)))) {
sl = location; location = head;
in_direction = in_direction.door_to();
location = sl;
if (in_direction == nothing) continue;
}
if (in_direction == to) {
in_direction.vector = dir;
in_direction.vector2 = head;
found = true;
break;
}
if ((in_direction has mark_as_room) && (in_direction.vector == 0) && ((filter == 0) || filter(in_direction))) {
in_direction.vector = dir;
in_direction.vector2 = head;
tail.room_index = in_direction; ! add child to queue
tail = in_direction;
}
}
head = head.room_index; ! remove head from queue
}
if (~~found) return nothing;
! extract route
last_dir = to.vector;
obj = to.vector2;
while (obj ~= 0) {
dir = obj.vector;
obj.vector = last_dir;
last_dir = dir;
obj = obj.vector2;
}
return from.vector;
];
#Endif;
-).

Include (- Replace ComputeFWMatrix; -) after “Definitions.i6t”.
Include (-
#Ifdef FAST_ROUTE_FINDING;
[ ComputeFWMatrix filter use_doors oy ox row ;
Memclr(FWMatrix, WORDSIZENUM_ROOMSNUM_ROOMS);

objectloop (oy has mark_as_room) if (oy.room_index >= 0) {
	FindOptimalFirstStepsFrom(oy, filter, use_doors);
	row = oy.room_index * NUM_ROOMS;
	objectloop (ox has mark_as_room) if (ox.room_index >= 0)
		FWMatrix-->(row + ox.room_index) = ox.vector;
}

];

! Unlike the SlowRouteTo implementation above, we do need to use room_index; but we don’t need vector2 anymore since we only need the first step of each shortest path, so we use it to store the queue instead
[ FindOptimalFirstStepsFrom from filter use_doors obj row diri head tail in_direction sl;
objectloop (obj has mark_as_room) { obj.vector = 0; obj.vector2 = 0; }
from.vector = 1; ! mark source as visited
head = from; ! initialize queue to contain only the source
tail = head;
while (head ~= 0) { ! iterate until queue empty
row = head.IK1_Count * No_Directions;
for (diri = 0 : diri < No_Directions : ++diri) { ! find all unvisited child nodes
in_direction = Map_Storage–>(row + diri);
if (in_direction == nothing) continue;
if (use_doors && (in_direction ofclass K4_door) && ((use_doors & 2) || (DoorRoutingViable->(in_direction.IK4_Count)))) {
sl = location; location = head;
in_direction = in_direction.door_to();
location = sl;
if (in_direction == nothing) continue;
}
if ((in_direction has mark_as_room) && (in_direction.vector == 0) && (in_direction.room_index >= 0)) {
if (head == from) ! propagate first step of shortest path
in_direction.vector = No_Directions + diri;
else
in_direction.vector = No_Directions + head.vector;
tail.vector2 = in_direction; ! add child to queue
tail = in_direction;
}
}
head = head.vector2; ! remove head from queue
}
];

! Trivially modified from I7 implementation of Memcpy
[ Memclr addr size n;
#Ifdef TARGET_ZCODE;
for (n = size/WORDSIZE: (n-- ) > 0: ) addr–>n = 0;
for (n = size: ((n-- ) % WORDSIZE ~= 0): ) addr->n = 0;
#Ifnot; ! TARGET_GLULX
@mzero size addr;
#Endif; ! TARGET_
];
#Endif;
-).[/code][/rant]
A while back there were some threads about route-finding being slow (in either “fast” or “slow” mode), so today I took a look at the implementation. To my surprise, the “slow” algorithm isn’t breadth-first search as I had supposed, but something unusual and inefficient (and nameless, as far as I know). For example, the algorithm can waste a lot of time on parts of the map disconnected from both the starting point and the destination. The template documentation claims that

but this is not true. The runtime is in general only O(dnN) where N is the total number of rooms (which could of course be much larger than n, for example if a locked door blocked access to a big chunk of the map). It also can end up evaluating the condition specifiying whether routing through a particular room is allowed up to n times per room, when once would be enough.

Anyway, even if the claim above were correct the algorithm would still be suboptimal, since a totally standard breadth-first search has complexity O(dn) (assuming d > 0). I’ve implemented that, and at least on a couple of maps I tried it was several times faster than the built-in algorithm. It uses slightly more memory: one extra word per room.

I also took a look at the “fast” route-finding method. This uses the Floyd-Warshall algorithm to simultaneously calculate all shortest paths between any two rooms, caching them for later use. Floyd-Warshall is a good algorithm for dense graphs, but doesn’t make sense for maps in IF, which are typically very sparse (the number of exits in a single room is much lower than the total number of rooms). For sparse graphs running Dijkstra’s algorithm from each vertex is usually better, and in this case since we have an unweighted graph we can just run breadth-first search as above. I also implemented this, and on the few maps I tried it was many times faster than Floyd-Warshall.

Note that in my implementation both “slow” and “fast” route-finding are done by breadth-first search. The only difference is that in “fast” mode, all possible shortest paths are found and cached. So the choice of which to use is really only a function of whether you’d rather do computation ahead of time (e.g. if the map and any conditions you put on the rooms/doors that can be travelled through don’t change) or on the fly (e.g. if the map is very dynamic). But both are hopefully faster than they were before.

At any rate, I’ve only done a little testing, and it would be very helpful if anyone who has a few minutes could try out my code on their maps and see if it is correct (and ideally, faster). The implementation above is only for map route-finding, but it should be straightforward to extend it to relation route-finding, and I will attempt it later this week.

Here’s a drop-in replacement for “slow” relation route-finding (i.e. the route-finding used unless you write “with fast route-finding” when defining the relation) - again BFS seems much faster than the built-in algorithm. Any bug reports or other feedback are appreciated.

[rant][code]Include (- Property vector2; -) after “Definitions.i6t”.
Include (- with vector2 0 -) when defining an object.
Include (-
[ VtoVRelRouteTo relation from to count vtov_structure right_ix obj head tail found last_obj next i;
vtov_structure = RlnGetF(relation, RR_STORAGE);
if (vtov_structure–>VTOVS_CACHE)
return FastVtoVRelRouteTo(relation, from, to, count);
right_ix = vtov_structure–>VTOVS_RIGHT_INDEX_PROP;

objectloop (obj ofclass Object && obj provides vector) { obj.vector = 0; obj.vector2 = 0; }
from.vector = 1;	! mark source as visited
head = from;	! initialize queue to contain only the source
tail = head;
found = false;
while ((~~found) && (head ~= 0)) {		! iterate until destination found or queue empty
	objectloop (obj ofclass Object && obj provides right_ix && obj.vector == 0) {	! find all unvisited child nodes
		if (Relation_TestVtoV(head, relation, obj)) {
			obj.vector = head;
			if (obj == to) {
				found = true;
				break;
			}
			tail.vector2 = obj;
			tail = obj;
		}
	}
	head = head.vector2;	! remove head from queue
}
if (~~found) {
	if (count) return -1;
	return nothing;
}
! extract route
last_obj = to;
obj = to.vector;
i = 0;
from.vector = 0;
while (obj ~= 0) {
	next = obj.vector;
	obj.vector = last_obj;
	last_obj = obj;
	obj = next;
	++i;
}
if (count) return i;
return from.vector;

];
-) instead of “Slow Various To Various Route-Finding” in “Relations.i6t”.[/code][/rant]
I also implemented BFS for “fast” relation route-finding, but unlike the case of map route-finding it doesn’t seem to do better than the built-in implementation of Floyd-Warshall.