[I7] Table lookup speed

Eleas · September 14, 2018, 1:59pm

I’ve been playing around with a small project that gets its data from 256 rows in a table. Accessing that data is… not slow, but there’s a barely perceptible delay. I don’t want that; it’s bound to be worse when I pile on rules and daemons.

So I thought I’d subdivide the (sorted) giant table into a tree structure and then do a binary search. The idea seemed solid. I built it to access by number (called x), and got the game to load the giant table, chunk it into 32 equal-sized tables on the leaf-nodes, and then I recursively let the max-x values bubble up the tree in (fixed-size) tables (basically, top node table has 2 subtables, which each have 2 subtables, etc, each looking something like this:

max-x	table-name
102	Table of Sub-node a
230	Table of Sub-node b

Table of Sub-node a
max-x	table-name
63	Table of Sub-node a a
102	Table of Sub-node a b

Table of Sub-node b
max-x	table-name
160	Table of Sub-node b a
230	Table of Sub-node b b

While finished code does run, I still sensed a small but perceptible delay in the built-in i7 terp (which has always been slow as molasses on my end, but still). So I guess what I’m asking is, am I just reinventing something I7 does under the hood already, or could this method help reduce lag to a useful degree?

zarf · September 14, 2018, 3:18pm

I7 does not do that under the hood. Table searches are simple linear loops.

However, a loop of 256 values isn’t that bad. Consider the possibility that some rule elsewhere in your code is doing something expensive.

If you can figure out how to run a profiling interpreter, that may be helpful. Unfortunately this is a pain in the ass, both to do (github.com/erkyrath/glulxe/blob … analyze.py) and to understand the output.

Eleas · September 14, 2018, 3:35pm

Ah. No, this is a fresh project just to test out the code. It does nothing but lookups and one numeric comparison per element in one When Play Begins rule. And yeah, there’s also about eight square root operations in there, but even without them it still lags noticeably.

Guess I’ll try super-large iterations and stick a timer in there, see what happens.

Draconis · September 14, 2018, 6:25pm

If the table is sorted, there are some shiny Glulx tricks you can use to have the interpreter do a more efficient search. I’m not sure if I7 uses any of them for table searches at present.

zarf · September 14, 2018, 7:27pm

All simple linear loops, I promise.

Dannii · September 14, 2018, 11:27pm

Each column is stored in a separate array so it would be possible to use @linearsearch (no sorting required). Probably wouldn’t be too hard to make an extension to do that. If anyone wants to try, look for the TableRowCorr function and the others around it.

Once change that would make a big performance difference would be to make Inform use Glulx’s malloc instead of its own heap implementation. Every call to BlkValueRead and BlkValueWrite is orders of magnitude slower than it would be compared to reading and writing to a malloc’d block of memory.

zarf · September 14, 2018, 11:51pm

Glulx’s malloc is pretty shoddily implemented, mind you. It’s shoddily implemented in C (or JS), rather than in I6 code, which is indeed an improvement, but I always wince when I think about it…

I still think that a perceptible slowdown when dealing with a 256-element table is not caused by searching the table. Sorting the table, maybe, if that’s a thing which is happening.

Draconis · September 14, 2018, 11:52pm

Exactly, and if it’s sorted anyway, you could use @binarysearch (though for 256 elements the difference should be imperceptible).

How much work would be involved in switching out BlkValue* for @malloc and @free? I.e. how deeply entrenched are they?

And zarf is of course correct, a linear search through 256 elements shouldn’t be causing a delay.

Dannii · September 15, 2018, 12:24am

Zarf, that’s an implementation detail though, right? Just as there are multiple C malloc implementations, it would be possible to improve the ones in Glulxe, Git, and Quixe. But even without improvements, letting the terp handle it would be so much more performant than the the BlkValue functions.

Draconis, it might be possible to do in an extension. But it would be close to a thousand replacements.

matt_weiner · September 15, 2018, 8:06pm

This is a shot in the dark, but would it be more efficient to use the table to define a kind of value and then look up things according to the kind of value? So instead of choosing a row with a foo value of bar and then getting the baz entry, you would define bar as one of many different foos and then look up the baz property that bar has. I don’t know if there are any circumstances in which that is quicker.

Also, instead of doing square-root operations, is it possible to work with the square of the value? When I was doing grid distance in Terminator, instead of determining the distance from (3, 3) to (4, 5) as sqrt((4 - 3)^2 + (5 - 3)^2) and then comparing that to 3, I would determine the squared distance to be (4 - 3)^2 + (5 - 3)^2 and compare that to 9. Just another shot in the dark if you’re using square roots, you might be able to avoid floating point arithmetic that way.

Eleas · September 15, 2018, 9:34pm

zarf’s hunch was right: it was something else. In this case, the algorithm used relied on sorting. I’ve settled instead for doing a quick box-pass: determine a box that would fit the radius, reject any x coordinate that doesn’t fit in it, otherwise reject y-coordinate, and finally doing the expensive square root for the remaining candidates. I thought I might have to do a Newton-Raphson variation if the idea didn’t pan out (had all sorts of ideas on how to guesstimate the value, one being a list of precomputed square-root values in a table), but as it turned out none of that needed to be done.

Anyway, much obliged to you guys for your insight. Some cool ideas in this thread.

bikibird · September 16, 2018, 3:31pm

Can you explain what you are actually trying to calculate? From the clues, it sounds like you are trying to determine whether something is contained within a circle? Or are you looking for the as the crow flies distance? Or something else?

There are two very old computer graphics algorithms that rely only on addition and subtraction to draw circles and line segments. I think they might be fairly easy to modify to estimate whether something is inside or outside a circle or estimate distance between two point. They won’t be precise to the decimal point as it is integer math only, but it might be good enough for the problem you want to solve. Doing the boxing first would still be a good idea.

See:

https://en.wikipedia.org/wiki/Midpoint_circle_algorithm

https://en.wikipedia.org/wiki/Bresenham's_line_algorithm

Also, as an aside, wondering whether the midpoint circle algorithm could be modified to provide trigonometry functions? Although not sure anyone needs/wants that in their interactive fiction?

Eleas · September 16, 2018, 9:48pm

Can you explain what you are actually trying to calculate? From the clues, it sounds like you are trying to determine whether something is contained within a circle? Or are you looking for the as the crow flies distance? Or something else?

Basically, I have a 2D field with a bunch of coordinate points (“destinations”), and the player will be able to go a certain distance along that grid, always point to point. The function I was referring to earlier would return a list of all viable destinations. It made several unclever assumptions and managed to make the lookup of 256 destinations slow enough to be noticeable. My new one ditched the contortions and the overhead as well.

Those are cool links though. This wouldn’t be a good use of them, I think (we’re only concerned with endpoints here, as opposed to line or circumference shape), but I can think of a few projects that might use a fast line-draw algorithm or two.

matt_weiner · September 16, 2018, 10:07pm

I tried to use trigonometry in IF once! I had got a lot of pirate-related songs for Shufflecomp 2 and for some reason I decided both this would be a good time to start on that Strange Adventures in Infinite Space-like I’ve always wanted to do, and that the first thing to start with would be the sailing mechanics including a bunch of stuff about the trigonometry of tacking. Needless to say my Shufflecomp 2 entry never got made, but I did discover a bug in the trig functions (which makes me think no one else has ever actually used them).

Eleas, if the thing is that the player can go a certain distance point-to-point on the grid, won’t it be enough to check whether the square of the difference in x coordinates plus the square of the difference in the y coordinates is less than or equal to the square of the maximum allowable distance? Then you could do it all in cheap integer arithmetic.

bikibird · September 17, 2018, 12:32am

Glad you have a solution that works for you. I haven’t thought about those algorithms since I used to dabble with 6502 assembly language a very long time ago. Thanks for triggering the nostalgia

I don’t know Inform7 real well, but it seems to me that maybe instead of using a table to store this data, this is a job for a data graph, a.k.a relations. I’m not comfortable enough with Inform to figure out how to write that, but it seems like that relations might be a good fit for linking teleportation pads together?

Been trying to improve my comp sci skills lately and have been studying a (very) little graph theory. There is a concept where the line (edge, link,relation, whatever you want to call it) has a weight value associated with it so that you can calculate the shortest distance between two nodes. For example consider cities linked by highways. What is the shortest path between two cities? That’s different than fewest hops between cities. You have to take into account how long the roads are. Question: Does Inform7 have the ability to weight the relations between things? What would that code look like?

bikibird · September 17, 2018, 12:45am

I can see that. Also, charting a course with a sextant. It wouldn’t be a game I could play though. I can barely handle compass directions.

Eleas · September 17, 2018, 6:51am

I’m sorry, I didn’t see your previous post. And yeah, I considered that (fuel expended when traveling would still force me to compute the distance traveled though). I considered a table-based Newton-Raphson, a log-based fast square root deal, some sort of octal representation to work it with bit shifts, partitioning the calculation over turns; lots of fun ideas.

Implementing them would probably have been fun as well, but pointless; the bottleneck wasn’t there. It’s a bad idea to optimize without measurement. Better to keep the routine simple, terse and expressive.

6502 always seemed like good clean fun. I’m doing 68k Asm through-and-through at the moment, and it’s surprisingly coherent.

It’s a thought. zarf may correct me here, but IIRC, Inform 7 implements the many-to-many relation via tables under the hood.

I’d love a link to that, because it sounds intriguing. As for I7 relations, they’re strictly binary states. The usual idiom that I’ve seen for scalar relations is to use indirection: you implement with tables and phrases, add the required number of computed relations, and you’ll have something that will act very much like what you want. The table itself will impose limitations on the design, however; you sacrifice the simplicity of just declaring a relation.

matt_weiner · September 17, 2018, 11:24am

Oh yeah, if you want fuel expanded or anything like that you need the actual square roots.

I think many-to-many relations tend to be computationally expensive as far as I7 things go, because they have to store something for each pair of related things. Though as you said there’s no point in trying to optimize things if you haven’t profiled them.

bikibird · September 17, 2018, 4:21pm

This should get you started.

https://en.wikipedia.org/wiki/Directed_graph

https://en.wikipedia.org/wiki/Dijkstra's_algorithm

Also take a look at the last example in the relationships chapter of the inform manual (244): http://inform7.com/learn/man/WI_13_16.html

Also can you clarify? Is your grid 16 x 16 (256 entries) or 256 x 256? If 16 x 16, have you thought about organizing a table this way with computed cost of the travel (time, distance, fuel, or whatever) as the values:

Table 2.1 - Travel Calculator 

Departure Alpha Beta Gamma Delta Epsilon
Alpha     0     10   20    30    40
Beta      10    0    10    20    30
Gamma     20    10   0     10    20
Delta     30    20   10    0     10
Epsilon   40    30   20    10    0

That should allow you to compute the cost of travel from Beta quadrant to Gamma quandrant for example.

The numbers are just examples of course, but notice how they line up over diagonal lines and reflect over the central diagonal going from top left to bottom right. That was not incidental and you can check for accuracy of data entry that way. I suppose you could do a 256 x 256 table, but the data entry would be unpleasant.

zarf · September 17, 2018, 5:09pm

The first thing to note is that the I7 “real square root of…” function makes use of the native sqrt() function. It’s not slow.

(Integer square root does too, on Glulx.)

The implementation uses an I6 array. It’s memory-intensive, but I don’t think it’s CPU-intensive.