Game with defined unicode translation table?

As the title says… I need a example game with a custom defined unicode translation table for testing a z-machine disassembler.

1 Like

I think the z5 versions of Garry’s PunyInform games typically (always?) use a copyright symbol. I’m not sure if that means it has to be in the Unicode translation table (this is probably the part of the Z-machine that I’m the least familiar with).

1 Like

Does this count?

1 Like

My ZDevtools includes an assembler which will add a Unicode table whenever you make use of non-ASCII/ZSCII characters. For example:

start

print "Dvořák"
quit

This’ll create a 2-entry Unicode table. I’ve attached the output of this if a single example is enough.

unicode.z5 (728 Bytes)

2 Likes

That’s true, but I’ve been thinking of changing that.

The z5 version of The Mystery of Winchester High uses D@'un @'Eideann.

The z5 version of Search for the Lost Ark uses Andr@'e, resum@'e and sl@aegan.

I have no idea how or where these are stored. All I know is that it works. (The z3 versions use ASCII characters without diacritical marks.)

1 Like

These characters are in the default Unicode table (See 3.8.7 at The Z-Machine Standards Document), so they don’t require a game-specific Unicode table.Copyright symbol isn’t though.

1 Like

Thanks. As I said, I didn’t know where they were stored. I suspected they weren’t anything special.

2 Likes

This is Search for the Lost Ark

C:\Users\heasm\Downloads>unz search-for-the-lost-ark.z5 -m -i -u

***** ANALYZING *****

Filename:                                  search-for-the-lost-ark.z5
Compiled With:                             Inform 6.41
Z-machine version:                         5
Calculated checksum:                       0x9EA3, checksum ok
IFID:                                      UUID://20c07a3d-6486-4e59-a560-a606edecf9e0//
Object count:                              151
Unique verbs count:                        105
Grammar table version:                     2
Verb action count:                         113
Dictionary word count:                     630
Scanning for routines from:                0x03EA8
Found first routine at address:            0x03F00
Lowest routine address (immediate call) :  0x03F64
Highest routine address (immediate call):  0x0B528
Lowest string address (immediate address): 0x0B59C
Strings start at address:                  0x0B59C
Highest used global in z-code:             240
Number of used globals in z-code:          107
Number of unique properties:               26

***** MEMORY MAP *****

00000-02191 DYNAMIC MEMORY
   00000-0003F Header table, 64 bytes.
   00040-001F9 Abbreviation strings, 442 bytes.
   001FA-002B9 Abbreviation table, 192 bytes.
   002BA-002C1 Header extension table, 8 bytes.
   002C2-0034F Unidentified data, 142 bytes.
   00350-003CD Object defaults table, 126 bytes.
   003CE-00C0F Object tree table, 2,114 bytes.
   00C10-0192A Object properties tables, 3,355 bytes.
   0192B-01B6E Unidentified data (Class, indiv. prop & symbol table), 580 bytes.
   01B6F-01D4E Global variables, 480 bytes.
   01D4F-01D7C IFID, 46 bytes.
   01D7D-02190 Unidentified data (Arrays), 1,044 bytes.
   02191-02191 Terminating characters table, 1 byte.

02192-03EFF STATIC MEMORY
   02192-02263 Syntax/Grammar table, 210 bytes.
   02264-02A0B Syntax/Grammar table data, 1,960 bytes.
   02A0B-02AEC Action table, 226 bytes.
   02AED-02AEE Preposition/Adjective table, 2 bytes.
   02AEF-03EA5 Vocabulary/Dictionary, 5,047 bytes.
   03EA6-03EFF Unidentified data (Static arrays), 90 bytes.

03F00-12D73 HIGH MEMORY
   03F00-0B59B Z-code, 30,364 bytes.
   0B59C-12D73 Static strings, 30,680 bytes.


***** HEADER (00000-0003F, 64 bytes) *****

00000 05                      VERSION Z-machine version:         5
00001 00                      MODE    Flags 1:                   0x00
00002 00 01                   ZORKID  Release number:            1
00004 3F 00                   ENDLOD  Base of high memory:       0x3F00
00006 3F 01                   START   Initial value of pc:       0x3F01
00008 2A EF                   VOCAB   Dictionary:                0x2AEF
0000A 03 50                   OBJECT  Object table:              0x0350
0000C 1B 6F                   GLOBALS Global variables table:    0x1B6F
0000E 21 92                   PURBOT  Base of static memory:     0x2192
00010 00 50                   FLAGS   Flags 2:                   0x0050
00012 32 33 30 36 32 39       SERIAL  Serial number:             230629
00018 01 FA                   FWORDS  Abbreviations table:       0x01FA
0001A 4B 5D                   PLENTH  Length of file:            0x12D74
0001C 9E A3                   PCHKSM  Checksum of file:          0x9EA3
0001E 00                      INTWRD  Interpreter number:        0
0001F 00                              Interpreter version:       0
00020 00                      SCRWRD  Screen height (lines):     0
00021 00                              Screen width (chars):      0
00022 00 00                   HWRD    Screen width in units:     0x0000
00024 00 00                   VWRD    Screen width in units:     0x0000
00026 00                      FWRD    Font width/height:         0
00027 00                              Font width/height:         0
00028 00 00                   FOFF    Routines offset:           0x0000
0002A 00 00                   SOFF    Static strings offset:     0x0000
0002C 00                      CLRWRD  Default backgr. color:     0
0002D 00                              Default foregr. color:     0
0002E 21 91                   TCHARS  Terminating chars table:   0x2191
00030 00 00                   TWID    Output loc for DIROUT:     0x0000
00032 00 00                           Standard revision number:  0x0000
00034 00 00                   CHRSET  Alphabet table address:    0x0000
00036 02 BA                   EXTAB   Header extension address:  0x02BA
00038 00 00 00 00 36 2E 34 31 USRNM   Username:                  ....6.41


***** HEADER EXTENSION TABLE (002BA-002C1, 8 bytes) *****

002BA 00 03                   Number of further words:                  3
002BC 00 00                   MSLOCX  X-coord of mouse after a click:   0x0000
002BE 00 00                   MSLOCX  Y-coord of mouse after a click:   0x0000
002C0 02 C2                           Unicode tranlation table address: 0x02C2


***** UNIDENTIFIED DATA (002C2-0034F, 142 bytes) *****

002C0       46 00 E4 00 F6 00 FC 00 C4 00 D6 00 DC 00    F.............
002D0 DF 00 BB 00 AB 00 EB 00 EF 00 FF 00 CB 00 CF 00  ................
002E0 E1 00 E9 00 ED 00 F3 00 FA 00 FD 00 C1 00 C9 00  ................
002F0 CD 00 D3 00 DA 00 DD 00 E0 00 E8 00 EC 00 F2 00  ................
00300 F9 00 C0 00 C8 00 CC 00 D2 00 D9 00 E2 00 EA 00  ................
00310 EE 00 F4 00 FB 00 C2 00 CA 00 CE 00 D4 00 DB 00  ................
00320 E5 00 C5 00 F8 00 D8 00 E3 00 F1 00 F5 00 C3 00  ................
00330 D1 00 D5 00 E6 00 C6 00 E7 00 C7 00 FE 00 F0 00  ................
00340 DE 00 D0 00 A3 01 53 01 52 00 A1 00 BF 00 A9 00  ......S.R.......

As you can see (header extension table) there’s an unicode translation table at 0x02c2 with 0x46 (70) word entries in the array. It’s first the 69 defined in §3.8.7 (0xe4 - 0xbf), last is the ‘©’ (0xa9) added to the array.

(There’s an odd extra byte between unicode and objects default table that needs investigation…)

Excellent, thank you, all!

1 Like

The mystery with the extra byte is solved. Looked in the Inform6 source code and this is the comment…

    /* The object table must be word-aligned. The Z-machine spec does not
       require this, but the RA__Pr() veneer routine does.
    */
2 Likes

Now I have the table decoded. Example from Search of the Lost Ark.

***** UNICODE TRANSLATION TABLE (002C2-0034E, 141 bytes) *****

002C2 46                       Number of entries: 70
002C3 00 E4                    ZSCII #155 = U+00E4 'ä'
002C5 00 F6                    ZSCII #156 = U+00F6 'ö'
002C7 00 FC                    ZSCII #157 = U+00FC 'ü'
002C9 00 C4                    ZSCII #158 = U+00C4 'Ä'
002CB 00 D6                    ZSCII #159 = U+00D6 'Ö'
002CD 00 DC                    ZSCII #160 = U+00DC 'Ü'
002CF 00 DF                    ZSCII #161 = U+00DF 'ß'
002D1 00 BB                    ZSCII #162 = U+00BB '»'
002D3 00 AB                    ZSCII #163 = U+00AB '«'
002D5 00 EB                    ZSCII #164 = U+00EB 'ë'
002D7 00 EF                    ZSCII #165 = U+00EF 'ï'
002D9 00 FF                    ZSCII #166 = U+00FF 'ÿ'
002DB 00 CB                    ZSCII #167 = U+00CB 'Ë'
002DD 00 CF                    ZSCII #168 = U+00CF 'Ï'
002DF 00 E1                    ZSCII #169 = U+00E1 'á'
002E1 00 E9                    ZSCII #170 = U+00E9 'é'
002E3 00 ED                    ZSCII #171 = U+00ED 'í'
002E5 00 F3                    ZSCII #172 = U+00F3 'ó'
002E7 00 FA                    ZSCII #173 = U+00FA 'ú'
002E9 00 FD                    ZSCII #174 = U+00FD 'ý'
002EB 00 C1                    ZSCII #175 = U+00C1 'Á'
002ED 00 C9                    ZSCII #176 = U+00C9 'É'
002EF 00 CD                    ZSCII #177 = U+00CD 'Í'
002F1 00 D3                    ZSCII #178 = U+00D3 'Ó'
002F3 00 DA                    ZSCII #179 = U+00DA 'Ú'
002F5 00 DD                    ZSCII #180 = U+00DD 'Ý'
002F7 00 E0                    ZSCII #181 = U+00E0 'à'
002F9 00 E8                    ZSCII #182 = U+00E8 'è'
002FB 00 EC                    ZSCII #183 = U+00EC 'ì'
002FD 00 F2                    ZSCII #184 = U+00F2 'ò'
002FF 00 F9                    ZSCII #185 = U+00F9 'ù'
00301 00 C0                    ZSCII #186 = U+00C0 'À'
00303 00 C8                    ZSCII #187 = U+00C8 'È'
00305 00 CC                    ZSCII #188 = U+00CC 'Ì'
00307 00 D2                    ZSCII #189 = U+00D2 'Ò'
00309 00 D9                    ZSCII #190 = U+00D9 'Ù'
0030B 00 E2                    ZSCII #191 = U+00E2 'â'
0030D 00 EA                    ZSCII #192 = U+00EA 'ê'
0030F 00 EE                    ZSCII #193 = U+00EE 'î'
00311 00 F4                    ZSCII #194 = U+00F4 'ô'
00313 00 FB                    ZSCII #195 = U+00FB 'û'
00315 00 C2                    ZSCII #196 = U+00C2 'Â'
00317 00 CA                    ZSCII #197 = U+00CA 'Ê'
00319 00 CE                    ZSCII #198 = U+00CE 'Î'
0031B 00 D4                    ZSCII #199 = U+00D4 'Ô'
0031D 00 DB                    ZSCII #200 = U+00DB 'Û'
0031F 00 E5                    ZSCII #201 = U+00E5 'å'
00321 00 C5                    ZSCII #202 = U+00C5 'Å'
00323 00 F8                    ZSCII #203 = U+00F8 'ø'
00325 00 D8                    ZSCII #204 = U+00D8 'Ø'
00327 00 E3                    ZSCII #205 = U+00E3 'ã'
00329 00 F1                    ZSCII #206 = U+00F1 'ñ'
0032B 00 F5                    ZSCII #207 = U+00F5 'õ'
0032D 00 C3                    ZSCII #208 = U+00C3 'Ã'
0032F 00 D1                    ZSCII #209 = U+00D1 'Ñ'
00331 00 D5                    ZSCII #210 = U+00D5 'Õ'
00333 00 E6                    ZSCII #211 = U+00E6 'æ'
00335 00 C6                    ZSCII #212 = U+00C6 'Æ'
00337 00 E7                    ZSCII #213 = U+00E7 'ç'
00339 00 C7                    ZSCII #214 = U+00C7 'Ç'
0033B 00 FE                    ZSCII #215 = U+00FE 'þ'
0033D 00 F0                    ZSCII #216 = U+00F0 'ð'
0033F 00 DE                    ZSCII #217 = U+00DE 'Þ'
00341 00 D0                    ZSCII #218 = U+00D0 'Ð'
00343 00 A3                    ZSCII #219 = U+00A3 '£'
00345 01 53                    ZSCII #220 = U+0153 'œ'
00347 01 52                    ZSCII #221 = U+0152 'Œ'
00349 00 A1                    ZSCII #222 = U+00A1 '¡'
0034B 00 BF                    ZSCII #223 = U+00BF '¿'
0034D 00 A9                    ZSCII #224 = U+00A9 '©'

And the strings prints correct:

***** STATIC STRINGS (0B59C-12D73, 30,680 bytes) *****

0B59C S0001 "S{ear}ch{ for}{ the }Lost Ark"
0B5AC S0002 "^Copyr{ight} © 2023 Garry Francis^Type ABOUT{ for} fur{the}r{ in}fo{ and }credits{.^^}"
0B5E4 S0003 "André"
0B5EC S0004 "resumé"
0B5F4 S0005 "slægan"
2 Likes