And again a year is nearing its end. Like last year and the year before, I’d like to turn my gaze inwards.

A lot of things happened with xoreos this past year, albeit most of them hidden and “under the hood”:

  • I wrote about disassembling NWScript bytecode. The tasks I mentioned there are still open, too. If anybody wants to take them up, I’d be happy to explain them in more detail :).
  • We released xoreos 0.0.4, nicknamed “Chodo”. That was the only release of xoreos in 2016. xoreos 0.0.4 included some minor fixes and features for Neverwinter Nights, and the xoreos-tools package included the new NWScript disassembler.
  • In April, I reached a streak of a full year of daily xoreos commits. Due to some real life things, I had to take a break there, though. I’m now again at three months of daily commits, but there is a three-month “hole” between April and August.

GitHub contribution graph in AprilGitHub contribution graph in April

GitHub contribution graph in NovemberGitHub contribution graph in November

  • Farmboy0 fleshed out the Jade Empire engine a bit, mostly in the scripts department.
  • Supermanu implemented a huge chunk of the character generator for Neverwinter Nights.
  • Farmboy0 fixed a glitch in the Neverwinter Nights animation system that has plagued xoreos for quite some time: the animation scaling in various creature models was off. This lead to, for example, the head and arms of elves detaching from the body during the yawn animation.
  • I then implemented a few more animation script functions, too, which is especially noticeable in the intro animation for Hordes of the Underdark. I also fixed a mistake in the keyframe interpolation. This takes care of another glitch in Neverwinter Nights: model nodes rotating the wrong way around.
  • smbas added support for Lua scripts in The Witcher. A lot of the initialization code that sets up the classes and functions The Witcher expects to find is still missing, so nothing obvious is visible as of yet.
  • Farmboy0 moved the window handling from the GraphicsManager into a new WindowManager class, making the code more readable.
  • I fundamentally restructured our build system, or at least the autotools part of it (xoreos can be built using either autotools or CMake). Previously, we used a recursive autotools setup, where make recurses into each subdirectory. This is, unfortunately, pretty slow, among other drawbacks. I changed it to be non-recursive now, with the top-level Makefile instead being created using (recursive) includes.
  • I then introduced various smart pointer templates into the codebase, making it easier to read and easier to keep track of memory allocations.
  • berenm added AppVeyor integration. Like Travis CI (which we already use as well), AppVeyor is a continuous integration service. This means that every single commit to the public xoreos repository will now be built on Microsoft Windows, using Microsoft Visual Studio 2015, in addition to gcc and clang on GNU/Linux (via Travis CI). This ensures that any compilation breakage on these systems is immediately visible and can be fixed at once.
  • GitHub added a new feature, “Projects”, that provide Kanban-like boards of tasks. I took the time to fill the xoreos Projects page with boards for tasks from our TODO list.
  • There were of course also various clean-ups, minor fixes and expanded code documentation.

Animation with glitchAnimation with glitch Animation without glitchAnimation without glitch Animations in the HotU introAnimations in the HotU intro

Additionally, there are several tasks currently being worked on, among them:

  • Supermanu is looking into pathfinding.
  • mirv is still working on rewriting the OpenGL renderer.
  • I am currently writing unit tests for the xoreos codebase, using Google Test. I already found multiple issues, bugs, and corner cases while adding them.

From my side of things, my current plan is to make my unit tests branch public some time in December. I’ll write a small announcement here about it then. A new release of xoreos, 0.0.5, should follow early next year.

As always, this all wouldn’t have been possible without a lot of people. For them I am thankful.

  • Farmboy0, for various fixes, implementations and file format spelunking.
  • Supermanu, for his character generator work and pathfinding research.
  • mirv, for continuing to work on the OpenGL rewrite.
  • smbas, for his work on Lua and The Witcher.
  • berenm, for the AppVeyor integration and CMake knowledge.
  • TC01, for writing a Fedora specfile for the xoreos projects.
  • CromFr, for taking a stab at the walkmesh structure in NWN2’s TRN files.
  • clone2727, for invaluable ideas and corrections.
  • The folks at GamingOnLinux, who continue to be a great resource for all things related to Games on Linux.

I am also thankful for all the people who take the time to explain things to others, people who write interesting, useful or needed articles, and people who provide mentoring and help. Relatedly: a week ago, Stephanie Hurlburt published an article with engineers who are willing to mentor or answer programming/engineering questions. I for one think that’s a really great idea. Please take a look at that article.

And now, let’s see what the next year has in store for us. If you, however, found all this terribly interesting and would like to help with our little project, then please, feel free to contact us! :)

A new year, a new release: we are proud to announce the release of version 0.0.4, nicknamed “Chodo”, of xoreos and xoreos-tools.

In this release, Neverwinter Nights now shows speech bubbles for conversation one-liners, as used for cutscenes, bark strings and short NPC dialogues. Additionally, the premium modules BioWare sold for Neverwinter Nights, including the three that come with the Diamond Edition, can now be properly loaded and started.

Speech bubblesSpeech bubbles

An oversight in the handling of the texture fonts used in Neverwinter Nights and the two Knights of the Old Republic games has been fixed. This oversight broke rendering of certain characters, most prominently of those used in eastern European languages and the “smart” single quotation mark that’s used instead of an apostrophe in some strings found in the French versions.

For xoreos-tools, there’s two new tools: fixpremiumgff and ncsdis.

The first tool, fixpremiumgff, can restore the deliberately broken GFF files found in the BioWare premium modules for Neverwinter Nights. The resulting GFF files can then be edited as normal.

The second tool, ncsdis, is a disassembler for the stack-based bytecode of BioWare’s NWScript scripting language. It supports the scripts of all games targeted by xoreos and can disassemble them into a full assembly listing. It can also produce a control flow graph in the DOT description language, which can then be plotted into an image by using the dot tools from the GraphViz suite.

Moreover, this release includes a lot of user-invisible code documentation and quality fixes, in both xoreos and xoreos-tools.

Binaries for Windows, GNU/Linux and Mac OS X are attached to the GitHub release, here for xoreos and here for xoreos-tools. Additionally, packages for various GNU/Linux distributions can be found on the OpenSuSE Build Service (here for xoreos, here for xoreos-tools) and in Arch Linux’s AUR (here for xoreos, here for xoreos-tools).

Alternatively, the repository and the source tarballs contain PKGBUILD files in dists/arch/ and a debian build directory in dists/debian/, which can be used to build Arch Linux and Debian/Ubuntu packages, respectively.

And as always, we’re looking for more developers to join us in our efforts to reimplement those 3D BioWare RPGs. If you would like to help, please feel free to contact us. :)

As I already said in last year’s retrospective, I want to write a bit about NWScript and its bytecode.

First of all, what is NWScript? NWScript is the scripting language and system BioWare introduced with Neverwinter Nights and used, with improvements and changes, throughout the Aurora-based games. Specifically, you can find NWScript driving the high-level game logic of Neverwinter Nights, Neverwinter Nights 2, Knights of the Old Republic, Knights of the Old Republic II, Jade Empire, The Witcher (in combination with Lua), Dragon Age: Origins and Dragon Age II. This is nearly every single game xoreos targets. The only exception is the Nintendo DS game, Sonic Chronicles: The Dark Brotherhood, which doesn’t seem to use any scripts at all.

NWScript is written in a C-like language and saved with the .nss extension. A compiler then translates it into a stack-based bytecode with the .ncs extension, which is what the game executes. That is similar to how ActionScript in Flash videos works, and how Java, Lua and other scripting languages can operate.

Like C, NWScript is a strongly typed language: each variable has one definite type. Among the available types are “int” (32-bit signed integer), “float” (32-bit IEEE floating point), “string” (a textual ASCII string) and “object” (an object in the game world, like an NPC or a chest). Moreover, there are several engine types, like “event” and “effect”, though which of these are available depends on the game. There are also structs, but in the compiled bytecode, they vanish and are replaced by a collection of loose variables. Likewise, the “vector” type is treated as three single float variables. A special type is the “action” type, a script state (or functor) that’s stored separately.

Additionally, Dragon Age: Origins adds a “resource” type (which, in the bytecode, can be used interchangeably with the “string” type) and dynamic arrays. Dragon Age II in turn adds the concept of a reference to a variable to help performance in several use-cases. For these new features, these two games each add two new bytecode opcodes, something not done for any of the other post-Neverwinter Nights games.

To get and modify the state of the game world, like searching for objects and moving them, and for complex mathematical operations like trigonometry functions, the NWScript bytecode can call so-called engine functions. These are functions that are provided by the game itself; about 850 per game, with some overlap. They’re declared in the nwscript.nss file (nwscriptdefn.nss for The Witcher and script.ldf for the Dragon Age games) of each game.

The original Neverwinter Nights toolset came with a compiler, but a part of the modding community, the OpenKnights Consortium, created their own, free software compiler, nwnnsscomp. Unfortunately, it has a few disadvantages. For example, it always needs the nwscript.nss file and it also only handles Neverwinter Nights. And while there has been several variations that have been extended to handle newer games, many of these are only available as Windows binaries. As far as I’m aware, there never has been a variation that handles Dragon Age: Origins or Dragon Age II. Also, since the code hasn’t been touched for 10 years, it’s difficult to compile now, and it doesn’t work when compiled for 64-bit. For what it’s worth, I mirrored the old OpenKnights NWNTools, with a few changes, to GitHub here.

This nwnnsscomp also has a disassembly mode, which can convert the compiled bytecode into somewhat human-readable assembly. This is pretty useful! I wanted my own disassembler in xoreos-tools.

The steps to disassemble NWScript bytecode are the following:

1) Read instructions

Read the .ncs file, instruction for instruction. An instruction consists of an opcode (like ADD for addition), the argument types (which are taken from or pushed onto the stack) and any direct arguments. For example, an addition that operates on two integers would be known as ADDII. The instructions are stored in a list, one after the other.

2) Link instructions

Each instruction may have a follower, the instruction that follows naturally. For most instructions, this is the instruction next in the list. But certain branching instruction, jumps and subroutine calls, also have jump destinations that may be taken.

3) Create blocks

Group the instructions into blocks. A block is a sequence of instructions that follow each other, with two constraints: a jump into a block can only land at the beginning of a block and a jump out of a block can only happen at the end of the block.

4) Create subroutines

Group the blocks into subroutines. A subroutine is a sequence of blocks that gets jumped to by a special opcode, JSR, and returns to the place from where it has been called with RETN. (In many programming languages, for example C, these are also called functions, but we’re calling them subroutines so that they’re not being confused with engine functions. Subroutine is also often the usual term in assembly dialects.)

5) Link subroutines

Record where a subroutine calls another and link the caller and callee, so that a call graph could be created easily. Likewise, the instructions that start and end the subroutine are also separately recorded.

6) Identify “special” subroutines

There are three special subroutines that we can identify:

  • _start(), the very first subroutine that starts execution of the script. It’s the subroutine that contains the very first instruction in the .ncs file.
  • _global(), which, if it exists, sets up the global variables. This is the subroutine that contains an instruction with a SAVEBP opcode.
  • main(), which is the main or StartingConditional function visible in the script source. If a _global() subroutine exists, this is the last subroutine called by _global(). Otherwise, it’s the last (and only) subroutine called by _start().

7) Analyze the stack

This step goes through the whole script and evaluates each instruction for how it manipulates the stack. Since stack elements are explicitly typed, and instructions that create new stack elements know which type they create (either explicitly, or implicitly by copying an already typed element), both the size of the stack and the type of all elements can be figured out. At the end, each instruction will know how the stack looks before its execution. And for each subroutine, we then know its signature: how many parameters it takes, what their types are and what the subroutine returns.

However, this step only works if we know which game this script is from, because we need to know the signatures of the engine functions. And, unfortunately, this step fails when the script subroutines call each other recursively. The stack of a recursing script can’t be analyzed like this.

8) Analyze the control flow

So far, the script disassembly consists of blocks that jump to each other, with no further meaning attached. To extract this deeper meaning, we analyze the control flow for higher-level control structures: do-while loops, while loops, if and if-else blocks, together with break and continue statements and early subroutine returns. Each block gets a list of designators that show if and how it contributes to such a control structure.

9) Create output

Finally, we create an output. This can be one of three:

  • A full listing, including offset, raw bytes and disassembly. This is similar to the output created by the disassembly mode of the OpenKnights nwnnsscomp.
  • Only the disassembly. This output might be used to reassemble the script some time later, should someone want to write an assembler.
  • A DOT file. A DOT file is a textual description of a graph, which can be plotted into a graph with the Graphviz tool. The result is a clear representation of the control flow in graph form.

As an example, here’s a script from Neverwinter Nights 2: this is the original source, and this is the full listing output, the assembly-only output and the control flow graph.

During this work, I have found a few interesting little bugs in the original BioWare NWScript compiler. For example, this little script, hf_c_skill_value.nss in Neverwinter Nights 2 and its disassembly:

1
2
3
4
5
6
int StartingConditional(int nSkill, int nValue)
{
  object oPC = GetPCSpeaker();
  int nSkill = GetSkillRank(nSkill, oPC);
  return (nSkill >= nValue);
}
1
2
3
4
5
6
7
8
9
  00000017 02 06                      RSADDO
  00000019 05 00 00EE 00              ACTION GetPCSpeaker 0
  0000001E 01 01 FFFFFFF8 0004        CPDOWNSP -8 4
  00000026 1B 00 FFFFFFFC             MOVSP -4
  0000002C 02 03                      RSADDI
  0000002E 04 03 00000000             CONSTI 0
  00000034 03 01 FFFFFFF4 0004        CPTOPSP -12 4
  0000003C 03 01 FFFFFFF4 0004        CPTOPSP -12 4
  00000044 05 00 013B 03              ACTION GetSkillRank 3

Specifically, the int nSkill = GetSkillRank(nSkill, oPC); line is compiled wrong. First, an instruction with the opcode RSADDI is generated, which creates a new integer variable on the stack, for nSkill. Then, the arguments for GetSkillRank are pushed onto the stack, and GetSkillRank is called using the ACTION instruction.

Unfortunately, as soon as the compiler creates the stack space for the local variable nSkill, it associates nSkill with this local variable. So when it’s time to push the parameter nSkill for GetSkillRank on the stack, the parameter to the outer StartingConditional subroutine has already been overruled, and the CPTOPSP points to the new local variable.

This renders the nSkill parameter unused and useless, and GetSkillRank is potentially called with an uninitialized value.

For another example, have a look at this script from Neverwinter Nights:

1
2
3
4
5
6
7
8
9
10
int StartingConditional()
{
  int iResult;
  object oPC = GetPCSpeaker();

  iResult = GetClassByPosition(1,oPC) == CLASS_TYPE_DRUID ||
            GetClassByPosition(2,oPC) == CLASS_TYPE_DRUID ||
            GetClassByPosition(3,oPC) == CLASS_TYPE_DRUID;
  return iResult;
}

It’s meant to check whether the player character is a druid. Since you can multi-class in Neverwinter Nights, it checks whether the character is a druid for the first class, for the second class and then the third class. If any of these return true, iResult will be set to true. To achieve this, a boolean disjunction (“or” operation) is used. As is customary in C-like languages, the boolean disjunction in NWScript is supposed to support short-circuiting: if the first part is already true, the second (and third) checks aren’t even called.

Let’s see how the disassembly graph looks like:

BioWare DisassemblyBioWare Disassembly

The first EQII is the first comparison, and then the block in loc_00000057 is supposed to do the short-circuiting. It duplicates the top-most stack element with a CPTOPSP -4 4 before bypassing the second comparison and jumping to the LOGORII that does the boolean disjunction. Unfortunately, instead of directly jumping to loc_00000080 with a JMP, a JZ was generated instead. And since the top-most stack element was already duplicated and checked with the previous JZ, we know that the true-edge is never taken. It is a dead edge.

This has an interesting consequence. The short-circuiting for the boolean disjunction is broken: all parts are always evaluated before the results are or’ed together. In practice, this doesn’t matter much. It makes the code a bit slower, and any side effects will always happen.

Additionally, if the true edge were ever taken, the stack would be in a broken state. Unlike JMP, JZ consumes a stack element, and so the LOGORII would be missing one of its arguments. Because this is not possible, it doesn’t matter for execution, but my stack analysis dies there. To combat this problem, I added an extra disassembly step after the block generation, the detection of these dead edges. To keep it simple, I only do some simple pattern matching, which is enough for most scripts. There are a few cases where it fails, though.

This bug is present in the original scripts coming with Neverwinter Nights, Knights of the Old Republic and Knights of the Old Republic II, but has already been fixed by the release of Neverwinter Nights 2.

This bug is also not present in the OpenKnights compiler:

OpenKnights DisassemblyOpenKnights Disassembly

As you can see, the branch instruction in loc_00000057 is a JMP, as it should be.

So, to recap, xoreos-tools now has a tool that can disassemble NWScript bytecode, similar to the disassemble mode of the OpenKnights nwnnsscomp, with these added features:

  • Easier to compile.
  • Works compiled as 64-bit.
  • Out-of-the-box support for the NWScript found in all Aurora-based games.
  • No need for a separate nwscript.nss.
  • Support for new array and reference opcodes.
  • Deeper analysis of the stack, to figure out the subroutine signatures.
  • Deeper analysis of the control flow, to detect higher-level control structures.
  • Can create control flow graphs in DOT format.

If you’re interested, the source is available here. Binaries will come with the next release of xoreos and xoreos-tools.

There is, however, still a lot left to do there:

  • Create a decompiler: use the detected control structures as a base to generate C-like NWScript source code.
  • Detect chained instructions: something like “int a = b * c” compiles to a lot of instructions that create temporary stack variables.
  • Detect structs and vectors.

Unlike nwnnsscomp, xoreos-tools is still missing a compiler as well. This is something that would be very nice to have indeed. An assembler, which can take the disassembly output and create a working .ncs file out of it would probably be a useful first step in that direction.

If you would like to help and take up any of these tasks, or any other task from our TODO list, please contact us! :)

The end of the year is approaching fast, and just like last year, I want to use this time for some retrospection.

First of all, what happened in the last year?

  • berenm added support for building xoreos with CMake, by the way of parsing the automake files used for the autotools build system. This way, xoreos can now be built with either CMake or autotools. I was skeptical at first, especially since I harbour no love for CMake, but it is working reasonably well and I am quite happy with it. In hindsight, I was wrong to reject this pull request for so long.
  • I focused on supporting all the different model formats used in the Aurora games, and then I made all the games display their in-game areas with objects.
  • xoreos adopted the Contributor Covenant as its Code of Conduct, in the hopes that it helps foster a friendly and welcoming community.
  • The big one: our first official release, xoreos 0.0.2, nicknamed “Aribeth”.
  • I overhauled the script system, making it more generic. This way, I was able to apply it to all targeted games, except Sonic Chronicles: The Dark Brotherhood (which doesn’t seem to use any scripts at all). This included figuring out and implementing four new script bytecode opcodes: two for array access in Dragon Age: Origins, and two for reference creation in Dragon Age II.
  • I implemented reflective environment mapping for Neverwinter Nights and the two Knights of the Old Republic games.
  • I added a new tool to the xoreos-tools package: xml2tlk, which can recreate TLK talk table files out of XML files created by tlk2xml.
  • With these changes, I decided to push out xoreos 0.0.3, nicknamed “Bastila”.

This is all old news, more or less already discussed in previous blog posts. However, since then, I added yet another new tool to the xoreos-tools package: ncsdis. It’s a disassembler for NCS files, the stack-based compiled bytecode of the C-like NWScript, BioWare’s scripting language used throughout their Aurora-based games.

It basically replaces the disassembler within the old OpenKnightsN WScript compiler, with various added benefits. I’ll write a bit more about this tool in the near future, so for now I’ll just leave you with an example assembly listing it can produce, as well as a control flow graph it can create (with the help of Graphviz). As you can see, it already groups the instruction by blocks and subroutines. It performs a static analysis of the stack (to figure out subroutine parameters and return types) and it also analyzes the control flow to detect assorted control structures (loops, if/else). I plan to grow it into a full-fledged NWScript decompiler.

Additionally, I also added support for BioWare’s Neverwinter Nights premium modules, like Kingmaker, to xoreos.

On the documentation side of things,

  • I added comments and documentation to various files in the xoreos sources, hopefully making them more understandable and useful for potential new contributors and otherwise interested people. Considering how awful my memory is at, this is also a kind of future-proofing.
  • Farmboy0 added “research” subpages for various games on our wiki, filling them with information about their workings.
  • I extended our TODO list considerably.
  • I added an example configuration file, and extended the documentation on the wiki on how to compile and run xoreos.
  • I wrote man pages for each tool in xoreos and for xoreos itself. I also added the former to the wiki.

Phew! This is again a bigger list than I had anticipated. This wouldn’t have been possible without these people, for whom I am thankful:

  • I am thankful to berenm for providing the CMake bindings, despite my grumbling about it.
  • I am thankful to Supermanu, for continuing on chipping away on the Neverwinter Nights character generator.
  • I am thankful to Farmboy0, for working on xoreos’ Jade Empire engine and researching game internals.
  • I am thankful to mirv, for continuing with the huge task of rewriting my naive OpenGL code.
  • I am thankful to Coraline Ada Ehmke for creating the Contributor Covenant.
  • I am thankful to all the people in the different BioWare modding communities, for having figured out many different things already. Skywing for example, who had emailed me a few years ago about certain NWScript issues, issues I recently stumbled over again.
  • I am thankful to fuzzie, for giving me pointers on the NCS disassembler/decompiler.
  • I am thankful to the GamingOnLinux people, who do a lot of work reporting on all sorts of Linux-related gaming news, and who so graciously mirror my xoreos blog posts.
  • I am thankful to kevL, for notifying me of issues with xoreos’ build system on configurations I hadn’t thought about.
  • I am thankful to clone2727, for putting up with rants and ravings.
  • I am thankful to all the people who told me when I was wrong, for example when I wrongheadedly silenced clang static analyzer warnings, without understanding what I was doing.
  • I am thankful to everybody else who gave me hints and tips, taught me tricks and procedure, showed me new things, old things, forgotten things, broken things.
  • I am thankful to all the people who are not angry with me for forgetting them, because they are aware that this is not meant as a personal slight ;).

Now that I have these mushy feelings out of my system, here’s hoping for another great year! :)

And like always, if you want to join our effort, please don’t hesitate to contact us!

To keep things moving following the previous 0.0.2 release, we’re proud to announce the release of version 0.0.3, nicknamed “Bastila”, of xoreos and xoreos-tools.

This release features a working script system for all targeted games, with game scripts being fired for the start of a campaign or module, when entering and leaving areas, and when clicking on in-game object. The singular exception is the Nintendo DS game Sonic Chronicles: The Dark Brotherhood, which doesn’t seem to feature any scripts at all.

The vast majority of engine functions, the functions that are called by the scripts and that do the actual work of tracking and changing the game state, are still missing, though. Per game there are about 850 functions (with some overlap) that need to be implemented. We currently have about 90, per game, of these written and working within xoreos. Moreover, many of the functions still missing depend on features not yet implemented.

Apart from the script system changes, 0.0.3 also comes with support for reflective environment mapping in Neverwinter Nights and the two Knights of the Old Republic games. The “metallic” armor and area parts that were rendered transparent in xoreos are now properly reflective. This can be seen, for example, in the Sith troopers in Knights of the Old Republic, in various plate armor worn by NPC in Neverwinter Nights, as well as the metallic floors on the planet of Taris and the icy wastes of Cania. For Neverwinter Nights, xoreos now also correctly smoothes the vertex normals of (binary) models, so that the metallic effect is not broken by sharp polygon edges.

Semi-transparent maskSemi-transparent mask Plus reflectivityPlus reflectivity Correctly rendered Sith trooperCorrectly rendered Sith trooper

Without environment mapWithout environment map Without normal smoothingWithout normal smoothing Correctly rendered plate armorCorrectly rendered plate armor

On the xoreos-tools side of things, there’s now a new xml2tlk tool that can convert XML files created by the tlk2xml tool back into a talk table TLK file. Please note that, at the moment, only non-GFF’d TLK files can be written, as used by the two Neverwinter Nights games, the two Knights of the Old Republic games, Jade Empire and The Witcher. TLK files as used by Sonic Chronicles: The Dark Brotherhood and the two Dragon Age games can not be written (they can, however, be read with the tlk2xml tool).

Additionally, the convert2da tool gained the ability to write binary 2DA files, as used by the two Knights of the Old Republic games; and xoreostex2tga can now correctly read TPC cube maps.

Binaries for Windows, GNU/Linux and Mac OS X are attached to the GitHub release, here for xoreos and here for xoreos-tools. Additionally, packages for various GNU/Linux distributions can be found on the OpenSuSE Build Service (here for xoreos, here for xoreos-tools) and in Arch Linux’s AUR (here for xoreos, here for xoreos-tools).

Alternatively, the repository and the source tarballs contain PKGBUILD files in dists/arch/ and a debian build directory in dists/debian/, which can be used to build Arch Linux and Debian/Ubuntu packages, respectively.

And as always, we’re looking for more developers to join us in our efforts to reimplement those 3D BioWare RPGs. If you would like to help, please feel free to contact us. :)

We are proud to announce our very first release of xoreos and xoreos-tools, version 0.0.2, nicknamed “Aribeth”.

While xoreos is still far from being useful to end-users, all targeted games work insofar as that they at least show basic in-game areas. You can start the game, xoreos loads the game resources, loads a campaign or module, and then shows an area of the game. This accurately demonstrates what the xoreos project wants to accomplish.

Neverwinter NightsNeverwinter Nights Knights of the Old RepublicKnights of the Old Republic Dragon Age: OriginsDragon Age: Origins

Within the in-game area, you can fly around in a “spectator” mode, using the common first-person WASD control scheme. Moving the mouse while holding down the middle mouse button rotates the camera. With Ctrl+D, a debug console drops down, allowing for general resource dumping and the loading of different areas, modules and/or campaigns.

A few games, specifically Neverwinter Nights and Knights of the Old Republic, also show a main menu, although the latter’s is not as extensive yet. The former also shows a few in-game menu elements.

Additionally, Neverwinter Nights also has a script system hooked up, and preliminary dialogue support. This means that clicking on an NPC opens up its conversation dialog, and some of the script commands will be executed. For example, the door in the first area of the original campaign’s prelude opens after speaking to Bim and telling him that no tutorial is necessary. However, triggering the tutorial leads to the scripts looping endlessly, because the necessary game functions are not implemented yet.

Further gameplay is still missing. At the moment, none of the other games have a script system.

The current graphics are very basic: only flat-shaded, textured meshes are shown. No lighting, shadows or shaders of any kind are currently available.

Please note that xoreos is still missing a GUI and needs to be started from the command line.

The accompanying xoreos-tools package includes command line tools that can be used to inspect the games’ resource files and, as such, are meant primarily for developers.

Binaries for Windows, GNU/Linux and Mac OS X are attached to the GitHub release, here for xoreos and here for xoreos-tools. Additionally, packages for various GNU/Linux distributions can be found on the OpenSuSE Build Service (here for xoreos, here for xoreos-tools) and in Arch Linux’s AUR (here for xoreos, here for xoreos-tools).

Alternatively, the repository and the source tarballs contain PKGBUILD files in dists/arch/ and a debian build directory in dists/debian/, which can be used to build Arch Linux and Debian/Ubuntu packages, respectively.

And as always, we’re looking for more developers to join us in our efforts to reimplement those 3D BioWare RPGs. If you would like to help, please feel free to contact us. :)

Yet further down the path of getting all targeted games to show areas, it seems like I reached the end with Dragon Age: Origins and Dragon Age II. Similar to my posts about my progress with Sonic Chronicles: The Dark Brotherhood (Part 1, Part 2, Part 3), The Witcher, Jade Empire and Neverwinter Nights 2, this will be a short description of what I did. This time: Dragon Age: Origins and Dragon Age II.

Models

Lucky for me, the Dragon Age model format is reasonably well documented in the Dragon Age toolset wiki. tazpn even created standalone model viewers for Dragon Age: Origins and Dragon Age II, and released them with sources under the terms of the 3-clause BSD license. :)

And since the model format is based on GFF4, missing pieces of information are relatively easy to decipher too. So I quickly had a loader capable of reading the skeleton whipped up for both Dragon Age: Origins and Dragon Age II (since they are nearly identical in format).

Rage demon skeletonRage demon skeleton Dragon skeletonDragon skeleton

With a bit of fiddling, the meshes were there too. There’s two types of meshes within the models: static meshes, directly hanging at one specific bone, and dynamic meshes that include weights for several bones for each vertex. Similar to models in Sonic Chronicles, this would deform the mesh according to those weights when the bones are animated. Unlike Sonic Chronicles, the default vertex positions of those meshes create a valid, unanimated pose. This means I could just completely ignore the bone weights for now, and load the meshes as if they were static. In the future, a vertex shader would combine those weights with the bone position to create the fully animatable model meshes.

Rage demon meshRage demon mesh Dragon meshDragon mesh Statue meshStatue mesh

Only thing missing now were the textures. For that, I needed to read the MAO (material object) files, which contains the material file (MAT), various textures (diffuse, lightmap, etc.) and a number of optional parameters. The material file in turn contains several different “semantics”, which is basically the name of a shader and how to map the MAO values onto the shader input. The original game takes all these, looks for the most fitting semantic in the material file (depending on number of parameters, graphics card capability and user settings), and then tells the graphics card which shader to use to render the mesh.

Now, since we don’t actually support any shaders yet (and we can’t use the game’s Direct3D shaders directly anyway), we simple read the MAO (which can be either in GFF4 or XML format), take the diffuse texture, and apply it to the mesh directly.

Textured rage demonTextured rage demon Textured dragonTextured dragon Textured statueTextured statue

Campaigns

With the models done, I turned to reading the Dragon Age: Origins campaign files. A campaign, that is either the default single player campaign (which is defined in a CIF file), or a DLC package (with both a CIF file and a manifest.xml) that doesn’t extend another campaign (those would be add-ons).

There’s several caveats involved here:

First of all, most of the DLC packages are encrypted. The original game queries a BioWare server for the decryption key, asking whether its a legitimate copy. While the encryption method is known (Blowfish in ECB mode), xoreos does not include any of the keys. So the only campaigns apart from the main one loadable right now are the unencrypted ones, namely Dragon Age: Awakening, and any custom ones you might have downloaded (including the PC Gamer promo DLC A Tale of Orzammar).

Then, we don’t load any add-ons. So no Shale or Feastday Gifts, even if they weren’t encrypted (which they are). It’s not like xoreos could do anything with them yet anyway.

Finally, we have no way to install .dazip packages yet, so those need to be installed using the original game for now, or manually extracted and put in the right places. In the future, something that install them would be nice. Or maybe we could support loading of packed .dazip files, but that could be slow.

In either case, I implemented the loading of standalone campaign files.

Areas and rooms

Next up were areas (ARE) and environment layouts (ARL) with room definitions (RML). The ARE contains dynamic room information, like what music to play, and the placeables and creatures (more of those later). The ARL defines what rooms are in the area (as well as pathing information, weather, fog, etc.), each of them being a RML file with models. They are all, again, GFF4 files, making them nice and easy to understand.

ArenaArena CastleCastle OstagarOstagar

There was one problem, though. The orientations of the models were given in quaternions, and as I said in the blog post about my The Witcher progress, a combination the automatic world rotation xoreos does, and our Model class wanting Euler angles instead leads to them not being correctly evaluated for whole models.

I was getting sick of that not being correct. I bit the bullet and removed the world rotation (which meant I had to rejigger the placement code in all engines, as well as the camera system, which was especially painful in Sonic Chronicles). And then I changed the Model class to take axis-angle rotations instead; those can be more easily calculated from quaternions, and can still be directly fed into OpenGL.

As a result, the area room models in Dragon Age: Origins were correctly oriented. And the placeable models in The Witcher as well.

Elven alienageElven alienage OstagarOstagar

You might notice that the ground mesh in outdoor areas looks very blurry and low-res. That’s because the original game doesn’t specify a single texture for those, but instead combines several textures together in a shader. We don’t support that yet, so instead we apply the replacement texture of the lowest LOD which is normally used for meshes that are far away.

Placeables

On to the placeables, the objects within areas. They are defined within a list in the ARE file (giving position, orientation, name, etc.), each with a template. The template is a UTP file, a GFF3, that contains common properties for all instances of this placeable. This includes an appearance, which is an index into a GDA (a GFF’d 2DA, a two-dimensional table), which specifies, among other things, the model to use.

So far, so usual for BioWare games.

One difference, though. In the Dragon Age games, the GDA files do not stand alone. Instead, each is a combination of potentially several GDA files with the same prefix (defined in m2da.gda). This is used for DLCs, which then can simply add rows to a GDA, instead of overwriting the whole file. Consequentially, the appearance index is not a direct row number, but corresponds to a value in the “ID” column.

A bit fiddly, but still relatively easy to implement.

PlaceablePlaceable PlaceablePlaceable

Creatures

The creatures were more difficult. There’s several types of creatures: type S (simple) are just a single model; type H (head) are split into a body model and several models for the head (base, eyes, hair, beard); type W (welded) are similar to H, but already include weapons in the body model; and “P” (player-type) creatures are segmented into head (with base, eyes, hair, beard), chest, hands (gloves) and feet (boots).

HurlockHurlock Headless DuncanHeadless Duncan Headless kitchen boyHeadless kitchen boy

Moreover, creatures of type P also switch model parts depending on the equipped items. So armor changes the chest model, gloves and boots change the hands/feet models and a helmet replaces the hair. Which models to use depends on several factors, and includes look-ups in several different GDA files, as well as UTC (creature template) and UTI (items) files.

Another problem is the tinting. The original game uses a shader to tint hair, skin and armor parts custom, user-selectable colors. To do that, their textures just contain intensity values in two color channels, while the two other channels are used as a bump map and something else (which I’m not sure yet). If we just apply the texture to those body parts, they are suddenly mostly transparent. To work around that for now, we manually modify each of those textures to remove the transparency. That leaves the weird coloring, but you can at least see all the body parts then.

Duncan without hairDuncan without hair Kitchen boyKitchen boy CookCook

Meeting of the headsMeeting of the heads BodiesBodies HelmetsHelmets

Dragon Age II

I then applied all this to Dragon Age II. Just a few minor changes to the resource loading was necessary, and nearly everything worked out of the box.

Hawke EstateHawke Estate ViscountViscount’s Keep

Only the P-type creatures needed a bit more work, since how the body part models are constructed changed.

CompanionsCompanions

Similar to Sonic Chronicles, Dragon Age II is also missing many of the GDA headers; they’re only stored as CRC hashes. With a dictionary attack, I did manage to crack about half of them, but that still leaves about 450 unknown. Something to watch out for in the future.

Music

I also investigated how music works in the two games. Dragon Age: Origins uses FMOD, and Dragon Age II uses Wwise. Both work similarily: the area specifies an event group, and the scripts then tell the library to play a specific event list from that group at certain times. The library does the rest, evaluating the events in the event list (which range from “play sound X”, over “set volume to Y”, to “add Z% reverb”). And while I do have adequately licensed code to read the sounds from both libraries’ soundbanks, figuring out the events is a massive undertaking. And we don’t have a script system for the Dragon Age games in place anyway, so this is nothing that can be done right now.

What’s next

So… All games xoreos cares about now show areas. What’s next, then?

Well, first of all, I’d like to do some cleanup of the engines code. Sync them up, make them more similar to each other. Right now, many things are done slightly different in each engine, because the games changes something around and the old concept suddenly didn’t fit anymore. If possible, I’d like to unify the concepts again.

There’s also a few potential portability issues I want to investigate. For example, I read that using fopen() on Windows with filenames containing non-ASCII characters won’t work at all. Instead, I’ll probably have to change xoreos’ File stream class to use Boost’s fstreams, and convert our UTF-8 strings to UTF-16 on file open. I hope that’s something I can test with Wine, otherwise I’ll have to bug somebody with access to a real Windows.

After those things have been cleared, I’d like to prepare for our very first release. I plan to include both xoreos and xoreos-tools, with sources (of course) and pre-compiled binaries for GNU/Linux, Mac OS X (>= 10.5) and Windows, each for both x86 and x86_64. I have cross-compilers for those, and they all should work. Yes, xoreos is still not really useful for end-users, but a release can’t hurt, and might give us some publicity and/or get people interested. Who knows.

I could use some testers for those binaries, though, to make sure I get the library dependencies correctly. And that the GNU/Linux binaries work on other systems than just mine.

I’m also open for other platforms. Would it make sense to have xoreos pre-compiled for Free/Net/OpenBSD? Other architectures than just x86/x86_64? Anybody with insights there, and capable of compiling those binaries (or pointers to cross-compilers), please, contact us. :)

As for how to continue the actual xoreos development, I think it would be useful to transfer the script system that’s currently hooked up to Neverwinter Nights onto the other engines. It would need to be rewritten, though. When I first wrote it, I wanted to have engine functions with signatures that mirrored the signatures of what the scripts call. I couldn’t get it to work, though, and settled on a context that contained an array of parameters. For some reason, I still used boost::bind for all the functions, which, at that point, was not necessary. boost::bind compiles really, really slow, and so now the files containing the Neverwinter Nights engines functions take ages to compile. This needs to go.

There, that’s the current short-term roadmap for me: cleanup, release, script system.

(This is part 3 of 3 of my report about the progress on Sonic Chronicles. If you haven’t already, please also read part 1 and part 2.)

Now that I had (nearly) everything graphical together, it was time to weave it all together into something approaching fake Sonic Chronicles gameplay.

Windows size

Being a Nintendo DS game, Sonic Chronicles run on two screens of 256x192 each, arranged on top of each other. To make this easy on me, I decided to, for now, force xoreos’ window size to a static 256x384, and draw into it as if it was the two Nintendo DS screen. For the future, we’ll have to think about how to handle scaling.

There’s at least two ways to handle this:

  • Render into two textures, one for the each screen, then scale these
  • Scale and position the pieces separately

The former is the easy way out, but the latter might provide higher quality. There might be a third, a middle way: draw all 2D elements combined with scaling, and render the 3D objects in higher resolution.

Intro panels

To make the game feel a bit more real, I rigged up few static panels showing the various splash screens, and the “Start your adventure” GUI. Note that the button isn’t actually working: it’s really just an image that waits for a mouse click.

Touch to StartTouch to Start Start your AdventureStart your Adventure Chapter 1Chapter 1

Area background

After the intro panels, Sonic Chronicles in xoreos then dumps you right into the first area. Using a static panel again, it displays the mini map on the top screen, and the area background on the bottom screen. With the arrow keys, you can move the camera along the X and Y axes, and the area background panel follows the camera to draw the section.

Area placeables

That was easy enough. I then wasted the better part of a day trying to trace and guess at how the game loads the “placeable” objects, the usable 3D objects in the area. The area description within the ARE file lists all placeables, each with an integer with the “name” of 40023 that seems to be a running ID and an integer called 40018 that’s unique for each type. I.e. collectable rings have a 40018 value of 0, the item chest 6, and the pile of wood in the first area has a value of 15. The model names are listed in appearances.gda. So far, so familiar. However, the model for the wood pile is on row 101, and I failed to find a consistent way to connect those two numbers, 15 and 101, either mathematical or with the help of other data files, that would have worked for other placeables as well.

With nowhere else to turn, I looked at the disassembly. And I wept. There is no clean way to connect those numbers, because the placeable instantiation is hardcoded: there’s a big old switch with all possible values (43 of them, 0-42) for the integer 40018, with object instantiation for each of them.

To keep it simple for now, I added a little array mapping that type ID onto a row in the appearances.gda. Not all cells are filled yet, but enough that the first area makes sense.

But, to get the models to display correctly, there was still something missing. You see, xoreos sets up a proper perspective projection matrix, where objects in the distance are smaller and all this jazz. Sonic Chronicles, however, uses an orthogonal projection at an angle of 45°. So, I added a method in our GraphicsManager to let the game code select an orthogonal projection instead.

And after some other minor fixes related to this, like scaling the rate of camera movement to fit the 45° angle (so that 3D objects stay at the same point on the background image when you move), and adding the changed orthogonal viewport to our unproject() function (so that detecting that you’ve moused over an object works correctly), the placeables now display and behave correctly within the areas.

Green Hill ZoneGreen Hill Zone Blue Ridge ZoneBlue Ridge Zone Voxai Beta InteriorVoxai Beta Interior

That all?

So, what’s missing? Quite a lot, actually:

Model animations. Those can be both geometry- and texture-based. In geometry-based animations, the bones move around and rotate, leading to different vertex positions. With texture-based animations, the textures move, rotate or scale, or even get replaced by different textures.

Most of the placeable types aren’t yet recognized. Nor do the placeables do anything yet. Nor do we create creatures, triggers and squads. Nor do we have a player character that can move around.

There are also no conversations of any kind implemented yet, and there’s no proper GUI and menu support.

We’re also lacking sound and music. There’s partial documentation for them, though, so it should be relatively easy to manage. Videos, which we still miss too, will be more difficult: we need to reverse-engineer the ActImagine VX video codec, since no one has done that yet.

This all is stuff for the TODO pile, though. Nothing I want to work on at the moment. Of course, we would welcome your contribution, so please, don’t hesitate to contact us if you want to tackle any of these features, or anything else for that matter!

What’s next, then?

If I want to continue on the path of getting all games to show areas, Dragon Age: Origins would be next. We’ll see how well I’ll do there, I guess! :)

(This is part 2 of 3 of my report about the progress on Sonic Chronicles. Part 1 can be found here. Part 3 can be found here.)

After having implemented readers for the common BioWare formats, I turned to the graphics formats. They’re, for the most part, stock Nintendo DS formats, which means I could build upon detective work from the Nintendo modding scene. I have to thank three people in particular: Martin Korth, of NO$GBA fame, whose GBATEK documentation is invaluable, lowlines who figured out much of the gory details of Nintendo’s formats and pleoNeX, whose GPLv3-licensed tool Tinke provided the base on which I implemented my code.

SMALL

When I looked over the files inside the Sonic Chronicles archives, I noticed a peculiar thing. There’s a lot of files with names ending in “.small”. I suspected a compression scheme, especially after examining the files in a hexeditor. And sure enough, there are several compression algorithms provided by the Nintendo DS firmware. The one used by Sonic Chronicles is an LZSS variant, which can be decompressed using barubary’s MIT-licensed dsdecmp tool (GitHub mirror). I pulled the decompressor into xoreos.

NSBTX

The first graphics format in Sonic Chronicles I inspected was NSBTX. NSBTX files are texture; or rather: archives of several textures used by a single model each. Implementing the reading of these was simple enough, and I added a small program to our tools collection that can “extract” them into TGAs.

HillHill BoarBoar TailsTails

NFTR

Next up, I wanted to see the fonts, NFTR, used in the game. They’re bitmap fonts, with each glyph an image. The image can be 1-bit black and white, or it can be greyscale for anti-aliasing, shadowing or outlining purposes. Additionally, there’s mapping tables to translate a code point in a certain encoding (UTF-16, UTF-32, CP1252 or Shift-JIS) into the index of the glyph it represents.

There was a bit of trial and error involved, as the documentation and existing FLOSS projects to display the fonts weren’t quite correct in certain details (which might not even be their fault; Nintendo likes to subtly change formats between firmware versions). But, before long, I could print arbitrary strings in these fonts in xoreos.

Font drawing testFont drawing test

NBFS/NBFP

Sonic Chronicles comes with a few NBFS files, which is a dead-simple raw format: 8-bit paletted image data, with the palette (in 16-bit RGB555 values) in an NBFP file of the same name. They’re mostly used for images spanning a whole Nintendo DS screen.

Mini mapMini map Conversation cardConversation card

NCGR/NCLR

The main image format used in Sonic Chronicles, however, was still missing: NCGR. As I went along implementing a reader, I learned these ugly facts:

  • The palettes are in separate NCLR files that are shared among NCGR
  • Most of the images are made up of several NCGR files, arranged on a grid
  • The NCGR image data itself is made up of 8x8 pixel tiles

Essentially, this image of Sonic is divided into these parts:

SonicSonic NCGR cellsNCGR cells NCGR tilesNCGR tiles

This all makes assembling the final image a bit…ugly. But hey, I made it work in the end.

…except for one thing: a few of the NCGR files don’t specify their width and height. By fiddling with the values a bit, I managed to find these values manually, but the resulting image looks off: it’s as if the image is supposed to be rearranged afterwards, different pieces drawn at different places. Each of those file also has an NCER file with the same name. I assume that means information on how to draw these NCGR are within the NCER. A thing for the TODO pile.

NSBMD

I then went for the big one: the 3D model format NSBMD. This involved a lot of fiddling, guessing and trial-and-error, as the documentation of these formats is sparse, and oftentimes wrong.

Conceptually, an NSBMD file consists of these parts:

  • Bones
  • Bone commands
  • Materials
  • Polygons
  • Polygon commands

A bone consists of a name and a transformation that displaces it from its (at that point unknown) parent bone. They are stored as a flat list. The bone commands then specify how the bones connect together. And they give each bone a location in the Nintendo DS’s matrix stack (a list of transformation matrices containing the absolute transformation of each bone at the time of rendering).

Broken boar skeletonBroken boar skeleton Getting there...Getting there… Correct boar skeletonCorrect boar skeleton

The material contains the texture name (which reference textures inside the NSBTX with the same name as the NSBMD), and a few properties.

Each polygon can use a single material, and contains a list of polygon commands. These polygon commands produce the actual geometry. They set color, normal and texture coordinates, and generate vertices. They also manipulate the matrix stack, specifically replacing the working matrix with the matrix from the stack position of certain bones. In essence, this bases the vertices on the position of the bone.

No.No. Boar with holesBoar with holes Correct boarCorrect boar

While the Nintendo DS interprets the polygon commands on-the-fly while rendering, and while they can be nearly directly converted to OpenGL-1.2-era glBegin()/glEnd() blocks, this is not really want we want to do. So instead, we, while loading, interpret the polygon commands into an intermediate structure.

SonicSonic AmyAmy TailsTails

The result is a relatively massive loader for these files, and that doesn’t yet include support for animations.

One interesting anecdote: the Nintendo DS doesn’t use floating-point numbers to represent real numbers, but various formats of fixed-point numbers. Those are found extensively in the NSBMD files. But when I implemented the GFF4 format earlier (see part 1 of my Sonic Chronicles progress report), I found, in the GFF4 files used by Sonic Chronicles, a field type not described in the Dragon Age toolset wiki. Turns out, those are Nintendo DS fixed-point numbers!

CBGT/PAL and CDPTH

With those pesky models out of the way, I was ready to show the areas, right? Wrong. There’s yet another graphics format in Sonic Chronicles: CBGT, used for the area background images.

However, CBGT isn’t a Nintendo format. No, it’s one of BioWare’s creation. It does, though, take inspiration from the Nintendo DS formats. It consists of blocks of 64x64 pixels, each compressed using the LZSS algorithm found in SMALL files, and each block divided into 8x8 pixel tiles. PAL files of the same name carry palettes, with each CBGT able to use a different palette within the PAL.

Since I already knew how to puzzle together those cells and tiles from the NCGR format, getting the image itself was not a problem. But I was at a loss where to get the dimensions of the image from, and how to distribute the palettes onto the cells. I figured out an algorithm for the latter, that worked for nearly all images, but the outliers still annoyed me. Then it hit me: for each CBGT/PAL pair, there’s a third file: a 2DA. And that one contains the information which cell uses which palette, neatly organized in a 2D table exactly how the cells are arranged in the final image. This, of course, is enough to calculate the final image dimensions as well.

Wrong palette distributionWrong palette distribution Correct palette distributionCorrect palette distribution

I also found a fourth file for nearly each CBGT/PAL/2DA tuple: a CDPTH. Arranged in a similar fashion to the CBGT, it contains 16-bit depth information for each area background. This is used to let certain background pieces draw over the 3D models in the game, when they should appear behind something.

Depth valuesDepth values

Now I was ready to implement actual Sonic Chronicles stuff. I’ll describe that in part 3.

And further down the path of getting all targetted games to show areas I go. Previously, I wrote about my progress with The Witcher, Jade Empire and Neverwinter Nights 2. For the next two months, I took a look at the odd one out: the Nintendo DS game Sonic Chronicles: The Dark Brotherhood.

Yes, a Nintendo DS game. I wasn’t so sure myself that game is actually a “proper” target for xoreos. I’m still not 100% sure, but I know now that it at least does use several BioWare file formats, as well as Nintendo DS formats. I also saw that some of those BioWare formats are used in Dragon Age: Origins as well, so Sonic Chronicles actually did provide a natural station on my path.

I’ll divide my report in three parts. In this post, I’ll go a bit into the details of those common BioWare file formats. In the next post, I cover the graphics (that are mostly Nintendo formats). And the third post will show how I tied it all together in xoreos.

So, onwards to the BioWare formats.

GFF4

GFF is BioWare’s “General File Format”, which is used as the basis for many things in BioWare games. It’s an old format, already found in the Infinity Engine, but not quite as complex yet. (Correction: It seems I misremembered there; GFF is not used in the Infinity Engine. I apologize for this mistake.) Conceptually, it is comparable to XML1: hierarchical data, organized in a tree-like fashion, able to hold basically everything. As such, it’s used to describe areas, characters, items, dialogues, … Unlike XML, however, GFF is a binary format, not directly human-readable.

Since GFF is such an important format, xoreos already implemented a reader (thanks to BioWare releasing specifications for the Neverwinter Nights toolset. And we provide a tool to convert them into XML for easier readability, too. It was only, however, for versions 3.2 (used by Neverwinter Nights, Neverwinter Nights 2, Knights of the Old Republic, Knights of the Old Republic 2 and Jade Empire) and 3.3 (used by The Witcher). But Sonic Chronicles, Dragon Age: Origins and Dragon Age 2 needed a reader for versions 4.0 and 4.1 – and boy did they change the format.

You see, after converting the GFF3 to XML, the whole thing is really quite readable and understandable. Every tag has a full string as a name, making the uses and intentions clear. But from the game’s perspective, this has a huge drawback: it’s slow. Strings are unwieldy, slow to read and compare, and variable length items are generally a pain when you want to quickly jump to a specific field. To curb that, GFF4 removes those pesky strings. Instead, fields use a single 32-bit integer as their “name”, making comparisons easy as pie.

GFF3 as XMLGFF3 as XML GFF4 as XMLGFF4 as XML

Lucky for me, the new GFF4 format is already documented on the Dragon Age Toolset Wiki. The huge amount of example files provided by the two Dragon Age games and Sonic Chronicles gave me ample opportunities to test out corner cases as well. Easy. The gff2xml tool mentioned above now supports GFF4 as well.

[1] In fact, BioWare generates their GFF4 files out of XML, as can be seen from the Dragon Age: Origins toolset.

TLK

Next up, I saw a new TLK format used in Sonic. TLK is a “talktable”, a list of strings indexed by a numerical ID. The idea is that you have all text used in the game in one place, easy to use and easy to translate. Already used in Neverwinter Nights, xoreos has a reader for it already. It’s relatively simple, too.

However, the new format is quite different. In fact, it’s a GFF4! I did say that you can basically stick everything in a GFF, right? That’s what they did for Sonic Chronicles (and the two Dragon Age games). With the new GFF4 reader, adding GFF4’d TLK support was quick and painless.

GDA

Just like the GFF4’d TLK, GDA is an old friend in GFF4 suit. This time, it’s 2DA: a 2 dimensional array, a table if you will. If you’re still lost, think Excel spreadsheet, a simple collection of data organized on a grid.

2DAs are used to, for example, specify the models of different objects. The GIT file describing objects in an area would say “Here’s an object, we call it Chair, it has Appearance 179”. The game then looks into appearances.2da, at row 179 and column “ModelName”, grab that filename there and load it as the object’s model.

GDA is, essentially, just the same thing as GFF4. A list of columns giving their name and type, and a list of rows with the data for each column. However… While real 2DA have an actual column name (the “ModelName”, for example), making guessing the meaning easy, GDA don’t actually store a name. They store a hash of the name (specifically, the CRC32 of the UTF-16LE encoded string in all lowercase), a number that’s meaningless in and of itself.

There’s 845 unique hashes in the GDA files found in Sonic. There’s no real way to turn them back into readable strings, but there’s a certain trick I could apply: a dictionary attack. I got myself a huge list of words found in a dictionary, hashed them, and compared the hashes. Then I extracted all strings I could find in the game (from the GFFs, mostly), and did the same. Then I combined the words of these lists. Then I combined matches. Each time, I manually went through the list to kick out the many, many false positives: strings that hashed to a valid number, but that don’t make sense in the context of the game (“necklessnoflyzone”, “rareuniquemummifications”, “properlyunsmoked”).

Phew, that was a lot of tedious work. Still, I managed to find the source strings for 534 of those 845 hashes, 63%. Sure, there’s still 311 missing, but that’ll have to wait for later.

And that’s it for the common BioWare file formats. Tune in next time when I go over the graphical formats.