Sunday, December 30, 2007
Recently, my son has been playing the original Xbox game Thrillville. The game is solid on our original Xbox, but crashes with disturbing frequency on the Xbox 360, particularly during loading. This leads me to suspect that the timing differences between the original Xbox and its emulation on the 360 exposed threading issues that were present but simply did not occur because of the timing characteristics of the original console. Console game developers generally test exclusively for the single configuration per target platform. The only way we find threading issues, it seems, is through testing. So if testing fails to reveal real issues, games are shipped with them. This is mitigated, of course, by developing games for multiple multithreaded platforms (ie, Xbox 360 and PlayStation 3) with the same code base.
Emulating Concurrent Code
This is one kink in a larger problem. The problem is developing a body of 'literature,' whether games or other software, that will be usable on future machines. Take, for example, the classic game Ultima VII. It is a beautiful and vastly influential role-playing game rendered unplayable by modern machines because of its hardware voodoo. (We also may occasionally need to use software such as old versions of WordPerfect, VisiCalc or Lotus 1-2-3 - and perhaps from non-PC machines - to read crucial, but old, data stored in those formats.) For such situations we typically resort to emulators such as the excellent DOSBox or AppleWin for maximum compatibility.
It is very difficult, if not impossible, to maintain timing-level compatibility in emulators. Older games that relied heavily upon machine-specific timing are now pretty much broken in emulators. For that matter, they're generally broken when the next generation of hardware arrives. Multithreaded code that contains hidden bugs (which is pretty much all code, multithreaded or otherwise) is implicitly and heavily timing-dependent. It's one thing to emulate an Xbox game, with typically very low levels of multithreading on an Xbox 360, and may be quite another to emulate Xbox 360 games on an Xbox 720.
As a result, I suspect that far more Xbox 360 and PlayStation 3 games will be unplayable on future hardware than Xbox and PS2 games on the 360 and PS3.
Thursday, December 20, 2007
Why Do We Use C++?
Now before I hear the platitudes about how [insert favorite non-C++ language here] is better anyhow and would fix everything in one shot, let me ask you how feasible it would be to develop multi-million line software on multiple, often brand-new, platforms simultaneously with extreme performance expectations, leveraging mega/gigabytes of shared legacy code and without having to port, develop and/or maintain one's own compilers and toolsets with this superior language. If the game development industry turns to non-C++ languages to solve this, it will be slowly and painfully. And, most likely, it will be an industry-wide move and not just one studio or another, although some must lead the way. Honestly, I would never expect C++ to go away completely (just as assembly has never really gone away), and some future approaches may just be layered on top of C++ (Bigloo, Intel Ct, OpenMP), or use it when performance is critical (Java's JNI).
C++ is flexible and fast. With sufficient (possibly enormous) effort, it can do almost everything any language can do. It can function both as a high-level multiparadigm language and as a low-level portable assembly language. What makes C++ (and C) so widespread is its unrestrictive nature. This is widely seen as a negative, but in the real world, being able to abuse your language to get what you want from your machine is of crucial importance - especially when performance is a primary concern. That said, you may only want to abuse your language say 2-5% of the time. The rest (95-98%) of the time, you'd like some nice, type-safe, bounds checking, memory-managed, interactive, memoizing language, giving you a >100% increase in productivity. That we have settled on C++ indicates that the 2-5% is so crucial that we're willing to sacrifice the rest for it. In the game industry, I think that's a fair statement.
Still, writing solid C++ code even in the absence of multithreading requires a mastery of nearly the whole language, making it dangerous for inexperienced developers. Even merely adequate C++ coding is heavily reliant on learned idioms that are not a part of the language, and therefore unenforceable. For example, that you are allowed to return a pointer or reference to a stack-allocated object from a function, or that you are allowed to overflow a string buffer have cost the world untold man-hours and dollars. And yet for all its ills, C++ ranks near the top of the most useful (or at least used) languages ever explicitly designed (including Esperanto).
(4/11/08 - adapted from one of my comments in reddit.)
I should stress again that one of the more crucial issues surrounding language choice is the set of tools provided to us by the console vendors. C++ wasn't adopted until late in the console world because C++ standards weren't well supported by console vendors. Tools have always been poor on consoles compared to the PC (this has changed somewhat with the 360), and good quality C++ compilers were rare in previous console generations, but C compilers were available (athough they, too, came late - with the original PlayStation or Saturn, I believe?).
The PC gaming world hasn't seen this sort of lag. Game developers are eager to adopt new technologies. Other technologies have been unavailable on these platforms.
This is an exploratory post (as they all are..) and is subject to change.
Tuesday, December 18, 2007
Effectively developing concurrent software for modern machines with multiple cores is perhaps the greatest technical challenge we'll encounter in some time. Our current approaches just aren't up to the task of creating robust multithreaded code.
Please see the linked post for some thoughts on this topic. Before we look directly at concurrency, let's look at what brought us to where we are.
It’s reasonably well established that categorization is a fundamental human strength. You might say that feature extraction and classification are hardware-accelerated in the brain. Even at levels far below the conscious (i.e., the visual cortex) information is categorized before it even becomes 'thought' to us. In fact, categorization is so fundamental to human thought that assuming categories themselves are real objects has been a universal illusion. In “How the Mind Works”, Pinker states that nearly all cultures initially adopt a ‘folk idealism’ as a result of this. Applying this to the boundary of man and machine communication, Object Orientation has evolved as a straightforward way to map categories (type, class, etc.) and instances of those categories onto machine architectures that deal primarily in a few primitive and largely undifferentiated numerical types.
Abstractions are complexes of categories and their interactions. Understanding complexity in terms of hierarchies of abstraction is something that people do really well. Modular Programming and, to a greater extent, Object Orientation attempt to give us tools to work at these levels of human competence - with, of course, some consequences in terms of final performance.
The predominant programming paradigms attempt to map the way people think onto the way machines operate. Let's turn our attention to concurrency and see if we can stretch this a bit further.
Concurrency as Time vs. Space
In software development, one of the things I’ve noticed is that people are much better at understanding space than time. This is why we map out time in timelines, MS Project files and a million other ways in spatial form. What this means to programming is that anywhere you can map out state in terms of space, the result is far easier to comprehend. For a clear example of this, see Google's MapReduce.
This is the core issue with concurrency. It’s very difficult to understand the possible states of concurrent systems because they happen in time. Many of the abstractions that can help with concurrency (such as Functional Programming, Message Passing, etc.) are useful because they essentially transform a state-heavy process into some equivalent but more understandable spatial map whose design appears much more static.
In some ways the game industry is at the vanguard of multicore development on consumer machines with the Xbox 360 and the PlayStation 3. The amount of time and effort that we spend finding and fixing multithreading bugs is terrifying. While the hardware companies move toward doubling the number of cores with each processor generation, software companies will be reeling.
Clearly, C++, as it is currently, is not well suited toward developing software on highly multicore machines. Ideally, we’d have a language, extension or paradigm that would allow us to map thread concurrency onto easily understood (that is, spatial) language constructs that discourage or prohibit the kind of multithreading errors that currently waste hours and hours of our time. Right now, we don’t have to worry how many registers there are on the processor when we write C++ code. Similarly, whatever language/paradigm we’d want to use would, at the compile stage, optimally generate code for the number of cores available to it on the target platform.
So would some sort of functional language be best? C++ with functional extensions? Erlang? Haskell?
What about programming models based more on hardware description languages such as VHDL and Verilog, which are inherently spatial? Would they map more effectively to multicore machinery since hardware languages describe processes which are inherently asynchronous?
In any event, we software developers have an interesting road ahead.
Wednesday, December 12, 2007
The world -
The dungeons -