I always feel like I should blog more about engineering related stuff, since that's what I do all day.
For now, my earlier plans for bytecode to C++ compilation are on hold as we're looking into more immediate gains from aggressively optimizing the interpretation. One of the downsides of the hybrid C++ compilation/bytecode interpretation approach that I have in mind is that it will require a workflow change. The C++ will eventually need to be compiled and then linked to the executable. There's a pipeline/workflow bubble there that's not appealing. Still, I think it would be fantastic to get that built for critical core libraries of script (utilities) that don't change much.
Our bytecode is that of a dynamic, and (as a result of the structure of the instruction set) terribly slow language. Unfortunately, we use an off-the-shelf compiler, so we can't infer much that was in the original source but fails to show up in the bytecode output.
Did some memoization, in this case caching of variable and function lookups, with excellent (awesome!) results.
A clever universal hashing scheme for both special strings and user strings opened up a lot of possibilities.
Close attention paid to string handling yielded good results.
Thread synchronizaton primitives were killing us. We usually try to keep our assembly language to a bare minimum for portability sake. This translates to a handful of primitives to improve PS2 performance. But to rid ourselves of our most time consuming locks, we applied asm to a couple of critical spots where we need to read in, modify, then write out whole cache lines as atomic operations on the PowerPC machines (PS3, 360). This avoids the expense of a more heavyweight lock such as a mutex in those places and gives us back a lot of performance for multithreaded operation.
Anyhow, we've significantly improved the bytecode execution performance, which is something the game teams have been asking (begging!) for for ages.
Of course, it's never fast enough.. ;)