Sunday, December 29, 2013

26.0, trouble in paradise

So over Christmas weekend I managed to get a debug build of TenFourFox 26 up and running. It basically works, though I figured it would (29, the first Australis release, is what worries me more). So that's the good news.

The bad news is that there are three major problems, one of which is worked around only incompletely, all of them having to do with the graphics stack. Recall that starting with Firefox 12, Mozilla introduced a scheme called Azure to reduce the overhead of graphics drawing by mapping graphics calls more directly into operating system primitives rather than running them through Cairo, the abstracted graphics system Firefox has used for almost everything since 3.0. We support Azure for HTML5 <canvas> elements, and in that employ it works very well, improving overall canvas performance for most shapes by about 60 percent.

Where our implementation falls flat is text and gradients; we need lots of hacks for 10.4 to make this work, and the overhead for these specific elements is significantly greater. (Text filled with gradients is even worse.) Unfortunately, a browser's primary job is to render lots and lots of text, and Mozilla now wants to use Azure to do the rendering for everything (relegating Cairo to a backup engine, and for printing). I have to disable "content Azure" in 26, or the browser renders things about three times slower overall in the debug build (and in an opt build we'd lose AltiVec-accelerated compositing through pixman too). I think we can get away with this for awhile since Cairo is still an integral part of the layout stack for now, but if Mozilla finds another printing solution then Cairo becomes expendable. I'm going to try to do more research to figure out if we can speed this up, but remember that the vast majority of supported Macs running Firefox have hardware acceleration and we don't. It's entirely possible that the combination of 10.4 graphics thunks and their hardware-tuned rendering strategy is just too much for Macs limited to software rendering like us.

On top of that, our current Azure implementation exposes bugs in 10.4 CoreGraphics that, although not obviously severe, generate invalid context errors and other such annoyances that suggest we're just wallpapering over more significant problems. I haven't figured out where these errors come from yet (I suspect DrawTargetCG::FillRect), and because of their frequency I can't enable content Azure in TenFourFox until I'm forced to.

The third problem is the most serious, and the one I don't know how to fix definitively. One of 10.6's new system features is blocks, a construct for creating closures in C, C++ and Objective-C that allows applications to better exploit Grand Central Dispatch: you can generate little snippets of code, wrap them up as a block, and pass them around in such a way that they can remember their original state when finally run and even execute in parallel. Blocks require both runtime support from the operating system to handle and execute them, and compiler support to understand the new syntax and generate the closure code. Apple implemented this in Xcode using their fork of gcc, but blocks are really a better first-class citizen in clang, and regular gcc doesn't support them.

Blocks can be added to 10.5 using PLBlocks, which offers a modified gcc (presumably using Apple's patches) and a userland framework for runtime support. Tobias himself already uses this package for Leopard WebKit, so we have good evidence it should work fine for this purpose. Although this framework does not exist for 10.4, it looks like it doesn't require any 10.5 Objective-C features to function, so it could probably be ported. That's not the real problem. The real problem is the compiler: we don't use gcc 4.0.1 or 4.2 anymore, and Apple never ported blocks to a later version (we use 4.6), so we'd have to roll this support ourselves off Apple's patches. Even assuming it works (which is a big if), I really don't want to be maintaining a compiler and an entire tool chain on top of a browser, linker and debugger, especially since it's likely we'll have to force another compiler change in the not-too-distant future (fortunately MacPorts already offers 4.8 and it works fine on 10.4). David Fang is still industriously working away on a PowerPC OS X clang, but I don't know how far along he is with it or if its codegen for blocks will work with the PLBlocks runtime, which we'll still need.

Fortunately, Mozilla uses blocks in a very limited way and only within the OS X widget library as callbacks for graphics calls, since no other compiler other than Apple's supports them. For the time being, these callbacks (which so far appear to only apply when hardware acceleration is active) can be partially emulated by spinning the closure out as a static function that can be passed as a function pointer. I say partially, because what this doesn't emulate is the, you know, closure part: being static functions they don't have access to class member properties, and even if they did (or we figure out some Rube Goldbergian bridge class) they have no memory of their value at creation, so it's possible for them to have the wrong value when the callback is triggered. I hacked around this and the app seems to be fine, but that's no guarantee it will continue to be. As Mozilla tries to optimize Firefox more for multi-core systems and Off-Main Thread Compositing looms on the horizon, the use of blocks in Mac-specific code is likely to increase because it's what Apple wants developers to do and it's relatively straightforward for developers to use, but it's going to be a big problem for us if that code is an essential part of the application.

26 will be issued as changesets only and maybe a debug build. The first unstable release will either be 29-aurora, if Australis works, or a new 24 branch if it doesn't. Cross your fingers.

6 comments:

  1. Hi,
    I want to express my gratitude and support for you in your work to maintain support for those of us dedicated to the PowerPC...
    I wish I had the know-how to help, but I figure I can at least let you know your efforts are appreciated and put to good use.
    I only takes a handful of you wiz guys to knock the legs out from under the pressure to be always procuring the newest stuff.
    Your project is of particular value since internet compatibility is always the greatest of the pressures (It's what drove me from Panther to Tiger, for sure). Much Thanks!

    ReplyDelete
  2. Until clang works on PPC OS X for Leopard WebKit I replaced most (or even all?) C++11 lambas with Apple Blocks because that's supported by GCC 4.2 .
    As long as you don't need the lamba to be an Objective-C object you could replace the Blocks with lambas - and it should actually be possible to write a Block compatible Objective-C container class for C++11 lambdas.

    As for the graphics stack that seems a real problem to me. You might consider doing something more drastic and decide to switch to 10.5 or stay on 10.4 but switching to X11...

    ReplyDelete
    Replies
    1. You know, I didn't think of lambdas. That's a great suggestion. Right now they're just C++ classes using them, so that should work. I'll play with that.

      The invalid context errors worry me too. I need better visibility into what's causing CoreGraphics to barf. The app runs and the graphics work, but I'm really concerned about the number of problems content Azure is generating on 10.4.

      Delete
  3. Over the winter holidays, there were some struct layout fixes that allowed powerpc-darwin8 clang to bootstrap in 3-stages at -O0 (for the first time ever), starting with gcc-4.0. Stage-3 alone took 3+ days, due to stage-2 being unoptimized. Nevertheless, this is cause for celebration. However, we are far from done. ABI patches are still rolling in, one unwind info patch just went in, -O1 has known issues, and libc++ (using stage-1 clang) still fails numerous tests. I have not yet attempted to use blocks support. We are (slowly) racing to get this port up and running before the llvm/clang code base starts requiring C++11.

    I hang out in #llvm-powerpc-darwin@irc.oftc.net, stop by and chat!

    ReplyDelete
    Replies
    1. Good work. But why not bootstrap it with, say, MacPorts gcc instead of the Xcode gcc?

      Delete

Due to an increased frequency of spam, comments are now subject to moderation.