There are two regressions I am tracking and both are big question marks. Chris mentioned a WebM performance regression; I can't reproduce this on the quad G5, but the G5 never had a problem with WebM even before the AltiVec patches went in, and my usual iMac G4 tester never ran WebM without a problem. I've turned the loop and bilinear filters down as low as possible, with fortunately minimal visual change on most YouTube videos, and made sure that video frames are being passed directly to the compositor. I hope that helps; I don't really have anything else to offer there. libvpx changed dramatically in 29 and it might be that it's just too heavy now for the G4s that used to run it acceptably.
I'm pretty sure, by the way, that OMTC (off-main thread compositing) is not to blame for this, which was also enabled in 29. OMTC puts the compositor on a separate asynchronous thread so that compositing graphics layers together doesn't block the main business of the browser. Multithreading is not great on Power Macs running OS X and is worse in 10.4, so OMTC makes some sites scroll better and some sites scroll worse, but OMTC is a prerequisite for future features like Async Pan-Zoom which finally lets you move around on the page while the browser is working on it and Mozilla is openly warning they won't support synchronous on-main thread compositing for much longer. OMTC is the future and fortunately it works on Tiger, so we're moving forward with it.
The second regression is JavaScript. While generational garbage collection (GGC) improves browser memory usage even more, it comes at a price; the changes for tracking these new short-lived objects add further overhead to the JIT and even with some additional fiddling we still take a hit of about 10% relative to 24. The G5 takes it on the chin even more, about 15%. This is still substantially better than the naked interpreter, but the interpreter is now so deoptimized that this isn't saying much anymore.
The biggest decline was due to us failing to run a particular type of analysis which is built into Ion, but we don't run Ion, so this didn't run and PPCBC generated terrible code as a result. I hacked this back together and it now works mostly as it did before, but this makes the definitive solution clear: we need IonMonkey working on PowerPC or we'll end up incessantly running into these kinds of edge cases in future versions. Plus, more to the point, Baseline Compiler was never intended as a definitive JIT solution despite the fact we're using it like one. My previous efforts with Ion ran aground in the bailout code and I think the main problem is related to Ben's sophisticated but complex branch optimization we used in JaegerMonkey (the previous JIT), so that needs to be decoupled from the low-level assembler which I'm working on over the weekend and then Ion dragged back up to a testable state. If I can make progress on that, I am considering delaying 31 until Ion passes tests (no asm.js; I continue to believe that it's irrelevant on big-endian because of all the little-endian code compiled for it) like we did with 24. If I can't, we'll just do the best we can.
For now the current plan is a 31 beta shortly after 24.6.0 which will still use PPCBC unless I have a major breakthrough; after that we'll see. 24.6.0 is still on schedule for an on-time release the second week of June.
For PPC Linux users, that bug that made Firefox unable to open has now been fixed, thanks to the wizards at Mozilla. https://bugzilla.mozilla.org/show_bug.cgi?id=961488
ReplyDelete