Sunday, December 21, 2014

31.3.1pre test build available

Everyone needs a break, so I took a break from my Master's project to experiment with a few ideas I've been ruminating over (it's also a convenient excuse to goof off). We're always looking for more performance from our older machines and sometimes this requires rewriting or changing assumptions of the code where this can be done without a lot of work.

Multiprocessing/multithreading has been one of those areas. Since around Fx22 Mozilla has been using background finalization for JavaScript (so background finalization was in 24 and 31 but not in 17) which is to say that the procedure for deallocation of objects is not done on the main thread; finalization is a separate, though related, process to garbage collection where finalized objects with no remaining references are reclaimed. This, at least on first blush, would seem like an unmitigated win, especially on machines with multiple CPUs -- you just do the work on another core while you're working on something else. But interestingly it has some very odd interactions on my test systems after long uptimes. Some profiling I did when TenFourFox just seems to be sitting there waiting showed that it wasn't TenFourFox (per se) that was twiddling its thumbs; the OS X kernel was stuck in a wait state as well and it seemed to have been kicked off by ... background finalization, waiting on threading management in the kernel, which had temporarily deadlocked. This may be the source of some unexplained seizing up that a few people have complained about and we had no obvious way to reproduce.

We can't do anything about the kernel (this is another reason why I'm very concerned about how Electrolysis will perform on 10.4/10.5), but background finalization can be defeated with a few lines of easily ported code. This doesn't make the browser faster objectively, mind you; it just "spreads the badness around" so that finalization occurs predictably, and in a manner that makes it more likely to complete, which in turn makes garbage collection more likely to complete (which since the fix in 31.3 now can run on a different core and doesn't appear to trigger this specific issue), which in turn reduces the drag on performance. In effect, this rolls this aspect of JavaScript back to Fx17.

The result is more consistent and smoother, even if it's not truly faster -- especially on the G5 where bouncing around in code generally hobbles performance -- and does not seem to affect uniprocessor systems adversely, but I'd like to get a test out where you can play with it and see. You should not expect much difference immediately when you start; in fact, startup time might even be slower, and memory usage will show little if any improvement. This is just to improve the browser's responsiveness in terms of staying reasonably quick after multiple compartments have been allocated and need to be scanned.

Speaking of, I also implemented the reduction on tab undoes (to 4) and window undoes (to 2), which also reduces the number of compartments that must be scanned (you can override this from about:config, but be aware that every tab and tab-undo-state you keep in memory remains active and so the garbage collector must evaluate it), and I threw in one other change that forces substantial additional buffering of video playback just to see how this works out for you lot. Like the change in finalization, this "spreads the badness" by forcing the browser to build up a large backing store of fully decoded video frames before playback (up to 10 seconds' worth depending on available memory). Videos may appear to stall initially, but then can play more smoothly because decoding is now more aggressively buffered as well, not just downloaded video data. This quad shows improvement only in that playback is more consistent, but on my 1GHz iMac G4 many standard-definition videos on YouTube are now noticeably less like a slideshow with audio. It's possible to overdo this setting, so I've settled on a conservative number that seems to work decently for the test machines here (it's baked into the C++ code, sorry -- you can't twiddle this from about:config). The minimum recommendation for video is still a 1.25GHz G4.

The tab undo and buffering changes will be carried into 38, but 38 introduces generational garbage collection (to us) and I have to do some testing to determine if it will react adversely with GGC's system assumptions. Please note that I only built this for G5 and 7450 mostly because I want testing on a good mix of single and multi-CPU systems (7400 users can use the 7450 build if you really want to, but sorry, G3 folks, you'll have to wait for the next scheduled release), but I did the building on my new external solid state drive which reduces build overhead by as much as 30%. Not bad! I'll post some stuff about the RAM disk and SSD build testing I've been playing with in a future entry.

Downloads available from the usual place.

12 comments:

  1. I am posting from the pre-test-build.

    I can't say that I notice any more zip to the casual eye. However, a couple of more complicated sites didn't seem a bit more responsive, but again, that's just an eyeball test. Really appreciate the hard work you do on this browser.

    My wondering as of late is if there are any legitimate adblock options that don't go the route of Adblock-Edge, eating up CPU. It seems like stuff like Adblock (and other add-ons) are going to continue to bog down the speed potential of TenFourFox in the future. Still, I can't stand annoying ads.

    ReplyDelete
    Replies
    1. What type of Mac? Single processor systems may benefit less from this change. At least it doesn't regress you, so that's good to hear.

      Does video/YouTube performance change?

      I don't really see much CPU drag from Adblock Edge (I don't use AB Plus anymore), and on some sites they're absolutely intolerable without it.

      Delete
    2. Dual Core 2.3 Power Mac G5

      And by the way, "didn't" in that third sentence should be "did". But that's hardly a scientific qualification. If there was a change, it's probably in a millisecond here or there, which to the naked eye is tough to see.

      Yeah, Adblock Edge is my go to, as well. What I get concerned about is the reports that the add-on renders the whole page first, then cuts out the ads. On the other hand, some Add-ons use filter lists to block them incoming, which may speed things up. I figure that adds some serious overhead, but maybe this calls for some testing without any add-ons loaded to see if it does feel faster.

      Delete
    3. I use BluHell Firewall. Works and no real CPU hit.

      https://addons.mozilla.org/en-us/firefox/addon/bluhell-firewall/

      Delete
  2. Finally have been able to download the test build, somehow sourceforge had an error 500 earlier.

    Anyhow, seeking/ caching on eg. html5 youtube now works better, without completely stalling the OS/ Firefox Interface and bringing things down to a halt. I usually pause videos and wait till they are loaded, still the actual playback is not of much use, as I get only about 2-3 frames a second (tho now as you described and I noted above with the system remaining much more responsive and without further interruption).

    Sadly only Leopard Webkit plays videos almost stable at 30 fps, slight stutter here and there and with seek and caching but otherwise useable.

    This is on my iBook G4, 1,25 GHz, 768 MB RAM, OSX 10.5.8

    ReplyDelete
  3. Well guess I spoke too soon, seems to depend heavily on which site you try, while it was working better described as above on youtube, Valve's Steam store with it's annoying auto play of trailers left tenfourfox unresponsive and ultimately crashing (i had 2 more tabs open tho) second try I just had one tab which was Steams site and after the first trailer was done I had a brief moment of control to switch to a picture and abort the second trailer,...

    http://store.steampowered.com/app/274310/?l=german

    ReplyDelete
    Replies
    1. I can't reproduce that here. My iBook doesn't play the videos well, but it does play them, and I don't get a freeze. The G5 plays everything fine.

      Delete
    2. Well it only crashed once on this page so far: http://store.steampowered.com/app/274310/?l=german but the site is practically unuseable for video as it immediatly starts playing the trailer before I can even uncheck it's autoloading or switch to a picture of a game or otherwise control anything displayed, also compared to youtube I have no say whatsoever on the quality/ size of those trailers, so that might be part of the problem there.

      I imagine with two tabs open and the site doing it's thing on a third tab with it's videos was just too much at least at one point.

      Delete
    3. ps. in case that wasn't clear from my post albeit the crash and performance issues the videos do play on steampowered as well.

      Delete
    4. I'm not sure if the tabs have anything to do with it -- both my iBook and (I just tried it) iMac G4 had a number of other tabs open and active. The iMac doesn't play the videos well either, but it does play them, and I don't have any hangs or crashes.

      Memory might be a factor, though. Both my iBook and iMac have substantially more RAM than your machine (the iMac has 1.25GB and the iBook has 1.5GB, and they both run Tiger).

      Delete
  4. Mixed bag:

    I see no difference at all in video playback. If any, 31.3.1pre is a little less performant. But this will probably depend on the videos you test with. I use these
    https://www.youtube.com/watch?v=czvsSqg7HwE (easy)
    https://www.youtube.com/watch?v=EOvPO8mW1Mw (hard)
    For comparison, the "hard" video plays almost flawlessly on 24.6.0 but with much stuttering on both 31.3.0 and 31.3.1pre. The "easy" one plays ok on 31, too.

    On the other hand, memory management, i.e. freeing of ram after high usage and responsiveness during this process is dramatically better. The browser is now able to free ram much faster (and with clearly better success) than before. It's very obvious both when using the browser and when looking at Activity Monitor. I just kicked the browser really hard to use >1200 MB of ram by opening about 50 Facebook tabs, and it went down to ~520 MB pretty much instantaneously after the tabs were closed. It stayed usable all the time.

    10.5., G4 1.33 GHz.

    ReplyDelete
    Replies
    1. Good, that's encouraging. The fact that video playback is no(t much) worse is also noteworthy, though I'm surprised there was no major improvement. Maybe it only helps much on low-spec systems.

      Delete

Due to an increased frequency of spam, comments are now subject to moderation.