Thursday, November 29, 2012

17.0.1 released, and jawing about the WiiUCPU

17.0.1 release candidate is available. Near as I can tell from releng, Mozilla plans to release this tomorrow. Please test it today. Changesets are available.

I am still stymied by the locking problems we still have on some sites, where the browser spins its wheels in semaphore wait. It's only a minority of sites, thank goodness, but it even occurs on some pure HTML/CSS sites with no JavaScript at all implying that the hangup is within layout. Once I get a debug build of Fx19 mounted, I want to review this problem in detail; it seems there should be a simple solution.

The fanboi boards and the usual blogosphere suspects are aflutter with "WiiU CPU sux" based on a couple of tweets made by self-proclaimed Nintendo WiiU hacker Hector Martin. The triple-core PowerPC CPU in the WiiU, codenamed "Espresso," is according to Martin's measurements clocked at only around 1.24GHz per core, compared with around 1.6GHz (when multithreaded; 3.2GHz max) for the Xbox 360's tri-core PowerPC-based Xenon. Given that apparently dismal clock speed, is this another case of the megahertz myth reborn?

Both the preceding Nintendo GameCube's Gekko and the Wii's Broadway were evolved forms of the G3, specifically modifications of the PowerPC 750CXe and 750CL. IBM customized Gekko's FPU with SIMD instructions (sort of an "AltiVec Lite") to facilitate media processing and built a fast path from Gekko to the GameCube's Flipper GPU, and then took that same basic design and essentially cranked up the clock to generate Broadway. The systems run at 485MHz and 729MHz respectively. Even to this day IBM continues to make custom versions of the venerable G3; it's cheap to produce, simple to modify and can be ground out in quantity from their secret underground base in Fishkill.

Despite this, IBM can and has made other kinds of application-specific PowerPC designs. The Cell in the Playstation 3 is the most obvious example with its PowerPC "PPE" core and satellite SPEs, and the Xenon in the Xbox 360 is three PPE cores stuck together; Xenon even has modified AltiVec instructions ("VMX128"). As if to confirm IBM was trying something new with Nintendo, during the WiiU's hype-y-moon Nintendo marketing billed the processor as "the same processor technology found in Watson." We assumed this to be a POWER7 derivative, given that Watson was IBM's POWER7-based know-it-all computer cluster that pretended to win Jeopardy.

However, Martin claims that the WiiU CPU is also, once again, another morph of our old friend the PowerPC 750 with some additional cache. Well, I'm dubious, and the reason is that the 750 and its ancestor, the 603e, were never designed for multicore environments (and Espresso has three). The only multi-CPU 603 I ever met was the BeBox, and the BeBox's design hobbled its two CPUs because it had to overcome the 603's incomplete cache coherency with glue logic (the 603 and the G3 do not support all five MERSI states necessary for multiprocessing; they are only MEI). I've never met a multiprocessor G3, and Apple certainly never made one. In fact, early 7400 G4s have the same limitation.

Furthermore, Martin also admits that the CPU is out-of-order, which the 750 never was either. POWER7, interestingly enough, is. On the other hand, "big" POWER like the POWER7 has significantly different execution characteristics than "little" POWER. We know this personally from this project, because the G5 acts like a "big POWER" CPU and requires "big POWER" optimizations that are different from what a G3 or G4 would require. The G5 can run G3 and G4 code, but it certainly doesn't do so as well as its own, which was why early G5s weren't really all that much faster than the MDD G4. The fact that the WiiU cores implement the same instruction set as Broadway, including the modified SIMD FPU instructions -- because, other than emulation, how else could it still run Wii games acceptably? -- make it pretty unlikely it really is a POWER7 derivative, and Espresso also lacks POWER7's hardware threads. However, to work in a multicore environment, whatever is in there cannot be a garden-variety 750 (unless the system board situation is as tricky as it was with the BeBox). The pipeline is certainly not a regular 750's to have OOOE.

In the final analysis what we're looking at is a new design, inspired by the 750, but not a 750 (just like the 7400 G4 was more than a G3 with an AltiVec unit bolted on); there are too many fundamental changes in its operation to just call Espresso merely an evolutionary variant. Martin should know this and I'm a bit disappointed in his simplistic analysis, even though I salute his technical skills. It also means that, like the megahertz myth of the PowerPC vs x86 days, the clock of these cores is probably not at all comparable to the more deeply-pipelined PPE in Xenon and Cell. And that's why fanbois suck.

23 comments:

  1. He made it up and he did 2 telling things he made the cpu look like a old mac 1.24ghz g4 this is were he gets the idia the clock cannot be higher he simply took the g4 clock that macs were stuck with for years then ran with it secondly he took the gpu speed from ps3 he clearly made all this up he is an idiot the powerppc cannot do multicore its the powerpc 400 that supports mulicore SMP and its the 400 series thats mcm and system on chip ready its the 45nm version of cpu older 750 g3 customers now use he is exposed as nintendo hating its a broadwayfied version of powerpc 476 fp and its likely clocked at 1.6ghz the native clock of powerpc 400 series multicore

    ReplyDelete
  2. Powerpc 750 and g3 supports single cpu only and 64 bit bus,powerpc 400\476fp the 32bit cire ibm offer at 45nm replaces the 750 for ibm customers is clocked lowest at 1.6ghz to 2.0ghz is a mcm and system on chip ready embedded design of the 750 32bit core the powerpc 32bit at 45nm is the. Most powerful 32bit core on earth a powerpc 476fp is over 2x the power of a ARM A9 at same clockspeed with ease and operates on a 128bit embedded ring bus and us 100% backward compat with powerpc 750s the wiiu has a powerpc SMP multicore mcm on a 128 bit ring bus using broadway plus spec cores and a custom 3mb high bandwidth edram l2 catch likely clock speeds gpu 800 mhz. Ring bus 800 mhz cpu 1600 mhz. Ram1600 ddr3 800 dualchannel gpu edram 32mb at 800mhz dedicated DSP hd sound chip at 120mhz gpu likely based on hd4670 and shrunk to 40nm and customized this is called commonsense based on factual ibm chips and process

    ReplyDelete
  3. Broadway cpu 2 instructions per clock cell 2 instructions per clock. Xenon 2 instructions per clock average intel 2 instructions per clock gekko cpu 1 instruction per clock powerpc 476fp 45nm mcm ready 5 instructions per clock arm a9 2 instructions per clock. Powerpc 32 bit at 45 nm is 5 instructions per clock efficent and has 5 execution units per core and supports 2 to 8 cores per mcm or system on chip wiiu has 3 powerpc 750 are not avalable at 45nm do not support multicore and do not support ibm edram the hacker lied im telling the truth I said 2 years ago wiiu would use powerpc 45nm 32bit

    ReplyDelete
  4. ClassicHasClass: are you going to allow WiiU fanboi/troll spam in your comments?

    ReplyDelete
  5. Audie is certainly welcome to his assessment of the situation.

    ReplyDelete
  6. power pc 750 was 100% out of order it was RISC OUT OF ORDER vs cisc out of order intel celeron and pentium if the power pc 750 isnt out of order please explain the 5 internal execution units !!!!!!

    gekko cpu of gamecube had 5 execution units wii cpu had 5 execution units wiiu has 15 execution units xenon in xbox 360 has 1 execution unit per core = 3 in total wiiu has over 10 x more execution units compared to xbox 360s cpu

    FACT

    ReplyDelete
  7. cell cpu inline, Xenon cpu inline, gekko cpu out of order, broadway cpu out of order, and xbox 1 cpu out of order....=FACT

    the gc /wii / wiiu all have out of order cores each core supports 5 efficient execution units POWWERPC 750 HAS BEEN OUT OF ORDER FROM THE DAY IT WAS CONCIEVED YOUR CONFUSING IT WITH OLD 600S IRONICALLY THE CORE BASE OF CELL AND XENON BOTH POWER PE NOT POWER PC

    PE IS FOR STRIPPED OUT CRAPPY CORES ONLY NINTENDO USES TRUE POWER >PC< FULLY SPEC'ED LOADED POWER PC OUT OF ORDER CORES THATS WHY BROADWAY WAS SO CLOSE TO PS3 IN LINUX TESTS

    ReplyDelete
  8. power pc 750 cannot support SMP simultainious multi core only power pc 400s and power pc 476fp support it so nintendo went from 750 to 400s base core and BROADWAY FIED IT and this family of cores has a base clock speed of 1.6ghz

    PLEASE GOOGLE WIIBOY101 AND POWERPC 476FP THE GUYS A GENIUS AND MADE 1/4 MILLION UK POUNDS PROFIT IN NINTENDO STOCK OH BY THE WAY I AM WIIBOY101 I PREDICTED A SYSTEM ON PACKAGE OR SYSTEM ON CHIP WITH POWERPC 400 BROADWAY CORES X 3 AT 1.6GHZ 2 FREAKING YEARS AGO

    ReplyDelete
  9. my prediction of wiiu specs 2 years ago

    system on package or chip with a powerpc 476fp broadway -fied core x 3 @ 1.6ghz or close and IBM EDRAM and a 128 bit ring bus designed for mcm or system on chip with gpu edram also and main ram being gddr3 1600 or there about

    i predicted this years ago just as i predicted disruption and motion controls for nintendo in 2003 and put my money were my mouth was in 2005 and watched iwata talk of disruption in 2006 and said to sony fans and ms fans in forums I TOLD YOU SO then sat back and made my money £££££

    seam malstrom wasnt the only one you know i was doing what sean is in his blog in forums in 2003 and oh look everything i predicted came true including the near bankrupcy of sony

    BUT IM ONLY A COUGH TROLL LOL xx

    ReplyDelete
  10. hacker would you please explain how your stated clock is suspiciously exacly like a old imacs cpu speed out of all the clock speeds in the world wiiu matches a very old mac I SEE THRU U HACKER YOUR LYING

    ReplyDelete
  11. hacker fails to see that powerpc 750 single core no multi core support

    became powerpc 400s and 476fp at 45nm with 128 bit bus and fully supporting multi core and ibm edram existed before wiiu so its broadway fied powerpc 476fp not a over clocked powerpc 750

    to all informed sane people lol

    ReplyDelete
  12. wiiu cpu 5 instruction per clock, 5 execution units per core out of order, backed up by a HD surround sound dsp (cpu not required for sound)and a ARM co cpu


    xbox 360 totally inline cpu with only 1 execution unit per core vs wiius 15 and no co cpu and no dsp whats so ever

    under powered LOL

    ReplyDelete
  13. IBM DID NOT STATE IT WAS WHATSONS CPU THEY STATED THERE WAS SHARED TECH SHALL I EXPLAIN THIS NOT ROCKET SCIENCE TO YOU ALL!!!!!!

    WHATSON = 45NM SILICON ON INSULATOR COPPER WIRE PROCESS AND IBM EDRAM

    WIIU HAS 45NM AND COPPER WIRE AND SILICON ON INSULATOR AND IBM EDRAM

    PRESTO THERES YOUR WHATSON LINK THAT WASNT LIKE SENDING ROCKETS INTO SPACE WAS IT IT WAS JUST PLAIN OLD COMMONSENSE

    ReplyDelete
  14. the real reason this so called hacker guy and all the other scum buckets are doing tjis is to dis credit the wiiu troll it like they have been doing from 1980s wen nintendo butt raped atari and the usa computer industry almost into dust

    wi did it again and now wiiu is free to rule only apple and ms can waste billions chasing them as sony has been left near dead and bankrupt

    this is why pro usa industry pro pc sites are making these reports up the hacker guy is most likely all made up or some guy pretending he knows when in fact he doesnt and the 550 mhz gpu is also crap it doesnt clock balance with the cpu at ether 2to1 or 3to1

    nintendo use balanced systems of 1to1 2to1 and 3to1 max theres no way in hell the cpu and gpu dont balance 550 x 2 = 1100 mhz not 1240mhz

    and half of 1240mhz (1.24ghz) = 620 mhz SO THE GUYS A LYING F%%K

    ReplyDelete
  15. Audie, not only is that language not acceptable, but you're now spamming this blog. I certainly don't mind diverse points of view, even though I don't agree 100% with your statements, but nine posts overnight crosses the line. I will delete your posts if you do this again.

    ReplyDelete
  16. Spotify launched their HTML5 webapp recently, and since they stream Vorbis this could be a potentially interesting development for Mozilla folk, not to mention that they bothered to do a UB. A Linux client is also available. Can someone on the Linux PPC side of the fence update us all on the status of Australis addon compatibility and GStreamer/WebGL support?

    Tobias, was I correct in reading that Carbon plugin restoration would be easy? Even if it is a stopgap until the era of NPAPI VLC, spoofing Flash 11 is very popular with users and if they have to switch browsers to do this I think it costs us at some point.

    ReplyDelete
  17. sorry if iv offended anyone here its your blog so i appolagize the reason i do what i do is being the voice of reason in this mad console online talk is very refreshing we cannot leave sean malstrom to do it all on his lonesome commonsense must and should provail always im sck to the back teeth with uneduated moron fanbois talking out there back sides the day i listen to a guy who claims to be a core gamer yet plays fps with twin sticks and software auto aim and would rather his disc drive play movies instead of stream and load data at cartride like speeds is the day i buy a playstation and that aint ever going to happen

    ReplyDelete
    Replies
    1. This isn't a gamer site. Its about using open source software on old PPC macs (specifically on OS X 10.4 and the Firefox fork, TTF). This might have interested me 7 years ago, but now, with the exception of mainly gaming and servers, the Power Architecture is pretty much dead in laptops and pc's. Speaking for myself, what you write here will probably get little attention among your fellow gamers (I'm not a gamer) since this site is about TTF on the Mac OS X 10.4 (which isn't Wii U).

      Too bad Power Architecture WiiU APU's [cpu+gpu] aren't done on 28 nm silicon. 45 nm SOI/copper as a fab process is getting pretty old. IBM now use 32 nm for their Power7+. Nintendo to keep its price low only gets to use the IBM's less expensive (and therefore inferior) fab technology. I only mention this because you seem to have such an interest in old PPC chips.

      Also, you don't even mention what compiler they use on WiiU software. Hardware is only part of the equation, software is the other part. Is is an autovectorizing and autoparallezing compiler?

      You should take ClasicHasClass' advice and learn to write a lot more clearly (complete sentence with proper spelling). I frankly thought your writing was a spambot because it made little sense and looked like gibberish. If you want people to read what you write: learn to practice writing succinctly.

      Delete
    2. In case this helps other readers: "iv"=I've, "appolagize"=apologize, "provail"=prevail, "sck"=sick, "uneduated"=uneducated, "there" (in one case) =their, "cartride"=cartridge.

      Punctuation, including sentence breaks: sorry, you're on your own.

      Delete
  18. powerpc 32 bit already exists at 45 nm mcm package with multicore SMP and is ready to relace all old school powerpc 750 and ppc 400 customers needs so a version with broadway cores and edram would naturally replace the broadway i do not see why a existing process at ibm would be ignored and then ppc750 worked on to ad edram add multi core etc would be done wen that has already been done by the ppc 400 series with mcm ready multicore plus dsp plus gpu on a 45nm lsi that already exists at ibm why ignore that and start from scratch with a all new design making ppc 750 broadway a multicore it makes no sense at all,,,the g4 4 pipe stage cpu did indeed max out at 1.24 nm but that was 180 nm and many many years ago we are now at 45 nm copperwire silicon on insulator mcm embedded designs i dont see 1.24ghz being the max with 4 stage pipe it would surly be much higher today the powerpc 32 bit is now a 9 stage pipe ppc400 s and ppc 476fp system on chip or mcm at 45nm now if nintehdo did go with this but maintained the more effectve 4 pipes instead of 9 then the new expresso is infact a hybrid of broadway and ppc400-s and the 1.24ghz simply doesnt matter but with the existence of 1.6 to 2 .0 ghz ppc 400s and jo effidence of 45 nm ppc750s existence and no effidence f ppc750 doing multi core i have no coice but to beleive that wiiu uses a custoized boadway 2 core bsed on ppc 400s,476fp and not ppc,750 the hackers intentions with this news is to make wii u look bad he also has no idia about gc wii wiiu simd its 2x 32bit simd plus custom instructions plus peak 4to1 data compression how is that weak it would blow a sandard 4 x 32bit simd from a g4 out of the water its fadtly more effective per clock has great compression and is per clock efficent at a way way higher level than old g4s and the bus is way way way faster than old g4s and then add the ultra high bandwidth custom 3mb edram l2 catch he is claiming the wiiu is like old g4s wen in fact its a brand new kodern cpu using the latest processes at ibm

    ReplyDelete
    Replies
    1. Translation notes: "wen"=when, "surly"=surely, "nintehdo"=Nintendo, "effectve"=effective, "jo"=no, "effidence"=evidence, "f"=of, "coice"=choice, "beleive"=believe, "custoized"=customized, "boadway"=Broadway, "idia"=idea, "sandard"=standard, "fadtly"=?vastly, "efficent"=efficient, "kodern"=?modern.

      The author's comma and period keys clearly work, but the entire comment looks like one long un-punctuated, unbroken run-on sentence.

      Delete
  19. sorry i sent a post on a tiny phone screen,,, i didn't realize this was a blog about grammar policing!!!! and the following sentence is physiological fact!

    the discussion or argument changes to someones grammar when the other person feels beaten in the argument or said discussion, its always the looser or threatened person that changes the subject to the other persons grammar!! "THAT'S A PHYSIOLOGICAL FACT BY THE WAY" WINK WINK !!!!

    im more interested in the wiius edram bandwidth and configuration anf the 3mb edram catch and system bus and how powerful the imput output co cpu is

    so i can build a picture in my mind to how they did it is there a huge texture catch or shader catch or a texture shader shared catch how much edram goes to frame buffer etc these are the important issues in specs

    i was reading a thread and real programmers were saying pure inline cpus can only do game code at 1/3 to 1/10 of a out of order version of the same cpu give or take and it basically proves xbox 360 cpu was no better than 3x xbox 1 cores if even that funny many developers said that 6 years ago then ad that all io os and digital surround sound was also cpu processed it makes you realize that that cpu was a load of horse poo

    ReplyDelete
    Replies
    1. This blog is neither about physiology nor about grammar/spelling. However, a writer's demonstrating his blatant misunderstanding of either does not serve to reassure readers of that writer's expertise in other fields such as computer sciences.

      "looser"=loser. "imput"=input. "a out"=an out. "ad"=add.

      Delete