Wednesday, May 2, 2018

A semi-review of the Raptor Talos II

After several days of messing with firmware and a number of false starts, the Talos II is now a functioning member of the Floodgap internal network. It's under my desk with my other main daily drivers (my Quad G5, MDD G4, SGI Fuel and DEC Alpha 164LX) and shares a KVM. Thanks to the diligent folks at Raptor, who incredibly responded to my late night messages at 2am Pacific, the fans are now much more manageable and I'm able to get proper video output from the Radeon WX 7100 (though more on that in a minute). As proof of its functionality, I'm typing this blogpost on the Talos instead of on the G5. Here it is in its new home:

I'll call this a "semi-review" because, well, the system is a work in progress and getting the most from it will take time. Relatively little is optimized for PowerPC these days, and even less still for little-endian PowerPC or POWER9 in particular. If you want performance benchmarks, you can read Phoronix's performance tests which are substantially more thorough than anything I could gin up. This is about my experiences with the unit now that I've been using it most of today with my early firmware issues now largely corrected.

For what's in it, see my unboxing photographs from a few days ago. This system is best described as a middle-road configuration now that the 22-core chips are becoming available. It contains two four-core Sforza POWER9 processors on a 14nm process at 3.2/3.8GHz with 512K of L2 and 10MB of L3 per core; there is a discrepancy with the wiki which says 3.1/3.7 but you can read Raptor's spec sheet. SMT is available. By having both processors installed, all of the PCIe slots in this machine are unlocked (this was a deliberate design decision for efficiency, not to make you buy hardware you didn't need). All of the options in this unit are factory-installed: 32GB of ECC DDR4 RAM (maximum 2TB), an AMD Radeon Pro WX 7100 workstation video card (roughly a hopped-up RX 480), a Samsung 960 EVO 500GB NVMe SSD on a PCIe card and a Microsemi PM8068 SAS 3.0 controller. The machine comes stock in a Supermicro CSE-747 EATX case with two redundant 1400W power supplies, a SATA controller, onboard VGA, onboard USB 3.0, onboard 2xGigE network ports (Broadcom BCM95719), onboard RS-232 serial and an LG Blu-ray drive. A recovery disc with the factory firmware and manual is included. Sticker price was approximately US$7200.

No operating system is installed except for Petitboot (more in a moment). This machine should eventually run anything that supports it (of course it will run NetBSD, at least someday), but your sole option right now is Linux, and bleeding edge Linux at that: Raptor's excellent tech support team tells me that kernel 4.13 is minimally required and 4.16 is strongly recommended. This greatly limits your choices out of the box especially if you don't already have another Linux system to support bringing this one up. I didn't, so I selected Fedora 28, which supports ppc64le and has kernel 4.16. As of this writing, the final release of 28 has just hit the streets, so that's very timely. Officially Fedora only supports the Server flavour on ppc64le, but we can convert that to a Workstation version after it's installed.

The system boot sequence has several stages. You can read about them in a bit more detail on the RCS Wiki, but the breakdown is not unlike that of a modern POWER server, since this is mostly a modified OpenPOWER design. Immediately when power is applied, the system boots the "BMC" Baseboard Management Controller, which runs on an ARM6L service processor. This sits idle when the main processors are powered down. When the power button is pressed, the BMC starts the Initial Program Load (IPL) process on the main POWER9 CPUs from PNOR flash. Through a complex six-stage process the IPL terminates with loading Petitboot, a simple loader inside a tiny Linux environment called Skiroot. Since Petitboot is running in a tiny Linux, its presence simplifies driver support for the main operating system by handling platform functionality directly via the OpenPOWER Abstraction Layer. Petitboot, in turn, kexec()s into the OS kernel and, at least in theory, away you go.

Initially the fans came on at IPL at a terrific volume such that my wife and I could not reliably hold a conversation in the same room (no exaggeration). In addition, while I could get Petitboot to display on the Radeon card, the operating system wouldn't appear -- I had to use the onboard VGA to boot, which was inconvenient for my KVM and meant my expensive workstation card was doing nothing. Raptor's tech guys listened to my frustrated pleas and notified me immediately when the most current firmware was available. When I loaded this firmware on the system, it worked beautifully for about 10 or 15 minutes and then started freaking out, failing to see the NVMe, kernel panicking on the Fedora disc that it used to boot from, etc. Raptor got me another command to blank the GUARD partition on the PNOR flash to try to reset it, and the machine started working! In fact, I've been using it since about 11am today non-stop, so I consider that to be an excellent burn-in period. Oddly, I can't get Petitboot to appear on the WX 7100 output now, but the OS does, so it's not a big deal (I can just switch to VGA if I need to get into the bootloader until that gets fixed). This all happened literally within the space of a few days. The fans rev up and down periodically, which can be a bit disconcerting next to my usually quiet Quad G5, but they are no longer anywhere near as shrill or constant and I only notice them if the room is warm.

Power usage, incredibly, is quite modest for a machine of this specification. When you connect the power, the BMC sits idle at around 13W with a very quiet fan running. Starting IPL, the fans do still come on full blast at least initially and power usage jumps immediately to 143W. This climbs slowly to 212W by the time you hear the beep from the system indicating Petitboot has started. Petitboot then starts Fedora and once Fedora has booted and we are at the login screen, power usage drops back to around 150W and the fans automatically throttle down. I got out my infrared thermometer and checked the heat coming out the back, and found it was a very reasonable 91 to 99 degrees Fahrenheit. (The cat likes the G5 better for heat.) Most tasks barely moved the needle after that. I did some installations with dnf and the power usage barely rose to 160W. Compiling OpenSSL got it up to 177W. This is all less than the Quad G5 next to it, which right now is reading 238W on the UPS while sitting largely idle in Reduced power mode. On the other hand, I'm not using the video card very heavily, so this output could jump quite a bit once I get some games running on it. There are also no drives connected to the RAID yet, just the NVMe SSD.

You can expect an upgrade path with this hardware as well. Besides accepting any hardware accessories that are compatible with Linux (though see below), Raptor is planning to make additional processor options available if you have the thermal headspace and the power capacity, even this 22 core monster. Unlike the Quad G5, this system shouldn't be a dead end.

So that's the hardware. Let's talk about the software. This isn't under Raptor's control necessarily, but it will play into your decision-making process should you make an investment in one.

There are really two kinds of customers for the Talos II: people like me who dislike x86 on technical grounds and Intel's continued hegemony and wish to support alternative architectures of comparable performance, and people who are paranoid and want a system that is far less of an opaque closed system that they can audit and trust than what passes for commodity hardware these days. There is naturally some overlap between these two groups. The second group will probably put up with a little more inconvenience for the sake of ultimate privacy than the first group, which is more concerned about functionality. You should think long and hard about where you fall here, because it will affect how you perceive the system.

The major selling point to the second group is that the firmware is fully open-source and auditable even down to the FPGA level. Schematics are included! You can download and build your own FPGA flash image, your own BMC flash image and your own PNOR flash image. In fact, you are expected to, though Raptor provides pre-built versions assuming you trust them and their warrant canary. As long as you don't brick the BMC -- though this is doable if you are incautious -- you can play around with Petitboot and Skiboot pretty much at will and just reflash if you screw it up. Programming the FPGA means you'll need your own SPI programmer, but there's a JTAG port on the board and you can plug right in. Note that this scheme isn't perfect because you still have to trust a certain amount of the other firmware in the system, mostly in the various peripheral devices, but it's clearly better than what you'd get from any other system and it's a strong start towards reclaiming control of our own machines. Although I haven't tried writing my own custom firmware yet, it's very easy to build and flash the prefab releases, and the process is well documented. To upgrade to the current firmware from the v1.02 my machine came with did not require flashing the FPGA, so I could do it all from my G5 by talking to the BMC over SSH.

For the first group, however, that alone won't be satisfactory, because we actually want to use this thing as a computer. Frankly, my plan is to make this the Power Mac G5 successor that never was. It certainly has the specs for it. Unfortunately, this part is the bit that's not yet complete. I haven't tried other operating systems other than Fedora 28 yet, but I can't imagine the experience is much different, so take these observations at face value.

Because Fedora doesn't offer a direct download for Workstation on little-endian 64-bit PowerPC, you have to install Server first, and then switch to the Workstation environment. This will download the remaining missing pieces; I selected the default Workstation environment, which is based on GNOME. Hopefully future distros supporting this machine will do better than this process. While in use the system is perfectly responsive but seems slower than it ought to be at times, particularly compared to the Quad G5, though this is probably an unfair comparison. The G5 is running Mac OS X 10.4.11, an OS written by its manufacturer and highly optimized for it. The Talos II has to contend with an OS for which it is not the primary target, nor one that is particularly tuned for any PowerPC system.

There are various glitches and many things don't work yet. The first and most important deficiency is that I still can't get amdgpu working in Xorg, so I'm using framebuffer support (fbdev). This means the video card is still going largely underutilized. I expect this to improve as Polaris support improves, but it's not there yet. For this reason I haven't even bothered trying to load any games on it so far.

Multimedia is also limited because I don't have a sound device. lspci alleges the WX 7100 has some sort of audio support, but the open source drivers don't currently support it. I'll likely solve this problem with some sort of USB audio out in the meantime, but that's suboptimal. I haven't tried playing DVD or BD movies on it yet either for that reason.

Fedora didn't like the GBU-421 Bluetooth USB dongle I use with the G5. The G5 needed no drivers and it just "works," but GNOME doesn't see it. I had to transfer the picture above from my phone to the G5, and then to the Talos.

A few of the included applications either misbehave or don't work at all, though most fortunately do. The GNOME Software application kept complaining about incorrect checksums, but dnf was fine from the command line. Firefox 59 crashes with a segmentation fault on start-up (like I say, I guess I've got a project now). GNOME Web (formerly Epiphany) does work, but it's WebKit and I don't like that, and it too is not very well optimized. It does pretty well, though, considering; it got 2455ms on SunSpider, which would seem like a dismal number given that the Quad G5 managed 2255ms in TenFourFox, except that TenFourFox has a JIT and GNOME Web here is running interpreted, and TenFourFox is compiled with CPU optimizations specifically for the G5 while GNOME Web and the system WebKit have no specific optimizations. I'm also unhappy there's no Gopher support. On the other hand, you would expect YouTube videos to be a slideshow (no JIT, little or no SIMD), and yet they play at a surprisingly good framerate, just muted. This post is being written in GNOME Web.

However, much of the rest of it does work. Since I intend this to be a successor to my G5, I spent most of the afternoon making GNOME more Mac-like. Using Fondu, I copied the Lucida Grande font from Tiger and converted it back to TTF (to compile Fondu on the Talos, configure it with ./configure x86_64-unknown-linux-gnu, since it doesn't know what the heck a little-endian PowerPC is) and installed it. I then installed the GNOME Tweaks tool with dnf and a Mac GNOME theme and Dock extension. (Some other ideas are on this how-to.) I switched the system font to Lucida Grande in the Tweaks tool, disabled hinting entirely and left it with greyscale antialiasing, turned on User shell themes in Tweaks, and wrote a minimal shell theme to make the top bar more like a Mac menu bar. It's not perfect, but it's a good start. I'll provide it later if people are interested.

To get my Mac shortcut keys back, I installed AutoKey (autokey-gtk), and started making equivalents. A few clashed with GNOME, which I changed from Settings, and I altered a couple others in Terminal, but they mostly just worked with everything else including GNOME Web.

Let's bottom line it. As far as value for money, the machine is well-assembled, solidly built (if in an unexciting enclosure) and consists of quality components. I think the above paragraphs also demonstrate that the level of support from Raptor is absolutely commensurate with what you would expect for a $7000+ computer. Frankly, it's one of the best technical support experiences I've ever had with any system. Part of that is undoubtedly the low production numbers and highly technical engineering audience, but I have never felt like the machine was an unrecoverable doorstop even when it wasn't suitable for use yet.

Software, however, is still a work in progress. You should not expect a 100% functional system at the end and you don't even get a functional system out of the box. Not only will you have to install an OS and go through that process, you're also pretty much guaranteed that something won't work when that part is done. And even when everything you need actually is working, nothing is optimized for it; many things will run abnormally slowly until "someone" (tm) does this work. It's been a long time since PowerPC was a common desktop platform, so many of the optimizations Intel systems take for granted just don't exist, and some desktop apps aren't even tested.

But all of these things are correctable. The hardware is solid. The firmware rudiments are coming together; look at how quickly this machine evolved in just a few short days. Software is likely to be an easier nut to crack on the little-endian Talos than on previous big-endian PowerPC systems, too. Assuming there aren't dependencies on complex assembly code blocks, more code is likely to "just" work with fewer or no modifications because the assumptions made for mainstream x86 will now largely apply here as well. This depresses me personally since I think in big-endian, and have used big-endian systems for decades, but that's the way things are now.

I'm looking forward to this system becoming my daily driver and it might even happen in just a few months. I need to get Firefox working, and I need to get QEMU optimized to run my old Power Mac software. That's all doable. Once the video card and sound options are fixed, I can even start using it for multimedia and games, and the G5 can then become a well-cherished part of my collection.

This is a fully free system you can live with. This is a fully free system that can kick ass. The promise was kept and the dream is real. It's time to get busy.

17 comments:

  1. Tiny little comment that the Talos II can run big-endian software, and our distro Adélie is targeting ppc64 specifically; not ppc64le. We have Firefox and Otter (a Qt WebKit-based browser) and KDE and LXQt all running on G5s with no issue, and when I get my Talos II in, I'm going to be working on the kernel bits to make it run well on 4.14 (the LTS branch of Linux).

    So it's not all little-endian out there, and a lot more is working on ppc64 than it seems if you go with a distro that tunes for PPC first :)

    ReplyDelete
    Replies
    1. Yes, it certainly can be run in BE mode, though I don't envy the task needed to get everything to work together. I'm sad to lose BE, but everything else is going LE, so it's sensible for IBM to go that way too. I'll be interested to see how your distro fares on the Talos though I'm likely to keep this system LE for the foreseeable future.

      Delete
  2. Thanks for posting updates on the talos. As usual, a new, different system is a lot of fun in the beginning.

    No, Linux distros definitely don't feel snappy in comparison to mac os, especially on PowerPC. Personally I got tired of GNOME and all the metapackages that are part of standard distros quite fast. Last distro I used was gentoo and though it requires heck a lot of patience(source install), you can at least entirely configure the whole system install. That could be a possibility since 4.16 is the recommended kernel. That's quite the long way compared to fedora, so maybe not.

    I expect PPC64LE to be less prone to breaking drivers than big endian.

    ReplyDelete
  3. My Talos just arrived yesterday (got the single bundle, 18-core CPU), so I'll be spending time setting it up over the next several days, and then get started porting FreeBSD to it. My hope is to have FreeBSD working on it reliably, if maybe not optimally, by June. Oh, and FreeBSD on PowerPC is big-endian only.

    ReplyDelete
    Replies
    1. Hey, congrats! Hope you like it as much as I do. Just one CPU?

      I'm heartened to hear all the interest in keeping it BE, though I'm surprised there isn't a ppc64le FreeBSD (I'm not up on FreeBSD, I'm mostly a NetBSD dweeb). I certainly hope you document the porting process, I'm interested to hear how that goes.

      Delete
  4. This comment has been removed by the author.

    ReplyDelete
  5. Now I know this will sound like herasy, but with all the power you have to spare on that, it really seems like a quality Emulator is the thing to work toward. QEMU is OK, but I wonder if there is a way to go for more of an OpenSource Parallels/Virtual PC solution. I have a crappy old Pentium D 820 (slowest one), that can transcode a DVD on LXLE nearly as fast as my brothers Mac Pro with Quad Xeons.

    I just wonder if it is easier to bridge at the higher-level, or at the bit-swap level. QEMU always seems so slow compared to Virtual PC, there has got to be a better emulator with the kind of power you have there to burn. If you can get a sandbox going that thinks it's AMD64, then the world really opens up and with bit-swapping going on, that surely must disrupt cache-picking exploits.

    ReplyDelete
  6. Hi good to finally see an open box on this. Been itching to plunk down the money for one but wanting to see some reviews first.

    Just wondering if you may experiment running a big endian Linux on it through KVM, and then running MOL on top of it. That way you can run your old MacOS app on it and properly test its performance vs your old G5, on a more like for like basis?

    ReplyDelete
    Replies
    1. Maybe, or see if I can get KVM-PR to run a BE guest. I haven't played with this much yet.

      Delete
    2. Yes, KVM (with proper kernel support) can do bit-swapped guests. I'm using this to do some very limited LE test runs on a G5 (970), which of course only supports BE mode in the firmware. I'm told it's just as easy to do it the other way.

      Delete
  7. I'd like to know what OS you installed and what media you used. Did you go with the Fedora 28? And did you burn it to a DVD?

    Because I just spent four hours learning a lot more than I wanted to know about Anaconda, Dracut and Systemd tonight, heh.

    I came to the conclusion that no one has tested a USB key installation of PPC64 Fedora in a long time. I don't see how it can possibly work because the kernel command line on the installer doesn't have enough arguments to boot on anything but CD/DVD/Bluray. As far as I can tell.

    ReplyDelete
    Replies
    1. Fedora 28, on a DVD. I'm just old school like that.

      Delete
  8. In the interests of accuracy: I kept writing Microsemi, but this is actually the LSI SAS card. Not sure what I was thinking at the time.

    ReplyDelete
  9. I wonder what impact would have on Phoronix performance tests if they would be compiled by IBM XL C compiler.
    https://www.ibm.com/developerworks/downloads/r/xlcpluslinux/index.html

    Back in a day, IBM XL C almost always produced considerably faster and smaller PowerPC binaries.

    ReplyDelete
  10. Fantastic post! I have been waiting for something like this since the PowerMac G5. I’m a really big fan too of the POWER/PowerPC architecture, and very glad to see that there is some alternatives to “real people” and not the only IBM ultra expensive options.

    It would be really really appreciated if you post some real world benchmarks to compare the performance with other processors/architectures. I’m very interested in buying one, my main work is for 3D modelling and render, so if you have time and you desire to do some Blender benchmarks with “Cycles Render” it would be fantastic. For example the old and trusty BMW Benchmark: https://www.blender.org/download/demo-files/

    If you are using Fedora 28, I think Blender is in the official repositories in the ppc64el branch, so in theory you can install Blender in a few seconds.

    So congratulations for your new machine, I think you will have fun with all that POWER!!! : )

    ReplyDelete
  11. Fantastic. You can try Debian testing or unstable. There is Linux 4.16 there. Debian supports both ppc64 and ppc64el and powerpc (32 bit), but from what I see, only ppc64el have installation media available, and this one is probably only officially supported. You can install Debian testing weekly installation cd.

    I really hope you manage to backport various Firefox fixes upstream. Please! :)

    AMD Polaris is still newish. It should work, but be sure to have Mesa 18 too. You might need to play with some kernel parameters, and make sure to use amdgpu module. I see on your screenshot "llvmpipe", which is a software rendering pipeline, which is slow (even if llvmpipe uses Altivec and co.).

    For the sound I highly recommend small USB DAC (good ones are around 150$, if not little less), and they are actually superior (lower noise, better amplifiers for headphones, better DACs, easy to access volume knobs, etc) to internal PCIe sounds cards or ones usually found integrated on board. (You do not want to waste PCIe slot for so low bandwidth device).

    ReplyDelete
  12. What kind of power supplies does the Talos II use?
    I've been thinking about getting a motherboard to build my own system inside the case of an old IBM dual-CPU netburst Intel server with an Extended ATX board (I'm relatively confident that if it can handle two 3.6GHz netbursts, it can handle POWER9!). What standard (if any) is followed for them?

    ReplyDelete

Due to an increased frequency of spam, comments are now subject to moderation.