Parallel computers are rad! I know we are all spoiled these days with our symmetrically multi-threaded multi-core microprocessors – the computer I am typing this blog on has 22 physical cores –, but forty years ago, no ordinary individual had access to a parallel machine. As a curious computer enthusiast, I was always fascinated by these super systems. Thanks to my college professors, I got a glimpse into a few of these inaccessible machines.
Sometimes, my exposure was purely theoretical, which proves by the way that one doesn’t need the real thing to understand how it works. I recall our algorithmic Professor, Patrick Greussay, showing us how to microprogram a MasPar-like computer. It was mind-boggling. We learned writing microcode to drive a systolic grid of one-bit elementary processors to implement parallel algorithms. This first contact with a Single Instruction Multiple Data (SIMD) architecture was an enlightening experience! A few years later, my Teacher and Mentor late Jean Méhat, exposed me to the Transputers he was using in his research on AI and the game of Go. Years later, I could use day-in-day-out my personal multi-processor computers (here), but as good these are, they are no parallel computers. A few months ago, I decided to get a Transputer. After all, in the 21st century, shouldn’t all individuals be able to play with one of these marvels? The quest was set, remains the execution.
But before diving into the nuts and bolts – that’s for later posts – let’s have a look into what a Transputer is? The Transputer microprocessor is the brainchild of Michael David May a British computer scientist (here). For his teaching and research in robotics at the University of Warwick, he had to design systems relying upon serial interconnects to drive and synchronize computations between multiple and distributed circuits and sensors. Like breaking warp speed in Star Trek attracted the Vulcan’s attention to Dr. Zefram Cochrane, these concepts opened the doors of Inmos International (Inmos) to May. Inmos – with foundries in Newport in the UK and Colorado Springs in the US – was founded in 1978 by Iann Barron, Richard Petritz, and Paul Schroeder. Like many chip makers of the era, Inmos started making money selling memory, with a focus on EEPROMs and SRAMs. But, in the background, the goal was always to produce a microprocessor for parallel supercomputers. In fact, Inmos wanted its chips to be useful at all departments of computers. This, of course, includes the CPU, the I/O controllers, the peripherals, and even embedded systems. A very ambitious goal. No surprise if Inmos’ propaganda team coined the name Transputer as the contraction of transistor and computer. The Transputer should have been the transistor of the future!
To make this vision a reality, the Transputer’s design and the implementation must have been dirt cheap. Or at least inexpensive enough compared to the other parallel architectures. But success is a creature hard to tame. So, whatever brilliant the Transputer was, and despite the promising T9000, it failed to become the foundation of modern computers. Inmos disappeared as such in 1994 as the company struggled to become profitable while it burnt through UK taxpayer money up to 210 million pounds. Even Margaret Thatcher’s privatizations couldn’t help. In 1989 Inmos was sold to the French-Italian SGS-Thomson (a.k.a. STMicroelectronics). Nevertheless, former Inmos employees engaged in new endeavors, and the technological legacy of the Transputer still fuels modern computers.
If you wonder if there are any Transputers still in use, you will be amused to learn that few of them made it into space! The simple architecture of the Transputer makes it less prone to the adverse effects of space, while it can deliver serious computational power. Satellites such as the NASA’ SOHO or models from Surrey Satellite Technology and SunSpace are running on Transputers. Similarly, the SpaceWire standard, developed in 2003, draws its origins from Inmos concepts. Although STMicroelectronics doesn’t use the Inmos brand since ’94, its STBus got influenced by the Transputer. And I am sure I am missing many more out there.
So, what made the Transputer so special? Well, almost everything. To achieve Iann’s goals, May et Al. designed arguably the first System on a Chip (SoC). Indeed, to make the Transputer ubiquitous, it had to embark on a single chip computational circuits – a simple low-instruction-count CISC 32-bit core with an input clock @ 5 MHz –, local memory, and some form of I/O. I would say that the most remarkable part of the architecture is in the un-core: the serial links used for point-to-point communications (@ 10 Mbits/sec). Using the same link protocol in every chip makes building a Transputer system, even with heterogeneous nodes, relatively easy.
Although a single-Transputer computer is viable – who can afford more than one anyway? – it’s really when you interconnect a buttload of them that you can build a worthy parallel computer. When you do so, you must ensure that they talk to each other in the most efficient way to run your algorithms. This is where the interconnect topology gets into the picture. Should it be a binary tree, a loop, a grid, or a hypercube? As a system designer, you pick one, and you assemble it. For this purpose, the Transputer offers one or more – typically four – serial links to handle in hardware all aspects of the inter-chip communications.
From the get-go, Inmos insisted on using the OCCAM language to program Transputer based systems. Although you can use languages such as C to write a process running on a Transputer, in the end, only OCCAM makes it simple to express concurrency between processes and processors. Note that on a Transputer, everything is a process, and multiple processes can run on a single Transputer or as many Transputers as needed. Inmos claimed that the entire system could be described in OCCAM! What about machine language? Well, the Transputer has a CISC ISA and uses microcode to implement them. While most of the 32-ish instructions execute in a few cycles and are encoded in a single byte, more complex ones need longer µ-code, may clock cycles, and multiple byte encoding. Sorry, no RISC here. Note that the µ-code has the same structure regardless of the CPU word-length, making it standard across all Transputers. I believe you could write code in ML, but this ISA was designed with the compiler in mind, not the programmer. The message is simple: stick to OCCAM!
At its core, the Transputer has a scheduler – implemented in µ-code – that runs the processes with two priority levels. As a devoted FORTH programmer, I was pleased to learn about a register stack in the Transputer’s core – yeah, with push and pop semantic, and no hardware bound checks for simplicity. Unfortunately, the stack is composed only of three registers out of the six totals (A, B, C, and AF, BF, and CF in the FP unit). That’s less than an HP calculator! Bummer. Since the FP stack has two banks (one per priority level of the scheduler), maybe there is room for some witchcraft & trickery… Of course, several FORTH were implemented on the Transputer, but they are using traditional local-memory stacks. Inmos’ justification for such a small register file is the close local memory in the Transputer. The scheduler provides direct support for the concurrency expressed in OCCAM. To do so, active processes are maintained as a linked list. Two registers point to the head and tail (Front & Back). The currently executing process has its context pointed at by the Workspace register. A running process is preempted when it waits on an I/O or a timer. Since a process can spawn processes itself, two instructions, start process (STARTP) and end process (ENDP), help managing the linked list and adding/removing workspaces as needed. A simple and elegant approach. Who needs a kernel anyway?
I mentioned earlier, the point-to-point synchronized and unbuffered communications are the magic of the Transputer, as expressed by OCCAM channels. From a hardware point of view, this is a blessing, as no queues or buffers are required. Intra-Transputer channels are represented by a single word in memory. Inter-Transputer channels are implemented using hardware links. Similar to the dedicated CPU instructions to handle processes, message passing has its CPU support: input and output message (IN and OUT). These instructions behave the same regardless of the nature of communication (intra or inter). Both the transmitter and the receiver must be ready to exchange data for executing. So, if the OUT instruction of the transmitter process doesn’t have an IN instruction of the receiver process executing, the transmitter is demoted to the inactive state by the scheduler. Again, simple and elegant. Note that point-to-point interconnects are very efficient to process data with high memory localities – such as images – but when the communications need to reach distant Transputers, system designers had to rely upon switches. For example, the T9000’s companion IMSC104 chip is a 32-way non-blocking cross-bar switch with an optimistic claimed latency of 700ns.
Inmos has developed and released several VLSI Transputers over the years, along with many specialized companion chips. Few notable Inmos products are the first T414 released in 1985; the T800 with floating-point capabilities released in 1987; and the previously referred and promising T9000 – darn, with one less zero, it could have been Terminator-worthy –, which unfortunately never made it to the market. Because many Inmos chips look-alike to the inexperienced eye, here is a small product number (PN) decoding ring just for you. PNs are in the form of IMS abbbc-xyyz. Where a is the product group (most important ones are A for DSP, B for PC board, C for communications, G for graphics, and T for Transputer). bbb is the product identifier. c is the revision. x is the package type. yy is the part’s speed. Finally, z is the chips specs (military, standard, etc.).
Transputers are often used in professional applications or supercomputers. But what about the consumer market? Well, it depends on your definition of consumer. In my book, it can be anything I can buy – with some hefty cash – but without having to verse in some form of occult ritual involving representatives, smugglers, or even lawyers. To my knowledge, two of such products existed. The Kuma K-Max and the Atari Transputer Workstation, a.k.a. ATW-800.
To experience the Transputer without having to give up an arm and a kidney, you can play with the Transputer Emulator (here). If you want the real vintage hardware, you can always scavenge eBay, but I am afraid you will need several kidneys in stock. The other path is to look into current projects involving Transputers. Regardless of the path you pick, visit the excellent Geekdot web by Axel Muhr (here). It is a gold mine if you want to learn anything practical about the Transputer. In my next post dedicated to the Transputer, I will share my experience with two of Axel’s creations: the T2A2 and AM-B404. Stay tuned!