A Japanese company has announced a massive, 800 teraflop real-time ray tracing (RTRT) system that gangs together nine, 73-core chips into a single system that fits inside a desktop computer form factor. The new chip, which is being jointly developed with Toyota and Unisys, is aimed at the auto industry, where designers will use it to prototype body designs and paint combinations.
As for how this system works, there are currently only two sources of information: a Japanese description on the website of the chip's maker, TOPS Systems Corporation, and a Nikkei article in English that's presumably a summary of the Japanese original. Given the paucity of information and the relative shallowness of my technical knowledge of ray tracing, I'll give my best shot at explaining this system and putting it in context, and I'll invite others to weigh in with more info in the comments thread.
As I noted above, the overall system consists of nine identical, 45nm ASICs ganged together via some unspecified interconnect scheme. Each individual ASIC consists of nine compute clusters connected to one another through a shared bus. (See this excellent diagram from Nikkei.) This bus also hosts a 64-bit RISC master controller that presumably takes in work in batches and assigns it to the other eight cores, which then do the grunt work of computing the rays; there are also I/O and memory interfaces attached to this shared bus, which link the chip to the rest of the system.
There are a few things that are interesting about these clusters, one of which is pictured below. First, you'll notice that each cluster is made of eight heterogeneous cores, each of which is supposed to handle on part of the ray tracing algorithm.
The heterogeneous cores are connected by a high-bandwidth, three-bus link (system bus, data bus, and instruction bus), which lets a job move in stages from one core to the next. Clearly, this is a pipeline setup, with one core per stage, and in this respect the ASIC is very much a ray-tracing GPU—analogous to the fixed-function GPUs of yesteryear, which had custom hardware blocks dedicated to each stage of the rasterization pipeline.
The fact that this is a ray-tracing GPU has very important implications for the part's future in gaming. To wit, it has no such future. But more on that in a moment.
The second interesting thing about this system is that it addresses RTRT as a compute problem, instead of as a data management problem like the much less ambitious Caustic Graphics solution. You'll recall that Caustic's solution relies on the traditional GPU to do the computational heavy lifting, with the Caustic board accelerating the data lookup part of the problem. The TOPS design, in contrast, is more traditional brute force, multicore plus caches solution whose main novel twist is this one-core-per-pipeline-stage idea.
There's a reason that the other plans for a hardware-based RTRT solution have been homogeneous multicore designs that throw bandwidth and highly parallel math hardware at the problem, and that's the fact that you can actually repurpose a homogeneous design for other applications in different verticals, thereby gaining the sales volume needed to make producing the IC profitable. This brings me to the reason why you shouldn't plan on using a successor to this chip in a computer game.
Don't count on ever buying one of these for your PC
It seems like ages ago since Intel jumped on the real-time ray-tracing bandwagon, flogging the rendering technique as the inevitable successor of conventional rasterization techniques for generating video game graphics. But this turned out to be a classic case of a lot of smoke but no real fire, as the graphics community pushed back fairly hard against the RTRT hype (some of which I admit to falling for), with many arguing for the eventual dominance of a hybrid rasterization/RTRT approach.
In a nutshell, unless you really need to see an accurate representation of the way that a particular paint shade looks on a surface in different lighting conditions, it's not clear that an RT-only approach will ever have an advantage over rasterization for real-time rendering. Actually, it would be more accurate for me to say that people whose opinions I trust on all things graphical are very insistent that RT-only will never supplant rasterization for either real-time or offline rendering in non-industrial-design contexts, while the advocates of "RT everywhere" are in the minority. (I hate having to "report the controversy" like a journalist, but that's the only responsible thing for me to do here.)
So given that this new ASIC is a very ambitious, complex design that will be fabricated at the current 45nm process node, it will be quite expensive. And given that the only possible use of this extremely expensive, boutique ASIC is to accelerate ray tracing—a rendering technique whose commercial prospects outside of the narrow market of industrial design are debatable—there doesn't seem to be any way that this product can achieve enough volume to come down in price. This means that the TOPS part will always be a boutique item intended for a very specific, relatively small vertical, and all of its customers will pay through the nose for it into perpetuity—sort of like the SGI workstations of old, but worse.
There's a chance that the much cheaper Caustic RTRT solution could come down in price enough to pass the "what the heck, I'll buy one" threshold and create a mass market for RTRT, in which case there may eventually be premium niche for an RTRT GPU to fill. But that's too many "what if's" strung together to make anything but the most hypothetical case for an eventual TOPS-derived RTRT mass-market GPU.
No comments:
Post a Comment