Mike Hutton (Google, USA): Accelerating Deep Learning with TPUs

Abstract:  Google introduced the first Tensor Processor in a 2017 ISCA paper.  The TPU is a domain-specific, coarse-grained VLIW processor with dedicated matrix-multiply units designed to accelerate machine learning workloads over large-scale data.  Multiple generations later, current TPUs now support both inference and training, massive compute power (more than 100 petaflops training systems) and drive all of the internal machine learning efforts at Google.  This presentation will overview the TPU and its evolution, including some of the design principles and decisions shaping the architecture.

Bio:  Mike Hutton received his BMath in Computer Science in 1989 and MMath in Computer Science in 1991, both from the University of Waterloo, and his Ph.D. in Computer Science in 1997 from the University of Toronto.   Across 20 years at Altera, Tabula and Intel, he worked on FPGA architecture, CAD and applications.  He is author of 30 published papers and 100+ US patents in these areas, has served on multiple FPGA and CAD program committees, and is a former Associate Editor for IEEE Transactions on VLSI.  In 2018 he joined Google to lead a group focused on performance modeling for the Tensor Processor (TPU) architecture.