Towards Aggregated Computing on Mobile System-On-Chips for Generative AI


Personal data on mobile devices is becoming a natural resource for Generative AI computation. The increasing AI mobile computation demand requires that current isolated and tailored processing units (PU), such as APU, NPU, GPU, VPU, and ISP on the current main SoCs of a flagship smartphone work orchestrated on inference workloads. Along with all these "flexible" computing engines, the main SoC also contains the modem engine performing the 5G/4G stack. Due to stringent throughputs, the modem engine is mainly hardwired occupying close to a quarter of the entire SoC. This talk will discuss challenges and opportunities for aggregated computing on the next generation of SoC, with a special focus on the new software and hardware stack paradigm to accelerate generative AI while performing software-defined modem and xPU tasks.