In this case study, we explore compressing neural networks for efficient deployment on edge devices with limited resources. We explore practical techniques like quantization, pruning, and tensorization using off-the-shelf open-source tools.
Our aim is to illustrate a typical model compression workflow, highlighting the approaches and techniques used to analyse a …