Case Study: Compressing DeepMind’s RepNet for Edge Deployment

Posted on: Mon 10 June 2024

RepNet

In this case study, we explore compressing neural networks for efficient deployment on edge devices with limited resources. We explore practical techniques like quantization, pruning, and tensorization using off-the-shelf open-source tools.

Our aim is to illustrate a typical model compression workflow, highlighting the approaches and techniques used to analyse a …

Tensorization: Breaking through the Ranks

Posted on: Fri 23 February 2024

Having covered quantization and pruning, it’s time to move on to some of the more popular algorithms and libraries to leverage tensorization. As usual, we assume that you have gone over the introductory post of our model compression series. Read more here.

Model Pruning: Keeping the Essentials

Posted on: Wed 14 February 2024

In the previous blog post of our model compression series we went over the available quantization libraries and their features. Now we will now go over the packages and tools that enable us to apply different kinds of pruning methods on our machine learning models. Read more here.

Quantization: A Bit Can Go a Long Way

Posted on: Sat 27 January 2024

Following up with our model compression blog post series, we will now delve into quantization, one of the more powerful compression techniques that we can leverage to reduce the size and memory footprint of our models. Read more here.

Model Compression: A Survey of Techniques, Tools, and Libraries

Posted on: Sat 06 January 2024

Machine learning has witnessed a surge in interest in recent years driven by several factors including the availability of large datasets, advancements in transfer learning, and the development of more potent neural network structures, all giving rise to powerful models with wide ranges of applications. Read more here.