Balancing Accuracy and Latency in Multipath Neural Networks

25 Apr 2021 · Mohammed Amer, Tomás Maul, Iman Yi Liao ·

The growing capacity of neural networks has strongly contributed to their success at complex machine learning tasks and the computational demand of such large models has, in turn, stimulated a significant improvement in the hardware necessary to accelerate their computations. However, models with high latency aren't suitable for limited-resource environments such as hand-held and IoT devices. Hence, many deep learning techniques aim to address this problem by developing models with reasonable accuracy without violating the limited-resource constraint. In this work, we use a one-shot neural architecture search model to implicitly evaluate the performance of an intractable number of multipath neural networks. Combining this architecture search with a pruning technique and architecture sample evaluation, we can model the relation between the accuracy and the latency of a spectrum of models with graded complexity. We show that our method can accurately model the relative performance between models with different latencies and predict the performance of unseen models with good precision across different datasets.

PDF Abstract

Code

Add Remove Mark official

No code implementations yet. Submit your code now

Tasks

Add Remove

Neural Architecture Search

Datasets

Add Datasets introduced or used in this paper

Results from the Paper

Edit

Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods

Add Remove

Pruning

Edit Social Preview

Balancing Accuracy and Latency in Multipath Neural Networks

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove