no code implementations • 14 Apr 2023 • Abhisek Kundu, Naveen K. Mellempudi, Dharma Teja Vooturi, Bharat Kaul, Pradeep Dubey
We integrated GA with the latest learnable pruning methods to create an automated sparse training algorithm called AutoSparse, which achieves better accuracy and/or training/inference FLOPS reduction than existing learnable pruning methods for sparse ResNet50 and MobileNetV1 on ImageNet-1K: AutoSparse achieves (2x, 7x) reduction in (training, inference) FLOPS for ResNet50 on ImageNet at 80% sparsity.
no code implementations • 24 Jun 2020 • Dharma Teja Vooturi, Girish Varma, Kishore Kothapalli
We also propose to use products of Ramanujan graphs which gives the best connectivity for a given level of sparsity.
no code implementations • 29 May 2019 • Dhiraj Kalamkar, Dheevatsa Mudigere, Naveen Mellempudi, Dipankar Das, Kunal Banerjee, Sasikanth Avancha, Dharma Teja Vooturi, Nataraj Jammalamadaka, Jianyu Huang, Hector Yuen, Jiyan Yang, Jongsoo Park, Alexander Heinecke, Evangelos Georganas, Sudarshan Srinivasan, Abhisek Kundu, Misha Smelyanskiy, Bharat Kaul, Pradeep Dubey
In this paper, we discuss the flow of tensors and various key operations in mixed precision training, and delve into details of operations, such as the rounding modes for converting FP32 tensors to BFLOAT16.
no code implementations • 10 Aug 2018 • Dharma Teja Vooturi, Dheevatsa Mudigere, Sasikanth Avancha
In this work, we jointly address both accuracy and performance of sparse DNNs using our proposed class of sparse neural networks called HBsNN (Hierarchical Block sparse Neural Networks).
no code implementations • 1 Nov 2017 • Dharma Teja Vooturi, Saurabh Goyal, Anamitra R. Choudhury, Yogish Sabharwal, Ashish Verma
Large number of weights in deep neural networks makes the models difficult to be deployed in low memory environments such as, mobile phones, IOT edge devices as well as "inferencing as a service" environments on cloud.