Search Results for author: Igor Gitman

Found 11 papers, 5 papers with code

OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset

1 code implementation15 Feb 2024 Shubham Toshniwal, Ivan Moshkov, Sean Narenthiran, Daria Gitman, Fei Jia, Igor Gitman

Building on the recent progress in open-source LLMs, our proposed prompting novelty, and some brute-force scaling, we construct OpenMathInstruct-1, a math instruction tuning dataset with 1. 8M problem-solution pairs.

 Ranked #1 on Math Word Problem Solving on MAWPS (using extra training data)

Arithmetic Reasoning GSM8K +2

Confidence-based Ensembles of End-to-End Speech Recognition Models

no code implementations27 Jun 2023 Igor Gitman, Vitaly Lavrukhin, Aleksandr Laptev, Boris Ginsburg

Second, we demonstrate that it is possible to combine base and adapted models to achieve strong results on both original and target data.

Language Identification Model Selection +2

Powerful and Extensible WFST Framework for RNN-Transducer Losses

no code implementations18 Mar 2023 Aleksandr Laptev, Vladimir Bataev, Igor Gitman, Boris Ginsburg

This paper presents a framework based on Weighted Finite-State Transducers (WFST) to simplify the development of modifications for RNN-Transducer (RNN-T) loss.

Convergence Analysis of Gradient Descent Algorithms with Proportional Updates

no code implementations9 Jan 2018 Igor Gitman, Deepak Dilipkumar, Ben Parr

The basic idea of both of these algorithms is to make each step of the gradient descent proportional to the current weight norm and independent of the gradient magnitude.

Large Batch Training of Convolutional Networks

12 code implementations13 Aug 2017 Yang You, Igor Gitman, Boris Ginsburg

Using LARS, we scaled Alexnet up to a batch size of 8K, and Resnet-50 to a batch size of 32K without loss in accuracy.

8k

Cannot find the paper you are looking for? You can Submit a new open access paper.