Search Results for author: Igor Gitman

Found 11 papers, 5 papers with code

OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset

1 code implementation • 15 Feb 2024 • Shubham Toshniwal, Ivan Moshkov, Sean Narenthiran, Daria Gitman, Fei Jia, Igor Gitman

Building on the recent progress in open-source LLMs, our proposed prompting novelty, and some brute-force scaling, we construct OpenMathInstruct-1, a math instruction tuning dataset with 1. 8M problem-solution pairs.

Ranked #1 on Math Word Problem Solving on MAWPS (using extra training data)

Arithmetic Reasoning GSM8K +2

106

Paper
Code

Confidence-based Ensembles of End-to-End Speech Recognition Models

no code implementations • 27 Jun 2023 • Igor Gitman, Vitaly Lavrukhin, Aleksandr Laptev, Boris Ginsburg

Second, we demonstrate that it is possible to combine base and adapted models to achieve strong results on both original and target data.

Language Identification Model Selection +2

Paper
Add Code

Powerful and Extensible WFST Framework for RNN-Transducer Losses

no code implementations • 18 Mar 2023 • Aleksandr Laptev, Vladimir Bataev, Igor Gitman, Boris Ginsburg

This paper presents a framework based on Weighted Finite-State Transducers (WFST) to simplify the development of modifications for RNN-Transducer (RNN-T) loss.

Paper
Add Code

Understanding the Role of Momentum in Stochastic Gradient Methods

1 code implementation • NeurIPS 2019 • Igor Gitman, Hunter Lang, Pengchuan Zhang, Lin Xiao

The use of momentum in stochastic gradient methods has become a widespread practice in machine learning.

Stochastic Optimization

Paper
Code

OpenSeq2Seq: Extensible Toolkit for Distributed and Mixed Precision Training of Sequence-to-Sequence Models

no code implementations • WS 2018 • Oleksii Kuchaiev, Boris Ginsburg, Igor Gitman, Vitaly Lavrukhin, Carl Case, Paulius Micikevicius

We present OpenSeq2Seq {--} an open-source toolkit for training sequence-to-sequence models.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

Mixed-Precision Training for NLP and Speech Recognition with OpenSeq2Seq

3 code implementations • 25 May 2018 • Oleksii Kuchaiev, Boris Ginsburg, Igor Gitman, Vitaly Lavrukhin, Jason Li, Huyen Nguyen, Carl Case, Paulius Micikevicius

We present OpenSeq2Seq - a TensorFlow-based toolkit for training sequence-to-sequence models that features distributed and mixed-precision training.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

1,536

Paper
Code

Novel Prediction Techniques Based on Clusterwise Linear Regression

1 code implementation • 28 Apr 2018 • Igor Gitman, Jieshi Chen, Eric Lei, Artur Dubrawski

In this paper we propose two novel approaches on how to solve this problem.

regression

Paper
Code

Convergence Analysis of Gradient Descent Algorithms with Proportional Updates

no code implementations • 9 Jan 2018 • Igor Gitman, Deepak Dilipkumar, Ben Parr

The basic idea of both of these algorithms is to make each step of the gradient descent proportional to the current weight norm and independent of the gradient magnitude.