Search Results for author: Inderjit Dhillon

Found 23 papers, 10 papers with code

Dual-Encoders for Extreme Multi-Label Classification

1 code implementation • 16 Oct 2023 • Nilesh Gupta, Devvrit Khatri, Ankit S Rawat, Srinadh Bhojanapalli, Prateek Jain, Inderjit Dhillon

We propose decoupled softmax loss - a simple modification to the InfoNCE loss - that overcomes the limitations of existing contrastive losses.

Classification Extreme Multi-Label Classification +2

Paper
Code

EHI: End-to-end Learning of Hierarchical Index for Efficient Dense Retrieval

no code implementations • 13 Oct 2023 • Ramnath Kumar, Anshul Mittal, Nilesh Gupta, Aditya Kusupati, Inderjit Dhillon, Prateek Jain

Such techniques use a two-stage process: (a) contrastive learning to train a dual encoder to embed both the query and documents and (b) approximate nearest neighbor search (ANNS) for finding similar documents for a given query.

Contrastive Learning Retrieval

Paper
Add Code

MatFormer: Nested Transformer for Elastic Inference

2 code implementations • 11 Oct 2023 • Devvrit, Sneha Kudugunta, Aditya Kusupati, Tim Dettmers, KaiFeng Chen, Inderjit Dhillon, Yulia Tsvetkov, Hannaneh Hajishirzi, Sham Kakade, Ali Farhadi, Prateek Jain

Furthermore, we observe that smaller encoders extracted from a universal MatFormer-based ViT (MatViT) encoder preserve the metric-space structure for adaptive large-scale retrieval.

Decoder Language Modelling

3,029

Paper
Code

Bayesian regularization of empirical MDPs

no code implementations • 3 Aug 2022 • Samarth Gupta, Daniel N. Hill, Lexing Ying, Inderjit Dhillon

Due to noise, the policy learnedfrom the estimated model is often far from the optimal policy of the underlying model.

Paper
Add Code

Positive Unlabeled Contrastive Learning

no code implementations • 1 Jun 2022 • Anish Acharya, Sujay Sanghavi, Li Jing, Bhargav Bhushanam, Dhruv Choudhary, Michael Rabbat, Inderjit Dhillon

We extend this paradigm to the classical positive unlabeled (PU) setting, where the task is to learn a binary classifier given only a few labeled positive samples, and (often) a large amount of unlabeled samples (which could be positive or negative).

Contrastive Learning Pseudo Label

Paper
Add Code

Accelerating Primal-dual Methods for Regularized Markov Decision Processes

no code implementations • 21 Feb 2022 • Haoya Li, Hsiang-Fu Yu, Lexing Ying, Inderjit Dhillon

Entropy regularized Markov decision processes have been widely used in reinforcement learning.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Extreme Zero-Shot Learning for Extreme Text Classification

1 code implementation • NAACL 2022 • Yuanhao Xiong, Wei-Cheng Chang, Cho-Jui Hsieh, Hsiang-Fu Yu, Inderjit Dhillon

To learn the semantic embeddings of instances and labels with raw text, we propose to pre-train Transformer-based encoders with self-supervised contrastive losses.

Multi Label Text Classification Multi-Label Text Classification +2

493

Paper
Code

DRONE: Data-aware Low-rank Compression for Large NLP Models

no code implementations • NeurIPS 2021 • Pei-Hung Chen, Hsiang-Fu Yu, Inderjit Dhillon, Cho-Jui Hsieh

In addition to compressing standard models, out method can also be used on distilled BERT models to further improve compression rate.

Low-rank compression MRPC +1

Paper
Add Code

Approximate Newton policy gradient algorithms

no code implementations • 5 Oct 2021 • Haoya Li, Samarth Gupta, HsiangFu Yu, Lexing Ying, Inderjit Dhillon

This paper proposes an approximate Newton method for the policy gradient algorithm with entropy regularization.

Paper
Add Code

Enterprise-Scale Search: Accelerating Inference for Sparse Extreme Multi-Label Ranking Trees

1 code implementation • 4 Jun 2021 • Philip A. Etter, Kai Zhong, Hsiang-Fu Yu, Lexing Ying, Inderjit Dhillon

In industrial applications, these models operate at extreme scales, where every bit of performance is critical.

Recommendation Systems

Paper
Code

Top-$k$ eXtreme Contextual Bandits with Arm Hierarchy

1 code implementation • 15 Feb 2021 • Rajat Sen, Alexander Rakhlin, Lexing Ying, Rahul Kidambi, Dean Foster, Daniel Hill, Inderjit Dhillon

We show that our algorithm has a regret guarantee of $O(k\sqrt{(A-k+1)T \log (|\mathcal{F}|T)})$, where $A$ is the total number of arms and $\mathcal{F}$ is the class containing the regression function, while only requiring $\tilde{O}(A)$ computation per time step.

Computational Efficiency Extreme Multi-Label Classification +2

Paper
Code

Voting based ensemble improves robustness of defensive models

no code implementations • 28 Nov 2020 • Devvrit, Minhao Cheng, Cho-Jui Hsieh, Inderjit Dhillon

Several previous attempts tackled this problem by ensembling the soft-label prediction and have been proved vulnerable based on the latest attack methods.

Paper
Add Code

On the Benefits of Multiple Gossip Steps in Communication-Constrained Decentralized Optimization

1 code implementation • 20 Nov 2020 • Abolfazl Hashemi, Anish Acharya, Rudrajit Das, Haris Vikalo, Sujay Sanghavi, Inderjit Dhillon

In this paper, we show that, in such compressed decentralized optimization settings, there are benefits to having {\em multiple} gossip steps between subsequent gradient iterations, even when the cost of doing so is appropriately accounted for e. g. by means of reducing the precision of compressed information.

Paper
Code

Extreme Multi-label Classification from Aggregated Labels

no code implementations • ICML 2020 • Yanyao Shen, Hsiang-Fu Yu, Sujay Sanghavi, Inderjit Dhillon

Current XMC approaches are not built for such multi-instance multi-label (MIML) training data, and MIML approaches do not scale to XMC sizes.

Classification Extreme Multi-Label Classification +1

Paper
Add Code

Learning to Encode Position for Transformer with Continuous Dynamical Model

1 code implementation • ICML 2020 • Xuanqing Liu, Hsiang-Fu Yu, Inderjit Dhillon, Cho-Jui Hsieh

The main reason is that position information among input units is not inherently encoded, i. e., the models are permutation equivalent; this problem justifies why all of the existing models are accompanied by a sinusoidal encoding/embedding layer at the input.

Ranked #5 on Semantic Textual Similarity on MRPC

Inductive Bias Linguistic Acceptability +4

Paper
Code

CAT: Customized Adversarial Training for Improved Robustness

no code implementations • 17 Feb 2020 • Minhao Cheng, Qi Lei, Pin-Yu Chen, Inderjit Dhillon, Cho-Jui Hsieh

Adversarial training has become one of the most effective methods for improving robustness of neural networks.

Paper
Add Code

Think Globally, Act Locally: A Deep Neural Network Approach to High-Dimensional Time Series Forecasting

1 code implementation • NeurIPS 2019 • Rajat Sen, Hsiang-Fu Yu, Inderjit Dhillon

Forecasting high-dimensional time series plays a crucial role in many applications such as demand forecasting and financial predictions.

Time Series Time Series Forecasting

166

Paper
Code

Taming Pretrained Transformers for Extreme Multi-label Text Classification

2 code implementations • 7 May 2019 • Wei-Cheng Chang, Hsiang-Fu Yu, Kai Zhong, Yiming Yang, Inderjit Dhillon

However, naively applying deep transformer models to the XMC problem leads to sub-optimal performance due to the large output space and the label sparsity issue.

Extreme Multi-Label Classification General Classification +4

136

Paper
Code

Online Embedding Compression for Text Classification using Low Rank Matrix Factorization

no code implementations • 1 Nov 2018 • Anish Acharya, Rahul Goel, Angeliki Metallinou, Inderjit Dhillon

Empirically, we show that the proposed method can achieve 90% compression with minimal impact in accuracy for sentence classification tasks, and outperforms alternative methods like fixed-point quantization or offline word embedding compression.

General Classification Quantization +3

Paper
Add Code

Kernel Ridge Regression via Partitioning

no code implementations • 5 Aug 2016 • Rashish Tandon, Si Si, Pradeep Ravikumar, Inderjit Dhillon

In this paper, we investigate a divide and conquer approach to Kernel Ridge Regression (KRR).

Clustering Generalization Bounds +1

Paper
Add Code

Structured Sparse Regression via Greedy Hard-Thresholding

no code implementations • 19 Feb 2016 • Prateek Jain, Nikhil Rao, Inderjit Dhillon

Several learning applications require solving high-dimensional regression problems where the relevant features belong to a small number of (overlapping) groups.

regression

Paper
Add Code

Coordinate Descent Methods for Symmetric Nonnegative Matrix Factorization

no code implementations • 4 Sep 2015 • Arnaud Vandaele, Nicolas Gillis, Qi Lei, Kai Zhong, Inderjit Dhillon

Given a symmetric nonnegative matrix $A$, symmetric nonnegative matrix factorization (symNMF) is the problem of finding a nonnegative matrix $H$, usually with much fewer columns than $A$, such that $A \approx HH^T$.

Clustering

Paper
Add Code

NOMAD: Non-locking, stOchastic Multi-machine algorithm for Asynchronous and Decentralized matrix completion

1 code implementation • 1 Dec 2013 • Hyokun Yun, Hsiang-Fu Yu, Cho-Jui Hsieh, S. V. N. Vishwanathan, Inderjit Dhillon

One of the key features of NOMAD is that the ownership of a variable is asynchronously transferred between processors in a decentralized fashion.

Distributed, Parallel, and Cluster Computing

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.