Search Results for author: Suhas Kotha

Found 6 papers, 3 papers with code

Jailbreaking is Best Solved by Definition

no code implementations20 Mar 2024 Taeyoun Kim, Suhas Kotha, aditi raghunathan

The rise of "jailbreak" attacks on language models has led to a flurry of defenses aimed at preventing the output of undesirable responses.

Repetition Improves Language Model Embeddings

2 code implementations23 Feb 2024 Jacob Mitchell Springer, Suhas Kotha, Daniel Fried, Graham Neubig, aditi raghunathan

In this work, we address an architectural limitation of autoregressive models: token embeddings cannot contain information from tokens that appear later in the input.

Language Modelling

Understanding Catastrophic Forgetting in Language Models via Implicit Inference

1 code implementation18 Sep 2023 Suhas Kotha, Jacob Mitchell Springer, aditi raghunathan

We lack a systematic understanding of the effects of fine-tuning (via methods such as instruction-tuning or reinforcement learning from human feedback), particularly on tasks outside the narrow fine-tuning distribution.

In-Context Learning

Provably Bounding Neural Network Preimages

3 code implementations NeurIPS 2023 Suhas Kotha, Christopher Brix, Zico Kolter, Krishnamurthy Dvijotham, huan zhang

Most work on the formal verification of neural networks has focused on bounding the set of outputs that correspond to a given set of inputs (for example, bounded perturbations of a nominal input).

Adversarial Robustness

CELESTIAL: Classification Enabled via Labelless Embeddings with Self-supervised Telescope Image Analysis Learning

no code implementations20 Jan 2022 Suhas Kotha, Anirudh Koul, Siddha Ganju, Meher Kasam

To solve this problem, we establish CELESTIAL-a self-supervised learning pipeline for effectively leveraging sparsely-labeled satellite imagery.

Image Retrieval Retrieval +2

Cannot find the paper you are looking for? You can Submit a new open access paper.