Search Results for author: Devanshu Agrawal

Found 6 papers, 4 papers with code

Can't Remember Details in Long Documents? You Need Some R&R

1 code implementation • 8 Mar 2024 • Devanshu Agrawal, Shang Gao, Martin Gajek

Long-context large language models (LLMs) hold promise for tasks such as question-answering (QA) over long documents, but they tend to miss important information in the middle of context documents (arXiv:2307. 03172v3).

Question Answering

Paper
Code

Densely Connected $G$-invariant Deep Neural Networks with Signed Permutation Representations

1 code implementation • 8 Mar 2023 • Devanshu Agrawal, James Ostrowski

In contrast to other $G$-invariant architectures in the literature, the preactivations of the$G$-DNNs presented here are able to transform by \emph{signed} permutation representations (signed perm-reps) of $G$.

3D Object Classification

Paper
Code

A Classification of $G$-invariant Shallow Neural Networks

1 code implementation • 18 May 2022 • Devanshu Agrawal, James Ostrowski

In this paper, we take a first step towards this goal; we prove a theorem that gives a classification of all $G$-invariant single-hidden-layer or ``shallow'' neural network ($G$-SNN) architectures with ReLU activation for any finite orthogonal group $G$, and we prove a second theorem that characterizes the inclusion maps or ``network morphisms'' between the architectures that can be leveraged during neural architecture search (NAS).

General Classification Neural Architecture Search

Paper
Code

A Group-Equivariant Autoencoder for Identifying Spontaneously Broken Symmetries

1 code implementation • 13 Feb 2022 • Devanshu Agrawal, Adrian Del Maestro, Steven Johnston, James Ostrowski

We use group theory to deduce which symmetries of the system remain intact in all phases, and then use this information to constrain the parameters of the GE-autoencoder such that the encoder learns an order parameter invariant to these ``never-broken'' symmetries.

Paper
Code

Deep Ensemble Kernel Learning

no code implementations • 1 Jan 2021 • Devanshu Agrawal, Jacob D Hinkle

In the deep kernel learning (DKL) paradigm, a deep neural network or "feature network" is used to map inputs into a latent feature space, where a GP with a "base kernel" acts; the resulting model is then trained in an end-to-end fashion.

Gaussian Processes

Paper
Add Code

Wide Neural Networks with Bottlenecks are Deep Gaussian Processes

no code implementations • 3 Jan 2020 • Devanshu Agrawal, Theodore Papamarkou, Jacob Hinkle

There has recently been much work on the "wide limit" of neural networks, where Bayesian neural networks (BNNs) are shown to converge to a Gaussian process (GP) as all hidden layers are sent to infinite width.

Gaussian Processes

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.