Search Results for author: Devanshu Agrawal

Found 6 papers, 4 papers with code

Can't Remember Details in Long Documents? You Need Some R&R

1 code implementation8 Mar 2024 Devanshu Agrawal, Shang Gao, Martin Gajek

Long-context large language models (LLMs) hold promise for tasks such as question-answering (QA) over long documents, but they tend to miss important information in the middle of context documents (arXiv:2307. 03172v3).

Question Answering

Densely Connected $G$-invariant Deep Neural Networks with Signed Permutation Representations

1 code implementation8 Mar 2023 Devanshu Agrawal, James Ostrowski

In contrast to other $G$-invariant architectures in the literature, the preactivations of the$G$-DNNs presented here are able to transform by \emph{signed} permutation representations (signed perm-reps) of $G$.

3D Object Classification

A Classification of $G$-invariant Shallow Neural Networks

1 code implementation18 May 2022 Devanshu Agrawal, James Ostrowski

In this paper, we take a first step towards this goal; we prove a theorem that gives a classification of all $G$-invariant single-hidden-layer or ``shallow'' neural network ($G$-SNN) architectures with ReLU activation for any finite orthogonal group $G$, and we prove a second theorem that characterizes the inclusion maps or ``network morphisms'' between the architectures that can be leveraged during neural architecture search (NAS).

General Classification Neural Architecture Search

A Group-Equivariant Autoencoder for Identifying Spontaneously Broken Symmetries

1 code implementation13 Feb 2022 Devanshu Agrawal, Adrian Del Maestro, Steven Johnston, James Ostrowski

We use group theory to deduce which symmetries of the system remain intact in all phases, and then use this information to constrain the parameters of the GE-autoencoder such that the encoder learns an order parameter invariant to these ``never-broken'' symmetries.

Deep Ensemble Kernel Learning

no code implementations1 Jan 2021 Devanshu Agrawal, Jacob D Hinkle

In the deep kernel learning (DKL) paradigm, a deep neural network or "feature network" is used to map inputs into a latent feature space, where a GP with a "base kernel" acts; the resulting model is then trained in an end-to-end fashion.

Gaussian Processes

Wide Neural Networks with Bottlenecks are Deep Gaussian Processes

no code implementations3 Jan 2020 Devanshu Agrawal, Theodore Papamarkou, Jacob Hinkle

There has recently been much work on the "wide limit" of neural networks, where Bayesian neural networks (BNNs) are shown to converge to a Gaussian process (GP) as all hidden layers are sent to infinite width.

Gaussian Processes

Cannot find the paper you are looking for? You can Submit a new open access paper.