no code implementations • 25 Sep 2023 • Nidhi Hegde, Sujoy Paul, Gagan Madan, Gaurav Aggarwal
Recent document question answering models consist of two key components: the vision encoder, which captures layout and visual elements in images, and a Large Language Model (LLM) that helps contextualize questions to the image and supplements them with external world knowledge to generate accurate answers.
no code implementations • 29 Aug 2023 • Debapriya Tula, Sujoy Paul, Gagan Madan, Peter Garst, Reeve Ingle, Gaurav Aggarwal
While text line recognition models are generally trained on large corpora of real and synthetic data, such models can still make frequent mistakes if the handwriting is inscrutable or the image acquisition process adds corruptions, such as noise, blur, compression, etc.
no code implementations • 12 Jun 2023 • Sujoy Paul, Gagan Madan, Akankshya Mishra, Narayan Hegde, Pradeep Kumar, Gaurav Aggarwal
In this work, we focus on the complex problem of extracting medicine names from handwritten prescriptions using only weakly labeled data.
1 code implementation • 30 Mar 2023 • Lucas Beyer, Bo Wan, Gagan Madan, Filip Pavetic, Andreas Steiner, Alexander Kolesnikov, André Susano Pinto, Emanuele Bugliarello, Xiao Wang, Qihang Yu, Liang-Chieh Chen, Xiaohua Zhai
A key finding is that a small decoder learned on top of a frozen pretrained encoder works surprisingly well.
no code implementations • 7 Oct 2022 • Soumyabrata Pal, Prateek Varshney, Prateek Jain, Abhradeep Guha Thakurta, Gagan Madan, Gaurav Aggarwal, Pradeep Shenoy, Gaurav Srivastava
We then study the framework in the linear setting, where the problem reduces to that of estimating the sum of a rank-$r$ and a $k$-column sparse matrix using a small number of linear measurements.
1 code implementation • 23 Nov 2021 • Ameya Daigavane, Gagan Madan, Aditya Sinha, Abhradeep Guha Thakurta, Gaurav Aggarwal, Prateek Jain
Graph Neural Networks (GNNs) are a popular technique for modelling graph-structured data and computing node-level representations via aggregation of information from the neighborhood of each node.
1 code implementation • 2 Jul 2018 • Gagan Madan, Ankit Anand, Mausam, Parag Singla
These orbits are represented compactly using permutations over variables, and variable-value (VV) pairs, but they can miss several state symmetries in a domain.
no code implementations • 10 May 2017 • Gagan Madan
Relation Extraction refers to the task of populating a database with tuples of the form $r(e_1, e_2)$, where $r$ is a relation and $e_1$, $e_2$ are entities.
no code implementations • 15 Mar 2016 • Ramandeep S Randhawa, Parag Jain, Gagan Madan
We propose a new algorithm for topic modeling, Vec2Topic, that identifies the main topics in a corpus using semantic information captured via high-dimensional distributed word embeddings.