no code implementations • 21 Mar 2024 • Daniel Garibi, Or Patashnik, Andrey Voynov, Hadar Averbuch-Elor, Daniel Cohen-Or
However, applying these methods to real images necessitates the inversion of the images into the domain of the pretrained diffusion model.
no code implementations • 2 Mar 2024 • Moran Yanuka, Morris Alper, Hadar Averbuch-Elor, Raja Giryes
Web-scale training on paired text-image data is becoming increasingly central to multimodal learning, but is challenged by the highly noisy nature of datasets in the wild.
no code implementations • 14 Feb 2024 • Chen Dudai, Morris Alper, Hana Bezalel, Rana Hanocka, Itai Lang, Hadar Averbuch-Elor
To bolster such models with fine-grained knowledge, we leverage large-scale Internet data containing images of similar landmarks along with weakly-related textual information.
1 code implementation • 6 Dec 2023 • Assaf Ben-Kish, Moran Yanuka, Morris Alper, Raja Giryes, Hadar Averbuch-Elor
To this end, we propose a framework for addressing hallucinations in image captioning in the open-vocabulary setting.
no code implementations • 29 Nov 2023 • Etai Sella, Gal Fiebelman, Noam Atia, Hadar Averbuch-Elor
We are witnessing rapid progress in automatically generating and manipulating 3D assets due to the availability of pretrained text-image diffusion models.
no code implementations • 6 Nov 2023 • Yuval Alaluf, Daniel Garibi, Or Patashnik, Hadar Averbuch-Elor, Daniel Cohen-Or
Recent advancements in text-to-image generative models have demonstrated a remarkable ability to capture a deep semantic understanding of images.
no code implementations • NeurIPS 2023 • Morris Alper, Hadar Averbuch-Elor
Although the mapping between sound and meaning in human language is assumed to be largely arbitrary, research in cognitive science has shown that there are non-trivial correlations between particular sounds and meanings across languages and demographic groups, a phenomenon known as sound symbolism.
1 code implementation • 6 Sep 2023 • Noriyuki Kojima, Hadar Averbuch-Elor, Yoav Artzi
Key to tasks that require reasoning about natural language in visual contexts is grounding words and phrases to image regions.
1 code implementation • ICCV 2023 • Ruojin Cai, Joseph Tung, Qianqian Wang, Hadar Averbuch-Elor, Bharath Hariharan, Noah Snavely
Our evaluation shows that our method can distinguish illusory matches in difficult cases, and can be integrated into SfM pipelines to produce correct, disambiguated 3D reconstructions.
1 code implementation • CVPR 2023 • Haotong Lin, Qianqian Wang, Ruojin Cai, Sida Peng, Hadar Averbuch-Elor, Xiaowei Zhou, Noah Snavely
Specifically, we represent the scene as a space-time radiance field with a per-image illumination embedding, where temporally-varying scene changes are encoded using a set of learned step functions.
no code implementations • ICCV 2023 • Morris Alper, Hadar Averbuch-Elor
We show that the pseudo-labels produced by this procedure can be used to train a captioning model to effectively understand human-human interactions in images, as measured by a variety of metrics that measure textual and semantic faithfulness and factual groundedness of our predictions.
1 code implementation • ICCV 2023 • Etai Sella, Gal Fiebelman, Peter Hedman, Hadar Averbuch-Elor
Our method takes oriented 2D images of a 3D object as input and learns a grid-based volumetric representation of it.
1 code implementation • CVPR 2023 • Morris Alper, Michael Fiman, Hadar Averbuch-Elor
We show that SOTA multimodally trained text encoders outperform unimodally trained text encoders on the VLU tasks while being underperformed by them on the NLU tasks, lending new context to previously mixed results regarding the NLU capabilities of multimodal models.
1 code implementation • ICCV 2023 • Or Patashnik, Daniel Garibi, Idan Azuri, Hadar Averbuch-Elor, Daniel Cohen-Or
In this paper, we present a technique to generate a collection of images that depicts variations in the shape of a specific object, enabling an object-level shape exploration process.
1 code implementation • 13 Oct 2022 • Eric Ming Chen, Jin Sun, Apoorv Khandelwal, Dani Lischinski, Noah Snavely, Hadar Averbuch-Elor
How can one visually characterize people in a decade?
1 code implementation • 25 May 2022 • Jiaming Sun, Xi Chen, Qianqian Wang, Zhengqi Li, Hadar Averbuch-Elor, Xiaowei Zhou, Noah Snavely
We are witnessing an explosion of neural implicit representations in computer vision and graphics.
1 code implementation • 21 Jan 2022 • Yotam Elor, Hadar Averbuch-Elor
Balancing the data before training a classifier is a popular technique to address the challenges of imbalanced binary classification in tabular data.
1 code implementation • ICCV 2021 • Claire Yuqing Cui, Apoorv Khandelwal, Yoav Artzi, Noah Snavely, Hadar Averbuch-Elor
We present a task and benchmark dataset for person-centric visual grounding, the problem of linking between people named in a caption and people pictured in an image.
Ranked #1 on Person-centric Visual Grounding on Who’s Waldo (using extra training data)
1 code implementation • ICCV 2021 • Xiaoshi Wu, Hadar Averbuch-Elor, Jin Sun, Noah Snavely
The abundance and richness of Internet photos of landmarks and cities has led to significant progress in 3D vision over the past two decades, including automated 3D reconstructions of the world's landmarks from tourist photos.
1 code implementation • CVPR 2021 • Ruojin Cai, Bharath Hariharan, Noah Snavely, Hadar Averbuch-Elor
We present a technique for estimating the relative 3D rotation of an RGB image pair in an extreme setting, where the images have little or no overlap.
no code implementations • 18 Mar 2021 • Or Perel, Oron Anschel, Omri Ben-Eliezer, Shai Mazor, Hadar Averbuch-Elor
Nowadays, as cameras are rapidly adopted in our daily routine, images of documents are becoming both abundant and prevalent.
no code implementations • 27 Nov 2020 • Margot Hanley, Apoorv Khandelwal, Hadar Averbuch-Elor, Noah Snavely, Helen Nissenbaum
Important ethical concerns arising from computer vision datasets of people have been receiving significant attention, and a number of datasets have been withdrawn as a result.
no code implementations • ECCV 2020 • Jin Sun, Hadar Averbuch-Elor, Qianqian Wang, Noah Snavely
Predicting where people can walk in a scene is important for many tasks, including autonomous driving systems and human behavior analysis.
1 code implementation • ECCV 2020 • Ruojin Cai, Guandao Yang, Hadar Averbuch-Elor, Zekun Hao, Serge Belongie, Noah Snavely, Bharath Hariharan
Point cloud generation thus amounts to moving randomly sampled points to high-density areas.
1 code implementation • 17 May 2020 • Anna Darzi, Itai Lang, Ashutosh Taklikar, Hadar Averbuch-Elor, Shai Avidan
As image generation techniques mature, there is a growing interest in explainable representations that are easy to understand and intuitive to manipulate.
1 code implementation • ACL 2020 • Noriyuki Kojima, Hadar Averbuch-Elor, Alexander M. Rush, Yoav Artzi
Visual features are a promising signal for learning bootstrap textual models.
1 code implementation • CVPR 2020 • Zekun Hao, Hadar Averbuch-Elor, Noah Snavely, Serge Belongie
We are seeing a Cambrian explosion of 3D shape representations for use in machine learning.
3 code implementations • CVPR 2020 • Sharon Fogel, Hadar Averbuch-Elor, Sarel Cohen, Shai Mazor, Roee Litman
This is especially true for handwritten text recognition (HTR), where each author has a unique style, unlike printed text, where the variation is smaller by design.
no code implementations • 1 Sep 2019 • Akshay Gadi Patil, Omri Ben-Eliezer, Or Perel, Hadar Averbuch-Elor
Creating large varieties of plausible document layouts can be a tedious task, requiring numerous constraints to be satisfied, including local ones relating different semantic elements and global constraints on the general appearance and spacing.
no code implementations • 15 Apr 2019 • Yiftach Ginger, Dov Danon, Hadar Averbuch-Elor, Daniel Cohen-Or
As a result, in recent years more attention has been given to techniques that learn the mapping from unpaired sets.
1 code implementation • 22 Mar 2018 • Sharon Fogel, Hadar Averbuch-Elor, Jacov Goldberger, Daniel Cohen-Or
In this paper, we depart from centroid-based models and suggest a new framework, called Clustering-driven deep embedding with PAirwise Constraints (CPAC), for non-parametric clustering using a neural network.
no code implementations • 31 Jan 2017 • Hadar Averbuch-Elor, Johannes Kopf, Tamir Hazan, Daniel Cohen-Or
Thus, to disambiguate what the common foreground object is, we introduce a weakly-supervised technique, where we assume only a small seed, given in the form of a single segmented image.
1 code implementation • 14 Dec 2016 • Hadar Averbuch-Elor, Nadav Bar, Daniel Cohen-Or
In this paper, we present a novel non-parametric clustering technique.
no code implementations • CVPR 2015 • Etai Littwin, Hadar Averbuch-Elor, Daniel Cohen-Or
In this paper, we introduce a spherical embedding technique to position a given set of silhouettes of an object as observed from a set of cameras arbitrarily positioned around the object.