Search Results for author: Burak Uzkent

Found 20 papers, 9 papers with code

Augment the Pairs: Semantics-Preserving Image-Caption Pair Augmentation for Grounding-Based Vision and Language Models

1 code implementation • 5 Nov 2023 • Jingru Yi, Burak Uzkent, Oana Ignat, Zili Li, Amanmeet Garg, Xiang Yu, Linda Liu

While we demonstrate our data augmentation method with MDETR framework, the proposed approach is applicable to common grounding-based vision and language tasks with other frameworks.

Data Augmentation Phrase Grounding +1

Paper
Code

GOHSP: A Unified Framework of Graph and Optimization-based Heterogeneous Structured Pruning for Vision Transformer

no code implementations • 13 Jan 2023 • Miao Yin, Burak Uzkent, Yilin Shen, Hongxia Jin, Bo Yuan

We first develop a graph-based ranking for measuring the importance of attention heads, and the extracted importance information is further integrated to an optimization-based procedure to impose the heterogeneous structured sparsity patterns on the ViT models.

Paper
Add Code

Dynamic Inference With Grounding Based Vision and Language Models

no code implementations • CVPR 2023 • Burak Uzkent, Amanmeet Garg, Wentao Zhu, Keval Doshi, Jingru Yi, Xiaolong Wang, Mohamed Omar

For example, recent image and language models with more than 200M parameters have been proposed to learn visual grounding in the pre-training step and show impressive results on downstream vision and language tasks.

Language Modelling Referring Expression +3

Paper
Add Code

Lite-MDETR: A Lightweight Multi-Modal Detector

no code implementations • CVPR 2022 • Qian Lou, Yen-Chang Hsu, Burak Uzkent, Ting Hua, Yilin Shen, Hongxia Jin

The key primitive is that Dictionary-Lookup-Transformormations (DLT) is proposed to replace Linear Transformation (LT) in multi-modal detectors where each weight in Linear Transformation (LT) is approximately factorized into a smaller dictionary, index, and coefficient.

object-detection Object Detection +3

Paper
Add Code

Negative Data Augmentation

2 code implementations • ICLR 2021 • Abhishek Sinha, Kumar Ayush, Jiaming Song, Burak Uzkent, Hongxia Jin, Stefano Ermon

Empirically, models trained with our method achieve improved conditional/unconditional image generation along with improved anomaly detection capabilities.

Ranked #6 on Image Generation on CIFAR-100

Action Recognition Anomaly Detection +9

Paper
Code

Efficient Conditional Pre-training for Transfer Learning

no code implementations • 20 Nov 2020 • Shuvam Chakraborty, Burak Uzkent, Kumar Ayush, Kumar Tanmay, Evan Sheehan, Stefano Ermon

Finally, we improve standard ImageNet pre-training by 1-3% by tuning available models on our subsets and pre-training on a dataset filtered from a larger scale dataset.

Transfer Learning

Paper
Add Code

Geography-Aware Self-Supervised Learning

1 code implementation • ICCV 2021 • Kumar Ayush, Burak Uzkent, Chenlin Meng, Kumar Tanmay, Marshall Burke, David Lobell, Stefano Ermon

Contrastive learning methods have significantly narrowed the gap between supervised and unsupervised learning on computer vision tasks.

Ranked #5 on Semantic Segmentation on SpaceNet 1 (using extra training data)

Contrastive Learning Image Classification +4

Paper
Code

Predicting Livelihood Indicators from Community-Generated Street-Level Imagery

1 code implementation • 15 Jun 2020 • Jihyeon Lee, Dylan Grosz, Burak Uzkent, Sicheng Zeng, Marshall Burke, David Lobell, Stefano Ermon

Major decisions from governments and other large organizations rely on measurements of the populace's well-being, but making such measurements at a broad scale is expensive and thus infrequent in much of the developing world.

Paper
Code

Efficient Poverty Mapping using Deep Reinforcement Learning

no code implementations • 7 Jun 2020 • Kumar Ayush, Burak Uzkent, Kumar Tanmay, Marshall Burke, David Lobell, Stefano Ermon

The combination of high-resolution satellite imagery and machine learning have proven useful in many sustainability-related tasks, including poverty prediction, infrastructure measurement, and forest monitoring.

object-detection Object Detection +2

Paper
Add Code

Farmland Parcel Delineation Using Spatio-temporal Convolutional Networks

no code implementations • 11 Apr 2020 • Han Lin Aung, Burak Uzkent, Marshall Burke, David Lobell, Stefano Ermon

Using satellite imaging can be a scalable and cost effective manner to perform the task of farm parcel delineation to collect this valuable data.

Segmentation

Paper
Add Code

Learning When and Where to Zoom with Deep Reinforcement Learning

2 code implementations • CVPR 2020 • Burak Uzkent, Stefano Ermon

While high resolution images contain semantically more useful information than their lower resolution counterparts, processing them is computationally more expensive, and in some applications, e. g. remote sensing, they can be much more expensive to acquire.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Generating Interpretable Poverty Maps using Object Detection in Satellite Images

no code implementations • 5 Feb 2020 • Kumar Ayush, Burak Uzkent, Marshall Burke, David Lobell, Stefano Ermon

Accurate local-level poverty measurement is an essential task for governments and humanitarian organizations to track the progress towards improving livelihoods and distribute scarce resources.

Feature Importance Humanitarian +2

Paper
Add Code

Cloud Removal in Satellite Images Using Spatiotemporal Generative Networks

3 code implementations • 14 Dec 2019 • Vishnu Sarukkai, Anirudh Jain, Burak Uzkent, Stefano Ermon

In contrast, we cast the problem of cloud removal as a conditional image synthesis challenge, and we propose a trainable spatiotemporal generator network (STGAN) to remove clouds.

Ranked #6 on Cloud Removal on SEN12MS-CR-TS

Cloud Removal Earth Observation +3

Paper
Code

Efficient Object Detection in Large Images using Deep Reinforcement Learning

3 code implementations • 9 Dec 2019 • Burak Uzkent, Christopher Yeh, Stefano Ermon

Traditionally, an object detector is applied to every part of the scene of interest, and its accuracy and computational cost increases with higher resolution images.

object-detection Object Detection +2

Paper
Code

Learning to Interpret Satellite Images in Global Scale Using Wikipedia

3 code implementations • 7 May 2019 • Burak Uzkent, Evan Sheehan, Chenlin Meng, Zhongyi Tang, Marshall Burke, David Lobell, Stefano Ermon

Despite recent progress in computer vision, finegrained interpretation of satellite images remains challenging because of a lack of labeled training data.

Paper
Code

Predicting Economic Development using Geolocated Wikipedia Articles

no code implementations • 5 May 2019 • Evan Sheehan, Chenlin Meng, Matthew Tan, Burak Uzkent, Neal Jean, David Lobell, Marshall Burke, Stefano Ermon

Progress on the UN Sustainable Development Goals (SDGs) is hampered by a persistent lack of data regarding key social, environmental, and economic indicators, particularly in developing countries.

Paper
Add Code

Learning to Interpret Satellite Images Using Wikipedia

no code implementations • 19 Sep 2018 • Evan Sheehan, Burak Uzkent, Chenlin Meng, Zhongyi Tang, Marshall Burke, David Lobell, Stefano Ermon

Despite recent progress in computer vision, fine-grained interpretation of satellite images remains challenging because of a lack of labeled training data.

Paper
Add Code

EnKCF: Ensemble of Kernelized Correlation Filters for High-Speed Object Tracking

2 code implementations • 20 Jan 2018 • Burak Uzkent, Young-Woo Seo

Experimental results showed that the performance of ours is, on average, 70. 10% for precision at 20 pixels, 53. 00% for success rate for the OTB100 data, and 54. 50% and 40. 2% for the UAV123 data.

Object Tracking Vocal Bursts Intensity Prediction