Search Results for author: Piotr Dollár

Found 33 papers, 29 papers with code

Segment Anything

18 code implementations • ICCV 2023 • Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C. Berg, Wan-Yen Lo, Piotr Dollár, Ross Girshick

We introduce the Segment Anything (SA) project: a new task, model, and dataset for image segmentation.

Ranked #2 on Zero-Shot Instance Segmentation on LVIS v1.0 val

Event-based Object Segmentation Image Segmentation +3

127,344

Paper
Code

The effectiveness of MAE pre-pretraining for billion-scale pretraining

1 code implementation • ICCV 2023 • Mannat Singh, Quentin Duval, Kalyan Vasudev Alwala, Haoqi Fan, Vaibhav Aggarwal, Aaron Adcock, Armand Joulin, Piotr Dollár, Christoph Feichtenhofer, Ross Girshick, Rohit Girdhar, Ishan Misra

While MAE has only been shown to scale with the size of models, we find that it scales with the size of the training dataset as well.

Ranked #1 on Few-Shot Image Classification on ImageNet - 10-shot (using extra training data)

Action Classification Action Recognition +6

Paper
Code

Revisiting Weakly Supervised Pre-Training of Visual Perception Models

2 code implementations • CVPR 2022 • Mannat Singh, Laura Gustafson, Aaron Adcock, Vinicius de Freitas Reis, Bugra Gedik, Raj Prateek Kosaraju, Dhruv Mahajan, Ross Girshick, Piotr Dollár, Laurens van der Maaten

Model pre-training is a cornerstone of modern visual recognition systems.

Ranked #1 on Out-of-Distribution Generalization on ImageNet-W (using extra training data)

Fine-Grained Image Classification Out-of-Distribution Generalization +3

169

Paper
Code

Masked Autoencoders Are Scalable Vision Learners

49 code implementations • CVPR 2022 • Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Dollár, Ross Girshick

Our MAE approach is simple: we mask random patches of the input image and reconstruct the missing pixels.

Ranked #1 on Out-of-Distribution Generalization on ImageNet-W

Decoder Domain Generalization +5

6,889

Paper
Code

Early Convolutions Help Transformers See Better

1 code implementation • NeurIPS 2021 • Tete Xiao, Mannat Singh, Eric Mintun, Trevor Darrell, Piotr Dollár, Ross Girshick

To test whether this atypical design choice causes an issue, we analyze the optimization behavior of ViT models with their original patchify stem versus a simple counterpart where we replace the ViT stem by a small number of stacked stride-two 3*3 convolutions.

Paper
Code

Boundary IoU: Improving Object-Centric Image Segmentation Evaluation

2 code implementations • CVPR 2021 • Bowen Cheng, Ross Girshick, Piotr Dollár, Alexander C. Berg, Alexander Kirillov

We perform an extensive analysis across different error types and object sizes and show that Boundary IoU is significantly more sensitive than the standard Mask IoU measure to boundary errors for large objects and does not over-penalize errors on smaller objects.

Image Segmentation Object +2

209

Paper
Code

Fast and Accurate Model Scaling

4 code implementations • CVPR 2021 • Piotr Dollár, Mannat Singh, Ross Girshick

This leads us to propose a simple fast compound scaling strategy that encourages primarily scaling model width, while scaling depth and resolution to a lesser extent.

30,332

Paper
Code

Evaluating Large-Vocabulary Object Detectors: The Devil is in the Details

2 code implementations • 1 Feb 2021 • Achal Dave, Piotr Dollár, Deva Ramanan, Alexander Kirillov, Ross Girshick

On one hand, this is desirable as it treats all classes equally.

Benchmarking object-detection +2

2,023

Paper
Code

Designing Network Design Spaces

24 code implementations • CVPR 2020 • Ilija Radosavovic, Raj Prateek Kosaraju, Ross Girshick, Kaiming He, Piotr Dollár

In this work, we present a new network design paradigm.

Ranked #1 on Out-of-Distribution Generalization on ImageNet-W

Image Classification Out-of-Distribution Generalization

30,332

Paper
Code

Are Labels Necessary for Neural Architecture Search?

2 code implementations • ECCV 2020 • Chenxi Liu, Piotr Dollár, Kaiming He, Ross Girshick, Alan Yuille, Saining Xie

Existing neural network architectures in computer vision -- whether designed by humans or by machines -- were typically found using both images and their associated labels.

Neural Architecture Search

2,117

Paper
Code

LVIS: A Dataset for Large Vocabulary Instance Segmentation

3 code implementations • CVPR 2019 • Agrim Gupta, Piotr Dollár, Ross Girshick

We plan to collect ~2 million high-quality instance segmentation masks for over 1000 entry-level object categories in 164k images.

Instance Segmentation Object +4

399

Paper
Code

On Network Design Spaces for Visual Recognition

4 code implementations • ICCV 2019 • Ilija Radosavovic, Justin Johnson, Saining Xie, Wan-Yen Lo, Piotr Dollár

Compared to current methodologies of comparing point and curve estimates of model families, distribution estimates paint a more complete picture of the entire design landscape.

Neural Architecture Search

2,117

Paper
Code

TensorMask: A Foundation for Dense Object Segmentation

2 code implementations • ICCV 2019 • Xinlei Chen, Ross Girshick, Kaiming He, Piotr Dollár

To formalize this, we treat dense instance segmentation as a prediction task over 4D tensors and present a general framework called TensorMask that explicitly captures this geometry and enables novel operators on 4D tensors.

Ranked #90 on Instance Segmentation on COCO test-dev

Instance Segmentation Object +4

29,111

Paper
Code

Panoptic Feature Pyramid Networks

12 code implementations • CVPR 2019 • Alexander Kirillov, Ross Girshick, Kaiming He, Piotr Dollár

In this work, we perform a detailed study of this minimally extended version of Mask R-CNN with FPN, which we refer to as Panoptic FPN, and show it is a robust and accurate baseline for both tasks.

Ranked #4 on Panoptic Segmentation on Indian Driving Dataset

Instance Segmentation Panoptic Segmentation +2

29,111

Paper
Code

Rethinking ImageNet Pre-training

1 code implementation • ICCV 2019 • Kaiming He, Ross Girshick, Piotr Dollár

We report competitive results on object detection and instance segmentation on the COCO dataset using standard models trained from random initialization.

Ranked #81 on Object Detection on COCO minival

Instance Segmentation object-detection +2

6,296

Paper
Code

Panoptic Segmentation

9 code implementations • CVPR 2019 • Alexander Kirillov, Kaiming He, Ross Girshick, Carsten Rother, Piotr Dollár

We propose and study a task we name panoptic segmentation (PS).

Ranked #23 on Panoptic Segmentation on Cityscapes val (using extra training data)

Image Segmentation Instance Segmentation +4

411

Paper
Code

Data Distillation: Towards Omni-Supervised Learning

4 code implementations • CVPR 2018 • Ilija Radosavovic, Piotr Dollár, Ross Girshick, Georgia Gkioxari, Kaiming He

We investigate omni-supervised learning, a special regime of semi-supervised learning in which the learner exploits all available labeled data plus internet-scale sources of unlabeled data.

Keypoint Detection object-detection +1

26,179

Paper
Code

Learning to Segment Every Thing

3 code implementations • CVPR 2018 • Ronghang Hu, Piotr Dollár, Kaiming He, Trevor Darrell, Ross Girshick

Most methods for object instance segmentation require all training examples to be labeled with segmentation masks.

Instance Segmentation Segmentation +1

26,179

Paper
Code

Focal Loss for Dense Object Detection

230 code implementations • ICCV 2017 • Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, Piotr Dollár

Our novel Focal Loss focuses training on a sparse set of hard examples and prevents the vast number of easy negatives from overwhelming the detector during training.

Ranked #3 on Region Proposal on COCO test-dev

Dense Object Detection Knowledge Distillation +5

76,728

Paper
Code

Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour

71 code implementations • 8 Jun 2017 • Priya Goyal, Piotr Dollár, Ross Girshick, Pieter Noordhuis, Lukasz Wesolowski, Aapo Kyrola, Andrew Tulloch, Yangqing Jia, Kaiming He

To achieve this result, we adopt a hyper-parameter-free linear scaling rule for adjusting learning rates as a function of minibatch size and develop a new warmup scheme that overcomes optimization challenges early in training.

Stochastic Optimization

4,391

Paper
Code

Detecting and Recognizing Human-Object Interactions

2 code implementations • CVPR 2018 • Georgia Gkioxari, Ross Girshick, Piotr Dollár, Kaiming He

Our hypothesis is that the appearance of a person -- their pose, clothing, action -- is a powerful cue for localizing the objects they are interacting with.

Ranked #53 on Human-Object Interaction Detection on HICO-DET

Human-Object Interaction Detection Object

26,179

Paper
Code

Mask R-CNN

172 code implementations • ICCV 2017 • Kaiming He, Georgia Gkioxari, Piotr Dollár, Ross Girshick

Our approach efficiently detects objects in an image while simultaneously generating a high-quality segmentation mask for each instance.

Ranked #1 on Keypoint Estimation on GRIT

3D Instance Segmentation Human Part Segmentation +12

76,722

Paper
Code

Learning Features by Watching Objects Move

1 code implementation • CVPR 2017 • Deepak Pathak, Ross Girshick, Piotr Dollár, Trevor Darrell, Bharath Hariharan

Given the extensive evidence that motion plays a key role in the development of the human visual system, we hope that this straightforward approach to unsupervised learning will be more effective than cleverly designed 'pretext' tasks studied in the literature.

object-detection Object Detection +1

260

Paper
Code

Feature Pyramid Networks for Object Detection

85 code implementations • CVPR 2017 • Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, Serge Belongie

Feature pyramids are a basic component in recognition systems for detecting objects at different scales.

Ranked #3 on Pedestrian Detection on TJU-Ped-campus

Object Object Detection +1

39,644

Paper
Code

Aggregated Residual Transformations for Deep Neural Networks

58 code implementations • CVPR 2017 • Saining Xie, Ross Girshick, Piotr Dollár, Zhuowen Tu, Kaiming He

Our simple design results in a homogeneous, multi-branch architecture that has only a few hyper-parameters to set.

Ranked #3 on Image Classification on GasHisSDB

Domain Generalization General Classification +1

15,622

Paper
Code

A MultiPath Network for Object Detection

1 code implementation • 7 Apr 2016 • Sergey Zagoruyko, Adam Lerer, Tsung-Yi Lin, Pedro O. Pinheiro, Sam Gross, Soumith Chintala, Piotr Dollár

To address these challenges, we test three modifications to the standard Fast R-CNN object detector: (1) skip connections that give the detector access to features at multiple network layers, (2) a foveal structure to exploit object context at multiple object resolutions, and (3) an integral loss function and corresponding network adjustment that improve localization.

Ranked #104 on Instance Segmentation on COCO test-dev

Instance Segmentation Object +2

1,341

Paper
Code

Unsupervised Learning of Edges

no code implementations • CVPR 2016 • Yin Li, Manohar Paluri, James M. Rehg, Piotr Dollár

In this work we present a simple yet effective approach for training edge detectors without human supervision.

Edge Detection Motion Estimation +2

Paper
Add Code

Semantic Amodal Segmentation

2 code implementations • CVPR 2017 • Yan Zhu, Yuandong Tian, Dimitris Mexatas, Piotr Dollár

Specifically, we create an amodal segmentation of each image: the full extent of each region is marked, not just the visible pixels.

object-detection Object Detection +2

Paper
Code

What makes for effective detection proposals?

no code implementations • 17 Feb 2015 • Jan Hosang, Rodrigo Benenson, Piotr Dollár, Bernt Schiele

Current top performing object detectors employ detection proposals to guide the search for objects, thereby avoiding exhaustive sliding window search across images.

Object object-detection +1

Paper
Add Code

From Captions to Visual Concepts and Back

1 code implementation • CVPR 2015 • Hao Fang, Saurabh Gupta, Forrest Iandola, Rupesh Srivastava, Li Deng, Piotr Dollár, Jianfeng Gao, Xiaodong He, Margaret Mitchell, John C. Platt, C. Lawrence Zitnick, Geoffrey Zweig

The language model learns from a set of over 400, 000 image descriptions to capture the statistics of word usage.

Ranked #1 on Image Captioning on COCO Captions test

Image Captioning Language Modelling +3

150

Paper
Code

Fast Edge Detection Using Structured Forests

no code implementations • 20 Jun 2014 • Piotr Dollár, C. Lawrence Zitnick

We formulate the problem of predicting local edge masks in a structured learning framework applied to random decision forests.

Edge Detection Image Segmentation +1

Paper
Add Code

Local Decorrelation For Improved Detection

no code implementations • 4 Jun 2014 • Woonhyun Nam, Piotr Dollár, Joon Hee Han

In fact, orthogonal trees with our locally decorrelated features outperform oblique trees trained over the original features at a fraction of the computational cost.

object-detection Object Detection

Paper
Add Code

Microsoft COCO: Common Objects in Context

35 code implementations • 1 May 2014 • Tsung-Yi Lin, Michael Maire, Serge Belongie, Lubomir Bourdev, Ross Girshick, James Hays, Pietro Perona, Deva Ramanan, C. Lawrence Zitnick, Piotr Dollár

We present a new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding.

Instance Segmentation Object +5

12,252

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.