no code implementations • 11 Dec 2023 • Guglielmo Camporese, Alessandro Bergamo, Xunyu Lin, Joseph Tighe, Davide Modolo
For example, on early recognition observing only the first 10% of each video, our method improves the SOTA by +2. 23 Top-1 accuracy on Something-Something-v2, +3. 55 on UCF-101, +3. 68 on SSsub21, and +5. 03 on EPIC-Kitchens-55, where prior work used either multi-modal inputs (e. g. optical-flow) or batched inference.
no code implementations • 20 Sep 2023 • Haodong Duan, Mingze Xu, Bing Shuai, Davide Modolo, Zhuowen Tu, Joseph Tighe, Alessandro Bergamo
It first models the intra-person skeleton dynamics for each skeleton sequence with graph convolutions, and then uses stacked Transformer encoders to capture person interactions that are important for action recognition in general scenarios.
no code implementations • ICCV 2023 • Haodong Duan, Mingze Xu, Bing Shuai, Davide Modolo, Zhuowen Tu, Joseph Tighe, Alessandro Bergamo
It first models the intra-person skeleton dynamics for each skeleton sequence with graph convolutions, and then uses stacked Transformer encoders to capture person interactions that are important for action recognition in the wild.
Ranked #2 on Human Interaction Recognition on NTU RGB+D
1 code implementation • ECCV 2022 • Bing Shuai, Alessandro Bergamo, Uta Buechler, Andrew Berneshawi, Alyssa Boden, Joseph Tighe
This paper presents a new large scale multi-person tracking dataset -- \texttt{PersonPath22}, which is over an order of magnitude larger than currently available high quality multi-object tracking datasets such as MOT17, HiEve, and MOT20 datasets.
Ranked #1 on Multi-Object Tracking on PersonPath22
no code implementations • 13 Sep 2014 • Loris Bazzani, Alessandro Bergamo, Dragomir Anguelov, Lorenzo Torresani
This paper introduces self-taught object localization, a novel approach that leverages deep convolutional networks trained for whole-image recognition to localize objects in images without additional human supervision, i. e., without using any ground-truth bounding boxes for training.
no code implementations • CVPR 2013 • Alessandro Bergamo, Sudipta N. Sinha, Lorenzo Torresani
In this paper we propose a new technique for learning a discriminative codebook for local feature descriptors, specifically designed for scalable landmark classification.
no code implementations • NeurIPS 2011 • Alessandro Bergamo, Lorenzo Torresani, Andrew W. Fitzgibbon
In contrast to previous approaches to learn compact codes, we optimize explicitly for (an upper bound on) classification performance.
no code implementations • NeurIPS 2010 • Alessandro Bergamo, Lorenzo Torresani
In this paper we investigate and compare methods that learn image classifiers by combining very few manually annotated examples (e. g., 1-10 images per class) and a large number of weakly-labeled Web photos retrieved using keyword-based image search.