Search Results for author: Haosen Yang

Found 9 papers, 4 papers with code

Unsupervised Audio-Visual Segmentation with Modality Alignment

no code implementations • 21 Mar 2024 • Swapnil Bhosale, Haosen Yang, Diptesh Kanojia, Jiangkang Deng, Xiatian Zhu

Audio-Visual Segmentation (AVS) aims to identify, at the pixel level, the object in a visual scene that produces a given sound.

Contrastive Learning

Paper
Add Code

Uncertainty-Aware Pseudo-Label Filtering for Source-Free Unsupervised Domain Adaptation

1 code implementation • 17 Mar 2024 • Xi Chen, Haosen Yang, Huicong Zhang, Hongxun Yao, Xiatian Zhu

Source-free unsupervised domain adaptation (SFUDA) aims to enable the utilization of a pre-trained source model in an unlabeled target domain without access to source data.

Contrastive Learning Memorization +3

Paper
Code

WSI-SAM: Multi-resolution Segment Anything Model (SAM) for histopathology whole-slide images

no code implementations • 14 Mar 2024 • Hong Liu, Haosen Yang, Paul J. van Diest, Josien P. W. Pluim, Mitko Veta

In particular, our model outperforms SAM by 4. 1 and 2. 5 percent points on a ductal carcinoma in situ (DCIS) segmentation tasks and breast cancer metastasis segmentation task (CAMELYON16 dataset).

Decoder Segmentation +2

Paper
Add Code

Recognize Any Regions

1 code implementation • 2 Nov 2023 • Haosen Yang, Chuofan Ma, Bin Wen, Yi Jiang, Zehuan Yuan, Xiatian Zhu

Understanding the semantics of individual regions or patches within unconstrained images, such as in open-world object detection, represents a critical yet challenging task in computer vision.

object-detection Object Recognition +1

113

Paper
Code

Leveraging Foundation models for Unsupervised Audio-Visual Segmentation

no code implementations • 13 Sep 2023 • Swapnil Bhosale, Haosen Yang, Diptesh Kanojia, Xiatian Zhu

Particularly, in situations where existing supervised AVS methods struggle with overlapping foreground objects, our models still excel in accurately segmenting overlapped auditory objects.

Segmentation

Paper
Add Code

Self-supervised Video Representation Learning with Motion-Aware Masked Autoencoders

1 code implementation • 9 Oct 2022 • Haosen Yang, Deng Huang, Bin Wen, Jiannan Wu, Hongxun Yao, Yi Jiang, Xiatian Zhu, Zehuan Yuan

As a result, our model can extract effectively both static appearance and dynamic motion spontaneously, leading to superior spatiotemporal representation learning capability.

Representation Learning Semantic Segmentation +2

Paper
Code

NSNet: Non-saliency Suppression Sampler for Efficient Video Recognition

no code implementations • 21 Jul 2022 • Boyang xia, Wenhao Wu, Haoran Wang, Rui Su, Dongliang He, Haosen Yang, Xiaoran Fan, Wanli Ouyang

On the video level, a temporal attention module is learned under dual video-level supervisions on both the salient and the non-salient representations.

Ranked #4 on Action Recognition on ActivityNet

Action Recognition Video Classification +1

Paper
Add Code

Temporal Action Proposal Generation with Background Constraint

1 code implementation • 15 Dec 2021 • Haosen Yang, Wenhao Wu, Lining Wang, Sheng Jin, Boyang xia, Hongxun Yao, Hujie Huang

To evaluate the confidence of proposals, the existing works typically predict action score of proposals that are supervised by the temporal Intersection-over-Union (tIoU) between proposal and the ground-truth.

Temporal Action Proposal Generation

Paper
Code

Temporal Action Proposal Generation with Transformers

no code implementations • 25 May 2021 • Lining Wang, Haosen Yang, Wenhao Wu, Hongxun Yao, Hujie Huang

Conventionally, the temporal action proposal generation (TAPG) task is divided into two main sub-tasks: boundary prediction and proposal confidence prediction, which rely on the frame-level dependencies and proposal-level relationships separately.

Temporal Action Proposal Generation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.