Search Results for author: Wen-Huang Cheng

Found 27 papers, 8 papers with code

EmoVIT: Revolutionizing Emotion Insights with Visual Instruction Tuning

1 code implementation • 25 Apr 2024 • HongXia Xie, Chu-Jun Peng, Yu-Wen Tseng, Hung-Jen Chen, Chan-Feng Hsu, Hong-Han Shuai, Wen-Huang Cheng

Visual Instruction Tuning represents a novel learning paradigm involving the fine-tuning of pre-trained language models using task-specific instructions.

Emotion Classification Emotion Recognition

Paper
Code

Lightweight Deep Learning for Resource-Constrained Environments: A Survey

no code implementations • 8 Apr 2024 • Hou-I Liu, Marco Galindo, HongXia Xie, Lai-Kuan Wong, Hong-Han Shuai, Yung-Hui Li, Wen-Huang Cheng

Over the past decade, the dominance of deep learning has prevailed across various domains of artificial intelligence, including natural language processing, computer vision, and biomedical signal processing.

Paper
Add Code

MonoTAKD: Teaching Assistant Knowledge Distillation for Monocular 3D Object Detection

no code implementations • 7 Apr 2024 • Hou-I Liu, Christine Wu, Jen-Hao Cheng, Wenhao Chai, Shian-Yun Wang, Gaowen Liu, Jenq-Neng Hwang, Hong-Han Shuai, Wen-Huang Cheng

Subsequently, we introduce the cross-modal residual distillation to transfer the 3D spatial cues.

Autonomous Driving Knowledge Distillation +3

Paper
Add Code

DQ-DETR: DETR with Dynamic Query for Tiny Object Detection

no code implementations • 4 Apr 2024 • Yi-Xin Huang, Hou-I Liu, Hong-Han Shuai, Wen-Huang Cheng

Despite previous DETR-like methods having performed successfully in generic object detection, tiny object detection is still a challenging task for them since the positional information of object queries is not customized for detecting tiny objects, whose scale is extraordinarily smaller than general objects.

Object object-detection +1

Paper
Add Code

MobileVidFactory: Automatic Diffusion-Based Social Media Video Generation for Mobile Devices from Text

no code implementations • 31 Jul 2023 • Junchen Zhu, Huan Yang, Wenjing Wang, Huiguo He, Zixi Tuo, Yongsheng Yu, Wen-Huang Cheng, Lianli Gao, Jingkuan Song, Jianlong Fu, Jiebo Luo

In the basic generation, we take advantage of the pretrained image diffusion model, and adapt it to a high-quality open-domain vertical video generator for mobile devices.

Video Generation

Paper
Add Code

MovieFactory: Automatic Movie Creation from Text using Large Generative Models for Language and Images

no code implementations • 12 Jun 2023 • Junchen Zhu, Huan Yang, Huiguo He, Wenjing Wang, Zixi Tuo, Wen-Huang Cheng, Lianli Gao, Jingkuan Song, Jianlong Fu

To generate videos, we extend the capabilities of a pretrained text-to-image diffusion model through a two-stage process.

Retrieval

Paper
Add Code

Size Does Matter: Size-aware Virtual Try-on via Clothing-oriented Transformation Try-on Network

1 code implementation • ICCV 2023 • Chieh-Yun Chen, Yi-Chung Chen, Hong-Han Shuai, Wen-Huang Cheng

COTTON leverages clothing structure with landmarks and segmentation to design a novel landmark-guided transformation for precisely deforming clothes, allowing for size adjustment during try-on.

Virtual Try-on

103

Paper
Code

Most Important Person-guided Dual-branch Cross-Patch Attention for Group Affect Recognition

no code implementations • ICCV 2023 • HongXia Xie, Ming-Xian Lee, Tzu-Jui Chen, Hung-Jen Chen, Hou-I Liu, Hong-Han Shuai, Wen-Huang Cheng

Then, the Cross-Patch Attention module is proposed to fuse the features of MIP and global context together to complement each other.

Open-Ended Question Answering

Paper
Add Code

Power Efficient Video Super-Resolution on Mobile NPUs with Deep Learning, Mobile AI & AIM 2022 challenge: Report

no code implementations • 7 Nov 2022 • Andrey Ignatov, Radu Timofte, Cheng-Ming Chiang, Hsien-Kai Kuo, Yu-Syuan Xu, Man-Yu Lee, Allen Lu, Chia-Ming Cheng, Chih-Cheng Chen, Jia-Ying Yong, Hong-Han Shuai, Wen-Huang Cheng, Zhuang Jia, Tianyu Xu, Yijian Zhang, Long Bao, Heng Sun, Diankai Zhang, Si Gao, Shaoli Liu, Biao Wu, Xiaofeng Zhang, Chengjian Zheng, Kaidi Lu, Ning Wang, Xiao Sun, HaoDong Wu, Xuncheng Liu, Weizhan Zhang, Caixia Yan, Haipeng Du, Qinghua Zheng, Qi Wang, Wangdu Chen, Ran Duan, Mengdi Sun, Dan Zhu, Guannan Chen, Hojin Cho, Steve Kim, Shijie Yue, Chenghua Li, Zhengyang Zhuge, Wei Chen, Wenxu Wang, Yufeng Zhou, Xiaochen Cai, Hengxing Cai, Kele Xu, Li Liu, Zehua Cheng, Wenyi Lian, Wenjing Lian

While numerous solutions have been proposed for this problem, they are usually quite computationally demanding, demonstrating low FPS rates and power efficiency on mobile devices.

Video Super-Resolution

Paper
Add Code

Vision Transformers: State of the Art and Research Challenges

no code implementations • 7 Jul 2022 • Bo-Kai Ruan, Hong-Han Shuai, Wen-Huang Cheng

Transformers have achieved great success in natural language processing.

3D Reconstruction Image Segmentation +5

Paper
Add Code

Fast Vehicle Detection and Tracking on Fisheye Traffic Monitoring Video using CNN and Bounding Box Propagation

no code implementations • 4 Jul 2022 • Sandy Ardianto, Hsueh-Ming Hang, Wen-Huang Cheng

We design a fast car detection and tracking algorithm for traffic monitoring fisheye video mounted on crossroads.

Fast Vehicle Detection object-detection +1

Paper
Add Code

Mask or Non-Mask? Robust Face Mask Detector via Triplet-Consistency Representation Learning

no code implementations • 1 Oct 2021 • Chun-Wei Yang, Thanh-Hai Phung, Hong-Han Shuai, Wen-Huang Cheng

To automate the monitoring process, one of the promising solutions is to leverage existing object detection models to detect the faces with or without masks.

object-detection Object Detection +1

Paper
Add Code

Technical Report for Valence-Arousal Estimation in ABAW2 Challenge

no code implementations • 8 Jul 2021 • Hong-Xia Xie, I-Hsuan Li, Ling Lo, Hong-Han Shuai, Wen-Huang Cheng

In this work, we describe our method for tackling the valence-arousal estimation challenge from ABAW2 ICCV-2021 Competition.

Arousal Estimation

Paper
Add Code

Multimodal Deep Learning Framework for Image Popularity Prediction on Social Media

no code implementations • 18 May 2021 • Fatma S. Abousaleh, Wen-Huang Cheng, Neng-Hao Yu, Yu Tsao

In this study, motivated by multimodal learning, which uses information from various modalities, and the current success of convolutional neural networks (CNNs) in various fields, we propose a deep learning model, called visual-social convolutional neural network (VSCNN), which predicts the popularity of a posted image by incorporating various types of visual and social features into a unified network model.

Paper
Add Code

ZYELL-NCTU NetTraffic-1.0: A Large-Scale Dataset for Real-World Network Anomaly Detection

no code implementations • 8 Mar 2021 • Lei Chen, Shao-En Weng, Chu-Jun Peng, Hong-Han Shuai, Wen-Huang Cheng

Network security has been an active research topic for long.

Anomaly Detection Intrusion Detection

Paper
Add Code

Template-Free Try-on Image Synthesis via Semantic-guided Optimization

no code implementations • 6 Feb 2021 • Chien-Lung Chou, Chieh-Yun Chen, Chia-Wei Hsieh, Hong-Han Shuai, Jiaying Liu, Wen-Huang Cheng

Afterward, given an in-shop clothing image, a user image, and a synthesized pose, we propose a novel model for synthesizing a human try-on image with the target clothing in the best fitting pose.

Image Generation Virtual Try-on

Paper
Add Code

Spatiotemporal Dilated Convolution with Uncertain Matching for Video-based Crowd Estimation

1 code implementation • 29 Jan 2021 • Yu-Jen Ma, Hong-Han Shuai, Wen-Huang Cheng

In this paper, we propose a novel SpatioTemporal convolutional Dense Network (STDNet) to address the video-based crowd counting problem, which contains the decomposition of 3D convolution and the 3D spatiotemporal dilated dense convolution to alleviate the rapid growth of the model size caused by the Conv3D layer.

Crowd Counting

Paper
Code

DAF:re: A Challenging, Crowd-Sourced, Large-Scale, Long-Tailed Dataset For Anime Character Recognition

2 code implementations • 21 Jan 2021 • Edwin Arkel Rios, Wen-Huang Cheng, Bo-Cheng Lai

In this work we tackle the challenging problem of anime character recognition.

Face Recognition Image Classification +1

Paper
Code

FashionMirror: Co-Attention Feature-Remapping Virtual Try-On With Sequential Template Poses

1 code implementation • ICCV 2021 • Chieh-Yun Chen, Ling Lo, Pin-Jui Huang, Hong-Han Shuai, Wen-Huang Cheng

In the second stage, we first remove the clothes on the source human via the removed mask and warp the clothing features conditioning on the try-on clothing mask to fit the next frame human.

Segmentation Semantic Segmentation +1

Paper
Code

Naturalistic Physical Adversarial Patch for Object Detectors

1 code implementation • ICCV 2021 • Yu-Chih-Tuan Hu, Bo-Han Kung, Daniel Stanley Tan, Jun-Cheng Chen, Kai-Lung Hua, Wen-Huang Cheng

Most prior works on physical adversarial attacks mainly focus on the attack performance but seldom enforce any restrictions over the appearance of the generated adversarial patches.

Generative Adversarial Network Object

Paper
Code

An Overview of Facial Micro-Expression Analysis: Data, Methodology and Challenge

no code implementations • 21 Dec 2020 • Hong-Xia Xie, Ling Lo, Hong-Han Shuai, Wen-Huang Cheng

Facial micro-expressions indicate brief and subtle facial movements that appear during emotional communication.

Micro Expression Recognition Micro-Expression Recognition +2

Paper
Add Code

MER-GCN: Micro Expression Recognition Based on Relation Modeling with Graph Convolutional Network

no code implementations • 19 Apr 2020 • Ling Lo, Hong-Xia Xie, Hong-Han Shuai, Wen-Huang Cheng

Micro-Expression (ME) is the spontaneous, involuntary movement of a face that can reveal the true feeling.

Graph Classification Micro Expression Recognition +2

Paper
Add Code

Fashion Meets Computer Vision: A Survey

no code implementations • 31 Mar 2020 • Wen-Huang Cheng, Sijie Song, Chieh-Yun Chen, Shintami Chusnul Hidayati, Jiaying Liu

Fashion is the way we present ourselves to the world and has become one of the world's largest industries.

Attribute Fashion Synthesis +2

Paper
Add Code

SMP Challenge: An Overview of Social Media Prediction Challenge 2019

no code implementations • 4 Oct 2019 • Bo Wu, Wen-Huang Cheng, Peiye Liu, Bei Liu, Zhaoyang Zeng, Jiebo Luo

In the SMP Challenge at ACM Multimedia 2019, we introduce a novel prediction task Temporal Popularity Prediction, which focuses on predicting future interaction or attractiveness (in terms of clicks, views or likes etc.)

Multimedia recommendation