Search Results for author: Junting Pan

Found 22 papers, 13 papers with code

MathGenie: Generating Synthetic Data with Question Back-translation for Enhancing Mathematical Reasoning of LLMs

no code implementations • 26 Feb 2024 • Zimu Lu, Aojun Zhou, Houxing Ren, Ke Wang, Weikang Shi, Junting Pan, Mingjie Zhan, Hongsheng Li

We augment the ground-truth solutions of our seed data and train a back-translation model to translate the augmented solutions back into new questions.

GSM8K Math +1

Paper
Add Code

Measuring Multimodal Mathematical Reasoning with MATH-Vision Dataset

no code implementations • 22 Feb 2024 • Ke Wang, Junting Pan, Weikang Shi, Zimu Lu, Mingjie Zhan, Hongsheng Li

Recent advancements in Large Multimodal Models (LMMs) have shown promising results in mathematical reasoning within visual contexts, with models approaching human-level performance on existing benchmarks such as MathVista.

Ranked #1 on Multimodal Reasoning on MATH-V (using extra training data)

Math Mathematical Reasoning +1

Paper
Add Code

GroundingGPT:Language Enhanced Multi-modal Grounding Model

2 code implementations • 11 Jan 2024 • Zhaowei Li, Qi Xu, Dong Zhang, Hang Song, Yiqing Cai, Qi Qi, Ran Zhou, Junting Pan, Zefeng Li, Van Tu Vu, Zhida Huang, Tao Wang

Beyond capturing global information like other multi-modal models, our proposed model excels at tasks demanding a detailed understanding of local information within the input.

Language Modelling Large Language Model

207

Paper
Code

TeacherLM: Teaching to Fish Rather Than Giving the Fish, Language Modeling Likewise

no code implementations • 29 Oct 2023 • Nan He, Hanyu Lai, Chenyang Zhao, Zirui Cheng, Junting Pan, Ruoyu Qin, Ruofan Lu, Rui Lu, Yunchen Zhang, Gangming Zhao, Zhaohui Hou, Zhiyuan Huang, Shaoqing Lu, Ding Liang, Mingjie Zhan

Based on TeacherLM-7. 1B, we augmented 58 NLP datasets and taught various student models with different parameters from OPT and BLOOM series in a multi-task setting.

Data Augmentation Language Modelling

Paper
Add Code

Turning hazardous volatile matter compounds into fuel by catalytic steam reforming: An evolutionary machine learning approach

no code implementations • 25 Jul 2023 • Alireza Shafizadeh, Hossein Shahbeik, Mohammad Hossein Nadian, Vijai Kumar Gupta, Abdul-Sattar Nizami, Su Shiung Lam, WanXi Peng, Junting Pan, Meisam Tabatabaei, Mortaza Aghbashlo

Literature is used to compile a database covering a variety of catalyst characteristics and reaction conditions.

Feature Importance TAR

Paper
Add Code

Retrieving-to-Answer: Zero-Shot Video Question Answering with Frozen Large Language Models

no code implementations • 15 Jun 2023 • Junting Pan, Ziyi Lin, Yuying Ge, Xiatian Zhu, Renrui Zhang, Yi Wang, Yu Qiao, Hongsheng Li

Video Question Answering (VideoQA) has been significantly advanced from the scaling of recent Large Language Models (LLMs).

Ranked #3 on Temporal/Casual QA on NExT-QA (using extra training data)

Domain Generalization Retrieval +2

Paper
Add Code

Using evolutionary machine learning to characterize and optimize co-pyrolysis of biomass feedstocks and polymeric wastes

no code implementations • 24 May 2023 • Hossein Shahbeik, Alireza Shafizadeh, Mohammad Hossein Nadian, Dorsa Jeddi, Seyedali Mirjalili, Yadong Yang, Su Shiung Lam, Junting Pan, Meisam Tabatabaei, Mortaza Aghbashlo

The input features are constructed using an innovative approach to reflect the physics of the process.

Paper
Add Code

VideoLLM: Modeling Video Sequence with Large Language Models

1 code implementation • 22 May 2023 • Guo Chen, Yin-Dong Zheng, Jiahao Wang, Jilan Xu, Yifei HUANG, Junting Pan, Yi Wang, Yali Wang, Yu Qiao, Tong Lu, LiMin Wang

Building upon this insight, we propose a novel framework called VideoLLM that leverages the sequence reasoning capabilities of pre-trained LLMs from natural language processing (NLP) for video sequence understanding.

Decoder Video Understanding

153

Paper
Code

Personalize Segment Anything Model with One Shot

1 code implementation • 4 May 2023 • Renrui Zhang, Zhengkai Jiang, Ziyu Guo, Shilin Yan, Junting Pan, Xianzheng Ma, Hao Dong, Peng Gao, Hongsheng Li

Driven by large-data pre-training, Segment Anything Model (SAM) has been demonstrated as a powerful and promptable framework, revolutionizing the segmentation models.

Ranked #1 on Personalized Segmentation on PerSeg

Personalized Segmentation Segmentation +4

1,435

Paper
Code

SparseMAE: Sparse Training Meets Masked Autoencoders

no code implementations • ICCV 2023 • Aojun Zhou, Yang Li, Zipeng Qin, Jianbo Liu, Junting Pan, Renrui Zhang, Rui Zhao, Peng Gao, Hongsheng Li

In this paper, we aim to reduce model complexity from large vision transformers pretrained by MAE with assistant of sparse training.

Paper
Add Code

InternVideo: General Video Foundation Models via Generative and Discriminative Learning

2 code implementations • 6 Dec 2022 • Yi Wang, Kunchang Li, Yizhuo Li, Yinan He, Bingkun Huang, Zhiyu Zhao, Hongjie Zhang, Jilan Xu, Yi Liu, Zun Wang, Sen Xing, Guo Chen, Junting Pan, Jiashuo Yu, Yali Wang, LiMin Wang, Yu Qiao

Specifically, InternVideo efficiently explores masked video modeling and video-language contrastive learning as the pretraining objectives, and selectively coordinates video representations of these two complementary frameworks in a learnable manner to boost various video applications.

Ranked #1 on Action Recognition on Something-Something V1 (using extra training data)

Action Classification Contrastive Learning +8

954

Paper
Code

InternVideo-Ego4D: A Pack of Champion Solutions to Ego4D Challenges

2 code implementations • 17 Nov 2022 • Guo Chen, Sen Xing, Zhe Chen, Yi Wang, Kunchang Li, Yizhuo Li, Yi Liu, Jiahao Wang, Yin-Dong Zheng, Bingkun Huang, Zhiyu Zhao, Junting Pan, Yifei HUANG, Zun Wang, Jiashuo Yu, Yinan He, Hongjie Zhang, Tong Lu, Yali Wang, LiMin Wang, Yu Qiao

In this report, we present our champion solutions to five tracks at Ego4D challenge.

Ranked #1 on State Change Object Detection on Ego4D

Future Hand Prediction Moment Queries +7

Paper
Code

ST-Adapter: Parameter-Efficient Image-to-Video Transfer Learning

1 code implementation • 27 Jun 2022 • Junting Pan, Ziyi Lin, Xiatian Zhu, Jing Shao, Hongsheng Li

This has led to a new research direction in parameter-efficient transfer learning.

Ranked #23 on Action Recognition on Something-Something V2 (using extra training data)

Action Classification Action Recognition +3

Paper
Code

EdgeViTs: Competing Light-weight CNNs on Mobile Devices with Vision Transformers

1 code implementation • 6 May 2022 • Junting Pan, Adrian Bulat, Fuwen Tan, Xiatian Zhu, Lukasz Dudziak, Hongsheng Li, Georgios Tzimiropoulos, Brais Martinez

In this work, pushing further along this under-studied direction we introduce EdgeViTs, a new family of light-weight ViTs that, for the first time, enable attention-based vision models to compete with the best light-weight CNNs in the tradeoff between accuracy and on-device efficiency.

Paper
Code

1st place solution for AVA-Kinetics Crossover in AcitivityNet Challenge 2020

2 code implementations • 16 Jun 2020 • Siyu Chen, Junting Pan, Guanglu Song, Manyuan Zhang, Hao Shao, Ziyi Lin, Jing Shao, Hongsheng Li, Yu Liu

This technical report introduces our winning solution to the spatio-temporal action localization track, AVA-Kinetics Crossover, in ActivityNet Challenge 2020.

Relation Network Spatio-Temporal Action Localization +1

198

Paper
Code

Actor-Context-Actor Relation Network for Spatio-Temporal Action Localization

3 code implementations • CVPR 2021 • Junting Pan, Siyu Chen, Mike Zheng Shou, Yu Liu, Jing Shao, Hongsheng Li

We propose to explicitly model the Actor-Context-Actor Relation, which is the relation between two actors based on their interactions with the context.

Ranked #2 on Action Recognition on AVA v2.1

Action Detection Action Recognition +5

3,003

Paper
Code

Video Generation from Single Semantic Label Map

2 code implementations • CVPR 2019 • Junting Pan, Chengyu Wang, Xu Jia, Jing Shao, Lu Sheng, Junjie Yan, Xiaogang Wang

This paper proposes the novel task of video generation conditioned on a SINGLE semantic label map, which provides a good balance between flexibility and quality in the generation process.

Image Generation Image to Video Generation +1

139

Paper
Code

Unsupervised Bi-directional Flow-based Video Generation from one Snapshot

no code implementations • 3 Mar 2019 • Lu Sheng, Junting Pan, Jiaming Guo, Jing Shao, Xiaogang Wang, Chen Change Loy

Imagining multiple consecutive frames given one single snapshot is challenging, since it is difficult to simultaneously predict diverse motions from a single image and faithfully generate novel frames without visual distortions.

Video Generation

Paper
Add Code

Online Detection of Action Start in Untrimmed, Streaming Videos

no code implementations • ECCV 2018 • Zheng Shou, Junting Pan, Jonathan Chan, Kazuyuki Miyazawa, Hassan Mansour, Anthony Vetro, Xavier Giro-i-Nieto, Shih-Fu Chang

We aim to tackle a novel task in action detection - Online Detection of Action Start (ODAS) in untrimmed, streaming videos.

Action Detection Generative Adversarial Network

Paper
Add Code

SalGAN: Visual Saliency Prediction with Generative Adversarial Networks

3 code implementations • 4 Jan 2017 • Junting Pan, Cristian Canton Ferrer, Kevin McGuinness, Noel E. O'Connor, Jordi Torres, Elisa Sayrol, Xavier Giro-i-Nieto

We introduce SalGAN, a deep convolutional neural network for visual saliency prediction trained with adversarial examples.

Binary Classification General Classification +1

368

Paper
Code

Shallow and Deep Convolutional Networks for Saliency Prediction

1 code implementation • CVPR 2016 • Junting Pan, Kevin McGuinness, Elisa Sayrol, Noel O'Connor, Xavier Giro-i-Nieto

The prediction of salient areas in images has been traditionally addressed with hand-crafted features based on neuroscience principles.

Saliency Prediction

185

Paper
Code

End-to-end Convolutional Network for Saliency Prediction

1 code implementation • 6 Jul 2015 • Junting Pan, Xavier Giró-i-Nieto

The prediction of saliency areas in images has been traditionally addressed with hand crafted features based on neuroscience principles.

Saliency Prediction

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.