1 code implementation • CoNLL (EMNLP) 2021 • Yang Hou, Houquan Zhou, Zhenghua Li, Yu Zhang, Min Zhang, Zhefeng Wang, Baoxing Huai, Nicholas Jing Yuan
In the coarse labeling stage, the joint model outputs a bracketed tree, in which each node corresponds to one of four labels (i. e., phrase, subphrase, word, subword).
no code implementations • Findings (NAACL) 2022 • Le Qi, Yu Zhang, Qingyu Yin, Guidong Zheng, Wen Junjie, Jinlong Li, Ting Liu
In this process, there are two kinds of critical information that are commonly employed: the representation information of original questions and the interactive information between pairs of questions.
no code implementations • COLING 2022 • Meiguo Wang, Benjamin Yao, Bin Guo, Xiaohu Liu, Yu Zhang, Tuan-Hung Pham, Chenlei Guo
To evaluate the performance of a multi-domain goal-oriented Dialogue System (DS), it is important to understand what the users’ goals are for the conversations and whether those goals are successfully achieved.
no code implementations • ECCV 2020 • Song Zhang, Yu Zhang, Zhe Jiang, Dongqing Zou, Jimmy Ren, Bin Zhou
A detail enhancing branch is proposed to reconstruct day light-specific features from the domain-invariant representations in a residual manner, regularized by a ranking loss.
1 code implementation • Findings (ACL) 2022 • Le Qi, Shangwen Lv, Hongyu Li, Jing Liu, Yu Zhang, Qiaoqiao She, Hua Wu, Haifeng Wang, Ting Liu
Open-domain question answering has been used in a wide range of applications, such as web search and enterprise search, which usually takes clean texts extracted from various formats of documents (e. g., web pages, PDFs, or Word documents) as the information source.
no code implementations • EMNLP 2020 • Zheng Li, Mukul Kumar, William Headden, Bing Yin, Ying WEI, Yu Zhang, Qiang Yang
Recent emergence of multilingual pre-training language model (mPLM) has enabled breakthroughs on various downstream cross-lingual transfer (CLT) tasks.
no code implementations • 3 May 2024 • Kaiyuan Chen, Xingzhuo Guo, Yu Zhang, Jianmin Wang, Mingsheng Long
The precision weighting mechanism posits that the brain allocates more attention to signals with lower precision, contributing to the cognitive ability of human brains.
no code implementations • 1 May 2024 • Zhili Liu, Yunhao Gou, Kai Chen, Lanqing Hong, Jiahui Gao, Fei Mi, Yu Zhang, Zhenguo Li, Xin Jiang, Qun Liu, James T. Kwok
As the capabilities of large language models (LLMs) have expanded dramatically, aligning these models with human values presents a significant challenge, posing potential risks during deployment.
no code implementations • 1 May 2024 • Ziyi Chen, Xiaolong Wu, Yu Zhang
Specifically, we integrate view-dependent biases in monocular normal priors into the neural implicit representation of the scene.
1 code implementation • 25 Apr 2024 • Toru Lin, Yu Zhang, Qiyang Li, Haozhi Qi, Brent Yi, Sergey Levine, Jitendra Malik
Two significant challenges exist: the lack of an affordable and accessible teleoperation system suitable for a dual-arm setup with multifingered hands, and the scarcity of multifingered hand hardware equipped with touch sensing.
1 code implementation • 17 Apr 2024 • Xin Li, Kun Yuan, Yajing Pei, Yiting Lu, Ming Sun, Chao Zhou, Zhibo Chen, Radu Timofte, Wei Sun, HaoNing Wu, ZiCheng Zhang, Jun Jia, Zhichao Zhang, Linhan Cao, Qiubo Chen, Xiongkuo Min, Weisi Lin, Guangtao Zhai, Jianhui Sun, Tianyi Wang, Lei LI, Han Kong, Wenxuan Wang, Bing Li, Cheng Luo, Haiqiang Wang, Xiangguang Chen, Wenhui Meng, Xiang Pan, Huiying Shi, Han Zhu, Xiaozhong Xu, Lei Sun, Zhenzhong Chen, Shan Liu, Fangyuan Kong, Haotian Fan, Yifang Xu, Haoran Xu, Mengduo Yang, Jie zhou, Jiaze Li, Shijie Wen, Mai Xu, Da Li, Shunyu Yao, Jiazhi Du, WangMeng Zuo, Zhibo Li, Shuai He, Anlong Ming, Huiyuan Fu, Huadong Ma, Yong Wu, Fie Xue, Guozhi Zhao, Lina Du, Jie Guo, Yu Zhang, huimin zheng, JunHao Chen, Yue Liu, Dulan Zhou, Kele Xu, Qisheng Xu, Tao Sun, Zhixiang Ding, Yuhang Hu
This paper reviews the NTIRE 2024 Challenge on Shortform UGC Video Quality Assessment (S-UGC VQA), where various excellent solutions are submitted and evaluated on the collected dataset KVQ from popular short-form video platform, i. e., Kuaishou/Kwai Platform.
no code implementations • 15 Apr 2024 • Moyu Zhang, Yongxiang Tang, Jinxin Hu, Yu Zhang
To enhance the model's capacity to capture user interests from historical behavior sequences in each scenario, we develop a ranking framework named the Scenario-Adaptive Fine-Grained Personalization Network (SFPNet), which designs a kind of fine-grained method for multi-scenario personalized recommendations.
1 code implementation • 10 Apr 2024 • Bowen Jin, Chulin Xie, Jiawei Zhang, Kashob Kumar Roy, Yu Zhang, Suhang Wang, Yu Meng, Jiawei Han
Then, we propose a simple and effective framework called Graph Chain-of-thought (Graph-CoT) to augment LLMs with graphs by encouraging LLMs to reason on the graph iteratively.
no code implementations • 4 Apr 2024 • Beibei Wang, Shuang Meng, Lu Zhang, Chenjie Wang, Jingjing Huang, Yao Li, Haojie Ren, Yuxuan Xiao, Yuru Peng, Jianmin Ji, Yu Zhang, Yanyong Zhang
Numerous roadside perception datasets have been introduced to propel advancements in autonomous driving and intelligent transportation systems research and development.
no code implementations • 1 Apr 2024 • Shourya Bose, Yu Zhang, Kibaek Kim
The advent of smart meters has enabled pervasive collection of energy consumption data for training short-term load forecasting models.
1 code implementation • 25 Mar 2024 • Kaipeng Zeng, Bo Yang, Xin Zhao, Yu Zhang, Fan Nie, Xiaokang Yang, Yaohui Jin, Yanyan Xu
Single-step retrosynthesis prediction, a crucial step in the planning process, has witnessed a surge in interest in recent years due to advancements in AI for science.
1 code implementation • 24 Mar 2024 • Yan Jia, Yuxin Song, Zihou Liu, Qingyin Tan, Fangming Wang, Yu Zhang, Zheli Liu
From the security and privacy perspective, this survey seeks out the new characteristics in CIoT traffic analysis, the state-of-the-art progress in CIoT traffic analysis, and the challenges yet to be solved.
no code implementations • 16 Mar 2024 • Xuehao Wang, Feiyang Ye, Yu Zhang
Furthermore, we introduce modified SAM (mSAM) for multi-task learning where we remove the prompt encoder of SAM and use task-specific no mask embeddings and mask decoder for each task.
no code implementations • 14 Mar 2024 • Yunhao Gou, Kai Chen, Zhili Liu, Lanqing Hong, Hang Xu, Zhenguo Li, Dit-yan Yeung, James T. Kwok, Yu Zhang
Multimodal large language models (MLLMs) have shown impressive reasoning abilities, which, however, are also more vulnerable to jailbreak attacks than their LLM predecessors.
no code implementations • 11 Mar 2024 • Qing Xiao, Siyeop Yoon, Hui Ren, Matthew Tivnan, Lichao Sun, Quanzheng Li, Tianming Liu, Yu Zhang, Xiang Li
Alzheimer's Disease (AD) is a neurodegenerative condition characterized by diverse progression rates among individuals, with changes in cortical thickness (CTh) closely linked to its progression.
no code implementations • 8 Mar 2024 • Jiabao Zhang, Yu Zhang
Mapping plays a crucial role in location and navigation within automatic systems.
1 code implementation • 8 Mar 2024 • Zhijing Shao, Zhaolong Wang, Zhuang Li, Duotun Wang, Xiangru Lin, Yu Zhang, Mingming Fan, Zeyu Wang
We present SplattingAvatar, a hybrid 3D representation of photorealistic human avatars with Gaussian Splatting embedded on a triangle mesh, which renders over 300 FPS on a modern GPU and 30 FPS on a mobile device.
no code implementations • 29 Feb 2024 • Yu Zhang, long wen, Xiangtong Yao, Zhenshan Bing, Linghuan Kong, wei he, Alois Knoll
Subsequently, the hyperparameters of the Gaussian model are trained with a specially compound kernel, and the Gaussian model's online inferential capability and computational efficiency are strengthened by updating a solitary inducing point derived from new samples, in conjunction with the learned hyperparameters.
no code implementations • 26 Feb 2024 • Yu Zhang, Guangyao Tian, long wen, Xiangtong Yao, Liding Zhang, Zhenshan Bing, wei he, Alois Knoll
This paper proposes a LiDAR-based goal-seeking and exploration framework, addressing the efficiency of online obstacle avoidance in unstructured environments populated with static and moving obstacles.
no code implementations • 26 Feb 2024 • Jinxu Zhang, Yongqi Yu, Yu Zhang
Document Visual Question Answering (DVQA) is a task that involves responding to queries based on the content of images.
1 code implementation • 20 Feb 2024 • Yanzhen Shen, Yu Zhang, Yunyi Zhang, Jiawei Han
Entity Set Expansion, Taxonomy Expansion, and Seed-Guided Taxonomy Construction are three representative tasks that can be used to automatically populate an existing taxonomy with new entities.
no code implementations • 19 Feb 2024 • Yu Zhang, Hui-Ling Zhen, Zehua Pei, Yingzhao Lian, Lihao Yin, Mingxuan Yuan, Bei Yu
In this paper, we propose a novel solver-layer adaptation (SoLA) method, where we introduce a solver as a new layer of the LLM to differentially guide solutions towards satisfiability.
no code implementations • 4 Feb 2024 • Yun Long, Haifeng Luo, Yu Zhang
Recognizing the knowledge-intensive and labor-intensive nature of traditional qualitative methods in educational research, this study investigates the potential of LLM to streamline and enhance the analysis process.
1 code implementation • 4 Feb 2024 • Yanbin Wei, Qiushi Huang, James T. Kwok, Yu Zhang
Knowledge Graph Completion (KGC) is crucial for addressing knowledge graph incompleteness and supporting downstream applications.
no code implementations • 3 Feb 2024 • Yanbin Wei, Shuai Fu, Weisen Jiang, James T. Kwok, Yu Zhang
In this paper, we take the first step in incorporating visual information into graph reasoning tasks and propose a new benchmark GITQA, where each sample is a tuple (graph, image, textual description).
no code implementations • 31 Jan 2024 • Mengxi Liu, Vitor Fortes Rey, Yu Zhang, Lala Shakti Swarup Ray, Bo Zhou, Paul Lukowicz
While IMUs are currently the prominent fitness tracking modality, through iMove, we show bio-impedence can help improve IMU-based fitness tracking through sensor fusion and contrastive learning. To evaluate our methods, we conducted an experiment including six upper body fitness activities performed by ten subjects over five days to collect synchronized data from bio-impedance across two wrists and IMU on the left wrist. The contrastive learning framework uses the two modalities to train a better IMU-only classification model, where bio-impedance is only required at the training phase, by which the average Macro F1 score with the input of a single IMU was improved by 3. 22 \% reaching 84. 71 \% compared to the 81. 49 \% of the IMU baseline model.
no code implementations • 29 Jan 2024 • Heyang Gong, Chaochao Lu, Yu Zhang
In the field of causal modeling, potential outcomes (PO) and structural causal models (SCMs) stand as the predominant frameworks.
1 code implementation • 23 Jan 2024 • Yu Zhang, Yunyi Zhang, Yanzhen Shen, Yu Deng, Lucian Popa, Larisa Shwartz, ChengXiang Zhai, Jiawei Han
In this paper, we study the task of seed-guided fine-grained entity typing in science and engineering domains, which takes the name and a few seed entities for each entity type as the only supervision and aims to classify new entity mentions into both seen and unseen types (i. e., those without seed entities).
no code implementations • 23 Jan 2024 • W. Ronny Huang, Cyril Allauzen, Tongzhou Chen, Kilol Gupta, Ke Hu, James Qin, Yu Zhang, Yongqiang Wang, Shuo-Yiin Chang, Tara N. Sainath
In the era of large models, the autoregressive nature of decoding often results in latency serving as a significant bottleneck.
no code implementations • 22 Jan 2024 • Zelin Gao, Weichen Dai, Yu Zhang
We propose Hierarchical Geometric Guidance (HGG) to incorporate the attachment of Structure from Motion (SfM), namely sparse depth prior, into the scene representations.
no code implementations • 22 Jan 2024 • Yu Zhang, Mei Di, Haozheng Luo, Chenwei Xu, Richard Tzong-Han Tsai
Recognizing the lack of extensive, publicly available datasets for SM, we have created and open-sourced the HDXSM dataset from the public humanitarian data.
no code implementations • 17 Jan 2024 • Feiyang Ye, Baijiong Lin, Xiaofeng Cao, Yu Zhang, Ivor Tsang
In this paper, we study the Multi-Objective Bi-Level Optimization (MOBLO) problem, where the upper-level subproblem is a multi-objective optimization problem and the lower-level subproblem is for scalar optimization.
no code implementations • 13 Jan 2024 • Jiaheng Liu, Zhiqi Bai, Yuanxing Zhang, Chenchen Zhang, Yu Zhang, Ge Zhang, Jiakai Wang, Haoran Que, Yukang Chen, Wenbo Su, Tiezheng Ge, Jie Fu, Wenhu Chen, Bo Zheng
Typically, training LLMs with long context sizes is computationally expensive, requiring extensive training hours and GPU resources.
no code implementations • 9 Jan 2024 • Didier Sornette, Yu Zhang
Defining age-dependent transaction flows as the fraction of bitcoins that are traded at a given time and that were born (last traded) at some specific earlier time, we document that the time-averaged transaction flow fraction has a power law dependence as a function of age, with an exponent close to $-1. 5$, a value compatible with priority queuing theory.
1 code implementation • 8 Jan 2024 • Pengxin Guo, Pengrong Jin, Ziyue Li, Lei Bai, Yu Zhang
To make the model trained on historical data better adapt to future data in a fully online manner, this paper conducts the first study of the online test-time adaptation techniques for spatial-temporal traffic flow forecasting problems.
Ranked #4 on Traffic Prediction on PeMS07
1 code implementation • 6 Jan 2024 • Shuhao Chen, Yulong Zhang, Weisen Jiang, Jiangang Lu, Yu Zhang
Recent advances achieved by deep learning models rely on the independent and identically distributed assumption, hindering their applications in real-world scenarios with domain shifts.
no code implementations • 19 Dec 2023 • Yunhao Gou, Zhili Liu, Kai Chen, Lanqing Hong, Hang Xu, Aoxue Li, Dit-yan Yeung, James T. Kwok, Yu Zhang
Instruction tuning of Large Vision-language Models (LVLMs) has revolutionized the development of versatile models with zero-shot generalization across a wide range of downstream vision-language tasks.
no code implementations • 17 Dec 2023 • Yu Zhang, Rongjie Huang, RuiQi Li, Jinzheng He, Yan Xia, Feiyang Chen, Xinyu Duan, Baoxing Huai, Zhou Zhao
Moreover, existing SVS methods encounter a decline in the quality of synthesized singing voices in OOD scenarios, as they rest upon the assumption that the target vocal attributes are discernible during the training phase.
1 code implementation • 13 Dec 2023 • Hong Zhang, Yu Zhang
In this paper, we propose the reversible spiking neural network to reduce the memory cost of intermediate activations and membrane potentials during training.
no code implementations • 8 Dec 2023 • Jinjing Zhu, Feiyang Ye, Qiao Xiao, Pengxin Guo, Yu Zhang, Qiang Yang
Specifically, the proposed LIWUDA method constructs a weight network to assign weights to each instance based on its probability of belonging to common classes, and designs Weighted Optimal Transport (WOT) for domain alignment by leveraging instance weights.
no code implementations • 6 Dec 2023 • Zixuan Gong, Qi Zhang, Guangyin Bao, Lei Zhu, Yu Zhang, Ke Liu, Liang Hu, Duoqian Miao
The limited data availability and the low signal-to-noise ratio of fMRI signals lead to the challenging task of fMRI-to-image retrieval.
1 code implementation • 2 Dec 2023 • Yu Zhang, Songpengcheng Xia, Lei Chu, Jiarui Yang, Qi Wu, Ling Pei
This paper introduces a novel human pose estimation approach using sparse inertial sensors, addressing the shortcomings of previous methods reliant on synthetic data.
no code implementations • 27 Nov 2023 • Chenglin Yang, Siyuan Qiao, Yuan Cao, Yu Zhang, Tao Zhu, Alan Yuille, Jiahui Yu
To tackle this problem, we redesign the scoring objective for the captioner to alleviate the distributional bias and focus on measuring the gain of information brought by the visual inputs.
1 code implementation • 21 Nov 2023 • Yongliang Lin, Yongzhi Su, Praveen Nathan, Sandeep Inuganti, Yan Di, Martin Sundermeyer, Fabian Manhardt, Didier Stricker, Jason Rambach, Yu Zhang
In this work, we present a novel dense-correspondence method for 6DoF object pose estimation from a single RGB-D image.
no code implementations • 21 Nov 2023 • Shourya Bose, Yu Zhang, Kibaek Kim
The widespread adoption of smart meters provides access to detailed and localized load consumption data, suitable for training building-level load forecasting models.
no code implementations • 6 Nov 2023 • Xulong Wang, Yu Zhang, Menghui Zhou, Tong Liu, Jun Qi, Po Yang
The experimental results show that compared with directly ROI based learning, our proposed method is more effective in predicting disease progression.
no code implementations • 6 Nov 2023 • Zhipeng Yao, Yu Zhang, Dazhou Li
To address this contradiction, we propose a novel optimization method that aims to accelerate the convergence rate of SGD without loss of generalization.
no code implementations • 2 Nov 2023 • Yuan Gao, Nobuyuki Morioka, Yu Zhang, Nanxin Chen
Instead, E3 TTS models the temporal structure of the waveform through the diffusion process.
no code implementations • 23 Oct 2023 • Yu Zhang, Yanzhen Shen, Xiusi Chen, Bowen Jin, Jiawei Han
As many academic conferences are overwhelmed by a rapidly increasing number of paper submissions, automatically finding appropriate reviewers for each submission becomes a more urgent need than ever.
no code implementations • 11 Oct 2023 • Carlo da Cunha, Nobuyuki Aoki, David Ferry, Kevin Vora, Yu Zhang
In the realm of quantum-effect devices and materials, two-dimensional electron gases (2DEGs) stand as fundamental structures that promise transformative technologies.
no code implementations • 11 Oct 2023 • Siru Ouyang, Jiaxin Huang, Pranav Pillai, Yunyi Zhang, Yu Zhang, Jiawei Han
In this study, we propose OnEFET, where we (1) enrich each node in the ontology structure with two types of extra information: instance information for training sample augmentation and topic information to relate types to contexts, and (2) develop a coarse-to-fine typing algorithm that exploits the enriched information by training an entailment model with contrasting topics and instance-based augmented training samples.
1 code implementation • 11 Oct 2023 • Yu Zhang, Yue Zhang, Leyang Cui, Guohong Fu
In this work, we propose a novel non-autoregressive text editing method to circumvent the above issues, by modeling the edit process with latent CTC alignments.
no code implementations • 10 Oct 2023 • Bowen Jin, Wentao Zhang, Yu Zhang, Yu Meng, Han Zhao, Jiawei Han
Mainstream text representation learning methods use pretrained language models (PLMs) to generate one embedding for each text unit, expecting that all types of relations between texts can be captured by these single-view embeddings.
no code implementations • 10 Oct 2023 • Peng Di, Jianguo Li, Hang Yu, Wei Jiang, Wenting Cai, Yang Cao, Chaoyu Chen, Dajun Chen, Hongwei Chen, Liang Chen, Gang Fan, Jie Gong, Zi Gong, Wen Hu, Tingting Guo, Zhichao Lei, Ting Li, Zheng Li, Ming Liang, Cong Liao, Bingchang Liu, Jiachen Liu, Zhiwei Liu, Shaojun Lu, Min Shen, Guangpei Wang, Huan Wang, Zhi Wang, Zhaogui Xu, Jiawei Yang, Qing Ye, Gehao Zhang, Yu Zhang, Zelin Zhao, Xunjin Zheng, Hailian Zhou, Lifu Zhu, Xianying Zhu
It is specifically designed for code-related tasks with both English and Chinese prompts and supports over 40 programming languages.
no code implementations • 3 Oct 2023 • Weisen Jiang, Baijiong Lin, Han Shi, Yu Zhang, Zhenguo Li, James T. Kwok
Recently, various merging methods have been proposed to build a multi-task model from task-specific finetuned models without retraining.
no code implementations • 30 Sep 2023 • Mingqiu Wang, Wei Han, Izhak Shafran, Zelin Wu, Chung-Cheng Chiu, Yuan Cao, Yongqiang Wang, Nanxin Chen, Yu Zhang, Hagen Soltau, Paul Rubenstein, Lukas Zilka, Dian Yu, Zhong Meng, Golan Pundak, Nikhil Siddhartha, Johan Schalkwyk, Yonghui Wu
We present a joint Speech and Language Model (SLM), a multitask, multilingual, and dual-modal model that takes advantage of pretrained foundational speech and language models.
no code implementations • 26 Sep 2023 • Qiao Yang, Yu Zhang, Jian Zhang, Zijing Zhao, Shunli Zhang, Jinqiao Wang, Junzhe Chen
Most existing learning-based infrared and visible image fusion (IVIF) methods exhibit massive redundant information in the fusion images, i. e., yielding edge-blurring effect or unrecognizable for object detectors.
no code implementations • 26 Sep 2023 • Qiao Yang, Yu Zhang, Jian Zhang, Zijing Zhao, Shunli Zhang, Jinqiao Wang, Junzhe Chen
Infrared and visible image fusion (IVIF) is used to generate fusion images with comprehensive features of both images, which is beneficial for downstream vision tasks.
no code implementations • 25 Sep 2023 • Ping Li, Yu Zhang, Li Yuan, Jian Zhao, Xianghua Xu, Xiaoqin Zhang
Particularly, the gradients from the segmentation model are exploited to discover the easily confused region, in which it is difficult to identify the pixel-wise objects from the background in a frame.
no code implementations • 23 Sep 2023 • Yulong Zhang, Shuhao Chen, Weisen Jiang, Yu Zhang, Jiangang Lu, James T. Kwok
However, the performance of existing UDA methods is constrained by the large domain shift and limited target domain data.
1 code implementation • 23 Sep 2023 • Xiang Geng, Zhejian Lai, Yu Zhang, Shimin Tao, Hao Yang, Jiajun Chen, ShuJian Huang
We generate pseudo MQM data using parallel data from the WMT translation task.
no code implementations • 21 Sep 2023 • Ping Li, Yu Zhang, Li Yuan, Huaxin Xiao, Binbin Lin, Xianghua Xu
Unsupervised Video Object Segmentation (VOS) aims at identifying the contours of primary foreground objects in videos without any prior knowledge.
Semantic Segmentation Unsupervised Video Object Segmentation +1
no code implementations • 21 Sep 2023 • Ping Li, Yu Zhang, Li Yuan, Xianghua Xu
Referring Video Object Segmentation (RVOS) requires segmenting the object in video referred by a natural language query.
1 code implementation • 21 Sep 2023 • Longhui Yu, Weisen Jiang, Han Shi, Jincheng Yu, Zhengying Liu, Yu Zhang, James T. Kwok, Zhenguo Li, Adrian Weller, Weiyang Liu
Our MetaMath-7B model achieves 66. 4% on GSM8K and 19. 4% on MATH, exceeding the state-of-the-art models of the same size by 11. 5% and 8. 7%.
Ranked #54 on Arithmetic Reasoning on GSM8K (using extra training data)
no code implementations • 21 Sep 2023 • Riko I Made, Jing Lin, Jintao Zhang, Yu Zhang, Lionel C. H. Moh, Zhaolin Liu, Ning Ding, Sing Yang Chiam, Edwin Khoo, Xuesong Yin, Guangyuan Wesley Zheng
Battery health assessment and recuperation play a crucial role in the utilization of second-life Li-ion batteries.
no code implementations • 19 Sep 2023 • Shikhar Bharadwaj, Min Ma, Shikhar Vashishth, Ankur Bapna, Sriram Ganapathy, Vera Axelrod, Siddharth Dalmia, Wei Han, Yu Zhang, Daan van Esch, Sandy Ritchie, Partha Talukdar, Jason Riesa
Spoken language identification refers to the task of automatically predicting the spoken language in a given utterance.
no code implementations • 14 Sep 2023 • Guanlong Zhao, Yongqiang Wang, Jason Pelecanos, Yu Zhang, Hank Liao, Yiling Huang, Han Lu, Quan Wang
We show that the USM-SCD model can achieve more than 75% average speaker change detection F1 score across a test set that consists of data from 96 languages.
1 code implementation • ICCV 2023 • Hao Chen, Chenyuan Qu, Yu Zhang, Chen Chen, Jianbo Jiao
It is understandable as the model is designed to learn paired mapping (e. g. from a noisy image to its clean version).
Ranked #1 on Denoising on CBSD68 sigm75
no code implementations • 4 Sep 2023 • Zhexiao Xiong, Feng Qiao, Yu Zhang, Nathan Jacobs
We introduce a novel training strategy for stereo matching and optical flow estimation that utilizes image-to-image translation between synthetic and real image domains.
1 code implementation • 3 Sep 2023 • Yue Zhang, Yafu Li, Leyang Cui, Deng Cai, Lemao Liu, Tingchen Fu, Xinting Huang, Enbo Zhao, Yu Zhang, Yulong Chen, Longyue Wang, Anh Tuan Luu, Wei Bi, Freda Shi, Shuming Shi
While large language models (LLMs) have demonstrated remarkable capabilities across a range of downstream tasks, a significant concern revolves around their propensity to exhibit hallucinations: LLMs occasionally generate content that diverges from the user input, contradicts previously generated context, or misaligns with established world knowledge.
1 code implementation • NeurIPS 2023 • Xiaolong Wang, Runsen Xu, Zuofan Cui, Zeyu Wan, Yu Zhang
In this paper, we introduce a novel approach to fine-grained cross-view geo-localization.
no code implementations • 30 Aug 2023 • Yukun Su, Ruizhou Sun, Xin Shu, Yu Zhang, Qingyao Wu
Multi-Object Tracking (MOT) is a crucial computer vision task that aims to predict the bounding boxes and identities of objects simultaneously.
1 code implementation • 23 Aug 2023 • Baijiong Lin, Weisen Jiang, Feiyang Ye, Yu Zhang, Pengguang Chen, Ying-Cong Chen, Shu Liu, James T. Kwok
Multi-task learning (MTL), a learning paradigm to learn multiple related tasks simultaneously, has achieved great success in various fields.
1 code implementation • 16 Aug 2023 • Ben Chen, Xuechao Zou, Kai Li, Yu Zhang, Junliang Xing, Pin Tao
Lake extraction from remote sensing imagery is a complex challenge due to the varied lake shapes and data noise.
no code implementations • 15 Aug 2023 • Weisen Jiang, Han Shi, Longhui Yu, Zhengying Liu, Yu Zhang, Zhenguo Li, James T. Kwok
Instead of using forward or backward reasoning alone, we propose FOBAR to combine FOrward and BAckward Reasoning for verification.
1 code implementation • 8 Aug 2023 • Xuechao Zou, Kai Li, Junliang Xing, Yu Zhang, Shiying Wang, Lei Jin, Pin Tao
Optical satellite images are a critical data source; however, cloud cover often compromises their quality, hindering image applications and analysis.
1 code implementation • 8 Aug 2023 • Ben Chen, Xuechao Zou, Yu Zhang, Jiayu Li, Kai Li, Junliang Xing, Pin Tao
LEFormer contains three main modules: CNN encoder, Transformer encoder, and cross-encoder fusion.
no code implementations • 2 Aug 2023 • Zhenyuan Ning, Yixiao Mao, Qianjin Feng, Shengzhou Zhong, Yu Zhang
Complex scenario of ultrasound image, in which adjacent tissues (i. e., background) share similar intensity with and even contain richer texture patterns than lesion region (i. e., foreground), brings a unique challenge for accurate lesion segmentation.
1 code implementation • 27 Jul 2023 • Jing Xiong, Tianqi Hong, Dongbo Zhao, Yu Zhang
Non-intrusive load monitoring (NILM) identifies the status and power consumption of various household appliances by disaggregating the total power usage signal of an entire house.
no code implementations • 25 Jul 2023 • Enqiang Zhu, Yu Zhang, Shengzhi Wang, Darren Strash, Chanjuan Liu
Given a graph, the minimum dominating set (MinDS) problem is to identify a smallest set $D$ of vertices such that every vertex not in $D$ is adjacent to at least one vertex in $D$.
no code implementations • 20 Jul 2023 • Ian P. Roberts, Yu Zhang, Tawfik Osman, Ahmed Alkhateeb
Noteworthy strides continue to be made in the development of full-duplex millimeter wave (mmWave) communication systems, but most of this progress has been built on theoretical models and validated through simulation.
no code implementations • 20 Jul 2023 • Qian Wan, Siying Hu, Yu Zhang, Piaohong Wang, Bo Wen, Zhicong Lu
This collaborative process champions the human in a dominant role, in addition to mixed and shifting levels of initiative that exist between humans and LLMs.
1 code implementation • 24 Jun 2023 • Yu Zhang, Bowen Jin, Xiusi Chen, Yanzhen Shen, Yunyi Zhang, Yu Meng, Jiawei Han
Instead of relying on human-annotated training samples to build a classifier, weakly supervised scientific paper classification aims to classify papers only using category descriptions (e. g., category names, category-indicative keywords).
no code implementations • 22 Jun 2023 • Paul K. Rubenstein, Chulayuth Asawaroengchai, Duc Dung Nguyen, Ankur Bapna, Zalán Borsos, Félix de Chaumont Quitry, Peter Chen, Dalia El Badawy, Wei Han, Eugene Kharitonov, Hannah Muckenhirn, Dirk Padfield, James Qin, Danny Rozenberg, Tara Sainath, Johan Schalkwyk, Matt Sharifi, Michelle Tadmor, Ramanovich, Marco Tagliasacchi, Alexandru Tudor, Mihajlo Velimirović, Damien Vincent, Jiahui Yu, Yongqiang Wang, Vicky Zayats, Neil Zeghidour, Yu Zhang, Zhishuai Zhang, Lukas Zilka, Christian Frank
AudioPaLM inherits the capability to preserve paralinguistic information such as speaker identity and intonation from AudioLM and the linguistic knowledge present only in text large language models such as PaLM-2.
no code implementations • 22 Jun 2023 • Yu Zhang, Hao Zeng, Bowen Ma, Wei zhang, Zhimeng Zhang, Yu Ding, Tangjie Lv, Changjie Fan
The discriminator is shape-aware and relies on a semantic flow-guided operation to explicitly calculate the shape discrepancies between the target and source faces, thus optimizing the face swapping network to generate highly realistic results.
no code implementations • 20 Jun 2023 • Yu Zhang, Long Cheng, Xiuze Xia, Haoyu Zhang
The proposed approach involves the estimation of full stiffness matrices from human demonstrations, which are then combined with sensed forces and motion information to create a model using the non-parametric method.
no code implementations • 13 Jun 2023 • Nanxin Chen, Izhak Shafran, Yu Zhang, Chung-Cheng Chiu, Hagen Soltau, James Qin, Yonghui Wu
However, finetuning all parameters from the self-supervised learned model can be computationally expensive, and becomes infeasiable as the size of the model and the number of downstream tasks scales.
no code implementations • 13 Jun 2023 • Xu Han, Bin Guo, Yoon Jung, Benjamin Yao, Yu Zhang, Xiaohu Liu, Chenlei Guo
Personalized dialogue agents (DAs) powered by large pre-trained language models (PLMs) often rely on explicit persona descriptions to maintain personality consistency.
no code implementations • 12 Jun 2023 • Yu Zhang, Jia Li, Jie Ding, Xiang Li
Learning and analysis of network robustness, including controllability robustness and connectivity robustness, is critical for various networked systems against attacks.
1 code implementation • 7 Jun 2023 • Xiusi Chen, Yu Zhang, Jinliang Deng, Jyun-Yu Jiang, Wei Wang
Few-shot question answering (QA) aims at precisely discovering answers to a set of questions from context passages while only a few training samples are available.
no code implementations • 6 Jun 2023 • Zhishan Zhao, Jingyue Gao, Yu Zhang, Shuguang Han, Siyuan Lou, Xiang-Rong Sheng, Zhe Wang, Han Zhu, Yuning Jiang, Jian Xu, Bo Zheng
In this architecture, the pre-ranking model is expected to be a lightweight approximation of the ranking model, which handles more candidates with strict latency requirements.
1 code implementation • CVPR 2023 • Yingjie Wang, Jiajun Deng, Yao Li, Jinshui Hu, Cong Liu, Yu Zhang, Jianmin Ji, Wanli Ouyang, Yanyong Zhang
LiDAR and Radar are two complementary sensing approaches in that LiDAR specializes in capturing an object's 3D shape while Radar provides longer detection ranges as well as velocity hints.
1 code implementation • 1 Jun 2023 • Han Cui, Shangzhan Li, Yu Zhang, Qi Shi
The generation of explanation graphs is a significant task that aims to produce explanation graphs in response to user input, revealing the internal reasoning process.
1 code implementation • 1 Jun 2023 • Zih-Ching Chen, Chao-Han Huck Yang, Bo Li, Yu Zhang, Nanxin Chen, Shuo-Yiin Chang, Rohit Prabhavalkar, Hung-Yi Lee, Tara N. Sainath
In this work, we introduce a "score-based assessment" framework for estimating the transferability of pre-trained speech models (PSMs) for fine-tuning target tasks.
1 code implementation • 1 Jun 2023 • Weisen Jiang, Yu Zhang, James T. Kwok
Combining meta-learning the prompt pool and RepVerb, we propose MetaPrompter for effective structured prompting.
no code implementations • 30 May 2023 • Yuma Koizumi, Heiga Zen, Shigeki Karita, Yifan Ding, Kohei Yatabe, Nobuyuki Morioka, Michiel Bacchiani, Yu Zhang, Wei Han, Ankur Bapna
The constituent samples of LibriTTS-R are identical to those of LibriTTS, with only the sound quality improved.
no code implementations • 25 May 2023 • Ke Hu, Bo Li, Tara N. Sainath, Yu Zhang, Francoise Beaufays
We evaluate the proposed model on a set of 12 languages, and achieve an average 11. 9% relative improvement in WER over the baseline.
1 code implementation • 23 May 2023 • Yunyi Zhang, Minhao Jiang, Yu Meng, Yu Zhang, Jiawei Han
Weakly-supervised text classification trains a classifier using the label name of each target class as the only supervision, which largely reduces human annotation efforts.
no code implementations • 23 May 2023 • Yu Zhang, Hao Cheng, Zhihong Shen, Xiaodong Liu, Ye-Yi Wang, Jianfeng Gao
Scientific literature understanding tasks have gained significant attention due to their potential to accelerate scientific discovery.
1 code implementation • 22 May 2023 • Zhangming Chan, Yu Zhang, Shuguang Han, Yong Bai, Xiang-Rong Sheng, Siyuan Lou, Jiacen Hu, Baolin Liu, Yuning Jiang, Jian Xu, Bo Zheng
However, we observe that a well-trained CVR prediction model often performs sub-optimally during sales promotions.
no code implementations • 20 May 2023 • Bowen Jin, Wentao Zhang, Yu Zhang, Yu Meng, Xinyang Zhang, Qi Zhu, Jiawei Han
A real-world text corpus sometimes comprises not only text documents but also semantic links between them (e. g., academic papers in a bibliographic network are linked by citations and co-authorships).
1 code implementation • 13 May 2023 • Yu Zhang, Siqi Chen, Mingdao Wang, Xianlin Zhang, Chuang Zhu, Yue Zhang, Xueming Li
Extensive experiments demonstrate that our method outperforms other methods in maintaining temporal consistency both qualitatively and quantitatively.
1 code implementation • 10 May 2023 • Guoqing Yang, Chuang Zhu, Yu Zhang
Weakly supervised semantic segmentation (WSSS) based on image-level labels is challenging since it is hard to obtain complete semantic regions.
1 code implementation • 8 May 2023 • Jing Xiong, Yu Zhang
In this paper, we propose a unifying deep learning framework for load forecasting, which includes time-varying feature weighting, hierarchical temporal attention, and feature-reinforced error correction.
1 code implementation • 4 May 2023 • Kaixin Ma, Hao Cheng, Yu Zhang, Xiaodong Liu, Eric Nyberg, Jianfeng Gao
Our approach outperforms recent self-supervised retrievers in zero-shot evaluations and achieves state-of-the-art fine-tuned retrieval performance on NQ, HotpotQA and OTT-QA.
Ranked #4 on Question Answering on HotpotQA
1 code implementation • 3 May 2023 • Xu Yang, Jiawei Peng, Zihua Wang, Haiyang Xu, Qinghao Ye, Chenliang Li, Songfang Huang, Fei Huang, Zhangzikang Li, Yu Zhang
In TSG, we apply multi-head attention (MHA) to design the Graph Neural Network (GNN) for embedding scene graphs.
no code implementations • 28 Apr 2023 • Weisen Jiang, Hansi Yang, Yu Zhang, James Kwok
Sharpness-aware minimization (SAM), which searches for flat minima by min-max optimization, has been shown to be useful in improving model generalization.
no code implementations • 27 Apr 2023 • Gary Wang, Kyle Kastner, Ankur Bapna, Zhehuai Chen, Andrew Rosenberg, Bhuvana Ramabhadran, Yu Zhang
Recently, a number of approaches to train speech models by incorpo-rating text into end-to-end models have been developed, with Mae-stro advancing state-of-the-art automatic speech recognition (ASR)and Speech Translation (ST) performance.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 25 Apr 2023 • Yu Zhang, Lin Zhang
In this study, we investigated nine promising models to evaluate their performance in pavement surface crack detection by model accuracy, computational complexity, and model stability.
no code implementations • 20 Apr 2023 • Chenglu Sun, Yichi Zhang, Yu Zhang, Ziling Lu, Jingbin Liu, Sijia Xu, Weidong Zhang
We propose asymmetric-evolution training (AET), a novel multi-agent reinforcement learning framework that can train multiple kinds of agents simultaneously in AMP game.
no code implementations • 16 Apr 2023 • Yu Zhang, Huaming Chen, Wei Bao, Zhongzheng Lai, Zao Zhang, Dong Yuan
Being able to identify and track all the pedestrians in the dense crowd scene with computer vision approaches is a typical challenge in this field, also known as the Multiple Object Tracking (MOT) challenge.
1 code implementation • 13 Apr 2023 • Siqi Chen, Xueming Li, Xianlin Zhang, Mingdao Wang, Yu Zhang, Yue Zhang
Previous methods search for correspondence across the entire reference image, and this type of global matching is easy to get mismatch.
1 code implementation • 6 Apr 2023 • Yu Zhang, Xiaoguang Di, Junde Wu, Rao Fu, Yong Li, Yue Wang, Yanwu Xu, Guohui YANG, Chunhui Wang
In this paper, to make the learning easier in low-light image enhancement, we introduce FLW-Net (Fast and LightWeight Network) and two relative loss functions.
no code implementations • 4 Apr 2023 • Akkamahadevi Hanni, Andrew Boateng, Yu Zhang
The goal of SEP is to find behaviors that align with human expectations while adhering to the specified safety criterion.
no code implementations • 2 Apr 2023 • Sicong Liang, Junchao Tian, Shujun Yang, Yu Zhang
The key challenge of FL is the heterogeneity of local data in different clients, such as heterogeneous label distribution and feature shift, which could lead to significant performance degradation of the learned models.
no code implementations • 27 Mar 2023 • Siqi Chen, Xueming Li, Xianlin Zhang, Mingdao Wang, Yu Zhang, Jiatong Han, Yue Zhang
Exemplar-based video colorization is an essential technique for applications like old movie restoration.
no code implementations • 22 Mar 2023 • Guoliang You, Xiaomeng Chu, Yifan Duan, Jie Peng, Jianmin Ji, Yu Zhang, Yanyong Zhang
In particular, we specify a prompt-transformer for representation conversion and propose a two-step training process to train the prompt-transformer for the target environment, while the rest of the DRL pipeline remains unchanged.
no code implementations • 17 Mar 2023 • Yulong Zhang, Shuhao Chen, Yu Zhang, Jiangang Lu
The generated samples can well simulate the data distribution of the target domain and help existing UDA methods transfer from the source domain to the target domain more easily, thus improving the transfer performance.
1 code implementation • 3 Mar 2023 • Yuma Koizumi, Heiga Zen, Shigeki Karita, Yifan Ding, Kohei Yatabe, Nobuyuki Morioka, Yu Zhang, Wei Han, Ankur Bapna, Michiel Bacchiani
Experiments show that Miipher (i) is robust against various audio degradation and (ii) enable us to train a high-quality text-to-speech (TTS) model from restored speech samples collected from the Web.
no code implementations • 2 Mar 2023 • Yu Zhang, Wei Han, James Qin, Yongqiang Wang, Ankur Bapna, Zhehuai Chen, Nanxin Chen, Bo Li, Vera Axelrod, Gary Wang, Zhong Meng, Ke Hu, Andrew Rosenberg, Rohit Prabhavalkar, Daniel S. Park, Parisa Haghani, Jason Riesa, Ginger Perng, Hagen Soltau, Trevor Strohman, Bhuvana Ramabhadran, Tara Sainath, Pedro Moreno, Chung-Cheng Chiu, Johan Schalkwyk, Françoise Beaufays, Yonghui Wu
We introduce the Universal Speech Model (USM), a single large model that performs automatic speech recognition (ASR) across 100+ languages.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
1 code implementation • 28 Feb 2023 • Yu Zhang, Junle Yu, Xiaolin Huang, Wenhui Zhou, Ji Hou
Different from previous methods that only use geometry representation, our module is specifically designed to effectively correlate color into geometry for the point cloud registration task.
1 code implementation • 21 Feb 2023 • Bowen Jin, Yu Zhang, Yu Meng, Jiawei Han
Edges in many real-world social/information networks are associated with rich text information (e. g., user-user communications or user-product reviews).
no code implementations • 17 Feb 2023 • Ke Hu, Tara N. Sainath, Bo Li, Nan Du, Yanping Huang, Andrew M. Dai, Yu Zhang, Rodrigo Cabrera, Zhifeng Chen, Trevor Strohman
In this work, we propose to train a single multilingual language model (LM) for shallow fusion in multiple languages.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 16 Feb 2023 • Zhong Meng, Weiran Wang, Rohit Prabhavalkar, Tara N. Sainath, Tongzhou Chen, Ehsan Variani, Yu Zhang, Bo Li, Andrew Rosenberg, Bhuvana Ramabhadran
We propose JEIT, a joint end-to-end (E2E) model and internal language model (ILM) training method to inject large-scale unpaired text into ILM during E2E training which improves rare-word speech recognition.
1 code implementation • 7 Feb 2023 • Yu Zhang, Bowen Jin, Qi Zhu, Yu Meng, Jiawei Han
Due to the exponential growth of scientific publications on the Web, there is a pressing need to tag each paper with fine-grained topics so that researchers can track their interested fields of study rather than drowning in the whole literature.
no code implementations • 4 Feb 2023 • Haojie Ren, Sha Zhang, Sugang Li, Yao Li, Xinchen Li, Jianmin Ji, Yu Zhang, Yanyong Zhang
In this paper, we propose TrajMatch -- the first system that can automatically calibrate for roadside LiDARs in both time and space.
no code implementations • 3 Feb 2023 • Bo Li, Dongseong Hwang, Zhouyuan Huo, Junwen Bai, Guru Prakash, Tara N. Sainath, Khe Chai Sim, Yu Zhang, Wei Han, Trevor Strohman, Francoise Beaufays
The FM encoder adapter and decoder are then finetuned to the target domain with a small amount of supervised in-domain data.
no code implementations • 28 Jan 2023 • Kejun Chen, Yu Zhang
Probabilistic power flow (PPF) analysis is critical to power system operation and planning.
no code implementations • 19 Jan 2023 • Chao-Han Huck Yang, Bo Li, Yu Zhang, Nanxin Chen, Rohit Prabhavalkar, Tara N. Sainath, Trevor Strohman
In this work, we propose a new parameter-efficient learning framework based on neural model reprogramming for cross-lingual speech recognition, which can \textbf{re-purpose} well-trained English automatic speech recognition (ASR) models to recognize the other languages.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 17 Jan 2023 • Yu Zhang, Yue Wang, Zhi Tian, Geert Leus, Gong Zhang
This paper proposes a super-resolution harmonic retrieval method for uncorrelated strictly non-circular signals, whose covariance and pseudo-covariance present Toeplitz and Hankel structures, respectively.
no code implementations • 13 Jan 2023 • Xiaomeng Chu, Jiajun Deng, Yuan Zhao, Jianmin Ji, Yu Zhang, Houqiang Li, Yanyong Zhang
To this end, we propose OA-BEV, a network that can be plugged into the BEV-based 3D object detection framework to bring out the objects by incorporating object-aware pseudo-3D features and depth features.
no code implementations • 5 Jan 2023 • Zihua Wang, Xu Yang, Haiyang Xu, Hanwang Zhang, and Qinghao Ye, Chenliang Li, and Weiwei Sun, Ming Yan, Songfang Huang, Fei Huang, Yu Zhang
We design a novel global-local Transformer named \textbf{Ada-ClustFormer} (\textbf{ACF}) to generate captions.
no code implementations • ICCV 2023 • Xu Yang, Zhangzikang Li, Haiyang Xu, Hanwang Zhang, Qinghao Ye, Chenliang Li, Ming Yan, Yu Zhang, Fei Huang, Songfang Huang
To amend this, we propose a novel TW-BERT to learn Trajectory-Word alignment by a newly designed trajectory-to-word (T2W) attention for solving video-language tasks.
no code implementations • CVPR 2023 • ZHIYANG YU, Yu Zhang, Dongqing Zou, Xijun Chen, Jimmy S. Ren, Shunqing Ren
Continuous-time video frame interpolation is a fundamental technique in computer vision for its flexibility in synthesizing motion trajectories and novel video frames at arbitrary intermediate time steps.
no code implementations • ICCV 2023 • Zelin Gao, Weichen Dai, Yu Zhang
Neural Radiance Fields have shown great potential to synthesize novel views with only a few discrete image observations of the world.
1 code implementation • ICCV 2023 • Yunshan Qi, Lin Zhu, Yu Zhang, Jia Li
To solve this problem, we propose a novel Event-Enhanced NeRF (E2NeRF) by utilizing the combination data of a bio-inspired event camera and a standard RGB camera.
1 code implementation • CVPR 2023 • Junle Yu, Luwei Ren, Yu Zhang, Wenhui Zhou, Lili Lin, Guojun Dai
Recently, it has achieved huge success in incorporating Transformer into point cloud feature representation, which usually adopts a self-attention module to learn intra-point-cloud features first, then utilizes a cross-attention module to perform feature exchange between input point clouds.
1 code implementation • 19 Dec 2022 • Qiao Xiao, Boqian Wu, Yu Zhang, Shiwei Liu, Mykola Pechenizkiy, Elena Mocanu, Decebal Constantin Mocanu
The receptive field (RF), which determines the region of time series to be ``seen'' and used, is critical to improve the performance for time series classification (TSC).
no code implementations • 19 Dec 2022 • Yong Cheng, Yu Zhang, Melvin Johnson, Wolfgang Macherey, Ankur Bapna
We present Mu$^{2}$SLAM, a multilingual sequence-to-sequence model pre-trained jointly on unlabeled speech, unlabeled text and supervised data spanning Automatic Speech Recognition (ASR), Automatic Speech Translation (AST) and Machine Translation (MT), in over 100 languages.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +7
1 code implementation • 12 Dec 2022 • Yu Zhang, Yunyi Zhang, Martin Michalski, Yucheng Jiang, Yu Meng, Jiawei Han
Instead of mining coherent topics from a given text corpus in a completely unsupervised manner, seed-guided topic discovery methods leverage user-provided seed words to extract distinctive and coherent topics so that the mined topics can better cater to the user's interest.
no code implementations • 7 Dec 2022 • Kejun Chen, Shourya Bose, Yu Zhang
Non-convex AC optimal power flow (AC-OPF) is a fundamental optimization problem in power system analysis.
no code implementations • 5 Dec 2022 • Yu Zhang, Yunyi Zhang, Yucheng Jiang, Martin Michalski, Yu Deng, Lucian Popa, ChengXiang Zhai, Jiawei Han
Given a few seed entities of a certain type (e. g., Software or Programming Language), entity set expansion aims to discover an extensive set of entities that share the same type as the seeds.
1 code implementation • 2 Dec 2022 • Tao Zhou, Yi Zhou, Chen Gong, Jian Yang, Yu Zhang
In this paper, we propose a novel Feature Aggregation and Propagation Network (FAP-Net) for camouflaged object detection.
no code implementations • 29 Nov 2022 • Junde Wu, Huihui Fang, Yehui Yang, Yu Zhang, Haoyi Xiong, Huazhu Fu, Yanwu Xu
In the paper, we call them expert-level classification.
no code implementations • 24 Nov 2022 • Yueqing Sun, Yu Zhang, Le Qi, Qi Shi
In this paper, we aim to address the above limitation by leveraging the implicit knowledge stored in PrLMs and propose a two-stage prompt-based unsupervised commonsense question answering framework (TSGP).
no code implementations • CVPR 2023 • Yunhao Gou, Tom Ko, Hansi Yang, James Kwok, Yu Zhang, Mingxuan Wang
(2) Under-utilization of the unmasked tokens: CMLM primarily focuses on the masked token but it cannot simultaneously leverage other tokens to learn vision-language associations.
no code implementations • 16 Nov 2022 • Juan Zha, Zheng Li, Ying WEI, Yu Zhang
However, most prior works assume that all the tasks are sampled from a single data source, which cannot adapt to real-world scenarios where tasks are heterogeneous and lie in different distributions.
no code implementations • 13 Nov 2022 • Xuetong Wang, Kanhao Zhao, Rong Zhou, Alex Leow, Ricardo Osorio, Yu Zhang, Lifang He
Normative modeling is an emerging and promising approach to effectively study disorder heterogeneity in individual participants.
1 code implementation • 7 Nov 2022 • Yi Zhai, Yu Zhang, Shuo Liu, Xiaomeng Chu, Jie Peng, Jianmin Ji, Yanyong Zhang
Instead of extracting features from the tensor program itself, TLP extracts features from the schedule primitives.
1 code implementation • 6 Nov 2022 • Yu Meng, Martin Michalski, Jiaxin Huang, Yu Zhang, Tarek Abdelzaher, Jiawei Han
In this work, we study few-shot learning with PLMs from a different perspective: We first tune an autoregressive PLM on the few-shot samples and then use it as a generator to synthesize a large amount of novel training samples which augment the original training set.
no code implementations • 2 Nov 2022 • Yu Zhang, Mitchell Bucklew
In this paper, we introduce Max Markov Chain (MMC), a novel representation for a useful subset of High-order Markov Chains (HMCs) with sparse correlations among the states.
no code implementations • 2 Nov 2022 • Chao-Han Huck Yang, Bo Li, Yu Zhang, Nanxin Chen, Tara N. Sainath, Sabato Marco Siniscalchi, Chin-Hui Lee
We propose a quantum kernel learning (QKL) framework to address the inherent data sparsity issues often encountered in training large-scare acoustic models in low-resource scenarios.
2 code implementations • 1 Nov 2022 • Junde Wu, Rao Fu, Huihui Fang, Yu Zhang, Yehui Yang, Haoyi Xiong, Huiying Liu, Yanwu Xu
Inspired by the success of DPM, we propose the first DPM based model toward general medical image segmentation tasks, which we named MedSegDiff.
no code implementations • 31 Oct 2022 • Zhong Meng, Tongzhou Chen, Rohit Prabhavalkar, Yu Zhang, Gary Wang, Kartik Audhkhasi, Jesse Emond, Trevor Strohman, Bhuvana Ramabhadran, W. Ronny Huang, Ehsan Variani, Yinghui Huang, Pedro J. Moreno
In this work, we propose a modular hybrid autoregressive transducer (MHAT) that has structurally separated label and blank decoders to predict label and blank distributions, respectively, along with a shared acoustic encoder.
no code implementations • 29 Oct 2022 • Yongqiang Wang, Zhehuai Chen, Chengjian Zheng, Yu Zhang, Wei Han, Parisa Haghani
We propose a novel method to accelerate training and inference process of recurrent neural network transducer (RNN-T) based on the guidance from a co-trained connectionist temporal classification (CTC) model.
no code implementations • 28 Oct 2022 • Nobuyuki Morioka, Heiga Zen, Nanxin Chen, Yu Zhang, Yifan Ding
Adapting a neural text-to-speech (TTS) model to a target speaker typically involves fine-tuning most if not all of the parameters of a pretrained multi-speaker backbone model.
1 code implementation • 28 Oct 2022 • Xubo Liu, Qiushi Huang, Xinhao Mei, Haohe Liu, Qiuqiang Kong, Jianyuan Sun, Shengchen Li, Tom Ko, Yu Zhang, Lilian H. Tang, Mark D. Plumbley, Volkan Kılıç, Wenwu Wang
Audio captioning aims to generate text descriptions of audio clips.
no code implementations • 27 Oct 2022 • Takaaki Saeki, Heiga Zen, Zhehuai Chen, Nobuyuki Morioka, Gary Wang, Yu Zhang, Ankur Bapna, Andrew Rosenberg, Bhuvana Ramabhadran
This paper proposes Virtuoso, a massively multilingual speech-text joint semi-supervised learning framework for text-to-speech synthesis (TTS) models.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
1 code implementation • 27 Oct 2022 • Qiushi Huang, Yu Zhang, Tom Ko, Xubo Liu, Bo Wu, Wenwu Wang, Lilian Tang
Persona-based dialogue systems aim to generate consistent responses based on historical context and predefined persona.
no code implementations • 18 Oct 2022 • Zhehuai Chen, Ankur Bapna, Andrew Rosenberg, Yu Zhang, Bhuvana Ramabhadran, Pedro Moreno, Nanxin Chen
First, we show that by combining speech representations with byte-level text representations and use of language embeddings, we can dramatically reduce the Character Error Rate (CER) on languages with no supervised speech from 64. 8\% to 30. 8\%, a relative reduction of 53\%.
1 code implementation • 14 Oct 2022 • Kuan-Po Huang, Yu-Kuan Fu, Tsu-Yuan Hsu, Fabian Ritter Gutierrez, Fan-Lin Wang, Liang-Hsuan Tseng, Yu Zhang, Hung-Yi Lee
Self-supervised learned (SSL) speech pre-trained models perform well across various speech processing tasks.
no code implementations • 13 Oct 2022 • Tara N. Sainath, Rohit Prabhavalkar, Ankur Bapna, Yu Zhang, Zhouyuan Huo, Zhehuai Chen, Bo Li, Weiran Wang, Trevor Strohman
In addition, we explore JOIST using a streaming E2E model with an order of magnitude more data, which are also novelties compared to previous works.
no code implementations • 11 Oct 2022 • Dongseong Hwang, Khe Chai Sim, Yu Zhang, Trevor Strohman
Knowledge distillation is an effective machine learning technique to transfer knowledge from a teacher model to a smaller student model, especially with unlabeled data.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 4 Oct 2022 • Zixiao Wang, Yuluo Guo, Jin Zhao, Yu Zhang, Hui Yu, Xiaofei Liao, Biao Wang, Ting Yu
In this paper, we propose a Graph Inception Diffusion Networks(GIDN) model.
Ranked #1 on Link Property Prediction on ogbl-ddi
no code implementations • 3 Oct 2022 • Yu Zhang, Li Liu, Chen Diao, Ning Cai
Computer model has been extensively adopted to overcome the time limitation of language evolution by transforming language theory into physical modeling mechanism, which helps to explore the general laws of the evolution.
2 code implementations • British Machine Vision Conference 2022 • Pengxin Guo, Jinjing Zhu, Yu Zhang
To solve this problem, we propose a Selective Partial Domain Adaptation (SPDA) method, which selects useful data for the adaptation to the target domain.
Ranked #1 on Partial Domain Adaptation on VisDA2017
no code implementations • 26 Sep 2022 • Gabriel Intriago, Andres Intriago, Charalambos Konstantinou, Yu Zhang
This paper proposes a strategy based on observers and residuals for detecting internal faults in grid-forming inverters with power-sharing coordination.
no code implementations • 26 Sep 2022 • Xinnan Ding, Shan Du, Yu Zhang, Kejun Wang
The critical goal of gait recognition is to acquire the inter-frame walking habit representation from the gait sequences.
no code implementations • 25 Sep 2022 • Gabriel Intriago, Yu Zhang
Instance selection is a vital technique for energy big data analytics.
no code implementations • 22 Sep 2022 • Shengcai Liu, Yu Zhang, Ke Tang, Xin Yao
Hopefully, this work would help with a better understanding of the strengths and weaknesses of NCO and provide a comprehensive evaluation protocol for further benchmarking NCO approaches in comparison to other approaches.
no code implementations • 21 Sep 2022 • Yu Zhang, Bing-Zhao Li
In this paper, we propose and design the definition of the discrete linear canonical transform on graphs (GLCT), which is an extension of the discrete linear canonical transform (DLCT), just as the graph Fourier transform (GFT) is an extension of the discrete Fourier transform (DFT).
no code implementations • 9 Sep 2022 • Yu Zhang, Tawfik Osman, Ahmed Alkhateeb
Furthermore, a hardware proof-of-concept prototype based on mmWave phased arrays is built and used to implement and evaluate the developed online beam learning solutions in realistic scenarios.
no code implementations • 26 Aug 2022 • Yu Zhang, Shuaifei Chen, Jiayi Zhang
Cell-free massive multiple-input-multiple-output is promising to meet the stringent quality-of-experience (QoE) requirements of railway wireless communications by coordinating many successional access points (APs) to serve the onboard users coherently.
1 code implementation • journal 2022 • Shujun Yang, Yu Zhang, Yuheng Jia, and Weijia Zhang
By taking advantage of the local manifold structure, a Laplacian graph is constructed from the superpixels to ensure that a typical pixel should be similar to its neighbors within the same superpixel.
no code implementations • 16 Aug 2022 • Enqiang Zhu, Yu Zhang, Chanjuan Liu
The maximum independent set (MIS) problem, a classical NP-hard problem with extensive applications in various areas, aims to find the largest set of vertices with no edge among them.
no code implementations • 7 Aug 2022 • Zesheng Ye, Lina Yao, Yu Zhang, Sylvia Gustin
Recent studies demonstrate the use of a two-stage supervised framework to generate images that depict human perception to visual stimuli from EEG, referring to EEG-visual reconstruction.
1 code implementation • 5 Aug 2022 • Yongxiang Tang, Wentao Bai, Guilin Li, Xialong Liu, Yu Zhang
In this paper, we proposed the Customizable Recall@N Optimization Loss (CROLoss), a loss function that can directly optimize the Recall@N metrics and is customizable for different choices of N. This proposed CROLoss formulation defines a more generalized loss function space, covering most of the conventional loss functions as special cases.
1 code implementation • 5 Aug 2022 • Junde Wu, Yu Zhang, Rao Fu, Yuanpei Liu, Jing Gao
Then, to ensure that the method adapts to the dynamic and unseen person flow, we propose Graph Convolutional Network (GCN) with a simple Nearest Neighbor (NN) strategy to accurately cluster the instances of CSG.
no code implementations • 3 Aug 2022 • Qibing Bai, Tom Ko, Yu Zhang
In human speech, the attitude of a speaker cannot be fully expressed only by the textual content.
1 code implementation • 18 Jul 2022 • Xinyu Shi, Dong Wei, Yu Zhang, Donghuan Lu, Munan Ning, Jiashun Chen, Kai Ma, Yefeng Zheng
A key to this challenging task is to fully utilize the information in the support images by exploiting fine-grained correlations between the query and support images.
Ranked #4 on Few-Shot Semantic Segmentation on COCO-20i (1-shot)
no code implementations • 16 Jul 2022 • Jiahao Qi, Zhiqiang Gong, Xingyue Liu, Kangcheng Bin, Chen Chen, YongQian Li, Wei Xue, Yu Zhang, Ping Zhong
Deep learning methodology contributes a lot to the development of hyperspectral image (HSI) analysis community.
1 code implementation • 7 Jul 2022 • Jiashun Chen, Donghuan Lu, Yu Zhang, Dong Wei, Munan Ning, Xinyu Shi, Zhe Xu, Yefeng Zheng
In this study, we propose a novel Deformer module along with a multi-scale framework for the deformable image registration task.
no code implementations • 18 Jun 2022 • Zhanghao Sun, Yu Zhang, Yicheng Wu, Dong Huo, Yiming Qian, Jian Wang
We propose three applications using our redundancy codes: (1) Self error-correction for SL imaging under strong ambient light, (2) Error detection for adaptive reconstruction under global illumination, and (3) Interference filtering with device-specific projection sequence encoding, especially for event camera-based SL and light curtain devices.
no code implementations • 4 Jun 2022 • Xiaochen Li, Xin Song, Pengjia Yuan, Xialong Liu, Yu Zhang
In this paper, we focus on a new type of user interest, i. e., user retargeting interest.
1 code implementation • 25 May 2022 • Alexis Conneau, Min Ma, Simran Khanuja, Yu Zhang, Vera Axelrod, Siddharth Dalmia, Jason Riesa, Clara Rivera, Ankur Bapna
We introduce FLEURS, the Few-shot Learning Evaluation of Universal Representations of Speech benchmark.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +6
no code implementations • 24 May 2022 • Shourya Bose, Sifat Chowdhury, Yu Zhang
Mobile energy storage systems (MESS) offer great operational flexibility to enhance the resiliency of distribution systems in an emergency condition.
1 code implementation • 20 May 2022 • Bowen Jin, Yu Zhang, Qi Zhu, Jiawei Han
In heterogeneous text-rich networks, this task is more challenging due to (1) presence or absence of text: Some nodes are associated with rich textual information, while others are not; (2) diversity of types: Nodes and edges of multiple types form a heterogeneous network structure.
no code implementations • 19 May 2022 • Yu Zhang, Zhiqiang Gong, Yichuang Zhang, YongQian Li, Kangcheng Bin, Jiahao Qi, Wei Xue, Ping Zhong
Transferable adversarial attack is always in the spotlight since deep learning models have been demonstrated to be vulnerable to adversarial samples.
1 code implementation • 18 May 2022 • Qianqian Dong, Fengpeng Yue, Tom Ko, Mingxuan Wang, Qibing Bai, Yu Zhang
Direct Speech-to-speech translation (S2ST) has drawn more and more attention recently.
1 code implementation • NAACL 2022 • Yu Zhang, Yu Meng, Xuan Wang, Sheng Wang, Jiawei Han
Discovering latent topics from text corpora has been studied for decades.
no code implementations • 3 May 2022 • Yun Li, Zhe Liu, Lina Yao, Molly Lucas, Jessica J. M. Monaghan, Yu Zhang
With the development of digital technology, machine learning has paved the way for the next generation of tinnitus diagnoses.
no code implementations • 2 May 2022 • Kejun Chen, Yu Zhang
With an increasing high penetration of solar photovoltaic generation in electric power grids, voltage phasors and branch power flows experience more severe fluctuations.
no code implementations • 29 Apr 2022 • Shourya Bose, Yu Zhang
Distributed energy storage systems (ESSs) can be efficiently leveraged for load restoration (LR) for a microgrid (MG) in island mode.
no code implementations • 27 Apr 2022 • Houliang Zhou, Lifang He, Yu Zhang, Li Shen, Brian Chen
Identification of brain regions related to the specific neurological disorders are of great importance for biomarker and diagnostic studies.