Search Results for author: Lu Yin

Found 22 papers, 14 papers with code

OwLore: Outlier-weighed Layerwise Sampled Low-Rank Projection for Memory-Efficient LLM Fine-tuning

1 code implementation • 28 May 2024 • Pengxiang Li, Lu Yin, Xiaowei Gao, Shiwei Liu

In this paper, we propose Outlier-weighed Layerwise Sampled Low-Rank Projection (OwLore), a new memory-efficient fine-tuning approach, inspired by the layerwise outlier distribution of LLMs, which dynamically samples pre-trained layers to fine-tune instead of adding additional adaptors.

Paper
Code

CourseGPT-zh: an Educational Large Language Model Based on Knowledge Distillation Incorporating Prompt Optimization

no code implementations • 8 May 2024 • Zheyan Qu, Lu Yin, Zitong Yu, Wenbo Wang, Xing Zhang

Moreover, considering the alignment of LLM responses with user needs, a novel method for discrete prompt optimization based on LLM-as-Judge is introduced.

Knowledge Distillation Language Modelling +2

Paper
Add Code

FFN-SkipLLM: A Hidden Gem for Autoregressive Decoding with Adaptive Feed Forward Skipping

no code implementations • 5 Apr 2024 • Ajay Jaiswal, Bodun Hu, Lu Yin, Yeonju Ro, Shiwei Liu, Tianlong Chen, Aditya Akella

In this work, we observed the saturation of computationally expensive feed-forward blocks of LLM layers and proposed FFN-SkipLLM, which is a novel fine-grained skip strategy of autoregressive LLMs.

Attribute Hallucination +1

Paper
Add Code

E2ENet: Dynamic Sparse Feature Fusion for Accurate and Efficient 3D Medical Image Segmentation

1 code implementation • 7 Dec 2023 • Boqian Wu, Qiao Xiao, Shiwei Liu, Lu Yin, Mykola Pechenizkiy, Decebal Constantin Mocanu, Maurice van Keulen, Elena Mocanu

E2ENet achieves comparable accuracy on the large-scale challenge AMOS-CT, while saving over 68\% parameter count and 29\% FLOPs in the inference phase, compared with the previous best-performing method.

Brain Tumor Segmentation Image Segmentation +2

Paper
Code

A Structural-Clustering Based Active Learning for Graph Neural Networks

1 code implementation • 7 Dec 2023 • Ricky Maulana Fajri, Yulong Pei, Lu Yin, Mykola Pechenizkiy

To address this problem, we propose the Structural-Clustering PageRank method for improved Active learning (SPA) specifically designed for graph-structured data.

Active Learning Clustering +2

Paper
Code

REST: Enhancing Group Robustness in DNNs through Reweighted Sparse Training

1 code implementation • 5 Dec 2023 • Jiaxu Zhao, Lu Yin, Shiwei Liu, Meng Fang, Mykola Pechenizkiy

These bias attributes are strongly spuriously correlated with the target variable, causing the models to be biased towards spurious correlations (i. e., \textit{bias-conflicting}).

Paper
Code

Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity

1 code implementation • 8 Oct 2023 • Lu Yin, You Wu, Zhenyu Zhang, Cheng-Yu Hsieh, Yaqing Wang, Yiling Jia, Gen Li, Ajay Jaiswal, Mykola Pechenizkiy, Yi Liang, Michael Bendersky, Zhangyang Wang, Shiwei Liu

Large Language Models (LLMs), renowned for their remarkable performance across diverse domains, present a challenge when it comes to practical deployment due to their colossal model size.

Network Pruning

Paper
Code

Pruning Small Pre-Trained Weights Irreversibly and Monotonically Impairs "Difficult" Downstream Tasks in LLMs

1 code implementation • 29 Sep 2023 • Lu Yin, Ajay Jaiswal, Shiwei Liu, Souvik Kundu, Zhangyang Wang

Contrary to this belief, this paper presents a counter-argument: small-magnitude weights of pre-trained model weights encode vital knowledge essential for tackling difficult downstream tasks - manifested as the monotonic relationship between the performance drop of downstream tasks across the difficulty spectrum, as we prune more pre-trained weights by magnitude.

Quantization

Paper
Code

Enhancing Adversarial Training via Reweighting Optimization Trajectory

1 code implementation • 25 Jun 2023 • Tianjin Huang, Shiwei Liu, Tianlong Chen, Meng Fang, Li Shen, Vlaod Menkovski, Lu Yin, Yulong Pei, Mykola Pechenizkiy

Despite the fact that adversarial training has become the de facto method for improving the robustness of deep neural networks, it is well-known that vanilla adversarial training suffers from daunting robust overfitting, resulting in unsatisfactory robust generalization.

Adversarial Robustness

Paper
Code

Are Large Kernels Better Teachers than Transformers for ConvNets?

1 code implementation • 30 May 2023 • Tianjin Huang, Lu Yin, Zhenyu Zhang, Li Shen, Meng Fang, Mykola Pechenizkiy, Zhangyang Wang, Shiwei Liu

We hereby carry out a first-of-its-kind study unveiling that modern large-kernel ConvNets, a compelling competitor to Vision Transformers, are remarkably more effective teachers for small-kernel ConvNets, due to more similar architectures.

Knowledge Distillation

256

Paper
Code

Supervised Feature Selection with Neuron Evolution in Sparse Neural Networks

1 code implementation • 10 Mar 2023 • Zahra Atashgahi, Xuhao Zhang, Neil Kichler, Shiwei Liu, Lu Yin, Mykola Pechenizkiy, Raymond Veldhuis, Decebal Constantin Mocanu

Feature selection that selects an informative subset of variables from data not only enhances the model interpretability and performance but also alleviates the resource demands.

feature selection

Paper
Code

You Can Have Better Graph Neural Networks by Not Training Weights at All: Finding Untrained GNNs Tickets

1 code implementation • 28 Nov 2022 • Tianjin Huang, Tianlong Chen, Meng Fang, Vlado Menkovski, Jiaxu Zhao, Lu Yin, Yulong Pei, Decebal Constantin Mocanu, Zhangyang Wang, Mykola Pechenizkiy, Shiwei Liu

Recent works have impressively demonstrated that there exists a subnetwork in randomly initialized convolutional neural networks (CNNs) that can match the performance of the fully trained dense networks at initialization, without any optimization of the weights of the network (i. e., untrained networks).

Out-of-Distribution Detection

Paper
Code

Lottery Pools: Winning More by Interpolating Tickets without Increasing Training or Inference Cost

1 code implementation • 23 Aug 2022 • Lu Yin, Shiwei Liu, Meng Fang, Tianjin Huang, Vlado Menkovski, Mykola Pechenizkiy

We call our method Lottery Pools.

Paper
Code

Superposing Many Tickets into One: A Performance Booster for Sparse Neural Network Training

no code implementations • 30 May 2022 • Lu Yin, Vlado Menkovski, Meng Fang, Tianjin Huang, Yulong Pei, Mykola Pechenizkiy, Decebal Constantin Mocanu, Shiwei Liu

Recent works on sparse neural network training (sparse training) have shown that a compelling trade-off between performance and efficiency can be achieved by training intrinsically sparse neural networks from scratch.

Paper
Add Code

Semantic-Based Few-Shot Learning by Interactive Psychometric Testing

no code implementations • 16 Dec 2021 • Lu Yin, Vlado Menkovski, Yulong Pei, Mykola Pechenizkiy

In this work, we advance the few-shot learning towards this more challenging scenario, the semantic-based few-shot learning, and propose a method to address the paradigm by capturing the inner semantic relationships using interactive psychometric learning.

Few-Shot Learning

Paper
Add Code

Hierarchical Semantic Segmentation using Psychometric Learning

no code implementations • 7 Jul 2021 • Lu Yin, Vlado Menkovski, Shiwei Liu, Mykola Pechenizkiy

One of the major challenges in the supervised learning approaches is expressing and collecting the rich knowledge that experts have with respect to the meaning present in the image data.

Image Segmentation Metric Learning +2

Paper
Add Code

Sparse Training via Boosting Pruning Plasticity with Neuroregeneration

2 code implementations • NeurIPS 2021 • Shiwei Liu, Tianlong Chen, Xiaohan Chen, Zahra Atashgahi, Lu Yin, Huanyu Kou, Li Shen, Mykola Pechenizkiy, Zhangyang Wang, Decebal Constantin Mocanu

Works on lottery ticket hypothesis (LTH) and single-shot network pruning (SNIP) have raised a lot of attention currently on post-training pruning (iterative magnitude pruning), and before-training pruning (pruning at initialization).

Ranked #3 on Sparse Learning on ImageNet

Network Pruning Sparse Learning

Paper
Code

Rethinking Lifelong Sequential Recommendation with Incremental Multi-Interest Attention

no code implementations • 28 May 2021 • Yongji Wu, Lu Yin, Defu Lian, Mingyang Yin, Neil Zhenqiang Gong, Jingren Zhou, Hongxia Yang

With the rapid development of these services in the last two decades, users have accumulated a massive amount of behavior data.

Sequential Recommendation

Paper
Add Code

Linear-Time Self Attention with Codeword Histogram for Efficient Recommendation

1 code implementation • 28 May 2021 • Yongji Wu, Defu Lian, Neil Zhenqiang Gong, Lu Yin, Mingyang Yin, Jingren Zhou, Hongxia Yang

Inspired by the idea of vector quantization that uses cluster centroids to approximate items, we propose LISA (LInear-time Self Attention), which enjoys both the effectiveness of vanilla self-attention and the efficiency of sparse attention.

Quantization Sequential Recommendation

Paper
Code

Do We Actually Need Dense Over-Parameterization? In-Time Over-Parameterization in Sparse Training

4 code implementations • 4 Feb 2021 • Shiwei Liu, Lu Yin, Decebal Constantin Mocanu, Mykola Pechenizkiy

By starting from a random sparse network and continuously exploring sparse connectivities during training, we can perform an Over-Parameterization in the space-time manifold, closing the gap in the expressibility between sparse training and dense training.

Ranked #4 on Sparse Learning on ImageNet

Image Classification Sparse Learning

Paper
Code

Knowledge Elicitation using Deep Metric Learning and Psychometric Testing

no code implementations • 14 Apr 2020 • Lu Yin, Vlado Menkovski, Mykola Pechenizkiy

The main reason for such a reductionist approach is the difficulty in eliciting the domain knowledge from the experts.

Metric Learning

Paper
Add Code

DymSLAM:4D Dynamic Scene Reconstruction Based on Geometrical Motion Segmentation

no code implementations • 10 Mar 2020 • Chenjie Wang, Bin Luo, Yun Zhang, Qing Zhao, Lu Yin, Wei Wang, Xin Su, Yajun Wang, Chengyuan Li

The only input of DymSLAM is stereo video, and its output includes a dense map of the static environment, 3D model of the moving objects and the trajectories of the camera and the moving objects.

Motion Segmentation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.