Sparse representation-based over-sampling technique for classification of imbalanced dataset

journal 2017 · Xionggao Zou, Yueping Feng, Huiying Li, Shuyu Jiang ·

As one of the most popular research fields in machine learning, the research on imbalanced dataset receives more and more attentions in recent years. The imbalanced problem usually occurs in when minority classes have extremely fewer samples than the others. Traditional classification algorithms have not taken the distribution of dataset into consideration, thus they fail to deal with the problem of class-imbalanced learning, and the performance of classification tends to be dominated by the majority class. SMOTE is one of the most effective over-sampling methods processing this problem, which changes the distribution of training sets by increasing the size of minority class. However, SMOTE would easily result in over-fitting on account of too many repetitive data samples. According to this issue, this paper proposes an improved method based on sparse representation theory and oversampling technique, named SROT (Sparse Representation-based Over-sampling Technique). The SROT uses a sparse dictionary to create synthetic samples directly for solving the imbalanced problem. The experiments are performed on 10 UCI datasets using C4.5 as the learning algorithm. The experimental results show that compared our algorithm with Random Over-sampling techniques, SMOTE and other methods, SROT can achieve better performance on AUC value

PDF

Code

Add Remove Mark official

No code implementations yet. Submit your code now

Tasks

Add Remove

Datasets

Add Datasets introduced or used in this paper

Results from the Paper

Add Remove

Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Sparse representation-based over-sampling technique for classification of imbalanced dataset

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove