Learning with Instance-Dependent Label Noise: A Sample Sieve Approach

ICLR 2021  ยท  Hao Cheng, Zhaowei Zhu, Xingyu Li, Yifei Gong, Xing Sun, Yang Liu ยท

Human-annotated labels are often prone to noise, and the presence of such noise will degrade the performance of the resulting deep neural network (DNN) models. Much of the literature (with several recent exceptions) of learning with noisy labels focuses on the case when the label noise is independent of features. Practically, annotations errors tend to be instance-dependent and often depend on the difficulty levels of recognizing a certain task. Applying existing results from instance-independent settings would require a significant amount of estimation of noise rates. Therefore, providing theoretically rigorous solutions for learning with instance-dependent label noise remains a challenge. In this paper, we propose CORES$^{2}$ (COnfidence REgularized Sample Sieve), which progressively sieves out corrupted examples. The implementation of CORES$^{2}$ does not require specifying noise rates and yet we are able to provide theoretical guarantees of CORES$^{2}$ in filtering out the corrupted examples. This high-quality sample sieve allows us to treat clean examples and the corrupted ones separately in training a DNN solution, and such a separation is shown to be advantageous in the instance-dependent noise setting. We demonstrate the performance of CORES$^{2}$ on CIFAR10 and CIFAR100 datasets with synthetic instance-dependent label noise and Clothing1M with real-world human noise. As of independent interests, our sample sieve provides a generic machinery for anatomizing noisy datasets and provides a flexible interface for various robust training techniques to further improve the performance. Code is available at https://github.com/UCSC-REAL/cores.

PDF Abstract ICLR 2021 PDF ICLR 2021 Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Image Classification with Label Noise CIFAR-100, 20% Asymmetric Noise CORES2* Accuracy 75.19% # 1
Image Classification with Label Noise CIFAR-100, 20% IDN CORES2* Accuracy 72.91% # 1
Image Classification with Label Noise CIFAR-100, 30% Asymmetric Noise CORES2* Accuracy 73.81% # 1
Image Classification with Label Noise CIFAR-100, 40% IDN CORES2* Accuracy 70.66% # 1
Image Classification with Label Noise CIFAR-100, 40% Symmetric Noise CORES2* Accuracy 72.22% # 1
Image Classification with Label Noise CIFAR-100, 60% IDN CORES2* Accuracy 63.08% # 1
Image Classification with Label Noise CIFAR-100, 60% Symmetric Noise CORES2* Accuracy 59.16% # 1
Learning with noisy labels CIFAR-100N CORES Accuracy (mean) 61.15 # 8
Learning with noisy labels CIFAR-100N CORES* Accuracy (mean) 55.72 # 21
Image Classification with Label Noise CIFAR-10, 20% Asymmetric Noise CORES2* Accuracy 95.18% # 1
Image Classification with Label Noise CIFAR-10, 20% IDN CORES2* Accuracy 95.42% # 1
Image Classification with Label Noise CIFAR-10, 30% Asymmetric Noise CORES2* Accuracy 94.67% # 1
Image Classification with Label Noise CIFAR-10, 40% IDN CORES2* Accuracy 88.45% # 1
Image Classification with Label Noise CIFAR-10, 40% Symmetric Noise CORES2* Accuracy 93.76% # 1
Image Classification with Label Noise CIFAR-10, 60% IDN CORES2* Accuracy 85.53% # 1
Image Classification with Label Noise CIFAR-10, 60% Symmetric Noise CORES2* Accuracy 89.78% # 1
Learning with noisy labels CIFAR-10N-Aggregate CORES* Accuracy (mean) 95.25 # 5
Learning with noisy labels CIFAR-10N-Aggregate CORES Accuracy (mean) 91.23 # 16
Learning with noisy labels CIFAR-10N-Random1 CORES Accuracy (mean) 89.66 # 17
Learning with noisy labels CIFAR-10N-Random1 CORES* Accuracy (mean) 94.45 # 5
Learning with noisy labels CIFAR-10N-Random2 CORES Accuracy (mean) 89.91 # 12
Learning with noisy labels CIFAR-10N-Random2 CORES* Accuracy (mean) 94.88 # 3
Learning with noisy labels CIFAR-10N-Random3 CORES Accuracy (mean) 89.79 # 13
Learning with noisy labels CIFAR-10N-Random3 CORES* Accuracy (mean) 94.74 # 3
Learning with noisy labels CIFAR-10N-Worst CORES Accuracy (mean) 83.60 # 11
Learning with noisy labels CIFAR-10N-Worst CORES* Accuracy (mean) 91.66 # 6
Image Classification Clothing1M CORES2 Accuracy 73.24% # 32

Methods


No methods listed for this paper. Add relevant methods here