Kai Tong, Xiao Ke, IdentifyMix: An efficient two-stage learning approach to combating label noise

Full Text: PDF
DOI: 10.23952/jano.5.2023.2.04
Volume 5, Issue 2, 1 August 2023, Pages 237-253

Abstract. Deep neural networks require correct label annotation during supervised learning. It is inevitable, however, that some labels are noisy during the labeling process. A deep neural network retains incorrect labels during training, resulting in a degradation of performance. Therefore, it is essential to identify samples with potentially correct labels. In state-of-the-art methods, small-loss samples are chosen for subsequent training through a sample selection strategy. Howerver, it typically ignores the imbalance in noise ratios between mini-batches when performing sample selection within each mini-batch. Further, numerous valuable samples with high losses are discarded, which adversely affects the generalization performance of the model, particularly under conditions of high noise ratios. To this end, this paper proposes IdentifyMix, an effective two-stage learning approach for noisy robust learning that combines an unique sample selection strategy and the semi-supervised learning technique. By observing how the dynamics of network training are changing, AUM (Area Under the Margin) provides a criterion that is applied in this research to identify mislabeled data. Moreover, by combining semi-supervised learning with contrastive learning and data augmentation, it is possible to extract more useful information from mislabeled samples. Experiments on several synthetic and real-world noise benchmarks demonstrate the effectiveness of IdentifyMix compared with state-of-the-art methods.

How to Cite this Article:
K. Tong, X. Ke, IdentifyMix: An efficient two-stage learning approach to combating label noise, J. Appl. Numer. Optim. 5 (2023), 237-253.