Editing training data for multi-label classification with the k-nearest neighbor rule

Sawsan Kanj, Fahed Abdallah, Thierry Denœux, Kifah Tout

Research output: Contribution to journalArticlepeer-review

41 Citations (Scopus)

Abstract

Multi-label classification allows instances to belong to several classes at once. It has received significant attention in machine learning and has found many real-world applications in recent years, such as text categorization, automatic video annotation and functional genomics, resulting in the development of many multi-label classification methods. Based on labeled examples in the training dataset, a multi-labeled method extracts inherent information to output a function that predicts the labels of unlabeled data. Due to several problems, like errors in the input vectors or in their labels, this information may be wrong and might lead the multi-label algorithm to fail. In this paper, we propose a simple algorithm for overcoming these problems by editing the existing training dataset, and adapting the edited set with different multi-label classification methods. Evaluation on benchmark datasets demonstrates the usefulness and effectiveness of our approach.

Original languageEnglish
Pages (from-to)145-161
Number of pages17
JournalPattern Analysis and Applications
Volume19
Issue number1
DOIs
Publication statusPublished - 1 Feb 2016
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2015, Springer-Verlag London.

Keywords

  • Classification
  • Edition
  • k-Nearest neighbor
  • Multi-label
  • Prototype selection

Cite this