Editing training data for multi-label classification with the k-nearest neighbor rule

Sawsan Kanj; Fahed Abdallah; Thierry Denœux; Kifah Tout

doi:10.1007/s10044-015-0452-8

Editing training data for multi-label classification with the k-nearest neighbor rule

Sawsan Kanj, Fahed Abdallah, Thierry Denœux, Kifah Tout

Résultats de recherche: Contribution à un journal › Article › Revue par des pairs

41 Citations (Scopus)

Résumé

Multi-label classification allows instances to belong to several classes at once. It has received significant attention in machine learning and has found many real-world applications in recent years, such as text categorization, automatic video annotation and functional genomics, resulting in the development of many multi-label classification methods. Based on labeled examples in the training dataset, a multi-labeled method extracts inherent information to output a function that predicts the labels of unlabeled data. Due to several problems, like errors in the input vectors or in their labels, this information may be wrong and might lead the multi-label algorithm to fail. In this paper, we propose a simple algorithm for overcoming these problems by editing the existing training dataset, and adapting the edited set with different multi-label classification methods. Evaluation on benchmark datasets demonstrates the usefulness and effectiveness of our approach.

langue originale	Anglais
Pages (de - à)	145-161
Nombre de pages	17
journal	Pattern Analysis and Applications
Volume	19
Numéro de publication	1
Les DOIs	https://doi.org/10.1007/s10044-015-0452-8
état	Publié - 1 févr. 2016
Modification externe	Oui

Une note bibliographique

Publisher Copyright:
© 2015, Springer-Verlag London.

Accès au document

10.1007/s10044-015-0452-8

Autres fichiers et liens

Lien vers la publication sur Scopus

Contient cette citation

@article{b6c80f9237ce4807bafbfc14ea3de599,

title = "Editing training data for multi-label classification with the k-nearest neighbor rule",

abstract = "Multi-label classification allows instances to belong to several classes at once. It has received significant attention in machine learning and has found many real-world applications in recent years, such as text categorization, automatic video annotation and functional genomics, resulting in the development of many multi-label classification methods. Based on labeled examples in the training dataset, a multi-labeled method extracts inherent information to output a function that predicts the labels of unlabeled data. Due to several problems, like errors in the input vectors or in their labels, this information may be wrong and might lead the multi-label algorithm to fail. In this paper, we propose a simple algorithm for overcoming these problems by editing the existing training dataset, and adapting the edited set with different multi-label classification methods. Evaluation on benchmark datasets demonstrates the usefulness and effectiveness of our approach.",

keywords = "Classification, Edition, k-Nearest neighbor, Multi-label, Prototype selection",

author = "Sawsan Kanj and Fahed Abdallah and Thierry Den{\oe}ux and Kifah Tout",

note = "Publisher Copyright: {\textcopyright} 2015, Springer-Verlag London.",

year = "2016",

month = feb,

day = "1",

doi = "10.1007/s10044-015-0452-8",

language = "English",

volume = "19",

pages = "145--161",

journal = "Pattern Analysis and Applications",

issn = "1433-7541",

publisher = "Springer London",

number = "1",

}

TY - JOUR

T1 - Editing training data for multi-label classification with the k-nearest neighbor rule

AU - Kanj, Sawsan

AU - Abdallah, Fahed

AU - Denœux, Thierry

AU - Tout, Kifah

PY - 2016/2/1

Y1 - 2016/2/1

N2 - Multi-label classification allows instances to belong to several classes at once. It has received significant attention in machine learning and has found many real-world applications in recent years, such as text categorization, automatic video annotation and functional genomics, resulting in the development of many multi-label classification methods. Based on labeled examples in the training dataset, a multi-labeled method extracts inherent information to output a function that predicts the labels of unlabeled data. Due to several problems, like errors in the input vectors or in their labels, this information may be wrong and might lead the multi-label algorithm to fail. In this paper, we propose a simple algorithm for overcoming these problems by editing the existing training dataset, and adapting the edited set with different multi-label classification methods. Evaluation on benchmark datasets demonstrates the usefulness and effectiveness of our approach.

AB - Multi-label classification allows instances to belong to several classes at once. It has received significant attention in machine learning and has found many real-world applications in recent years, such as text categorization, automatic video annotation and functional genomics, resulting in the development of many multi-label classification methods. Based on labeled examples in the training dataset, a multi-labeled method extracts inherent information to output a function that predicts the labels of unlabeled data. Due to several problems, like errors in the input vectors or in their labels, this information may be wrong and might lead the multi-label algorithm to fail. In this paper, we propose a simple algorithm for overcoming these problems by editing the existing training dataset, and adapting the edited set with different multi-label classification methods. Evaluation on benchmark datasets demonstrates the usefulness and effectiveness of our approach.

KW - Classification

KW - Edition

KW - k-Nearest neighbor

KW - Multi-label

KW - Prototype selection

UR - http://www.scopus.com/inward/record.url?scp=84954376160&partnerID=8YFLogxK

U2 - 10.1007/s10044-015-0452-8

DO - 10.1007/s10044-015-0452-8

M3 - Article

AN - SCOPUS:84954376160

SN - 1433-7541

VL - 19

SP - 145

EP - 161

JO - Pattern Analysis and Applications

JF - Pattern Analysis and Applications

IS - 1

ER -