Purifying training data to improve performance of multi-label classification algorithms

Sawsan Kanj, Fahed Abdallah, Thierry Denoux

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Citations (Scopus)


Multi-label classification assumes that each object in the training set is associated with a set of labels, and the goal is to assign labels to unseen instances. k-nearest neighbors based algorithms answer the multi-label problem by using inherent information given by the neighbors of the observation to classify. Due to several problems, like errors in the input vectors, or in their labels, this information may be wrong and might lead the multi-label algorithm to fail. In this paper, we propose a simple algorithm for editing out some training instances by voting of some metrics in order to purify the existing training sample. This purifying approach is adapted on the recently proposed evidential k-nearest neighbors for multi-label classification. Comparative experimental results on various data sets demonstrate the usefulness and effectiveness of our approach.

Original languageEnglish
Title of host publication15th International Conference on Information Fusion, FUSION 2012
Number of pages8
Publication statusPublished - 2012
Externally publishedYes
Event15th International Conference on Information Fusion, FUSION 2012 - Singapore, Singapore
Duration: 7 Sept 201212 Sept 2012

Publication series

Name15th International Conference on Information Fusion, FUSION 2012


Conference15th International Conference on Information Fusion, FUSION 2012

Cite this