Multi-Instance Learning (MIL) by Finding an Optimal set of Classification Exemplars (OSCE) Using Linear Programming


1 Faculty of Electrical Engineering, Czech Technical University, Prague, Czech Republic

2 Faculty of Computer Engineering, Rouzbahan University, Sari, Iran


This paper describes how to classify a data set by using an optimum set of exemplar to determine the label of an instance among a set of data for solving classification run time problem in a large data set.
In this paper, we purposely use these exemplars to classify positive and negative bags in synthetic data set.
There are several methods to implement multi-instance learning (MIL) such as SVM, CNN, and Diverse density. In this paper, optimum set of classifier exemplar (OSCE) is used to recognize positive bag (contains tumor patches).
The goal of this paper is to find a way to speed up the classifier run time by choosing a set of exemplars. We used linear programming problems to optimize a hinge loss cost function, in which estimated label and actual label is used to train the classification. Estimated label is calculated by measuring Euclidean distance of a query point to all of its k nearest neighbors and an actual label value. To select some exemplars with none zero weights, Two solutions is suggested to have a better result. One of them is choosing k closer neighbors. The other one is using LP and thresholding to select some maximum of achieved unknown variable which are more significant in finding a set of exemplar. Also, there is trade-off between classifier run time and accuracy. In large data set, OSCE classifier has better performance than ANN and K-NN cluster. Also, OSCE is faster than NN classifier. After describing OSCE method, we used it to recognize a data set which contains cancer in synthetic data points. In deed, we define OSCE to apply for MIL for cancer detection.


  1. Samet, ”Foundations of multidimensional and metric data structures”, Morgan Kaufmann, 2006.
  2. Weber, H. Schek, and S. Blott,A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces, in Pro- ceedings of the International Conference on Very Large Data Bases, IEEE, 1998, pp. 194–205.
  3. Dasgupta and Y. Freund, Random projection trees and low dimensional manifolds, in Proceedings of the 40th annual ACM symposium on Theory of computing, ACM, 2008, pp. 537–546.
  4. Jones, A. Osipov, and V. Rokhlin, Randomized approximate nearest neighbors algorithm, Proceedings of the National Academy of Sciences, 108 (2011),pp. 15679–15686.
  5. Aiger, E. Kokiopoulou, and E. Rivlin, Random grids: Fast approximate nearest neighborsand range searching for image search, in Computer Vision (ICCV), 2013 IEEE International Conferenceon, IEEE, 2013, pp. 3471–3478.
  6. Li and X. Zhang, “Improving k nearest neighbor with exemplar generalization for imbalanced classification,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 6635 LNAI, no. PART 2, pp. 321–332, 2011.
  7. Rousu, C. Saunders, S. Szedmak, and J. Shawe-Taylor, “Kernel-based learning of hierarchical multilabel classification models,” Journal of Machine Learning Research, vol. 7, no. Jul, pp. 1601–1626, 2006.
  8. Camps-Valls and L. Bruzzone, ”Kernel-based methods for hyperspectral image classification,” IEEE Transactions on Geoscience and Remote Sensing, vol. 43, no. 6, pp. 1351–1362, 2005.
  9. L. Hintzman and G. Ludlam, ”Differential forgetting of prototypes and old instances: Simulation by an exemplar-based classification model,” Memory and Cognition, vol. 8, no. 4, pp. 378–382, 1980.P. McANDREWS and M. Moscovitch, “Rule-based and exemplar-based classification in artificial grammar learning,” Memory and Cognition, vol. 13, no. 5, pp. 469–475, 1985.
  10. Zhou, Zhi-Hua, and Min-Ling Zhang. ”Neural networks for multi-instance learning.” In Proceedings of the International Conference on Intelligent Information Technology, Beijing, China, pp. 455-459. 2002.
  11. Zhang, Min-Ling, and Zhi-Hua Zhou. ”Improve multi-instance neural networks through feature selection.” Neural processing letters 19, no. 1, pp. 1-10, 2004.
  12. M.Liu, J.Zhang, E.Adeli, and D.Shen, ”Landmark-based deep multi-instance learning for brain disease diagnosis”, Medical image analysis, 43, pp.157-168, 2018.
  13. T.Khatibi, A.Shahsavari, A.Farahani, ”Proposing a novel multi-instance learning model for tuberculosis recognition from chest X-ray images based on CNNs, complex networks and stacked ensemble”.Physical and Engineering Sciences in Medicine, vol.44, no.1, pp.291-311, 2021.
  14. X.Wang, F.Tang, L.Luo, Z.Tang, A.Ran, C.Cheung, P.Heng ”uncertainty-driven deep multiple instance learning for OCT image classification”.IEEE journal of biomedical and health informatics, vol.24, no.12, pp.3431-3442, 2020.