Dr. Anna Khalemsky

Dr. Anna Khalemsky
Research Interests: 


Dynamic Segmentation Rules for Online Classifier of Big Data via Small Data Buffers

In the various classification processes, we classify new cases according to a model that was built on the basis of past cases. As long as the new cases are "similar enough" to past cases, the classification runs normally. However, when a new case is substantially different from the known cases, a reexamination is required. We assume that in dynamic data environment it is not possible to reexamine all past data, thus the current research suggests using small groups of selected cases, stored in small memory buffers, instead. The current paper presents incremental dynamic classifier that can dynamically update the segmentation sets (classification categories) as well as the classification rules. In this way the suggested model enables automatic decision making in real-time and provides an effective tool for Big Data dynamic environment. The main innovation in this study is the use of small data buffers in order to supply ongoing representative cases for each classification category. In order to reduce the computational effort of unsupervised clustering and updating process through the constant attempt to keep up with new elements-trends that dynamically appears in big-data environment, the entire calculations, according to the suggested approach, are based only on the relevant data buffers with the relevant representative cases. In this way the suggested model and prototype offers an autonomous dynamic classifier which can be very useful in industry.

Model evaluation tests show that, the model creates optimal classification set using a relatively small number of iterations while keeping its ability for dynamical changes of the segments. We call it Dynamic Archetypes Base Reasoning.