On Application of Case-Based Reasoning to Personalise Learning

. The paper aims to present application of Educational Data Mining and particularly Case-Based Reasoning (CBR) for students profiling and further to design a personalised intel - ligent learning system. The main aim here is to develop a recommender system which should help the learners to create learning units (scenarios) that are the most suitable for them. First of all, systematic literature review on application of CBR and its possible implementation to personalise learning was performed in the paper. After that, methodology on CBR application to personalise learning is presented where learning styles play a dominate role as key factor in proposed personalised intelligent learning system model based on students profiling and person - alised learning process model. The algorithm (the sequence of steps) to implement this model is also presented in the paper.


Introduction
Integration of Data Mining and Case-Based Reasoning (CBR) techniques is enough complex object for implementation.However, recently there were many articles published where authors presented their ways to integrate these technologies.
Thus, algorithms are proposed (Wang et al., 2012), models and mechanisms for decision-making of such problems as diagnosing diseases (Huang et al., 2007), the probability of bankruptcy, the choice of suppliers (Zhao, 2011), as well as to personalise students in Virtual Learning Environments (VLEs) (Huang et al., 2007b).
To determine and to set appropriate algorithm to a new data set is a difficult task because there is no single classificatory which equally well suited for all data sets.In prac-tice it is very important to choose the proper classification / clustering or other algorithm to a particular data set.
In personalised learning, first of all, integrated learner profile (model) should be implemented using e.g.Soloman-Felder Index of Learning Styles Questionnaire (1991).After that, interlinking of learning components (learning objects, activities, and environments) with learners' profiles should be performed, and an ontologies-based personalised recommender system should be created to suggest learning components suitable to particular learners according to their profiles (Kurilovas, 2016).
Interlinking and ontologies creation should be based on the expert evaluation results.Experienced experts should evaluate learning components in terms of its suitability to particular learners according to their learning styles and other preferences / needs.A recommender system should form the preference lists of the learning components according to the expert evaluation results.
Probabilistic suitability indexes should be identified for all learning components in terms of its suitability level to particular learners (Kurilovas et al., 2016).These suitability indexes could be easily calculated for all learning components and all students if one should multiply learning components' suitability ratings obtained while the experts evaluate suitability of the learning components to particular learning styles by probabilities of particular students' learning styles.These suitability indexes should be included in the recommender system, and all learning components should be linked to particular students according to those suitability indexes.The higher the suitability indexes, the better the learning components fit the needs of particular learners.
Thus, personalised learning units / scenarios (i.e.personalised methodological sequences of learning components) could be created for particular learners.An optimal learning unit / scenario (i.e.learning unit of the highest quality) for particular student means a methodological sequence of learning components having the highest suitability indexes.
A number of intelligent technologies should be applied to implement this methodology, for example, ontologies, recommender systems, intelligent software agents, decision support systems to evaluate quality of learning units / scenarios etc.This pedagogically sound learning units' / scenarios personalisation methodology is aimed at improving learning motivation and thus learning quality and effectiveness.Learning unit / scenario of the highest quality for particular student means a methodological sequence of learning components having the highest suitability indexes.The level of students' competences, that is, knowledge / understanding, skills and attitudes / values directly depends on the level of application of high-quality learning units / scenarios in real pedagogical practice (Kurilovas, 2016).
Existing learning activities and tools should be analysed to be further interlinked with appropriate students' learning styles.For this purpose, e.g.Felder-Silverman learning styles model (FSLSM, Felder and Silverman, 1988) should be applied.Students' learning styles according to FSLSM should be interlinked with the most suitable learning activities and tools using expert evaluation method.FSLSM classifies students according to where they fit on a number of scales pertaining to the ways they receive and process information: (a) By information type: (1) Sensory (SEN) -concrete, practical, oriented towards facts and procedures vs (2) Intuitive (INT) -conceptual, innovative, oriented towards facts and meaning; (b) By sensory channel: (3) Visual (VIS) -prefer visual representations of presented material e.g.pictures, diagrams, flow charts vs (4) Verbal (VER) -prefer written and spoken explanations; (c) By information processing: (5) Active (ACT) -learn by trying things out, working with others vs (6) Reflective (REF) -learn by thinking things through, working alone; and (d) By understanding: (7) Sequential (SEQ) -linear, orderly, learn in small incremental steps vs (8) Global (GLO) -holistic, systems thinkers, learn in large leaps.
Next, students should be analysed in terms of identifying their individual learner profiles.After identifying individual learner profiles, probabilistic suitability indexes should be calculated for each analysed student and each learning activity to identify which learning activities or tools are the most suitable for particular student.From theoretical point of view, the higher is probabilistic suitability index the better learning activity or tool fits particular student's needs.
On the other hand, students practically used some learning activities or tools in real learning practice in VLEs (e.g.Moodle) before identifying the aforementioned probabilistic suitability indexes.Here we could hypothesise that students preferred to practically use particular VLE-based learning activities or tools that fit their learning needs mostly.
Thus, using appropriate Educational Data Mining methods and techniques, it would be helpful to analyse what particular learning activities or tools were practically used by these students in VLE, and to what extent.
After that, the data on practical use of VLE-based learning activities or tools should be compared with students' probabilistic suitability indexes.In the case of any noticeable discrepancies, students' profiles and accompanied suitability indexes should be identified more precisely, and students' personal learning paths (i.e.learning units / scenarios) in VLE should be corrected according to new identified data.In this way, after several iterations, we could noticeably enhance students' motivation, learning quality and effectiveness.
However, personalisation using CBR is still an open question.The rest of the paper is organised as follows.Systematic review on using CBR to personalise learning is presented in Section 2, methodology on CBR application to personalise learning is presented in Section 3 where learning styles play a dominate role as key factor in proposed personalised intelligent learning system model based on students profiling and personalised learning process model.The algorithm (the sequence of steps) to implement this model is also presented in Section 3. The paper is concluded by Section 4.

Systematic Review
In order to identify scientific methods and possible results on CBR application to personalise learning, the systematic literature review method devised by Kitchenham (2004) has been used.
The following research questions have been raised to perform systematic literature review: What is CBR?We see that during this period, 17 papers were published in Clarivate Analytics Web of Science database according the topic (case based reasoning AND learning personalisation) (Fig. 1): The analysis results are as follows: Galitsky (2017) presented a report from the field on a linguistic-based relevance technology based on learning of parse trees for processing, classification and delivery of a stream of texts.The author described the content pipeline for eBay entertainment domain which employs this technology, and showed that text processing relevance is the main bottleneck for its performance.In the partial case where short expression is commonly used terms such as Facebook likes, syntactic generalization (SG) ascends to the level of categories and a reasoning technique is required to map these categories in the course of relevance assessment.A number of content pipeline components employ web mining which needs SG to compare web search results.The author described how SG works in a number of components in the content pipeline including personalisation and recommendation, and provide the evaluation results for eBay deployment.
According to Miranda et al. (2016), Subject Ontologies represent conceptualisations of disciplinary domains in which concepts symbolise topics that are relevant for the considered domain and are associated each other by means of specific relations.Usually, these kind of lightweight ontologies are adopted in knowledge-based educational environments to enable semantic organisation and search of resources and, in other cases, to support personalisation and adaptation features for learning and teaching experiences.For this reason, applying effective management methodologies for Subject Ontologies is a crucial aspect in engineering the environments.This paper proposes an approach to use SKOS (a Semantic Web-based vocabulary providing a standard way to represent knowledge organisation systems) for modelling subject ontologies.It focuses on alternative strategies for storing and accessing ontologies in order to support the knowledge sharing, knowledge reusing, planning, assessment, customisation and adaptation processes related to learning scenarios.The results of an early experimentation allowed the authors defining a framework able to support, from both methodological and technological viewpoints, the use of Subject Ontologies in the context of a Semantic Web-based Educational System.The defined framework has high performances in terms of response and this, may really improve the user experience.Khamparia and Pandey (2015) consider that e-learning is the use of technology that enables people to learn at anytime from anywhere.Various single knowledge-based methods (KBM) such as rule-based reasoning (RBR) and case-based reasoning (CBR); and intelligent computing methods (ICM) such as genetic algorithm (GA), particle swarm optimisation (PSO), artificial neural network (ANN), multi-agent systems (MAS), ant colony optimisation (ACO), fuzzy logic (FL) etc. Integrated KBM-ICM methods such as GA-CBR, ANN-RBR, GA-Ontology and ANN-Mining have been used in various elearning contexts such as: the learning path generation, adaptive course sequencing and personalisation of recommended learning object etc. From the results, it is observed that a single KBM is not deployed to solve any e-learning problem.A single ICM and integrated KBM-ICM methods are used to solve various e-learning problems.Yakoumettis et al. (2014) propose a weight rectification strategy that improves weight estimation by exploiting metadata interrelations defined through an ontology.In the sequel, a genetic optimisation algorithm is incorporated to select the most user preferred routes based on a multi-criteria minimisation approach.To increase the degree of personalisation in 3D navigation, the authors have also introduced an efficient algorithm for estimating 3D trajectories around objects of interest by merging best selected 2D projected views that contain faces which are mostly preferred by the users.Qualitative comparisons have been also performed using a use case route scenario.
According to Ogiela (2013), cognitive categorisation systems are used for in-depth analyses of data which contains significant layers of information.Adding semantic analysis modules to personal identification systems represents a novel scientific proposition which marks the beginning of the use of semantic analysis processes for biological modelling and personalisation tasks.
Llorente and Guerrero (2012) consider that a major task of research in conversational recommender systems is personalisation.The expectation is that in each cycle the system retrieves the products that best satisfy the user's soft product preferences from a minimal information input.In this paper, the authors present a novel technique that increases retrieval quality based on a combination of compatibility and similarity scores.Under the hypothesis that a user learns during the recommendation process, the authors propose two novel exponential Reinforcement Learning approaches for compatibility that take into account both the instant at which the user makes a critique and the number of satisfied critiques.They also propose a Global Weighting approach that uses a common weight for nearest cases in order to focus on groups of relevant prod-ucts.Their methodology significantly improves recommendation efficiency in four data sets of different sizes in terms of session length in comparison with state-of-the-art approaches.Moreover, this recommender shows higher robustness against noisy user data when compared to classical approaches.Scotton et al. (2010) focus on recent work on strategy modularisation and merger development in the authoring process of adaptive hypermedia.The reason for the modularisation of strategies is to break a complex adaptation decision into a number of simpler ones, which may be reused more easily and applied in different orders.The rationale for strategy merger is to be able to apply multiple adaptation strategies over the same content -a challenge which is not yet fully addressed in current adaptive hypermedia systems.Ha (2008) introduced a personalised counselling system based on context mining.As a technique for context mining, the author has developed an algorithm called CANSY.It adopts trained neural networks for feature weighting and a value difference metric in order to measure distances between all possible values of symbolic features.CANSY plays a core role in classifying and presenting most similar cases from a case base.Experimental results show that CANSY along with a rule base can provide personalised information with a relatively high level of accuracy, and it is capable of recommending appropriate products or services.
According to Singal et al. (2016), recommender systems are ways for web personalisation and crafting the browsing experience to the users' specific needs and are tools for communicating with large and complicated information spaces.It gives a personalised view of these spaces, ranking items likely to be of interest to the user.Recommender systems research has integrated a wide range of artificial intelligence techniques including machine learning, data mining, user modelling, case-based reasoning, and constraint satisfaction, among others.The purpose of this paper is to show how recommendations can be generated for case-based scenarios using AdaBoost machine learning algorithm.The technique has been used to predict the restaurants a user may like based on the data gathered from past.
Huertas (2016) considers that learning analytics can be defined as the measurement, collection, analysis, and reporting of data about learners and their contexts, for purposes of understanding and optimising learning and the environments in which it occurs.The availability of data on the interactions of online students is an opportunity to improve learning processes in formal education.Data produced by the student provides valuable information about the reality of the learning process and suggests opportunities for improvement to educators.Among others, the author can identify students at risk, assist students in achieving goals, and provide students with knowledge about their own learning habits and with recommendations for improvement.This research focuses on the e-assessment analytics for STEM, i.e. the application of learning analytics techniques to improve e-assessment in the field of STEM subjects.This real case research focuses on the analysis of student activity in relation to the process of e-assessment in the Logic course.
In (Loeckx, 2016), a recommender system is presented that suggests (classical) piano pieces to a particular student, based on his/her history of played pieces.It learns from human teachers how to make sensible recommendations by recording the path of real student's curricula.The recommendations made by the proposed system have been evaluated and compared to human suggestions in a blind test performed by piano teachers.Preliminary evidence suggests that the quality of suggestions is similar, and that teachers had trouble identifying which recommendations were made by a real teacher and which by a computer.
According to Limongelli and Sciarrone (2014), in the context of e-learning courses, personalisation is a more and more studied issue, being its advantage in terms of time and motivations widely proved.Course personalisation basically means to understand student's needs: to this aim several Artificial Intelligence methodologies have been used to model students for tailoring e-learning courses and to provide didactic strategies, such as planning, case based reasoning, or fuzzy logic, just to cite some of them.
In (Chu and Liu, 2012), a cooperative negotiation and problem solving method, ANC (Automated Negotiation and Case-based reasoning), is proposed.The goal of this method is to provide a suitable smart home service for multiple user requirements.The motivation in this paper is that agreeing on a common service is difficult when different users propose different requirements.Therefore, in ANC, cooperation negotiation is considered for resolving conflicts for making the requirements consistent among users, and based on such requirements, a common solution is provided through a reasoning process.There are five stages in ANC: issue acquisition, conflict detection, issue decision, automated negotiation, and problem solving.To make the ANC system provide personalisation, a learning method based on attributes weighting has been integrated with the advantage of the constant learning ability of case-based reasoning.
According to Gouttaya and Begdouri (2012), current context-aware adaptation techniques in smart environments are limited in their support for proactivity and user personalisation.A reliance on developer modification and an inability to automatically learn from user interactions hinder their use for providing proactive services that can be adapted to the frequent changes of the context of individuals.To address these problems, the authors propose a proactive and personalized approach to adaptation.Their approach integrates both Case-based Reasoning (CBR) and data mining techniques.It is based on CBR, but aided by data mining to extract user patterns and knowledge adaptation from users' interaction history.Ciloglugil and Inceoglu (2010) think the use of one-size-fits-all approach is getting replaced by the adaptive, personalised perspective in recently developed learning environments.This study takes a look at the need of personalisation in e-learning systems and the adaptivity and distribution features of adaptive distributed learning environments.By focusing on how personalisation can be achieved in e-learning systems, the technologies used for establishing adaptive learning environments are explained and evaluated briefly.Some of these technologies are web services, multi-agent systems, semantic web and AI techniques such as case-based reasoning, neural networks and Bayesian networks used in intelligent tutoring systems.Finally, by discussing some of the adaptive distributed learning systems, an overall state of the art of the field is given with some future trends.
The aim of (M'tir et al., 2008) research is to build a cooperative e-Learning system adapted to different learners' profiles (knowledge levels, pedagogical preferences, goals, etc.).In order to improve e-Learning systems, the authors propose to capitalise and reuse learning experiences.The capitalisation consists in modelling the learning situation of a given learner.The learning situation model includes the learner profile as well as the learning process features.The reuse consists in exploiting previous experiences in order to offer to the current learner the best suited experience to his needs.This experience should be already validated and evaluated by other learners having similar learning profiles.This experiences reuse approach is based on the Case-Based Reasoning.The Case-Based Reasoning defines a reasoning approach based on the reuse concept.
According to Leake and Powell (2008), how to endow case-based reasoning systems with effective case adaptation capabilities is a classic problem.A significant impediment to developing automated adaptation procedures is the difficulty of acquiring the required knowledge.Initial work on WebAdapt proposed addressing this problem with "just-intime" knowledge mining from Web sources.This paper addresses two key questions building on that work.First, to develop flexible, general and extensible procedures for gathering adaptation-relevant knowledge from the Web, it proposes a knowledge planning approach in which a planner takes explicit knowledge goals as input and generates a plan for satisfying them from a set of general operators.Second, to focus selection of candidate adaptations from the potentially enormous space of possibilities, it proposes personalising adaptations based on learned information about user preferences.Evaluations of the system are encouraging for the use of knowledge planning and learned preference information to improve adaptation performance.
This systematic review reveals that CBR is already actively used in learning but its application to personalise learning should be further analysed.One of the possible CBR applications to personalise learning is proposed in the following section.

Methodology on CBR Application to Personalise Learning
According to Jevsikova et al. (2017), learning software and all learning process should be personalised according to the main characteristics / needs of the learners.Learners have different needs and characteristics i.e. prior knowledge, intellectual level, interests, goals, cognitive traits (working memory capacity, inductive reasoning ability, and associative learning skills), learning behavioural type (according to his / her self-regulation level), and, finally, learning styles.These characteristics should be included into students' learning profiles.
CBR could be successfully applied to personalise learning according to students' profiles.
For example, the study of Masood et al. (2017) applied CBR for learning personalisation.It measures the ability of the CBR algorithm to give suggestions for the most suitable learning material based on specific information supplied by the user of the system.In order to test the ability of the application to recommend learning material, two ver-sions of the application were created.The first version displayed the most suitable learning material, and the second version displayed the least preferable learning material.The results show that the first version of the application successfully assigns students to the most suitable learning material when compared with the second version.
According to (Aamodt and Plaza, 1994), CBR has been formalised for purposes of computer reasoning as a four-step process: Retrieve: Given a target problem, retrieve from memory cases relevant to solv-• ing it.A case consists of a problem, its solution, and, typically, annotations about how the solution was derived.For example, suppose Fred wants to prepare blueberry pancakes.Being a novice cook, the most relevant experience he can recall is one in which he successfully made plain pancakes.The procedure he followed for making the plain pancakes, together with justifications for decisions made along the way, constitutes Fred's retrieved case.
Reuse: Map the solution from the previous case to the target problem.This may • involve adapting the solution as needed to fit the new situation.In the pancake example, Fred must adapt his retrieved solution to include the addition of blueberries.
Revise: Having mapped the previous solution to the target situation, test the new • solution in the real world (or a simulation) and, if necessary, revise.Suppose Fred adapted his pancake solution by adding blueberries to the batter.After mixing, he discovers that the batter has turned blue -an undesired effect.This suggests the following revision: delay the addition of blueberries until after the batter has been ladled into the pan.Retain: After the solution has been successfully adapted to the target problem, • store the resulting experience as a new case in memory.Fred, accordingly, records his new-found procedure for making blueberry pancakes, thereby enriching his set of stored experiences, and better preparing him for future pancakemaking demands.
CBR is a classic artificial intelligence algorithm, which is a part of Data Mining technologies.In any case, historical data and information take an imprescriptible part in CBR as precedents, on the basis of which decisions are made.It is argued that such method is not just a good method of automation of reasoning, but it is also a widespread behaviour in everyday human being life; or, more that all reasoning is based on personal experience.In Data Mining, historical data plays a dominant role for forecasting and prediction, being at the same time one of the key points in decision-making.Forecasting is aimed at determining trends in the dynamics of a particular object or event on the basis of historical data; analysis of its state in the past and present.Thus, the solution of the prediction problem requires some training data selection.
The task of forecasting, perhaps, can be considered as one of the most complex tasks of Data Mining, it requires careful study the initial data set and methods suitable for analysis.
Many Data Mining methods are used to solve classification and forecasting problems, for example, linear regression, neural networks, decision trees (which sometimes are called prediction trees and classifications).
The tasks of classification and forecasting have similarities and differences, which consist of the following steps: solving both problems, a two-stage process is used to build a model based on the training set and its use to predict the unknown values of the dependent variable.
The differences between classification and forecasting tasks are that the class of the dependent variable is predicted in the first task, and in the second one -the numerical values of the dependent variable, missed or unknown (relating to the future).
Based on this, the experiment (Mamcenko and Kurilovas, 2017) on the basis of the data by which the learning styles of the group of students were defined by the following indices: Information Processing, Information Type, Sensorial Chanel and Understanding based on FSLSM (1988) (Fig. 2).This made it possible to identify students in classes that describe their perception of information and educational material.For further intelligent analysis, these classes will be as a starting point for classifying students using CBR on the basis of information perception.
The use of CBR will speed up the process of students' profiling, because known students profiling classes will be already determined according to this information, and not-profiled students will be separated into groups by similar characteristics to profiled students.
However, the created model of classification of students, according to the feature described above, requires supervision and correction depending on the subject and its complexity, the students themselves (age and level of secondary education, college), as well as forms of education (full-time, correspondence).Therefore, to update the model, namely the classes obtained earlier, it is proposed to use CBR, which will be an auxiliary tool in the life cycle of the intelligent model and will allow to predict the model of training for new students more accurately.To automate the determination profiling process we propose to design a prototype system, which block diagram is shown in Fig. 3.
The design of personalised learning process modelling is based on the development of a curriculum, which should take into account the above factors that affect the choice and content of a personal program.The described method is to engage the CBR and Educational Data Mining, as well as to design a system similar to the expert one, which will automatically profile students by the features and factors of information perception, to create an enabled form of learning based on the students' personal characteristics, exactly, the data received and used by the CBR decision making.
The main scientific inputs of the paper are students profiling and profiled learning process model using CBR presented by Fig. 3, and the sequence of 11 steps to implement this model presented in Section 3.
In the paper, it is proposed to use CBR, which will be an auxiliary tool in the life cycle of the intelligent model and will allow to predict the model of training for new students more accurately.To automate the determination profiling process the authors propose to design a prototype system, which block diagram is shown in Fig. 3.
The proposed model consists of the following steps: Determination of ing learning objectives; Asking students to complete learning styles questionnaires; Determination of students' learning styles also by using CBR and Data Mining to prepare personalised profiles; Learning profiles development using received results; Designing profiled course content; Examination process after study by personalised profiling; Exam results evaluation; Learning profiling improvement depending on received exam results; and, finally, Creation of improved course profiles saving system.

•
Is CBR already used to personalise learning?• What are scientific methods and results of applying CBR to personalise learning?• Systematic literature review was performed in Clarivate Analytics (formerly Thomson Reuters) Web of Science database, timespan 2008-2018.The search history is as follows: Fig. 1.Search history in Clarivate Analytics Web of Science.