<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN" "JATS-journalpublishing1.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="article">
  <front>
    <journal-meta>
      <journal-id journal-id-type="publisher-id">INFEDU</journal-id>
      <journal-title-group>
        <journal-title>Informatics in Education</journal-title>
      </journal-title-group>
      <issn pub-type="epub">2335-8971</issn>
      <issn pub-type="ppub">1648-5831</issn>
      <publisher>
        <publisher-name>VU</publisher-name>
      </publisher>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="publisher-id">INFEDU_2023_1_15</article-id>
      <article-id pub-id-type="doi">10.15388/infedu.2023.15</article-id>
      <article-categories>
        <subj-group subj-group-type="heading">
          <subject>Article</subject>
        </subj-group>
      </article-categories>
      <title-group>
        <article-title>Automatically detecting previous programming knowledge from novice programmer code compilation history</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <name>
            <surname>Lokkila</surname>
            <given-names>Erno</given-names>
          </name>
          <email xlink:href="mailto:eolokk@utu.fi">eolokk@utu.fi</email>
          <xref ref-type="aff" rid="j_INFEDU_aff_000"/>
        </contrib>
        <aff id="j_INFEDU_aff_000">University of Turku, Faculty of Technology, Department of Computation Turku, Finland
University of Turku, Faculty of Natural Sciences, Turku Research Institute of Learning Analytics, Turku, Finland</aff>
        <contrib contrib-type="author">
          <name>
            <surname>Christopoulos</surname>
            <given-names>Athanasios</given-names>
          </name>
          <email xlink:href="mailto:atchri@utu.fi">atchri@utu.fi</email>
          <xref ref-type="aff" rid="j_INFEDU_aff_001"/>
        </contrib>
        <aff id="j_INFEDU_aff_001">University of Turku, Faculty of Natural Sciences, Turku Research Institute of Learning Analytics, Turku, Finland</aff>
        <contrib contrib-type="author">
          <name>
            <surname>Laakso</surname>
            <given-names>Mikko-Jussi</given-names>
          </name>
          <email xlink:href="mailto:milaak@utu.fi">milaak@utu.fi</email>
          <xref ref-type="aff" rid="j_INFEDU_aff_002"/>
        </contrib>
        <aff id="j_INFEDU_aff_002">University of Turku, Faculty of Natural Sciences, Turku Research Institute of Learning Analytics, Turku, Finland</aff>
      </contrib-group>
	  
  <volume>22</volume>
            <issue>2</issue>
            <fpage>277</fpage>
            <lpage>294</lpage>
	  
      <pub-date pub-type="epub">
        <day>02</day>
        <month>09</month>
        <year>2022</year>
      </pub-date>
      <permissions>
        <copyright-year>2022</copyright-year>
        <copyright-holder>Vilnius University, ETH Zürich</copyright-holder>
        <license license-type="open-access">
          <license-p>Open access article under the CC BY license.</license-p>
        </license>
      </permissions>
      <abstract>
        <p>Prior programming knowledge of students has a major impact on introductory programming courses. Those with prior experience often seem to breeze through the course. Those without prior experience see others breeze through the course and disengage from the material or drop out. The purpose of this study is to demonstrate that novice student programming behavior can be modeled as a Markov process. The resulting transition matrix can then be used in machine learning algorithms to create clusters of similarly behaving students. We describe in detail the state machine used in the Markov process and how to compute the transition matrix. We compute the transition matrix for 665 students and cluster them using the k-means clustering algorithm. We choose the number of cluster to be three based on analysis of the dataset. We show that the created clusters have statistically different means for student prior knowledge in programming, when measured on a Likert scale of 1-5.</p>
      </abstract>
      <kwd-group>
        <label>Keywords</label>
        <kwd>machine learning</kwd>
        <kwd>higher education</kwd>
        <kwd>programming skill</kwd>
      </kwd-group>
    </article-meta>
  </front>
</article>
