Informatics in Education logo


Login Register

  1. Home
  2. Issues
  3. Volume 18, Issue 2 (2019)
  4. Source Code Plagiarism Detection in Acad ...

Informatics in Education

INFORMATION Submit your article Help
  • Article info
  • Related articles
  • More
    Article info Related articles

Source Code Plagiarism Detection in Academia with Information Retrieval: Dataset and the Observation
Volume 18, Issue 2 (2019), pp. 321–344
Oscar KARNALIM   Setia BUDI   Hapnes TOBA   Mike JOY  

Authors

 
Placeholder
https://doi.org/10.15388/infedu.2019.15
Pub. online: 16 October 2019      Type: Article     

Published
16 October 2019

Abstract

Source code plagiarism is an emerging issue in computer science education. As a result, a number of techniques have been proposed to handle this issue. However, comparing these techniques may be challenging, since they are evaluated with their own private dataset(s). This paper contributes in providing a public dataset for comparing these techniques. Specifically, the dataset is designed for evaluation with an Information Retrieval (IR) perspective. The dataset consists of 467 source code files, covering seven introductory programming assessment tasks. Unique to this dataset, both intention to plagiarise and advanced plagiarism attacks are considered in its construction. The dataset's characteristics were observed by comparing three IR-based detection techniques, and it is clear that most IR-based techniques are less effective than a baseline technique which relies on Running-Karp-Rabin Greedy-String-Tiling, even though some of them are far more time-efficient.

Related articles PDF XML
Related articles PDF XML

Copyright
No copyright data available.

Keywords
source code plagiarism dataset programming computer science education

Metrics
since February 2020
3925

Article info
views

0

Full article
views

1383

PDF
downloads

430

XML
downloads

Export citation

Copy and paste formatted citation
Placeholder

Download citation in file


Share


RSS

INFORMATICS IN EDUCATION

  • Online ISSN: 2335-8971
  • Print ISSN: 1648-5831
  • Copyright © 2024 Vilnius University
  •  

For contributors

  • Submit
  • OA Policy

Contact us

  • Institute of Data Science and Digital Technologies,
  • Vilnius University, Akademijos St. 4, 08412, Vilnius, Lithuania
  • E-mail: gabriele.stupuriene@mif.vu.lt
Powered by PubliMill  •  Privacy policy