- Tytuł:
- TF-IDF inspired detection for cross-language source code plagiarism and collusion
- Autorzy:
- Karnalim, Oscar
- Tematy:
-
source code plagiarism and collusion
cross-language detection
TF-IDF
computing education
information retrieval - Pokaż więcej
- Wydawca:
- Akademia Górniczo-Hutnicza im. Stanisława Staszica w Krakowie. Wydawnictwo AGH
- Powiązania:
- https://bibliotekanauki.pl/articles/305519.pdf  Link otwiera się w nowym oknie
- Opis:
- Several computing courses allow students to choose which programming language they want to use for completing a programming task. This can lead to cross-language code plagiarism and collusion, in which the copied code file is rewritten in another programming language. In response to that, this paper proposes a detection technique which is able to accurately compare code files written in various programming languages, but with limited effort in accommodating such languages at development stage. The only language-dependent feature used in the technique is source code tokeniser and no code conversion is applied. The impact of coincidental similarity is reduced by applying a TF-IDF inspired weighting, in which rare matches are prioritised. Our evaluation shows that the technique outperforms common techniques in academia for handling language conversion disguises. Furthermore, it is comparable to those techniques when dealing with conventional disguises.
- Dostawca treści:
- Biblioteka Nauki
Artykuł